dimanche 17 juin 2018

Spacy rule-based matcher pattern syntax questions

I am using the rule-based matcher in spacy to look for some patterns in a text. This is an example:

pattern = [{'POS':'DET'},{'DEP':'nsubj', 'OP' : '+'}, {'LEMMA':'can'},{'ORTH': 'but'},{'ORTH': 'need'},{'ORTH': 'not'}

I want to make my query more efficiënt, so what I want to do is:

  1. specify that the dependency of a certain tokens is 'nsubj' OR 'nsubjpass', so adding the option 'DEP':'nsubjpass' to {'DEP':'nsubj', 'OP' : '+'}
  2. add to my query that at a certain position 'zero or more tokens' can occur. {'OP':'*'} does not appear to work for this.

My questions thus relate to syntax and the spacy documentation offers little to no help here.

Any ideas on how I can write these queries?

Many thanks!

Aucun commentaire:

Enregistrer un commentaire