mercredi 13 mai 2015

Python regex pattern max length in re.compile?

I try to compile a big pattern with re.compile in Python 3.

The pattern I try to compile is composed of 500 small words (I want to remove them from a text). The problem is that it stops the pattern after about 18 words

Python doesn't raise any error.

What I do is:

stoplist = map(lambda s: "\\b" + s + "\\b", stoplist)
stopstring = '|'.join(stoplist)
stopword_pattern = re.compile(stopstring)

The stoplist is ok (all the words are in) but the pattern is much shorter. It even stops in the middle of a word!

Is there a max length for the regex pattern?

Aucun commentaire:

Enregistrer un commentaire