I've been learning NLP text classification via book "Text Analytics with Python". It's required several modules to be installed in a virtual environment. I use Anaconda env. I created a blank env with Python 3.7 and installed required pandas, numpy, nltk, gensim, sklearn... then, I have to install Pattern. The first problem is that I can't install Pattern via conda because of a conflict between Pattern and mkl_random.
(nlp) D:\Python\Text_classification>conda install -c mickc pattern Solving environment: failed
UnsatisfiableError: The following specifications were found to be in conflict:
- mkl_random
- pattern
Use "conda info <package>" to see the dependencies for each package.
It's impossible to remove mkl_random because there're related packages: gensim, numpy, scikit-learn etc. I don't know what to do, I didn't find any suitable conda installations for Pattern that is accepted in my case. Then, I installed Pattern using pip. Installation was successful. Is it okay to have packages from conda and from pip at the same time?
The second problem, I think, is connected with the first one. I downloaded the book's example codes from https://github.com/dipanjanS/text-analytics-with-python/tree/master/Old-First-Edition/source_code/Ch04_Text_Classification, added brackets to Python 2.x 'print' functions and run classification.py The program raised an exception:
Traceback (most recent call last):
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\__init__.py", line 609, in _read
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "classification.py", line 50, in <module>
norm_train_corpus = normalize_corpus(train_corpus)
File "D:\Python\Text_classification\normalization.py", line 96, in normalize_corpus
text = lemmatize_text(text)
File "D:\Python\Text_classification\normalization.py", line 67, in lemmatize_text
pos_tagged_text = pos_tag_text(text)
File "D:\Python\Text_classification\normalization.py", line 58, in pos_tag_text
tagged_text = tag(text)
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\en\__init__.py", line 188, in tag
for sentence in parse(s, tokenize, True, False, False, False, encoding, **kwargs).split():
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\en\__init__.py", line 169, in parse
return parser.parse(s, *args, **kwargs)
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\__init__.py", line 1172, in parse
s[i] = self.find_tags(s[i], **kwargs)
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\en\__init__.py", line 114, in find_tags
return _Parser.find_tags(self, tokens, **kwargs)
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\__init__.py", line 1113, in find_tags
lexicon = kwargs.get("lexicon", self.lexicon or {}),
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\__init__.py", line 376, in __len__
return self._lazy("__len__")
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\__init__.py", line 368, in _lazy
self.load()
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\__init__.py", line 625, in load
dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if len(x.split(" ")) > 1))
File "C:\Users\PC\Anaconda3\envs\nlp\lib\site-packages\pattern\text\__init__.py", line 625, in <genexpr>
dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if len(x.split(" ")) > 1))
RuntimeError: generator raised StopIteration
I don't understand what is happening. Is the exception raised because my installation with pip, or the problem is in the wrong or deprecated code in the book... and is it possible to install Pattern in conda with all other necessary packages.
Thank you in advance!
Aucun commentaire:
Enregistrer un commentaire