spacy-ru
spacy-ru copied to clipboard
Spacy-RU integration with Rasa Open source
Приветствую.
Описание установки и используемые версии пакетов. apt update && apt install -y python3-venv python3-dev python3-pip
python3 -m venv ./venv source ./venv/bin/activate
pip install -U pip pip install rasa --use-feature=2020-resolver
pip install pymorphy2==0.8 pip install spacy==2.1.9
git clone -b v2.1 https://github.com/buriy/spacy-ru.git cp -r ./spacy-ru/ru2/. ./ru2/
python -V Python 3.6.9
pip -V pip 20.2.2 from /home/rasa/venv/lib/python3.6/site-packages/pip (python 3.6)
pip show tensorflow Version: 2.1.1
pip show tensorflow_addons Version: 0.7.1
pip show pymorphy2 Version: 0.8
pip show spacy Version: 2.1.9
rasa --version Rasa 1.10.12
cat ./config.yml
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: ru
pipeline:
- name: "SpacyNLP"
model: ru2
- name: "SpacyTokenizer"
- name: "SpacyFeaturizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "SklearnIntentClassifier"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
- name: ResponseSelector
epochs: 100
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
- name: MemoizationPolicy
- name: TEDPolicy
max_history: 5
epochs: 100
После запуска комманды: rasa train
Training Core model...
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 3274.24it/s, # trackers=1]
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 1573.97it/s, # trackers=5]
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 405.48it/s, # trackers=20]
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 301.93it/s, # trackers=24]
Processed trackers: 100%|███████████████████| 5/5 [00:00<00:00, 1970.45it/s, # actions=16]
Processed actions: 16it [00:00, 10648.82it/s, # examples=16]
Processed trackers: 100%|███████████████| 231/231 [00:00<00:00, 822.90it/s, # actions=126]
Epochs: 100%|██████| 100/100 [00:26<00:00, 3.71it/s, t_loss=0.084, loss=0.011, acc=1.000]
2020-09-06 16:12:45 INFO rasa.utils.tensorflow.models - Finished training.
2020-09-06 16:12:45 INFO rasa.core.agent - Persisted model to '/tmp/tmpwnqa2h6f/core'
Core model training completed.
Training NLU model...
2020-09-06 16:12:45 INFO rasa.nlu.utils.spacy_utils - Trying to load spacy model with name 'ru2'
2020-09-06 16:12:45 INFO pymorphy2.opencorpora_dict.wrapper - Loading dictionaries from /home/rasa/venv/lib/python3.6/site-packages/pymorphy2_dicts/data
2020-09-06 16:12:45 INFO pymorphy2.opencorpora_dict.wrapper - format: 2.4, revision: 393442, updated: 2015-01-17T16:03:56.586168
2020-09-06 16:12:51 INFO rasa.nlu.components - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-ru2'.
2020-09-06 16:12:51 INFO rasa.nlu.training_data.training_data - Training data stats:
2020-09-06 16:12:51 INFO rasa.nlu.training_data.training_data - Number of intent examples: 33 (7 distinct intents)
2020-09-06 16:12:51 INFO rasa.nlu.training_data.training_data - Found intents: 'mood_unhappy', 'bot_challenge', 'deny', 'mood_great', 'goodbye', 'greet', 'affirm'
2020-09-06 16:12:51 INFO rasa.nlu.training_data.training_data - Number of response examples: 0 (0 distinct responses)
2020-09-06 16:12:51 INFO rasa.nlu.training_data.training_data - Number of entity examples: 0 (0 distinct entities)
2020-09-06 16:12:51 INFO rasa.nlu.model - Starting to train component SpacyNLP
2020-09-06 16:12:51 INFO rasa.nlu.model - Finished training component.
2020-09-06 16:12:51 INFO rasa.nlu.model - Starting to train component SpacyTokenizer
2020-09-06 16:12:51 INFO rasa.nlu.model - Finished training component.
2020-09-06 16:12:51 INFO rasa.nlu.model - Starting to train component SpacyFeaturizer
2020-09-06 16:12:51 INFO rasa.nlu.model - Finished training component.
2020-09-06 16:12:51 INFO rasa.nlu.model - Starting to train component RegexFeaturizer
2020-09-06 16:12:51 INFO rasa.nlu.model - Finished training component.
2020-09-06 16:12:51 INFO rasa.nlu.model - Starting to train component CRFEntityExtractor
2020-09-06 16:12:51 INFO rasa.nlu.model - Finished training component.
2020-09-06 16:12:51 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper
2020-09-06 16:12:51 INFO rasa.nlu.model - Finished training component.
2020-09-06 16:12:51 INFO rasa.nlu.model - Starting to train component SklearnIntentClassifier
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.0s finished
Traceback (most recent call last):
File "/home/rasa/venv/bin/rasa", line 8, in <module>
sys.exit(main())
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/__main__.py", line 92, in main
cmdline_arguments.func(cmdline_arguments)
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/cli/train.py", line 76, in train
additional_arguments=extract_additional_arguments(args),
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 50, in train
additional_arguments=additional_arguments,
File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 101, in train_async
additional_arguments,
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 188, in _train_async_internal
additional_arguments=additional_arguments,
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 245, in _do_training
persist_nlu_training_data=persist_nlu_training_data,
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 482, in _train_nlu_with_validated_data
persist_nlu_training_data=persist_nlu_training_data,
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/nlu/train.py", line 90, in train
interpreter = trainer.train(training_data, **kwargs)
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/nlu/model.py", line 191, in train
updates = component.train(working_data, self.config, **context)
File "/home/rasa/venv/lib/python3.6/site-packages/rasa/nlu/classifiers/sklearn_intent_classifier.py", line 125, in train
self.clf.fit(X, y)
File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 739, in fit
self.best_estimator_.fit(X, y, **fit_params)
File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/svm/_base.py", line 148, in fit
accept_large_sparse=False)
File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 755, in check_X_y
estimator=estimator)
File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 578, in check_array
allow_nan=force_all_finite == 'allow-nan')
File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 60, in _assert_all_finite
msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Если установить языковую модель en запускается без ошибок. Прошу поделиться опытом тех у кого получилось использовать RASA и русский язык.