DeepPavlov icon indicating copy to clipboard operation
DeepPavlov copied to clipboard

building complete gobot.json without separate slotfiller.json and ner.json

Open vitalyuf opened this issue 5 years ago • 4 comments

Hi! I have a gobot, based on dstc2 gobot example. gobot.json references to slofiller.json and slotfiller.json references to ner.json. There is a problem of embedders duplicating (one for ner and one for gobot needed) at runtime resulting in excessive memory usage. So I supposed, that it is possible to build one big.json, made of 3 above jsons. Is it possible? How to do it?

vitalyuf avatar Dec 26 '18 08:12 vitalyuf

Hi there! I could not find embeddings usage in ner_dstc2.json nor in slotfill_dstc2.json. Intents config uses fasttext embeddings, but [gobot config)(https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/configs/go_bot/gobot_dstc2.json) that references it does not.

If you do want to combine multiple configurations into one, you should replace component description that has config_path parameter with pieline elements from the referenced configuration file, properly connecting inputs and outputs. Then, if you have multiple identical components, you can replace later uses with refs as described in documentation.

yoptar avatar Dec 26 '18 08:12 yoptar

Sorry for siminformation. I built ner component based on ner_rus.json. It uses fasttext embedder and gobot_dstc2_best.json uses it too. And, yes, intents classifier uses.

The example in documentation shows a case when both elements are in the same pipeline. But, for example, embedder in gobot_dstc2_best.json is not a pipeline element, it is a parameter of go_bot element. And slotfiller is a go_bot parameter too.

Let's consider an example:

      {
        "in": ["x"],
        "in_y": ["y"],
        "out": ["y_predicted"],
        "main": true,
        "class_name": "go_bot",
        "load_path": "{MODELS_PATH}/my_gobot_rus/model",
        "save_path": "{MODELS_PATH}/my_gobot_rus/model",
        "debug": false,
        "word_vocab": "#token_vocab",
        "template_path": "{DOWNLOADS_PATH}/dstc2_v2_rus/dstc2-templates.txt",
        "template_type": "DualTemplate",
        "database": "#phone_database",
        "api_call_action": "api_call",
        "use_action_mask": false,
        "network_parameters": {
          "learning_rate": 0.002,
          "end_learning_rate": 0.00002,
          "decay_steps": 10,
          "decay_power": 0.5,
          "dropout_rate": 0.45,
          "l2_reg_coef": 2e-3,
          "hidden_size": 128,
          "dense_size": 64,
          "attention_mechanism": {
            "type": "cs_bahdanau",
            "hidden_size": 32,
            "depth": 3,
            "action_as_key": true,
            "max_num_tokens": 100,
            "projected_align": false
          }
        },
        **"slot_filler": {
          "config_path": "{CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json"
        },**
        "intent_classifier": "{CONFIGS_PATH}/classifiers/intents_dstc2_big_rus.json",
        **"embedder": {
          "class_name": "fasttext",
          "load_path": "{DOWNLOADS_PATH}/embeddings/lenta_lower_100.bin"
        },**
        "bow_embedder": null,
        "tokenizer": {
          "class_name": "stream_spacy_tokenizer",
          "lowercase": false
        },
        "tracker": {
          "class_name": "featurized_tracker",
          "slot_names": ["surname", "name", "pos_confirm", "neg_confirm", "phone"]
        }
      }

Have I understood right that to include a slotfiller.json pipeline into a gobot_dstc2_best.json I should take a "pipe" component of file: {CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json. And copy-paste the content of "pipe" it to config_path instead of "{CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json"?

vitalyuf avatar Dec 26 '18 10:12 vitalyuf

Hmmm. No, sorry, i don't think it will work like that. For now there is no way to use pipelines as a component constructor argument without using a config reference... I'll think about what we can do about it.

yoptar avatar Dec 26 '18 10:12 yoptar

Hey! The separation of GO-bot units onto separate pipeline units is planned though is not implemented yet.

oserikov avatar May 14 '20 17:05 oserikov