rasa icon indicating copy to clipboard operation
rasa copied to clipboard

training-data reg test config fix

Open kedz opened this issue 3 years ago • 5 comments

Proposed changes:

  • ...

Status (please check what you already did):

  • [ ] added some tests for the functionality
  • [ ] updated the documentation
  • [ ] updated the changelog (please check changelog for instructions)
  • [ ] reformat files using black (please check Readme for instructions)

kedz avatar Jun 17 '21 19:06 kedz

Hey @kedz! :wave: To run model regression tests, comment with the /modeltest command and a configuration.

Tips :bulb:: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips :bulb:: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot" (NLU)
# - "Hermit" (NLU)
# - "Private 1" (NLU)
# - "Private 2" (NLU)
# - "Private 3" (NLU)
# - "Sara" (NLU)
# - "financial-demo" (NLU, Core)
# - "helpdesk-assistant" (NLU, Core)

##########
## Available NLU configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

##########
## Available Core configurations
##########
# - "Rules"
# - "Rules + AugMemo"
# - "Rules + AugMemo + TED"
# - "Rules + Memo"
# - "Rules + Memo + TED"
# - "Rules + TED"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'main'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##
## Shortcuts:
## You can use the "all" shortcut to include all available configurations or datasets.
## You can use the "all-nlu" shortcut to include all available NLU configurations or datasets.
## You can use the "all-core" shortcut to include all available core configurations or datasets.

include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

github-actions[bot] avatar Jun 17 '21 19:06 github-actions[bot]

/modeltest


dataset_branch: "fix-response-selector-t2t-configs"
include:
 - dataset: ["all-nlu"]
   config: ["all-nlu"]

github-actions[bot] avatar Jun 17 '21 19:06 github-actions[bot]

The model regression tests have started. It might take a while, please be patient. As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

github-actions[bot] avatar Jun 17 '21 19:06 github-actions[bot]

Commit: a03d6ae0607f841bd4a69a356884755c566e33c2, The full report is available as an artifact.

Dataset: Carbon Bot, Dataset repository branch: fix-response-selector-t2t-configs, commit: 65c7338ecd2384defadc47638863807f7f1bf134

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m26s, train: 3m48s, total: 5m14s
0.7942 (0.00) 0.7529 (0.00) 0.5382 (0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 1m29s, train: 4m25s, total: 5m54s
0.7942 (0.01) 0.7529 (0.00) 0.5515 (-0.01)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 36s, train: 2m31s, total: 3m6s
0.7437 (-0.02) 0.7529 (0.00) 0.5099 (0.01)

Dataset: Hermit, Dataset repository branch: fix-response-selector-t2t-configs, commit: 65c7338ecd2384defadc47638863807f7f1bf134

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m33s, train: 18m0s, total: 20m32s
0.8987 (0.00) 0.7504 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m46s, train: 11m58s, total: 14m44s
0.9033 (0.00) 0.8026 (0.00) no data
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m36s, train: 21m24s, total: 23m59s
0.8717 (-0.00) 0.7504 (0.00) no data
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m53s, train: 13m2s, total: 15m54s
0.8764 (0.00) 0.8118 (0.01) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 1m1s, train: 18m43s, total: 19m44s
0.8290 (0.00) 0.7504 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m24s, train: 11m58s, total: 13m22s
0.8299 (0.00) 0.7543 (-0.00) no data

Dataset: Private 1, Dataset repository branch: fix-response-selector-t2t-configs, commit: 65c7338ecd2384defadc47638863807f7f1bf134

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m47s, train: 3m10s, total: 4m57s
0.9096 (0.00) 0.9612 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m8s, train: 2m58s, total: 5m6s
0.9127 (0.00) 0.9728 (0.00) no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 30s, train: 2m25s, total: 2m54s
0.8420 (0.00) 0.9574 (0.00) no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 52s, train: 2m57s, total: 3m48s
0.8545 (0.00) 0.9353 (0.00) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 25s, train: 2m51s, total: 3m15s
0.9002 (0.01) 0.9612 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 43s, train: 2m44s, total: 3m26s
0.9085 (0.00) 0.9715 (-0.00) no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 32s, train: 3m20s, total: 3m52s
0.9075 (0.01) 0.9574 (0.00) no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 51s, train: 3m9s, total: 4m0s
0.8971 (0.00) 0.9717 (0.00) no data

Dataset: Private 2, Dataset repository branch: fix-response-selector-t2t-configs, commit: 65c7338ecd2384defadc47638863807f7f1bf134

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m53s, train: 5m3s, total: 6m56s
0.8830 (0.00) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 41s, train: 4m56s, total: 5m36s
0.7736 (0.00) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 36s, train: 4m18s, total: 4m54s
0.8573 (0.00) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 44s, train: 5m24s, total: 6m7s
0.8541 (0.00) no data no data

Dataset: Private 3, Dataset repository branch: fix-response-selector-t2t-configs, commit: 65c7338ecd2384defadc47638863807f7f1bf134

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(seq) + ResponseSelector(t2t)
test: 55s, train: 39s, total: 1m34s
0.8436 (0.00) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 35s, train: 36s, total: 1m10s
0.6214 (0.00) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 31s, train: 35s, total: 1m7s
0.8560 (0.00) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 36s, train: 40s, total: 1m16s
0.8601 (0.00) no data no data

Dataset: Sara, Dataset repository branch: fix-response-selector-t2t-configs, commit: 65c7338ecd2384defadc47638863807f7f1bf134

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m6s, train: 4m3s, total: 6m8s
0.8541 (0.00) 0.8683 (0.00) 0.8913 (0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m12s, train: 6m7s, total: 8m18s
0.8629 (-0.00) 0.8683 (0.00) 0.8804 (-0.01)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 47s, train: 4m34s, total: 5m20s
0.8110 (-0.01) 0.8683 (0.00) 0.8478 (-0.02)

github-actions[bot] avatar Jun 18 '21 01:06 github-actions[bot]

This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 16 '22 07:04 stale[bot]