rasa icon indicating copy to clipboard operation
rasa copied to clipboard

Triplet loss model regression experiments

Open dakshvar22 opened this issue 4 years ago • 7 comments

Proposed changes:

  • ...

Status (please check what you already did):

  • [ ] added some tests for the functionality
  • [ ] updated the documentation
  • [ ] updated the changelog (please check changelog for instructions)
  • [ ] reformat files using black (please check Readme for instructions)

dakshvar22 avatar Apr 22 '21 03:04 dakshvar22

Commit: 1a9bba051c1264d06de638b73d1e3b1d843a9b4f, The full report is available as an artifact.

Dataset: Carbon Bot, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m30s, train: 6m59s, total: 8m28s
0.7903 (-0.00) 0.7529 (0.00) 0.5629 (0.01)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m48s, train: 4m22s, total: 6m9s
0.8136 (0.01) 0.7627 (-0.03) 0.5497 (0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 1m34s, train: 4m36s, total: 6m9s
0.7883 (-0.01) 0.7529 (0.00) 0.6087 (-0.00)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m55s, train: 4m56s, total: 6m51s
0.7981 (-0.00) 0.7824 (-0.01) 0.5847 (0.04)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 41s, train: 2m53s, total: 3m33s
0.7417 (0.00) 0.7529 (0.00) 0.5563 (0.09)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m1s, train: 4m7s, total: 5m8s
0.7456 (0.02) 0.6685 (-0.02) 0.5629 (0.07)

Dataset: Hermit, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m44s, train: 19m52s, total: 22m35s
0.9043 (0.02) 0.7504 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 3m0s, train: 12m37s, total: 15m36s
0.8931 (0.00) 0.8088 (0.01) no data
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m47s, train: 23m33s, total: 26m20s
0.8922 (0.02) 0.7504 (0.00) no data
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 3m4s, train: 13m41s, total: 16m44s
0.8866 (0.02) 0.8150 (-0.00) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 1m8s, train: 20m35s, total: 21m43s
0.8569 (0.02) 0.7504 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m23s, train: 12m25s, total: 13m48s
0.8615 (0.03) 0.7536 (-0.01) no data

Dataset: Private 1, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m56s, train: 3m34s, total: 5m29s
0.9064 (-0.00) 0.9612 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m17s, train: 3m18s, total: 5m34s
0.9054 (-0.01) 0.9753 (0.00) no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 34s, train: 2m50s, total: 3m23s
0.8430 (-0.01) 0.9574 (0.00) no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 55s, train: 3m12s, total: 4m7s
0.8534 (-0.00) 0.9405 (0.00) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 29s, train: 3m17s, total: 3m45s
0.9002 (0.01) 0.9612 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 49s, train: 3m5s, total: 3m54s
0.9023 (-0.00) 0.9672 (-0.00) no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 38s, train: 3m56s, total: 4m34s
0.8992 (0.01) 0.9574 (0.00) no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 1m0s, train: 3m37s, total: 4m36s
0.8950 (-0.00) 0.9734 (0.00) no data

Dataset: Private 2, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m3s, train: 10m47s, total: 12m49s
0.8734 (0.00) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 41s, train: 5m31s, total: 6m11s
0.7543 (0.03) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 48s, train: 5m33s, total: 6m20s
0.7800 (-0.01) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 37s, train: 4m59s, total: 5m36s
0.8530 (0.01) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 42s, train: 4m52s, total: 5m34s
0.8637 (0.01) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 46s, train: 7m23s, total: 8m9s
0.8691 (0.02) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 51s, train: 6m3s, total: 6m53s
0.8755 (0.02) no data no data

Dataset: Private 3, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m0s, train: 1m2s, total: 2m2s
0.9177 (0.00) no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m3s, train: 45s, total: 1m48s
0.9136 (0.06) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 36s, train: 52s, total: 1m28s
0.7037 (0.10) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 41s, train: 40s, total: 1m20s
0.7654 (0.17) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 34s, train: 1m0s, total: 1m34s
0.8642 (0.02) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 38s, train: 41s, total: 1m18s
0.8724 (0.04) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 38s, train: 1m11s, total: 1m48s
0.8724 (0.00) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 42s, train: 46s, total: 1m28s
0.8765 (0.00) no data no data

Dataset: Sara, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m25s, train: 4m45s, total: 7m10s
0.8609 (0.01) 0.8683 (0.00) 0.8630 (-0.00)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m43s, train: 3m42s, total: 6m25s
0.8511 (-0.00) 0.8884 (0.01) 0.8826 (0.01)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m32s, train: 7m7s, total: 9m38s
0.8668 (0.01) 0.8683 (0.00) 0.8804 (-0.01)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m53s, train: 4m51s, total: 7m44s
0.8648 (0.01) 0.9072 (0.00) 0.8761 (-0.02)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 54s, train: 5m23s, total: 6m17s
0.8306 (0.01) 0.8683 (0.00) 0.8413 (-0.02)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m16s, train: 4m4s, total: 5m19s
0.8580 (0.02) 0.8228 (0.01) 0.8500 (-0.01)

github-actions[bot] avatar Apr 22 '21 10:04 github-actions[bot]

Hey @dakshvar22! :wave: To run model regression tests, comment with the /modeltest command and a configuration.

Tips :bulb:: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips :bulb:: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot"
# - "Hermit"
# - "Private 1"
# - "Private 2"
# - "Private 3"
# - "Sara"

##########
## Available configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'main'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]


include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

github-actions[bot] avatar Apr 26 '21 13:04 github-actions[bot]

/modeltest

dataset_branch: "triplet"
include:
 - dataset: ["all"]
   config: ["all"]

github-actions[bot] avatar Apr 26 '21 13:04 github-actions[bot]

The model regression tests have started. It might take a while, please be patient. As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

github-actions[bot] avatar Apr 26 '21 13:04 github-actions[bot]

Commit: 8d3fe68eaae4afe017355c987ed62fc8a45ab974, The full report is available as an artifact.

Dataset: Carbon Bot, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m20s, train: 4m1s, total: 5m21s
0.7922 (-0.00) 0.7529 (0.00) 0.5781 (0.02)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m42s, train: 4m9s, total: 5m50s
0.8039 (0.00) 0.7622 (-0.03) 0.5695 (0.02)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 1m30s, train: 4m19s, total: 5m49s
0.7883 (-0.01) 0.7529 (0.00) 0.5515 (-0.02)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m47s, train: 4m42s, total: 6m29s
0.7922 (-0.01) 0.7769 (-0.01) 0.6067 (0.01)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 39s, train: 2m44s, total: 3m22s
0.7204 (-0.01) 0.7529 (0.00) 0.5800 (0.10)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 57s, train: 3m56s, total: 4m53s
0.7379 (0.01) 0.7032 (0.01) 0.5847 (0.06)

Dataset: Hermit, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m36s, train: 18m57s, total: 21m32s
0.8913 (0.00) 0.7504 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m49s, train: 12m25s, total: 15m14s
0.8968 (0.00) 0.8126 (0.01) no data
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m46s, train: 22m11s, total: 24m57s
0.8959 (0.03) 0.7504 (0.00) no data
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m54s, train: 13m26s, total: 16m20s
0.8894 (0.02) 0.8096 (-0.00) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 1m1s, train: 19m28s, total: 20m29s
0.8625 (0.03) 0.7504 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m29s, train: 12m33s, total: 14m1s
0.8578 (0.03) 0.7521 (-0.01) no data

Dataset: Private 1, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m59s, train: 3m42s, total: 5m41s
0.9054 (-0.00) 0.9612 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m20s, train: 3m20s, total: 5m39s
0.9075 (-0.00) 0.9745 (0.00) no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 35s, train: 2m50s, total: 3m25s
0.8326 (-0.02) 0.9574 (0.00) no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 1m0s, train: 3m20s, total: 4m20s
0.8524 (-0.00) 0.9431 (0.01) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 29s, train: 3m21s, total: 3m50s
0.9085 (0.02) 0.9612 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 48s, train: 3m6s, total: 3m54s
0.9075 (0.00) 0.9717 (0.00) no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 38s, train: 3m56s, total: 4m33s
0.8950 (0.00) 0.9574 (0.00) no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 58s, train: 3m38s, total: 4m36s
0.8950 (0.00) 0.9681 (-0.00) no data

Dataset: Private 2, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m1s, train: 11m5s, total: 13m6s
0.8691 (-0.00) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 40s, train: 5m40s, total: 6m19s
0.7382 (0.01) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 47s, train: 5m32s, total: 6m19s
0.7672 (-0.02) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 36s, train: 5m0s, total: 5m35s
0.8466 (-0.01) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 41s, train: 4m50s, total: 5m30s
0.8444 (-0.01) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 46s, train: 7m41s, total: 8m27s
0.8519 (0.00) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 51s, train: 6m5s, total: 6m56s
0.8648 (0.01) no data no data

Dataset: Private 3, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m0s, train: 1m2s, total: 2m1s
0.9177 (0.00) no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m4s, train: 46s, total: 1m49s
0.9300 (0.07) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 38s, train: 52s, total: 1m29s
0.6996 (0.09) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 40s, train: 41s, total: 1m21s
0.7572 (0.16) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 33s, train: 1m0s, total: 1m33s
0.8683 (0.03) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 37s, train: 41s, total: 1m17s
0.8765 (0.04) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 39s, train: 1m11s, total: 1m49s
0.8930 (0.02) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 42s, train: 47s, total: 1m28s
0.8807 (0.01) no data no data

Dataset: Sara, Dataset repository branch: triplet

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m22s, train: 4m44s, total: 7m5s
0.8619 (0.01) 0.8683 (0.00) 0.8652 (-0.00)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m41s, train: 3m44s, total: 6m24s
0.8541 (0.00) 0.8805 (0.00) 0.8674 (-0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m32s, train: 6m56s, total: 9m28s
0.8697 (0.00) 0.8683 (0.00) 0.8696 (-0.02)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m50s, train: 4m52s, total: 7m42s
0.8570 (-0.00) 0.8998 (-0.00) 0.8717 (-0.03)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 54s, train: 5m21s, total: 6m14s
0.8296 (0.00) 0.8683 (0.00) 0.8435 (-0.02)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m13s, train: 4m1s, total: 5m13s
0.8394 (0.00) 0.8168 (-0.00) 0.8500 (-0.01)

github-actions[bot] avatar Apr 26 '21 19:04 github-actions[bot]

This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 16 '22 07:04 stale[bot]

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jul 18 '22 22:07 CLAassistant