rasa Swap non-deterministic ops with possibly deterministic ones

Proposed changes:

...

Status (please check what you already did):

[ ] added some tests for the functionality
[ ] updated the documentation
[ ] updated the changelog (please check changelog for instructions)
[ ] reformat files using black (please check Readme for instructions)

Mar 03 '22 18:03 dakshvar22

Status of the run: Failed

Commit: 911ad2ae00c0bbff9539620136d1886063cc70c3, The full report is available as an artifact.

Datadog dashboard link

Mar 03 '22 18:03 github-actions[bot]

Status of the run: Succeeded

Commit: 911ad2ae00c0bbff9539620136d1886063cc70c3, The full report is available as an artifact.

Datadog dashboard link

Dataset: Customer 1, Dataset repository branch: mr-tests (external repository), commit: c36f81a7b7a012e8b42c8286d3fee8c0a3e3b896 Configuration repository branch: nib-test

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m42s`, train: `6m36s`, total: `8m17s`	0.7816 (`no data`)	0.9967 (`no data`)	`no data`

Mar 03 '22 18:03 github-actions[bot]

Status of the run: Failed

Commit: 911ad2ae00c0bbff9539620136d1886063cc70c3, The full report is available as an artifact.

Datadog dashboard link

Dataset: Carbon Bot, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `2m7s`, train: `5m45s`, total: `7m51s`	0.7922 (-0.00)	0.7529 (0.00)	0.5430 (-0.01)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `3m9s`, train: `6m26s`, total: `9m35s`	0.7806 (0.00)	0.7880 (0.00)	0.5563 (0.00)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `50s`, train: `3m37s`, total: `4m27s`	0.7456 (-0.00)	0.7529 (0.00)	0.5232 (0.01)
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m50s`, train: `5m15s`, total: `7m5s`	0.7398 (0.00)	0.7022 (0.00)	0.5364 (0.01)

Dataset: Hermit, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `3m46s`, train: `31m0s`, total: `34m45s`	0.8717 (-0.00)	0.7504 (0.00)	`no data`
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `1m15s`, train: `24m10s`, total: `25m24s`	0.8271 (0.00)	0.7504 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m53s`, train: `18m42s`, total: `20m35s`	0.8346 (0.00)	0.7585 (0.00)	`no data`

Dataset: Private 1, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `37s`, train: `4m35s`, total: `5m12s`	0.9002 (0.00)	0.9612 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m17s`, train: `4m15s`, total: `5m31s`	0.9096 (0.00)	0.9735 (0.01)	`no data`

Dataset: Private 2, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `43s`, train: `6m48s`, total: `7m31s`	0.8498 (0.01)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `49s`, train: `5m42s`, total: `6m30s`	0.8530 (0.00)	`no data`	`no data`

Dataset: Private 3, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `36s`, train: `1m17s`, total: `1m52s`	0.8683 (0.00)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `41s`, train: `53s`, total: `1m34s`	0.8642 (0.00)	`no data`	`no data`

Mar 03 '22 20:03 github-actions[bot]

Status of the run: Failed

Commit: 911ad2ae00c0bbff9539620136d1886063cc70c3, The full report is available as an artifact.

Datadog dashboard link

Dataset: Carbon Bot, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `2m8s`, train: `5m49s`, total: `7m56s`	0.7883 (-0.01)	0.7529 (0.00)	0.5563 (0.00)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `3m13s`, train: `6m29s`, total: `9m41s`	0.7806 (0.00)	0.7880 (0.00)	0.5563 (0.00)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `53s`, train: `4m12s`, total: `5m5s`	0.7573 (0.01)	0.7529 (0.00)	0.5249 (0.01)

Mar 04 '22 03:03 github-actions[bot]

Status of the run: Failed

Commit: 911ad2ae00c0bbff9539620136d1886063cc70c3, The full report is available as an artifact.

Datadog dashboard link

Dataset: Carbon Bot, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `2m10s`, train: `5m50s`, total: `8m0s`	0.7922 (-0.00)	0.7529 (0.00)	0.5629 (0.01)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `3m6s`, train: `6m13s`, total: `9m19s`	0.7806 (0.00)	0.7880 (0.00)	0.5364 (-0.02)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `51s`, train: `3m37s`, total: `4m28s`	0.7476 (0.00)	0.7529 (0.00)	0.5364 (0.02)
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m54s`, train: `5m6s`, total: `6m59s`	0.7398 (0.00)	0.7022 (0.00)	0.5364 (0.01)

Dataset: Hermit, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `3m48s`, train: `31m4s`, total: `34m51s`	0.8690 (-0.00)	0.7504 (0.00)	`no data`
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `1m15s`, train: `24m7s`, total: `25m22s`	0.8262 (0.00)	0.7504 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `2m1s`, train: `18m38s`, total: `20m39s`	0.8364 (0.00)	0.7571 (0.00)	`no data`

Dataset: Private 1, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `38s`, train: `4m48s`, total: `5m26s`	0.8940 (-0.00)	0.9612 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m16s`, train: `4m17s`, total: `5m33s`	0.9075 (0.00)	0.9717 (0.01)	`no data`

Dataset: Private 2, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `44s`, train: `6m51s`, total: `7m34s`	0.8455 (0.01)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `49s`, train: `5m43s`, total: `6m32s`	0.8530 (0.00)	`no data`	`no data`

Dataset: Private 3, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `38s`, train: `1m20s`, total: `1m57s`	0.8683 (0.00)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `42s`, train: `55s`, total: `1m37s`	0.8642 (0.00)	`no data`	`no data`

Mar 04 '22 04:03 github-actions[bot]

Status of the run: Succeeded

Commit: 911ad2ae00c0bbff9539620136d1886063cc70c3, The full report is available as an artifact.

Datadog dashboard link

Dataset: Sara, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `6m24s`, train: `12m7s`, total: `18m31s`	0.6967 (0.01)	0.7949 (0.00)	0.7907 (-0.01)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `8m16s`, train: `9m45s`, total: `18m1s`	0.7039 (-0.00)	0.7848 (-0.01)	0.7891 (-0.01)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `2m6s`, train: `9m37s`, total: `11m42s`	0.6630 (-0.01)	0.7949 (0.00)	0.7736 (-0.01)
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `2m47s`, train: `6m42s`, total: `9m30s`	0.6794 (-0.00)	0.7692 (-0.00)	0.7907 (-0.01)

Mar 04 '22 07:03 github-actions[bot]

Status of the run: Failed

Commit: c2b0bdc9bba09e0779b549cd420631c8ec4acbaf, The full report is available as an artifact.

Datadog dashboard link

Mar 04 '22 07:03 github-actions[bot]

Status of the run: Failed

Commit: c2b0bdc9bba09e0779b549cd420631c8ec4acbaf, The full report is available as an artifact.

Datadog dashboard link

Mar 04 '22 07:03 github-actions[bot]

Status of the run: Succeeded

Commit: c2b0bdc9bba09e0779b549cd420631c8ec4acbaf, The full report is available as an artifact.

Datadog dashboard link

Dataset: Private 1, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `39s`, train: `5m59s`, total: `6m38s`	0.7495 (-0.15)	0.9612 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m17s`, train: `4m27s`, total: `5m44s`	0.8170 (-0.09)	0.9709 (0.00)	`no data`

Dataset: Private 2, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `45s`, train: `6m53s`, total: `7m38s`	0.2650 (-0.57)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `54s`, train: `5m54s`, total: `6m48s`	0.5923 (-0.26)	`no data`	`no data`

Mar 04 '22 08:03 github-actions[bot]

Status of the run: Failed

Commit: 8bd612eadadfecd9d616f066bc32fc3a73485ae4, The full report is available as an artifact.

Datadog dashboard link

Mar 04 '22 09:03 github-actions[bot]

Status of the run: Succeeded

Commit: 5a78a9fddd7966886546eca346cce126b7bc477c, The full report is available as an artifact.

Datadog dashboard link

Dataset: Private 1, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `42s`, train: `5m5s`, total: `5m47s`	0.8919 (-0.01)	0.9612 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m25s`, train: `4m16s`, total: `5m40s`	0.9012 (-0.01)	0.9663 (0.00)	`no data`

Dataset: Private 2, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `47s`, train: `7m1s`, total: `7m47s`	0.8423 (0.00)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `52s`, train: `5m49s`, total: `6m41s`	0.8552 (0.00)	`no data`	`no data`

Mar 04 '22 10:03 github-actions[bot]

Status of the run: Failed

Commit: 63d18b65a0ae76ba98b161a418f3c9dee0e90497, The full report is available as an artifact.

Datadog dashboard link

Mar 04 '22 11:03 github-actions[bot]

Status of the run: Failed

Commit: 63d18b65a0ae76ba98b161a418f3c9dee0e90497, The full report is available as an artifact.

Datadog dashboard link

Dataset: Private 1, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `42s`, train: `4m51s`, total: `5m33s`	0.8992 (0.00)	0.9612 (0.00)	`no data`

Mar 04 '22 11:03 github-actions[bot]

Status of the run: Succeeded

Commit: e9e8b437a1238a78a4638b9819fb0aaf0f78d06d, The full report is available as an artifact.

Datadog dashboard link

Dataset: Private 1, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `42s`, train: `4m50s`, total: `5m32s`	0.8950 (-0.00)	0.9612 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m25s`, train: `4m32s`, total: `5m57s`	0.9023 (-0.00)	0.9655 (-0.00)	`no data`

Dataset: Private 2, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `45s`, train: `6m59s`, total: `7m44s`	0.8573 (0.02)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `55s`, train: `5m53s`, total: `6m47s`	0.8552 (0.00)	`no data`	`no data`

Mar 04 '22 12:03 github-actions[bot]

Hey @dakshvar22! :wave: To run model regression tests, comment with the /modeltest command and a configuration.

Tips :bulb:: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips :bulb:: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot" (NLU)
# - "Customer 1" (NLU, Core)
# - "Hermit" (NLU)
# - "Private 1" (NLU)
# - "Private 2" (NLU)
# - "Private 3" (NLU)
# - "Sara" (NLU, Core)
# - "financial-demo" (NLU, Core)
# - "helpdesk-assistant" (NLU, Core)
# - "insurance-demo" (NLU, Core)
# - "retail-demo" (NLU, Core)

##########
## Available NLU configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

##########
## Available Core configurations
##########
# - "Rules"
# - "Rules + AugMemo"
# - "Rules + AugMemo + TED"
# - "Rules + Memo"
# - "Rules + Memo + TED"
# - "Rules + TED"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'main'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
#
## Example: Define number of repetitions. This will inform how often to repeat all runs defined in the include section. Default is 1
## num_repetitions: 2
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)"]
##
## Shortcuts:
## You can use the "all" shortcut to include all available configurations or datasets.
## You can use the "all-nlu" shortcut to include all available NLU configurations or datasets.
## You can use the "all-core" shortcut to include all available core configurations or datasets.

include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

Mar 04 '22 13:03 github-actions[bot]

/modeltest

dataset_branch: "nib-test"
include:
 - dataset: ["Sara", "Sara", "Sara", "Sara"]
   config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "Sparse + DIET(bow) + ResponseSelector(bow)", "Sparse + BERT + DIET(bow) + ResponseSelector(bow)", "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"]

Mar 04 '22 13:03 github-actions[bot]

The model regression tests have started. It might take a while, please be patient. As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

Mar 04 '22 13:03 github-actions[bot]

Status of the run: Succeeded

Commit: e9e8b437a1238a78a4638b9819fb0aaf0f78d06d, The full report is available as an artifact.

Datadog dashboard link

Dataset: Sara, Dataset repository branch: nib-test, commit: 1434bff267f2be514caea55c94638ecd8c4bd864

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `6m22s`, train: `11m57s`, total: `18m18s`	0.7010 (0.01)	0.7949 (0.00)	0.7907 (-0.01)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `7m27s`, train: `9m6s`, total: `16m32s`	0.6948 (-0.01)	0.7913 (-0.00)	0.8031 (0.01)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `1m57s`, train: `8m37s`, total: `10m34s`	0.6688 (0.00)	0.7949 (0.00)	0.7804 (-0.01)
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `2m42s`, train: `6m30s`, total: `9m11s`	0.6774 (-0.00)	0.7629 (-0.01)	0.7876 (-0.01)

Mar 04 '22 15:03 github-actions[bot]

Status: Even with the above changes, the model training is not deterministic. Will need to try a different approach to first determine what's the first point of non-determinism in the model architecture. Clearly its not sparse.sparse_dense_matmul.

Mar 10 '22 10:03 dakshvar22

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Daksh seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Jul 18 '22 22:07 CLAassistant

Closing this as it does not look like this line of work yielded improvements.

Sep 14 '22 07:09 twerkmeister

rasa rasa copied to clipboard

Swap non-deterministic ops with possibly deterministic ones

rasa
rasa copied to clipboard