rasa Watson Training Data importer

Proposed changes:

Added a watson data importer that takes exported watson training for and converts them into rasa format for training an NLU model Currently accepted transformations
Intents
Entities
Entity Synonyms

Links to Issue #1154

Status (please check what you already did):

[x] added some tests for the functionality
[x] updated the documentation
[x] updated the changelog (please check changelog for instructions)
[x] reformat files using black (please check Readme for instructions)

Feb 10 '21 18:02 souvikg10

All committers have signed the CLA.

Feb 10 '21 18:02 CLAassistant

Thanks for opening a draft pull request 🚀If you have any questions, you can direct them to @alopez ✨

Feb 11 '21 07:02 sara-tagger

@akelad , @tmbo - I am not sure whether the reviewer is active, it automatically assigned.

Thanks for checking. I may be missing some prep for a PR but happy to add whatever is missing.

Feb 15 '21 16:02 souvikg10

Hey @souvikg10

Thanks for working on this! I will review the PR this week 👍

Mar 09 '21 13:03 degiz

@degiz - any update on this?

thanks

Mar 30 '21 07:03 souvikg10

👋 Thanks for your contribution. The failing step wait for docs test in the CI is an error and it is fixed. If you update the PR with the base branch, the wait for docs test step should be green. 🙌🏻

Apr 08 '21 18:04 HotThoughts

@degiz Hi, I added some updates to use .get() for the dictionaries to handle better the loops if keys dont exist. Could you review please.

Thanks

Apr 23 '21 08:04 souvikg10

@degiz - i made the changes as requested. Thanks for the feedback.

May 10 '21 09:05 souvikg10

Hey @souvikg10 Thanks for addressing the comments! 🚀 You can re-request the review if the changes are ready 👍

May 11 '21 11:05 degiz

Hi @degiz , not sure if i did it right, ,but i think i requested review.. can you please check and let me know if there is anything else to add on this?

Jun 01 '21 14:06 souvikg10

Hey @souvikg10

Could you please merge Rasa main into your branch? Let's see if the tests are passing now 🤔

Jun 01 '21 14:06 degiz

I rebased and fixed the failling tests. i hope :) @degiz - let's see if this passes

Jun 04 '21 12:06 souvikg10

All tests are passing from my side. @degiz . there is just one about unused imports but i am not sure why

Jun 08 '21 07:06 souvikg10

rasa/shared/nlu/training_data/formats/watson.py:4:1: F401 'rasa.shared.nlu.constants.VALID_FEATURE_TYPES' imported but unused

You need to remove the unsed import there.

rasa/shared/nlu/training_data/formats/init.py:10:1: F401 'rasa.shared.nlu.training_data.formats.rasa.RasaReader' imported but unused

Please put # noqa: F401 on the same line as import

Jun 09 '21 14:06 degiz

@wochinge - updated changelog as well.

Jun 17 '21 11:06 souvikg10

@wochinge yeah it makes sense to convert the data first specially I only wrote the part for watson NLU data and not the conversation part which i don't even know how to integrate just. could be something i can add in the future, convert watson flows into rules in Rasa.

I actually followed the Dialogflow implementation. I suppose there will be one for Dialogflow training data converter as well. I can work with that. no problem.

I will let you know if I am stuck at some point.

Jun 18 '21 08:06 souvikg10

could be something i can add in the future, convert watson flows into rules in Rasa.

That sounds like a super cool idea!

suppose there will be one for Dialogflow training data converter as well.

Actually there isn't 😬 For historic reasons we treat DialogFlow etc. as first class citizen training data (cc @TyDunn ). It's done by this part of the code.

Jun 18 '21 08:06 wochinge

@wochinge - sorry for the delay.

I have added the ref to create a watson training data converter. Could you take a look. I will remove the other implementation if this is all okay. wrote some tests but i don't know where to add the docs..

I had to add a param in the CLI to indicate the source type as watson using -s arg. let me know if that is okay...

Jul 16 '21 13:07 souvikg10

Thanks @souvikg10 I'm just heading out and on vacation next week: Will have a look the week after 🤞🏻

Jul 16 '21 15:07 wochinge

Sorry @wochinge I have been taking a loong time to get this done 😢 Time constraints

Aug 09 '21 15:08 souvikg10

@souvikg10 How is it going with this? Any way I can help you out?

Oct 18 '21 08:10 wochinge

@wochinge - sorry, i thought the latest build was successful, because i refactored and removed the older implementation out. Just need to fix some tests it seems. I will do it now. Was off for a couple of weeks.

Oct 19 '21 07:10 souvikg10

@wochinge - this is becoming too big for me 😢 I don't understand some of the failing tests. the implementation is done but could you please take a look at the failing tests. I will try my best to get them sorted but i am under a lot of workload recently. sorry for this dragging this PR for a long time

Nov 08 '21 08:11 souvikg10

@wochinge - i have some code complexity issues here. Cognitive complexity is too high ☹️ I am going to try to rework some of it

Dec 02 '21 10:12 souvikg10

@souvikg10 Code complexity is not a required check and we haven't adjusted the code climate setting for this setting yet to the codebase. Unless it's an easy change / a very obvious thing, I'd ignore it.

I'll have a look at your changes right away. Sorry having you wait so long!

Dec 03 '21 14:12 wochinge

@souvikg10 Do you still plan on completing this PR? If you don't have time for it right now, I'd suggest we close it for now.

Apr 22 '22 11:04 indam23

@melindaloubser1 - Unfortunately i won't have time on this until summer when things are a little relaxed :( there has been a lot of review on this including a rewrite few months back. I didn't get the time afterwards to work on the tests as tobias reviewed. my apologies

Apr 25 '22 07:04 souvikg10

All good, just wanted to check if you needed support on it - stalebot might close it in the future, but you can reopen it if you have capacity later in the summer.

Apr 25 '22 09:04 indam23

rasa rasa copied to clipboard

Watson Training Data importer

rasa
rasa copied to clipboard