MapBot
MapBot copied to clipboard
Gather more data for the chatbot's database via crowd-sourcing
Requirement
The sentences.csv
file has very limited data which can be used for the initial training. The aim is to gather more data via crowd-sourcing and sources to help improve the responses of the bot via ML models.
Pre-requisite
Elementary knowledge of Python Elementary understanding of the available data
Dependencies None
Description This is an open-ended issue where participants can explore crowd-sourcing to gather the data required for improving the bot's NLP capabilities. We can either look at using a crowd-sourcing platform (like Amazon Mechanical Turks) or a simple survey form distributed amongst friends.
The primary aim with this bit would be to get a wide variety of questions that people may ask a mapbot i.e. a bot which can answer direction and location information related queries primarily. Please provide the details of the different APIs we're planning to include in the bot and ask folks to frame their questions based on the set of available capabilites.
As discussed in a similar issue #52, elementary pre-processing of the data might be required before we put it in the db. Please look at sentences.csv
to get an idea of the kind of questions we're handling right now.
Please review your method of gathering data before actually putting it up on a site or sharing it with your friends/batchmates/colleagues
Interested
Hi!!! Can I work on this issue ?
@preeti13456 I'm assigning this to you, please share your approach here and we can have a short discussion before you proceed with the implementation. @shreyanshi2228 This issue is already assigned but you can take a look at the others and we'll be opening more soon :)
Okay:)
@preeti13456 how's the progress here? Are you stuck with something? When do you think you can raise a PR to resolve this?
No will start today.
Get Outlook for Androidhttps://aka.ms/ghei36
From: Vishakha Lall [email protected] Sent: Wednesday, March 18, 2020 9:37:31 AM To: vishakha-lall/MapBot [email protected] Cc: preeti13456 [email protected]; Mention [email protected] Subject: Re: [vishakha-lall/MapBot] Gather more data for the chatbot's database via crowd-sourcing (#53)
@preeti13456https://github.com/preeti13456 how's the progress here? Are you stuck with something? When do you think you can raise a PR to resolve this?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/vishakha-lall/MapBot/issues/53#issuecomment-600414221, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AITUUQ2XM5VQYJJ75MFNKXTRIBCIHANCNFSM4LCC5BYA.
I am unassigning this issue, as a guideline we only recommend claiming one issue at a time.
ok
Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10
From: Vishakha Lall [email protected] Sent: Friday, March 20, 2020 11:27:17 AM To: vishakha-lall/MapBot [email protected] Cc: preeti13456 [email protected]; Mention [email protected] Subject: Re: [vishakha-lall/MapBot] Gather more data for the chatbot's database via crowd-sourcing (#53)
I am unassigning this issue, as a guideline we only recommend claiming one issue at a time.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/vishakha-lall/MapBot/issues/53#issuecomment-601548503, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AITUUQ2UCLTOILAIOCI6HRTRIMAT3ANCNFSM4LCC5BYA.
Hey @vishakha-lall and @janakrajchadha I found something https://github.com/hellohaptik/haptik_open_datasets many be useful
https://lionbridge.ai/datasets/15-best-chatbot-datasets-for-machine-learning/
there are many @shreyanshi2228 and @preeti13456 maybe you can try to manually label this dataset and make the current data-rich, data enrichment is one of the important components of data science pipeline and purpose it to the mentors of this project, Just to help I have commented here !!
Thank you :)
https://github.com/hellohaptik/haptik_open_datasets/blob/master/domain_classification/test_data.csv I guess we want this for this issue. Can I work on this issue.
@shreyanshi2228 while the link is interesting, the idea of using crowd sourcing was so we could concentrate on the intent of the bot (which is navigation related). Do you think you would be able to crowd source the data?