reinforcement learning as a method to design conversations
rel https://github.com/wechaty/wishlist/issues/43
We can try to plan a perfect path through a conversation ahead of time, and write out a script for our bots. This is "top down design"
But often the user will run the conversation in a completely different way. If they were talking to a real human agent, the conversation would flow in a different sequence. Some authoring system such as rasa will start with this approach: https://rasa.com/docs/rasa/writing-stories
But then try to use annotations of actual conversations to refine the conversation flow. However, the current tools on the market really are quite un-unsable for this. RASA stories IMHO qucikly devolve to a huge mess that is impossible to view or reason about.
So this project would be a new start in trying to combine NLU conversation insights from "human in the loop" choices, or post-review of past conversations, with the top-down designed stories. The choices a human makes should affect future conversations in a probabalistic way
a simple prototype exists here, but it is not connected to any kind of NN model https://dc.rik.ai/projects/convoai
SOwC https://github.com/wechaty/summer-of-wechaty/issues/30