awesome-ai-ml-dl
awesome-ai-ml-dl copied to clipboard
Add more features to the BetterNLP library
On the back of this discussion, @shahanesanket and I will take this further https://github.com/pandas-profiling/pandas-profiling/issues/278, some high-level ideas:
- Missing value analysis
- Text length analysis
- 2.1 min, max, average, quantiles
- 2.2 freq words, infrequent words (can include the deepmoji project's tokenizer. it's very robust)
- 2.2 word cloud. (if it isn't a far stretched goal)
@shahanesanket let's continue with our discussions here.
@shahanesanket any thoughts on the above, shall we get started with your ideas and then draft some code on top of it!
@shahanesanket
Please have a look at this implementation and let me know what you think, it's on the back of the issue you had raised as discussion point on the Pandas Profiling repo: https://github.com/neomatrix369/awesome-ai-ml-dl/blob/master/examples/better-nlp/notebooks/jupyter/nlp_profiler.ipynb
I'm happy to expound on this further after hearing your response and feedback on it.
NLP Profiler has been moved from under the Examples: BetterNLP section to into own repo: https://github.com/neomatrix369/nlp_profiler