nidaba
nidaba copied to clipboard
modified get_emoticons util functions
used split()
instead of regex to extract emoticons.
Thanks for this PR Chillar. I hope you've had a good Christmas break :)
I'd say that rather than returning a set of the emoticons, we should return a dictionary with the count of how many times the emoticon is used.
So it may return, for example:
get_emoticons(" This is my text :D :D :D but it has no code :(")
# {':D': 3, ':(':1}
This would allow us to take into account over-use of smileys etc, but also mean we could get the same set of emoticons using dict.keys()
if needs be.
I'm planning on doing some more Nidaba over the next few days.
We also need much more thorough unittests. Add more if you can, but if you can't think of any more thorough ones then just PR it and I'll add a few more before merging.
get_emoticons accepts string or list of words. modified unit tests.