editdojo icon indicating copy to clipboard operation
editdojo copied to clipboard

Automatically detect if the given text is Japanese or English with Python

Open ykdojo opened this issue 7 years ago • 10 comments

I think I'm going to release the Twitter-based version of this product for Japanese and English first. So, we should be able to detect if a given tweet is written in Japanese or English with Python. This way, we can only show Japanese tweets coming from Japanese learners to native speakers of the language. Same with English.

ykdojo avatar Nov 03 '18 19:11 ykdojo

Nice. @ykdojo

yuruyuri16 avatar Nov 04 '18 15:11 yuruyuri16

@ykdojo does it mean whenever there is a japanese tweet from a person,the person who is familiar with Japanese will only be able to see that.?or all the members in the community?If we notify only japanese familiar people,then while using this twitter app,they must be registered as learning English knows japanese?Is your thought process is the similar to this?,What I have understood.By the way I am very much interested in contributing to this app idea from which I can gain more knowledge.we can do this to other languages aswell here in India :)

Small doubt :(

ghost avatar Nov 05 '18 14:11 ghost

Hmm here's an example to clarify.

Suppose User A is learning Japanese, and her native language is English.

She starts using one of her Twitter accounts, say, @uesr_a_jp to start tweeting stuff in Japanese.

Then, Japanese native speakers should start seeing these tweets so they can fix them.

However, I'm only concerned that, what if @user_a_jp starts tweeting stuff in both Japanese and English? We should probably be able to ignore all English tweets in that case.

ykdojo avatar Nov 05 '18 20:11 ykdojo

For something like this, we could look into the langdetect library? If, following along with the above example, @user_a_jp writes a tweet that returns 'en', we would ignore the tweet.

emills11 avatar Nov 07 '18 18:11 emills11

Oh yeah, the langdetect library looks good!

ykdojo avatar Nov 07 '18 19:11 ykdojo

Would you like me to go ahead and create a few functions that make use of the library? @ykdojo

emills11 avatar Nov 08 '18 14:11 emills11

Yeah that would be awesome! Thank you.

On Thu, Nov 8, 2018 at 9:37 AM ratdog45 [email protected] wrote:

Would you like me to go ahead and create a few functions that make use of the library? @ykdojo https://github.com/ykdojo

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ykdojo/editdojo/issues/23#issuecomment-437014861, or mute the thread https://github.com/notifications/unsubscribe-auth/ABukw50Jcr8W-oejIEfxqlCOD4m4-Edrks5utEGPgaJpZM4YM9Wv .

ykdojo avatar Nov 08 '18 16:11 ykdojo

NOTE: there's already a PR for this. https://github.com/ykdojo/editdojo/pull/29

Will come back to this when it's more immediately useful.

ykdojo avatar Dec 06 '18 05:12 ykdojo

would it be easier to implement google traductors feature of automatic language detection or its something extra and unnecessary ? @ykdojo

tushar-punjabi avatar Jan 02 '19 23:01 tushar-punjabi

Yeah, actually I think that will be ideal.

ykdojo avatar Jan 03 '19 18:01 ykdojo