exercises_thushv_dot_com icon indicating copy to clipboard operation
exercises_thushv_dot_com copied to clipboard

Sentence formatting in nmt_tutorial.ipynb

Open 07Agarg opened this issue 7 years ago • 0 comments

Hi @thushv89 ,

Thanks for your tutorial on neural machine translation. I am a newbie to this field and getting stuck in the data pre-processing part of the code.

def split_to_tokens(sent,is_source): #sent = sent.replace('-',' ') sent = sent.replace(',',' ,') sent = sent.replace('.',' .') sent = sent.replace('\n',' ') sent_toks = sent.split(' ')

Can you explain me this part? Why have you replaced newline character with a space . Shouldn't that be denoting end of sentence () or in this case. Or if replacing \n with a space , shouldn't split be done according to the full stop ' .' ??

Also, why should target sentences start and end from ?

Thanks a lot!

07Agarg avatar Aug 25 '18 12:08 07Agarg