chamanti_ocr_theano icon indicating copy to clipboard operation
chamanti_ocr_theano copied to clipboard

సొంత డేటా తో ట్రెయిన్ చెయ్యడం

Open gayatri-devarakonda opened this issue 9 years ago • 6 comments

_/_

నేను నా సొంత డేటా తో దీన్ని ట్రెయిన్ చేసి వాడొచ్చా?

ధన్యవాదాలు

gayatri-devarakonda avatar Jul 14 '16 06:07 gayatri-devarakonda

మీరు అచ్చతెనుగులో అడిగారు కాబట్టి, నేను స్వచ్ఛాంగ్లాన ఒక పీడీఎఫ్ఫు అప్పలోడు చేస్తాను. రేపో మాపో. నెనర్లు రాకేశ్వర.

rakeshvar avatar Jul 14 '16 07:07 rakeshvar

@gayatriMahesh What kind of data do you have? Right now I upgraded the system, so that it will generate the data on the fly using the scribe.py file. If you have sentences and corresponding labels, you can replace scribe.py to give one sample each time. That's it.

rakeshvar avatar Oct 15 '16 09:10 rakeshvar

@gayatriMahesh If possible, can you share training data?

ChillarAnand avatar Oct 16 '16 07:10 ChillarAnand

I have multiple volumes of bhagavatam which I wanted to digitize. If it works correctly there are other non profit organizations who want to use the same for some other old telugu texts. So I was trying to understand the end to end process. Right now I have some images.. but are not manually annotated yet

gayatri-devarakonda avatar Oct 16 '16 11:10 gayatri-devarakonda

I have started a site to collect and digitize all telugu books. If you have any Telugu books, you can share them.

ChillarAnand avatar Oct 16 '16 12:10 ChillarAnand

Hi,

I do have many old historic books like puranas ,upapuranas,ithihasa etc.., I am trying to digitize them, I initially tried tesseract with no luck,now iam trying to use ocropus engine,,,I need language model of telugu it would be great if you can help..Thank you.

harinath141 avatar Dec 27 '16 06:12 harinath141