tika-python icon indicating copy to clipboard operation
tika-python copied to clipboard

[documentation] Add example with byte buffer

Open bjrne opened this issue 2 years ago • 1 comments

I wanted to use the library with a file that I get from another server, thus I already had the file in memory. It took me a while to understand how to hand it to tika without first having to write it to disk, party because there were no examples.

Counterintuitively, the from_file method does not work, since it relies on the name attribute of a FileIO object, which a file in memory (used with BytesIO) does not have. I don't think that the name is really necessary in the case of in-memory files, but that would be another issue to fix this.

The from_buffer method in always shown with strings and the parameter is always called "string" until it ends up in the callServer method where it is actually called data. Thus, byte data works, but that is not quite obvious. Therefore, I suggest to add this bit of documentation.

bjrne avatar Oct 07 '21 11:10 bjrne

@chrismattmann Could you please merge this PR? I'd like to help out by writing some documentation as well, but first want to see if PR's will be merged.

dogweather avatar Mar 06 '22 19:03 dogweather

yes, sorry for the delay @dogweather !

chrismattmann avatar Dec 31 '22 20:12 chrismattmann

yes, sorry for the delay @dogweather !

Thanks! NP.

dogweather avatar Jan 03 '23 05:01 dogweather