tika-python
tika-python copied to clipboard
[documentation] Add example with byte buffer
I wanted to use the library with a file that I get from another server, thus I already had the file in memory. It took me a while to understand how to hand it to tika without first having to write it to disk, party because there were no examples.
Counterintuitively, the from_file
method does not work, since it relies on the name
attribute of a FileIO object, which a file in memory (used with BytesIO) does not have. I don't think that the name is really necessary in the case of in-memory files, but that would be another issue to fix this.
The from_buffer
method in always shown with strings and the parameter is always called "string" until it ends up in the callServer
method where it is actually called data. Thus, byte data works, but that is not quite obvious. Therefore, I suggest to add this bit of documentation.
@chrismattmann Could you please merge this PR? I'd like to help out by writing some documentation as well, but first want to see if PR's will be merged.
yes, sorry for the delay @dogweather !
yes, sorry for the delay @dogweather !
Thanks! NP.