ocr-fileformat
ocr-fileformat copied to clipboard
multi-choice of files in the web interface
currently, the GUI allows choosing only one file every time. If you can pick a large set of files at once and then convert them and download them together - that would be great.
thank a lot.
Especially for many files drag and drop support would also be helpful.
Conversion of many files would still only return a single download, so for example a zip file with the converted files could be returned.
But maybe a Web API would even be better for handling lots of files.
For batch transformations I would suggest to use the command line tools, e.g. something like
for filename in *.alto; do
docker run --rm -it -v "$PWD":/data ubma/ocr-fileformat ocr-transform alto2.0 hocr "$filename"
done
Maybe we could allow multiple files directly as an argument for the CLI scripts.
However, I am skeptical that for larger transformation tasks the Web GUI is suitable or that a Web API is the direction we should go to.
BTW there is already some kind of Web API (but not documented), e.g. https://digi.bib.uni-mannheim.de/ocr-fileformat/ocr-fileformat.php?do=transform&from=alto2.0&to=hocr&url=https://rawgit.com/kba/ocr-fileformat-samples/master/samples/alto/2.0/wetzel_reisebegleiter_1901_0021.alto .