eikek
eikek
That's an interesting idea! I have to do research here, but I believe it is something where NLP techniques can help and maybe also with the document structure. It depends...
Hi @nero82 , thank you! I'm not sure if this is a different case: do you mean file names or the item name? I think the proposal with nlp is...
The item name only defaults to the file name (or some text like "3 files" if there are multiple files on an item) when processing. I also think that most...
> Cool would be the possibility to use user defined fields and tags of a certain category. The patterns can be extended, custom fields are a good idea – I...
I actually have a source per language. But it's only for two languages :) You can reprocess the files which will reuse the language from the previous run. If you...
> Could you elaborate on your first statement? What do you mean by "a source per language but only for two languages"? Sure. I have several source urls and I...
oh strange yes! I need to look at the code. it's probably a bug. what is your collective's default language? Edit: oh, sorry - it's using the language from the...
> I don't know the first thing about Scala (I work with Python and JS) but I think I'm gonna give it a go (provided I'm able to set up...
Additionally to what @Snify89 said: you could disable some processing to save resources if you want. For example, running ocrmypdf is not required, when you have pdfs in the first...
Hi @ohagene - thank you for your thoughts! I'm a bit reluctant to add a complete pagination I have to admit. It is always a bit of a hassle to...