Mehmed Mustafa
Mehmed Mustafa
@bertsky I have pushed my latest changes to the benchmarking branch. I have not been working on that experiment after that. @mweidling is investigating this topic in more depth and...
@bertsky you are right. I have not changed the actual modules. In order to implement my own version of the functions I have extended the OcrdMets class. The reason I...
> Also, why have separate `_fileGrp_cache` and `_file_cache` - doesn't the latter also contain the fileGrp info in a fast-to-access way? What would be more beneficial would be caching the...
 Building the METS file with 50 files inside got improved from 8,7s down to 508ms! Here we go, less electricity consumption for everyone :)
> Related issue: #723 > > Related discussion: [OCR-D/zenhub#39](https://github.com/OCR-D/zenhub/issues/39) > > Still missing IIUC: > > > I would suggest diversifying test scenarios BTW: > > * thousands of pages...
 @bertsky, here are the results for 50-500-1000-2000-5000 pages. I forced iterations to 1 because it was already taking 3 days for 5000 pages (non-cached) to finish....
> Perhaps you could include your local results as a Markdown table into `ocrd_models/README.md` or a new file under `tests/model/README.md`? @bertsky, yes, I will do that once I have the...
@bertsky, I have canceled the test execution because even the building of the mets file for the non-cached version has not finished in almost 5 days. Since I could not...
> @MehmedGIT understood. (Perhaps a server machine or SSH build on CircleCI would help with your resource limitation.) > > It would help at least knowing how much that test...
Since the caching_functionality branch was behind by 150 commits from the master, I merged the master. The test example with 1500 files per page and 500 pages is running inside...