Mehmed Mustafa comments

Results 82 comments of


                                            Mehmed Mustafa

performance of input_files on large workspaces

@bertsky I have pushed my latest changes to the benchmarking branch. I have not been working on that experiment after that. @mweidling is investigating this topic in more depth and...

performance of input_files on large workspaces

@bertsky you are right. I have not changed the actual modules. In order to implement my own version of the functions I have extended the OcrdMets class. The reason I...

Cache functionality

> Also, why have separate `_fileGrp_cache` and `_file_cache` - doesn't the latter also contain the fileGrp info in a fast-to-access way? What would be more beneficial would be caching the...

![Screenshot from 2022-06-07 17-01-51](https://user-images.githubusercontent.com/17258874/172416468-7b155c0c-df6c-462e-a584-dd1ea3b9ee60.png) Building the METS file with 50 files inside got improved from 8,7s down to 508ms! Here we go, less electricity consumption for everyone :)

Cache functionality

> Related issue: #723 > > Related discussion: [OCR-D/zenhub#39](https://github.com/OCR-D/zenhub/issues/39) > > Still missing IIUC: > > > I would suggest diversifying test scenarios BTW: > > * thousands of pages...

Cache functionality

![Screenshot from 2022-06-28 06-39-30](https://user-images.githubusercontent.com/17258874/176134118-8c85c919-84c0-4851-b114-bd4277e30387.png) @bertsky, here are the results for 50-500-1000-2000-5000 pages. I forced iterations to 1 because it was already taking 3 days for 5000 pages (non-cached) to finish....