Kelson
Kelson
After discussion which should be able to implement a solution to: * Save files smaller than cluster size in memory * Keep filehandle open for bigger files
@rgaudin @MiguelRocha @satyamtg It seems difficult from the `zimcheck` perspective to easily make the difference beetween a template and a "noirmal" HTML page. IMO We are missing here some kind...
@rgaudin This is a regex parser
Agree with @rgaudin, here the core of the problem is that we have a regex based parser and we should have a DOM based.
@maneeshpm Might be a good candidate for you, replacing the functions which retrieve the link with a DOM (pugixml) parser.
@maneeshpm Thank you very much. Ifyou have other tasks ongoing, please try to finish them first.
@maneeshpm This is a really pertinent remark. pugixml seems indeed not the properly tool. Not sure for the moment how to proceed.
@veloman-yunkan Question: might that be that the problem is worse because we don't have error codes and store therefore a lot of free text, see #239 ?
@veloman-yunkan I would prefer to just fix the software elegantly that the memory consumption does not go wild. I understand that following current approach (in RAM), there will be always...
@ballerburg9005 Thx for your bug report but this is a bug complecated to reproduce considering the size of the web site/directory. You don't have a simpler reproduction case?