mwoffliner
mwoffliner copied to clipboard
Audio/Video within blacklisted CSS containers is still downloaded
Take this html:
<div class='noprint'>
<audio><source src='...'/></audio>
</div>
The source url is added to the list of medias to download in pre-processing, but the element itself is removed later in the treatment process. This means we download the content, but it's not visible
Sir, Should I start resolving this issue now?
@vaibhavmatta Would be great! May you please explain first what would be your approach to solve this problem?
Thanks alot @kelson42 sir.! I think I am capable of working on many bugs simultaneously. I will reply for resolving strategies as fast as I can.
@ISNIT0 I let you handle tech. details with @vaibhavmatta.
My recommendation would be to apply the blacklists/html processing before extracting the urls.
This would be done by re-ordering the calls to processArticleHtml
, getModuleDependencies
, and templateArticle
: https://github.com/openzim/mwoffliner/blob/master/src/util/saveArticles.ts#L90
I'm not sure it it'll just work, or need further tweaking
Yes, That's what I was exactly thinking @ISNIT0
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.