mwoffliner icon indicating copy to clipboard operation
mwoffliner copied to clipboard

Audio/Video within blacklisted CSS containers is still downloaded

Open ISNIT0 opened this issue 5 years ago • 8 comments

Take this html:

<div class='noprint'>
    <audio><source src='...'/></audio>
</div>

The source url is added to the list of medias to download in pre-processing, but the element itself is removed later in the treatment process. This means we download the content, but it's not visible

ISNIT0 avatar Jun 28 '19 08:06 ISNIT0

Sir, Should I start resolving this issue now?

vaibhavmatta avatar Aug 02 '19 07:08 vaibhavmatta

@vaibhavmatta Would be great! May you please explain first what would be your approach to solve this problem?

kelson42 avatar Aug 02 '19 08:08 kelson42

Thanks alot @kelson42 sir.! I think I am capable of working on many bugs simultaneously. I will reply for resolving strategies as fast as I can.

vaibhavmatta avatar Aug 02 '19 08:08 vaibhavmatta

@ISNIT0 I let you handle tech. details with @vaibhavmatta.

kelson42 avatar Aug 02 '19 08:08 kelson42

My recommendation would be to apply the blacklists/html processing before extracting the urls.

This would be done by re-ordering the calls to processArticleHtml, getModuleDependencies, and templateArticle: https://github.com/openzim/mwoffliner/blob/master/src/util/saveArticles.ts#L90 I'm not sure it it'll just work, or need further tweaking

ISNIT0 avatar Aug 02 '19 09:08 ISNIT0

Yes, That's what I was exactly thinking @ISNIT0

vaibhavmatta avatar Aug 02 '19 09:08 vaibhavmatta

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Oct 01 '19 09:10 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jan 20 '21 08:01 stale[bot]