zim-requests icon indicating copy to clipboard operation
zim-requests copied to clipboard

New ZIM request: NHS conditions

Open dattaz opened this issue 4 years ago • 19 comments

  • Website URL: https://www.nhs.uk/conditions/
  • License: OGL (need second check : 3.4 https://www.nhs.uk/our-policies/terms-and-conditions/)
  • Desired ZIM Title: Health A to Z
  • Desired ZIM Description: Your complete guide to conditions, symptoms and treatments, including what to do and when to get help
  • Desired ZIM Icon –png (URL or attach one): https://www.nhs.uk/static/nhsuk/img/favicons/favicon-192x192.43924bfe6c7e.png
  • Language (ISO 639-3): eng
  • Desired Main Page (homepage, if different from website URL): https://www.nhs.uk/conditions/ (not all the website, only the descending pages)
  • Is this a MediaWiki?: no

dattaz avatar Feb 02 '21 18:02 dattaz

Excellent idea. It looks like a pretty straightforward design, have you tried it on zimit?

Popolechien avatar Feb 02 '21 18:02 Popolechien

running it through youzim.it seems to do a great job :clap: (maybe just hitting the 1000 file limit).

dattaz avatar Feb 03 '21 22:02 dattaz

Recipe created https://farm.openzim.org/recipes/nhs.uk-conditions_en_all I'll update the library link once ready

RavanJAltaie avatar Aug 12 '24 23:08 RavanJAltaie

File is ready at the library https://library.kiwix.org/viewer#nhs.uk-conditions_en_all_2024-08

RavanJAltaie avatar Aug 15 '24 11:08 RavanJAltaie

Same CSS fix should be applied as in https://github.com/openzim/zim-requests/issues/1138

benoit74 avatar Sep 02 '24 10:09 benoit74

Custom CSS created, recipe updated to publish to dev with this custom CSS and requested, let's see.

benoit74 avatar Sep 17 '24 07:09 benoit74

@benoit74 I've just noticed with this ZIM (I'm testing for the first time, having been away) that none of the videos appear to work. See for example the Heart Attack video at bottom of this page: https://library.kiwix.org/viewer#nhs.uk-conditions_en_all_2024-09/www.nhs.uk/conditions/heart-attack/ . There are other examples such as the Menstrual Cycle video at the bottom of this page: https://library.kiwix.org/viewer#nhs.uk-conditions_en_all_2024-09/www.nhs.uk/conditions/periods/ .

Clearly this is Zimit-related, and not specific to this ZIM, but I thought I should note it here.

EDIT: I tested in library.kiwix.org and in the PWA. Videos don't play in either.

Jaifroid avatar Sep 17 '24 15:09 Jaifroid

I'm testing for the first time, having been away

For the record, you published this file to production on August 15, you probably already tested it or at least you should have.

The fact that videos don't work is is a known limitation of the scraper. Only Youtube videos are known to work in Zimit/Warc2zim, and this is not going to change in the coming months / years.

Is it critical enough that we remove the ZIM for production? Or the information present is sufficiently valuable without videos?

benoit74 avatar Sep 19 '24 06:09 benoit74

I'm testing for the first time, having been away

For the record, you published this file to production on August 15, you probably already tested it or at least you should have.

The fact that videos don't work is is a known limitation of the scraper. Only Youtube videos are known to work in Zimit/Warc2zim, and this is not going to change in the coming months / years.

Is it critical enough that we remove the ZIM for production? Or the information present is sufficiently valuable without videos?

Hi @benoit74 I think you think you're replying to a different person! (I am not involved in publishing ZIMs.). The decision on whether it's critical is more for your team to decide, but personally I'd say it's not critical because there is a lot of textual information. I don't know whether the underlying video files have been scraped, but if they have, then it bloats the ZIM if they can't be accessed, and it might be an idea to exclude them.

Jaifroid avatar Sep 19 '24 06:09 Jaifroid

Sorry @Jaifroid, too soon in the morning, I was convinced it was Ravan speaking ^^

Your point regarding whether videos are bloating the ZIM is indeed a good one

benoit74 avatar Sep 19 '24 06:09 benoit74

I confirm the ZIM is bloated with first seconds of every videos. Unfortunately I don't think we have sufficient tooling to exclude them from the ZIM, AFAIK we can do it only with https://github.com/openzim/zimit/issues/353.

I think it would be super cool if we could also replace or even watermark video posters in such situation so that we have something saying "videos not available in ZIM". I've opened https://github.com/openzim/warc2zim/issues/396 to keep the idea.

I've also opened https://github.com/openzim/warc2zim/issues/397 for a "let's dream a bit" scenario.

Regarding current NHS conditions ZIM and until these issues are solved, should we manually remove the useless items and publish it manually? It is work only a developer can do, but if we agree that we will not update the ZIM for coming year this might be worth it to avoid big ZIM for nothing.

benoit74 avatar Sep 19 '24 07:09 benoit74

I was going to ask how much bloated is bloated but considering that NHS conditions is 4.5GB and NHS medicine is 13.5MB, I suspect I have an answer. @benoit74 can you please remove these unviewable videos?

Popolechien avatar Sep 19 '24 12:09 Popolechien

can you please remove these unviewable videos?

Do we agree this is a one-shot manual operation, and I will not do it again until many months (i.e. the recipe will be disable?)

We have no tooling for this, so I will have to do it "by hand", quite time consuming.

benoit74 avatar Sep 19 '24 13:09 benoit74

@benoit74 Personally (but I guess it's @Popolechien's call), I'd say it is not something you should have to do "by hand", but rather something that could wait till https://github.com/openzim/zimit/issues/353 is ready and it can be done automatically. I don't think it's so urgent as to take up valuable time that could be spent on other things. Sorry if I'm speaking (writing) out of turn! JMHO.

Jaifroid avatar Sep 19 '24 13:09 Jaifroid

We have no tooling for this, so I will have to do it "by hand"

Ah no, I thought that your hands would be writing a handy script and voilà. Never mind, then. Let's wait for openzim/zimit/issues/353 as flagged by @Jaifroid

Popolechien avatar Sep 19 '24 13:09 Popolechien

Then we have to remove the file from production, right? If so, then please open a separate issue since the assignees are different.

benoit74 avatar Sep 19 '24 13:09 benoit74

Yup. Opened #1163

Popolechien avatar Sep 19 '24 13:09 Popolechien

Wait - what's the policy again here? Keep it open as it's not ready, or close it because the recipe exists?

Popolechien avatar Sep 19 '24 13:09 Popolechien

Never close unless we know we will never make the ZIM. Here we have good hopes to do the ZIM, so only flag it as upstream + bug.

benoit74 avatar Sep 19 '24 13:09 benoit74