New request: OEIS (The On-Line Encyclopedia of Integer Sequences)
- Website URL: https://oeis.org
- License: Creative Commons Attribution Share-Alike 4.0 (source)
- Desired ZIM Title: OEIS
- Desired ZIM Description: The On-Line Encyclopedia of Integer Sequences
- Desired ZIM Icon –png (URL or attach one): https://oeis.org/oeis_logo.png
- Language (ISO 639-3): eng
- Is this a MediaWiki?: yes
Recipe created https://farm.openzim.org/recipes/oeis.org_en_all I'll update the library link once ready
The recipe was taking 7 days to scrape 11%, I've stopped it an re-run it again.
There is a mediawiki at https://oeis.org/wiki/
We cannot scrape mediawiki websites with zimit unless very special configuration is put in place (and this is not even recommended).
There is 49 languages supported, I don't get why we create only one ZIM.
The website seems to be centered around a search boxes to search for integer sequences. These search functionalities are not going to work inside the ZIM, they need an online server. Are we sure the ZIM will still be usable without this search box? (I doubt it will, at least not as-is with zimit, or we need to find proper home page).
Some "sub-sites" like https://oeis.org/play and https://oeis.org/plot2.html (and maybe others, I did not investigated all of them) are not going to work, they need a server to generate the audio file based on user input.
To move forward, we need to more precisely define:
- what do we want inside the ZIM?
- do we create one ZIM per language as usual?
- what is the strategy for the wiki? we create a separate ZIM? we try to tune zimit config to scrape the mediawiki because it is going to be too cumbersome for users to have two ZIMs, one with the integer sequences and one with the wiki?
Note: task e130bc44-0dfc-4901-ba92-1cf894731d05 is marked as succeeded, but in fact the crawler crashed with "Browser disconnected (crashed?), interrupting crawl" message, the ZIM is not usable.
What about changing the the website to: https://oeis.org/wiki/Welcome ? @benoit74 I think this it the target in the request?
I don't know
I've disabled the recipe which was still running but was wrong and not working.
@tdeitch can you please explain what is interesting you to ZIM on the website? Is it the database of integer sequences, or the wiki explaining things, or anything else? Or both?
Sorry, I should have been clearer and not said that it was a MediaWiki. The database of integer sequences is what's interesting to me, and I expect to most people. The wiki is useful for people looking to contribute, but not something I care about day to day.
Currently, I can look up terms in a local clone of https://github.com/oeis/oeisdata, I just thought it'd be cool to have as a ZIM file so I could search it alongside all of my other offline references.
@tdeitch no worries, and thanks a lot for the clarification. I expected your answer but didn't wanted to bias the request based on my own biases ^^
Due to the very dynamic nature of oeis website, we cannot scrape it with our general purpose zimit scraper.
It means that we have to create a custom scraper for put the database in a ZIM, and develop a web UI capable to interact with the in-ZIM database (and query it with JS). Not something infeasible and probably not extremely hard, but definitely going to take some time, so without funding or a volunteer contributor committing to this, we might never see this happen, at least not very soon.
Thanks a lot for the link to https://github.com/oeis/oeisdata, this is very important to know this database exists standalone, it is a great enabler for this custom scraper.
And anyway, I'm very supportive of creating this scraper and I find it would be cool as well. I just lack time / have more important or funded topics to handle ^^
So thank again for proposing the idea!
Since the wiki might be interesting as well, I will create a secondary issue focusing only on the wiki and modify your first comment here to make it clear that what we want is the database.
Nota: I've deleted https://farm.openzim.org/recipes/oeis.org_en_all since it made no sense