json-wikipedia icon indicating copy to clipboard operation
json-wikipedia copied to clipboard

Json Wikipedia, contains code to convert the Wikipedia xml dump into a json/avro dump

Results 10 json-wikipedia issues
Sort by recently updated
recently updated
newest added

Bumps [gson](https://github.com/google/gson) from 2.4 to 2.8.9. Release notes Sourced from gson's releases. Gson 2.8.9 Make OSGi bundle's dependency on sun.misc optional (#1993). Deprecate Gson.excluder() exposing internal Excluder class (#1986). Prevent...

dependencies

Bumps xercesImpl from 2.12.1 to 2.12.2. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=xerces:xercesImpl&package-manager=maven&previous-version=2.12.1&new-version=2.12.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a...

dependencies

Can you add documentation to the readme on how to sufficiently extend this solution to other languages? FR and ES did not work. E.g... $ ./scripts/convert-xml-dump-to-json.sh fr /u01/wikip/dumps.wikipedia/frwiki/frwiki-latest-pages-articles.xml.bz2 ./frwiki-latest-pages-articles.json Converting...

Commit 58d4b1e9608fcb883ab0d4cc7291e01492e94b7f introduced a regression: now languages are not just strings but proper values of an enum but in the enum in avro I specied only Italian and English. Requesting...

Once you get to the main xml content of the wikidump transforming the xml into json can get a severe speed up by running on spark. This has already been...

enhancement

commit: https://github.com/diegoceccarelli/json-wikipedia/commit/d3d6398ece696119a96f1ea08132e5ad2657c6a6 disabled sending the report to codecov because we had to switch from cobertura to jacoco (cobertura wasn't compatible with openjdk11). Find a way to send the report to...

enhancement

https://github.com/HeidelTime/heideltime

Bumps [ch.qos.logback:logback-classic](https://github.com/qos-ch/logback) from 1.2.0 to 1.2.13. Commits 2648b9e prepare release 1.2.13 bb09515 fix CVE-2023-6378 4573294 start work on 1.2.13-SNAPSHOT a388193 Merge branch 'branch_1.2.x' of github.com:qos-ch/logback into branch_1.2.x de44dc4 prepare release...

dependencies

Bumps org.apache.avro:avro from 1.8.1 to 1.11.3. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.avro:avro&package-manager=maven&previous-version=1.8.1&new-version=1.11.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a...

dependencies