petermr comments

Results 310 comments of


                                            petermr

trafficstars

Investigate DOAJ

Absolutely right. We actually need both. If I get it right SOLR will do a high volume generic index. Then we use specific dictionaries. Andy has I think run AMI...

Investigate DOAJ

Wow! Exciting. Any ETA? Are you able to get a sneak preview of the content? UTF-8? HTML, ?JSON? PDF? My guess is it's flat text, without style. On Thu, Apr...

Face Mask knowledge extraction

That's a lovely Wikipedia page you've found. I'll show how to make the dictionary. The surgical mask is less clear. We can probably hoover some terms from Wikipedia pages. On...

Communication within project and further

Record EVERYTHING on the site IMMEDIATELY. Here's what I wrote to Clyde: * Create a new directory on the site. Call it textIndexing/ * create a subdirectory SOLR * Open...

Communication within project and further

# MEETING at 1200 UTC 2020 03 27 I'm talking with Dan Hagon at 1200. I sugest as many as possible join . We'll all try to give an overview.

check dictionaries and add to OpenNotebook

4 more dictionaries added but NOT checked ``` ├── antiretroviral_drug │ └── mwk │ ├── antiretroviral_drug.html │ └── antiretroviral_drug.xml ├── baltimore_(virus_classification) │ └── mwk │ ├── baltimore_(virus_classification).html │ └── baltimore_(virus_classification).xml...

Scraper for biorxiv and medrxiv

This is organized as a `picocli` commandline (as is almost all AMI). My current style is to develop new functionalities as Tests, based on commandline and then add this to...

Scraper for Royal Society Publishing

Correct. `getpapers` creates a directory with PMC* child directories, each with a `fulltext.pdf`. The `api` is switchable so you could have a `--api rs` . Note that the deafult is...

Scraper for Royal Society Publishing

Test Quickscrape - I haven't used for some years. https://github.com/ContentMine/quickscrape It should be possible to build a RoyalSociety scraper. P.

Scraper for Royal Society Publishing

# quickscrape on RS Found a paper on COVID-19 `https://doi.org/10.1098/rsos.191420` run with scraper dir ``` quickscrape --url https://doi.org/10.1098/rsos.191420 --scraperdir ../../journal-scrapers/ --output rs-191420 --outformat bibjson info: quickscrape 0.4.7 launched with... info:...