petermr

Results 310 comments of petermr
trafficstars

Absolutely right. We actually need both. If I get it right SOLR will do a high volume generic index. Then we use specific dictionaries. Andy has I think run AMI...

Wow! Exciting. Any ETA? Are you able to get a sneak preview of the content? UTF-8? HTML, ?JSON? PDF? My guess is it's flat text, without style. On Thu, Apr...

That's a lovely Wikipedia page you've found. I'll show how to make the dictionary. The surgical mask is less clear. We can probably hoover some terms from Wikipedia pages. On...

Record EVERYTHING on the site IMMEDIATELY. Here's what I wrote to Clyde: * Create a new directory on the site. Call it textIndexing/ * create a subdirectory SOLR * Open...

# MEETING at 1200 UTC 2020 03 27 I'm talking with Dan Hagon at 1200. I sugest as many as possible join . We'll all try to give an overview.

4 more dictionaries added but NOT checked ``` ├── antiretroviral_drug │   └── mwk │   ├── antiretroviral_drug.html │   └── antiretroviral_drug.xml ├── baltimore_(virus_classification) │   └── mwk │   ├── baltimore_(virus_classification).html │   └── baltimore_(virus_classification).xml...

This is organized as a `picocli` commandline (as is almost all AMI). My current style is to develop new functionalities as Tests, based on commandline and then add this to...

Correct. `getpapers` creates a directory with PMC* child directories, each with a `fulltext.pdf`. The `api` is switchable so you could have a `--api rs` . Note that the deafult is...

Test Quickscrape - I haven't used for some years. https://github.com/ContentMine/quickscrape It should be possible to build a RoyalSociety scraper. P.

# quickscrape on RS Found a paper on COVID-19 `https://doi.org/10.1098/rsos.191420` run with scraper dir ``` quickscrape --url https://doi.org/10.1098/rsos.191420 --scraperdir ../../journal-scrapers/ --output rs-191420 --outformat bibjson info: quickscrape 0.4.7 launched with... info:...