petermr

Results 310 comments of petermr
trafficstars

ami-pdf will read the PDFs in bulk and split into characters and images. After that we need to know the application. Try http://discuss.contentmine.org/t/cm-ucl-ii-semantic-content-enhancement-of-table-data/396/2 for an overview of extracting tables You...

Much of this is available through java Tests on petermr/normami now moved to petermr/ami3 . ami3 has the tests but not the data. It's image-based, so probably limited value. Back...

How many documents do you have? The first step is to trun them into A CProject put them in a directory e.g. simon20190919 then ami-makeproject gives the help then ami-makeproject...

Here's a stack of `ami` commands ``` #! /bin/sh # your path should include the /bin directory of the appassembler distrib, e.g. # ami-forestplot => /Users/pm286/workspace/cmdev/normami/target/appassembler/bin/ami-forestplot # edit this to...

dont send it, add it in a new folder here unless there are copyright issues

from the 25K try to select ca 20 which are: * newish (old docs are problematc, but maybe that is the point) * born digital if possible * OPEN (we...

if it's publicly visible I'm happy. We did that with phylotrees We are allowed to extract data if we can legally read it somewhere. Doesn't have to be CC BY....

happy to talk on phone/skype if helps

if you have 100-year old records as bitmaps I am happy to try those, but they must be homogenous in type

see table extraction at http://discuss.contentmine.org/t/ami-eppi-cm-ucl-table-extraction-project/322/14