reach icon indicating copy to clipboard operation
reach copied to clipboard

Investigate url location of scrapes

Open dd207 opened this issue 4 years ago • 0 comments

@lizgzil commented on Thu Mar 19 2020

The comparison with Uber's results found that there were parts of websites where Uber found a lot of citations of Wellcome trust funded research papers, but Reach found none.

Investigate the scraper to see whether there is anything in this or it was to be expected.

Source URL stem Number in Reach results Number in Uber results Only Uber gets?
UK Parliament researchbriefings.files.parliament.uk/documents 0 58 Yes
UK Parliament oxfordmartin.ox.ac.uk/downloads 2 0
UK Parliament artshealthandwellbeing.org.uk/appg-inquiry 2 0
UK Parliament data.parliament.uk/writtenevidence 2 0
UK Parliament assets.publishing.service.gov.uk/government 1 0
Gov.uk assets.publishing.service.gov.uk/government/uploads/system 21 232
Gov.uk gov.uk/government/uploads/system 0 472 Yes
NICE nice.org.uk/guidance 0 343 Yes
NICE nice.org.uk/process 1 0
WHO apps.who.int/iris/bitstream/10665 1301 2381
MSF msf.org.uk/sites/uk/files 2 6
MSF msf.org/sites/msf.org/files 0 22 Yes
MSF doctorswithoutborders.org/sites/usa/files 0 6 Yes

dd207 avatar Mar 19 '20 12:03 dd207