trafilatura
trafilatura copied to clipboard
Web API idea
I've been briefly experimenting with Postlight (nee Mercury) Parser, most of my projects are Python based so I was looking at a couple of the server implementations. Any interest in implementing the API described in:
- https://github.com/postlight/parser-api
- https://github.com/postlight/parser-api-express
- https://github.com/HenryQW/mercury-parser-api/
Example (copied from HenryQW/):
GET /parser?url=[required:url]&contentType=[optional:contentType]&headers=[optional:url-encoded-headers]
curl localhost:3000/parser?url=https://www.bbc.co.uk/news/science-environment-35876621
I'm thinking about knocking something up, wondering if anyone else would find it useful (e.g. for switching backends).