domingohui

Results 16 comments of domingohui

What are the ML models used for? I must have lost track in the slack convo...

What was the url that you were trying to scrape?

I think None makes more sense. If there had to be a date, a fallback could be the latest date mentioned in the article ( or the latest Report).

Hey I just saw this issue about extracting details from a PDF. Since we have the updated schema, I'm just wondering if we should pick this up again. I came...

Looks like scraper is already using textract which uses pdfminer which is similar to PyPDF2. But they both seem to have difficulties extracting things like titles. Content (text) wise, theres's...

I haven't found a tool that can handle this reasonably well...