Mark Feblowitz comments

Results 13 comments of


                                            Mark Feblowitz

BERT SQuAD 2 fails on specific types of questions - Found New Info...

I was only able to make it go away by purging the apostrophes and/or the commas from the text.

BERT SQuAD 2 fails on specific types of questions - Found New Info...

OK - so it appears that there might be multiple reasons why this failure comes up: complex sentences, unhandled special characters in the text (including embedded apostrophes or commas, ......

BERT SQuAD 2 fails on specific types of questions - Found New Info...

Edits to the original problem statement were made after the comments above.

BERT SQuAD 2 fails on specific types of questions - Found New Info...

Question to @Anton-Velikodnyy RE #477: Does this this fix address the situation above, or merely prevent a mid-processing crash? The latter is good, the former would also be good.

Grobid consistently drops characters, e.g., "fi", "ff"

Um, no. Sorry to have not been clear. Updating the description. I'm pulling the pdfs from the web and extracting from them. Thus, I have no control of the production...

Grobid consistently drops characters, e.g., "fi", "ff"

Interesting... The origin of the pdf document (linked above) was the product of saving that web page to a pdf file. The contents are (mostly) binary. And pdftotext indeed revealed...

Grobid consistently drops characters, e.g., "fi", "ff"

Now, if only there was a way to be alerted when the ligature substitution _might have occurred_, so excruciating manual examination of all processed documents would not be required...

Grobid consistently drops characters, e.g., "fi", "ff"

That's the rub. To know whether it has the characters, you'd need a good extraction to compare against. Or you'd need a comprehensive (huge) set of patterns to look for...

CONSTRUCT query fails.

Ok - I have one, and a comment about the error handling. First, the query. Submit this query to _http://live.dbpedia.org/sparql_ : ``` PREFIX rdfs: PREFIX : PREFIX d: PREFIX do:...

Docker "0c6d765c54" upgrade: image build problem: "Directory renamed before its status could be extracted"

I'm under a time crunch. I'll try the upgrade again when I get a chance and will send the diagnostics.