xmltodict icon indicating copy to clipboard operation
xmltodict copied to clipboard

ExpatError: no element found: line 5399, column 292

Open tsj83 opened this issue 5 years ago • 1 comments

Hello,

I've been able to parse way over 2,346 patent entries in a bulk xml file using xmltodict( ), so I am a big fan, but I found a few impossible to parse so far (sample attached).

Previously, I was able to go unidecode(str(this_parsed_entry)) and fix the problem of skipped patents, but it's something else this time. The error I get is ExpatError: no element found: line 5399, column 292

I looked into open tags but that doesn't seem to be the case.

Core pieces of code (be warned, not a computer scientist and I have a sense of humor):

lets = "" this_entry = lets.join(patent_entry) # patent_entry is a list this_parsed_entry = xmltodict.parse(this_entry, dict_constructor=dict) this_stringfied_parsed_entry = unidecode(str(this_parsed_entry)) temp.write(this_stringfied_parsed_entry)

The file attached resulted from me printing first patent entry (saved as txt) with file size = 0

Big thanks in advance. All the best.

entry2347.txt

tsj83 avatar Oct 15 '19 00:10 tsj83

Facing the Same error, Please update a solution to this error. Thanks!

skwolvie avatar Dec 10 '20 17:12 skwolvie