Scribe-Data icon indicating copy to clipboard operation
Scribe-Data copied to clipboard

Missing Sax Parsing Logic For Xml Files in extract_wiki.py file

Open priyankaforu opened this issue 2 months ago • 1 comments

Terms

Behavior

I am unable to parse the enwiki_dump files because of the parling logic that is missing , for sax parsing that we are doing for xml files

I am expecting to complete the functions for triggering the call backs that captures the start and end elements / tags to extract the text

@axif0 @andrewtavis

Please let me know if I can work on this issue and and move ahead with the scribe-data for auto suggestions

Image

priyankaforu avatar Oct 10 '25 08:10 priyankaforu

Thanks so much for the issue, @priyankaforu! Thanks also for assigning, @axif0 :) Let's discuss this in the call later!

andrewtavis avatar Oct 11 '25 09:10 andrewtavis