llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

Skip when trafilatura extraction failed

Open ihfazhillah opened this issue 2 years ago • 1 comments

Sometimes, item in the urls will be fail to fetch or parse. Especially when we use some bulk url generator like when we use serpapi. Instead failing trafilatura reader, we should skip this item and not add the result into document.

ihfazhillah avatar Jan 25 '23 14:01 ihfazhillah

thanks! could we add a param error_on_missing in the __init__ that defaults to False? that way if users want, we can still explicitly fail on None (raise ValueError)

Good Idea, will add that.

ihfazhillah avatar Jan 26 '23 08:01 ihfazhillah