John Bumgarner

Results 81 comments of John Bumgarner

@BenMake I'm working on releasing the beta code in the coming days. I need to squash a few more bugs before releasing. What sites are you trying to use Newspaper3k...

I have no direct affiliation with this project and this project is very stagnant by its owner. Since the project is stagnant it has LOTS of bugs, such as the...

Thanks for mentioning my usage document in this Issue. What sites give you a 403?

Paul, can you share an example of what you are trying to do?

So you want to search the wayback archives. I wrote an [example](https://github.com/johnbumgarner/newspaper3_usage_overview#extraction-from-wayback-machine-archives) on this in my overview document for NewsPaper. If you provide me some more details I will add...

Why are you setting `config.number_threads` to 1? The default is 10. Also take at the [threading section ](https://github.com/johnbumgarner/newspaper3_usage_overview#newspaper-newspool-threading)in my `Newspaper3k` Overview Document.

If memoize_articles is not set to False then Newspaper will cache the article's urls and associated data in your system's temp directory. Here are some details on this cache in...

@AndyTheFactory Yes, I agree that @steeljardas was looking for a way to delete all the memoize articles. The document that I mentioned contains information on the cache's location.

@AndyTheFactory Thanks. I will reference your fork in my document. You reference that newspaper3k was last updated in September 2020. The correct date is September 2018. That is the date...

The library has lots of limitations, because the code base is old. You can parse the BBC site text with some additional code. Here is a document that I wrote...