Scrapegraph-ai
Scrapegraph-ai copied to clipboard
Local HTML cache
Is your feature request related to a problem? Please describe.
Implement a local HTML caching. To avoid re-process all scraping each time. It's notably long with recursive depths scrapings.
Describe the solution you'd like
Add a local cache near:
https://github.com/ScrapeGraphAI/Scrapegraph-ai/blob/aa6a76e5bdf63544f62786b0d17effa205aab3d8/scrapegraphai/nodes/fetch_node.py#L245
Describe alternatives you've considered
- Implement myself, not feasible as this is bundled in many graph implementations
Related
- #898
Hi, @clemlesne. I'm Dosu, and I'm helping the Scrapegraph-ai team manage their backlog. I'm marking this issue as stale.
Issue Summary:
- Proposal to add local HTML caching for performance optimization during recursive depth scrapings.
- Suggested implementation location in the code, noting complexity due to integration with various graph implementations.
- Linked to related issue #898, indicating broader context or dependency.
- No comments or activity since the issue was created.
Next Steps:
- Please let me know if this issue is still relevant to the latest version of the Scrapegraph-ai repository by commenting here.
- If there is no further activity, the issue will be automatically closed in 7 days.
Thank you for your understanding and contribution!
Still relevant
@PeriniM, the user has indicated that the proposal for adding local HTML caching for performance optimization is still relevant. Could you please assist them with this issue?