Fabric
Fabric copied to clipboard
Scrape url
What this Pull Request (PR) does
- added --scrape_url or -u CLI command to curl the content of a webpage in markdown form
- uses Jina AI
- no API key is needed
Screenshots
How to use it
The easiest way to use it is with this format: fabric -scrape_url {URL} | fabric --stream --pattern {fabric}. This scrapes the {URL}, transforms it into markdown, and then pipes it into fabric.
also s.jina can be added, ive added it on my own path and use it ez
Hi, thank you for the PR.
I see two problems:
- You are calling external command here, it will not work for Windows and if curl is not installed. We would need a native Go implementation. Maybe there is a Go client library for the service or to use some Rest API client libraries to call the service.
curlCommand := fmt.Sprintf("curl https://r.jina.ai/%s", url)
fmt.Println("Executing command:", curlCommand) // Debug print
if err := exec.Command("sh", "-c", curlCommand).Run(); err != nil {
return "", fmt.Errorf("failed to run curl command: %w", err)
}`
- Fabric uses OOP programing style. So, we would need a struct/methods implementation. E.g. check the YouTube impl. youtube/youtube.go
also s.jina can be added, ive added it on my own path and use it ez
I made a similar tool (https://github.com/johnconnor-sec/fabric/tree/jina-tool).
Maybe we all could work together to make it great... ?!
Holy crap this is incredible. And you did a video showing it!
One question / concern is how scalable / dependable jina ai will be as a solution, especially if tons of us are using this.
Also, how well it scrapes content that's resistant to scraping.
Thoughts?
Hi, thank you for the PR.
I see two problems:
- You are calling external command here, it will not work for Windows and if curl is not installed. We would need a native Go implementation. Maybe there is a Go client library for the service or to use some Rest API client libraries to call the service.
curlCommand:= fmt.Sprintf("curl https://r.jina.ai/%s", url) fmt.Println("Executing command:", curlCommand) // Debug print if err := exec.Command("sh", "-c", curlCommand).Run(); err != nil { return "", fmt.Errorf("failed to run curl command: %w", err) }`
- Fabric uses OOP programing style. So, we would need a struct/methods implementation. E.g. check the YouTube impl. youtube/youtube.go
Great points here. We need to make sure all functionality in the app works across all the platforms.
Holy crap this is incredible. And you did a video showing it!
One question / concern is how scalable / dependable jina ai will be as a solution, especially if tons of us are using this.
Also, how well it scrapes content that's resistant to scraping.
Thoughts?
It's completely free with rate limit and if we add api key which is free we get millions of tokens without rate limit
@eugeis @danielmiessler @johnconnor-sec
I made an update video on the new commits that were pushed.
Some changes can be made before bringing this to production. I wanted to commit these as soon as possible to start the feedback loop.
Thanks for the great feedback -- this was a great excuse to learn Go!
@noamsiegel Thank you for adaptions.
Did I understood it correct, that the API key is optional?
If yes, we need to adapt here ... client.ApiKey = client.AddSetupQuestion("API Key", true), ... false ...
if check if it is empty don't send the Auth.. header.
@eugeis I Just pushed the fix to make the Jina API key optional. If blank, it does not set the header token for the request.