Julia Turc
Julia Turc
### System Info accelerate==0.28.0 diffusers==0.27.0 peft==0.9.0 transformers==4.38.2 Amazon EC2 instance (g5.2xlarge) Deep Learning OSS Nvidia Driver AMI GPU PyTorch 2.0.1 (Amazon Linux 2) ### Who can help? @pacman100 @younesbelkada ###...
### System Info ```shell The same problem manifests on both of these systems: System 1 --- - Amazon EC2 instance - Type: g5.2xlarge - Image: Deep Learning OSS Nvidia Driver...
See [relevant SO thread](https://stackoverflow.com/questions/12518499/pip-ignores-dependency-links-in-setup-py). Since dependency_links doesn't take effect, the users of this library have to manually `git clone https://github.com/openai/CLIP.git`. This burden keeps propagating to libraries that depend on this...
When setting `refiner=base_image_refiner`, the `refine_steps` argument promises to control how many denoising steps will be performed with the refiner: data:image/s3,"s3://crabby-images/24bc5/24bc5a5b8aa5a1ed15c2504ed8e5fd43231a565a" alt="image" However, in practice, this is only true if you additional...
When setting `refiner=base_image_refiner`, the `refine_steps` argument promises to control how many denoising steps will be performed with the refiner: data:image/s3,"s3://crabby-images/80652/806520654915a94184d013975033dafa3a4386b8" alt="Screenshot 2024-05-09 at 11 39 35 AM" However, in practice, this...
I'm scraping https://huggingface.co/docs/transformers/model_doc/mistral Here's my code: ``` from firecrawl import FirecrawlApp app = FirecrawlApp(api_key=FIRECRAWL_API_KEY) def run_firecrawl_scrape(url: str): scrape_result = app.scrape_url(url, params={'formats': ['markdown']}) return scrape_result sample_url = "https://huggingface.co/docs/transformers/model_doc/mistral" sample_crawl = run_firecrawl_scrape(sample_url)...
Scraping https://azure.microsoft.com/en-us/solutions/hugging-face-on-azure/ I get: ``` url = "https://azure.microsoft.com/en-us/solutions/hugging-face-on-azure/" scrape_result = app.scrape_url(url, params={"formats": ["markdown"]}) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ python3.12/site-packages/firecrawl/firecrawl.py", line 86, in scrape_url raise Exception(f'Failed to scrape URL. Error: {response["error"]}') ~~~~~~~~^^^^^^^^^ KeyError: 'error' ```
When scraping https://huggingface.co/docs/transformers/main_classes/pipelines, I'm seeing a lot of back slashes: Firecrawl Markdown: ``` ### FillMaskPipeline\ \ ### classtransformers.FillMaskPipeline\ \ [](https://github.com/huggingface/transformers/blob/v4.21.2/src/transformers/pipelines/fill_mask.py#L34)\ \ (model: typing.Union\[ForwardRef('PreTrainedModel'), ForwardRef('TFPreTrainedModel')\]tokenizer: typing.Optional\[transformers.tokenization\_utils.PreTrainedTokenizer\] = Nonefeature\_extractor: typing.Optional\[ForwardRef('SequenceFeatureExtractor')\] = Nonemodelcard:...
**Problem Description** Firecrawl is very useful for scraping API documentation pages. However, it seems to drop Swagger tables, which contain the bulk of the information (endpoints, parameters, etc.) **Proposed Feature**...