docs-scraper icon indicating copy to clipboard operation
docs-scraper copied to clipboard

Scrape documentation into Meilisearch

Results 40 docs-scraper issues
Sort by recently updated
recently updated
newest added

Bumps [meilisearch](https://github.com/meilisearch/meilisearch-python) from 0.28.2 to 0.28.4. Release notes Sourced from meilisearch's releases. v0.28.4 🐍 🚀 Enhancements Support user dictionary loading (#870) @​ellnix Support text separator customization (#871) @​ellnix Maintenance Add...

skip-changelog
dependencies

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 4 to 5. Release notes Sourced from docker/build-push-action's releases. v5.0.0 Node 20 as default runtime (requires Actions Runner v2.308.0 or later) by @​crazy-max in docker/build-push-action#954 Bump @​actions/core...

skip-changelog
dependencies

Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 2 to 3. Release notes Sourced from docker/setup-buildx-action's releases. v3.0.0 Node 20 as default runtime (requires Actions Runner v2.308.0 or later) by @​crazy-max in docker/setup-buildx-action#264 Bump @​actions/core...

skip-changelog
dependencies

Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 2 to 3. Release notes Sourced from docker/setup-qemu-action's releases. v3.0.0 Node 20 as default runtime (requires Actions Runner v2.308.0 or later) by @​crazy-max in docker/setup-qemu-action#102 Bump @​actions/core...

skip-changelog
dependencies

Bumps [docker/login-action](https://github.com/docker/login-action) from 2 to 3. Release notes Sourced from docker/login-action's releases. v3.0.0 Node 20 as default runtime (requires Actions Runner v2.308.0 or later) by @​crazy-max in docker/login-action#593 Bump @​actions/core...

skip-changelog
dependencies

Bumps [scrapy](https://github.com/scrapy/scrapy) from 2.10.1 to 2.11.0. Release notes Sourced from scrapy's releases. 2.11.0 Spiders can now modify settings in their from_crawler methods, e.g. based on spider arguments. Periodic logging of...

skip-changelog
dependencies

**Description** There is a problem with null byte characters being inserted in HTML pages created with Docusaurus when the language is cjk. Of course, the issue mentioned is also registered...

bug

**Description** > TL;DR: This scraper used to support Keycloak SSO, but now it no longer does (unless you're running a very old Keycloak). Support for Keycloak SSO was added in...

help wanted

**Description** with meilisearch:v1.15.2 the doc-scrapping is failing with error: Docs-Scraper: https://gitdocs.test.com/gitdocs/chaoseng/prod/test-report.html 72 records) JSONDecodeError while adding documents for https://gitdocs.test.com/gitdocs/testone/prod/tech-acceptance.html: Expecting value: line 1 column 1 (char 0) First record: {'anchor':...

# Pull Request ## Related issue Fixes #105 ## What does this PR do? - url supports relative paths - In the config file, use "relative_url": true