web-miner icon indicating copy to clipboard operation
web-miner copied to clipboard

Crawls sites, to find new content and scrap it

Results 10 web-miner issues
Sort by recently updated
recently updated
newest added

Bumps [py](https://github.com/pytest-dev/py) from 1.6.0 to 1.10.0. Changelog Sourced from py's changelog. 1.10.0 (2020-12-12) Fix a regular expression DoS vulnerability in the py.path.svnwc SVN blame functionality (CVE-2020-29651) Update vendored apipkg: 1.4...

dependencies

Bumps [pygments](https://github.com/pygments/pygments) from 2.2.0 to 2.7.4. Release notes Sourced from pygments's releases. 2.7.4 Updated lexers: Apache configurations: Improve handling of malformed tags (#1656) CSS: Add support for variables (#1633, #1666)...

dependencies

Bumps [jinja2](https://github.com/pallets/jinja) from 2.10.1 to 2.11.3. Release notes Sourced from jinja2's releases. 2.11.3 This contains a fix for a speed issue with the urlize filter. urlize is likely to be...

dependencies

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 0.14.1 to 0.15.3. Release notes *Sourced from [werkzeug's releases](https://github.com/pallets/werkzeug/releases).* > ## 0.15.3 > * Blog: https://palletsprojects.com/blog/werkzeug-0-15-3-released/ > * Changes: https://werkzeug.palletsprojects.com/en/0.15.x/changes/#version-0-15-3 > > > ## 0.15.2 > *...

dependencies

Currently the only use case is to `request arxiv` data, which is actually not 100% correct, because 1. the main use case of this repository is to provide the parsed...

enhancement :rocket:
good first issue :new:
Hacktoberfest

Arxiv.org updates the papers once a day. Multiple requests in a short period of time 'wastes' the arxiv server by re-querying data. A solution for this would be daily-caching of...

enhancement :rocket:
help wanted :hand:
good first issue :new:
Hacktoberfest

**Describe the bug** The MAINTAINER command in the Dockerfile is deprecated. See [official documentation](https://docs.docker.com/engine/reference/builder/) **Expected behavior** Shouldn't be used.

bug :bug:
dev ops / ci :robot:
Hacktoberfest

To ensure commits that are referencing the issues all commits shall have the format of `# ` for example: see [this commit](https://github.com/Keep-Current/web-miner/commit/7d331aa8151958a8a783d42b20fa84f2b6f230a7) ![image](https://user-images.githubusercontent.com/22077628/43955476-b36763a0-9ca0-11e8-91b7-c05f8ef4e0fc.png)

improve open source :free:
Hacktoberfest

[See file here](https://github.com/Keep-Current/web-miner/blob/master/webminer/external_interfaces/flask_server/rest/arxiv_document.py) See the discussion on this issue [on Slack](https://keep-current.slack.com/archives/CAU31TV8F/p1533830565000519) Right now the Response is JSON dependent (end of the file): ```python return Response( json.dumps(response.value, cls=ser.ArxivDocEncoder), mimetype="application/json", status=STATUS_CODES[response.type], )...

enhancement :rocket:

Add pre-push hook to check for: - linting - formatting - tests Yes, this is done in Travis, but it's good to have immediate feedback before pushing bad code

enhancement :rocket:
help wanted :hand:
dev ops / ci :robot: