Improve alternative index handling.
This improve the handling of indexes and error messages.
Before this PR all errors reaching an index would be ValueError, thus not finding a wheel was the same as the index page having a non-handled content type, or even failing to parse the html because of a typo.
This made it hard to debug, or even realize that an index url was wrong, or that the pages served by the index were incorrect.
I thus introduce 3 New Errors that do not inherit from ValueError:
- IndexMetadataFetchError
- UnsupportedParserContentTypeError
- WheelNotFoundError
The first two of which trigger real error, as they likely suggest a problem with the index or the user config, and should not be ignored.
In addition the verbose= option was unconditionally setting the log level. I think this is improper as if the logging level is set somewhere else (to DEBUG, or something else), then it is overwritten.
- This then adds the option to pass
verbose=None, (and make it the default), in which case it will not change the default.
python -m http.server will serve ContentType with ; charset=utf-8, which is not recognized.
-
This now handle the case where the index ContentType header contains parameters for example
; charset=utf-8, by discarding everything after the semicolon;; this is not proper parsing, but should be sufficient here. -
I found that more debug logging would be useful and added a number of debug logs calls
With this I appear to get proper error message and debug statements when trying to work on the Cors proxy.
Should close #123, and #121, and help with pyodide/pyodide#4898
There are still a few case that needs handling, in particular when fetching for a given wheel and the page does not exists, is it a problem with the index, or does the package really not exists.
I think I might have to tweak the behavior a bit more.
Ok, one of the issues is that in particular PYPI does not set CORS on 404, therefore we get an OS error when trying to fetch a non-existing page.
I'm wondering if this is a bug upstream
Thanks @Carreau.
PYPI does not set CORS on 404
This does seem to be a warehouse bug.
Thanks @Carreau.
PYPI does not set CORS on 404
This does seem to be a warehouse bug.
Yep and it's known https://github.com/pypi/warehouse/issues/14229
Yeah, I'm looking at understanding the failing tests.
I'd like to figure out how to both get proper error messages if the index is invalid, and just keep going on 404, but because of PyPI not setting CORS on 404 I don't know if it's currently possible.
So I'm wondering if i I should bake a an boolean for indexes no_cors_on_404 or similar. Otherwise we'll get back to original behavior of swallowing almost every errors (including parse error). I'm not even sure the case where an index return a real 404 with proper cors is properly handled, and does not raise a parse error as query_package only error handing is OSError and does not check response status code, or does not use raises_for_status.
(I also have limited time, so might not touch this that much by next week).
We should look into fixing the warehouse bug. I tried a little bit but I didn't get the warehouse test suite running.
Yep, +1 for fixing the warehouse bug. Let me see if I can understand and modify the warehouse codebase.
Also https://github.com/pyodide/pyodide/pull/4974 will be relevant for later refactor.
crosslinking https://github.com/pypi/warehouse/pull/16339 as well.
So should I assume that we'll always get micropip on the latest pyodide, or can we have conditional that depends on versions ?
If there are some changes in Pyodide and if you want to use those features in the micropip, you are welcome to do so.
We try to keep the compatibility of micropip and Pyodide versions if possible, but it is not mandatory.
I'm going to close for now, there are too many conflicts, but I may coma back to it later.