Python-to-rust conversion of integer raises an `OverflowError` instead of `ValueError` when value is out of bounds
Bug Description
Currently having a PyO3 function that takes an integer raises a OverflowError when called from Python with a value out of bound.
According to Python documentation, the use of OverflowError for handling int seems like a odd choice:
Raised when the result of an arithmetic operation is too large to be represented. This cannot occur for integers (which would rather raise MemoryError than give up).
However, for historical reasons, OverflowError is sometimes raised for integers that are outside a required range. Because of the lack of standardization of floating-point exception handling in C, most floating-point operations are not checked.
On the other hand, it is clearly defined that ValueError is the way to go when passing an invalid value with right type (the current issue is literally the given example 😄 ):
Passing arguments of the wrong type (e.g. passing a list when an int is expected) should result in a TypeError, but passing arguments with the wrong value (e.g. a number outside expected boundaries) should result in a ValueError.
In practice this is footgun since 1. PyO3 documentation doesn't mention this behavior and 2. ValueError is the de-facto standard for this kind of behavior
This creates hidden bugs when implementing types in PyO3 that are then used in validation framework, typically Pydantic.
Currently avoiding this issue consists on either manually handling PyInt conversion in the Rust code, or manually handling OverflowError in the Python code. In both case this is error prone since forgetting to do it has no impact on the happy case...
Steps to Reproduce
use pyo3::prelude::*;
#[pyfunction]
fn foo(x: u64) {
x * 2
}
>>> import foomodule
>>> foomodule.foo(-1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: can't convert negative int to unsigned
Backtrace
Your operating system and version
Pop!_OS 22.04 LTS
Your Python version (python --version)
Python 3.12
Your Rust version (rustc --version)
rustc 1.85.0 (4d91de4e4 2025-02-17)
Your PyO3 version
0.22.6 (but the code responsible for this is present in master)
How did you install python? Did you use a virtualenv?
poetry
Additional Info
No response
Thanks for the report, this is a tough one to judge. I think we are mostly correct to continue to use OverflowError here, as even new CPython APIs added in 3.14 are continuing to use it: https://github.com/python/cpython/issues/120389
That said, those APIs will use ValueError for negative values passed to unsigned types. We could probably check our conversions and ensure we do the same for that case (and maybe also for 0 passed to nonzero types).
as even new CPython APIs added in 3.14 are continuing to use it: https://github.com/python/cpython/issues/120389
I think the situation is not comparable with PyO3:
- CPython already uses
OverflowError, so consistency may be the reason why it is used here (especially since CPython documentation considers this use to be "for historical reasons") - This code concern a few API that are clearly documented.
The last point is crucial: on the other hand with PyO3 we are talking about a weird behavior that is going to occur every time somebody implements an API that takes an integer as parameter, so the burden of documentation is passed to every person implementing such API. As I was saying in my original post, this weird behavior is totally invisible for the happy case so there is a very high level of chance that nobody is going to bother (or even pay attention !) about this...
Is there an actual benefit of throwing a OverflowError instead of a ValueError ?
We are built atop those same CPython APIs which raise OverflowError, which is why the current behaviour exists.
If there was a strong reason to make the change, we could catch the OverflowError exceptions and raise ValueErrors instead, however this comes at a complexity and performance cost. This would also be a silent breaking change for existing code depending on us throwing OverflowError.
While I agree with you that it would probably be nicer if the integer conversions would only raise ValueError, I'm not yet sufficiently convinced that the existing paths which raise OverflowError is bad enough that we should consider changing things here.
If the point is that documentation in CPython makes it clear where OverthrowError might be raised, we can add documentation to PyO3 to detail the conversions.
We are built atop those same CPython APIs which raise OverflowError, which is why the current behaviour exists.
Regarding of the use of OverflowError in CPython, https://bugs.python.org/issue29833 suggests that it is only for niche case where the hardware leaks out of implementation:
Author: Guido van Rossum (gvanrossum) If I had to do it over again I would have used OverflowError only for some very narrowly defined conditions and ValueError for "logical" range limitations. In particular OverflowError suggests that the abstraction is slightly broken (since we usually don't think much about how large an integer fits in a register) while ValueError suggests that the caller passed something of the right type but with an inappropriate value.
and is not expected to be caught by end-users:
Author: STINNER Victor (vstinner) I don't expect that any code rely on OverflowError. I don't remember any code catching explicitly this exception.
As MemoryError, it's not common to catch these exceptions.
we could catch the OverflowError exceptions and raise ValueErrors instead, however this comes at a complexity and performance cost.
I was under the impression that changing this code was enough to correct the behavior: https://github.com/PyO3/pyo3/blob/b050c874ec87f496a5e5daef97104d5d3d97cf3c/src/conversions/std/num.rs#L48-L52
I don't know much about PyO3 codebase, so I guess there is more under the surface: typically changing this code would fix the issue for some types but not all (for instance float are maybe implemented by a different code that itself relies on CPython APIs, hence raising the OverflowError). Do you agree ?
While I agree with you that it would probably be nicer if the integer conversions would only raise ValueError
The perfect solution would be that OverflowError inherits from ValueError, but that ship has sailed loooong ago 😭
I'm not yet sufficiently convinced that the existing paths which raise OverflowError is bad enough that we should consider changing things here.
I did a crawl across the flagship projects using PyO3 to see how this behavior impact different project, see Investigation: is OverflowError used in library using PyO3 paragraph at the bottom of this post.
TL;DR: Raising an OverflowError for numeric conversion doesn't impact in a significant way those libraries.
However keep in mind this is obviously not a full picture, typically because all those projects are libraries and no end project are present. I would say end-project are much more sensitive to this issue because they implement business logic by combining multiple library together, hence two libraries with seemingly okay behaviors can lead to a bug when put together.
Typically I came across this issue by implementing a business logic type with PyO3 that is used in a validation lib (Pydantic) within an HTTP route (fastAPI). In this case, the OverflowError error breaks the validation layer and allow leads to HTTP 500 errors when passing the wrong data 😭
we can add documentation to PyO3 to detail the conversions.
Regardless of if this behavior is here to stay, I agree documenting exceptions it would be most welcomed 🙏
Currently the argument type conversion page doesn't mention anything about exceptions raised.
I guess the most obvious place to add it is in Using Rust library types vs Python-native types](https://pyo3.rs/v0.25.1/conversions/tables.html#using-rust-library-types-vs-python-native-types)
However I also wonder about adding a column in the argument-types array: typically a python-to-rust conversion exceptions with n/a (for the PyAny type), TypeError only (e.g. PyBytes) TypeError, OverflowError (so for integer based types) etc.. We could also add footnotes to indicate in which case each error is raised.
Investigation: is OverflowError used in library using PyO3
Impacted (i.e. OverflowError is raised for numeric conversion error)
robyn
>>> import robyn
>>> robyn.Response(status_code=-1, headers={}, description=b'')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: out of range integral type conversion attempted
In practice response status is always build from constants and no arithmetic is done,
so raising a OverflowError is no big deal.
arro3
>>> from arro3.core import *
>>> Array([-1], DataType.uint16())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: out of range integral type conversion attempted
>>> Array([2**64], DataType.int64())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long
blake3_py
>>> import blake3
>>> blake3.blake3(b"foobarbaz").hexdigest(length=-1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: can't convert negative int to unsigned
>>> blake3.blake3(b"foobarbaz").hexdigest(length=2**64)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: int too big to convert
Since the parameter corresponds to a length, ValueError would be expected for a negative value, by opposition of e.g. lenght=2**64 which is in theory a valid value, but technical constraint makes it not usable (and in this case OverflowError is kind of acceptable according to CPython behavior)
Interestingly enough, OverflowError is also used by this library for further bound checking (my guess is this has been done to stay consistent with the exception raised by PyO3)
#[pyo3(signature=(length=32, *, seek=0))]
fn hexdigest<'p>(
&self,
py: Python<'p>,
length: usize,
seek: u64,
) -> PyResult<Bound<'p, PyString>> {
if length > (isize::max_value() / 2) as usize {
return Err(PyOverflowError::new_err("length overflows isize"));
}
Not impacted
cryptography
Interestingly enough, cryptography converts an OverflowError (raised by CPython conversion) into ValueError:
def _dynamic_truncate(self, counter: int) -> int:
ctx = hmac.HMAC(self._key, self._algorithm)
try:
ctx.update(counter.to_bytes(length=8, byteorder="big"))
except OverflowError:
raise ValueError(f"Counter must be between 0 and {2**64 - 1}.")
orjson
>>> import orjson
>>> orjson.dumps({'a':2**64})
OverflowError: int too big to convert
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Integer exceeds 64-bit range
ormsgpack
>>> import ormsgpack
>>> ormsgpack.packb({'a':2**64})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Integer exceeds 64-bit range
jsonschema
>>> import jsonschema_rs
>>> jsonschema_rs.validate({"type": "integer"}, 2**64)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: int too big to convert
Not impacted (no function implemented in Rust with integer conversion from Python)
- fastuuid
- granian
- pydantic_core
- pytauri
- primp
Not investigated
- bed_reader
- cellular_raza
- connector_x
- css_inline
- datafusion_python
- deltalake_python
- fastbloom
- feos
- forust
- geo_index
- haem
- html2text_rs
- html_py_ever
- inline_python
- johnnycanencrypt
- mocpy
- obstore
- opendal
- polars
- pycrdt
- rateslib
- river
- rust_python_coverage
- rnet
- sail
- tiktoken
- tokenizers
- tzfpy
- utiles
Thank you for the thorough analysis.
Typically I came across this issue by implementing a business logic type with PyO3 that is used in a validation lib (Pydantic) within an HTTP route (fastAPI). In this case, the
OverflowErrorerror breaks the validation layer and allow leads to HTTP 500 errors when passing the wrong data 😭
Ouch. I agree that's broken, and it's probably a bug in Pydantic that that's not handled. Whether the root cause is that we should fix it here... 🤔
However I also wonder about adding a column in the argument-types array: typically a
python-to-rust conversion exceptionswithn/a(for thePyAnytype),TypeError only(e.g.PyBytes)TypeError, OverflowError(so for integer based types) etc.. We could also add footnotes to indicate in which case each error is raised.
I completely agree with adding this column 👍
I was under the impression that changing this code was enough to correct the behavior:
Unfortunately it's a bit more complicated than that, we curently use the CPython APIs like https://docs.python.org/3/c-api/long.html#c.PyLong_AsLong which raise OverflowError. We would have to instead rework to avoid these APIs, or where unavoidable wrap them to catch the OverflowError and raise ValueError instead.
cc @Icxolu - I notice that when we introduce FromPyObject::Error we could potentially optimize these conversions by using methods like PyLong_AsLongLongAndOverflow which sets a flag rather than creating a Python exception. (I think we'd need to allow for the possibility of a type error too, but at least in the case of out-of-range maybe we can avoid the cost of a Python exception object.)
Given that I'm warming to the idea that the current behaviour is annoying, maybe this could fall in scope of the 0.27 release with the general rework to FromPyObject? Definitely if we did make changes to behaviour here, it would be nice to slot them into a release which is themed around rethinking the from-Python conversions. 🤔
cc @Icxolu - I notice that when we introduce
FromPyObject::Errorwe could potentially optimize these conversions by using methods likePyLong_AsLongLongAndOverflowwhich sets a flag rather than creating a Python exception. (I think we'd need to allow for the possibility of a type error too, but at least in the case of out-of-range maybe we can avoid the cost of a Python exception object.)
Yeah, this should be possible. We would need to think about what the error type should be, given that it can be "overflow or exception". I think this is something that occurs in other places as well, for example "downcast or exception" i believe I saw, maybe others as well. Maybe we can design something more general "common error or exception" that can be used for all (or at least most) of these cases. Would definitely make sense that we try to do something like this together with the FromPyObject rework.