pyo3 icon indicating copy to clipboard operation
pyo3 copied to clipboard

Add a proc macro for exceptions

Open konstin opened this issue 5 years ago • 11 comments

I would be great to have a custom derive for exceptions, similar but more powerful than create_exception. This would e.g. allow reimplementing JSONDecodeError:

#[pyexception(module = json)]
struct JSONDecodeError {
    msg: String,
    doc: String,
    pos: usize,
    lineno: Option<usize>,
    colno: Option<usize>,
}

The steps required are roughly:

  1. Look at examples in syn and pyo3's existing proc macros to learn how to implement a proc macro. wasm-bindgen could be another useful resource.
  2. Implement the macro using the existing impl_exception_boilerplate and create_exception_type_object macros. The #[proc_macro_attribute] function should go in pyo3cls/src/lib.rs, while the actual implementation should be in pyo3-derive-backend.
  3. Write tests. Most important is one for a struct without members and one with at least 2 members.
  4. Add a usage example to the guide. You can just copy one of the tests for that.

If you want to work on implementing this or have any questions, feel free ask here or on gitter!

konstin avatar Dec 01 '18 12:12 konstin

I'd like to try my hand at this. Would that be fine?

If so, I'm not super familiar with the details here, so would this issue be a good place to ask questions?

pradyunsg avatar Dec 03 '18 21:12 pradyunsg

Thanks, please don't hesitate to ask us!

kngwyu avatar Dec 04 '18 03:12 kngwyu

Yes, just ask here!

konstin avatar Dec 04 '18 21:12 konstin

I'm nominating this for 0.12 - I actually already have a PR almost ready for #682 which would lay the groundwork for this.

davidhewitt avatar Jun 21 '20 23:06 davidhewitt

Nearly everything else in 0.12 is close to being ready to go, and this is a potentially complex feature but not yet started. As much as I'm looking forward to having this, I'm going to delay this to the 0.13 release so we have time to design and implement properly.

davidhewitt avatar Sep 01 '20 22:09 davidhewitt

For those watching this, in #1591 we made it possible to write #[pyclass(extends=PyException)] (or other exception types) to create exception types.

This doesn't yet replace the create_exception! macro, because there's a few things which create_exception! does which #[pyclass(extends=PyException)] does not:

  • The type created by create_exception! implements Deref<Target=PyBaseException>, but PyCell<MyClass> currently implements Deref<Target=PyAny>. (However I'm wondering about proposing some changes in this area...)
  • create_exception! also adds impl std::error::Error for the newly created type.
  • create_exception! adds MyCustomException::new_err function to create a PyErr
  • create_exception! adds conversion from MyCustomException to PyErr

I'm wondering if we can eliminate the differences by adding a #[derive(Error)] macro to PyO3. This macro would only work alongside #[pyclass(extends=PyException)], and would implement the bits missing as listed above.

davidhewitt avatar Jun 01 '21 18:06 davidhewitt

create_exception! also adds impl std::error::Error for the newly created type.

...thinking about this one in particular and interaction with #[pyclass], I think the equivalent would be to implement std::error::Error for &PyCell<MyCustomException>.

davidhewitt avatar Sep 26 '21 12:09 davidhewitt

Saw this had a Good First Issue on it and I was wanting exception types with some fields attached, so I tried a quick swipe at this derive macro in pyo3-macros (since that crate has proc-macro = true in its Cargo.toml).

However, the PyErr type isn't available in pyo3-macros which means I can't trivially write a conversion from the custom exception type to PyErr nor can a new_err return it.

How should I solve that issue? (Moving important bits like that felt like it deserved some feedback first)

jmhodges avatar Aug 16 '22 09:08 jmhodges

However, the PyErr type isn't available in pyo3-macros which means I can't trivially write a conversion from the custom exception type to PyErr nor can a new_err return it.

How should I solve that issue? (Moving important bits like that felt like it deserved some feedback first)

You shouldn't have to do that - nothing in the macro crate has any idea about pyo3's types.

Do you have experience writing proc macros?

mejrs avatar Aug 16 '22 15:08 mejrs

You shouldn't have to do that - nothing in the macro crate has any idea about pyo3's types.

Do you have experience writing proc macros?

Ah, the thing I was missing that while you write the macro in that crate, you re-export and test it within the other depending crate!

I don't have much proc macro experience, you're right!

Now, I've reached the point where I tried to follow the recommendation of implementing std::error::Error for &PyCell<MyCustomException> by making the macro generate a wrapper struct that contains a &'a PyCell<MyCustomException> as a field and then implementing std::error:Error on that, but I'm running into the usual "Python doesn't know about lifetimes" issue with that.

I've been reaching for ouroboros::self_referencing in my own code using pyo3 to solve that, but I suspect that's not desirable here and might not work, anyhow, because maybe it'd required keeping a runtime object with it and that's not a thing.

But that whole strategy of making a wrapping struct might not be the direction folks were wanting to go in and I'm missing some other (possibly very obvious!) technique I could explore. I'm happy to hear ideas!

jmhodges avatar Aug 19 '22 21:08 jmhodges

Now, I've reached the point where I tried to follow the recommendation of implementing std::error::Error for &PyCell<MyCustomException> by making the macro generate a wrapper struct that contains a &'a PyCell<MyCustomException> as a field and then implementing std::error:Error on that, but I'm running into the usual "Python doesn't know about lifetimes" issue with that.

Hmm, perhaps my above comment was slightly too fleeting a thought without enough consideration.

Maybe an alternative approach would be to have a PyErrorClass trait, which we could help users derive automatically for their #[pyclass], along with a blanket implementation of impl<T> std::error::Error for PyCell<T> where T: PyErrorClass?

davidhewitt avatar Aug 20 '22 06:08 davidhewitt

Any update on this? If I'm being honest I'm not the most positive on what's going on with the above stuff, but I'd like this functionality in my own code to enhance the usability of my exceptions.

Anything I can do to help push things along?

hwittenborn avatar Oct 03 '22 09:10 hwittenborn

@hwittenborn no further progress here other than the above thread.

Really what this issue needs is someone who's able to invest the time to experiment with design to come up with a proposal of what we still need. Implementation can be worked out along the way, I can mentor and review PRs, though I wasn't expecting to find time to dedicate myself to this one at the moment.

to enhance the usability of my exceptions.

It would be interesting to hear what you feel you need which #[pyclass] doesn't give you at the moment. https://github.com/PyO3/pyo3/issues/295#issuecomment-852358088 is still the status quo and lists the pieces which are implemented by the create_exception! macro but not (yet) by #[pyclass]. Maybe there's something else entirely which you'd like to have?

davidhewitt avatar Oct 04 '22 19:10 davidhewitt

It would be interesting to hear what you feel you need which #[pyclass] doesn't give you at the moment

I want to be able to return exceptions with custom fields, such as the line number an error occured on. Currently I have this:

#[derive(Debug)]
#[pyclass(extends=PyException)]
pub struct ParserError {
    /// A message describing the parsing error.
    pub msg: String,
    /// The line number the error occured on. This will always be the [`Some`] variant unless there
    /// was an issue with the file as a whole, in which case the [`None`] variant will be returned.
    pub line_num: Option<usize>,
}

And then in another part of my code where I'd like to raise that exception, which I'm not sure how to make work with #[pyclass] (the issue is with my ParserError stuff at the bottom):

#[new]
fn new(content: String) -> PyResult<Self> {
    match RustSrcInfo::new(&content) {
        Ok(srcinfo) => Ok(SrcInfo { srcinfo }),
        Err(err) => {
            let msg: String;

            if let Some(line_num) = err.line_num {
                msg = format!("[Line {}] {}", line_num, err.msg);
            } else {
                msg = err.msg;
            }

            let py_err = ParserError { msg: err.msg, line_num: err.line_num};
            Err(py_err)
        }
    }
}
error[E0308]: mismatched types
   --> src/python.rs:42:21
    |
42  |                 Err(py_err)
    |                 --- ^^^^^^ expected struct `PyErr`, found struct `ParserError`
    |                 |
    |                 arguments to this enum variant are incorrect
    |

I can send a more general example if needed.

hwittenborn avatar Oct 07 '22 00:10 hwittenborn

Okay, I've learned a good deal about how the Python runtime manages exceptions and some of what needs to change in PyO3 to accommodate the goal of having instance attributes (instance members, whatever) on Python exceptions defined in Rust-land. There's a few plausible next steps one could follow and I know less about PyO3's state, so I figured I should write down what I know about the Python side in case others pick up the torch.

The current and the desired implementations

So, right now, PyO3 uses the cpython-provided PyErr_NewExceptionWithDoc to create exception types in PyErr::new_type. But the PyErr_NewExceptionWithDoc function (and the PyErr_NewException func it calls) only creates Python exception classes with the same constructor and memory layout of the BaseException type -- namely, the usual "single string message" constructor and attribute. That means, there's no way to add or change what instance attributes are on the exception nor the types that are used in its constructor. Confusingly, PyErr_NewExceptionWithDoc (and PyErr_NewException) has a dict arg but that's only for defining class variables and methods, not for adding instance attributes.

If one wants a more custom exception type, you're meant to first create a PyTypeObject with its fields set up to a) inherit from an exception type; b) set the arrays of PyMemberDef and PyGetSetDefs to add instance attributes (our goal!); c) set the init, dealloc, clear, and traverse fields to funcs that handle the new type's attributes and call -- somewhere in their callstack -- the BaseException version of themselves for various Py_TRASHCAN_BEGIN reasons. And then it's meant to call PyType_Ready with the new PyTypeObject to expose it in the runtime as a class that can be referenced in other Python code. (That PyType_Ready is usually desired but technically optional if you only want code that has a handle on the new PyTypeObject to have access to it.)

This is basically what you're supposed to do with normal Python classes with the extra sauce of BaseException tooling needing to be in the mix. There are links in the "How I learned the Python part of this problem" section that folks could use to double check if I missed anything important in the PyTypeObject definitions.

I assume everything in this PyTypeObject strategy has an equivalent in PyO3, but I'm not positive if all those bits are available for use in PyErr::new_type or if they're only available to the #[pyclass] macro.

If PyErr::new_type is the only code needing adjustment, we also need to replace some other Python C calls that rely on its output. PyO3 may be using Python C functions that assume string message-style exception constructors exist on all exceptions made from PyErr::new_type. Calls to C funcs like PyErr_SetString, PyErr_Format and similar will need to, instead, be to PyErr_SetObject or similar. Those new calls have the additional complication of needing to be passed the values from the Rust-land type's fields. I believe PyErrArguments was added to make this possible, but I'm unfamiliar with the trait and its PyErrState cousin, so it's possible there's more work to do.

How I learned the Python part of this problem

How to create exception types using the C API is difficult to search up because the answer is basically "make them like other Python classes but take care with the inheritance" while the search results you get are ones that explicitly mention exceptions like the docs for the underpowered PyErr_NewException APIs or StackOverflow answers about writing Python code in string literals in C[^1].

We can figure out what to do by looking at how the cpython runtime defines its exceptions. Specifically, the built-in exceptions (like RuntimeError or NameError) are defined using C macros that create global PyObjectType objects and then take a pointer to them to create global PyObject objects. Those C macros are called SimpleExtendsException, MiddlingExtendsException, and ComplexExtendsException. Those global PyObject values are added to a global array called static_exceptions which is iterated over in an initialization function called at Python interpreter start-time. That iteration calls a func that (with a lil bit of extra flag setting because these are built-ins) calls the PyType_Ready func on those global PyObject exception values.

PyType_Ready is the bit that exposes the value to other Python code as interpretation time. You can test that out by adding a new exception type to Objects/exceptions.c but leaving it out of the static_exceptions array. I actually performed that experiment, tested in the built Python interperter, and also got the new exception passing the already-existing tests that use the stable_abi.py/stable_abi.toml controlled definition of Python's public ABI to ensure backwards compatibilty.

Meanwhile, PyErr_NewException (which is the meat of PyErr_NewExceptionWithDoc) works somewhat differently. It constructs a PyTypeObject by calling the construtor of PyType_Type with the base type and a dict of class-level methods and attributes. Given the need to use write an unchanging format string in order to call the constructor correctly, and reading between the lines of some of the comments in PyType_Ready, I'm pretty sure this constructor call is not what the cpython folks would have us reproduce.

Instead, the strategy of using PyTypeObject with careful inheritance from another exception type and PyType_Ready seems to be correct.

Next actions

There's a couple possible next actions.

  1. Attempt to change the guts of PyErr::new_type to create PyTypeObject directly and PyType_Ready for exposing it. Then, it'd be a matter of fixing up the other PyErr_* C func calls in PyO3 (as described in the above text) to use PyErr_SetObject and pass in PyErrArguments values. Getting the inheritance of the func fields (init, traverse, etc) right might require more from the code underlying the [#pyclass] macro than is currently exposed, but I've not dug deep, yet. I'm beting this also requires understanding PyErrState better than I currently do. ~Finally, PyO3 doesn't seem to be directly using PyType_Ready anywhere, and I've not looked to see how PyO3 is currently exposing types to other Python code.~ Edited: Actually, PyO3 calls PyType_Ready by way of calling PyType_FromSpec in PyTypeBuilder.build. If PyErr::new_type reused that (which might be a lift), it'd be finalizing the exception type correctly.
  2. If 1 is unworkable (and it could be in some way that's immediately obvious to PyO3 experts), we'd need to alter the PyO3 type hierarchy (including macros). We already kind of know some similar work will be needed from prior discussion in this issue. Specifically, the suggestion of creating a new PyErrorClass trait separate from the PyErr trait was offered and my own realization that one kind of macro can't add calls of another kind of macro to the same type (iirc, I was thinking of having a pyexception attribute macro adding derive(Debug) or something to the type). So, there likely does need to be a separation of the ideas of "custom exception classes" from the idea of "using Python exceptions as Rust error values" but how that's done best is unclear.

[^1]: If you google around for things like "how do I add member attributes to exceptions", you'll find things like a StackOverflow saying to write some Python code inside your C code. Doing that for a single exception defined and controlled by the same extension might be considered okay by some folks, but, for the PyO3 use case, doing a bunch of string templating to create Turing complete code in a different language seems, uh, perilous.

jmhodges avatar Dec 28 '22 02:12 jmhodges

(Meta point: I'm pretty sure this ticket should have the "Good First Issue" label dropped. It's been open for a few years and it doesn't seem to have a trivial solution)

jmhodges avatar Dec 28 '22 02:12 jmhodges

There's a couple possible next actions.

I think the approach outlined in https://github.com/PyO3/pyo3/issues/295#issuecomment-1270895332 would be nicer, i.e. we use the existing machinery around #[pyclass] which includes inheritance and which should provide ready-to-go type objects under the hood (as for another #[pyclass]). I suspect that we would rather need a way to raise these arbitrary #[pyclass]es which checks that the inheritance is correct?

adamreichold avatar Dec 28 '22 20:12 adamreichold

I managed to get the basics of this working by loading a python string:

fn raise_MemoryError() -> PyErr {
    Python::with_gil(|py| {
        let icicle = py.import("icicle").unwrap();
        let exception = icicle.getattr("MemoryError").unwrap();
        let inst = exception.call0().unwrap();
        PyErr::from_value(inst)
    })
}

#[pymodule]
fn icicle(py: Python<'_>, m: &PyModule) -> PyResult<()> {
    PyModule::from_code(py, r#"
class MemoryError(Exception):
    pass
"#, "icicle_exceptions.py", "icicle")?;
    Ok(())
}

Obviously this is extremely ugly, but I couldn't figure out how to do this the 'proper' way...

mrexodia avatar Jun 02 '23 19:06 mrexodia

Slightly modified it to also accept a message + code:

fn raise_MemoryError(message: String, e: MemError) -> PyErr {
    Python::with_gil(|py| {
        let icicle = py.import("icicle").unwrap();
        let exception = icicle.getattr("MemoryError").unwrap();
        let xx = MemoryErrorCode::from(e);
        let args = (message, xx, );
        let inst = exception.call1(args).unwrap();
        PyErr::from_value(inst)
    })
}

#[pymodule]
fn icicle(py: Python<'_>, m: &PyModule) -> PyResult<()> {
    PyModule::from_code(py, r#"
class MemoryError(Exception):
    def __init__(self, message, code):
        super().__init__(message)
        self.code = code
    def __str__(self):
        return f"{super().__str__()}: {self.code}"
"#, "icicle_exceptions.py", "icicle")?;
    Ok(())
}

Super ugly, but this ended up working exactly as I needed:

    try:
        vm.mem_protect(addr, 0x2000, icicle.MemoryProtection.ExecuteRead)
    except icicle.MemoryError as x:
        message = x.args[0]
        print(message, x.code)

mrexodia avatar Jun 02 '23 21:06 mrexodia