[META] Declarative API for ASN.1 in Python
This is a meta-issue/design issue for tracking a declarative ASN.1 API for Cryptography!
The goal: an importable Python API that users of Cryptography can define ASN.1 structures with, which can then be ser/de'd to and from DER (and only DER).
A rough sketch, demonstrating the basic idioms of the API:
from cryptography.hazmat import asn1
# Corresponding to:
#
# Signature ::= Sequence {
# r INTEGER,
# s INTEGER
# }
@asn1.sequence
class Signature:
r: int
s: int
sig = Signature.from_der(b"...")
raw: bytes = sig.to_der()
This declarative API should have fully generality/expressivity with respect to ASN.1's own feature set, including qualifiers like EXPLICIT and IMPLICIT:
@asn1.sequence
class Signature:
r: Annotated[int, asn1.explicit(0)]
s: Annotated[int, asn1.implicit(1)]
This would also (naturally) include generality over user-defined types:
@asn1.sequence
class Frob:
...
@asn1.sequence
class FrobHolder:
frob: Frob
Key design constraints:
- This API should be 100% declarative: there should be no imperative effects on ASN.1 ser/de
Open design questions:
-
What's the best way to handle
ANY?asn1.Anyas a generic TLV type, similar to rust-asn1? -
What's the best way to handle
ANY DEFINED BY?Maybe something like this:
asn1.sequence lass VaryMe: content_type: asn1.ObjectIdentifier content: Annotated[Varied, asn1.defined_by("content_type")] or maybe we can do this with the standard enum.Enum? or maybe @asn1.varied? lass Varied(asn1.Enum): foo: Annotated[VariantA, asn1.defined_by(SOME_OID)] bar: Annotated[VariantB, asn1.defined_by(ANOTHER_OID)] -
To what extent/how can we best support trivial "native" Python types (
int,str, etc.) versus "synthetic" types?- Moreover, what's the appropriate isomorph for
str? ProbablyUTF8String, with all other string-ish types beingbytes?
- Moreover, what's the appropriate isomorph for
-
What about non-trivial native types like
list[T],set[T], etc? Should we support these with fixed mappings (e.g.list[T] -> SEQUENCE OF), or should we have our own types that don't require as much object conversion (e.g.asn.List[T])? -
To what extent should we support
datetimeas a time type/map betweendatetimeandUTCTime/GeneralizedTime?- One pitfall that we want to avoid is surprising serializations, e.g. a user really wants
UTCTime OR GeneralizedTimebut instead gets onlyGeneralizedTime
- One pitfall that we want to avoid is surprising serializations, e.g. a user really wants
-
What's the best way to handle ASN.1 type constraints, e.g. ranged integers and min/max sequence/set lengths?
- Probably additional fields on
Annotated, e.g.Annotated[list[T], asn1.size(1...10)] - Not all of these make sense for an MVP, since plenty are obscure/not widely used (e.g. contained subtypes)
- Probably additional fields on
Open integration questions:
- Where should this live within
cryptography? Doescryptography.hazmat.asn1make sense, or should it becryptography.asn1, or something else?
There are probably many other questions too, and I'm sure I've missed some in my notes 🙂
CC @facutuesca
As an implementation/integration consideration: we'll (@facutuesca and I) begin prototyping this as its own independent codebase, which should then be relatively straightforward to fold into Cryptography (since the only deps should be pyO3 and rust-asn1).
Some notes as I begin to look into this:
- We need to extract each class definition's annotations with
cls.__annotations__,inspect.get_annotations(), ortyping.get_type_hints(). Which one of these to use will depend on the version of Python we're on, since Cryptography supports 3.7 (but we're targeting 3.8).- https://docs.python.org/3/library/typing.html#typing.get_type_hints
- https://docs.python.org/3/howto/annotations.html
- Pydantic's approach to resolving annotations: https://docs.pydantic.dev/latest/internals/resolving_annotations/
- Relevant internals: https://github.com/pydantic/pydantic/blob/dac74dc1f87104b9ad47c8d2aae7e0d44ed05c77/pydantic/_internal/_typing_extra.py#L965-L1098
- Once the annotations are extracted, we need to put them into a form that's sufficiently convenient for the Rust side to walk and generate a set of parsing actions from.
- A successful parse should end with an instantiated
T(for whateverTbegan the parse), which should also have some methods added to it (from_der(),to_der(),__init__(), etc.). This last bit can be done by the decorator itself.- The decorator itself will probably be purely on the Python side.
- Decorators are possible in PyO3 but not super ergonomic: https://mat-his.medium.com/creating-python-decorators-in-rust-e89073aa3534
This would be a huge win for cryptography, to use it to implement custom ASN1 formats. I am currently using asn1crypto for that, but this has not type hints and converting between the cryptography objects and those from asn1crypto is cumbersome.
Maybe adding some small helper to convert to and from PEM would also be helpful. This is not very complicated but doing it in each project individually fells wrong.
@facutuesca could we put a list of remaining tasks on this issue so we can have a meta view on how close we are to fully functional? 😄
Current progress:
ASN1. types
- [x] SEQUENCE
- https://github.com/pyca/cryptography/pull/13325
- https://github.com/pyca/cryptography/pull/13449
- [ ] SEQUENCE OF
- [ ] SET
- [ ] SET OF
- [x] BOOLEAN
- https://github.com/pyca/cryptography/pull/13482
- [x] INTEGER
- https://github.com/pyca/cryptography/pull/13325
- https://github.com/pyca/cryptography/pull/13449
- [ ] BIT STRING
- [x] OCTET STRING
- https://github.com/pyca/cryptography/pull/13482
- [ ] NULL
- [ ] OBJECT IDENTIFIER
- [x] UTF8String
- https://github.com/pyca/cryptography/pull/13482
- [x] PrintableString
- https://github.com/pyca/cryptography/pull/13496
- [ ] IA5Strin
- [x] UTCTime
- https://github.com/pyca/cryptography/pull/13513
- [x] GeneralizedTime
- https://github.com/pyca/cryptography/pull/13513
- [ ] (generic) TLV
ASN.1 Annotations
- [x] OPTIONAL
- https://github.com/pyca/cryptography/pull/13542
- [x] DEFAULT
- https://github.com/pyca/cryptography/pull/13562
- [ ] EXPLICIT
- https://github.com/pyca/cryptography/pull/13735
- [ ] IMPLICIT
- https://github.com/pyca/cryptography/pull/13735
- [ ] SIZE
- [ ] DEFINED BY