syrupy Decouple different snapshot functionality

trafficstars

Decouple different snapshot functionality

Description

e.g. The plugin architecture is defined with multiple subparts. This plugin allows implementing those separately - e.g. you can use a different serializer from comparator.

Related Issues

Addresses use-case mentioned in #749, using default serializer but separate per-test snapshot files.

Checklist

[ ] This PR has sufficient documentation.
[ ] This PR has sufficient test coverage.
[ ] This PR title satisfies semantic convention.

Additional Comments

No additional comments.

Jun 03 '23 04:06 rotu

This might be a very bad idea, but it also seems like it might be in line with the original plugin architecture intent. Curious your thoughts.

Jun 03 '23 04:06 rotu

Will give this more thought at some point this week. Initial thought is that this should be doable with a custom extension like so (not tested, but I think this should work)

from syrupy.extensions.amber import AmberDataSerializer
from syrupy.extensions.single_file import SingleFileSnapshotExtension

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    def serialize(self, data, **kwargs) -> "SerializedData":
        return AmberDataSerializer.serialize(data, **kwargs)

Jun 05 '23 15:06 noahnu

Will give this more thought at some point this week. Initial thought is that this should be doable with a custom extension like so (not tested, but I think this should work)
from syrupy.extensions.amber import AmberDataSerializer
from syrupy.extensions.single_file import SingleFileSnapshotExtension

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    def serialize(self, data, **kwargs) -> "SerializedData":
        return AmberDataSerializer.serialize(data, **kwargs)

I think you're right but it's very confusing to me how the library is set up. It seems like there are two broad philosophies to plugins:

Inherit from the base thingy and override certain functions with a new implementation.
Inject new implementation into well-defined places for hooks.

Inheriting from an extension and overriding seems weird to me because it's doing both.

That said, there's a lot I don't understand about this codebase which prevents me from being confident in this PR:

what "amber" means in "AmberSnapshot" and why it's used instead of pickle (I'm guessing because pickle is not text-based).
why snapshot takes extension_class instead of extension
why some extension methods like get_snapshot_name, write_snapshot are class methods instead of instance methods
why SnapshotAssertion mutates itself in _post_assert.

Jun 05 '23 19:06 rotu

Inheriting from an extension and overriding seems weird to me because it's doing both.

I agree with that. There's definitely room to improve here. The codebase has already gone through several refactors, though it's pretty stable right now so I'd be hesitant to change anything too drastic without good reason. We also want to reduce the number of breaking changes if we can help it.

what "amber" means in "AmberSnapshot" and why it's used instead of pickle (I'm guessing because pickle is not text-based).

Amber is a custom format we built intended to reduce the size of diffs and be easy to review in pull requests. We don't need the ability to deserialize, only serialize. The name is a play on words with the theme of "syrupy" that used to be more prevalent in the codebase. There was a strong Jurassic Park theme (syrupy = amber / fossils etc). We've expunged most of this from the codebase to improve readability.

why snapshot takes extension_class instead of extension

The extension defines an interface so to speak. It does not have any instance state.

why some extension methods like get_snapshot_name, write_snapshot are class methods instead of instance methods

There was some recent refactoring to support pytest-xdist. Moving off of instances made it a lot simpler to reason about. Helps with performance as well. We cache some function calls using the extension class type. Not something we could easily do if there was instance data we had to consider.

why SnapshotAssertion mutates itself in _post_assert.

It lets you customize specific assertions. The post_assert resets the snapshot fixture in between assertions. Lets you do things like:

def test_case(snapshot):
    assert x == snapshot(exclude=props("uuid"))
    assert y == snapshot(exclude=props("other_prop"))

Jun 05 '23 20:06 noahnu

why snapshot takes extension_class instead of extension The extension defines an interface so to speak. It does not have any instance state.

True, though that's beside the point. One place where this is awkward is with SingleFileSnapshotExtension which must be subclassed in order to use TEXT mode. If you passed an extension object, you could add it as an argument to the constructor, e.g. something like snapshot(extension=SingleFileSnapshotExtension(text=True)).

There was some recent refactoring to support pytest-xdist. Moving off of instances made it a lot simpler to reason about. Helps with performance as well. We cache some function calls using the extension class type. Not something we could easily do if there was instance data we had to consider.

This makes sense. Something I'll have to look into.

why SnapshotAssertion mutates itself in _post_assert.

It lets you customize specific assertions. The post_assert resets the snapshot fixture in between assertions. Lets you do things like:
 def test_case(snapshot):
     assert x == snapshot(exclude=props("uuid"))
     assert y == snapshot(exclude=props("other_prop"))

I see that. The more obvious way to do this seems to be to have SnapshotAssertion.__call__ return a new object (similar to how use_extension creates and returns a new SnapshotAssertion). One consequence of the current setup is it's not at all clear what the following does:

def test_case(snapshot):
   s = snapshot(exclude=props("a"))
   assert x == s
   assert y == s

It might make sense to use dataclasses.replace to implement both __call__ and use_extension.

Jun 06 '23 16:06 rotu

Just posting to let you know I haven't forgotten about this. I do agree there's plenty of opportunity to simplify the architecture. I'd prefer to try more incremental changes in advance of any larger public API change (to reduce breaking changes and adoption pains).

I'm not sure I have too much time over the next few months to support large scale changes though.

FWIW, a lot of the initial implementation was based on attrs before dataclasses, and when we moved to dataclasses we only did the bare minimum so it was functional. Leaning on newer patterns to simplify the code and make it easier to contribute to is definitely something I'm interested in.

And if you're curious, you were 100% right about the initial design for the extensions/plugins being closer to the snapshot(storage=..., serializer=...) model (if you look at some of the earlier commits in this repo, you'll see something a bit closer to that).

Sep 27 '23 21:09 noahnu

Will give this more thought at some point this week. Initial thought is that this should be doable with a custom extension like so (not tested, but I think this should work)
from syrupy.extensions.amber import AmberDataSerializer
from syrupy.extensions.single_file import SingleFileSnapshotExtension

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    def serialize(self, data, **kwargs) -> "SerializedData":
        return AmberDataSerializer.serialize(data, **kwargs)

Only .encode() at the end sees to be missing. Only issue: the Amber Extension gives proper diffs. How to get those back?

Nov 22 '23 16:11 StephanMeijer

The following snippet works beautifully:

class SingleFileAmberSnapshotExtension(SingleFileSnapshotExtension):
    _file_extension = 'ambr'
    _write_mode = WriteMode.TEXT

    def serialize(self, data, **kwargs):
        return AmberDataSerializer.serialize(data, **kwargs)

Nov 22 '23 16:11 StephanMeijer

syrupy syrupy copied to clipboard

Decouple different snapshot functionality

Description

Related Issues

Checklist

Additional Comments

syrupy
syrupy copied to clipboard