csla icon indicating copy to clipboard operation
csla copied to clipboard

CSLA, Source Generators, and Serialization

Open JasonBock opened this issue 3 years ago • 32 comments

Is your feature request related to a problem? Please describe. Not a problem.

Describe the solution you'd like

Overview

The intent is to start incorporating source generators into CSLA. The first target is serialization - specifically, generating the serialization path to (potentially) improve performance and reduce memory allocations. Note that this would only work for .NET 5 and beyond, so care should be taken as to how "integrated" this is with CSLA. I'm guessing some kind of "opt-in" strategy would need to be in place such that users on older TPMs can use serialization as it currently works in CSLA.

Proposal

The idea is to update how serialization works in CSLA such that the serialization path is baked into code via source generation. The generator would look at the contents of a business object, figure out what properties need to be serialized, and create the necessary calls to the stream. This should provide speed and memory allocation gains for the following reasons:

  • There's no need for Reflection as the serialization strategy is baked in.
  • The tables that's used to minimize the amount of string duplication can go away. There's no need to store any type name information in the stream, and any duplicated strings would be rare. Compressed streams should be able to address any duplicated information, that's what they're good at.

There are a couple of disadvantages:

  • This would only work for .NET 5 as its the only compiler that understands source generation.
  • There is the potential for code bloat as the generated code is baked directly into the assembly. However, the amount of code generated should be minimal and not affect performance.

How Serialization Currently Works

It's kind of confusing (well, at least to me it is :) ). MobileFormatter implements ISerializationFormatter, but it also has these helper DTO methods that create other instances of MobileFormatter. These DTO methods get a list of SerializationInfo. At some point, CslaReaderWriterFactory is used to get either a reader or writer.

Source Generator Strategy

The overall goal is to greatly simplify and improve how serialization works in CSLA. Basically start from scratch and build a new serializer based on ISerializationFormatter, tentatively called SourceSerializer (but call it whatever you want). The difference would be that any business object (or, more formally, anything derived from IMobileObject) would have a partial class created that would implement a new interface, ISourceSerializable. This would have two methods, Serialize and Deserialize (note that this may cause a name collision, so we may want to implement this interface explicitly). Here's a brief example. Let's say I defined this object:

[Serializable]
public partial sealed class Customer
  : BusinessBase<Customer>
{ ... }

Let's say it has properties like an Id as a Guid, a Name as a string, a Buffer to hold arbitrary byte data (for some bizarre reason), and some child objects.

Arguably, we wouldn't need SerializableAttribute there, but that's a debate for another time. Note that the class must be partial as we are going to expand on the definition of Customer in another class. The source generator would find the definition written by the developer, and create another class definition that will look something like this:

public partial sealed class Customer
	: ISourceSerializable
{
  void Serialize(Stream stream)
  {
    stream.Write(this.Id);
    stream.Write(this.Name);
    stream.Write(this.Buffer.Length);
    stream.Write(this.Buffer);
	
    stream.Write(this.Children.Length);
	
    foreach(var child in this.Children)
    {
      // ... more serialization from the children...
    }
  }

  Customer Deserialize(Stream stream)
  {
    this.Id = new Guid(stream.ReadBytes(16));
    this.Name = stream.ReadString();
    this.Buffer = stream.ReadBytes(stream.ReadInt32());
    var childCount = stream.ReadInt32();
	
    foreach(var x = 0; x < childCount; x++)
    {
      // ... more deserialization from the children...
    }
  }
}

The names of the methods I have on the Stream might be wrong, but I hope the core of the idea comes through. If you look at the current serialization implementation in CSLA, there's a bunch of Reflection code, "well known types", and other junk that just disappears with this. We know what's being serialized and in what order, so deserialization can assume that happy-path.

Next Steps

My first suggestion would be to mock up this idea without doing source generators first. Meaning, handwrite the code that the generator would produce, and do extensive testing (preferably with Benchmark.NET) to determine if this will be a "better" option overall. There's no need to actually do the source generator work if this ends up being a bad idea in the first place :).

If this proof-of-concept work proves to be feasible, then start looking at how CSLA would work with making serialization strategies configurable. I really don't know if this is something that is easy and doable today or not. I'm guessing this is already built-in, but ensure that it is. Meaning, you want to make sure you can allow the source generation serialization to be opt-in and allow older versions of .NET to work with what's there (or let people stick with the current serialization implementations).

Once that's addressed, then start looking at what it would take to do it as a source generator. If you haven't worked with source generators in C# 9 yet, it's a powerful feature, but you have to be comfortable with the Compiler APIs. If you're not, there is a learning curve to address, and don't underestimate it :).

Make sure your implementation does not break existing expectations. There's probably a bunch of things I'm missing with serialization in CSLA. Just looking at the tests (e.g. TestWithoutSerializableHandler(), it looks like the GlobalContext is updated, which is odd to me. I think CSLA also tracks duplicated object references within a graph so that the references are deserialized correctly as one object referenced in multiple places. I don't have all the answers here, just make sure you don't break anything here.

Describe alternatives you've considered No alternatives considered.

JasonBock avatar Mar 15 '21 18:03 JasonBock

Yep, this seems to make sense. I suppose there are some additional nice to have features - either now, or as separate issues at a later date:

  1. The ability to customise behaviour with minimal effort and code. For example, I think support for the [NonSerialized] attribute might be wise, to provide a minimal ability to customise by excluding certain parts of the object graph.
  2. If we want to try to offer this as a plug-in replacement for the existing serializer in the future then ensuring there is support for child classes of differing types - continuing to serialize the type information - might be useful under certain circumstances. Consider for example a child collection containing different concrete classes that all implement a common interface. As an example, Employee and Customer both implementing the IPerson interface, and members of the same collection (not the best example, but the one I can think of for now.)
  3. I agree that understanding how to handle issues regarding circular references and shared object references is important.

It feels like deciding on what the required serialization behaviours are is the most tricky part. I'm not saying that the code will be simple, but it's exactly the sort of thing that source generators are made for, so I imagine this to be perfectly feasible.

TheCakeMonster avatar Mar 16 '21 16:03 TheCakeMonster

Question: in the serialization strategy suggested, is there anything to be considered about endianness across different .NET platforms? I'm not really good on multi-platform scenarios, but is Linux little-endian whereas WIndows is big-endian, for example? Is that relevant in .NET, or does .NET define that for us so that we can safely ignore it?

I did also wonder whether differences in code pages used on different platforms were relevant, but I think the answer is no - .NET uses Unicode for strings.

TheCakeMonster avatar Mar 16 '21 16:03 TheCakeMonster

I've spotted an interesting GitHub repository that might offer a useful reference for the source generator, once we've determined what it needs to do:

https://github.com/RudolfKurka/StructPacker

Found from a list of source generators being compiled at https://github.com/amis92/csharp-source-generators

I don't know the provenance of the repos to which I have provided links, so don't assume that they are the ideal reference materials. There might be better examples out there.

TheCakeMonster avatar Mar 16 '21 17:03 TheCakeMonster

Source generators cookbook from the Roslyn repo: https://github.com/dotnet/roslyn/blob/main/docs/features/source-generators.cookbook.md

TheCakeMonster avatar Mar 16 '21 17:03 TheCakeMonster

The endianness question would affect CSLA right now. It wouldn't be a source generator-specific thing.

FWIW I have three repos in flight that use source generators. They have nothing to do with serialization but maybe they'll help pick up a couple of things related to generators.

  • https://github.com/JasonBock/PartiallyApplied/
  • https://github.com/JasonBock/InlineMapping/
  • https://github.com/JasonBock/Rocks

JasonBock avatar Mar 16 '21 19:03 JasonBock

The endianness question would affect CSLA right now.

When using the binary reader/writer. Today, at least, the easy workaround would be to switch back to a text-based reader/writer.

rockfordlhotka avatar Mar 16 '21 19:03 rockfordlhotka

My point is, this serialization idea really isn't doing anything different from what MobileFormatter does right now. So MobileFormatter would have the same issues if a binary reader/writer is being used.

JasonBock avatar Mar 16 '21 19:03 JasonBock

I understand, though the design for this new model should probably rely on a configurable reader/writer concept as well so it is possible to address the issue if it arises for someone.

rockfordlhotka avatar Mar 16 '21 19:03 rockfordlhotka

I would assume that's the case currently in CSLA?

JasonBock avatar Mar 16 '21 19:03 JasonBock

Yes, MobileFormatter relies on a reader/writer that can be swapped out. The default is a binary reader/writer, but there's also an XML (?) reader/writer that wouldn't have endian issues b/c numbers transfer as text.

rockfordlhotka avatar Mar 16 '21 20:03 rockfordlhotka

For reference, here is the announcement about BinaryFormatter being obsoleted in .NET 5. It was done because of the security risks associated with the deserialization implementation. These security issues cannot be resolved, because they are baked into the design of what the deserialization does.

https://docs.microsoft.com/en-us/dotnet/standard/serialization/binaryformatter-security-guide

We need to be mindful of the risk in implementing the new serialization strategy, although I think the intended solution probably sidesteps a lot of the issues.

My understanding is a bit limited at the moment, but I think one of the main problems will be that BinaryFormatter will deserialize any type using the type information passed as part of the serialization stream. This is done in order to support polymorphic deserialization, but means that clients can effectively inject new types into a host application.

One of my suggestions was that we support polymorphic deserialization, so we need to be careful about how such a thing might be implemented. The issue is not that it cannot be supported, but instead that it should only be supported for types declared in advance within the code performing deserialization. A good example of this is DataContractSerializer, which requires the definition of known types for polymorphic deserialization scenarios, so that there is a list that can be used to validate the types that are to be deserialized prior to an attempt to do so.

TheCakeMonster avatar Mar 17 '21 10:03 TheCakeMonster

Having thought about it some more, I think the serialisation strategy would be most useful if it fulfilled the following rules:

  1. All public fields and properties that do not have a opt-out attribute applied (e.g. [NonSerialized])
  2. PLUS any non-public field or property that have a special opt-in attribute applied (e.g. [MustSerialize])
  3. We need to handle the risk of circular references. Not sure how to do that.
  4. Ideally it should handle serializing implementations of base classes or interfaces, but for security this MUST only work for previously defined/declared types, such as is achieved with DataContractSerializer through the [KnownTypes] attribute.

TheCakeMonster avatar Mar 29 '21 14:03 TheCakeMonster

I've done a bit of proof of concept work on this feature this afternoon.

As this is only PoC, it's in my own repo. Have a look at the following location and see what you think of it so far.

https://github.com/TheCakeMonster/Examples/tree/master/SourceGenerators/CslaSerialization

There are 4 projects:

  1. Core is the definitions of attributes which must be applied to classes to control serialization.
  2. Generators is the classes associated with the source generator - i.e. the code that does generation.
  3. Objects is a target project. A class exists in here on which generation is to be performed.
  4. UnitTests. Very incomplete, but there are a few tests that check that the serialized code works as expected.

The implementation is very limited. This is not intended to be ready to ship. It's something to kick the tyres of what might be required, intended to prompt further discussion.

The approach I've taken is to automatically implement IMobileObject. However, it only works for properties of primitive types, and therefore does not cope with children/complex graphs.

TheCakeMonster avatar Apr 12 '21 19:04 TheCakeMonster

I've done a couple of improvements to the code this morning to exclude properties that are not exposed outside of the class, and then be able to opt in properties of those lower scopes. However, I then started getting CodeLens throwing NullReferenceExceptions in VS2019 16.10 Preview 1 when viewing the target class (i.e. CslaSerialization.Objects.Person) in a code window - always 3 exceptions, one after another, when you switched to the Person code window.

I'm not sure if the exceptions were caused by the changed I made, or an update to something I got on my machine at the end of the day yesterday - but the changes in the code might be the culprit.

I tried to track down the issue by slowly commenting out code in the generator, but even when it was all commented out it still happened. Only removal of the [AutoSerializable] attribute fixed it. As soon as I added the attribute back in, it started happening again.

In the end, I removed the reference from the Objects project to the Generators project and it stopped happening. Then when I added that reference back, it didn't start happening again, and now it seems OK with all of the code back in circuit. I hate it when that sort of thing happens! There was something slightly messed up and no matter what I cleaned or rebuilt, how many restarts of VS I did and so on, it didn't clear it.

If you have the same problem let me know and I'll see if I can make it happen again on my machine - but it isn't happening at the moment.

TheCakeMonster avatar Apr 13 '21 11:04 TheCakeMonster

The CodeLens issue is a known one. Seems to happen when you have SGs in play (look here for details).

Here are some observations on the code...

Is there a reason you're introducing a new attribute for serialization? Source generators can key off of any type, and CSLA developers are used to having serialization "just work" via the inclusion of [Serializable].

If there isn't a need to extend a type, I'd seal it by default. Note that with attributes this can provide a small perf benefit, though with source generators this is probably not applicable.

I'd use a receiver to find candidates, especially context receivers (check out my receiver for InlineMapping). They can make it easier to look for the nodes that match the condition you want.

If possible, don't do a string name search to confirm that the attribute is the type you want (line 55 in SerializationPartialsGenerator). You can use a Compilation object from the SemanticModel to load a type symbol based on the type's name (again, look at my MapReceiver for details on this). That ensures that the match is exactly what you're looking for.

Consider using an IndentedTextWriter to write out the code. It makes it a bit easier to handle indentation than trying to do it via a StringBuilder.

In AppendSetStateMethod(), you're including a type name (line 154). The namespace of that type will (probably) need to be included as a using in your gen'd code.

Be careful of types not defined in a namespace. They'll end up in a "global" namespace, and it may break assumptions you're doing with namespaces (this has bit me with SGs I've created).

JasonBock avatar Apr 13 '21 15:04 JasonBock

@JasonBock Thanks for the thorough and comprehensive review. That will be very useful when I do the actual implementation.

Yes, I specifically chose not to use the [Serializable] attribute as a selector for target classes. There's a lot of existing CSLA code that could be broken if we were too aggressive with our targeting. I chose to use a new attribute that allows this to be a feature that a developer can opt into. It's not intended to be a comprehensive solution to all serialization at this stage - it will not do anything other than primitive types in the first version. Therefore, a developer should make a choice of whether to use it based on whether the limitations are appropriate to the scenario.

I'm not all that happy with the names of the attributes - particularly those used to opt public properties out and private properties (and, in future, fields) in - but I think the need for them is sound. The job of AutoSerializationExcluded could be done instead with the NonSerialized attribute, but I don't know of an equivalent for opting things in, so I chose to create two new ones that were closely related by name.

TheCakeMonster avatar Apr 15 '21 08:04 TheCakeMonster

Ooh, your .NET Oxford talk on source generators just appeared in my YouTube feed. What are the chances of that!

TheCakeMonster avatar Apr 15 '21 08:04 TheCakeMonster

@TheCakeMonster I understand how this is "opt-in", but right now with serialization in CSLA....it just "works". For the most part, you don't have to do a lot to change what serialization is doing (at least I haven't, others may have different experiences). For a long time, the pattern has been, "put [Serializable] on the class, and you're done". Would you be requiring users to attribute the class and all properties to state whether they are part of the serialization process or not? Just something to keep in mind.

JasonBock avatar Apr 15 '21 15:04 JasonBock

@JasonBock I was trying to allow for customisation of the behaviour in situations where it is required. By default, all public and internal properties are included in serialization. However, under certain circumstances you may want to exclude some of those properties, or possibly include private properties or fields that are not included by default.

TheCakeMonster avatar Apr 15 '21 16:04 TheCakeMonster

@rockfordlhotka @JasonBock Where do you think we should put the new source generator in the codebase? Csla.Analyzers isn't entirely unreasonable, but maybe a new assembly is better? Source generators do not need to be deployed; they are only build time dependencies.

TheCakeMonster avatar May 04 '21 19:05 TheCakeMonster

I don't think they should go in the Analyzers project, as this is something different. And also because we don't want to mix dependencies and make our lives more complex.

So probably something else, and I'm open to ideas. Probably should try and follow patterns others are using (if any yet).

rockfordlhotka avatar May 13 '21 21:05 rockfordlhotka

Is there any need for version tolerance in the serialized data? If a user adds a property to a previously serialized object, could that cause an issue?

For what it's worth, this serializer I made uses a source generator: https://github.com/ReubenBond/Hagar (and I am integrating it into Orleans directly here, with many improvements: https://github.com/dotnet/orleans/pull/7070)

I would be wary of simply serializing fields one after another with no type information or information which can be useful for version tolerance. Type information is useful for supporting polymorphism. Another issue is the possibility of cyclic references in a serialized object graph.

Polymorphism can be implemented in a relatively safe manner by only allowing creation of Type instances which are explicitly allowed

ReubenBond avatar May 13 '21 21:05 ReubenBond

@ReubenBond the current serialization models (BinaryFormatter, NDCS, MobileFormatter) are all full-fidelity, and that is a requirement. 100% exact clone of the original object graph, including children with multiple references to the same child.

Right now they are all version sensitive, and of course BinaryFormatter and NDCS are controlled by Microsoft, so that's not going to change. MobileFormatter is part of CSLA, and so could relax the requirement for a version match, with the caveat that it opens up a whole new avenue of runtime bugs for people to troubleshoot. So if we did go down that road, I'd want to make sure there's extensive logging in place so people could easily discover that they created a problem for themselves with mismatched versions.

Such a change - relaxing version parity - is not in the scope of this issue.

rockfordlhotka avatar May 14 '21 00:05 rockfordlhotka

I think there are a number of decisions we should make before I undertake the main development, based mainly on unresolved discussion on this thread. Here is a list of the outstanding queries I have about what we want this to do.

In summary, what the generator will do is to generate a partial class that implements the IMobileObject interface, so that the consumer class does not need to.

General Questions:

  1. How do we identify the classes for which to perform generation? We can't do all CSLA classes, both because we risk breaking existing implementations, and because the implementation is not going to be comprehensive to begin with. [Suggestion: create a custom attribute which developers would apply to their class to opt into generation]
  2. If we use an attribute for 1, then what should it be called? [Suggestion: AutoSerializable]
  3. What serialization scheme should we mimic - in other words, what rules should be applied to determine what members are - and what members are not - serialized? [Suggestion: JSON, or maybe Xml?]
  4. I would like to create an attribute that can be applied to a member that would be included as a result of the automatic member serialization rules, but which a developer does not want to be serialized - similar to NonSerialized. What should this attribute be called? [Suggestion: perhaps AutoSerializableExclude? I'm not entirely happy with than name. I am suggesting that we do NOT use NonSerialized, because of the next question.]
  5. I would like to create an attribute that can be applied to a member that would normally be excluded as a result of the automatic member serialization rules, but which a developer does additionally want to be serialized - the opposite of NonSerialized. What should this attribute be called? [Suggestion: perhaps AutoSerializableInclude? I'm not entirely happy with than name, but as far as I know we do not already have an attribute that is the logical opposite of NonSerialized]

CSLA-specific questions:

  1. I need to preserve multiple references to the same child objects in the object graph, if a complex graph is serialized automatically. Does MobileFormatter already deal with this for me, or do I need to do something specific in my implementation of IMobileObject in order for this to happen correctly?
  2. As we have chosen to put the new functionality in a new assembly, I will need to automatically register an additional assembly as a build-time reference, in the same way as the reference to Csla.Analyzers is added. This is presumably built into the Nuget package somehow, as Csla.Analyzers doesn't seem to be referenced in a consuming assembly after installation of the Csla package, yet the analyzers run automatically. If anyone can point me to the right place to do this, it would be appreciated; is there a nuspec file that needs to be updated to add this additional reference?

TheCakeMonster avatar May 23 '21 22:05 TheCakeMonster

Regarding the CSLA-specific questions:

  1. Yes, MobileFormatter already handles all the complexity of serialization/deserialization as long as all the objects in the graph implement IMobileObject so they participate in the process.
  2. Right now all the nuget package generation is handled by nuspec files in the /csla/nuget/definitions directory; which is nice because it is centralized and controlled; which is a PITA because a lot of the automation that comes from relying on the csproj files doesn't exist; but I haven't gone through the process of changing (yet)

General questions:

  1. I think the end game would be to build CSLA itself with the generator in place, so it would generate the strongly typed onget/onset/ongetchildren/onsetchildren methods in the base classes too - rather than the hand-crafted implementations that exist today; in the short term I think we should have a more modest target focused on user-created business classes
    1. Right now the FieldDataManager implements the get/set code for managed backing fields, and it is this code that would need to not run if a subclass was serializing its own field data - at least I think that's how it would work 😉
    2. Maybe the attribute should be Csla.GenerateSerializationAttribute or something like that? [GenerateSerialization]
  2. See above
  3. I don't think you need to worry about that level of detail if the generated code implements IMobileObject, because that allows MobileFormatter to do the heavy lifting - and it already does it correctly and with some nice optimizations
  4. This is an open issue in general: #1628 because it currently isn't supported by MobileFormatter; so whatever solution we come up with should address that issue plus the new code-gen solution; I rather like [NonSerialized] - but applied to the property instead of the field
  5. I'm not sure of the scenario here? Managed backing fields are always serialized today (as per number 4), and private backing fields are never serialized automatically
    1. We could change this behavior in the long run - automatically serializing all fields declared in a class unless they are marked with NonSerialized - kind of what's proposed via reflection in #2148
    2. This would be a nice step toward eliminating the hand-crafted get/set state methods in the CSLA base classes as I mentioned in number 1 above

rockfordlhotka avatar May 25 '21 19:05 rockfordlhotka

@rockfordlhotka Thanks for the responses, that is very helpful.

The answers to the CSLA-specific questions give me everything I need, so we can focus all of our attention on the general questions from this point on.

Here are my thoughts on your responses to those general questions:

  1. It is interesting that you would want to target the base classes as well. FieldDataManager seems to do a really good job, so if it doesn't use reflection during serialization (and my informed guess was that it didn't) I wasn't sure there would be much benefit to changing its implementation. However, I agree that even if we do it, this should be a long term aspiration rather than be our focus at the moment.

My intention with this task was to provide serialization for classes that do not inherit from the base classes. A POCO that needed to be passed via a Command object is the specific scenario I encountered myself, as I was writing a list of log entries to a database in a batch.

  1. I'd be happy with another attribute name, although including the word Generate does perhaps limit its scope to this usage scenario. I was wondering whether a naming convention that would work in more, future, scenarios, was what we wanted. If we come up with a new serialization scheme in the future, might it be nice that people not need to change the attributes on their classes? Having said that, maybe any future scheme would be opt in as well, so maybe this consideration is ill-conceived?

  2. With my question I was trying to tease out all of the detail of the rules for what should be included and what should be excluded. It seemed easier and more intuitive for a developer to know it was the same as some other, industry-standard scenario - and less for us to type in defining what we want. However, we don't have to mimic anything else if we think we know what the rules are. A default behaviour of including public members and excluding private ones makes sense to me, but this itself perhaps conveys insufficient detail. What about properties with a scope of internal? Are there restrictions on whether getters and setters are both of the same scope? There will need to be a setter for the generator to create valid code, but a setter of any scope would work I think, for a partial class.

I will be generating the implementation of IMobileObject and users will not be able to change that implementation - the output from a source generator cannot be edited. For that reason, we need to give developers clear guidelines on what they must do for their objects to be compatible with the new scheme. Being able to refer to the documentation of another scheme seemed like a useful additional benefit.

  1. I'd be comfortable using (indeed I would very much like to use) [NonSerialized] if there were an equivalent for the opposite - for requesting inclusion of private fields and/or properties. I think the logical opposite (covered by item 5) is both appropriate and necessary, but I'll cover that next.

  2. This is about customisation of the serialization behaviour. I can see (and have personally encountered) scenarios where people will need to customise serialization. We could require them to manually implement IMobileObject entirely if they encounter any such scenario, thereby not concerning ourselves with them. However, since it seems to be relatively simple to do, I thought allowing them to customise things using attributes would be a nice feature.

Here are some scenarios I have encountered where manually including private members into serialization could be necessary:

i. An object exposes a public property that only has a getter, but whose underlying value changes based on the calling of one or more methods. Take for example the state of an object whose lifecycle is to be very carefully controlled, and implemented using a private backing field. In this scenario you would have to mark the public property as excluded and the private backing field as included in serialization instead. ii. An object that exposes a read-only property of the time and date it was created, implemented using a private, read-only backing field. iii. If an object must carry with it a timestamp, row version or other value for managing concurrency; this must not be editable and should probably not even be publicly exposed at all, but must be serialized and carried as part of the object. This one might make less sense, in that it suggests an editable object, and editable objects would probably always benefit from inheriting from a base class, and making use of all of the other functionality that FieldDataManager offers. iv. An object which uses a private list to manage deleted items may need customisation. However, this is different in the sense that this should affect the behaviour of the OnGetChildren method rather than OnGet. As I haven't implemented that method yet, this might somehow be less relevant than I am currently imagining, as I have no code to refer to. The scenario also implies it is likely to be an editable object, so this might also make less sense than the first two scenarios.

In short, I am imagining the need to manually request inclusion of one or more private fields or properties in the generated serialization code. The code created by a source generator cannot be edited, so if customisation is to be possible, we must provide a mechanism to achieve it in what we build. I intended to do this straight away; whilst we could delay it for later, it fitted in my head as part of the same development task for better efficiency.

TheCakeMonster avatar May 26 '21 11:05 TheCakeMonster

I've run a test this morning, and confirmed a suspicion I'd had - that the backing fields of auto-implemented properties are not available in a source generator. Source generators use the code as it exists as their source of information, and auto-implemented properties are a compiler trick (which is likely to happen after source generators have run) that therefore makes the backing field unavailable to us in a source generator.

Serializing a class that has auto-implemented properties can't be done by serializing only fields.

BinaryFormatter's serialization (which we are trying to mimic) happens at runtime, by which point the backing fields have been created by the compiler.

At the moment, it's not entirely clear to me how you would identify a property as being auto-implemented, so treating these differently from other properties might be hard. I'm not saying that it's not possible to identify auto-implemented properties, just that I don't know how to do so just yet.

TheCakeMonster avatar Jun 24 '21 09:06 TheCakeMonster

Ah, OK. It looks like identifying auto-implemented properties might not be too difficult; PropertyDeclarationSyntax exposes a list of accessors for a property, and auto-implemented properties have accessors whose bodies are null. Properties whose get accessor is null is one way to identify them, although maybe there being 2 accessors and the bodies of both being null might be the most accurate mechanism of all.

TheCakeMonster avatar Jun 24 '21 12:06 TheCakeMonster

The underlying writers and whether they support typed writing is relevant to the discussion of avoiding boxing and unboxing of values types. Here is a list of the writers, and what support they have for typed writing:

CslaLegacyBinaryWriter and CslaBinaryWriter use BinaryWriter: ushort, uint, ulong, sbyte, bool, byte, short, int, long, decimal, float, double, char, char[], byte[] (in summary: fairly comprehensive) CslaXmlBinaryWriter uses XmlDictionaryWriter: bool[], int[], long[], float[], double[], decimal[], Guid[], TimeSpan[], DateTime[] (in summary: a bit patchy, and all for arrays; does this negate the benefit?) CslaXmlWriter uses XmlWriter: bool, int, long, decimal, float, double, DateTime, DataTimeOffset (in summary: quite limited)

There is some support in the underlying writers for typed serialization, but it's somewhat incomplete. There are advantages to be had for some of the most common value types in most circumstances.

TheCakeMonster avatar Jun 24 '21 13:06 TheCakeMonster

Does anybody faced issue with "losing" Kind of DateTime object after serialization?? Before serialization had Utc kind, but after Unspecified...

serialization_comaprison

BaHXeLiSiHg avatar Jun 29 '21 14:06 BaHXeLiSiHg