Reasons and Solutions for Deserialization Failure of Data Persisted in Json while Representing State with Types in C#

"JsonDerivedType" is an attribute in C# that can be attached to a parent interface to define which types implement that interface, allowing the state to be represented by types.

By using this method, you can define the data type with the original parent interface, save it to JSON with that structure, and also deserialize it from JSON.

In C#, you can also use pattern matching to restrict behavior based on data derived from the parent type in this format.Instead of implementing each function for the parent interface, by defining behaviors for the inherited type, you can define it so that specific functions can only be executed in certain states in the first place.

The disadvantage of this method compared to ADTs in pure functional languages is that inheritance to the interface is not restricted, so completeness cannot be verified when pattern matching, and it cannot be confirmed that all data is covered.

C# language has recently proposed to have union type proposal, it this will be completed we can use union and enjoy power of totality with switch cases.

For example, in the case of a shopping cart example:

  • Shopping cart during item addition

  • Purchased but unsettled shopping cart

  • Paid shopping cart

  • Shopping cart in shipping

  • Shopping cart received

You can change the type for each state like this.

In that case, define the functions that can be performed for each type.

  • Shopping cart during item addition

    • Add an item to the cart

    • Remove an item from the cart

    • Specify shipping destination and payment destination

  • Purchase confirmation process

    • Shopping cart confirmed for purchase, not yet settled

    • Credit settlement process

  • Paid shopping cart

    • Shipping process
  • Shopping cart in transit

    • Cancellation request
  • Shopping cart received

    • Return shipping label creation process

By restricting each process in this way to prevent errors, it is possible to reduce the likelihood of executing incorrect processing due to bugs.

At J-Tech Japan, we develop and operate Sekiban, an open-source framework that makes it easy to perform event sourcing in C#.

Recently, we added support for PostgreSQL. PostgreSQL can be easily run in a local environment, making it ideal as a database during development. PostgreSQL has a JSONB column type, which allows storing JSON data in a binary format to improve searchability.

In event sourcing, since all types of events are stored in an Events table.

By storing each payload in JSON format, data can be flexible. The JSONB column mentioned above was used for that purpose. In actual event sourcing, event data within the payload is not used as a search condition to retrieve events, but I thought it was a convenient and good feature, so I used it in the initial version. (Little did I know that this would cause bugs later on.)

Deserialization issues that occurred in Postgres.

Since the event store in Postgres has been released, by changing the local execution of the solution we are creating to Postgres, it has been modified so that there is no need to connect to Azure's Cosmos DB and other services, and I was running tests, and by actually putting the database locally, the local development speed also improved, and I was very satisfied.

When I added many types of events and was testing the development, I encountered one error.

Deserialization of interface types is not supported. Type 'Project.Domain.ValueObjects.Accumulations.IAccumulationTarget'.

It says that deserialization failed. Normally, interface types cannot be deserialized, but in this case, since JsonDerivedType is configured, it should be possible to deserialize.

Here, the aggregation target is defined as an interface, and the domain is defined using the interface and JsonDerivedType. The code looks like this.

public record MonthlyAccumulationRegistered(
    IAccumulationTarget AccumulationTarget,
    YearMonth YearMonth) : IEventPayload<MonthlyAccumulation, MonthlyAccumulationRegistered>
{
...
}

[JsonDerivedType(typeof(BranchAccumulationTarget), nameof(BranchAccumulationTarget))]
[JsonDerivedType(typeof(CompanyAccumulationTarget), nameof(CompanyAccumulationTarget))]
public interface IAccumulationTarget
{
    public string GetAccumulationTargetKey();
    public string GetAccumulationTargetName();
    public IArea GetArea();
}

public record CompanyAccumulationTarget(OtherDomainId OtherDomainId, CompanyId LogiCompanyId, CompanyName Name, IArea TargetArea)
    : IAccumulationTarget
{
...
}

[JsonDerivedType(typeof(JapanesePrefecture), nameof(JapanesePrefecture))]
[JsonDerivedType(typeof(JapaneseZipCode), nameof(JapaneseZipCode))]
[JsonDerivedType(typeof(UnspecifiedArea), nameof(UnspecifiedArea))]
public interface IArea : ILocation
{
    public string GetName();
    public PrefectureValues? GetPrefectureValues();
}

public record JapanesePrefecture(
    [property: Range(1, 47)]
    [property: Required]
    PrefectureValues Value) : IArea
{
...
}

MonthlyAccumulationRegistered is a type that is saved as JSONB, and it has a property of type IAccumulationTarget, which is actually a CompanyAccumulationTarget. Furthermore, inside this actual type, there is another property of type IArea, and its actual type is JapanesePrefecture.

By modeling in this way, it is possible to define multiple types of aggregation targets and also define areas by postal code or by prefecture.
However, the failure to deserialize this data has raised many questions.

  • Is nesting of JsonDerivedType not allowed?

  • Are there cases where deserialization fails, such as when Japanese characters are included in the JSON?

  • Is it really impossible to realize the programming style of defining state with types in C# after all?

I became very anxious about the possibility that what I have been working on for the past year, expressing complex domains with types by defining state with types, defining interfaces with types, defining their implementations with multiple concrete types, and then serializing and deserializing that data with JSON, may not work well.

However, I calmed down and thought that it should have worked with CosmosDb, so I wrote various test codes to verify it and successfully discovered the problem.

Compatibility issue between Postgres's JSONB column and JsonDerivedType

What I found out through various verifications is,

The JSONB column in Postgres may rearrange properties for efficiency when saving JSON data.

Also, the JsonDerivedType in C#'s System.Text.Json saves the class identifier in a property called $type, but it seems to be a specification that the class identifier $type must be the first property for deserialization. (I want to find the source in the documentation)

The following Microsoft Learn document describes how to persist polymorphic objects in JSON, but I don't think it states that the type identifier must be first...

In other words, the following JSON will fail to deserialize.



{
    "YearMonth": {
        "Year": 2024,
        "Month": 1
    },
    "AccumulationTarget": {
       "Name": {
            "Name": "株式会社 テスト会社名",
            "NameKana": "テスト",
            "ShortName": "テスト"
        },
        "$type": "CompanyAccumulationTarget",
        "TargetArea": {
            "$type": "JapaneseZipCode",
            "Value": "2440004"
        },
        "CompanyId": {
            "Value": "f7867c89-774d-3bcb-32c8-be17c173abb9"
        },
        "OtherDomainId": {
            "Value": "f7867c89-774d-3bcb-32c8-be17c173abb9"
        }
    }
}

However, the following will succeed in deserialization. You can see that the "$type" item has moved to the beginning of the class.

{
    "YearMonth": {
        "Year": 2024,
        "Month": 1
    },
    "AccumulationTarget": {
        "$type": "CompanyAccumulationTarget",
       "Name": {
            "Name": "株式会社 テスト会社名",
            "NameKana": "テスト",
            "ShortName": "テスト"
        },
        "TargetArea": {
            "$type": "JapaneseZipCode",
            "Value": "2440004"
        },
        "CompanyId": {
            "Value": "f7867c89-774d-3bcb-32c8-be17c173abb9"
        },
        "OtherDomainId": {
            "Value": "f7867c89-774d-3bcb-32c8-be17c173abb9"
        }
    }
}

Probably, when deserializing a large size JSON class, if there is no class identifier at the beginning, it is thought that there will be a performance problem, so it is placed at the beginning.

Normally, when deserializing an object serialized with System.Text.Json as is, it is thought that there will be no problem, but this time, when saving to Postgres, the properties were reordered within the JSONB column, and deserialization failed.

Solution

The JSON column in Postgres stores the received JSON as a string. Therefore, by using a JSON column instead of a JSONB column, it was confirmed that the problem is resolved.

With this feature, you can use JsonDerivedType in Sekiban to persist and restore interfaces in JSON, allowing you to represent domains using types in Cosmos DB, Dynamo DB, and PostgreSQL, and to express event sourcing.

Summary

By specifying JsonDerivedType for an interface and restricting types in this manner, you can simplify the representation of complex data within a domain and implement features such as switching operations based on state transitions.

This time, with serialization using JsonDerivedType, the $type type identifier is placed at the beginning.The knowledge that it is necessary is also useful information for cases other than Postgres, such as when sending typed data from the frontend, so I thought it would be good to remember. I will continue to explore various ways to easily express efficient programming.

0
Subscribe to my newsletter

Read articles from Tomohisa Takaoka directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Tomohisa Takaoka
Tomohisa Takaoka

CTO of J-Tech Creations, Inc. Recently working on the development of the event sourcing and CQRS framework Sekiban. Enthusiast of DIY keyboards and trackballs.