Home icon indicating copy to clipboard operation
Home copied to clipboard

Bug in nanoFramework.Json: First character of string value is lost when it starts with \u00XX

Open ababere opened this issue 4 months ago • 9 comments

Library/API/IoT binding

nanoFramework.Json

Visual Studio version

VS2022

.NET nanoFramework extension version

2022.14.1.2

Target name(s)

ESP32-S3-WROOM

Firmware version

N/A

Device capabilities

No response

Description

When deserializing a JSON string using JsonConvert.DeserializeObject in .NET nanoFramework, the first character of a string value is completely dropped if it is represented as a Unicode escape sequence \u00XX (e.g., \u00225, \u00311, \u00412, etc.) and located at the beginning of the string value.


Example

Input JSON (valid, with escaping):

{\u0022Product\u0022:\u00225 test\u0022,\u0022Count\u0022:10}

Equivalent without escaping:

{"Product":"5 test","Count":10}

Target class:

public class Item
{
    public string Product { get; set; }
    public int Count { get; set; }
}

Deserialization code:

var obj = JsonConvert.DeserializeObject(json, typeof(Item)) as Item;
Debug.WriteLine(obj.Product);

Expected Result

5 test

Actual Result

 test

The character 5 (from \u00225) is completely lost.


Conditions

  • The string value starts with \u00XX.
  • \u00XX represents an ASCII character (digits 09, letters AF, etc.).
  • The escaping is valid and compliant with the JSON standard.

Root Cause

The nanoFramework.Json parser:

  • Processes \uXXXX sequences before completing string literal parsing.
  • Treats \u00XX at the start of a value as a separate token.
  • Discards the decoded character if it is the first one in the string.

This violates the JSON standard (RFC 8259), which requires \uXXXX inside strings to be properly decoded.


Impact

  • Data loss when receiving JSON from external APIs, CMS, or cloud services.
  • Inability to parse valid but escaped JSON.
  • Critical for IoT devices where JSON is the primary data exchange format.

Recommendation

  • Fix the parser in nanoFramework.Json to correctly handle \uXXXX inside string values.
  • Add unit tests for Unicode escapes at the start of strings.
  • Until fixed — always apply preprocessing when receiving escaped JSON.

Status: Critical Bug
Priority: High
Repository: nanoframework/nanoFramework.Json

How to reproduce

No response

Expected behaviour

No response

Screenshots

No response

Sample project or code

No response

Aditional information

No response

ababere avatar Oct 24 '25 08:10 ababere

This issue doesn't seem to come from the Json nuget rather from the native side:

Image

And the test result is green with the virtual device:

Image

Ellerbach avatar Oct 24 '25 09:10 Ellerbach

It's green as well on an ESP32:

Image

Ellerbach avatar Oct 24 '25 09:10 Ellerbach

Excuse me, it looks like there is an error in the provided line - {\u0022Product\u0022:\u00225 test\u0022,\u0022Count\u0022:10} It does not match what is expected - {"Product":"5 test","Count":10}

After decoding, the required symbol " is missing, and the json is incorrect. Ideally, there should be no deserialization, and there should be a deserialization error.

RelaxSpirit avatar Oct 27 '25 12:10 RelaxSpirit

Adjusting the test to output the string and it shows properly:

//
// Copyright (c) .NET Foundation and Contributors
// See LICENSE file in the project root for full license information.
//

using nanoFramework.Json.Test.Shared;
using nanoFramework.TestFramework;
using System;

namespace nanoFramework.Json.Test
{
    [TestClass]
    public class CharacterEncoding
    {
        [TestMethod]
        public void Test_u00xx_encoding()
        {
            // arrange
            var json = "{\u0022CompanyName\u0022:\u00225 test\u0022,\u0022CompanyID\u0022:10}";
            Console.WriteLine(json);

            // act
            JsonTestCompany jsonTestCompany = (JsonTestCompany)JsonConvert.DeserializeObject(json, typeof(JsonTestCompany));

            // assert
            Assert.AreEqual("5 test", jsonTestCompany.CompanyName);
        }
    }
}
Image

Ellerbach avatar Oct 27 '25 16:10 Ellerbach

I will clarify the task.

I receive a request through MQTT, and I get a Message from it. If I only use the Result for testing (currently commented out), the tests pass. However, if I process the main Message and get a responseRpc.Result, the tests fail.

using nanoFramework.TestFramework;
using System.Diagnostics;
using nanoFramework.Json;

namespace UTestVodomatLib
{
    [TestClass]
    public class UTestVodomatLib
    {
        [TestMethod]
        public void Test_u00xx_encoding()
        {
            var message = "{\"CorrelationId\":\"48c972d2-787c-4151-96ab-70eb26cd21b8\",\"Status\":1,\"Result\":\"{\\u0022ProductId\\u0022:1,\\u0022Name\\u0022:\\u0022A 5-liter bottle\\u0022,\\u0022Description\\u0022:\\u00225 liters\\u0022,\\u0022Price\\u0022:100,\\u0022LiterCount\\u0022:5}\"}";
            var responseRpc = JsonConvert.DeserializeObject(message, typeof(RpcResponse)) as RpcResponse;
            var currentProduct = JsonConvert.DeserializeObject(responseRpc.Result, typeof(Product)) as Product;

            //            var result = "{\u0022ProductId\u0022:1,\u0022Name\u0022:\u0022A 5-liter bottle\u0022,\u0022Description\u0022:\u00225 liters\u0022,\u0022Price\u0022:100,\u0022LiterCount\u0022:5}";
            //var currentProduct = JsonConvert.DeserializeObject(result, typeof(Product)) as Product;

            Assert.AreEqual("5 liters", currentProduct.Description);
            Assert.AreEqual("A 5-liter bottle", currentProduct.Name);
        }
    }



}

Image

ababere avatar Oct 27 '25 17:10 ababere

Are you certain that the value in responseRpc.Result is identical the the value you have manually input to the result string?

Try changing your test to this:

[TestMethod]
public void Test_u00xx_encoding()
{
	var message = "{\"CorrelationId\":\"48c972d2-787c-4151-96ab-70eb26cd21b8\",\"Status\":1,\"Result\":\"{\\u0022ProductId\\u0022:1,\\u0022Name\\u0022:\\u0022A 5-liter bottle\\u0022,\\u0022Description\\u0022:\\u00225 liters\\u0022,\\u0022Price\\u0022:100,\\u0022LiterCount\\u0022:5}\"}";
	var responseRpc = JsonConvert.DeserializeObject(message, typeof(RpcResponse)) as RpcResponse;
	var currentProduct = JsonConvert.DeserializeObject(responseRpc.Result, typeof(Product)) as Product;

	var result = "{\u0022ProductId\u0022:1,\u0022Name\u0022:\u0022A 5-liter bottle\u0022,\u0022Description\u0022:\u00225 liters\u0022,\u0022Price\u0022:100,\u0022LiterCount\u0022:5}";
	//var currentProduct = JsonConvert.DeserializeObject(result, typeof(Product)) as Product;

	Assert.AreEqual(responseRpc.Result, result);
	Assert.AreEqual("5 liters", currentProduct.Description);
	Assert.AreEqual("A 5-liter bottle", currentProduct.Name);
}

CoryCharlton avatar Oct 27 '25 23:10 CoryCharlton

After the first call, the Result field contains an incorrect value

            var responseRpc = JsonConvert.DeserializeObject(message, typeof(RpcResponse)) as RpcResponse;
            Debug.WriteLine($"Field result:{responseRpc.Result}");

Field result:{"ProductId":1,"Name":"A 5-liter bottle","Description":" liters","Price":100,"LiterCount":5}

Description = " liters"

The "Description" field should have the value "5 liters", but it has " liters"

ababere avatar Oct 28 '25 13:10 ababere

@ababere I finally managed to reproduce the issue.

It's coming from the unicode decoding manual logic here: https://github.com/nanoframework/nanoFramework.Json/blob/bdbad0dc65f2c0f9a6e609c53ab716044756c29c/nanoFramework.Json/JsonConvert.cs#L1133

Somewhere, when parsing the 00xx it's taking all the numbers. If you are willing to dig into this, that's where the issue is!

Ellerbach avatar Oct 30 '25 09:10 Ellerbach

I have added a draft PR with a fix generated from GH co-pilot. Hopefully it is helpful (though needs testing).

networkfusion avatar Oct 30 '25 15:10 networkfusion