jackson-core icon indicating copy to clipboard operation
jackson-core copied to clipboard

Include forward slash ('/') as character to escape by default in String values

Open cowtowncoder opened this issue 5 years ago • 2 comments

Jackson 2.x only escapes minimum set of characters, as defined by JSON specification. This does not include forward slash character ('/'). But while legal, it turns out that more often than not users do want escaping, to guard against potential inclusion-in-HTML problems, particularly for embedded JSON constants in Javascript sources, in script tags.

Now: although it is possible to enable escaping already (via CharacterEscapes), it is bit verbose, and also adds some measurable (not huge, but not completely trivial) overhead. So for 3.0 let's add this character as escape-by-default, but also add a simple mechanism for turning that off if feasible (JsonWriteFeature, most likely?).

cowtowncoder avatar Jan 15 '19 04:01 cowtowncoder

I just encountered this difference in a project migrating PHP code to Kotlin. I'm not sure it should be enabled by default, as it makes URLs in Json objects hard to read, but it should be a feature that can easily be enabled. As you have stated, the main reason is to prevent the injection of closing script tags.

Here is an implementation of CharacterEscapes for Java & Kotlin: https://stackoverflow.com/questions/6817520/escape-forward-slash-in-jackson/73770950#73770950

MrBuddyCasino avatar Sep 15 '22 11:09 MrBuddyCasino

Right, the setting probably should be disabled by default for Jackson 2.x for backwards-compatibility. But possibly enabled for 3.0 (master).

But first, it'd need to be implemented; something that is surprisingly bit more complicated than I thought (at least to do it efficiently, and across generator implementations).

cowtowncoder avatar Sep 15 '22 15:09 cowtowncoder

cc @JooHyukKim This is something that would be nice to implement -- basically JsonWriteFeature to force escaping of /, without requiring full CharacterEscapes but also not changing current default behavior.

cowtowncoder avatar Jan 08 '24 19:01 cowtowncoder

WDYT of usage like below? @MrBuddyCasino @cowtowncoder

    @Test
    public void testEscapeForwardSlash() throws Exception
    {
        // Given
        Writer jsonWriter = new StringWriter();
        JsonGenerator generator = JsonFactory.builder()
                .enable(JsonWriteFeature.ESCAPE_FORWARD_SLASHES)
                .build()
                .createGenerator(jsonWriter);

        // When
        generator.writeStartObject(); // start object
        generator.writeStringField("url", "http://example.com");
        generator.writeEndObject(); // end object
        generator.close();

        // Then
        assertEquals("{\"url\":\"http:\\/\\/example.com\"}", jsonWriter.toString());
    }

JooHyukKim avatar Jan 09 '24 05:01 JooHyukKim

I understand the concern about backward compatibility w/o a major version change. However, the proposal to configure every new instance of ObjectMapper to get the desired behavior is challenging in practice. Because new ObjectMapper() is public it is hard to enforce consistent configuration.

I would propose a way to register a static method to register a Factory for the relevant configuration class. I'm not sure if this should be a Factory of JsonFactory or a lower level configuration class (CharacterEscapes?). This would be a backward compatible mechanism and generally useful beyond the specific issue of escaping the '/' character.

e.g.

JasonFactory registerDefaultJsonFactory(Factory<JsonFactory> f)

labkey-matthewb avatar Jan 10 '24 18:01 labkey-matthewb

I am generally against any static, JVM-wide configuration, due to Jackson's usage as an embedded library. This means that trying to force changes for your app/framework/library's needs can easily break other usage.

But there is precedence for doing something like this with StreamReadConstraints / StreamWriteConstraints, wherein ultimate defaults can be changed but can then still be overridden by explicit settings. This could be considered in this case.

Although TBH, I am not sure I see this as necessarily warranting such change: use case of producing JSON serializations for embedding in HTML is an existing but dominating use case.

Either way, the first step would be to implement JsonWriteFeature, after which possible defaulting could be considered (including implementation technique, something along lines suggested).

cowtowncoder avatar Jan 10 '24 18:01 cowtowncoder

@JooHyukKim yes, exactly, that'd be way to do it. Question is that of how to change encoding tables to add minimal/no overhead.

cowtowncoder avatar Jan 12 '24 02:01 cowtowncoder

Looking at JsonGeneratorImpl subtypes -- WriterBasedJsonGenerator and UTF8JsonGenerator -- the way to go is to figure out how to use alternate output encoding table _outputEscapes. Care needs to be taken as these are shared arrays, but the thing to do is just to make sure index for / (0x2F) forces ecaping; and that's it really.

This would be quite simple if it wasn't for existing setting of custom quote character, for which encoding table is changed.

cowtowncoder avatar Jan 12 '24 04:01 cowtowncoder

Wrote #1197 draft. Seems oddly straight-forward. Could you PTAL at what I am missing 🤔, @cowtowncoder ?

JooHyukKim avatar Jan 21 '24 04:01 JooHyukKim

This is great, and cleans up my code quite a bit. So thank you. Is this request considered closed for 3.0? Or is the option of changing the default to include '/' still on the table?

labkey-matthewb avatar Jan 23 '24 00:01 labkey-matthewb

So for 3.0 let's add this character as escape-by-default, but also add a simple mechanism for turning that off if feasible (JsonWriteFeature, most likely?).

Seems like original plan was to enable by default (currently disabled in 2.x) @labkey-matthewb.

JooHyukKim avatar Jan 23 '24 01:01 JooHyukKim

I'll create follow-up ticket to change the default for 3.0, and yes agree with @JooHyukKim .

cowtowncoder avatar Jan 23 '24 01:01 cowtowncoder