json-ld.net icon indicating copy to clipboard operation
json-ld.net copied to clipboard

Support dependency-free API

Open goofballLogic opened this issue 3 years ago • 16 comments

The current API requires the use of Newtonsoft.Json in order to pass data into and out of the API. It would be good to provide a dependency-free core implementation (acting on either strings or simple objects), but also provide a drop-in version of the package which supports the existing Newtonsoft.Json signature.

goofballLogic avatar Jul 05 '20 11:07 goofballLogic

Proposed design here: https://github.com/linked-data-dotnet/json-ld.net/issues/59#issuecomment-653392194

goofballLogic avatar Jul 05 '20 11:07 goofballLogic

#66 seems particularly relevant to this.

asbjornu avatar Jul 05 '20 12:07 asbjornu

I think the best way to proceed with this might be to extract a namespace (either as a project within this repository, or as a distinct repository + nuget package) and to then evolve the existing code to call through to the underlying dependency-free version.

At the moment the namespace seems to be split e.g. JsonLD.Core, JsonLD.Util, JsonLD.Impl etc. The nuget package is named "json-ld.net"

How about extracting the logic of JsonLD.Core into a separate repository and nuget package json-ld.core.net?

In order to retain backwards compatibility with existing users of json-ld.net we would have to preserve (or extend) the existing signature of the objects currently in the JsonLD.Core namespace. This also implies (I think?) that we couldn't simply reuse JsonLD.Core as the default namespace for the new json-ld.core.net repository, so we'd have to do something like JsonLD.Core.Logic or JsonLD.Core.Base which is a bit painful...

goofballLogic avatar Jul 11 '20 12:07 goofballLogic

+1 to the namespace split, but I'm not sure I agree with the namespaces themselves. I'm also wary to the idea of separating things into several repositories, as that makes maintenance of the different components harder.

Although I think Core makes sense as the name of an assembly, having it in a namespace does more harm than good since it doesn't allow for natural and automatic namespace scoping. Instead of everything living inside a sub-namespace of JsonLD. having a reference "upwards" to JsonLD.*, every sub-namespaced class needs using JsonLD.Core if the core namespace is JsonLD.Core and not just simply JsonLD.

I'm also not a fan of Util, Helper, etc.. If we can avoid such classes and namespaces, that would be a good thing. The .Impl namespace feels very Java-ish to me, so it would be good to get rid of that as well.

And while we're speaking of naming, perhaps the word JSON shouldn't be in the namespace at all? While the library can still be named json-ld.net, I think both namespace and assembly wise, we can go for the more serialization agnostic LinkedData and then have LinkedData.Newtonsoft for JSON.NET serialization and LinkedData.System.Text.Json (or something thereabouts) for System.Text.Json serialization.

Thoughts?

asbjornu avatar Jul 13 '20 21:07 asbjornu

I agree with all of your sentiments, but I was trying to stick to the principle of evolving the package forward, rather than doing rewrites (which to me implies breaking changes).

What I'd like to see is a way for us to pull the dependency-free parts out of the existing code without breaking the interface established by the package at present. Once we are happy with the new dependency-free logic, we can then introduce a more sane naming standard in parallel to the existing one, and then finally deprecate the existing one over time.

So, like this:

  1. Refactor the code, exposing a dependency-free API but retaining backwards interface compatibility (minor version change) ~ 6 months
  2. Introduce new naming convention, but not yet deprecating the old one (major version change) ~ 9 months
  3. Deprecate the old naming conventions and interface (major version change) ~ 18 months

goofballLogic avatar Jul 14 '20 09:07 goofballLogic

Ok, thanks for clarifying. 💯 to your proposed plan.

asbjornu avatar Jul 14 '20 10:07 asbjornu

I have a crazy idea for a dependency-free internal JSON representation that's trivially compatible with Newtonsoft.Json, System.Text.Json, IDictionary<string,object> and any other reasonable JSON DOM. I'll try to whip up a concrete proposal in the next week or so.

sblom avatar Oct 07 '20 03:10 sblom

thanks @sblom i'm very much looking forward to us achieving this (via your idea or any other). I've been chomping at the bit to do this myself but stopped when I saw the state of the tests, public API and code generally. So, I'm tackling those first and will come back to this when we have a solid foundation. Interested to see your ideas.

goofballLogic avatar Oct 09 '20 14:10 goofballLogic

Just wanted to ping this issue--I'm still working on this. I have an 80% complete prototype with an approach that I'm 95% happy with, but there are definitely some bumps that I'll want to chat through with y'all.

In the meanwhile, here's a very quick overview of the approach.

Overview

Nothing fancy happens in the algorithm implementations that strictly requires anything more than IOrderedDictionary<string,object>, and IList<object>. (I'll call this simplistic Perl-esque data model the JavaScript Data Model or JSDM.) However, using a wrapper like the Json.NET type JToken gives us some conveniences such as a polymorphic indexer (JToken this[object key]), as well as collection helpers and type inspection/conversion helpers that enable DOM-style navigation of a JSDM.

My approach is to generalize the design of JToken (and its subclasses JObject, JArray, etc.) to provide a GenericJsonToken wrapper for any JDSM-style model, whether the ubiquitous generic interfaces mentioned above, JToken, or any other JSON libraries with existing JSDM wrappers.

Internally, we will operate on GenericJsonToken values, but at API boundaries we can provide a shim that will encapsulate a JSDM in a GenericJsonToken wrapper, and another shim that can Unwrap() back to the source data type. We can also provide helpers that will get from string to a GenericJsonToken wrapping some dependency-free JSDT model, and certainly round-trip back to a string.

One reason I'm motivated to find some interface other than simply string-in-string-out is that, in my experience consuming JSON-LD.NET in the NuGet Gallery, there are many cases where we needed to do several operations on the data and would rather "stay inside" some representation, thus avoiding repeated parsing and string building.

I figure we can give consuming developers interfaces that let them keep a GenericJsonToken or JToken around for intermediate activities. That is, I'm okay with GenericJsonToken or JsonLdToken(?) being public types that we expose and support. The goal won't be parity with JToken, and in fact the outermost public wrapper could be an opaque Token that only supports Unwrap() to get the internal representation.

sblom avatar Nov 14 '20 20:11 sblom

I like the sound of that, @sblom! 👍 to JsonLdToken and a 👎 to Unwrap() since it will only make the object we unwrap to a part of our external API interface. People will expect that when they perform Unwrap() and do all sorts of crazy manipulation on the "inside", that manipulation is going to continue working in the future. That locks us into a corner and will make it impossible to change from JSON.NET to System.Text.Json without bumping major version, or to change our mapping and interpretation of the internal JToken to our exposed JsonLdToken.

asbjornu avatar Nov 14 '20 22:11 asbjornu

For my part I think we should use a simple Dictionary and List as the internal representation, and provide string, Json.Net and System.Text.Json as external representations that the library can work with.

I'm nervous about the Unwrap functionality - would like to see an example of client code which would use this facility to understand how much of a negative impact this would have on developer ergonomics.

The existing interface is to my mind quite bloated already (https://user-images.githubusercontent.com/910448/94048322-3f469480-fdcb-11ea-9fb1-b9582a25224e.png). Things like RDFDataset, RDFDatasetUtils, JsonLdError, UniqueNamer et al. complicate our external interface with custom types which I think we should avoid.

For this library to be competitive, I think we need to make the core library as lean as humanly possible, and then wrap adapters around it, rather than adding types into the core.

goofballLogic avatar Nov 15 '20 17:11 goofballLogic

Definitely on board with supporting fewer public types. TBH, I don't know what real use cases of this library look like well enough to have an intuition for what types we SHOULD support. Do we know any real usage (other than nuget.org) that we can use to guide us here?

I'm sure we don't want to require our users to completely de-/re-serialize potentially large documents into some intermediate format that we dictate if they already have data in a JSON DOM of a supportable flavor lying around. That's many megabytes and also many hundreds of milliseconds. Speaking from personal experience, nuget.org would have resisted this library if it cost too many megabytes or milliseconds, and it seems we generally want server developers to feel like we meet their data where it already stands.

My prototype now has almost all of the conformance tests passing with JsonLdToken wrappers for both Json.NET and JSDM (and I have very high confidence it could be adapted to nearly any other rational JSON lib). I have a JSDM-targeting System.Text.Json deserializer that can help users on .NET Core 3.1 get to/from strings with no dependencies.

(I don't expect anyone to be wholly persuaded of anything until you can put your hands on my prototype, and I know y'all don't know me well enough to trust me when I say "this is shaping up pretty well", but it is. ;) )

IMHO, we should continue to internally use JsonLdToken--it gives us good internal ergonomics, and keeps our intermediate state very close to the the form that the consuming developer will want at the end. As it stands, the final Unwrap() is just returning a private field.

Also, I have a design for good package-based support for wrapping/unwrapping into JsonLdToken to make it essentially transparent--I'll move on to a demonstration of that once I have a preview of my prototype in a readable state to discuss. (I expect a big burst of progress on this in the second half of December, but it's coming along.)

sblom avatar Dec 06 '20 02:12 sblom

@tpluscode, aren't you using jsonld.net? Can you provide some assistance to understand the requirements from a consuming application?

asbjornu avatar Dec 08 '20 14:12 asbjornu

I think your design idea and mine are heading in two rather different directions @sblom . I suggest that I stop my work if you are going ahead.

FWIW, I am a real-life user using this library in production applications.

goofballLogic avatar Dec 11 '20 15:12 goofballLogic

I suggest that I stop my work if you are going ahead.

@goofballLogic It's definitely premature to put all our chips on my not-quite-finished proof-of-concept.

I promise I'm not trying to steal your wind--I just have an idea I'm excited about for which I'm building a demo. Once we can look at my code, we can compare and contrast its ergonomics and performance with a strings-everywhere approach. We'll probably find some downsides in addition to the upsides, and I don't claim to represent any perspective other than my own in the trade-offs discussion that we'll need to have.

sblom avatar Dec 13 '20 03:12 sblom

This issue has been automatically marked as stale because it has not had recent activity. After 30 days from now, it will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 15 '21 06:03 stale[bot]