community icon indicating copy to clipboard operation
community copied to clipboard

GSoC 2025: Comprehensive JSON Schema linting for encouraging best practices and catching anti-patterns early

Open jviotti opened this issue 11 months ago • 17 comments

Brief Description

Writing well-crafted schemas is extremely hard. Not only JSON Schema is a complex schema language and it is easy to shoot yourself in the foot, but as an organisation, we never properly encoded and shared what the best practices and anti-patterns are. As a consequence, our users (including users of API specifications like OpenAPI and AsyncAPI) don't know how to write great schemas, and don't even know what the quality of the schemas they already have is.

I kickstarted some of this work on my open-source tooling (https://github.com/sourcemeta/jsonschema/blob/main/docs/lint.markdown) and would like to take it to the next level.

Expected Outcomes

  • Reach agreement on what the anti-patterns and best practices are across dialects, documenting these in the official JSON Schema organisation as a style guide / advice document in the JSON Schema website. In there, we can publish each rule with a stable URL so any linter out there can link to them when reporting failures, etc saving all of us time from actually explaining them in each tool 😅
  • Extend popular open-source tooling made by TSC members and endorsed by the org (like the Sourcemeta JSON Schema CLI, but potentially others too) to encode the before-mentioned anti-patterns and best practices in a runnable form

Skills Required

  • Strong communication skills, as the first point might involve talking to lots of people in the community and driving discussions to reach agreement.
  • A good understanding of JSON Schema, enough to appreciate and understand what the discovered anti-patterns and best practices would be. Of course, we will mentor a lot in this area
  • A good understanding of the programming language(s) used in the endorsed open-source linting tooling we will extend. The Sourcemeta JSON Schema CLI is written in C++ (it also has a library form that could be then used by other projects)

Mentors

  • @jviotti
  • @Relequestual

Expected Difficulty

Medium.

Expected Time Commitment

350 hours. We expect most of the time being spent driving discussions to reach agreement. Doing so often takes a lot of time!

jviotti avatar Jan 15 '25 14:01 jviotti

Great idea Juan, thanks for your submission.

Honyii avatar Jan 15 '25 20:01 Honyii

Hi Juan, have you seen the work done by @gregsdennis here:

https://github.com/json-schema-org/json-schema-linting

benjagm avatar Jan 18 '25 11:01 benjagm

Ah, very nice @benjagm. We should definitely take that as inspiration. Cool stuff. @gregsdennis Is it something you are actively working on?

jviotti avatar Jan 20 '25 13:01 jviotti

No. I started some Spectral stuff a year or so ago, but never got beyond a start. Feel free to overwrite what I've done.

gregsdennis avatar Jan 20 '25 17:01 gregsdennis

Adding myself as a mentor as discussed with @jviotti =]

Relequestual avatar Jan 24 '25 14:01 Relequestual

Hi @jviotti and @Relequestual I’ve been learning JSON Schema from the official docs for the past few weeks and now have a good understanding of it.I’ve also explored various linting methods, including the Sourcemeta JSON Schema Linting Guide and Json Schema Linting, to understand existing approaches. I’m interested in contributing to this issue for GSoC 2025 if this project gets selected under the guidance of you. To align with the project's needs, I'd love to get involved in community discussions on various linting methods and best practices. I'm also open to guidance on the most critical anti-patterns to focus on. Looking forward to your thoughts🙂.

cbum-dev avatar Jan 31 '25 04:01 cbum-dev

Please also see the prior work at

  • https://github.com/json-schema-org/json-schema-spec/issues/1079
  • https://github.com/orgs/json-schema-org/discussions/323
  • ..and the other issues linked from those.

karenetheridge avatar Feb 07 '25 00:02 karenetheridge

Hii everyone , so JSON-SCHEMA now officially Gsoc 2025 orgs list , And I wants to work on this Project with respected mentor . I read necessary document and also read other refer document and resource shared by mentors . looking forward for response .

saurabhraghuvanshii avatar Feb 27 '25 19:02 saurabhraghuvanshii

Hi @jdesrosiers,

I’m super excited about the opportunity to work on a JavaScript library that transforms standard JSON Schema (draft-2019-09) validation outputs into clear, human-friendly error messages for GSoC 2025! I’ve got a strong grasp of JavaScript (e.g., crafting modular libraries with Node.js) and have been diving into JSON Schema through tools like @hyperjump/json-schema. I love the idea of making technical outputs more accessible.

Inspired by libraries like better-ajv-errors, I’d like to propose a solution that delivers concise messages, supports language packs for multilingual use, and offers customization options. To get started, here’s how I’d approach it:

  1. Parse the Output: Study the draft-2019-09 error format and write a utility to extract key details (e.g., instancePath, schemaPath, message).
  2. Message Templates: Create a default set of human-friendly templates (e.g., “Value at /age must be a number, got string”) with fallback handling.
  3. Language Packs: Design a simple system to load JSON-based language files (e.g., en.json, fr.json) for easy i18n support.
  4. Customization: Add an API for users to override messages or define custom ones via a config object.
  5. Testing & Publishing: Test against @hyperjump/json-schema and other implementations, then package it for npm with a solid README.

I’m planning to draft a full GSoC proposal soon—would you be willing to share feedback on it? I’d also love to hear your thoughts on these steps or any specific priorities you’d like to emphasize. If it helps, I can whip up a quick prototype to showcase the concept.

Can’t wait to collaborate with you and the Hyperjump team—this feels like a perfect fit for my skills and passion! Thanks for considering me.

GANESHSHARMA1 avatar Feb 28 '25 12:02 GANESHSHARMA1

Here is a proposal for the qualification task for this project (cc @Relequestual we can discuss it here in the open if you have better ideas!)

  • Spend some time of your choosing looking around the JSON Schema Slack, StackOverflow, and other sources where people are asking questions about JSON Schema
  • Based on that data, create a PDF report of the top 10 linting rules you would want to see us standardise and implement, including a brief explanation, an example, and why you picked it

The point here is not to have a comprehensive list, but show us that:

  • You have the ability of digesting a large amount of information (Slack / StackOverflow)
  • You have the ability of transforming diffuse noise information (from i.e. user conversations), and understand what the users were trying to do, and why the recommended solution helped (i.e. either a best practice or an anti pattern)
  • You have enough understanding of JSON Schema to judge whether something mentioned indeed looks like a best practice or not, and why.
  • You have enough understanding of JSON Schema to be able to prioritise which should be the top linting rules in your opinion

jviotti avatar Feb 28 '25 19:02 jviotti

(@GANESHSHARMA1, it looks like posted in the wrong issue 😄. If you want to repost in the right place, I'd be happy to respond.)

jdesrosiers avatar Mar 03 '25 04:03 jdesrosiers

Here is a proposal for the qualification task for this project (cc @Relequestual we can discuss it here in the open if you have better ideas!) - @jviotti

I think this, plus:

  • Identify if/how to identify 1-5 of these using JSON Schema
  • Propose an approach on how to autofix the same linting violations
  • Give each of the rules a level (error, warn, info), and justify your choice

Too much?

Also, do we want the rule definitions to be JSON based as opposed to code based, and then for there to be an engine to process the rules? Using JSON Schema may not be faster than pure code, but it will likely still be faster using Blaze, and will allow the rulesets to be interoperable.

Relequestual avatar Mar 03 '25 11:03 Relequestual

Hello @jviotti @Relequestual , I've been an active contributor in the JSON Schema organisation and also have tried and contributed to Sourcemeta repositories like core and jsonschema (cli wrapper on blaze and core) and created a vs-code extension using jsonschema-cli based on this conversation I plan on working on my open PRs and adding new features to the cli and updating the extension with better linting as I complete adding --lint to json command after I'm done with my mid-sem exams. I have been learning JSON Schema from the tour and https://www.learnjsonschema.com/. I've started creating the doc with 10 linting rules I think must be present. Could you please give an update on what the final qualification task is?

Karan-Palan avatar Mar 03 '25 12:03 Karan-Palan

@Relequestual

Also, do we want the rule definitions to be JSON based as opposed to code based, and then for there to be an engine to process the rules? Using JSON Schema may not be faster than pure code, but it will likely still be faster using Blaze, and will allow the rulesets to be interoperable.

We can try to prototype this during GSoC as a bonus and see where we end up at. In parallel to researching and collecting the actual set of linting rules?

jviotti avatar Mar 03 '25 12:03 jviotti

Really excited to contribute to this! Can't wait to get started! 🚀

peter-abhinav avatar Mar 03 '25 13:03 peter-abhinav

@jviotti @Relequestual I am looking forward to work on this project, though I am relatively new to JSON Schema. Could you provide some Source to learn best practices for JSON Schema? is there any existing linting doc available ?

shivambisen avatar Mar 07 '25 09:03 shivambisen

Hello @ShivamBisen , you can learn JSON Schema from the https://tour.json-schema.org/ and https://www.learnjsonschema.com/2020-12/, after that write some schemas yourself.

Karan-Palan avatar Mar 07 '25 09:03 Karan-Palan

Hello! :wave:

This issue has been automatically marked as stale due to inactivity :sleeping:

It will be closed in 180 days if no further activity occurs. To keep it active, please add a comment with more details.

There can be many reasons why a specific issue has no activity. The most probable cause is a lack of time, not a lack of interest.

Let us figure out together how to push this issue forward. Connect with us through our slack channel : https://json-schema.org/slack

Thank you for your patience :heart:

github-actions[bot] avatar Sep 07 '25 01:09 github-actions[bot]

closing this as done

Honyii avatar Sep 29 '25 10:09 Honyii