GSoC 2025: Comprehensive JSON Schema linting for encouraging best practices and catching anti-patterns early
Brief Description
Writing well-crafted schemas is extremely hard. Not only JSON Schema is a complex schema language and it is easy to shoot yourself in the foot, but as an organisation, we never properly encoded and shared what the best practices and anti-patterns are. As a consequence, our users (including users of API specifications like OpenAPI and AsyncAPI) don't know how to write great schemas, and don't even know what the quality of the schemas they already have is.
I kickstarted some of this work on my open-source tooling (https://github.com/sourcemeta/jsonschema/blob/main/docs/lint.markdown) and would like to take it to the next level.
Expected Outcomes
- Reach agreement on what the anti-patterns and best practices are across dialects, documenting these in the official JSON Schema organisation as a style guide / advice document in the JSON Schema website. In there, we can publish each rule with a stable URL so any linter out there can link to them when reporting failures, etc saving all of us time from actually explaining them in each tool 😅
- Extend popular open-source tooling made by TSC members and endorsed by the org (like the Sourcemeta JSON Schema CLI, but potentially others too) to encode the before-mentioned anti-patterns and best practices in a runnable form
Skills Required
- Strong communication skills, as the first point might involve talking to lots of people in the community and driving discussions to reach agreement.
- A good understanding of JSON Schema, enough to appreciate and understand what the discovered anti-patterns and best practices would be. Of course, we will mentor a lot in this area
- A good understanding of the programming language(s) used in the endorsed open-source linting tooling we will extend. The Sourcemeta JSON Schema CLI is written in C++ (it also has a library form that could be then used by other projects)
Mentors
- @jviotti
- @Relequestual
Expected Difficulty
Medium.
Expected Time Commitment
350 hours. We expect most of the time being spent driving discussions to reach agreement. Doing so often takes a lot of time!
Great idea Juan, thanks for your submission.
Hi Juan, have you seen the work done by @gregsdennis here:
https://github.com/json-schema-org/json-schema-linting
Ah, very nice @benjagm. We should definitely take that as inspiration. Cool stuff. @gregsdennis Is it something you are actively working on?
No. I started some Spectral stuff a year or so ago, but never got beyond a start. Feel free to overwrite what I've done.
Adding myself as a mentor as discussed with @jviotti =]
Hi @jviotti and @Relequestual I’ve been learning JSON Schema from the official docs for the past few weeks and now have a good understanding of it.I’ve also explored various linting methods, including the Sourcemeta JSON Schema Linting Guide and Json Schema Linting, to understand existing approaches. I’m interested in contributing to this issue for GSoC 2025 if this project gets selected under the guidance of you. To align with the project's needs, I'd love to get involved in community discussions on various linting methods and best practices. I'm also open to guidance on the most critical anti-patterns to focus on. Looking forward to your thoughts🙂.
Please also see the prior work at
- https://github.com/json-schema-org/json-schema-spec/issues/1079
- https://github.com/orgs/json-schema-org/discussions/323
- ..and the other issues linked from those.
Hii everyone , so JSON-SCHEMA now officially Gsoc 2025 orgs list , And I wants to work on this Project with respected mentor . I read necessary document and also read other refer document and resource shared by mentors . looking forward for response .
Hi @jdesrosiers,
I’m super excited about the opportunity to work on a JavaScript library that transforms standard JSON Schema (draft-2019-09) validation outputs into clear, human-friendly error messages for GSoC 2025! I’ve got a strong grasp of JavaScript (e.g., crafting modular libraries with Node.js) and have been diving into JSON Schema through tools like @hyperjump/json-schema. I love the idea of making technical outputs more accessible.
Inspired by libraries like better-ajv-errors, I’d like to propose a solution that delivers concise messages, supports language packs for multilingual use, and offers customization options. To get started, here’s how I’d approach it:
- Parse the Output: Study the draft-2019-09 error format and write a utility to extract key details (e.g.,
instancePath,schemaPath,message). - Message Templates: Create a default set of human-friendly templates (e.g., “Value at /age must be a number, got string”) with fallback handling.
- Language Packs: Design a simple system to load JSON-based language files (e.g.,
en.json,fr.json) for easy i18n support. - Customization: Add an API for users to override messages or define custom ones via a config object.
- Testing & Publishing: Test against
@hyperjump/json-schemaand other implementations, then package it for npm with a solid README.
I’m planning to draft a full GSoC proposal soon—would you be willing to share feedback on it? I’d also love to hear your thoughts on these steps or any specific priorities you’d like to emphasize. If it helps, I can whip up a quick prototype to showcase the concept.
Can’t wait to collaborate with you and the Hyperjump team—this feels like a perfect fit for my skills and passion! Thanks for considering me.
Here is a proposal for the qualification task for this project (cc @Relequestual we can discuss it here in the open if you have better ideas!)
- Spend some time of your choosing looking around the JSON Schema Slack, StackOverflow, and other sources where people are asking questions about JSON Schema
- Based on that data, create a PDF report of the top 10 linting rules you would want to see us standardise and implement, including a brief explanation, an example, and why you picked it
The point here is not to have a comprehensive list, but show us that:
- You have the ability of digesting a large amount of information (Slack / StackOverflow)
- You have the ability of transforming diffuse noise information (from i.e. user conversations), and understand what the users were trying to do, and why the recommended solution helped (i.e. either a best practice or an anti pattern)
- You have enough understanding of JSON Schema to judge whether something mentioned indeed looks like a best practice or not, and why.
- You have enough understanding of JSON Schema to be able to prioritise which should be the top linting rules in your opinion
(@GANESHSHARMA1, it looks like posted in the wrong issue 😄. If you want to repost in the right place, I'd be happy to respond.)
Here is a proposal for the qualification task for this project (cc @Relequestual we can discuss it here in the open if you have better ideas!) - @jviotti
I think this, plus:
- Identify if/how to identify 1-5 of these using JSON Schema
- Propose an approach on how to autofix the same linting violations
- Give each of the rules a level (error, warn, info), and justify your choice
Too much?
Also, do we want the rule definitions to be JSON based as opposed to code based, and then for there to be an engine to process the rules? Using JSON Schema may not be faster than pure code, but it will likely still be faster using Blaze, and will allow the rulesets to be interoperable.
Hello @jviotti @Relequestual , I've been an active contributor in the JSON Schema organisation and also have tried and contributed to Sourcemeta repositories like core and jsonschema (cli wrapper on blaze and core) and created a vs-code extension using jsonschema-cli based on this conversation I plan on working on my open PRs and adding new features to the cli and updating the extension with better linting as I complete adding --lint to json command after I'm done with my mid-sem exams. I have been learning JSON Schema from the tour and https://www.learnjsonschema.com/. I've started creating the doc with 10 linting rules I think must be present. Could you please give an update on what the final qualification task is?
@Relequestual
Also, do we want the rule definitions to be JSON based as opposed to code based, and then for there to be an engine to process the rules? Using JSON Schema may not be faster than pure code, but it will likely still be faster using Blaze, and will allow the rulesets to be interoperable.
We can try to prototype this during GSoC as a bonus and see where we end up at. In parallel to researching and collecting the actual set of linting rules?
Really excited to contribute to this! Can't wait to get started! 🚀
@jviotti @Relequestual I am looking forward to work on this project, though I am relatively new to JSON Schema. Could you provide some Source to learn best practices for JSON Schema? is there any existing linting doc available ?
Hello @ShivamBisen , you can learn JSON Schema from the https://tour.json-schema.org/ and https://www.learnjsonschema.com/2020-12/, after that write some schemas yourself.
Hello! :wave:
This issue has been automatically marked as stale due to inactivity :sleeping:
It will be closed in 180 days if no further activity occurs. To keep it active, please add a comment with more details.
There can be many reasons why a specific issue has no activity. The most probable cause is a lack of time, not a lack of interest.
Let us figure out together how to push this issue forward. Connect with us through our slack channel : https://json-schema.org/slack
Thank you for your patience :heart:
closing this as done