sql-formatter
sql-formatter copied to clipboard
Explicitly include a single dialect
We are using this library in production in the browser, so bundle sizes matter.
When I analyze my bundle app, the sql-formatter library uses about 443kb, but a large part of this is all the dialetcs:
We only need 1 dialect, so there is no need for the others to be in the bundle.
Would it be possible to have an option to explicitly import a language so that the rest can be tree-shaken from the bundle?
I was thinking, maybe something like this:
import { format } from 'sql-formatter/sqlite';
Yeah, that's definitely a feature that would be nice to have.
I did a quick comparison of the minified bundle sizes:
- with all dialects included: 256KiB
- with only the default "sql" dialect: 77KiB
Possibly a better approach syntax-wise would be to separately export the core engine and language definitions. Something like:
import { format } from 'sql-formatter/core';
import { sqlite } from 'sq;-formatter/sqlite';
format("SELECT *", { language: sqlite });
The language
option already accepts passing in a custom dialect implementation, so that would work out more naturally.
However, currently this feature isn't really a priority. And for the time being, it's the first time such code-splitting has been requested.
To put things in perspective, the node-sql-parser library has the same issue and in its case each language definition is way larger than in sql-formatter.
highlight.js provides this API for this:
import hljs from 'highlight.js/lib/core'
import python from 'highlight.js/lib/languages/python'
hljs.registerLanguage('python', python)
hljs.highlightElement('<some Python in html>', {language: 'python'})
I wouldn't want to pass Formatter objects everywhere for the same reason you didn't make that the API initially. It's tedious to import and pass around an object instead of just a string. If I have a bunch of SQL snippets and I'm storing the SQL dialect with the snippet, if I'm doing anything with those snippets besides passing them to sql-formatter, I'm storing the information "what dialect is this SQL snippet in?" as a string, not as an sql-formatter Formatter object. So that means in my application I'm going to have to have a mapping of languageName -> FormatterObject
somewhere, but it makes much more sense for that mapping to live in the library, not everyone's applications.
I don't really know what to suggest here because obviously it's nicer that the library exports a function instead of an object with a .format()
property. You could probably have the format
function have the registerLanguage
function as a property, like
import { format } from 'sql-formatter/core';
import { sqlite } from 'sq;-formatter/sqlite';
format.registerLanguage("sqlite", sqlite);
format("SELECT *", { language: 'sqlite' });
but even if that's possible it's not a good idea.
Hey, thanks for the highlight.js example.
I agree with your concern that it could be inconvenient to perform this language mapping in code. However, there is a fundamental problem with the way highlight.js has solved this. What if in one part of code you do hljs.registerLanguage('lisp', commonLisp)
and in another part you do hljs.registerLanguage('lisp', scheme)
. And when you then finally call hljs.highlightElement(el, {language: 'lisp'})
you can't be certain which way it will get highlighted.
That's definitely a contrived example and very unlikely to happen in practice, but it's nevertheless always a possibility with this kind of shared global state solution. Perhaps it's worth the tradeoff, but perhaps not. Not really sure.
Regarding your concern over attaching registerLanguage()
to format
, there's really no need to do that. We can simply export it separately:
import { format, registerLanguage } from 'sql-formatter/core';
import { sqlite } from 'sql-formatter/sqlite';
registerLanguage("sqlite", sqlite);
format("SELECT *", { language: 'sqlite' });
Then again, the mapping isn't really that hard to do. Just wrap the calling of sql-formatter into your own function:
import { format, sqlite, postgresql, transactsql } from "sql-formatter/core";
const languageMap = { sqlite, postgres, transactsql };
export function formatSql(code: string, lang: keyof typeof languageMap) {
return format(code, { language: languageMap[lang], tabWidth: 4, keywordCase: 'upper' });
}
Which is probably a good idea anyway, as you likely want to apply the same settings to all languages you're formatting, plus you're shielding the rest of your code base from any changes that might happen in the library.
What if in one part of code you do
hljs.registerLanguage('lisp', commonLisp)
and in another part you dohljs.registerLanguage('lisp', scheme)
This is just setting a key on a dictionary. I would expect the last one I ran (scheme
) to happen, and if I'm doing this asynchronously or something (is that possible in JS?) then I can expect that setting the same key on a dictionary asynchronously is going to lead to problems, that's not sql-formatter's fault.
the mapping isn't really that hard to do
I'm only suggesting that having a way to modify the library's internal mapping
https://github.com/sql-formatter-org/sql-formatter/blob/3d750e1d76cc08df430ecb06fc738e08f652f48a/src/sqlFormatter.ts#L21-L39
and having a way to load it with an empty mapping would make code just a little simpler for some users, but if you don't think it's the right call then whatever API lets me not have to load every language would be great. I'm also just using one formatter, so this doesn't matter to me.
you likely want to apply the same settings to all languages you're formatting
In my case you're right that's what I'm already doing, but there's probably people with simpler usecases that don't need to do that.
Causing this mapping conflict in one's own code is, I'd guess, a fairly unlikely scenario, where one is simply shooting himself in the foot. The more problematic case is when you use sql-formatter and also use a library which internally uses sql-formatter. In that case the library can screw up the mapping you have defined, and you'll be having very hard time figuring out why the formatter is not working as it should (you might not even be aware that there's some library which is also using sql-formatter).
That's a fundamental problem with shared mutable state, and the best practice is to avoid it in the first place.
If we're talking about code that imports sql-formatter and imports a library that itself imports sql-formatter, wouldn't those be two different instances of the library and have their own, unshared state?
And actually, since formatters
is exported, sql-formatter already lets users do this
import { format, formatters } from 'sql-formatter';
formatters.trino = {}
console.log(format('SELECT * FROM tbl', { language: 'trino' })); // errors when it tries to use {} as a Formatter
and get it to use a custom formatter. You can even almost do what I'm suggesting letting people do (through a function in the library, but I guess just through directly editing the dictionary is good too)
import { format, formatters } from 'sql-formatter';
formatters.some_other_language = {}
console.log(format('SELECT * FROM tbl', { language: 'some_other_language' }));
but it doesn't work just because this check
https://github.com/sql-formatter-org/sql-formatter/blob/ad8dd40119941fed7b68cf864919313aa9e3db78/src/sqlFormatter.ts#L84-L86
doesn't expect formatters
to change after initialization.
You could have this API:
import { format, formatters } from 'sql-formatter';
console.log(formatters) // all the formatters
console.log(format('SELECT * FROM tbl', { language: 'redshift' }));
import { format, formatters } from 'sql-formatter/core';
import { TrinoFormatter } from 'sql-formatter/languages/trino';
console.log(formatters) // empty {}
formatters.trino = TrinoFormatter
console.log(format('SELECT * FROM tbl', { language: 'trino' }));
If we're talking about code that imports sql-formatter and imports a library that itself imports sql-formatter, wouldn't those be two different instances of the library and have their own, unshared state?
This depends on whether they're using the same or different versions of the library. And also on how exactly the versions are specified, like when one library requires ^2.0.0
and another requires ^2.1.0
, then npm can only download 2.1.0
to satisfy both. In that case the code and also the internal state of the library is shared.
And actually, since
formatters
is exported, sql-formatter already lets users do this
Good catch. That's been exported by mistake. Should remove.
@nene I tried the new feature, it works fine, but the output is the same as before. When using the code from the example it doesn't treeshake the other languages away just yet.
Thanks. Will investigate...
Found the culprit. Did another release that should fix the problem: 12.0.0-beta.6
Let me know if this works better.
Looks good, before:
312.63KB
Now:
84.36KB