sh icon indicating copy to clipboard operation
sh copied to clipboard

cmd/shfmt: support loading editorconfig conf when formatting a suffix-less file

Open sirosen opened this issue 3 years ago • 8 comments

In a project where there are shell scripts with a shebang line and no suffix, editorconfig for configuration doesn't work cleanly. If I run shfmt -w scripts/some-script scripts/other-script.sh (where some-script is actually a bash script), the editorconfig section isn't loaded for that file and different formatting is used between the two.

Ideally I would be able to run without flags and get behavior driven by *.sh in editorconfig as the "default" config.

I've looked at #577 (I can't use -filename PLACEHOLDER.sh because the file isn't coming from stdin) and #393 as relevant. I'm also looking at https://github.com/editorconfig/editorconfig/issues/404 which seems to have gone off the rails and stalled out. It's a shame because I already have pre-commit and it's identify tool solving the detection problem for me, so an editorconfig section for [[shell]] would be perfect. I'm not sure at this stage if a comment on that editorconfig issue would be productive, so I'm staying away.

In case it's handy and anyone reading this wants to try shfmt via pre-commit, here's a snippet of pre-commit config for running shfmt
- repo: local
  hooks:
    - id: shfmt
      name: "Format shell scripts (shfmt)"
      entry: shfmt
      language: system
      types: [shell]
      # we need to use explicit options, not rely on editorconfig, because
      # pre-commit will run this on detected shell scripts without the *.sh
      # suffix
      args: ["-w", "-i", "2", "-ci", "-sr"]

I think it would be viable to use a special editorconfig section for shfmt behavior when the filename, matched against editorconfig, does not produce a configuration section. Something like [*.{sh,bash,SHFMT_DEFAULT)]. That's obviously a huge hack, and much worse on the surface than [[shell]], but it would probably work okay.

I very much appreciate shfmt's decision not to litter my filesystem with more itty-bitty files, but in this case it's a frustration. I'm not able to configure shfmt behavior in a directory in a direct way, only through a layer of indirection (editorconfig) which does not support my project's structure.

sirosen avatar Feb 09 '21 18:02 sirosen

I've looked at #577 (I can't use -filename PLACEHOLDER.sh because the file isn't coming from stdin) and #393 as relevant. I'm also looking at editorconfig/editorconfig#404 which seems to have gone off the rails and stalled out.

I started reading your comment and was going to reply with these two threads. You have an accurate understanding of how I tried to solve the common case of extension-less shell scripts :)

Another option which you didn't mention is to have a [scripts/*] section for Bash, and a later [scripts/*.sh] section for POSIX shell to override settings matched from the previous section. It's still a bit hacky, but it might just work depending on what your setup is.

I think it would be viable to use a special editorconfig section for shfmt behavior when the filename, matched against editorconfig, does not produce a configuration section. Something like [*.{sh,bash,SHFMT_DEFAULT)]. That's obviously a huge hack, and much worse on the surface than [[shell]], but it would probably work okay.

I think using this thread to brainstorm ways to make this work in an editorconfig-compatible way is reasonable. I tried pushing the upstream spec to support languages without explicit filename extensions, but that stalled out, like you said.

I personally think that supporting [[language]] sections for our shell language variants is fine. It won't be following the EditorConfig spec, but that should be fine as long as the feature doesn't break a significant amount of existing users.

mvdan avatar Feb 09 '21 19:02 mvdan

Another option which you didn't mention is to have a [scripts/*] section for Bash, and a later [scripts/*.sh] section for POSIX shell

Just to address this really quickly: I bet this works for >90% of cases like mine, to the point where I almost didn't open an issue. In my real project, I actually have dir trees containing python scripts, bash scripts, and non-script files without extensions -- often utilities going into the PATH in some context.

I personally think that supporting [[language]] sections for our shell language variants is fine. It won't be following the EditorConfig spec, but that should be fine as long as the feature doesn't break a significant amount of existing users.

I took a look at a couple of the parser implementations to see how dangerous it would be to add this.

The ruby one would be fine, so would the python one, ... but then I checked the docs and editorconfig explicitly says brackets are allowed in section names, so I think this is 100% safe. 😂

Is there some other case you're concerned about this breaking, which we should discuss?


Also, after a bit of thought, I think I should maybe make some comment on the editorconfig issue to put forth a potential way forward.

sirosen avatar Feb 09 '21 21:02 sirosen

Is there some other case you're concerned about this breaking, which we should discuss?

Syntactically, it's entirely safe, like you said. [[lang]] is valid.

The only potential problem is anyone currently using [[lang]], whose behavior would change. I very much doubt this will be a problem in practice though, because such a pattern would only be matching the filenames l, a, n, g to begin with.

All in all, I think it's fine if we support this kind of "extension" to EditorConfig. It's essential to scripting languages like shell. It can perhaps act as a driver to extending the upstream spec in the future, too.

mvdan avatar Feb 10 '21 11:02 mvdan

Thanks for being so receptive to this!

I was looking at editorconfig.go (that's your parsing code, so I figure that's what's in use here) and trying to understand what change would need to be made. I think Find needs to take an optional list of language names in addition to the filename, and then that needs to get passed through to Filter to check for language names. That way you can call Find("path.sh", ["bash" "shell"]) or something like that.

I see that the code currently loads sections in order and "last match wins". In the comment I left on https://github.com/editorconfig/editorconfig/issues/404 I suggested that language matches should be lower precedence, but it's not necessary. If order in the file matters, then users should just define [[shell]] before [[bash]] before [foo/bar/*.sh], and the existing logic will just take care of it.

I would offer to help, but the last Go I wrote was "hello world" circa 2015. I'm not sure I'll be much help other than trying to define the right behavior. But if you like, I could give it a shot sometime.

sirosen avatar Feb 12 '21 16:02 sirosen

I can't use -filename PLACEHOLDER.sh because the file isn't coming from stdin

As a temporary measure, have you considered shfmt -filename foo.sh <scripts/bar? If that doesn't work, I'm also open to ideas to make -filename more flexible, like allowing it on a single file argument and not just stdin. I get that modifying a file in-place with redirections is tricky.

I think Find needs to take an optional list of language names in addition to the filename, and then that needs to get passed through to Filter to check for language names.

That sounds about right. Though I'd keep it in a separate method, to avoid breaking users and to keep the non-standard functionality to an entirely separate API function. Something like FindWithLanguage.

I see that the code currently loads sections in order and "last match wins".

I don't have a strong opinion here, but I do think that consistency with the existing EditorConfig spec (i.e. "last match wins") would be good. Is there a particular reason you think we should define a different match priority?

mvdan avatar Feb 15 '21 08:02 mvdan

Is there a particular reason you think we should define a different match priority?

When I wrote my comment on editorconfig, I didn't know about that existing behavior, so I proposed that language blocks should be lower precedence. But now, knowing that multiple matches are covered by the spec, I see no reason to do that.

As for -filename, I'm running via pre-commit, which invokes tools on a list of filenames. Making -filename work in that case is possible for shfmt, but I'd rather help pursue [[language]]. It's strange to write shfmt -w -filename X.sh scripts/foo.sh otherscripts/bar ..., even though that would work for me if support for it were added.

sirosen avatar Feb 15 '21 17:02 sirosen

It's strange to write shfmt -w -filename X.sh scripts/foo.sh otherscripts/bar ..., even though that would work for me if support for it were added.

I agree, and it would also result in error messages referencing the wrong file.

I'll try to take a look at language support soon. That is:

  • [[shell]] to match all shell scripts
  • [[bash]] to match bash scripts

The open question is what would we do for posix (just matching POSIX Shell), mksh, bats, and other shell language variants. We can keep them out of the first working version, I guess, but for consistency we should make them work with language detection too in the future. It doesn't seem like any "list of languages" standard includes all the shell variants we support.

mvdan avatar Feb 16 '21 16:02 mvdan

I just realised that we don't support language detection in shfmt per se; that's https://github.com/mvdan/sh/issues/622. So we should probably implement that first, then piggyback off the same language detection for this EditorConfig feature.

mvdan avatar Feb 24 '21 10:02 mvdan

My suggestion is that if you are going to continue deviating from the EditorConfig Specification, support should be added for a dedicated configuration file. Ideally this would be via a specific filename like .shfmtrc, but setting the path via a command line flag would also be fine. That file can use the same EditorConfig-ish syntax already supported.

I am running into the same problems with configuring shfmt for shell files that don't have a file extension, but am not comfortable with making my .editorconfig files deviate from the specification since they are consumed by multiple tools other than shfmt.

per1234 avatar Dec 08 '22 10:12 per1234

@per1234 to be clear, we are not deviating from the spec today. If configs per detected language is the only bit we lack from editorconfig, I don't think that's reason enough to add our own custom and specific config file. I agree it would be a nice to have, but once we add our own config file, we're stuck with it basically forever :)

mvdan avatar Dec 09 '22 09:12 mvdan

It feels to me like you are unnecessarily using .editorconfig for things it is not intended for. The only two keys from the specification that are used are indent_style and indent_size. I love that shfmt recognizes those keys. But all the rest are non-standard keys used only to configure shfmt. I don't think these things have any business being in my .editorconfig files. They are shfmt-specific, so they should be in a shfmt-specific file.

per1234 avatar Dec 09 '22 09:12 per1234

I've always understood that EditorConfig encourages domain-specific properties. For example, see https://github.com/editorconfig/editorconfig/wiki/EditorConfig-Properties#ideas-for-domain-specific-properties.

mvdan avatar Dec 09 '22 09:12 mvdan

Hi @mvdan, is there a solution in sight? I am having the same issue and none of the suggested workarounds cut it for me. We want to set per project settings with a reusable Workflow for CI and the same configuration in the developers editor and the command line. Without any duplication of settings this should be possible with a config file. Hence, we rely on .editorconfig. However, in order to configure shfmt for file without a suffix we need to add shfmt settings to [*].

I am concerned that other formatting tools might pick these settings up for scripts in different scripting languages. My suggestion to solve this is to let a .shfmtrc have the same format as .editorconfig but also let it take precedence over the settings in .editorconfig. This would be fully backwards compatible and work without any new flags.

dcd-arnold avatar Sep 25 '23 13:09 dcd-arnold

I'm still opposed to adding .shfmtrc. Once we add support for the file, we can basically never go back, and we're stuck with our own config file forever.

That said, the EditorConfig thread about [[language]] still isn't moving. It's nearly three years since I proposed implementing it anyway in https://github.com/mvdan/sh/issues/664#issuecomment-779935709, so I think we should just go ahead. We would slightly deviate from the EditorConfig spec, and they might eventually add this feature in a different form, but we can't just sit idle for a decade and hope for the best.

For now we could just support [[shell]] and [[bash]]; we can worry about other shell variants like POSIX at a later time. PRs welcome; just make sure to include tests. If noone wants to take this, I'll take a look when I can.

mvdan avatar Sep 25 '23 14:09 mvdan

I've just pushed a first implementation to master. Please try it out over the next few weeks; I would like to do a release in about a month's time. If there are any disagreements or significant input about how it works, I would like to flesh that out before a release.

mvdan avatar Dec 28 '23 22:12 mvdan

I stumbled over this issue by chance and read about [[shell]], which is useful if there are a lot of file types in a repo, and shell scripts don't have file extensions. I just can't find it in the documentation of shfmt. Also not here: https://editorconfig.org/ Can we rely on it in the future? It seems to work and it could have saved us a bit of trouble if we had known about it earlier :)

edit: I tested it with shfmt 3.5.1 on openSUSE Leap 15.5

perlpunk avatar Jan 30 '24 10:01 perlpunk

Support in shfmt is in master, and not in a release yet, per my last comment above. See https://github.com/mvdan/sh/commit/7f96e7d84a265f4d1005b96493422cde800bf9d1 for the changes with docs.

Upstream editorconfig has no support for [[language]] sections. They might at some point (https://github.com/editorconfig/editorconfig/issues/404), and if they do, we can tweak shfmt to align with upstream. But for now, it's been years and upstream isn't moving, so I think it's reasonable for us to implement it on top of the spec support.

mvdan avatar Jan 30 '24 10:01 mvdan

Oh, nevermind, I misread the diff. It was apparently touching the right files when i used [[shell]], but then it was using the default formatting rules.

perlpunk avatar Jan 30 '24 10:01 perlpunk

Just for the record, what we are now using in our CI is this, without using editorconfig:

shfmt -d [options] $(shfmt -f . | grep -v files-to-ignore)

But we are looking forward to [[shell]] :)

perlpunk avatar Jan 30 '24 13:01 perlpunk

I just tried this out in a branch of the repo where I wanted this most badly, and it worked great! I used pre-commit config with language: docker and entry: mvdan/shfmt:latest.

I'm very much looking forward to the release! As always, thanks so much for shfmt -- it's a boon to every project where I've applied it.

sirosen avatar Jan 30 '24 20:01 sirosen