cabal icon indicating copy to clipboard operation
cabal copied to clipboard

Inconsistent handling of `UnqualComponentName`

Open tfausak opened this issue 4 years ago • 6 comments
trafficstars

Describe the bug cabal-install appears to allow using component names that Cabal can't parse. I'm not sure if this a bug with cabal-install being too lax or Cabal being too strict.

To Reproduce Steps to reproduce the behavior:

$ cat example.cabal
cabal-version: >= 1.8
name: example
version: 0.0.0.0
build-type: Simple
executable x-1
  build-depends: base
  main-is: Main.hs

$ cat Main.hs
main = pure ()

$ cabal v2-run x-1
Resolving dependencies...
Build profile: -w ghc-9.0.1 -O1
In order, the following will be built (use -v for more details):
 - example-0.0.0.0 (exe:x-1) (first run)
Configuring executable 'x-1' for example-0.0.0.0..
Preprocessing executable 'x-1' for example-0.0.0.0..
Building executable 'x-1' for example-0.0.0.0..
[1 of 1] Compiling Main             ( Main.hs, /private/var/folders/dw/qn5kg6091gq3x8_106kw6ptm0000gn/T/tmp.h2GBaNkl/dist-newstyle/build/x86_64-osx/ghc-9.0.1/example-0.0.0.0/x/x-1/build/x-1/x-1-tmp/Main.o )
Linking /private/var/folders/dw/qn5kg6091gq3x8_106kw6ptm0000gn/T/tmp.h2GBaNkl/dist-newstyle/build/x86_64-osx/ghc-9.0.1/example-0.0.0.0/x/x-1/build/x-1/x-1 ...

Expected behavior I would expect either cabal-install to reject the component for having an invalid name, or for Cabal to successfully parse the component name.

System information

$ sw_vers
ProductName:	Mac OS X
ProductVersion:	10.15.7
BuildVersion:	19H1217

$ uname -a
Darwin TayBook.local 19.6.0 Darwin Kernel Version 19.6.0: Thu May  6 00:48:39 PDT 2021; root:xnu-6153.141.33~1/RELEASE_X86_64 x86_64

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.0.1

$ cabal --version
cabal-install version 3.4.0.0
compiled using version 3.4.0.0 of the Cabal library 

Additional context I discovered this problem on accident by looking at this package: https://hackage.haskell.org/package/hsc3-graphs-0.15. It has many components. I eventually narrowed it down to runs of numbers, like hsc3-berlin-1977. I assume the parser rejects names like this because they might accidentally be confused with version numbers.

tfausak avatar Jun 13 '21 20:06 tfausak

Oops, I meant to include a snippet showing Cabal being unable to parse that component name:

ghci> eitherParsec @UnqualComponentName "x1"
Right (UnqualComponentName "x1")

ghci> eitherParsec @UnqualComponentName "x-1"
Left "\"<eitherParsec>\" (line 1, column 4):\nunexpected end of input"

ghci> eitherParsec @UnqualComponentName "x-1-x"
Left "\"<eitherParsec>\" (line 1, column 5):\nunexpected Empty component, after x-1"

tfausak avatar Jun 13 '21 20:06 tfausak

Confirmed. This seems to be a Cabal bug, not a cabal-install one. The package description parser (in Cabal) somehow accepts that component name, even when called through Setup.hs (without cabal-install)

fgaz avatar Jun 13 '21 20:06 fgaz

The (generic) package description parser appears to use a special parser for component names rather than using the Parsec instance.

https://github.com/haskell/cabal/blob/00a2351789a460700a2567eb5ecc42cca0af913f/Cabal/src/Distribution/PackageDescription/Parsec.hs#L422-L423

https://github.com/haskell/cabal/blob/00a2351789a460700a2567eb5ecc42cca0af913f/Cabal/src/Distribution/PackageDescription/Parsec.hs#L389-L390

https://github.com/haskell/cabal/blob/00a2351789a460700a2567eb5ecc42cca0af913f/Cabal/src/Distribution/PackageDescription/Parsec.hs#L392-L405

That suggests you can use any SecArgName or SecArgStr as a component name, which does appear to be the case. All of the following work: executable 1, executable 1.2, executable "1", executable -. You can start getting into trouble if you pick a component name that collides with other syntax. For example:

$ grep executable example.cabal
executable "s p a c e"

$ cabal v2-build
Resolving dependencies...
Build profile: -w ghc-9.0.1 -O1
In order, the following will be built (use -v for more details):
 - example-0.0.0.0 (exe:"s p a c e") (first run)
cabal-3.4.0.0: Unrecognised build target 'exe:s p a c e'.
Examples:
- build foo -- component name (library, executable, test-suite or benchmark)
- build Data.Foo -- module name
- build Data/Foo.hsc -- file name
- build lib:foo exe:foo -- component qualified by kind
- build foo:Data.Foo -- module qualified by component
- build foo:Data/Foo.hsc -- file qualified by component
$ grep executable example.cabal
executable ":"

$ cabal v2-build
Resolving dependencies...
Build profile: -w ghc-9.0.1 -O1
In order, the following will be built (use -v for more details):
 - example-0.0.0.0 (exe:":") (first run)
cabal-3.4.0.0: Unrecognised build target 'exe::'.
Examples:
- build foo -- component name (library, executable, test-suite or benchmark)
- build Data.Foo -- module name
- build Data/Foo.hsc -- file name
- build lib:foo exe:foo -- component qualified by kind
- build foo:Data.Foo -- module qualified by component
- build foo:Data/Foo.hsc -- file qualified by component

tfausak avatar Jun 13 '21 21:06 tfausak

I bumped into this issue again.

I think the intent is for component names to be the same as package names. This is enforced for library names, but not for other types of components. For library names, that makes sense. The default public library implicitly has the same name as the package. Other sub-libraries may be public, in which case you can depend on them with something like a-package:a-library >= 1.2.3. Clearly in those cases the sub-library needs to be constrained somewhat, and matching package names makes a lot of sense.

However for other component types, especially executables, it's unclear how the component name should be constrained. I analyzed packages on Hackage to see what they used for component names. Most of them can be parsed as package names. Of the ones that can't be parsed as package names, the only invalid characters are _:. (underscore, colon, and period). For example:

  • inspection-testing has a test suite called NS_NP.
  • MagicHaskeller has an executable called MagicHaskeller.cgi.
  • ds-kanren has a test suite called test-unify:.
    • This appears to be a mistake. Instead of defining a test suite using section syntax (test-suite x\n), this package puts a colon at the end of the line (test-suite x:\n).

tfausak avatar Jun 18 '22 19:06 tfausak

The original post shows the hsc3-berlin-1977 executable name, which also can’t be parsed as a package name, right?

ulysses4ever avatar Jun 18 '22 19:06 ulysses4ever

Yes, that's correct. I was looking for components that used invalid characters in their names.

Looking for component names that are actually invalid package names is a little more complicated. I found 376 of them:

Click to expand list of component names:

tfausak avatar Jun 18 '22 19:06 tfausak