syft
syft copied to clipboard
feat: add BeamVM Hex support
📝 Description
Adds support for parse rebar.lock
and mix.lock
files to add cataloguing support for Elixir & Erlang projects that use the Hex package manager. Placed under the beam
cataloger.
Closes: https://github.com/anchore/syft/issues/1071
Thanks for another great contribution @cpendery! I do think having beam
as the language is going to be confusing. My suggestion would be to break it up into a separate erlang
cataloger for rebar.lock
and elixir
cataloger for mix.lock
, and both having the hex
package type. I'm not sure if sharing the same package type between two languages will cause any other problems though.
Thanks for another great contribution @cpendery! I do think having
beam
as the language is going to be confusing. My suggestion would be to break it up into a separateerlang
cataloger forrebar.lock
andelixir
cataloger formix.lock
, and both having thehex
package type. I'm not sure if sharing the same package type between two languages will cause any other problems though.
With splitting the languages into Elixir and Erlang, when we see a purl for the Hex pkg:hex/
package manager (supports both languages), I wouldn't know what language to resolve it to. I get using the ecosystem as a language is confusing. Any ideas @westonsteimel
With splitting the languages into Elixir and Erlang, when we see a purl for the Hex pkg:hex/ package manager (supports both languages), I wouldn't know what language to resolve it to. I get using the ecosystem as a language is confusing. Any ideas @westonsteimel
Hmm, can we encode the language as a parameter to the package url when syft creates it? @spiffcs, any thoughts? Do we ever need to decode language from a package url (perhaps this is something that happens from one of the non syft-native formats)?
Hmm, can we encode the language as a parameter to the package url when syft creates it? @spiffcs, any thoughts? Do we ever need to decode language from a package url (perhaps this is something that happens from one of the non syft-native formats)?
The purl specification doesn't have a language
property for the Hex purl type, unfortunately. However, it seems like an advantageous thing to add, so maybe @pombredanne would be up for adding this! :smile:
It could be useful for Conan (C/C++) and Cocoapods (Objective-C/Swift) since both support dual languages like Hex. I made a PR below just as a starting point for the discussion in the purl-spec
repo, since it may be a better place to continue talking
With splitting the languages into Elixir and Erlang, when we see a purl for the Hex pkg:hex/ package manager (supports both languages), I wouldn't know what language to resolve it to. I get using the ecosystem as a language is confusing. Any ideas @westonsteimel
Hmm, can we encode the language as a parameter to the package url when syft creates it? @spiffcs, any thoughts? Do we ever need to decode language from a package url (perhaps this is something that happens from one of the non syft-native formats)?
After forming the PURL we have a decoding function called LanguageFromPURL
:
https://github.com/anchore/syft/search?q=LanguageFromPURL
This would lead to the issue @cpendery brings up where we don't have enough information at that point to assign a language. If his PR is accepted then we can make the split in the specification itself and no longer encounter this issue. Because of this design choice, catalogers are loosely bound to support on the PURL side.
@cpendery this looks really good. I'll wait on merging or updating in anyway until we hear back on the PR you made for the purl-spec.
TODO: update cataloger to new generic cataloger pattern
I think we can leave language as blank / unknown in these circumstances -- the cataloger is more valuable than resolving the language from the pURL IMHO.
I can help rebase what is here and update the patterns some based on the drift.
The main changes I made were:
- Split the cataloger into
erlang
andelixir
. The conflict for parsing the language from the pURL has been kicked down the road. - Split the
HexMetadata
intoMixLockMetadata
andRebarLockMetadata
. Why do this since they contain essentially the same information? If we intend to support extracting more information from these sources, they aren't exactly the same, so we want future room to grow here without having to make a breaking change. The package type may map to many different metadatas, and the metadatas should most closely represent the source in which you are parsing from --this is the general rule of thumb. - Updated the cataloger to use the new generic cataloger and surrounding patterns.
I'll push shortly, and I think this will be good to go!