linguist icon indicating copy to clipboard operation
linguist copied to clipboard

Add BAML

Open imalsogreg opened this issue 8 months ago • 5 comments

Description

This PR adds support for the BAML programming language. Thank you for considering our PR, we would to have our syntax recognized by GitHub's language breakdowns and syntax highlighters!

Checklist:

  • [x] I am adding a new language.
    • [x] The extension of the new language is used in hundreds of repositories on GitHub.com.
      • Search results for each extension:
        • https://github.com/search?type=code&q=NOT+is%3Afork+path%3A*.baml+class
    • [x] I have included a real-world usage sample for all extensions added in this PR:
      • Sample source(s):
        • https://github.com/BoundaryML/baml/blob/canary/integ-tests/baml_src/test-files/semantic_streaming/semantic_streaming.baml
        • https://github.com/BoundaryML/baml/blob/canary/integ-tests/baml_src/test-files/functions/output/mutually-recursive-classes.baml
      • Sample license(s): Apache-2.0
    • [x] I have included a syntax highlighting grammar: https://github.com/boundaryml/textMate-baml
    • [x] I have added a color
      • Hex value: #a855f7
      • Rationale: This is the primary color from our language's marketing site and physical swag.
    • [ ] I have updated the heuristics to distinguish my language from others using the same extension. (N/A - there are no languages using the same extension)

imalsogreg avatar Apr 11 '25 21:04 imalsogreg

Sorry for the lack of update on this @lildude - we really appreciated the super fast initial response and never followed up.

We'll get this shored up when we hit the popularity milestone (twas also cool for us to see that we had a user open #7456 on our behalf) - it looks like we're getting closer.

Is the "hundreds of repos" an eyeball check based on file count or is there tooling somewhere that we can run?

sxlijin avatar Aug 19 '25 18:08 sxlijin

There's no tooling but https://github.com/github-linguist/linguist/issues/5756 gives you an idea of how we currently assess popularity.

lildude avatar Aug 19 '25 18:08 lildude

thanks @lildude is this a representative query in that case?

https://github.com/search?q=path%3A**.baml+NOT+owner%3ABoundaryML+NOT+is%3Afork+NOT+owner%3Aai-that-works++%22generator+%22&type=code

  • excludes our orgs
  • looks for the generators keyword, which only really appears in 1 file per repo (should occur only once)

hellovai avatar Aug 19 '25 22:08 hellovai

I wouldn't include the keyword unless this is the only way to identify BAML files as this cuts down quite a bit on the search results. As it appears a repo is expected to have more than one file, the number of files requirements needs to be met for inclusion.

lildude avatar Aug 20 '25 05:08 lildude

Gotcha, thanks - we'll keep an eye on the keywordless search for the overall file count.

We were trying to figure out how to count the number of unique repos, that's why we were using the generator keyword.

sxlijin avatar Aug 21 '25 05:08 sxlijin