dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[Feature] dbt Cloud CLI: rename binary to `dbt-cloud` or make it `pipx`-compatible

Open jaklan opened this issue 1 year ago • 1 comments

Is this your first time submitting a feature request?

  • [X] I have read the expectations for open source contributors
  • [X] I have searched the existing issues, and I could not find an existing issue for this feature
  • [X] I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Hi, dbt Cloud CLI repo has issues disabled, so let me put it here. Our goal is to use both dbt-core and dbt Cloud CLI simultaneously as we work with both types of projects and make this setup easily configurable for our Analytics Engineers, incl. the less advanced ones.

When reading the docs, you discover one completely ridiculous thing - both binaries use exactly the same name. It's explained in docs this way:

For compatibility, both the dbt Cloud CLI and dbt Core are invoked by running dbt.

Tbh - I don't buy that argument at all. Having dbt-cloud binary would solve all the issues with the hybrid setup - and if someone uses dbt Cloud CLI exclusively you could put a hint to create an alias like: alias dbt="dbt-cloud". You could argue "it's because they support the same commands, it's just easier from the docs perspective", but it's also not really true - a) dbt Cloud CLI has its unique commands b) flags for common commands differ, so it has to be taken into consideration anyway.

But okay, let's assume you won't agree to rename (for whatever reason) and we have to make the whole setup more complex than needed. Initially I thought pipx could solve the problem by using --suffix flag, but... dbt Core CLI is written in Go - Python package is just a wrapper to download the Go binary when executing setup.py during installation. It means there's no entry_points defined there, which means pipx would fail with:

No apps associated with package dbt or its dependencies. If you are attempting to install a library, pipx should not be used. Consider using pip or a similar tool instead.

And that makes sense - pipx is not a tool to install Go binaries. However, it's possible to workaround it - let's look at a sample uv wheel which includes Rust binaries:

 ├──uv-0.3.3.data
 │  └──scripts
 │     ├──uv
 │     └──uvx
 ├──uv-0.3.3.dist-info
 │  ├──licenses
 │  │  ├──LICENSE-APACHE
 │  │  └──LICENSE-MIT
 │  ├──METADATA
 │  ├──RECORD
 │  └──WHEEL
 └──uv
    ├──__init__.py
    ├──__main__.py
    └──py.typed

scripts directory contains Rust binaries. The whole magic is handled by __init__.py and __main__.py available here:
https://github.com/astral-sh/uv/tree/main/python/uv but generally - it's all about detecting venv, identifying a binary path and spinning up a subprocess:

  if sys.platform == "win32":
      import subprocess

      completed_process = subprocess.run([uv, *sys.argv[1:]], env=env)
      sys.exit(completed_process.returncode)
  else:
      os.execvpe(uv, [uv, *sys.argv[1:]], env=env)

pipx wouldn't fully solve the issue (more below), but would a) give more options to mitigate it b) make the installation process much easier than currently.

Describe alternatives you've considered

  • Homebrew doesn't allow (afaik) to modify the binary name, so using pipx for dbt-core and brew for dbt Cloud CLI won't solve the problem - whatever is first on PATH would win.

  • Downloading binaries manually is not an option as that requires manual upgrades.

  • go install is not related because source code is not available (and package doesn't exist on pkg.go.dev).

  • The workaround from docs:

    • create manually a dedicated venv
    • activate it
    • run pip install dbt
    • deactivate it
    • add alias like alias dbt-cloud="path_to_dbt_cloud_cli_binary"

    Yes, that works, but we would need to wrap it into custom script to make this process repeatable for our devs and to allow one-command upgrades.

  • Installing dbt-core with custom suffix using pipx - probably the most reasonable approach as for now, so we use dbt-core for, well, dbt-core, and dbt for dbt Cloud CLI.

    But it's also not perfect - we would need to adjust all our docs and local scripts, people would be confused why it's different than in dbt docs and need to get used to that, but even then - if someone installs dbt-core not via pipx, but e.g. via Poetry in their project environment, has to remember to use dbt... Ofc that would be also the issue in the opposite scenario - when we add custom suffix to dbt Cloud CLI, but in our case the first situation would be much more common.

Generally speaking, something supposed to be as easy as possible, so installing the package, becomes unnecessarily complicated and needs manual steps or custom workarounds just to make it usable for dbt-core users. That's not how it should look like and tbh I don't recognise any other library creating such problems just because they decided to introduce conflict with their own (!) package.

As mentioned at the beginning - all these problems wouldn't exist if you just keep the names unique - dbt for dbt-core, dbt-cloud for dbt Cloud CLI. Not to mention a burden on shoulders of all the poor guys supporting people raising issues like "I have installed dbt with pip install dbt, dbt command works, but I can't execute anything, because it asks for some Cloud config?!".

PS If you decide to re-write dbt-core in Go or Rust one day and you want to publish it via brew then... It would be even more entertaining - you would need to either introduce a new binary name anyway ("compatibility" anyone?) or mark them as conflicting with each other (which doesn't make sense as long as dbt Cloud CLI doesn't include dbt-core functionalities).

Who will this benefit?

Users of both dbt-core and dbt Cloud CLI.

jaklan avatar Aug 25 '24 08:08 jaklan

Love it - took me hours to set up dbt cloud cli more or less automatically for all devs while using poetry.

CommonCrisis avatar Aug 27 '24 07:08 CommonCrisis