dd-trace-rb icon indicating copy to clipboard operation
dd-trace-rb copied to clipboard

New profiler breaks libgit2 ssh transport

Open tjk opened this issue 1 year ago • 4 comments

EDIT: While disabling the new profiler seems to fix the issue, there might be an underlying problem related to different libssl due to different libssh2 that is just heightened with this feature on... will continue investigating and report back.


Current behaviour

Setting profiling.advanced.force_enable_new_profiler to true makes using rugged (libgit2) ssh transport break.

Steps to reproduce

Make sure to compile rugged with :ssh feature, and the do the following:

creds = Rugged::Credentials::SshKey.new(username: 'git', publickey: '/path/to/key.pub', privatekey: '/path/to/key')
Rugged::Repository.clone_at('ssh://[email protected]/<username>/<repo>.git', '/tmp/some-directory', credentials: creds)

You will see an error like Failed to open SSH channel: Error waiting on socket (Rugged::SshError)

Environment

  • ddtrace version: 1.9.0
  • Configuration block (Datadog.configure ...): see above
  • Ruby version: 2.7.7
  • Operating system: Debian Buster
  • Relevant library versions: rugged 1.5.0.1 (means libgit2 1.5.0)

tjk avatar Mar 24 '23 21:03 tjk

Thanks a lot @tjk for the report. This definitely looks similar to the mysql2 issue another customer reported (see known issues in the release notes).

You mentioned that

EDIT: While disabling the new profiler seems to fix the issue, there might be an underlying problem related to different libssl due to different libssh2 that is just heightened with this feature on... will continue investigating and report back.

Please do let us know about you find out! I'll work to reproduce this as well... For now indeed the best solution is to not use the new profiler for this application.

ivoanjo avatar Mar 27 '23 19:03 ivoanjo

I was able to reproduce this easily -- thanks for the clear notes! It's indeed the same issue as called out in the release notes for the mysql2 gem.

I can confirm the observation that the issue seems to be triggered in the ssh transport, and when cloning via http there seems to be no issue. It's unclear to me if the issue right now is in libssh2 (used by the ssh transport) or libgit2 (I don't think it comes from rugged directly).

@tjk: Actually getting a fix in may take a while, since I'll need to report this to the rugged maintainers (and possibly the libgit2 and libssh2 maintainers as well).

Are you OK with not enabling the new profiler for this specific app while we work on this?

Alternatively, I can look into exposing an API that could be used to wrap the clone_at call to avoid the profiler interrupting it, if that's something you'd be interested in using in your application.

I'll also make sure to document this and add the rugged gem to the list of gems we detect and avoid enabling the new profiler for, so that others don't run into this issue while we work to get the full fix in place!

ivoanjo avatar Apr 04 '23 11:04 ivoanjo

Update: I've reported the issue to the libssh2 developers and also asked the rugged developers if they'd be open to shipping a workaround built-into the gem.

ivoanjo avatar Apr 12 '23 11:04 ivoanjo

Looks like there's a PR to fix libssh2 ( https://github.com/libssh2/libssh2/pull/1058 ), hopefully it will get accepted soon.

ivoanjo avatar Jun 27 '23 16:06 ivoanjo