quarkus-langchain4j icon indicating copy to clipboard operation
quarkus-langchain4j copied to clipboard

Security context propagation in Tools using non-blocking APIs

Open NMichas opened this issue 8 months ago • 17 comments

Hello,

I'm trying to get a SecurityIdentity injected into one of my AI tools but keeps coming as Anonymous. I'm wondering if it's something I'm doing wrong, or if there is an alternative way to go about it?

AI service:

@SessionScoped
@RegisterAiService (tools = DeviceTools.class)
public interface ChatbotAgentService {
	Multi<String> chat(@MemoryId String memoryId, @UserMessage String userMessage);
}

WebSocket:

@BearerTokenAuthentication
@WebSocket(path = "/api/v1")
public class ChatboxWebSocketV1  {
	@Inject SecurityIdentity securityIdentity;
	@Inject ChatbotAgentService chatbotAgentService;
	...
	@OnTextMessage
	@RolesAllowed(AppConstants.ROLE_USER)
	public Multi<ChatbotMessageDTO> onTextMessage(ChatbotMessageDTO message) {
		return chatbotAgentService.chat(securityIdentity.getPrincipal().getName(),
			message.getMessage()).map(
			reply -> new ChatbotMessageDTO()
				.setMessage(reply)
		);
	}
	
}

Tool:

@ApplicationScoped
public class DeviceTools {

	@Inject SecurityIdentity securityIdentity;

	@Tool("find the number of devices assigned to a specific status")
	public Uni<Long> devicesStatus(String status) {
		...
	}

In ChatboxWebSocketV1 SecurityIdentity is properly populated, whereas in DeviceTools it's not (it produces an AnonymousIdentityProvider).

Thanks!

NMichas avatar Apr 09 '25 13:04 NMichas

Thanks for reporting!

This is something that @sberyozkin is actively working on already so I am sure he can give the most up to date info.

geoand avatar Apr 09 '25 13:04 geoand

Could you by any chance share the sample application you are using so we can take a look at your exact case?

Thanks

geoand avatar Apr 09 '25 13:04 geoand

There isn't much to share I'm afraid, as it's pretty much us dipping our toes into the Quarkus/LangChain4j integration, so it's just "hello world" code for now. Once we glue everything together, it will become part of our AI assistant tool of our OSS project https://esthes.is.

NMichas avatar Apr 09 '25 13:04 NMichas

Gotcha, thanks

geoand avatar Apr 09 '25 13:04 geoand

Thanks @NMichas, @geoand

I believe this is a specific restriction related to using HttpAuthenticationMechanism annotations with WebSockets Next only. @michalvavrick fixed it in this PR, see this doc in particular. That fix is available only in 3.21.0 though.

Do you have more than authentication mechanism ? If not, please use @Authenticated and it will work

sberyozkin avatar Apr 09 '25 13:04 sberyozkin

Have you perhaps tried using blocking APIs instead of the reactive ones?

I am asking because returning Multi<ChatbotMessageDTO> implies that the result is going to be streamed back and that might cause issues with the request context tracking.

FWIW I tried a similar example using blocking types everywhere and it worked fine.

geoand avatar Apr 09 '25 13:04 geoand

Thank you both for your prompt replies.

@sberyozkin indeed, I'm using WebSockets Next and Quarkus 3.21.1. I understand @Authenticated is not required on my WebSocket endpoint, as I'm already using @BearerTokenAuthentication with quarkus.http.auth.proactive=false (I, anyway, did try it and didn't make a difference). I only have a single authentication mechanism. As I mentioned above, the problem isn't with the SecurityIdentity in the WebSocket endpoint - that works fine, but with propagating it to the AI tool bean.

@geoand I tried your suggestion and it works for me too! I'd really like to keep it as a "streaming" endpoint though, as I'm now using OpenAI API for my tests but on the actual implementation the plan is to switch to self-hosted Ollama which might be rather slow in producing long/complicated replies on commodity hardware, so the AI bot could be perceived as unresponsive/slow by the end-users if it can't stream back its reply.

So, I guess at this point the question is, is this something specific to the langchain4j integration with non-blocking APIs that might be fixed/supported in the future, or a more general issue that should be just handled in a different way? I changed the title of this issue to better reflect that.

NMichas avatar Apr 10 '25 05:04 NMichas

@NMichas I'll need to investigate next week

geoand avatar Apr 10 '25 05:04 geoand

Today was longer than I expected, but I can check this one tomorrow and let you know what I think.

michalvavrik avatar Apr 13 '25 20:04 michalvavrik

This is how you reproduce this issue without langchain4j https://github.com/quarkusio/quarkus/commit/3f870b0d7058c7ea1de559b3c12db85291e6fe81 ; WDYT @mkouba ? Thoughts are enough, I can have a look into this over the weekend. My understanding is that io.quarkus.websockets.next.runtime.WebSocketEndpoint#onTextMessage onComplete must not match Multi completion for this to happen. But I didn't check how the endpoint method is generated yet.

michalvavrik avatar Apr 14 '25 21:04 michalvavrik

This is how you reproduce this issue without langchain4j quarkusio/quarkus@3f870b0 ; WDYT @mkouba ? Thoughts are enough, I can have a look into this over the weekend. My understanding is that io.quarkus.websockets.next.runtime.WebSocketEndpoint#onTextMessage onComplete must not match Multi completion for this to happen. But I didn't check how the endpoint method is generated yet.

Hm, I think that in your test it's the delayIt operator that offloads execution to a thread that does not have the context propagated...

mkouba avatar Apr 15 '25 09:04 mkouba

Hm, I think that in your test it's the delayIt operator that offloads execution to a thread that does not have the context propagated...

In this issue DeviceTools are executed on a new CDI request context and when I run some tests in this langchain extension (core deployment test) I thought it was the same duplicated context, but I know very little about this extension. Do you basically say that SR Context propagation should do the trick or what?

michalvavrik avatar Apr 15 '25 09:04 michalvavrik

@NMichas given the confusion around this, I think it would be very useful if you shared a barebones sample application that demonstrates the problem so we know exactly what to look at and figure out what needs to be fixed.

Thanks

geoand avatar Apr 15 '25 10:04 geoand

Do you basically say that SR Context propagation should do the trick or what?

@michalvavrik I'm not sure to be honest. From what I see in the log the duplicated context is propagated (I can see the same WebSocket connection there) but there is a different state object associated with the CDI request context and honestly I have no idea what kind of magic is SR CP doing here... CC @manovotn @FroMage

mkouba avatar Apr 15 '25 10:04 mkouba

ok, thank you @mkouba

@NMichas given the confusion around this, I think it would be very useful if you shared a barebones sample application that demonstrates the problem so we know exactly what to look at and figure out what needs to be fixed.

repro would be great, but it is a strange chance that I could create WS Next only test that fails on the same so easily

michalvavrik avatar Apr 15 '25 10:04 michalvavrik

there is a different state object associated with the CDI request context and honestly I have no idea what kind of magic is SR CP doing here

I suspect this is a question for @manovotn because I've lost track of how you split context propagation between SR-CP and Arc and Vert.x 🤷

FroMage avatar Apr 25 '25 12:04 FroMage

there is a different state object associated with the CDI request context and honestly I have no idea what kind of magic is SR CP doing here

I suspect this is a question for @manovotn because I've lost track of how you split context propagation between SR-CP and Arc and Vert.x 🤷

I am afraid I am in the exact same boat here I thought this was done by SR CP and its handling for reactive parts which I was hoping you knew more about :D

manovotn avatar Apr 26 '25 21:04 manovotn

Hey @geoand, all, apologies for the delay - not much time for coding these days, however my good colleague Pablo (@pablokintopp) prepared a small PoC here: https://github.com/pablokintopp/token-propagation-issue

Instructions on the README.md (Pablo can follow up questions/suggestions here, so feel free to ask).

NMichas avatar Jun 09 '25 13:06 NMichas

Thank you @NMichas. I will be travelling this week, so hopefully I'll have time to check next week

geoand avatar Jun 10 '25 07:06 geoand

Thanks a lot for the reproducer @NMichas.

If I am not mistaken, what you are seeing is the expecte behavior - i.e. the request scope is not kept open for the duration of the Multi , right @mkouba ?

geoand avatar Jun 16 '25 14:06 geoand

I know this wasn't question on me, but I'll provide opinion, or sort of a question. Even though this may be "expected behavior", we have one identity linked to one connection, I think it would be fine to support that and it's not difficult (I actually did half-step in that direction by storing SecuritySupport directly on the server connection, unfortunately that PR review takes ages and it's still open). As long as we have connection stored on duplicated context local data, we can support it?

michalvavrik avatar Jun 16 '25 16:06 michalvavrik

Right, that would definitely be nice

geoand avatar Jun 16 '25 16:06 geoand

Hey, I have slightly adjusted the reproducer (to avoid installing websoc and running requests manually in DEV mode) with 1.1.0.CR1 and Quarkus 999-SNAPSHOT. I can reproduce the issue. I also checked that the duplicated context has WS Server connection with security support we just added 2 days ago and wrote an identity association that can takes the identity from local data, which fixes the issue. Thanks for the reproducer, I'll open PR in Quarkus probably tomorrow.

michalvavrik avatar Jul 08 '25 19:07 michalvavrik

Hi all

Thanks for the detailed investigation here — this thread has been very insightful.

I wanted to ask a follow-up question related to context propagation and RAG integration via LangChain4j in non-blocking WebSocket endpoints, particularly when using Multi.

Since the PR https://github.com/quarkusio/quarkus/pull/48845 is not merged yet, I’m wondering: - Has anyone observed reliable workarounds for propagating context into ContentRetriever / custom retrievers or tool handlers? - Is there any officially supported way to propagate metadata into those downstream components without relying on CDI injection (e.g. passing it via metadata)? - If SecurityIdentity cannot be safely used inside retrievers/tools in a Multi context, what is the best practice to still carry forward things like user ID or other metadata?

yurikotikov avatar Sep 13 '25 08:09 yurikotikov

Since the PR https://github.com/quarkusio/quarkus/pull/48845 is not merged yet

https://github.com/quarkusio/quarkus/pull/48845 is fine, @sberyozkin just need to find the big green button

  • Has anyone observed reliable workarounds for propagating context into ContentRetriever / custom retrievers or tool handlers?
  • Is there any officially supported way to propagate metadata into those downstream components without relying on CDI injection (e.g. passing it via metadata)?
  • If SecurityIdentity cannot be safely used inside retrievers/tools in a Multi context, what is the best practice to still carry forward things like user ID or other metadata?

I don't want to answer because it is a question for context propagation specialists (I can see them already here in "Participants"), the security was a special case for how we populate it and close it in WS Next.

michalvavrik avatar Sep 17 '25 15:09 michalvavrik

This issue is fixed in Quarkus 3.29, earlier Quarkus versions won't be supported as the fix bit sensitive.

michalvavrik avatar Oct 13 '25 08:10 michalvavrik

Hi all,

I’ve tested the reproducer against Quarkus 3.29.0.CR1, and I can confirm that the issue is resolved.

Thanks to everyone involved for the fix

pablokintopp avatar Oct 23 '25 12:10 pablokintopp

Thanks a lot!

geoand avatar Oct 23 '25 12:10 geoand

Just to clarify, context propagation (done by MP-CP) is supposed to work for Multi just as well as for Uni: https://quarkus.io/guides/context-propagation#usage-example-with-mutiny

I am not entirely sure how CDI contexts are propagated anymore, since they stopped using MP-CP and stored them in Vert.x contexts.

If you switch thread in your multi, and for some reason, you switch Vert.x context, I suspect you lose MP-CP.

But in this case anyway, I'm pretty sure this is a CDI scope issue and so MP-CP can't help 🤷

FroMage avatar Nov 07 '25 09:11 FroMage