quarkus-langchain4j
quarkus-langchain4j copied to clipboard
Security context propagation in Tools using non-blocking APIs
Hello,
I'm trying to get a SecurityIdentity injected into one of my AI tools but keeps coming as Anonymous. I'm wondering if it's something I'm doing wrong, or if there is an alternative way to go about it?
AI service:
@SessionScoped
@RegisterAiService (tools = DeviceTools.class)
public interface ChatbotAgentService {
Multi<String> chat(@MemoryId String memoryId, @UserMessage String userMessage);
}
WebSocket:
@BearerTokenAuthentication
@WebSocket(path = "/api/v1")
public class ChatboxWebSocketV1 {
@Inject SecurityIdentity securityIdentity;
@Inject ChatbotAgentService chatbotAgentService;
...
@OnTextMessage
@RolesAllowed(AppConstants.ROLE_USER)
public Multi<ChatbotMessageDTO> onTextMessage(ChatbotMessageDTO message) {
return chatbotAgentService.chat(securityIdentity.getPrincipal().getName(),
message.getMessage()).map(
reply -> new ChatbotMessageDTO()
.setMessage(reply)
);
}
}
Tool:
@ApplicationScoped
public class DeviceTools {
@Inject SecurityIdentity securityIdentity;
@Tool("find the number of devices assigned to a specific status")
public Uni<Long> devicesStatus(String status) {
...
}
In ChatboxWebSocketV1 SecurityIdentity is properly populated, whereas in DeviceTools it's not (it produces an AnonymousIdentityProvider).
Thanks!
Thanks for reporting!
This is something that @sberyozkin is actively working on already so I am sure he can give the most up to date info.
Could you by any chance share the sample application you are using so we can take a look at your exact case?
Thanks
There isn't much to share I'm afraid, as it's pretty much us dipping our toes into the Quarkus/LangChain4j integration, so it's just "hello world" code for now. Once we glue everything together, it will become part of our AI assistant tool of our OSS project https://esthes.is.
Gotcha, thanks
Thanks @NMichas, @geoand
I believe this is a specific restriction related to using HttpAuthenticationMechanism annotations with WebSockets Next only. @michalvavrick fixed it in this PR, see this doc in particular. That fix is available only in 3.21.0 though.
Do you have more than authentication mechanism ? If not, please use @Authenticated and it will work
Have you perhaps tried using blocking APIs instead of the reactive ones?
I am asking because returning Multi<ChatbotMessageDTO> implies that the result is going to be streamed back and that might cause issues with the request context tracking.
FWIW I tried a similar example using blocking types everywhere and it worked fine.
Thank you both for your prompt replies.
@sberyozkin indeed, I'm using WebSockets Next and Quarkus 3.21.1. I understand @Authenticated is not required on my WebSocket endpoint, as I'm already using @BearerTokenAuthentication with quarkus.http.auth.proactive=false (I, anyway, did try it and didn't make a difference). I only have a single authentication mechanism. As I mentioned above, the problem isn't with the SecurityIdentity in the WebSocket endpoint - that works fine, but with propagating it to the AI tool bean.
@geoand I tried your suggestion and it works for me too! I'd really like to keep it as a "streaming" endpoint though, as I'm now using OpenAI API for my tests but on the actual implementation the plan is to switch to self-hosted Ollama which might be rather slow in producing long/complicated replies on commodity hardware, so the AI bot could be perceived as unresponsive/slow by the end-users if it can't stream back its reply.
So, I guess at this point the question is, is this something specific to the langchain4j integration with non-blocking APIs that might be fixed/supported in the future, or a more general issue that should be just handled in a different way? I changed the title of this issue to better reflect that.
@NMichas I'll need to investigate next week
Today was longer than I expected, but I can check this one tomorrow and let you know what I think.
This is how you reproduce this issue without langchain4j https://github.com/quarkusio/quarkus/commit/3f870b0d7058c7ea1de559b3c12db85291e6fe81 ; WDYT @mkouba ? Thoughts are enough, I can have a look into this over the weekend. My understanding is that io.quarkus.websockets.next.runtime.WebSocketEndpoint#onTextMessage onComplete must not match Multi completion for this to happen. But I didn't check how the endpoint method is generated yet.
This is how you reproduce this issue without langchain4j quarkusio/quarkus@3f870b0 ; WDYT @mkouba ? Thoughts are enough, I can have a look into this over the weekend. My understanding is that
io.quarkus.websockets.next.runtime.WebSocketEndpoint#onTextMessageonCompletemust not matchMulticompletion for this to happen. But I didn't check how the endpoint method is generated yet.
Hm, I think that in your test it's the delayIt operator that offloads execution to a thread that does not have the context propagated...
Hm, I think that in your test it's the delayIt operator that offloads execution to a thread that does not have the context propagated...
In this issue DeviceTools are executed on a new CDI request context and when I run some tests in this langchain extension (core deployment test) I thought it was the same duplicated context, but I know very little about this extension. Do you basically say that SR Context propagation should do the trick or what?
@NMichas given the confusion around this, I think it would be very useful if you shared a barebones sample application that demonstrates the problem so we know exactly what to look at and figure out what needs to be fixed.
Thanks
Do you basically say that SR Context propagation should do the trick or what?
@michalvavrik I'm not sure to be honest. From what I see in the log the duplicated context is propagated (I can see the same WebSocket connection there) but there is a different state object associated with the CDI request context and honestly I have no idea what kind of magic is SR CP doing here... CC @manovotn @FroMage
ok, thank you @mkouba
@NMichas given the confusion around this, I think it would be very useful if you shared a barebones sample application that demonstrates the problem so we know exactly what to look at and figure out what needs to be fixed.
repro would be great, but it is a strange chance that I could create WS Next only test that fails on the same so easily
there is a different state object associated with the CDI request context and honestly I have no idea what kind of magic is SR CP doing here
I suspect this is a question for @manovotn because I've lost track of how you split context propagation between SR-CP and Arc and Vert.x 🤷
there is a different state object associated with the CDI request context and honestly I have no idea what kind of magic is SR CP doing here
I suspect this is a question for @manovotn because I've lost track of how you split context propagation between SR-CP and Arc and Vert.x 🤷
I am afraid I am in the exact same boat here I thought this was done by SR CP and its handling for reactive parts which I was hoping you knew more about :D
Hey @geoand, all, apologies for the delay - not much time for coding these days, however my good colleague Pablo (@pablokintopp) prepared a small PoC here: https://github.com/pablokintopp/token-propagation-issue
Instructions on the README.md (Pablo can follow up questions/suggestions here, so feel free to ask).
Thank you @NMichas. I will be travelling this week, so hopefully I'll have time to check next week
Thanks a lot for the reproducer @NMichas.
If I am not mistaken, what you are seeing is the expecte behavior - i.e. the request scope is not kept open for the duration of the Multi , right @mkouba ?
I know this wasn't question on me, but I'll provide opinion, or sort of a question. Even though this may be "expected behavior", we have one identity linked to one connection, I think it would be fine to support that and it's not difficult (I actually did half-step in that direction by storing SecuritySupport directly on the server connection, unfortunately that PR review takes ages and it's still open). As long as we have connection stored on duplicated context local data, we can support it?
Right, that would definitely be nice
Hey, I have slightly adjusted the reproducer (to avoid installing websoc and running requests manually in DEV mode) with 1.1.0.CR1 and Quarkus 999-SNAPSHOT. I can reproduce the issue. I also checked that the duplicated context has WS Server connection with security support we just added 2 days ago and wrote an identity association that can takes the identity from local data, which fixes the issue. Thanks for the reproducer, I'll open PR in Quarkus probably tomorrow.
Hi all
Thanks for the detailed investigation here — this thread has been very insightful.
I wanted to ask a follow-up question related to context propagation and RAG integration via LangChain4j in non-blocking WebSocket endpoints, particularly when using Multi.
Since the PR https://github.com/quarkusio/quarkus/pull/48845 is not merged yet, I’m wondering: - Has anyone observed reliable workarounds for propagating context into ContentRetriever / custom retrievers or tool handlers? - Is there any officially supported way to propagate metadata into those downstream components without relying on CDI injection (e.g. passing it via metadata)? - If SecurityIdentity cannot be safely used inside retrievers/tools in a Multi context, what is the best practice to still carry forward things like user ID or other metadata?
Since the PR https://github.com/quarkusio/quarkus/pull/48845 is not merged yet
https://github.com/quarkusio/quarkus/pull/48845 is fine, @sberyozkin just need to find the big green button
- Has anyone observed reliable workarounds for propagating context into ContentRetriever / custom retrievers or tool handlers?
- Is there any officially supported way to propagate metadata into those downstream components without relying on CDI injection (e.g. passing it via metadata)?
- If SecurityIdentity cannot be safely used inside retrievers/tools in a Multi context, what is the best practice to still carry forward things like user ID or other metadata?
I don't want to answer because it is a question for context propagation specialists (I can see them already here in "Participants"), the security was a special case for how we populate it and close it in WS Next.
This issue is fixed in Quarkus 3.29, earlier Quarkus versions won't be supported as the fix bit sensitive.
Hi all,
I’ve tested the reproducer against Quarkus 3.29.0.CR1, and I can confirm that the issue is resolved.
Thanks to everyone involved for the fix
Thanks a lot!
Just to clarify, context propagation (done by MP-CP) is supposed to work for Multi just as well as for Uni: https://quarkus.io/guides/context-propagation#usage-example-with-mutiny
I am not entirely sure how CDI contexts are propagated anymore, since they stopped using MP-CP and stored them in Vert.x contexts.
If you switch thread in your multi, and for some reason, you switch Vert.x context, I suspect you lose MP-CP.
But in this case anyway, I'm pretty sure this is a CDI scope issue and so MP-CP can't help 🤷