integreat-cms Implement Chat API

Short description

This PR adds an implementation for the Chat-API.

In the issue description, a deletion function is mentioned; however, there's no reasonable way to actually delete Zammad tickets/articles via API, so this will have to be solved in some other way.

Proposed changes

add a new model UserChat storing mappings of device_ids -> zammad_ids
add ZammadChatAPI, a wrapper around zammad_py, plus corresponding API endpoints for our usecase
implement ratelimiting
add tests

Side effects

none I am aware of, this feature is very stand-alone

Resolved issues

Fixes: #2619 (partially)

Pull Request Review Guidelines

Apr 08 '24 15:04 charludo

In the issue description, a deletion function is mentioned; however, there's no reasonable way to actually delete Zammad tickets/articles via API, so this will have to be solved in some other way.

You're referring to "App users can close and start a new chat which will delete the ID on the app."? This is basically just intended as a front end thing to start a new chat. This should not remove Zammad tickets.

Apr 10 '24 07:04 svenseeberg

Question for @svenseeberg. Does this PR make #2568 redundant?

Apr 10 '24 11:04 JoeyStk

Question for @svenseeberg. Does this PR make #2568 redundant?

Yes, but I opted for creating a new PR since these two only share the same goal, not any implementation details.

Apr 12 '24 08:04 charludo

In the issue description, a deletion function is mentioned; however, there's no reasonable way to actually delete Zammad tickets/articles via API, so this will have to be solved in some other way.

You're referring to "App users can close and start a new chat which will delete the ID on the app."? This is basically just intended as a front end thing to start a new chat. This should not remove Zammad tickets.

Right, looks like the only mention of deletion in the issue, my bad. I do however remember discussing this at the last conference, and felt it was right to point this out on here.

Apr 12 '24 08:04 charludo

Yes, but I opted for creating a new PR since these two only share the same goal, not any implementation details.

No worries. My underlying question is if we can close #2568 then :)

Apr 12 '24 14:04 JoeyStk

I looked into the overall design and I think it does what we need. I will next create a test setup and try it out.

Great! Let me know if there's trouble, but I think the only important note I have is the one remarked in the Region form fields, i.e. both the "bot" account and any personnel wishing to be notified must have full group permissions for the settings.USER_CHAT_TICKET_GROUP group.

Apr 17 '24 09:04 charludo

Blocked due to #2763

Since this is ready for review apart from the mentioned issue, I've marked it as such regardless of the blocked status.

Apr 29 '24 07:04 charludo

We had a UX Design meeting last week and we decided it would be best to have a random user exposure with a defined chance of the feature being visible to app users. That means we want to be able to set a chance value per region that allows the app to decide whether it should show the feature or not. Would it be possible to include a chat_feature_probability value per region that is also returned in the API? The app would do the rest.

May 22 '24 07:05 svenseeberg

We had a UX Design meeting last week and we decided it would be best to have a random user exposure with a defined chance of the feature being visible to app users. That means we want to be able to set a chance value per region that allows the app to decide whether it should show the feature or not. Would it be possible to include a chat_feature_probability value per region that is also returned in the API? The app would do the rest.

Maybe doing this in a separate PR would make more sense? The required changes seem simple enough, but would probably draw out the review process even more. Plus it woud be nice to have a single PR/commit for reference when the feature is rolled back at a later point.

May 22 '24 08:05 charludo

Code Climate has analyzed commit 8cc4bd1e and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 91.3% (50% is the threshold).

This pull request will bring the total coverage in the repository to 82.7% (0.1% change).

View more on Code Climate.

May 28 '24 06:05 codeclimate[bot]

The tests finally pass... 😭 🎉

This is finally ready for review :)

May 28 '24 06:05 charludo

Hey, thank you so much for this PR! :green_heart: Is there already a timeline for this? We'd need the API to complete work in the frontend as well with sadly the same deadline :see_no_evil: Would it be possible to do a release with this to the test cms already?

Jun 04 '24 09:06 steffenkleinle

Regarding the device_id: We decided to generate a UUID in the frontend (e.g. using randomUUID) and save it to the local storage for that purpose, right? In theory, what happens if a user changes city/language and tries to open a new chat? Then previous messages are "lost" and a new chat is started, even though the device_id stays the same?

Jun 04 '24 09:06 steffenkleinle

Did we decide on a way to notify the CMS that the user wants to talk to a human counselor? I seem to remember that we talked about a specific "message", right? Probably also @svenseeberg. I am not sure if this works well in the multilingual setting. Perhaps we should add an additional attribute request_counselor or similar to the POST body.

I think we decided to implement this on the Zammad side with triggers. No need to have anything on the CMS?!

Jun 05 '24 07:06 svenseeberg

Can our app detect if a Zammad backend is configured? This might be necessary in order to hide the chat feature in case there is no backend configured.

That will be done manually anyways.

How will we handle live updates and push notifications for chat messages? This is crucial for real-time communication.

We actively decided to not implement push in the first iteration. We will work with polling on the app side.

Before storing the Zammad Access Token, we should validate it with a test case to ensure its correctness. Failing fast in this case also has the benefit of allowing us to show the user an error message if, for example, they made a typo.

While this would be nice, this will usually be done by admins/experts. They would need to set up Zammad anyways.

We need to ensure that messages are not lost if the Zammad backend is temporarily unavailable (e.g., during an update). Implementing a retry mechanism might be necessary.

Yes, this is currently an issue. But for a prototype this could be sufficient. I have no strong opinion here.

What happens if the distributed state becomes unsynchronized, and a chat_id in our database no longer exists in the Zammad backend?

If the Zammad Access Token becomes invalid over time (e.g., due to deletion), how will our backend handle it? We should display an alert or send a notification in such cases.

What is the device id, and how can a user restore their chat if they change their device? Are we certain that the device id cannot be taken over by someone who recycles the device?

Valid points. But I'd suggest to move that into new issues and address them over time.

Jun 05 '24 07:06 svenseeberg

Before storing the Zammad Access Token, we should validate it with a test case to ensure its correctness. Failing fast in this case also has the benefit of allowing us to show the user an error message if, for example, they made a typo.

While this would be nice, this will usually be done by admins/experts. They would need to set up Zammad anyways.

In my humble opinion admins and experts can benefit from input validation just as any other user.

Jun 05 '24 13:06 deen13

What happens if the distributed state becomes unsynchronized, and a chat_id in our database no longer exists in the Zammad backend?

If the Zammad Access Token becomes invalid over time (e.g., due to deletion), how will our backend handle it? We should display an alert or send a notification in such cases.

What is the device id, and how can a user restore their chat if they change their device? Are we certain that the device id cannot be taken over by someone who recycles the device?

Valid points. But I'd suggest to move that into new issues and address them over time.

Why should we address them gradually instead of immediately? Especially considering that points related to data structure and architecture, such as the first one, will become increasingly challenging to tackle once we've shipped the first version.

Jun 05 '24 13:06 deen13

In my humble opinion admins and experts can benefit from input validation just as any other user.

Sure. Its just a matter of priorties. We only have one test region for now and currently I can live w/o input validation. I know that we're potentially building some kind of tech debt. But this is just a prototype functionality. We do not even know yet if we're going to use it productively. For comparison: we have to manually set up the Matomo access tokens as well. There never was an issue with the tokens (about 150 data points).

Why should we address them gradually instead of immediately? Especially considering that points related to data structure and architecture, such as the first one, will become increasingly challenging to tackle once we've shipped the first version.

Because we need some kind of usable prototype soonish (end of June). We are a very long way from using it in wide spread production. And I would rather choose to get some first feedback from some users than producing an elaborate architecture without knowing whether we need it. We mostly need to get some kind of usable tech put together with wich we can start integrating the language models and test the quality of the answers in our context / validate our approach.

To address your concerns: I definitely agree that we should not postpone improving things. I would just rather get this merged, improve with a new PR and in parallel collect first feedback.

Jun 05 '24 17:06 svenseeberg

Hey, thank you so much for this PR! 💚 Is there already a timeline for this? We'd need the API to complete work in the frontend as well with sadly the same deadline 🙈 Would it be possible to do a release with this to the test cms already?

Tricky. Is it possible to complete most of the work without the concrete API endpoints, according to the API docs added in this PR?

Did we decide on a way to notify the CMS that the user wants to talk to a human counselor? I seem to remember that we talked about a specific "message", right? Probably also @svenseeberg. I am not sure if this works well in the multilingual setting. Perhaps we should add an additional attribute request_counselor or similar to the POST body.

Since BOT answers are not part of the initial MVP, this doesn't have to be done here though.

Not implemented yet, but as you said, it's also not relevant for the MVP. I'll open an issue for the CMS.

Regarding the device_id: We decided to generate a UUID in the frontend (e.g. using randomUUID) and save it to the local storage for that purpose, right? In theory, what happens if a user changes city/language and tries to open a new chat? Then previous messages are "lost" and a new chat is started, even though the device_id stays the same?

Changing the region would have to start a new chat, yes, since the Zammad backend used is dependant on the region.* The best way to handle this scenario (wipe chat? inject error message informing the user that the chat cannot be continued? offer to switch back to the previous region?) is probably a question for UX. "As is", if the user has a chat history with region A, then switches to region B, then

if B has the chat backend configured, a new chat is initialized with B when the user sends a message, and an error ("The requested chat does not exist. Did you delete it?") is shown when trying to get the messages
if B does not have the backend configured, an error ("No chat server is configured for your region.") is shown.

Changing the language does not have any effect. The language the user has the app set to when initializing the chat is included as part of the ticket subject, but not checked afterwards.

* Edit: unless you simply store the region slug alongside the device_id whenever a new chat is started, and simply use that slug instead of the current region's when communicating with the API! Honestly, this might be the cleanest solution.

Jun 06 '24 08:06 charludo

Architecture Decision Record (ADR):

I am not at all opposed to this, but I do think we should bring it up in the next team call and decide how we want to employ this. I think adding new tooling and processes to this specific (already huge...) PR is not a good idea.

Dependence on Zammad Without Abstraction:

IMO this is a non-issue. Right now, the feature is explicitly wanted in the context of Zammad, and we don't have evidence that other regions want to use a different backend. Until (if) that changes, another layer of abstraction does not provide any benefit.

(If we do want to add it at a later point though, all the pieces are already in place: we'd just need to add an abstract ChatAPIProvider class from which ZammadChatAPI and any other providers inherit, then change the line client = ZammadChatAPI(request.region) in api/.../user_chat.py to use the region's configured provider.)

Can our app detect if a Zammad backend is configured? This might be necessary in order to hide the chat feature in case there is no backend configured.

In addition to what Sven said, {"status": 503, "error": "No chat server is configured for your region."} is returned if Zammad is not configured. ~~Should be sufficient IMO.~~ Edit: You are right, having this as a separate endpoint would be better. I'll add it. Edit 2: Done

How will we handle live updates and push notifications for chat messages? This is crucial for real-time communication.

(see Sven's answer)

Before storing the Zammad Access Token, we should validate it with a test case to ensure its correctness. Failing fast in this case also has the benefit of allowing us to show the user an error message if, for example, they made a typo.

Validate how exactly? We could probably check the token length, but other than that? I don't think checking Zammad connectivity on save of the region form is a good idea. We could maybe add a "test connection" button, but I'd rather move that to a new issue/PR

We need to ensure that messages are not lost if the Zammad backend is temporarily unavailable (e.g., during an update). Implementing a retry mechanism might be necessary.

{ "status": 500, "error": "An error occurred while attempting to connect to the chat server." } is returned in that case. Since we do definitely not want to store any actual messages, the client needs to handle this case (for example by something like "tap to retry sending")

What happens if the distributed state becomes unsynchronized, and a chat_id in our database no longer exists in the Zammad backend?

{ "status": 404, "error": "Couldn't find Ticket with 'id'=<the id>" } is returned by the Zammad server. The client should probably just create a new chat in this case.

If the Zammad Access Token becomes invalid over time (e.g., due to deletion), how will our backend handle it? We should display an alert or send a notification in such cases.

{ "status": 500, "error": "An error occurred while attempting to connect to the chat server." }

Besides that, I do not think it's our responsibility to monitor region's Zammad instances. I guess the "test connection" functionality mentioned above could be run periodically in theory, though. But yes, I agree with Sven, new issue for this.

What is the device id, and how can a user restore their chat if they change their device?

The device_id is a random string generated by the client (app). Chats cannot be restored on new devices. Currently, AFAIK, this is by design; if we did want to be able to transfer chats to new devices, that would need to be handled 100% on client side.

Are we certain that the device id cannot be taken over by someone who recycles the device?

Yes, it's probably worth mentioning that device_id is not actually tied to the physical device, but just a random ID which we use to associate clients with Zammad tickets.

Jun 06 '24 09:06 charludo

@charludo Sorry, I didn't want to add another new tool to the project. Maybe there's already a place in the project suited for some architecture documentation? 👀

Jun 07 '24 08:06 deen13

Thanks for the answers @svenseeberg and @charludo. I don't want to block the pull request further. ☺️

No worries. Having to merge this fast is a not-too-great situation to be in. Btw though, if you have implementation-specific concerns or suggestions I am all ears! 😅

@charludo Sorry, I didn't want to add another new tool to the project. Maybe there's already a place in the project suited for some architecture documentation? 👀

Not sure if our docs would be the right place for this, this seems a bit too fine-grained for them. I did read up on ADRs, but am a bit lost about what exactly there's to document here. AFAIK the way API requests are handled for the chat are in line with all other API request handling in this project.

Jun 07 '24 09:06 charludo

Deadline was pushed forward to Friday (not for the release but for merging to the test CMS), as the Frontend needs at least two weeks to test :) If we don't have a second review until Friday we should just merge it with only one approve :)

Jun 11 '24 12:06 JoeyStk

Deadline was pushed forward to Friday (not for the release but for merging to the test CMS), as the Frontend needs at least two weeks to test :) If we don't have a second review until Friday we should just merge it with only one approve :)

Thanks! Its maybe important to note that this is not a final product but instead just an MVP for some early testing. Bugs and non-perfect behavior is okay! Therefore, if you have any issues you don't like about the current implementation, feel free to continue to work on this while we test and implement the frontend side.

Thank you all for your help :heart:

Jun 11 '24 12:06 steffenkleinle

LGTM, thanks 🚀

One thing I would change in the future is not showing internal comments as official responses. The red bordered message is also returned in the API, while Zammad defines this as "internal" information.

But for testing purposes this is totally sufficient.

*edit: I just opened a follow up PR for this: #2844

Good catch!

And also, great, I'll solve the conflict and then merge... Let's see what happens 😅

Jun 12 '24 13:06 charludo

Thanks to all of you! Amazing :heart: Could you please notify us once this is ready to test (in the test cms)?

Jun 12 '24 13:06 steffenkleinle

integreat-cms integreat-cms copied to clipboard

Implement Chat API

Short description

Proposed changes

Side effects

Resolved issues

integreat-cms
integreat-cms copied to clipboard