flagsmith icon indicating copy to clipboard operation
flagsmith copied to clipboard

Implement a streaming Identity Debugger

Open dabeeeenster opened this issue 11 months ago • 12 comments

Is your feature request related to a problem? Please describe.

Sometimes it can be hard to grok why identity data/sdk implementations are not working as expected

Describe the solution you'd like.

Add a page in the dashboard that shows a real time stream of identities being sent to the API from SDKs. This is similar to real time debugger pages in analytics tools (https://www.docs.developers.amplitude.com/data/debugger/#ingestion-debugger)

Describe alternatives you've considered

None

Additional context

We could reuse the identity webhook to stream content back into the app.

dabeeeenster avatar Feb 26 '24 09:02 dabeeeenster

I am working on this

tranchung163 avatar Jun 21 '24 23:06 tranchung163

Hi @dabeeeenster,

Our team is currently working on creating a graph page that will showcase the identity requests being sent from SDK clients to the REST API. We have a couple of questions and would appreciate your guidance:

Regarding the real-time data you mentioned, do you prefer implementing real-time streaming using WebSocket, or would a polling mechanism be sufficient for this purpose?

You mentioned an identity webhook in the documentation but we were unable to find specific details about it. Could you please point us in the right direction or provide more information on how to implement this webhook?

Additionally, we found the following two endpoints related to identity data:

api/v1/environments/{environment-api-key}/identities api/v1/environments

Are these the correct endpoints we should be using for fetching and streaming identity data? If there are any other relevant endpoints or additional considerations we should be aware of, please let us know.

Thank you for your assistance!

tranchung163 avatar Jul 18 '24 03:07 tranchung163

Hi @tranchung163 , thanks for your work on this.

In answer to your first question, I think polling is fine.

Regarding the webhooks, the documentation is here.

Regarding the endpoints, it gets a little more complicated when you talk about identities since there are 2 separate locations that the identity data can come from (for our SaaS platform at least). I've tried to detail this below, but please let me know if anything isn't clear.

Identities

Determining where to look for Identity data

The endpoint at /api/v1/projects/:id has an attribute called use_edge_identities. This value should be used to determine which endpoint(s) to use to retrieve data about a given environments identities.

SaaS - Core & Self-Hosted ("use_edge_identities": false)

The endpoints at api/v1/environments/{environment-api-key}/identities will give you the data you need for the identities in this case.

SaaS - Edge ("use_edge_identities": true)

The endpoints at api/v1/environments/{environment-api-key}/edge-identities will give you the data you need for the identities in this case.

Note that we wouldn't necessarily expect you to do the work to handle the edge identities (although we are looking to open source our Edge API in the near future), but we may include it in our thinking when it comes to reviewing the implementation since certain usage patterns are more difficult when it comes to edge identities (since they are stored in DynamoDB vs Postgres.

matthewelwell avatar Jul 18 '24 08:07 matthewelwell

Hello, Our team has been utilizing the /api/v1/environments/{environment-api-key}/identities endpoint, which we found to only return the count of identities and lacks real-time streaming capabilities.

To address this, we explored the /api/v1/organizations/{organization_pk}/usage-data endpoint and successfully created a usage data graph. However, this endpoint only provides identity tracking data at a daily granularity. Screenshot 2024-08-01 at 3 49 02 PM

To achieve more real-time streaming, we are considering modifying the database schema in api/app_analytics/influxdb_schema.py to track identity data at hourly and minute intervals.

Would implementing these changes align with the intended scope of the project, or is there an alternative approach recommended for real-time identity data streaming within Flagsmith? Thank you for your assistance.

SebastianRamos3 avatar Aug 05 '24 17:08 SebastianRamos3

Can you clarify what services you are running? Are you using InfluxDB?

I'm also not 100% clear what you are trying to achieve? What is your use case? You interested in a real time stream of just historic data but with finer resolution?

dabeeeenster avatar Aug 05 '24 18:08 dabeeeenster

Hello @dabeeeenster,

Sorry if we were not clear before. To answer your first question, yes, we are using InfluxDB and looking at api/app_analytics/influxdb_schema.py.

For the second question, our current graph shows the identities data at a daily level, and we want to display identities being sent by the hour or even minute. We believe this finer resolution would be more helpful for debugging.

In your response, you mentioned historic data. Could you clarify what you mean by this? Is there any other endpoint that allows us to get real-time stream identities? Are we heading in the right direction that you envision for this feature?

Doremegalul avatar Aug 05 '24 20:08 Doremegalul

Can you make a PR and we can take a look at the code?

dabeeeenster avatar Aug 05 '24 20:08 dabeeeenster

We have gone ahead and made the pull request so you can take a look at the code.

SebastianRamos3 avatar Aug 05 '24 21:08 SebastianRamos3