tabby
tabby copied to clipboard
Include language in events message or completion ID in completions message
Please describe the feature you want
In order of preference:
- Include the programming language in the
/v1/events
message - Include the completion ID in the
/v1/completions
message
Additional context
I have been snooping traffic between our Tabby server and the user to see what Tabby sends when it offers a suggestion and when the user either completely accepts or partially accepts a suggestion. The end goal is to collect the following metrics:
- What languages are being used
- For each language:
- How many suggestions have been offered
- How many of those suggestions are completely accepted by the user
- How many of those suggestions are partially accepted by the user
From my analysis of the messages, I noticed this:
- The
/v1/completions
message contains a programming language but no completion ID - The
/v1/events
message contains a completion ID but no programming language - The
/v1/events
message indicates the following:- A suggestion is offered (
type
isview
) - A suggestion is partially accepted (
type
isselect
andselect_kind
is present) - A suggestion is completely accepted (
type
isselect
andselect_kind
is absent)
- A suggestion is offered (
I pretty much can get all the data that I need from these messages, but I cannot easily correlate the language in the /v1/completions
message with the corresponding event in the /v1/events
messages. Ideally, I'd like to just use the /v1/events
message since it would simplify the way I would store this information in a database. That's why I would like the programming language in the events message. However, as a fallback, if the completion ID were in the /v1/completions
method, then I could use that ID to identify the corresponding event. That's workable because I could just store that ID in a database as a placeholder, indicating that a suggestion is offered and update the database entry if a corresponding event message came in.
Please reply with a 👍 if you want this feature.
Hello @rzuckerm, thank you for the detailed feature request. Have you checked the events we logged at ~/.tabby/events
? They are more organized and should be ideal for extracting the insights you have in mind.
Example content:
{
"ts": 1705686904479,
"event": {
"view": {
"completion_id": "cmpl-2360b276-c5c0-4394-8b0b-200db391271e",
"choice_index": 0
}
}
}
{
"ts": 1705705936255,
"event": {
"completion": {
"completion_id": "cmpl-c4498fd3-d541-4ead-8492-67184d86c539",
"language": "python",
"prompt": "<fim_prefix>def is_prime(n):\n<fim_suffix>\n<fim_middle>",
"segments": {
"prefix": "def is_prime(n):\n"
},
"choices": [
{
"index": 0,
"text": " if n == 2:\n return True\n if n % 2 == 0:\n return False\n for i in range(3, int(n ** 0.5) + 1, 2):\n if n % i == 0:\n return False\n return True"
}
]
}
}
}
{
"ts": 1705705985123,
"event": {
"completion": {
"completion_id": "cmpl-46b600fc-a541-4375-b6f5-8b4bc55952ab",
"language": "python",
"prompt": "<fim_prefix>def is_prime(n):\n<fim_suffix>\n<fim_middle>",
"segments": {
"prefix": "def is_prime(n):\n"
},
"choices": [
{
"index": 0,
"text": " if n == 2:\n return True\n if n % 2 == 0:\n return False\n for i in range(3, int(n ** 0.5) + 1, 2):\n if n % i == 0:\n return False\n return True"
}
]
}
}
}
One requirement that I forgot to mention is that I'd like to collect the IP addresses of the messages in order to keep track of the number of Tabby users. That information is not available in the log file. Also, while possible, monitoring the log files doesn't fit well into the architecture of our analytics service.
Well, since completion id is only created on after request to /v1/completions
, I don't think we have a way to make it available to traditional nginx style logging.
Shall we follow up in slack channel? Happy to learn / discuss about your use case.
Well, since completion id is only created on after request to
/v1/completions
, I don't think we have a way to make it available to traditional nginx style logging.
The /v1/completions
was my 2nd choice. My 1st choice is the add the language to the /v1/events
message. As for Slack, I don't use that.
Shall we follow up in slack channel? Happy to learn / discuss about your use case.
Sorry, I don't use Slack, but I do have a Discord account.
As for my use-case, my team has an analytics service that tracks various metrics for tools that we develop and maintain. Our analytics server listens to HTTP requests that contain analytics information and logs that information to a database. We use Grafana to query that database in order to produce various graphs and charts that allow us to track how these tools are being used and where we need to focus our development efforts.
One of the metrics we would like to track is the number of Tabby users as well as the percentage of acceptance (partial and complete) of suggestions from Tabby for different languages over time. If it is not feasible to add the language to the /v1/events
message, then tracking the overall acceptance is fine, and I will withdraw this request.
The /v1/completions was my 2nd choice. My 1st choice is the add the language to the /v1/events message. As for Slack, I don't use that.
I'll suggest go with logs under ~/.tabby/events
, with something like http://vector.dev/ to poll the logged events and forwarding to your data storage
I'll suggest go with logs under
~/.tabby/events
, with something like http://vector.dev/ to poll the logged events and forwarding to your data storage
That still doesn't meet the requirement to track users. We get the user from the IP address of the message, and we only use the IP address to count the number of unique users. That is not in the event logs. What is in our event logs for a field called user
is null
. I guess that's because we have the "Disable anonymous usage tracking" set in VSCode since we don't want to leak any data outside of our company (sorry, we're paranoid about that :smile: ).
One thing we will be working on is to have the user field filled with the user's email account for an authenticated server.
, It should be ready around version 0.10 or will certainly be ready before the 1.0 release.
(you might track the progress at https://github.com/TabbyML/tabby/issues/1324)
we don't want to leak any data outside of our company (sorry, we're paranoid about that 😄 ).
Understood—ultimately, that's something that drove us to build Tabby from the very beginning.