sketch Privacy Policy

First off, amazing package and thoughtful design! As mentioned in the readme, the default behavior is to call out to https://prompts.approx.dev. Is there more information on that endpoint regarding the privacy policy? How is the data used, is it stored, is it used for other purposes, etc?

Jan 24 '23 21:01 tszumowski

Hi @tszumowski

Fast Version: Current Stored State: logs and prompt history, right now we store Nginx logs to measure engagement (number of calls), and prompt history gets deleted on every update of sketch. Also, external to us (3rd party), we rely on openAI for completion, so prompts go to there as well. Current Use: Nginx logs to track engagement.

Longer, full transparency version for all details

About current running environment: Right now (22/01/26), sketch server is running on a single docker image on a single machine. It is based on https://github.com/approximatelabs/example-lambdaprompt-server . (I plan to update the base example server to include the sketch server example and dockerfile soon (It's a 6 line dockerfile, and 1 extra line import sketch # noqa in the main.py file.). Since it is built on labmdaprompt server, this stores the calls to prompts in a local sqlitedb for debugging https://github.com/approximatelabs/lambdaprompt/blob/a85b5f4214f531c92bcf5de4ed9c7a0c72e1b5f7/lambdaprompt/server/main.py#L26 .

This docker image has an Nginx proxy in front of it (Nginx proxy manager) which stores logs.

Lastly, in both non-hosted mode and prompts.approx.dev mode, we are using OpenAI models as the completion endpoint right now (until we have our own model, which is likely many months away), so this sends the prompt to OpenAI. This means that the raw prompt that is generated and completed is subject to their Privacy Policy as well (I suspect).

Together, these 2 spots on our side represent 100% of state that is stored on the server (nginx logs and the automatic history.db file). Right now, I am not mounting or storing the history.db, and so everytime I update sketch, I delete any previous calls from that. And, the nginx logs, I am storing and using to track total number of calls to sketch.

So, net total: I am using the nginx logs to track calls, I am not using the data for any other purpose, and we are relying on OpenAI to process so there's their policy on the data to be considered.

I need to figure out how to make a proper Privacy Policy here soon, so my answers here are just technical what is happening, and an actual more formal answer will follow. I recognize this matters to users of sketch, and want to make sure we address this transparently and completely.

I also want to fully admit naivety here, I don't know what documents (official privacy policy documents) I should be hosting / storing about this. What would be the most confidence boosting / best form of privacy policy I should look at, model after, or start from? Do you have any opinions here @tszumowski ?

Jan 26 '23 17:01 bluecoconut

@bluecoconut wow this was a very thoughtful and detailed reply. Much appreciated! That definitely gives me more comfort knowing about what is behind the endpoint and how the data is handled. Thank you.

Regarding suggestions on documentation. I perhaps shouldn't have used the term "Privacy Policy". While there are some templates/examples out there (i.e. OpenAI's), it tends to be more of a legal document, with corresponding legalese, and is more often used by companies than FOSS projects. So for this project that's overkill.

I think what some variant of what you wrote above in a README will do just fine. As an example the WhyLogs Readme has a Usage Statistics snippet in the README and then they link to a brief but informative docs page. (That all could just be in a readme too).

Hope that helps! And thank you!

Jan 27 '23 01:01 tszumowski