magrit
magrit copied to clipboard
Data privacy
Hi,
I was wondering which processes occur locally in my browser, and which processes take place on your servers - for which data has to be sent and eventually stored on them. According to the documentation, only user basemaps as well as representations that generate a new geometry ("carte lissée, carroyage, discontinuité, cartogramme et liens") are stored on your servers. Thus am I correct in understanding that no raw data ever travels to your servers? Or are there certain cases in which it does, e.g. for the purpose of calculating new geometries? Or does it always travel to your servers, but is stored only in the above-mentioned cases?
Thanks, Alice
Hi,
TLDR;
- Almost all the data imported in Magrit public instance is sent to the server at some point as there is some data massaging (and rudimentary caching) happening here.
- The connection to magrit.cnrs.fr uses TLS/SSL (edit 30/08/2022) ~~The connection to magrit.cnrs.fr don't use TLS/SSL.~~
There is some (very basic) details in french here : http://magrit.cnrs.fr/docs/privacy.html#quelques-d%C3%A9tails-permettant-de-comprendre-le-chemin-des-donn%C3%A9es-
Example :
- you are importing the layer A -> layer A is sent to the server, it undergoes a coordinate transformation, a transition to UTF-8 encoding and a cleaning of some characters in the column names, then that layer A is both : sent back to the client and stocked in RAM during your session (to quickly retrieve your layer if you are asking later for some functionalities like smoothed map or gridded map)
- you are rendering a (Choropleth | Prop. symbols | Categorical | Pictogram | Waffle) map using the layer A -> everything happens in your browser (but the server already 'know' the layer A)
- you are rendering a (Smoothed | Gridded | Cartogram | Link | Discontinuity) map using the layer A ->the computation is done on the server (and the resulting layer is stored in RAM during your session, in the case you want to export this layer using Magrit export option).
- you are importing a tabular file (let's call it B) -> if B is a CSV file and doesn't contain coordinates it will stay on your browser; otherwise (if B is a odt/xls/xlsx file or a CSV file with coordinates) it will be sent to the server for conversion (where it's nor written to disk neither stocked temporarily in RAM).
- if your tabular dataset B was a CSV file, then you're joining it with the layer A, then asking for a (Smoothed | Gridded | Cartogram | Link) map on layer A using one of the column coming dataset B, that column will now be known by the server.
Example of map with private data staying in your browser :
- Import layer A (with no private data as it's send to our server), import dataset B in CSV format, make the join between A and B, render a map among (Choropleth | Prop. symbols | Categorical | Pictogram | Waffle).
In the end, even if we can honestly tell you that we do not save / use / look at / etc. the layers that pass through our server, we can't offer real and strong guarantees or proofs about it (but I can point you to some way of verifying that there is no data send to our server for the specific case I described just above).
If you want to use Magrit without sending data through the network, we strongly advise to use Magrit locally (for example with Docker - you will only need an access to internet to download the Magrit's docker image). Of course you can also deploy it on your local network (we had a development version running in our office so it was only accessible to my office-mates for example...).
Just out of curiosity, don't hesitate to tell if Magrit with Docker is a suitable solution for you (and please explain why if not !).
I was interested in progressively migrating some features from the server to the client side... but it requires quite a lot of work...so there is clearly no roadmap about it :)
Hi,
Thank you so much for your answer, and my apologies for not answering sooner.
Magrit with Docker is indeed an excellent solution (if not the solution :) and we're looking into this. I had just one question about data persistance: is Redis (and thus its folder /var/lib/redis) the only storage brick (brique de stockage)? How much storage space should we plan?
Thanks a lot!
I guess the storage space would depend on the foreseen usage of the app. Our public instance (magrit.cnrs.fr) use 16gb of RAM. @mthh is Redis the only storage brick?
Redis is indeed the only storage brick.
You don't need much space to host Magrit because Redis does not use disk persistence, so everything is in RAM. The biggest thing (in the case of a Docker installation) is the Magrit container itself (because we have quite a few dependencies, and tens or hundreds of MB of example datasets).
Speeking about disk storage, the Magrit Docker image weight about 2.70 Gb once unpacked. So I guess having a total of 4-5 Gb disk storage available before pulling Magrit image is enough.
I'm closing this issue as the discussion is no longer relevant due to the application's new architecture since v2, where all data remains in the user's browser (or in the app in the case of the standalone version).