logseq
logseq copied to clipboard
Security issue: Logseq leaks graph's content outside of its directory.
What happened?
The issue described here is a security flaw.
One would expect that by encrypting all contents in the Logseq notes folder, the notes are safe. But in reality all notes can be read in plain text at ~/.logseq/graphs
.
This is a deal breaker for many users.
Reproduce the Bug
- create a logseq graph
- encrypt folder with favorite encryption software
- check ~/.logseq/graphs and see that all contents are still there available in plain text
Expected Behavior
one would expect that after encrypting notes, they are no longer accessible to anyone without the encryption keys.
Screenshots
No response
Desktop Platform Information
No response
Mobile Platform Information
No response
Additional Context
No response
Yep Logseq saves notes data outside of the users selected folder for their graph/notes.
I can see my notes in plaintext in the corresponding ~/.logseq/graphs/xxx.transit
file.
For something like VeraCrypt to be useful, all we need is a portable option for the graph, where these files are all stored in the users graph/notes folder.
For example in config.edn
:
:make-graph-portable? true
If true, the users graph folder will have one more folder: .graph
rather than being in the ~/.logseq/graphs
folder.
For example:
.git/
.sync/
assets/
draws/
exports/
journals/
logseq/
pages/
whiteboards/
.graph/ <-------- This is a new folder that contains the graph `.transit` file from `~/.logseq/graphs/`
This way, the graph is 100% portable and can be stored on a VeraCrypt volume securely.
Also you will want to add .graph/
to .gitignore
if you use git.
The above is my guess at what a possible solution might look like.
Thanks! The solution looks good to me. I'm happy to review PR on this.
I would argue that this should not even be a configurable option, it should just be the case always. I don't see a reason to have this plain text graph inadvertently placed separately from the other files, which trips a lot of people.
@danilofaria .transit
is a cache file of the index, it's too big for file sync service to sync.
@cnrpman the file could be there and just be ignored by the sync service. I don't see any issues. Or am I missing something?
@danilofaria It's not a problem for Logseq Sync or git with .gitignore
setup. But it's a big issue for common users that using general sync services like dropbox without knowledge about this.
We should keep the default setting simple and make it easy for most people. So a configurable option is more reasonable to me.
@cnrpman honestly idk if the biggest surprise is having this large file as part of the contents or having a plaintext file with all your sensitive data lying somewhere else in your computer without you knowing about it.
To me, the latter is a much bigger issue, as it goes against Logseq's philosophy of "you have control of your data".
And in any case, this file is not that big, since it is just text, and also Logseq already has a bunch of text under logseq/bak
that takes a comparable amount of space.
It could be configurable, but I would argue that the default should be for it to be together with the other files.
@danilofaria The search cache file is usually has a size of multiple times of the corresponding raw text - it's structured search index. It's totally unacceptable to be synced by default, as it's likely not be delta synced. Any other idea of consolidating the search cache is welcome
@cnrpman I don't think it is so big that you can't put it on Dropbox or similar. But if you think so, then it is alright to do it as @drawingthesun suggested and have it configurable and by default keep it out.
Some users will be surprised to find out their notes are stored elsewhere, but at least they will have a way to configure it to keep all files together.
Do you think there's any chance this ticket will get picked up any time soon?
@danilofaria
I can give some pointers if anyone is happy to work on the code:
Basically, it's to make this path configurable:
https://github.com/logseq/logseq/blob/bc568f6d5b9c8cfdae49f97de79ab98e9544d99b/src/electron/electron/handler.cljs#L255-L260
And need to use a dummy file to replace the .transit
file used here:
https://github.com/logseq/logseq/blob/bc568f6d5b9c8cfdae49f97de79ab98e9544d99b/src/electron/electron/handler.cljs#L208-L215
And need to use a dummy file to replace the .transit file used here:
@cnrpman would you mind explaining this bit in more detail?
@cnrpman can you also give pointers on how to define config options and how to access them in code and make the value accessible from this file you linked?
@cnrpman I found a workaround to this which was making the leaked .transit
file read-only (and deleting its contents). There are some errors in the console logs, but otherwise everything seems to work great and the file remains empty. Do you think I could have any issues with this workaround?
And need to use a dummy file to replace the .transit file used here: @cnrpman would you mind explaining this bit in more detail?
get-graphs
lists all .transit
files under the graph
directory and count them as graphs. So a fast workaround would be creating dummy files for the portable graphs (since they have no .transit
file stored in here)
For example, the portable graphs would touch empty files in name of <your graph path>.portable
can you also give pointers on how to define config options and how to access them in code and make the value accessible from this file you linked?
If you mean the global config, it's available in main process (which the file I mentioned above belongs to) https://github.com/logseq/logseq/blob/master/src/electron/electron/configs.cljs
If you mean the per-repo config helper, it is only accessible from renderer process, so IPC required to send the config to main process..
I found a workaround to this which was making the leaked .transit file read-only (and deleting its contents). There are some errors in the console logs, but otherwise everything seems to work great and the file remains empty. Do you think I could have any issues with this workaround?
Logseq would load internal DB cache from the .transit
file at start.
Oudated .transit
would make internal DB data outdated.
May need to re-index every time you start Logseq.