seafile icon indicating copy to clipboard operation
seafile copied to clipboard

Encrypted libraries leak lots of information

Open ef4 opened this issue 11 years ago • 75 comments

I spent some time auditing the crypto constructs for Seafile's encrypted repos, because I'd like to help make Seafile more secure and trustworthy. I found some significant problems.

An attacker who obtains a copy of the encrypted library without the key can:

  • read the complete list of directory and file names.
  • know the size of every file
  • know which files share some of the same information
  • see the history of who changed each file, when, and what byte ranges were altered

Furthermore, since the same initialization vector is reused for all chunks, the library is vulnerable to watermarking and known-plaintext attacks.

The first problem is straightforward to solve: encrypt all file and directory entries, not just the content chunks.

The second problem (predictable IVs) is not as easy to fix. To maintain seafile's existing deduplication and synchronization capabilities, you want deterministic encryption. But maintaining semantic secrecy with deterministic encryption is probably not possible.

As a practical improvement, you could use an HMAC of each chunk as its IV. This is still deterministic, but it would at least prevent chunks with the same prefix from sharing the same ciphertext prefix.

To achieve strong secrecy, Seafile would need to give up deterministic encryption. This can still provide reasonable reduplication and efficient sync, but it would require clients to maintain their own cached mapping from chunk sha1 sums to their encrypted identities. You may want to look at how the Tarsnap client does something similar.

ef4 avatar Sep 03 '13 21:09 ef4

Thanks for your suggestions.

Encrypting file and directory entries has been discussed for some time. It'll be added in the future. But it takes a lot of effort so we don't have time to work on it currently. If file and directory entries are encrypted, we have to decrypt them when browsing the files from the web. That requires a lot of changes.

I don't quite understand why using a single IV for the whole library is vulnerable to known-plaintext attacks. The IV is derived from the password with salt. It's deterministic, but not so predictable. If it's so bad to use this method to produce an IV, it's bad to derive the key from password too. I know it's better to use different IV and key for each file/block. But that would greatly increase complexity.

killing avatar Sep 04 '13 01:09 killing

It's bad for the same reason that ECB mode is bad: patterns in the data remain visible as patterns in the ciphertext. The only difference is that in this case we're talking about patterns of chunks instead of patterns of cipher blocks. I think it's possible to guess with good probability whether a repo contains certain files, just based on the statistical patterns of blocks with shared common prefixes.

An attacker can often guess that a repo might contain some well-known file. But combined with the plaintext file & directory metadata, I don't even need to guess. For example, if I see a file containing 35684 bytes named "seafile/common/processors/blocktx-common-impl-v2.h" in your encrypted repo, it wouldn't take much guessing for me to know exactly what the plaintext of that file must be. Then I can run the chunking algorithm over it, and I learn what the first 16 bytes of each block always encrypts to (because the IV never changes). I can then search for other blocks that start with those same patterns, and now I know some of the plaintext of your other files, too.

Repeat over and over for every known or guessable plaintext, and an accurate statistical picture of some of your private data can emerge. Or even worse, maybe I can send you some chosen plaintext (like a specially crafted PDF) and get you to stick it in your repo. That opens up even better attacks, because I can choose the chunking, tag my chunks with common shared prefixes of various lengths, and then accurately sample how the rest of each chunk encrypts.

Anyway, I'm just an amateur interested in improving open source security. Maybe try to engage some dedicated crypto discussion groups -- I bet you'll get an earful about potential risks. Then you can decide if those risks matter to your threat model.

Thanks for Seafile, if I didn't think it was useful I wouldn't spend time looking for security bugs. :-)

ef4 avatar Sep 04 '13 03:09 ef4

I second the need to encrypt all file and directory entries within encrypted libraries. Furthermore the use of an HMAC of each chunk as its IV seems like an useful enhancement.

q000te avatar Jan 15 '14 21:01 q000te

Having the same IV every chunk is the same as having no IV. You are not using the protocol in the way it was designed, so you are vulnerable to the attacks the protocol was designed to defend against. Using the HMAC of each chunk as an IV is better, but it introduces a relation between the IV and the data that was not researched by the designers of the protocol so it might introduce a weakness too. Intuitively the relation is faint and deliberately obfuscated, but a true cryptographer must judge whether that is good enough, and then other cryptographer should critique that judgement.

tinco avatar May 13 '14 14:05 tinco

Any news on this? Filenames (and the other mentioned issues) leaking information is a major problem. And unless it's clearly mentioned when creating an encrypted library may give people a false sense of security.

3onyc avatar May 14 '14 10:05 3onyc

Your page states:

Privacy Protection

  • Deploy on your own machine
  • Encrypt files with your own password
  • Keep password only on clients

Seafile provides an advanced feature called encrypted library to protect your privacy. The file encryption/decryption is performed in the client-side. The password of an encrypted library is not stored in the server. Even the system admin of the server can't view the file contents. https://seacloud.cc/group/3/wiki/faq-for-security-features/

But as @ef4 points out above SeaFile really isn't ready to be used with sensitive material yet.

Your response is totally understandable:

Killing: But it takes a lot of effort so we don't have time to work on it currently.

BUT you need to need to state on your site that SeaFile is beta software and only rudimentary secure right now instead of making it look like it is fully secure.

Thanks!

patrickwolf avatar Jun 08 '14 04:06 patrickwolf

kudos! @patrickwolf

Finkregh avatar Jun 08 '14 15:06 Finkregh

Is there any update on this topic? Are the issues raised by @ef4 still valid in the latest release?

Would be glad to have your insight on this.

fstoerkle avatar Oct 15 '14 09:10 fstoerkle

Agreed with @patrickwolf, the documentation is currently misleading regarding what to expect from Seafile on the security/privacy side.

A fair move would be to add a "Limitations" notice to the documentation.

@killing: Any update on this?

benoitv-code avatar Jan 21 '15 05:01 benoitv-code

Another albeit not security limitation, auto upload of photos/videos on Android is not supported for encrypted libraries https://github.com/haiwen/seadroid/issues/201

To my knowledge Seafile is still the best Open Source file syncing solution that has integrated encryption and thus no need to fiddle around with third party encryption tools, which is especially annoying on mobile.

midi avatar Jan 21 '15 11:01 midi

For the time being, the only thing we plan to improve is use a different salt for each encrypted library. The current version still use a static salt for all encrypted libraries. The other proposed security enhancements to encrypted libraries are not on schedule. Those includes:

  • Full encryption of file metadata, history etc. Only file content is encrypted.
  • Use separate IV for each file.

We'll make these limitations clear in our documentation. The users can make their own choice.

killing avatar Feb 01 '15 03:02 killing

It's much worse to have bad encryption than no encryption...and don't even try to implement security or distributed code if you are not an expert in the field. Use proven libs the recommended way only, and get the code audited. Otherwise it's worse than useless, and the users cannot make their own choice because users know nothing about security.

CoolkcaH avatar Feb 18 '15 20:02 CoolkcaH

Full encryption of file metadata, history etc. Only file content is encrypted.

Wait, are histories decrypted and stored or something? The content in the history should be encrypted as well, right?

shuhaowu avatar Feb 23 '15 21:02 shuhaowu

look at the first post. Don't think anything changed so far. And dev already commented on this 24 days ago - don't think it's that hard to read.

If you think your data is so important/confidential that it is not enough to keep it on your own hardware, don't think anyone will stop you from implementing this or hirering one to do so.

shoeper avatar Feb 25 '15 00:02 shoeper

Yes, file contents in the history are encrypted.

2015-02-25 4:52 GMT+08:00 Aaron Hastings [email protected]:

Wait, are histories decrypted and stored or something? The content in the history should be encrypted as well, right?

I certainly hope so. Can a dev comment on this?

— Reply to this email directly or view it on GitHub https://github.com/haiwen/seafile/issues/350#issuecomment-75843167.

killing avatar Feb 26 '15 03:02 killing

@patrickwolf I agree.

akaho avatar May 27 '15 20:05 akaho

Holy shit. This needs to be advertised loudly in the manual and the UI. I was about to send data to S3 and decided to look into how exactly Seafile is encrypting my library contents. This is security theater. I love Seafile, but people need to know to not use this functionality if they want something secure.

curtiszimmerman avatar Nov 28 '15 12:11 curtiszimmerman

Any news on this topic?

kantorkel avatar Feb 07 '16 09:02 kantorkel

@ef4 - Just wanted to say I appreciate the in-depth explanation you have given; very insightful to read!

stevesbrain avatar Feb 07 '16 22:02 stevesbrain

@kantorkel We'll update the encryption algorithm to use separate salt for each library in version 5.1 or 5.2. Other improvements proposed in this thread are not planned, as I said before. We'll make the limitation clear in our documentation.

killing avatar Feb 15 '16 07:02 killing

Where is the limitation section in the documentation? I cannot find it here: http://manual.seafile.com/security/security_features.html

shuhaowu avatar Apr 11 '16 01:04 shuhaowu

Which limitation? Are you talking about the metadata?

In the manual it says the following in bold characters.

Even the system admin of the server can't view the file contents.

shoeper avatar Apr 11 '16 10:04 shoeper

This ticket seems to be a blocker for Seafile adoption in Freedombox https://wiki.debian.org/FreedomBox/LeavingTheCloud

kelson42 avatar Feb 23 '17 21:02 kelson42

So what is the status on this? The issue remains Open, but @killing has stated that there are no plans for other improvements...

I sincerely hope that the issue related to the re-use of IVs is not going to stay unfixed forever, in fact IMO moving to at least using unique IV per file/block should be one of your highest priorities, if you are serious about Seafile being used for anything other than hobby file servers.

If there's not going to be any further improvements, then this issue should be closed, so that those of us who are hoping for an improvement can look for a product which better meets our needs. If there is any intention to improve the current implementation of encrypted libraries, then it would be good to get some kind of definitive time line on when we can expect it.

k-ninja avatar Jun 28 '17 04:06 k-ninja

Shall this ticket not be split into two (or more) separate issues: the single IV vulnerability and the unencrypted metadata/history? The first one really is a bug that should be fixed, while the second one is more of a design choice.

On the long term, I agree with other comments that even the metadata of the "filesystem" should be encrypted. If you want to claim without small prints that "Seafile uses encryption", I think some effort should be put into having full encryption of the libraries.

instantname avatar Mar 15 '18 09:03 instantname

More than 4 years? Any news @killing ? Wasn't aware of this and, as much as I like Seafile, that makes me want to stop using it…

Chouchen avatar Apr 23 '18 15:04 Chouchen

Will this issue ever be resolved ? I mean it's a serious problem and it's been 4 years .

FloTheSysadmin avatar May 22 '18 10:05 FloTheSysadmin

If an issue takes 4.5 years to solve, sadly, I'd say it's not a priority @RichardRMatthews . When security isn't a priority, I find it hard to put much faith in the project to protect secure data.

stevesbrain avatar May 22 '18 13:05 stevesbrain

Haha yeah, @stevesbrain totally beat me to the punch. I was thinking like man, this issue has been open for half as long as the length of the entire Apollo Program. So no, it's not going to ever be resolved.

curtiszimmerman avatar May 22 '18 15:05 curtiszimmerman

see https://github.com/haiwen/seafile/issues/350#issuecomment-184094215

We'll update the encryption algorithm to use separate salt for each library in version 5.1 or 5.2. Other improvements proposed in this thread are not planned

kantorkel avatar May 22 '18 15:05 kantorkel