[messaging] Tahoe-Lafs + miniLock: A device agnostic & user friendly zero knowledge file system?

totient at riseup.net totient at riseup.net
Tue Nov 4 08:53:21 PST 2014


While this isn't exactly a proposal for a “messaging” system, I thought 
this might be a good place to float this idea, since it involves usable 
crypto, elliptic curves, and since the authors of both systems are 
subscribed to this list.

Currently, there is no zero knowledge file system that is user friendly 
and fully open source.  Spider Oak is easy to use, but not open source.  
Tarsnap isn't easy to use, and the server side code hasn't been 
published.  Tahoe-Lafs is open source, but only really easy to use if 
you're a sysadmin or Linux autodidact.  MiniLock is open source and easy 
to use, but isn't a filesystem.

What would it look like if we combined the browser-based ease of use of 
miniLock with the redundant,  zero knowledge properties of the Tahoe 
file system?

In the typical Tahoe-Lafs setup, the user must control the “gateway,” 
which is responsible for file encryption/decryption.  This can be the 
user's laptop.  However, if the user wants to access her files from 
multiple devices, or share files with the ability to revoke access, she 
must set up the “gateway” as a server (to assure file availability) with 
a http proxy login (to control access revokation).  As it exists now, 
Tahoe-Lafs is a brilliant tool for system administrators and advanced 
users who want a secure and redundant back-up solution.  Tahoe's file 
and folder sharing properties are harder to take advantage of presently, 
since the knowledge barrier to setting up a Tahoe client are high enough 
to keep the community of users quite small.

However, if Tahoe-Lafs and miniLock were glued together, then the user 
wouldn't need to control her own gateway in order to assure the 
confidentiality, integrity, and authenticity (for shared files) of her 
files.

A Tahoe-miniLock hybrid could be implemented in various ways by the end 
user:

1.)	The user could rely on a service provider to run a Tahoe-Lafs 
gateway, in which case the service provider would know how many files 
she has uploaded, the names of those files, and the times when she 
uploads/downloads them.  The service provider would not know the 
contents of the files.

2.)	The user deploys her own gateway server, perhaps as a sysadmin for 
an organization, and relies on Amazon EC2 to deploy storage nodes.  In 
this scenario, file metadata would be known only to the sysadmin and 
authorized users.  The user community would be dependent on Amazon EC2 
for file availability.  The sysadmin would not be able to see the 
content of users' files, unless a file is explicitly shared with her.

3.)	An organization deploys their own gateway server and their own 
storage grid.  The whole system is in house.

The user flow could look something like this:

-User enters her email address and password into a browser extension.
-The gateway server authenticates the client using SRP.  Since the 
server does not store password-equivalent data, the user may safely (I 
believe?) use the same password to authenticate and to encrypt her files 
with miniLock.
-The gateway server fetches the user's files from the storage node 
servers and decrypts them, per Tahoe-Lafs' protocol.  However, theses 
decrypted files are still encrypted with user's public key, per 
miniLock.
-Once logged in, the user sees her file directory structure, including 
plaintext file names.  When she clicks on a file, that file will be 
decrypted locally, by the browser extension, and downloaded onto her 
machine, per miniLock.
-To upload a file, the user selects a local file, and the browser 
extension will encrypt the file with her miniLock public key.  The 
extension will then upload the file to the gateway server, which will 
encrypt the file per Tahoe-Lafs and distribute it to the storage nodes.

Alternately, a group of people could deploy their own gateway server, 
and allow members of their group to connect to it.  This would protect 
file names from a third party gateway server.  This way, users could 
easily compartment access to sensitive information within an 
organization on a need-to-know basis.  This would be useful for a 
hospital or health clinic, for example, since information could be 
restricted to relevant health care providers without sacrificing the 
redundancy and availability of storing data on a server.  Some 
information, such as non-attributed epidemiological information, could 
be shared with all health care workers in the hospital or clinic.  This 
setup would also be beneficial for a large media organization with 
several investigative journalists, who may want to take advantage of 
data redundancy and availability without sharing their work product with 
all of their colleagues.

Sharing files, how would that work?

Sharing could be done a few different ways.  This is what I've come up 
with:

A user enters contact names and their respective miniLock public keys 
into a “contacts” tab in the browser extension.  This information is 
encrypted with her public key and stored as a hidden part of her 
directory structure, which is then uploaded to the gateway server, 
encrypted again per Tahoe-Lafs, and distributed to the storage nodes.

In order to share a file, the user uploads a new file to the browser 
extension and selects the contacts with whom she wants to share the 
file.  The application encrypts the file with her public key and the 
public keys of the selected contacts.  The application then instructs 
the gateway server to create a new shared directory, with its own unique 
URI (per Tahoe-LAFS).  The gateway server returns the new URI  to the 
client application.  The client application then encrypts the URI with 
the public keys of each contact the user selected, including the user's 
own public key.  The application then instructs the user to share this 
file with her contacts.  Since the URI is encrypted with their 
respective public keys, it may be  safely shared over email, or any 
other insecure channel.

Another problem that would need to be sorted out is how to keep the 
names of individual files hidden from the gateway server.  This would be 
especially important for a user whose gateway is deployed by an 
application service provider, and not her own organization.  The file 
name could be encrypted with the user's public key for unshared files.

This is obviously just a very rough outline of what a system like this 
could look like.  I'm curious to know what people on this list make of 
this in terms of viability and security.

-totient


More information about the Messaging mailing list