[messaging] Tahoe-Lafs + miniLock: A device agnostic & user friendly zero knowledge file system?
totient at riseup.net
totient at riseup.net
Tue Nov 4 08:53:21 PST 2014
While this isn't exactly a proposal for a “messaging” system, I thought
this might be a good place to float this idea, since it involves usable
crypto, elliptic curves, and since the authors of both systems are
subscribed to this list.
Currently, there is no zero knowledge file system that is user friendly
and fully open source. Spider Oak is easy to use, but not open source.
Tarsnap isn't easy to use, and the server side code hasn't been
published. Tahoe-Lafs is open source, but only really easy to use if
you're a sysadmin or Linux autodidact. MiniLock is open source and easy
to use, but isn't a filesystem.
What would it look like if we combined the browser-based ease of use of
miniLock with the redundant, zero knowledge properties of the Tahoe
file system?
In the typical Tahoe-Lafs setup, the user must control the “gateway,”
which is responsible for file encryption/decryption. This can be the
user's laptop. However, if the user wants to access her files from
multiple devices, or share files with the ability to revoke access, she
must set up the “gateway” as a server (to assure file availability) with
a http proxy login (to control access revokation). As it exists now,
Tahoe-Lafs is a brilliant tool for system administrators and advanced
users who want a secure and redundant back-up solution. Tahoe's file
and folder sharing properties are harder to take advantage of presently,
since the knowledge barrier to setting up a Tahoe client are high enough
to keep the community of users quite small.
However, if Tahoe-Lafs and miniLock were glued together, then the user
wouldn't need to control her own gateway in order to assure the
confidentiality, integrity, and authenticity (for shared files) of her
files.
A Tahoe-miniLock hybrid could be implemented in various ways by the end
user:
1.) The user could rely on a service provider to run a Tahoe-Lafs
gateway, in which case the service provider would know how many files
she has uploaded, the names of those files, and the times when she
uploads/downloads them. The service provider would not know the
contents of the files.
2.) The user deploys her own gateway server, perhaps as a sysadmin for
an organization, and relies on Amazon EC2 to deploy storage nodes. In
this scenario, file metadata would be known only to the sysadmin and
authorized users. The user community would be dependent on Amazon EC2
for file availability. The sysadmin would not be able to see the
content of users' files, unless a file is explicitly shared with her.
3.) An organization deploys their own gateway server and their own
storage grid. The whole system is in house.
The user flow could look something like this:
-User enters her email address and password into a browser extension.
-The gateway server authenticates the client using SRP. Since the
server does not store password-equivalent data, the user may safely (I
believe?) use the same password to authenticate and to encrypt her files
with miniLock.
-The gateway server fetches the user's files from the storage node
servers and decrypts them, per Tahoe-Lafs' protocol. However, theses
decrypted files are still encrypted with user's public key, per
miniLock.
-Once logged in, the user sees her file directory structure, including
plaintext file names. When she clicks on a file, that file will be
decrypted locally, by the browser extension, and downloaded onto her
machine, per miniLock.
-To upload a file, the user selects a local file, and the browser
extension will encrypt the file with her miniLock public key. The
extension will then upload the file to the gateway server, which will
encrypt the file per Tahoe-Lafs and distribute it to the storage nodes.
Alternately, a group of people could deploy their own gateway server,
and allow members of their group to connect to it. This would protect
file names from a third party gateway server. This way, users could
easily compartment access to sensitive information within an
organization on a need-to-know basis. This would be useful for a
hospital or health clinic, for example, since information could be
restricted to relevant health care providers without sacrificing the
redundancy and availability of storing data on a server. Some
information, such as non-attributed epidemiological information, could
be shared with all health care workers in the hospital or clinic. This
setup would also be beneficial for a large media organization with
several investigative journalists, who may want to take advantage of
data redundancy and availability without sharing their work product with
all of their colleagues.
Sharing files, how would that work?
Sharing could be done a few different ways. This is what I've come up
with:
A user enters contact names and their respective miniLock public keys
into a “contacts” tab in the browser extension. This information is
encrypted with her public key and stored as a hidden part of her
directory structure, which is then uploaded to the gateway server,
encrypted again per Tahoe-Lafs, and distributed to the storage nodes.
In order to share a file, the user uploads a new file to the browser
extension and selects the contacts with whom she wants to share the
file. The application encrypts the file with her public key and the
public keys of the selected contacts. The application then instructs
the gateway server to create a new shared directory, with its own unique
URI (per Tahoe-LAFS). The gateway server returns the new URI to the
client application. The client application then encrypts the URI with
the public keys of each contact the user selected, including the user's
own public key. The application then instructs the user to share this
file with her contacts. Since the URI is encrypted with their
respective public keys, it may be safely shared over email, or any
other insecure channel.
Another problem that would need to be sorted out is how to keep the
names of individual files hidden from the gateway server. This would be
especially important for a user whose gateway is deployed by an
application service provider, and not her own organization. The file
name could be encrypted with the user's public key for unshared files.
This is obviously just a very rough outline of what a system like this
could look like. I'm curious to know what people on this list make of
this in terms of viability and security.
-totient
More information about the Messaging
mailing list