[messaging] Zero-metadata object store
Peter Parkanyi
me at rhapsodhy.hu
Sun Dec 1 16:44:47 PST 2019
Hi Mikalai,
> Do metadata objects contain file tree information?
>
> Is file tree continuously packed into mobjects, i.e. many folders' data is placed into every mobject? Or, is it a one mobject per folder structure?
>
File system information and keying information are in mobjects, and they are continuously packed. The aim is to sync the repo index independently from any other content using as little bandwidth as possible.
> Suppose that I sync objects onto server. Suppose also that one of the files there is big, like GBs BIG. What will happen if I change some 100 bytes in the middle? What objects change? Are there new objects create? What should be send to server for sync?
There’s a rolling hash that creates smaller chunks from a large file. This means that a 100 byte change would realistically change the flow of chunks from that point onwards until the hash syncs up again. A new object is made with the new chunks, and the file metadata is changed. Mobjects are easy to rewrite, and version control is also easy to add because the object format is flexible.
>
> Is it correct that for reading a small section of a multi-GB file, I'll have to download from server and to decrypt all 4MB of some dobject?
Data objects are encrypted chunk-by-chunk with a different key, so you could e.g. create a server that supports HTTP range requests to serve only the part of a dobject that’s referenced by the index.
>
>> I’ve uploaded the specification and code to https://github.com/rsdy/zerostash, and would be grateful for some feedback.
>
> Is it possible to give object ids and some keys as a way of sharing only part of the file tree, without giving out key material for other parts of the tree? Cause https://github.com/rsdy/zerostash#key-management reads like there's one data key (derived subkey).
The data key changes from chunk to chunk, so you could create an index that only lists the relevant chunks and share only that with someone else. That would leak the data key, but will not make any other data accessible in the stash without relevant metadata.
>
> Quote from https://github.com/rsdy/zerostash#data-objects """The key to encrypt each chunk is Blake2s(plaintext) XOR Argon2(user key)""" Is it a plaintext of content? When decrypting, I don't know content, hence, key must be in some mobject. Why can't it be a randomly generated key, stored in mobject, providing the same claimed feature: """therefore compromising a user key in itself does not necessarily result in full data compromise without access to indexing metadata.""'
It could be, at the cost of metadata size and performance. I am looking into solutions to this, but for single-user cross-device syncing I didn’t consider this a big issue.
>
> How are nonces generated?
Chunk nonces are `objectid XOR chunk size`, where chunk size is 4 bytes in little endian. Meta object nonces are the lower 12 bytes of the object id.
Peter
More information about the Messaging
mailing list