[messaging] Opportunistic encryption and authentication methods

Wed Sep 3 16:37:38 PDT 2014

We've discussed methods for distributing and authenticating
public-keys in email-like messaging.  I'll argue that "Opportunistic
Encryption" (OE) might be a good approach to this type of secure
messaging, and evaluate authentication methods in that light.

Background on Opportunistic Encryption
------
OE is an old concept, and people have different ideas about what it
means [1,2].  My take on the core ideas:
 * Authenticating a public key is harder than distributing it.
 * Thus authentication and encryption should be decoupled, so that
encryption can be deployed on a wide scale even without
authentication.

This has traditionally been controversial:  An
encrypted-but-not-authenticated connection is vulnerable to active
attack, so OE might not be worth much.  It might even have negative
value if it gives a false sense of security.

On the other hand:

1) There may be value to resisting large-scale passive eavesdropping
if switching to large-scale active attack is costly.

2) OE provides a foundation on which authentication can be added (e.g.
TOFU, fingerprints) [3].

3) A small number of users performing "stealthy" authentication could
protect other users by creating uncertainty about which connections
can be undetectably attacked [4].

This debate has played out different ways in different protocols.  For example:

 * STARTTLS between mail servers generally uses OE, and has some good
deployment between large providers [5].  People are thinking about how
to add authentication [6,7].

 * HTTPS is a non-OE protocol.  OE for HTTP (not HTTPS) is being proposed [8,9].

OE for email-like messaging
------
There's another argument for OE in the person-to-person case:

4) In the absence of widespread OE, users who publish their public key
and encrypt conversations will draw unwanted attention.

There's a new argument *against* widespread OE in the asynchronous
messaging case:  A key directory might get out of sync with a user,
and return a public key that the user has (for example) lost the
private key for.

I'll contend that 1-4 make a good case for widespread OE, and the risk
of messages encrypted to an out-of-sync public key is manageable:

 * At minimum, a service provider could implement a sort of "half-OE"
by registering key pairs for users and simply holding the private
keys.  This would hide to outsiders whether the user had opted for
full end-to-end encryption, and would provide some confidentiality for
messages that flow through multiple providers (like email; this is an
idea from UEE [21]).

 * A service provider could store most users' private keys encrypted
by a password, so that even a lost device doesn't result in
undecryptable messages.  A user could try password cracking in the
worst case of a forgotten password.

 * A third option is to simply give every user control of their own
private key, and if they lose their device(s) then they might lose
some messages sent before they upload a new key.  That might be
acceptable, or might not.

This could be debated more, but if you accept that OE makes sense
here, some principles follow:

A) Since we want widespread OE the goal should be for encryption to be
as frictionless as possible (ideally enabled by default, including
multiple-device support), scaleable, and reliable.  Users who don't
care about end-to-end authentication should not be inconvenienced by
it.

B) Since widespread OE would limit provider-based spam and malware
filtering, figuring out how to move these to the client is important
[10].

C) Authentication mechanisms should be evaluated on "stealthiness" as
well as useability and security.  Ideally it should be hard for any
observer (including service providers) to tell which conversations are
authenticated and which are not.

D) Authentication mechanisms will be built on top of OE, so can assume
that "identity public keys" and "key directories" already exist.

Evaluating authentication methods for secure messaging with OE
------
We can take the above principles and see whether different
authentication methods are compatible with widespread OE for
messaging.

TOFU: Compatible with OE since users could "stealthily" enable
notification of TOFU key changes, there's no effect on users who
don't, and no scaleability issues that would inhibit widespread OE.

FINGERPRINTS: Compatible with OE since users could "stealthily"
communicate about fingerprints out-of-band, there's no effect on users
who don't, and no scaleability issues.  In conjunction with TOFU, this
is Moxie's "simple thing" argument [11,12].

KEYS AS IDENTIFIERS: Using public keys or fingerprints directly as
identifiers, or attaching them to identifiers, has a long history
(Bitcoin, YURLs / S-Links, SMTorP, CGA, etc.).  The argument is that
identifiers are being exchanged anyway, so we might as well piggyback
authentication data.

I argue this violates the OE concept by inconveniencing users who
don't care about end-to-end authentication (A).  In particular, it
adds costs such as:
 i) useability cost of dealing with long, random-looking identifiers
 ii) switching cost of replacing widely-distributed identifiers with
new ones (in address books, memory, published materials, etc.)
 iii) operational cost of redistributing identifiers whenever the
private key changes.  If users change keys frequently due to new
devices, software reinstallation, lost passwords, etc., it would be
inconvenient to change email addresses every time [11,15].

PROVIDER-IMMUTABLE NAME/KEY MAPPING VIA VERIFIABLE LOG:  Namecoin
proposes that users register a name for their public key in a
cryptocurrency-type blockchain.  Once the public key is registered, it
can only be changed by expiration or a chain of signatures (signing a
new key, which can sign another key, etc.)

There are some questionable design decisions in Namecoin [13,14], but
the general idea of first-come first-serve names for public keys that
are widely witnessed seems potentially useful.

If these names are the user's primary identifier, then this is similar
to the "keys as identifiers" approach except keys are given better
names by a public infrastructure.  So while this improves (i), it
still violates the OE concept due to (ii) the cost of switching to new
names and (iii) the operational cost of having your identifier tied to
a key.  Additionally, publishing all names and relying on a new
infrastructure raises hard-to-answer questions about privacy,
reliability, and scaleability.

If these names aren't primary identifiers, but are instead exchanged
out-of-band to authenticate a specific public key, then this is
similar to fingerprints except keys are given better names:
 * my public key is "trevor_perrin_1970_email_2014 at Namecoin"
 * my public key is "gacuqk - aqoq - ecsag - biza - sjebre" (base32 fingerprint)

But this trades off "stealth" (C), as users with named keys are
advertising that they care about end-to-end authentication and might
be comparing keys out-of-band.  Users without named keys can probably
be attacked with impunity.

It's possible that the useability benefit of "named keys" instead of
fingerprints might justify the infrastructure cost and loss of
stealthy authentication, but the tradeoff is hard to evaluate.

PROVIDER-UPDATEABLE NAME/KEY MAPPING VIA VERIFIABLE LOG:  This is the
idea of a "transparency log", inspired by Certificate Transparency,
which is being explored by Keybase and Google's End-to-End [16,17,18].

Compared to a "provider-immutable" log, this accepts a more modest
security goal (notify on key changes) so that it works with existing
identifiers.  Moxie argues this goal is not much different than what
TOFU + fingerprints can achieve [19].  That's worth exploring more,
but to me this seems different enough that it would add security.

In any case, this doesn't suffer from (ii) or (iii), so the main
questions regarding compatibility with OE are privacy and
infrastructure cost.

Privacy:  Hashing identifiers won't be that effective [20], so this is
asking service providers to publish identifiers for a large portion of
their userbase.

Infrastructure cost:
 - Instead of just looking up Bob's public key, Alice needs to lookup
a proof-of-inclusion, which might increase the response size to 1 KB+
for large providers.
 - Storage of all the log data, and recalculating new logs, might be
significant, depending on (frequency of log publication, frequency of
key changes, size of userbase, etc).
 - To be practical, new keys would probably be batched into a new log
every 24 hours or so, which adds a delay that's not trivial deal with.
 - To be effective, third-party monitors would need to download and
review log entries, and it's not clear who these are and what costs
they'd have to pay to keep up.

ANONYMIZED LOOKUP AND AUDITING:  Some projects (e.g. Nyms [22]) have
suggested key lookups be performed via anonymized connections (e.g.
Tor, or a similar chain of proxies).  Then users could audit their own
key directory just by looking up their own key.

For widespread OE these lookups would be frequent.  Whether the
latency, reliability, and infrastructure cost of anonymizing them is
acceptable seems like an open question.

Conclusions
-----
Not sure.  The TL;DR is that there might be value to deploying
end-to-end encryption at scale, even without end-to-end authentication
(OE), so it would be good to have authentication methods that enhance
the value of that instead of impeding it.

Trevor

[1] http://en.wikipedia.org/wiki/Opportunistic_encryption
[2] https://datatracker.ietf.org/doc/draft-dukhovni-opportunistic-security/?include_text=1
[3] http://www.ietf.org/mail-archive/web/uta/current/msg00311.html
[4] https://moderncrypto.org/mail-archive/messaging/2014/000229.html
[5] https://www.eff.org/encrypt-the-web-report
[6] https://github.com/jsha/starttls-everywhere/blob/master/README.md
[7] https://datatracker.ietf.org/doc/draft-ietf-dane-smtp-with-dane/
[8] http://httpwg.github.io/http-extensions/encryption.html
[9] http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1727.html
[10] https://moderncrypto.org/mail-archive/messaging/2014/000727.html
[11] https://moderncrypto.org/mail-archive/messaging/2014/000718.html
[12] https://moderncrypto.org/mail-archive/messaging/2014/000723.html
[13] https://moderncrypto.org/mail-archive/messaging/2014/000679.html
[14] https://moderncrypto.org/mail-archive/messaging/2014/000685.html
[15] https://moderncrypto.org/mail-archive/messaging/2014/000234.html
[16] https://moderncrypto.org/mail-archive/messaging/2014/#226
[17] https://moderncrypto.org/mail-archive/messaging/2014/000706.html
[18] https://moderncrypto.org/mail-archive/messaging/2014/#708
[19] https://moderncrypto.org/mail-archive/messaging/2014/000723.html
[20] https://moderncrypto.org/mail-archive/messaging/2014/000766.html
[21] https://github.com/tomrittervg/uee/blob/master/proposal.md
[22] http://nyms.io/