[messaging] Linked Identities (was: affirmations)

Wed Jan 21 14:45:24 PST 2015

Hi Vincent--

On Wed 2015-01-21 06:40:14 -0500, Vincent Breitmoser wrote:

> I am at this point halfway through implementing support in
> OpenKeychain. While thinking about it and talking with various people, I
> have come to dislike the name "affirmation" as too arbitrary and
> unspecific. Best I could come up with till now is "linked identity", but
> maybe someone has an even better idea :)

I'm fine with the name change, and think it makes sense :)

> A linked identity is a mutual, verifiable connection between a pgp key
> and a resource on the web. The keywords here are verifiable and mutual:
> A linked identity should be considered valid if and only if the resource
> it points to links back not only to the pgp key, but to that exact
> linked identity packet, in a manner which can be verified by the client.

Sorry to be pedantic below, but we're talking about trying to define
something that is potentially pretty esoteric in a space that doesn't
have the clearest threat model, so i'm going to go ahead and pick the
nits...

You've said "a resource on the web" above, but your example below
appears to be a DNS example.  DNS is not the web -- do you mean "a
resource on the Internet?"

Who is supposed to make sense of these linked identities?  If it's a
normal person (a "user"), then they need to be able to understand what
the resource is.  A human name fits that definition, as does an e-mail
address, which is something that (most) internet users can recognize and
compare.  a domain name itself might also fit that definition, and
*maybe* a full URL could too (though more and more people are using web
browsers that hide URLs from them too, so this is not as widely
understood).  i'm not sure how much more complex you can get and still
have this linked identity be something that normal people can
understand.  And if they can't understand it, what is the goal here?

> The reliable statement we want from a linked identity is that the owner
> of the keyring simultaneously controlled both the keyring (ie, a subkey
> with 'certify' capability, ie, its master key) and the (context of the)
> linked resource at one point in time.

I'm not sure what you mean by "(context of the)" here.  Do you mean
"content" instead of "context"?

> This implies that the linked identity packet requires some sort of
> identifier, which is then part of the resource content.

I'm not sure that it does.  Let's look at a normal User ID for a second,
of the usual form "Jane Doe <jane at example.org>" this identifies "the
person named Jane Doe, who has e-mail address <jane at example.org>".  The
certification (selfsig) over that User ID and the primary key says that
the person identified by the uid is the same person who controls the
primary key (and by extension, the subkeys).  The e-mail address is the
network resource.

When i use caff or monkeysign or pius to certify someone else's key, i
verify their control of the resource (the e-mail address) as well as the
key by sending an encrypted copy of my certification to the e-mail
address.  Only someone with control of the resource can decrypt the
message to import and republish the certification.

Note that there was no extra identifier there.

If i say that the "network resource" i control is
https://social.example/dkg and there's a way to send me private messages
at that web page, how is that different from the e-mail workflow?

> So now, a linked identity consists of a URI to some resource, and an
> identifier. We *could* encode this identifier in the uri, but I would
> not consider the identifier as part of the URI to the resource - quite
> the opposite actually, the identifier is what we expect to find in the
> *content* of the resource, and is specifically un-involved in the
> process of resolving the URI for its content.

I don't yet understand why we need the identifier.  Why not provide the
OpenPGP fingerprint itself in the content of the resource instead of an
arbitrary identifier??

> Still, we could put the identifier in the URI and put that in a user
> id. At that point, we have something like
> pgpid:0123456789abcdef01234567 at dns:domain.com?TYPE=TXT
> and I would argue that without client support, this is will not only not
> lead the user to the right conclusions, but confuse or even mislead them.

the example above looks pretty opaque to me, and i agree that you
wouldn't want to expose that in a User ID.  By the same token, though,
it's not clear what you could do with it to make it useful to the end
user directly.  What is Alice supposed to do with the knowledge that Bob
can place a TXT record at domain.com ?

> Another important point is that, even though a resource may be correctly
> identified by a URI, it might not be available by generic means. The
> best example I have is twitter, where the https-uri of a tweet bears
> sufficient information to find the tweet, but the implementation still
> needs to specifically know how to parse just the tweet text from the
> website, not any replies or whatever else may be on the website as
> retrieved from the https URI. The generic 'fetch website, grep for
> backlink' resource is actually the special case here, because it can
> only be used if a user has guaranteed exclusive access to the linked
> resource, which is not the case for many resources published by accounts
> on social networks.

Yes, i agree that if we want things to be able to be verified in an
automated way, then there needs to be explicit support from the network
service provider to make sure that the resource is both
human-comprehensible and machine-extractable.

> For linked identities to work, I think proper client support for
> verification is a requirement. Conversely, being able to see a linked id
> as something other than 'opaque user ids' in a client which does not
> support them otherwise is hardly helpful. That said, handling a new user
> attribute packet is trivial for any implementation which already has
> routines for the jpeg type, and still very easy otherwise since user
> attributes are extremely similar to user ids.

well, it's easy to "handle" from the data perspective.  It's much less
clear to me how to handle it from the UI/UX perspective.  Say Alice
fetches the key that she suspects belongs to Bob, and it includes one of
these proposed User Attribute packets.  Alice decides she wants to sign
Bob's key.  her OpenPGP client now has to decide what to show her about
the different UIDs and UATs that she can certify or not.  What does it
show her?

> Different point: my current idea for the "link back" from the resource
> to the linked identity packet, which I would intuitively call "cookie",
> looks like this:
>
> [Verifying my PGP key; pgp+linked:fingerprint#nonce]
> e.g.
> [Verifying my PGP key; pgp+linked:d4ab192964f76a7f8f8a9b357bd18320deadfa11#0123456789abcdef01234567]
>
> So it's a very short text, followed by a uri including the fingerprint,
> plus the identifier for the linked identity packet as fragment. The text
> is meant to mitigate "hey tweet this text for me real quick lol"
> attackers, since people will hopefully be more weary to post something
> which implies they are verifying anything, than just random line noise.

I'm still not sure what the fragment/nonce is supposed to do here.
Sorry if i'm being dense!  What attack does the cookie prevent?

I agree with you that making the assertion of the link from the network
resource back to the key itself should be both machine-interpretable and
human-interpretable, though, to avoid the attack you describe.

I hope this feedback is useful to you.

Regards,

        --dkg