diff options
Diffstat (limited to 'post/2019-02-18-ghcq-exceptional-access-e2ee-decentralization-reproducible.md')
-rw-r--r-- | post/2019-02-18-ghcq-exceptional-access-e2ee-decentralization-reproducible.md | 973 |
1 files changed, 973 insertions, 0 deletions
diff --git a/post/2019-02-18-ghcq-exceptional-access-e2ee-decentralization-reproducible.md b/post/2019-02-18-ghcq-exceptional-access-e2ee-decentralization-reproducible.md new file mode 100644 index 0000000..ebdd49b --- /dev/null +++ b/post/2019-02-18-ghcq-exceptional-access-e2ee-decentralization-reproducible.md @@ -0,0 +1,973 @@ +# GHCQ's "Exceptional Access", End-To-End Encryption, Decentralization, and Reproducible Builds + +Late last November, + Ian Levy and Crispin Robinson of the GHCQ (the British intelligence + agency) published a proposal for intercepting end-to-end encrypted + communications, + entitled ["Principles for a More Informed Exceptional + Access Debate"][proposal]. +Since then, + there have been a series of notable rebuttals to this proposal + arguing why this system would fail in practice and why it should be + rejected. +Completely absent from these responses, however, + is any mention of existing practices that would prohibit this attack + outright---the + combination of free/libre software, reproducible builds, and + decentralized or distributed services. + +[proposal]: https://www.lawfareblog.com/principles-more-informed-exceptional-access-debate + +<!-- more --> + +This proposal is just the latest episode in the [crypto + wars][crypto-wars]: + Users need secure communications to protect their privacy and defend + against attackers, + but law enforcement and governments argue that this leaves them in + the dark. +But this one's a bit different. +The proposal states: + +[crypto-wars]: https://en.wikipedia.org/wiki/Crypto_wars + +> The U.K. government strongly supports commodity encryption. The Director +> of GCHQ has publicly stated that we have no intention of undermining the +> security of the commodity services that billions of people depend upon +> and, in August, the U.K. signed up to the Five Country statement on access +> to evidence and encryption, committing us to support strong encryption +> while seeking access to data. [...] We believe these U.K. principles will +> enable solutions that provide for responsible law enforcement access with +> service provider assistance without undermining user privacy or security. + +The suggestions in the article are a pleasant deviation from past proposals, + such as [key escrow schemes][key-escrow-eff]; + in fact, + it categorically denounces such schemes: + +[key-escrow-eff]: https://www.eff.org/deeplinks/2015/04/clipper-chips-birthday-looking-back-22-years-key-escrow-failures + +> There is no single solution to enable all lawful access, but we definitely +> don’t want governments to have access to a global key that can unlock any +> user’s data. Government controlled global key escrow systems would be a +> catastrophically dumb solution in these cases. + +So how do the authors propose intercepting communications? +They suggest inserting a third party---a + "ghost", as others have been calling it---into + the conversation. + +To understand the implications of adding a third party to an + end-to-end (E2E) encrypted protocol, + you have to understand how end-to-end encryption usually works in + practice.^[ + For another perspective, + see [Matthew Green's overview][green-ghost] in his response to + the GHCQ proposal.] + +[green-ghost]: https://blog.cryptographyengineering.com/2018/12/17/on-ghost-users-and-messaging-backdoors/ + + +## Undermining End-to-End Encrypted Communication Systems +Let's say that three users named Alice, Bob, and Carol wish to communicate + with one-another privately. +There are many ways to accomplish this, + but for the sake of this discussion, + we need to choose a protocol that attempts to fit into the model that + Levy and Robinson had in mind. +Alice and the others will make use of a centralized messaging service that + relays messages on behalf of users.[^centralized] +Centralized services are commonplace and include popular services like + Signal, WhatsApp, Facebook Messenger, iMessage, and many others. +They all work in slightly different ways, + so to simplify this analysis, + I'm going to talk about an imaginary messaging service called FooRelay. + +[^centralized]: See section [The Problem With Centralized + Services](#centralized-services). + +FooRelay offers a directory service that allows participants to find + one-another by name or pseudonym. +The directory will let Alice know if Bob and Carol are online. +FooRelay also offers private chat rooms supporting two or more participants. + +Alice, Bob, and Carol don't want anyone else to know what they are + saying---that + includes FooRelay's servers, + their Internet Service Providers (ISPs), + their employers, + their governments, + or whomever else may be monitoring the network that any of them are + communicating over.[^threat-model] +Fortunately for them, + FooRelay makes use of _end-to-end encryption_.[^primitive-e2e] + +[^threat-model]: The process of determining potential threats and + adversaries is called [threat modeling][]. + Since this article is about a proposal from a government spy agency, + it's also worth noting that global passive adversaries like the GHCQ and + NSA have the ability to monitor and store global traffic with the hopes + of later decrypting it. + [I have written about pre-Snowden revelations][national-uproar], + and [the EFF has compiled a bunch of information on NSA spying][eff-nsa]. + +[threat modeling]: https://en.wikipedia.org/wiki/Threat_model +[national-uproar]: /2013/06/national-uproar-a-comprehensive-overview-of-the-nsa-leaks-and-revelations +[eff-nsa]: http://eff.org/nsa-spying + +[^primitive-e2e]: Here I will describe a fairly elementary public-key + end-to-end encrypted protocol that omits many important features + (most notably, forward secrecy). + For detailed information on a modern and well-regarded key exchange + protocol, + see [X3DH][] (Extended Triple Diffie-Hellman), + which is employed by Signal. + Following a key agreement, + the [Double Ratchet][] algorithm is widely employed for forward + secrecy even in the event of a compromised session key. + +[X3DH]: https://signal.org/docs/specifications/x3dh/ +[Double Ratchet]: https://www.signal.org/docs/specifications/doubleratchet/ + +Alice, Bob, and Carol each hold secret encryption keys known only to + them---their + _private keys_, + which are generated for them automatically by the FooRelay client + software running on their systems. +These keys can be used to _decrypt_ messages sent to them, + and can be used to _sign_ messages to assert their authenticity. +But these private keys must never be divulged to others, + including FooRelay's servers. +Instead, + each private key has a _public key_ paired with it. +The public key can be used to _encrypt_ messages that can only be decrypted + using the associated private key.[^pke] +Alice, Bob, and Carol each publish their public keys into FooRelay's + directory so that others may discover and use them. +When Alice wants to start a chat with Bob and Carol, + she can ask FooRelay to provide their public keys from the directory. + +[^pke]: This is called [_public-key cryptography_][public-key-crypto] + (or _asymmetric encryption_). + +[public-key-crypto]: https://en.wikipedia.org/wiki/Public-key_cryptography + +But making the public keys available in a directory is only part of the + problem---how + do Alice, Bob, and Carol know that the keys published to the directory + are actually associated with the _real_ Alice, Bob, and Carol?^[ + This topic is known as [_key distribution_][key-distribution].] +**This is the first opportunity to spy**, + if FooRelay is poorly designed. + +[key-distribution]: https://en.wikipedia.org/wiki/Key_distribution + +As stated by the proposal: + +> It’s relatively easy for a service provider to silently add a law +> enforcement participant to a group chat or call. The service provider +> usually controls the identity system and so really decides who’s who and +> which devices are involved - they’re usually involved in introducing the +> parties to a chat or call. You end up with everything still being +> end-to-end encrypted, but there’s an extra ‘end’ on this particular +> communication. + + +### Man-in-the-Middle + +Let's start by assuming a pretty grim scenario. +This is not quite the plan of attack that Levy and Robinson had in mind, + but it's important to understand why it would not work in practice. + +The FooRelay client software running on Alice's computer retrieves Bob's + public key from the identity service and initiates a chat. +FooRelay's server creates a new private chat room to accommodate the + request and adds two initial participants---Alice and Bob. +The FooRelay client then generates an invitation message containing + the identifier of the new room, + signs it using Alice's private key to prove that it was from Alice, + and sends it off to FooRelay's servers. +FooRelay's server verifies Alice's signature to make sure that she is + authorized to invite someone to the room, + and then sends the invitation off to Bob.[^whatsapp-group-chat] + +[^whatsapp-group-chat]: As it turns out, + getting invitations right can be difficult too. + [WhatsApp had a vulnerability that allowed for users to insert themselves + into group conversations][whatsapp-vuln] because it didn't implement a + similar protocol. + A better defense would be for Bob to publish the invitation from Alice + when he joins the room, + allowing anyone else in the room (like Carol) to verify that he was + invited by someone authorized to do. + Only after verifying the invitation's signature would Carol decide to + encrypt messages to him. + +[whatsapp-vuln]: https://techcrunch.com/2018/01/10/security-researchers-flag-invite-bug-in-whatsapp-group-chats/ + +Bob is also running the FooRelay client on his computer. +It receives the invitation from Alice, + looks up her public key from the identity service, + and uses it to verify the signature on the invitation to make sure it + originated from Alice. +If the signature checks out, + FooRelay asks Bob if he'd like to join the chat. +Bob accepts. + +Alice enters a message into the FooRelay client to send to the chat room. +But remember, + Alice does not want the FooRelay server to know what message is being + sent. +So the FooRelay client on Alice's computer encrypts the message using Bob's + pubic key, + signs it using Alice's private key to assert that it was from her, + and sends it. +The FooRelay server---and + anyone else watching---see + junk data. +But Bob, + upon receiving the message and verifying its signature, + is able to decrypt and read it using his private key.[^sending] + +[^sending]: This is omitting many very important details that are necessary + for a proper implementation. + While this portrayal isn't necessarily dishonest at a high level, + there is a lot more that goes into sending a message. + See information on the [Double Ratchet][] algorithm for information on + one robust way to handle this exchange. + +**Now let's explore how to intercept communications.** +Enter Mallory. +Mallory works for the GHCQ. +FooRelay has been provided with a wiretap order against Carol. + +Alice wants to bring Carol into the conversation with her and Bob, + so she requests Carol's key from the identity service. +FooRelay's identity service, + subject to the wiretap order, + doesn't return Carol's public key; + instead, it returns Mallory's, + _who is pretending to be Carol_. +Alice sends the invitation to Mallory + (again, thinking he's Carol), + and the fake Carol (Mallory) joins the room. +Now when sending a message, + Alice encrypts using both Bob and Mallory's public keys, + so both of them can read it. + +But when Alice and Carol meet up tomorrow for lunch, + it will be pretty clear that Carol was not part of the conversation. +So Mallory is clever---he + has FooRelay provide him with Carol's _real_ public key. +When Alice sends Mallory an invitation to the room, + Mallory instructs FooRelay to create a covert _fake_ chat room with the + same identifier. +Mallory then sends an invitation to Carol to that new chat room, + _pretending to be Alice_. +But Mallory doesn't have access to Alice's private key, + and so cannot sign it as her; + he instead signs it using his own private key. + +FooRelay on Carol's computer receives the invitation, + which claims to be from Alice + (but is really from Mallory). +When it attempts to retrieve the key from the identity service, + rather than receiving Alice's key, + _the identity service sends back Mallory's_. +Now Mallory is impersonating _both_ Alice and Carol. +The signature checks out, + and Carol joins the covert chat. +FooRelay---still + under the wiretap order---announces + that Alice and Bob are both in the room, + even though they aren't. + +Now, + when Mallory receives a message from Alice that is intended for Carol, + he encrypts it using Carol's public key, + signs it using his own, + and sends it off to Carol. +Since Carol's FooRelay client thinks that Mallory's key is Alice's + (remember the invitation?), + the signature checks out and she happily decrypts the message and + reads it. +If Bob sends a message, + we repeat the same public key lookup procedure---FooRelay's identity + service lies and provides Mallory's key instead, + and Mallory proxies the message all the same.^[ + Of course, + it may be suspicious if Alice and Bob both have the same key, + so maybe Mallory has multiple keys. + Or maybe the FooRelay software just doesn't care.] + +This is a [man-in-the-middle (MITM)][mitm] attack. +But notice how **the conversation is still fully end-to-end encrypted**, + between each of Alice, Bob, Carol, and Mallory. + +[mitm]: https://en.wikipedia.org/wiki/Man-in-the-middle_attack + +Why is this attack possible? +Because FooRelay has not offered any insight into the identity + process---there + is no _authentication_ procedure. +Blind trust is placed in the directory, + which in this case has been compromised. + + +#### Mutual Authentication + +If the FooRelay client allowed Alice, Bob, and Carol to inspect each others' + public keys by displaying a [public key "fingerprint"][fingerprint], + then that would have immediately opened up the possibility for them to + discover that something odd was going on. +For example, + if Alice and Carol had previously communicated before Mallory was + involved, + then maybe they would notice that the fingerprint changed. +If they met _after_ the fact, + they would notice that the fingerprint Alice had for Carol was not the + fingerprint that Carol had for _herself_. +Maybe they would notice---perhaps + by communicating in person---that + the fingerprint that Alice associated with Carol and the fingerprint that + Carol associated with Alice were in fact the same (that is, Mallory's). + +[fingerprint]: https://en.wikipedia.org/wiki/Key_fingerprint + +To mitigate the first issue, + Mallory would have to MITM communications from the moment that Carol first + signed up for FooRelay, + and permanently thereafter. +The second could not be mitigated unless Mallory compromised Carol's device, + or FooRelay cooperated with Mallory to plant a defective FooRelay client + on Carol's device. +To mitigate the third, + maybe Mallory would use separate keys. +But if Alice, Bob, or Carol ever compared public keys in person with someone + else that was outside of their group of three, + then they would notice that the fingerprints did not match. +So FooRelay would have to always provide the wrong key to _everyone_ trying + to communicate with Carol, + and for _everyone_ Carol tried to communicate with, + in perpetuity---an + everlasting wiretap. + +This issue of mutual authentication is another complex topic that is very + difficult to solve in a manner that is convenient for users.[^wot] +For example, + Alice, Bob, and Carol could all meet in person and verify that + one-anothers' fingerprints look correct. +Or they could post their fingerprints to something outside of FooRelay's + control, + like social media. +This is the ["safety number"][safety-number] concept that Signal employs. + +[^wot]: One distributed model of assoicating a key with an owner is PGP's + [Web of Trust][wot], + which has been in use since the 1990s. + While it does enjoy use in certain communities, + it has failed to take off with average users due to the [complexities of + implementing the model properly][debian-keysign]. + PGP's author also came up with short authentication string (SAS) + authentication protocol for VoIP systems called [ZRTP][], + but it relies on users being able to identify the authenticity of + one-anothers' voices, + a luxury that may be undermined in the near future by speech + synthesis systems [trained to reproduce real voices][ss-deep]. + +[safety-number]: https://signal.org/blog/safety-number-updates/ +[wot]: https://en.wikipedia.org/wiki/Web_of_trust +[debian-keysign]: https://wiki.debian.org/Keysigning/ +[zrtp]: https://en.wikipedia.org/wiki/ZRTP +[ss-deep]: https://en.wikipedia.org/wiki/Speech_synthesis#Deep_learning + +FooRelay could also implement a [trust-on-first-use (TOFU)][tofu] + policy---the + client software would remember the last public key that it saw for a + user, + and if that key ever changed, + then a prominent warning would be displayed.[^ssh-tofu] +For example, + if Alice communicates once with the real Carol, + the TOFU policy in the FooRelay client would record that real public key. +Then, + when Mallory tries to MITM the conversation, + Alice's FooRelay client would say: + "Hold up; the key changed! Something is wrong!" + +[tofu]: https://en.wikipedia.org/wiki/Trust_on_first_use + +[^ssh-tofu]: SSH users, for example, may be familiar with the almost-violent + warning when the server fingerprint changes. + Server fingerprints are stored in `~/.ssh/known_hosts` the first time they + are contacted, + and those fingerprints are used for verification on all subsequent + connection attempts. + +In any case, + let's assume that FooRelay's cooperation in serving up the wrong public + key is no longer sufficient because of these mitigations. +What does Mallory do without the ability to MITM? + +No respectable communication software should be vulnerable to this sort of + attack. +Knowing this, + Levy and Robinson had a different type of attack in mind. + + +### A Ghost in the Room + +Back when most people used land lines for communication via telephone, + wiretapping was pretty easy. +Conversations were transmitted in an unencrypted, + analog form; + anyone could listen in on someone else's conversation if they had some + elementary technical know-how and knew where to apply it. +By severing or exposing the line at any point, + an eavesdropper could attach [alligator clips][]---or + "crocodile clips", if you're east of the Atlantic---to + route the analog signal to another phone or listening device. + +[alligator clips]: https://en.wikipedia.org/wiki/Crocodile_clip + +Levy and Robinson try to apply this same concept as a metaphor for Internet + communications, + presumably in an effort to downplay its significance. +But the concepts are very different. +Continuing from the previous quote of Levy and Robinson's proposal: + +> This sort of solution seems to be no more intrusive than the virtual +> crocodile clips that our democratically elected representatives and +> judiciary authorise today in traditional voice intercept solutions and +> certainly doesn’t give any government power they shouldn’t have. +> +> We’re not talking about weakening encryption or defeating the end-to-end +> nature of the service. In a solution like this, we’re normally talking +> about suppressing a notification on a target’s device, and only on the +> device of the target and possibly those they communicate with. That’s a +> very different proposition to discuss and you don’t even have to touch the +> encryption. + +This statement is disingenuous. +We can implement the quoted suggestion in two different ways: +The first is precisely the situation that was just previously + described---allow + MITM and remain ignorant about it. +The second way is to have the FooRelay server _actually invite Mallory_ to + the chat room, + but _have the FooRelay client hide him from other participants_. +**He would be a ghost in the room;** + nobody would see him, + but Alice, Bob, and Carol's FooRelay software would each surreptitiously + encrypt to him using his public key, + as a third recipient. + +Sure, + the actual ciphers used to encrypt the communications are not weakened. +Sure, + it is still end-to-end encrypted. +But this is _nothing_ like alligator clips on a phone line---instead, + _an anti-feature has been built into the software_. +As the EFF notes, + [this is just a backdoor by another name][eff-ghost]. + +[eff-ghost]: https://www.eff.org/deeplinks/2019/01/give-ghost-backdoor-another-name + +If software has to be modified to implement this backdoor, + then it has to either be done for _every_ user of FooRelay, + or individual users have to be targeted to install a malicious version + of the program. +If either of these things are possible, + then _everyone_ is made less secure. +What if a malicious actor figures out how to exploit either of those + mechanisms for their own purposes? +Or what if someone tricks FooRelay into thinking they're from the GHCQ? + +And since this is a backdoor in the software running on the user's computer, + it is very difficult to be covert. +Nate Cardozo and Seth Schoen of the Electronic Frontier Foundation + [analyze various ways to detect ghosts][detect-ghosts], + which would tip Alice, Bob, and Carol off that Mallory is watching them. + +[detect-ghosts]: https://www.lawfareblog.com/detecting-ghosts-reverse-engineering-who-ya-gonna-call + +This is bad, + and everyone knows it. +The proposal is a non-starter. +But this shouldn't be the end of the conversation---there + is a much more fundamental issue is at play which has received no + attention from the mainstream responses. + + +## Betrayed By Software {#betrayed} +All of these mainstream discussions make an implicit assumption: + _that users are not in control of the software running on their systems_. +The [detection methods][detect-ghosts] are discussed in terms of binary + profiling and side-channels. +[The GHCQ's proposal itself][proposal] fundamentally relies on the software + being modified in ways that are a disservice to the user---adding + a backdoor that surreptitiously exfiltrates messages to a third + party (Mallory) without the consent of other participants (Alice, Bob, + or Carol). + +When a user has full control over their software---when + they have the freedom to use, study, modify, and share it as they + please---we + call it [_free software_][free-sw]. +If FooRelay's client were free software, + then Alice, Bob, and Carol would all have the right to inspect it to make + sure no nasty backdoors were added,[^proprietary-malware] + or ask someone else to inspect it for them. +Or maybe they could depend on the fact that many other people are + watching---essentially + anyone in the world could at any moment look at FooRelay's client source + code. +This helps to keep FooRelay honest---if + they _did_ implement a feature that suppresses notifications as Levy and + Robinson suggest, + then they would have done so in plain sight of everyone, + and they would immediately lose the trust of their users. + +[free-sw]: https://www.gnu.org/philosophy/free-sw.en.html + +[^proprietary-malware]: Unfortunately, + [proprietary (non-free) software is often malware][proprietary-malware], + hiding things that work in the interests of its developers but + _against_ the interests of its users. + +[proprietary-malware]: https://www.gnu.org/philosophy/proprietary.html + +FooRelay could try make the change in a plausibly deniable way---to + make the change look like a bug---but + then _anyone with sufficient skill in the community could immediately fix + it_ and issue a patch. +That patch could be immediately circulated and adopted by other users + without the blessing of FooRelay itself. +If FooRelay didn't implement that patch, + then users would [_fork_][software-fork] it, + making their own version and ditching FooRelay entirely. +Forking is a commonly exercised and essential right in the free + software community. + +[software-fork]: https://en.wikipedia.org/wiki/Software_fork + +The popular program Signal is free software.[^moxie-signal] +The [OMEMO specification][omemo]---which + implements many of the encryption standards that were developed by + Signal---is + also [implemented by multiple free software projects][omemo-yet], + some of which include [Pidgin][] (GNU/Linux, Windows, Mac OSX), + [Conversations][] (Android), + [ChatSecure][] (iOS), + and [Gajim][] (GNU/Linux, Windows). + +[omemo]: https://conversations.im/omemo/ +[omemo-yet]: https://omemo.top/ +[pidgin]: https://pidgin.im/ +[conversations]: https://conversations.im/ +[chatsecure]: https://chatsecure.org/ +[gajim]: https://gajim.org/ + +[^moxie-signal]: Unfortunately, + its author has caused some friction in the free software community by + [strongly discouraging forks and saying they are unwelcome to connect to + Signal's servers][moxie-fdroid]. + This also relates to the issue of centralization, + which is the topic of the next section; + Moxie [explains in a blog post why he disagrees with a federated + Signal][moxie-federation]. + +[moxie-fdroid]: https://github.com/LibreSignal/LibreSignal/issues/37 +[moxie-federation]: https://signal.org/blog/the-ecosystem-is-moving/ + + +If a program does not respect users' freedoms, + we call it _non-free_, or _proprietary_. +**Most of the popular chat programs today are non-free**: + Apple iMessage, Facebook Messenger, and WhatsApp are all examples of + programs that keep secrets from their users. +Those communities are unable to inspect the program, + or modify it to remove anti-features; + they are at the mercy of the companies that write the software. + +For example, + a recent [bug in Apple's FaceTime][facetime-vuln] left users + vulnerable to surveillance by other FaceTime users. +FaceTime likely has hundreds of thousands of users. +If it were free software and only a tiny fraction of those users actually + inspected the source code, + it's possible that somebody would have noticed and maybe even fixed the + bug before it was exploited.[^bugs-shallow] +Further, + after it _was_ discovered, + users had no choice but to wait for Apple themselves to issue a fix, + which didn't come until a week later. +The person who did discover it [tried to contact Apple with no + success][bad-apple], + and the world only found out about the issue when a video demoing the + exploit went viral eight days after its initial discovery. +This differs from free software communities, + where bugs are typically posted to a public mailing list or bug tracker, + where anybody in the community can both view and immediately act upon + it.[^embargo] + +[facetime-vuln]: https://9to5mac.com/2019/01/28/facetime-bug-hear-audio/ +[bad-apple]: https://www.wsj.com/articles/teenager-and-his-mom-tried-to-warn-apple-of-facetime-bug-11548783393 + +[^bugs-shallow]: This is often cited as [Linus's Law][linus-law], + which states that "given enough eyeballs, all bugs are shallow". + While this may be true, + that is certainly not always the case. + It is a common argument in support of open source, + [which covers the same class of software][floss-class] as free software. + However, + it's important not to fixate too much on this argument---it + [misses the point of free software][oss-misses-point], + and is a shallow promise, + since open source software is not always superior in technical + quality to proprietary software. + +[linus-law]: https://en.wikipedia.org/wiki/Linus's_Law +[floss-class]: https://www.gnu.org/philosophy/free-open-overlap.html +[oss-misses-point]: https://www.gnu.org/philosophy/open-source-misses-the-point.html + +[^embargo]: Sometimes an exception is made for severe security + vulnerabilities. + For example, + the [`linux-distros` mailing list][linux-distros] is used to coordinate + security releases amongst GNU/Linux distributions, + imposing an embargo period. + This practice ensures that exploits are not made publicly available to + malicious actors before users are protected. + +[linux-distros]: https://oss-security.openwall.org/wiki/mailing-lists/distros + +But free software alone isn't enough. +How does Alice know that she _actually_ has the source code to the + program that she is running? + + +### Reproducibility and Corresponding Source Code {#reproducibility} + +The source code to FooRelay can't provide Alice with any security + assurances unless she can be confident that it is _actually_ + the source code to the binary running on her machine. +For example, + let's say that FooRelay has agreed to cooperate with the GHCQ to implement + ghosts by introducing a backdoor into the FooRelay client. +But since FooRelay is a free software project, + anyone can inspect it. +Rather than tipping off the community by publishing the _actual_ source + code, + _they publish the source code for a version that does not have the + backdoor_. +But when Alice downloads the compiled (binary) program from FooRelay, + she receives a backdoored version. + +To mitigate this, + **Alice wants to be sure that she has the _corresponding source code_**. + +One way for Alice to be confident is for her to compile the FooRelay client + herself from the source code. +But not everybody has the technical ability or desire to do + this.[^bootstrap] +Most users are instead going to download binaries from their operating + system's software repositories, + or from FooRelay's website, + or maybe even from other convenient third parties. +How can _all_ users be confident that the FooRelay client they download + actually corresponds to the source code that has been published and vetted + by the community? + +[^bootstrap]: And then you have the issue of ensuring that you have the + corresponding source to the rest of your system so that it does not + [alter the behavior of the produced binary][trusting-trust]. + System-wide reproducibility is the topic of [_bootstrappable + builds_][bootstrappable-builds]. + +[trusting-trust]: https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf +[bootstrappable-builds]: http://bootstrappable.org/http://bootstrappable.org/ + +[_Reproducible builds_][reproducible-builds] are required to solve this + problem. +When FooRelay is built, + it is done so in a manner that can be completely reproduced by others. +Bit-for-bit reproducibility means that, + if two people on different systems follow the same instructions for + building a program in similar enough environments, + every single bit of the resulting binary will match--- + they will be exact copies of one-another.[^unreproducible] + +[reproducible-builds]: http://reproducible-builds.org/ + +[^unreproducible]: Additional effort often has to be put into building + reproducibly because a build may produce timestamps corresponding to the + time of the build, + information specific to the environment in which the program is being + built, + and various other sources of nondeterminism. + +This has powerful consequences. +Alice no longer has to build the program herself---she + can trust that others have checked FooRelay's work. +FooRelay wouldn't dare try to distribute a tainted binary now, + since the community could trivially detect it. +Further, + Alice, Bob, and Carol could all verify that they have the _exact same + verison_ of the FooRelay client, + and _all_ be confident that it was compiled from the same source code + that was published.[^verify-checksum] +They could even accept FooRelay from complete strangers and _still_ be + confident that it was compiled from the published source code! + +[^verify-checksum]: This verification can be done trivially by verifiying + the _checksum_ of a program or distribution archive. + For example, + running `sha512sum foorelay` on a GNU/Linux system would output a _hash_ + of the contents of the file `foorelay`. + Alice, Bob, and Carol could then compare this value if they are all + running the same operating system and CPU architecture. + Otherwise they can compare it published checksums, + or with others they trust. + +Reproducible builds have made a lot of progress in recent years. +As of February 2019, + for example, + [over 93% of all packages on Debian GNU/Linux are reproducible on the + `amd64` architecture][debian-reproducible], + which includes the aforementioned Pidgin and Gajim projects that + implement OMEMO. +[Signal also offers a reproducible Android build][signal-reproducible]. + +[debian-reproducible]: https://tests.reproducible-builds.org/debian/reproducible.html +[signal-reproducible]: https://signal.org/blog/reproducible-android/ + + +So let's go back to Levy and Robinson's proposal. +How do you implement a ghost in FooRelay where its client source code is + publicly available and its builds are reproducible? +You don't, + unless you can hide the implementation in a plausibly-deniable way and + write it off as a bug. +But anyone that finds that "bug" will fix it and send FooRelay a patch, + which FooRelay would have no choice but to accept unless it wishes to lose + community trust (and provoke a fork). + +Mallory could instead target specific users and compromise them + individually, + but this goes beyond the original proposal; + if Mallory can cause Alice, Bob, or Carol to run whatever program he + pleases, + then he doesn't need to be a ghost---he + can just intercept communications _before they are encrypted_. +Therefore, + reproducible builds---if + done correctly---make + Levy and Robinson's attack risky and impractical long-term. + +But there is still one weak link---the + fact that Alice, Bob, and Carol are communicating with FooRelay's servers + at all means that Mallory still has the ability to target them by coercing + FooRelay to cooperate with him. + + +## The Problem With Centralized Services {#centralized-services} + +The final issue I want to discuss is that of centralized services. + +A centralized service is one where all users communicate through one central + authority---all + messages go through the same servers. +The hypothetical FooRelay is centralized. +Signal, iMessage, Facebook Messenger, WhatsApp, and many other popular chat + services are centralized. +And while this offers certain conveniences for users, + it also makes certain types of surveillance trivial to perform, + as they are bountiful targets for attackers, governments, and law + enforcement. + +But services don't have to be centralized. +_Decentralized_ services contain many separate servers to + which users connect, + and those servers can communicate with one-another. +The term _"federated"_ is also used, + most often when describing social networks.[^decentralize-term] +Consider email. +Let's say that Alice has an email address `alice@foo.mail` and Bob has an + email address `bob@quux.mail`. +Alice uses `foo.mail` as her provider, + but Bob uses `quux.mail`. +Despite this, + Alice and Bob can still communicate with one-another. +This works because the `foo.mail` and `quux.mail` mailservers send and + receive mail to and from one-another. + +[^decentralize-term]: While the term "decentralized" has been around for + some time, + there's not really a solid agreed-upon definition for "federated". + [Some people use the terms interchangeably][uu-federated]. + The term "federation" is frequently used when talking about social + networking. + +[uu-federated]: http://networkcultures.org/unlikeus/resources/articles/what-is-a-federated-network/ + +[XMPP][]---the protocol on which OMEMO is based---is + a federated protocol. +Users can choose to sign up with existing XMPP servers, + or they can even run their own personal servers.[^me-prosody] +Federation is also the subject of the [ActivityPub][] social networking + protocol, + which is implemented by projects like [Mastodon][], [NextCloud][], and + [PeerTube][]. +[Riot][] is an implementation of the [Matrix][] protocol for real-time, + decentralized, end-to-end encrypted communication including chat, voice, + video, file sharing, and more. +All of these things make Mallory's job much more difficult--- + instead of being able to go to a handful of popular services like + FooRelay, Signal, WhatsApp, iMessage, Facebook Messenger, and others, + Mallory has to go to potentially _thousands_ of server operators and ask + them to cooperate.[^risk-popular] + +[^me-prosody]: I run my own [Prosody][] server, + for example, + which supports OMEMO. + +[^risk-popular]: Of course, + there's always the risk of a few small instances becoming very + popular, + which once again makes Mallory's job easier. + +[xmpp]: https://en.wikipedia.org/wiki/XMPP +[prosody]: https://prosody.im/ +[activitypub]: https://www.w3.org/TR/activitypub/ +[mastodon]: https://joinmastodon.org/ +[nextcloud]: http://nextcloud.org/ +[peertube]: https://joinpeertube.org/ +[riot]: https://about.riot.im/ +[matrix]: https://matrix.org/docs/guides/faq + +[_Peer-to-peer (P2P)_][p2p] (or _distributed_) services forego any sort of + central server and users instead communicate directly with + one-another.[^dht] +In this case, + Mallory has no server operator to go to; + Levy and Robinson's proposal is ineffective in this + environment.[^excuse-me] +[Tox][] is an end-to-end encrypted P2P instant messaging program. +[GNU Jami][jami] is an end-to-end encrypted P2P system with text, audio, and + video support. +Another example of a different type of P2P software is Bittorrent, + which is a very popular filesharing protocol. +[IPFS][] is a peer-to-peer Web. + +[^excuse-me]: "Excuse me, kind sir/madam, + may I please have your cooperation in spying on your + conversations?" + Another benefit of distributed systems is that they help to + evade censorship, + since no single server can be shut down to prohibit speech. + +[^dht]: Though some P2P services offer discovery services. + For example, + [GNU Jami][jami] offers a distributed identity service using + [_distributed hash tables_][dht] (DHTs). + Bittorrent uses DHTs for its trackers. + +[p2p]: https://en.wikipedia.org/wiki/Peer-to-peer +[tox]: https://tox.chat/ +[jami]: https://jami.net/ +[ipfs]: https://ipfs.io/ +[dht]: https://en.wikipedia.org/wiki/Distributed_hash_table + +_Decentralization puts users in control._ +Users have a _choice_ of who to entrust their data and communications with, + or can choose to trust no one and self-host.[^metadata-leak] +Alice, Bob, and Carol may have different threat models---maybe + Carol doesn't want to trust FooRelay. +Maybe Alice, Bob, and Carol can't agree at _all_ on a host. +Nor should they have to. + +[^metadata-leak]: Though it is important to understand what sort of data are + leaked (including metadata) in decentralized and distributed systems. + When you send a message in a decentralized system, + that post is being broadcast to many individual servers, + increasing the surface area for Mallory to inspect those data. + If there are a couple popular servers that host the majority of users, + Mallory can also just target those servers. + For example, + even if you self-host your email, + if any of your recipients use GMail, + then Google still has a copy of your message. + +Self-hosting has another benefit: it helps to [put users in control of their + own computing][saass].[^online-freedom] +Not only do they have control over their own data, + but they also have full control over what the service does on their + behalf. +In the previous section, + I mentioned how free software helps to keep FooRelay honest. +What if FooRelay's _server software_ were _also_ free software? +If Alice can self-host FooRelay's server software and [doesn't like how + FooRelay implements their group chat][whatsapp-vuln], + for example, + she is free to change it. +If Mallory forces FooRelay to implement a feature on their server to allow + him to be added to group chats, + the community may find that as well and Alice can remove that + anti-feature from her self-hosted version. + +[^online-freedom]: I go into more information on the problems with modern + software on the web in [my LibrePlanet 2016 talk "Restore Online + Freedom!"][rof]. + +[saass]: http://www.gnu.org/philosophy/who-does-that-server-really-serve.html +[rof]: https://mikegerwitz.com/talks#online-freedom + + +## Please Continue Debating + +This article ended up being significantly longer and more substantive than I + had originally set out to write. +I hope that it has provided useful information and perspective that + was missing from many of the existing discussions, + and I hope that I have provided enough resources for further research. + +The prominent responses to which I referred (some of which were already + referenced above) are analyses by + [Susan Landau][landau], + [Matthew Green][green-ghost], + [Bruce Schneier][schneier], + [Nate Cardozo and Seth Schoen of the EFF][detect-ghosts], + and [another by Nate Cardozo][eff-ghost]. +There are surely others, + but these were the ones that motivated this article. + +It is important to keep these encryption debates alive. +The crypto wars are far from over. +We must ensure that we provide users with the tools and information + necessary to defend themselves and one-another---tools + and practices that are immune from government interference unless they + themselves become illegal. +What a grim and dangerous world that would be. + +I'm most concerned by the lack of debate from community leaders about the + issues of [software freedom](#betrayed), + [reproducibility](#reproducibility), and + [decentralization](#centralized-services). +These are essential topics that I feel must be encouraged if we are to + ensure the [safety and security][sapsf] of people everywhere.[^disagree] +We need more people talking about them! +If you found these arguments convincing, + I would appreciate your help in spreading the word. +If you didn't, + please reach out to me and tell me why; + I would very much like to hear and understand your perspective. + +[landau]: https://www.lawfareblog.com/exceptional-access-devil-details-0 +[schneier]: https://www.schneier.com/essays/archives/2019/01/evaluating_the_gchq_.html +[sapsf]: /talks/#sapsf + +[^disagree]: But I also know that there are many people that disagree with + me on each of these points! + If that weren't the case, + I wouldn't need to be an activist. |