go-ethereum/docs/_whisper/Achieving-Darkness.md

8.0 KiB

title
Achieving Darkness

Whisper is designed to be a building block in the next generation of unstoppable ÐApps. It was designed to provide resilience and privacy at considerable expense. At its most secure mode of operation Whisper can theoretically deliver complete darkness. Whisper should also allow the users to configure the level of privacy (how much information it leaks concerning the ÐApp content and ultimately, user activities) as a trade-off for performance. In this article we will discuss the strategy of achieving complete darkness.

Adversary

In the worst case, you are playing against a very powerful adversary, which has a capacity to run a lot of Whisper nodes. In addition to that, it might generate Whisper identities in order to target a specific node. For example, it might generate IDs that will be very close to the ID of the target node, which then fit into Kademlia topology in such a way that will allow a traffic analysis from this node. As a result, it will be possible to see which messages originate from the target node.

Recipient vs. Sender

Even if most nodes of the network belong to adversary, it will be still impossible to identify the recipient of particular message by the means of simple traffic analysis. But the sender might be identified. In order to prevent that, the node might maintain a certain noise level. For example, if the node sends on average several messages per hour, it might still send random messages every minute. Those extra messages should resemble the meaningful messages, but should contain randomly generated bytes and be encrypted with random key, so that adversary will not be able to distinguish the meaningful messages from the noise. It is necessary to encrypt (not only generate) random data because encrypted messages might have some common traits, which distinguish them from noise. E.g. in case of asymmetric encryption some bytes are allowed to have only certain values.

In some cases recipient might also be probabilistically identified if two nodes send each other messages with common traits, e.g. same Topic or same size. That's why it is important to use different Topics for sending and receiving the messages within the session.

Topics

It is possible to send asymmetrically encrypted messages without any Topic. However, in case of huge traffic it might be not practical to try decrypting all the incoming messages, since decrypt operation is quite expensive. Some filtering might be necessary. Whisper allows to subscribe for messages with particular Topic or specify several possible Topics. It is also possible to subscribe for partial Topic(s), e.g. first one, two, or three bytes of the Topic.

Topic collisions are part of the plausible deniability strategy. Some messages from different nodes (and encrypted with different keys) should have the same Topic (or partial Topic). Therefore it is recommended to set the Topics dynamically, so that collisions will regulary occur on your node.

For example, if there are 6000 nodes in the network and each of them on average sends ten messages per minute, it would make sense to set only first byte of the Topic (leaving the other three bytes random). In this case each node will receive on average 1000 messages per second. The node will try to decrypt approximately four messages (those matching the Topic), and one of them will be successfully decrypted. The rest of the messages will be ignored and just forwarded to the other peers.

Message Size and Padding

When sending the messages, another important metric which could potentially leak metainformation is the size. Therefore, depending on the circumstances, it might be useful to send messages only of certain size or always generate random size (not exceeding certain limit). The API allows easy setting of padding. The default implementation of Whisper also uses the padding to align the messages along the 256-byte boundaries.

Symmetric vs. Asymmetric

In case of symmetric encryption, the nodes must have already exchanged the encryption key via some secure channel. So, they should use the same channel to exchange the Topics (or partial Topics), and maybe PoW requirement for individual messages. In case of asymmetric encryption, the public key is exchanged via open channels, and it might be impossible to exchange any other information before they engage in secure communication. In this case different strategies might be employed.

For example, the first message might be sent without any Topic. This message may contain all the information necessary to start a symmetrically encrypted session, complete with (partial) Topics, PoW requirement and symmetric keys. Because it requires trying to decrypt all the incoming asymmetric messages, the PoW requirement for such messages might be set very high, which might be acceptable since such messages are only sent once per session.

Alternatively, a partial Topic should be published along with the public key and PoW requirement. E.g., if we use two or three bytes for the partial Topic of symmetrically encrypted messages, we might only use one byte for the asymmetric encryption.

Example of Session

Suppose two nodes want to engage in secure communication. One of them sends an asymmetrically encrypted message of certain (pre-defined) format, which contains two symmetric keys and corresponding partial Topics, all randomly generated. The sender of this message subscribes for the messages with the first Topic (encrypted with the first key), and always encrypts outgoing messages with the second key. The recipient subscribes for the messages with the second Topic (encrypted with the second key), and always encrypts outgoing messages with the first key. They also include the message number in each outgoing message, in order to ensure the correct order of the messages, and detect if some of them did not arrive. After the session is over, the keys are deleted and never used again.

In case of several participants (e.g. a group chat) they just need one additional key and one additional subscription for any additional participant.

Plausible Deniability

Suppose a user has lost control of his key. It might happen due to hacker attack on his computer, or becuase a hostile entity forces him to reveal the key. In case he signed any message with his private key, it might be used as evidence against him. So, the user might prefer to use his key only to establish the session, and then engage in symmetrically encrypted communication without signing the messages. His peer may be sure about his identity, since the symmetric key was signed with his private key. But these messages can not be used as evidence, since this message could have been sent by anyone in possession of the symmetric key, especially if this key was used in a group chat session.

Steganography

However, in some extreme cases even plausible deniability might not be enough. In those cases users might utilize the padding mechanism for steganographic purposes. It is possible to set arbitrary data as padding. The Whisper API allows easy access to padding for both writing and reading. Since vast majority of the messages will contain some kind of padding (either default or customized), it will be impossible to distinguish between randomly generated and encrypted padding bytes. The users should only use such encryption which does not reveal any metainformation (preferably symmetric stream ciphers), and then add the resulting data as padding to a message with some irrelevant payload.

Private Networks

It is very easy to set up a private Whisper network, since Whisper does not contain any genesis block or indeed any state at all. You just start a bootstrap node to which everybody should connect. It might be useful for organizations with restricted access. Private network may have several advantages. For example, it might be very difficult for any adversary to join the network, which will make the traffic analysis almost impossible. Also, PoW requirement might be dropped, thus allowing instant messaging with very little overhead.