“The Internet never forgets.”
That’s Tadayoshi Kohno, an assistant professor of computer science and engineering at the University of Washington, explaining the inspiration behind a new program called Vanish, which causes data posted online to self-destruct. In recent years, popular Web-based applications such as Hotmail, Facebook and Google Docs have opened up new ways to work, communicate and socialize. But these “cloud computing” services—in which data and applications are stored remotely rather than on a user’s personal computer—also can erode individual users’ control of their own words and data.
“Copies of your data are stored at all of these services or intermediaries that you may not know about, and that you don’t control,” explains Roxana Geambasu, ‘07, a Ph.D. student at UW who is working on Vanish as part of her dissertation. Moreover, as computer disk space becomes ever cheaper and more plentiful, these services have little reason to delete your data—in fact, it’s easier for them to keep it, more or less forever (in some cases, even if you ask them to erase it). A sensitive e-mail or an ill-advised joke in a Facebook post could come back to haunt you years from now. Geambasu says, “In the current era of cloud computing, we can hardly ever control the lifetime of our data.”
“Forgetting is actually a very important property of the human evolution,” says Kohno, a computer security specialist with an almost military look—close-cropped hair, chiseled cheekbones—who is also interested in the interaction between technology and human values. “The ability to forget allows healing, allows a number of other things. So forgetting is actually very important to society.”
Vanish grew out of Geambasu’s project for a class Kohno teaches on computer security. The research team includes Hank Levy, ‘81, a systems specialist who is chair of the Computer Science and Engineering Department, and Amit Levy, ‘09 (no relation to Hank), a master’s-degree student who joined the project as an undergraduate. Though hardly decentralized, the group’s collaborative and non-hierarchical working style has a lot in common with a computing cloud. Kohno says, “It’s kind of hard to attribute who came up with which specific idea.”
The ideas they came up with add up to this: The Vanish program encrypts a message, breaks the encryption key into many tiny pieces, and then sprinkles these pieces throughout a large peer-to-peer network that consists of more than a million computers all over the world. As individual computers leave the network and those that remain purge their memories, pieces of the key are gradually lost. Once a certain number of pieces are lost, the key can never be reassembled and the message can’t be decrypted. A Vanish message can be read for at least eight hours after it is sent, but becomes permanently indecipherable by the nine-hour mark.
Crucially, the person who encrypts and sends the message never holds the key, and so can never be hacked nor forced to give it up. “A major advantage of Vanish is that users don’t need to trust us, or any service that we provide, to protect or delete the data,” Geambasu adds. The recipient of the 2009 Google Ph.D. Fellowship in Cloud Computing, Geambasu is animated in manner and kempt in appearance, with a tidy ponytail, glasses and even, round features.
Vanish isn’t the first attempt at self-destructing data, but the use of the peer-to-peer network to hide the pieces of the key is a particularly elegant approach: The same vast, decentralized nature of the cloud that poses a problem also provides the solution. “I think the really novel part was this idea of using a natural system that already exists to self-destruct data,” Amit Levy says.
“The analogy that we had in mind was of writing a message on the sand at the beach at low tide,” Kohno explains. “As the waves come, a natural process just starts to wash away the message, and the message disappears without any explicit action by any individual.” The constant evolution of peer-to-peer networks emerged as the digital analogue to waves on the shore. Or, more precisely—since a message isn’t erased but merely becomes unreadable—it’s as if the Rosetta Stone needed to read the message gets eroded by the Internet sands of time.
Vanish does have limitations. Some are technical—users can’t choose exactly how long the decryption key will remain available, both the sender and the recipient of a message must have Vanish installed, and Vanish must be in use when data is posted in order for it to self-destruct later. Some are legal—some companies and government agencies have rules about electronic record-keeping that may make it inadvisable to use Vanish in certain situations. Still others are philosophical—it’s not always clear up-front that a message is going to be sensitive, so how do you decide when to use Vanish?
Vanish is a free, open-source program that can be downloaded from http://vanish.cs.washington.edu/download.html. But Geambasu cautions that it’s a research prototype, not a fully supported piece of software. “We encourage people to experiment with it, but not rely on it for perfect security.”
In fact, other researchers are working on how to hack Vanish. In September a group from the University of Texas at Austin, Princeton University and the University of Michigan announced that they had created a system called Unvanish, which searches the peer-to-peer network for anything that looks like a piece of a Vanish key (key pieces have a distinctive size). Unvanish saves these key pieces, enabling a user to reconstruct a Vanish decryption key long after it has disappeared from the network.
“I think this is very exciting!” Geambasu says of this “attack” on Vanish. “It’s a little bit stressful for me, because I have to come up with defenses now, but this is exactly what we wanted to do”—that is, raise awareness of “immortal” data as a problem, and stimulate research into how to solve it. Mission accepted, and accomplished.