Amazon Echo: How It Works, What It Stores and How It Hurts Your Privacy

Feb. 16, 2017, 7:21 PM UTC

The Amazon.com Inc. Echo may seem like a wonder device—an always-attentive digital butler willing to try to fulfill your every spoken whim. Before you turn that microphone on, you should know a few things about how it can be used to eavesdrop on you, your friends and your family and some of the ways law enforcement can use your Echo against you. In future articles, you will find out how it can be used in civil cases (such as divorce or child custody), ways it can expose Amazon (and its customers) to legal liability and ways blackhats can use it to steal your information and identity.

How the Amazon Echo Works.

Typically, the Amazon Echo is always on and listening for commands. The Echo “listens” by recording everything it hears in its on-device storage. The recordings are processed to determine if a “wake word” (such as “Alexa”) is present. If a wake word is not detected, Echo continues to listen and the recordings are periodically overwritten. If a wake word is detected, the Echo uploads the recording to Amazon’s servers for further processing. The Amazon servers store the recording, process it and then respond based on the results of the processing.

You can also interact with the Echo through applications installed on your smartphone. The recordings sent to Amazon from your Echo can be reviewed through the Amazon Alexa application, though they can be deleted file by file in the Alexa app and en masse through the “Manage Your Content and Devices” view of your account on Amazon’s website.

The Echo’s processing is not perfect. Often it mistakes sounds or ordinary conversation for a wake word. As a result, it can record your (or your friends’) conversations without your knowledge, uploading the dialog to Amazon’s servers (In its URLs, Amazon refers to any recorded information as an “utterance”).

Amazon’s processing of the recording is also imperfect. In order to respond to a command, it must process the recording to extract the instructions and convert them into a usable form. When the speech in the recording is converted to text, errors often occur—words are mangled or misinterpreted (particularly when spoken with an accent or in a dialect). This parsed form of the recording is stored separately by Amazon’s systems and may be preserved longer than the recording itself, as it consumes significantly less storage space.

Finally, the Echo does not discriminate between speakers—everyone within range of the device is monitored, recorded, and could have their conversations uploaded to Amazon’s servers.

Where Are My Conversations Stored?

There are at least three places where your recorded conversations can be found—in the local storage of the Amazon Echo, online in Amazon’s systems, and in any smartphone that has been used to control the Echo. These conversations are stored as audio files and as data structures that are the output of any processing.

Your Conversations Are in the Echo Itself.

The device itself is the least likely to contain significant amounts of recoverable information. Computers have to store information in random access memory (RAM) to be able to process it. The Echo’s RAM is one location where audio files will be found. Because RAM is limited in space, these files will be frequently overwritten by newer recording. Once the device is powered down, the audio files in RAM will be unrecoverable. The Echo may also cache these recordings in local storage, such as a flash drive. Because the Echo does not contain large amounts of such storage, it will overwrite these locations as well, though not as often as it will overwrite RAM. As a result, it is unlikely that much more than the last few conversations can be recovered from the device itself.

Your Conversations Are in the Alexa App.

Apps on your smartphone that connect with the Echo, such as Amazon’s Alexa app, also contain stored conversations. The University of Champlain’s Patrick Leahy School has done a preliminary investigation into the forensic artifacts stored in the user’s smartphone. If you have replayed any stored conversations using the Alexa app, the last such conversation is stored as a .wav file on your smartphone. There are also Alexa-specific data structures (“cards”) in .json format. These cards contain Amazon’s speech-to-text conversion of the conversation but not the audio recording of the conversation (though it provides the URL to the stored copy of the recording in Amazon’s systems).

Your Conversations Are in the Amazon Cloud.

Amazon’s servers hold the bulk of your conversations. On their servers they have stored copies of every conversation that the Amazon Echo has sent them as well as the speech-to-text conversion of the conversations and any metadata concerning the conversation (time, place, duration, and so forth). Amazon does not reveal much about what information it is storing, nor does it reveal how long it will keep such information. Nor does Amazon reveal all the ways it is using your stored conversations.

At the very least, Amazon is using the conversations to improve their speech-to-text processing and to develop additional software for Amazon’s commercial gain. Similar to Alphabet Inc.'s Google and Facebook Inc.'s treatment of user information, Amazon may also be mining the information contained in your conversations to build profiles of you, your friends and family, and anyone else whose voice has been captured and uploaded by an Amazon Echo. This means your conversations may be stored in many different places within Amazon’s systems, and you do not have the ability to control their use of your conversations.

You can use your Amazon account to remove your Echo audio files. However, there is no guarantee that the audio file is actually deleted at that time or if it is simply hidden from your account’s view. Amazon has not said if other copies of the file are also deleted when you attempt to remove the version in your account.

Amazon may be using your conversations for its own commercial gain, such as by generating user profiles similar to Google or Facebook. If so, it is likely that Amazon will keep other copies of your conversations for its own use, separate from any that you are able to delete through your Amazon account.

To ordinary users, the physical location of information may not matter much. To the law, it makes all the difference.

So What Location Should Worry You the Most?

You should worry the most about the conversations that have been uploaded to Amazon’s servers.

While information is stored on your Echo, it likely contains little beyond the last conversation. Your Alexa app is more troubling since it contains more information—the speech-to-text conversions of previous conversations and the last audio conversation. The Amazon servers, however, contain every conversation that has been uploaded. There is one key distinction—both the device and the app are in your possession and under your control, but the Amazon servers are not. To ordinary users, the physical location of information may not matter much. To the law, however, it makes all the difference.

The law has not evolved as quickly as technology. In the physical world, privacy (and the protections from searches and seizures) vary by physical location: you have greater protection from searches and seizures in your home than in your front lawn; and greater protections in your front lawn than at your bank. The virtual world is different—information can be stored in a variety of places, often without the ordinary user being aware of it. Your expectations of privacy are not governed by the information’s physical location—you expect the information to be equally private whether you store it on your computer or in a cloud. The law, stuck analogizing the physical to the virtual, does not share this view.

An illustrative example of this flaw is something called the “Third Party Doctrine,” part of the rules that govern when, where, and how law enforcement can intrude on your privacy to search your information.

What is the ‘Third Party Doctrine’?

Courts often view information stored with third parties—such as on Amazon’s systems—as being less worthy of privacy protection than information you store on your own devices. This theory, embodied within the “Third Party Doctrine,” arises from antiquated thinking that confuses secrecy with privacy. Under this doctrine, if you provide private information to a third party to use or store on your behalf, you’ve given up any and all interests you may have in keeping that information private.

How Does the Third Party Doctrine Affect Your Privacy?

The Third Party Doctrine eviscerates your privacy rights. Under this doctrine, you have “no reasonable expectation of privacy” in the information. This makes the information more readily discoverable in civil cases, and in criminal cases it removes any Fourth Amendment protections you otherwise would possess. This means law enforcement does not need a warrant to get the information—they can choose to instead use a prosecutor’s subpoena, which does not require court approval (In later articles we will discuss other laws that may govern how law enforcement can access this information, including the Stored Communications Act and state laws, such as California’s Invasion of Privacy Act).

While law enforcement’s choice between a warrant and a subpoena can affect how and if the third party can refuse to produce the information, the doctrine prevents you from challenging the request. Under the doctrine, you have lost the right to challenge the seizure because you “gave up” your privacy interests by giving the information to the third party. Even worse, you may not even know about the seizure until much later, as there is no requirement that law enforcement or the third party tell you.

Additionally, under the doctrine, only the third party can raise a direct challenge to the seizure (assuming one can even be raised—an issue currently being considered by the New York Court of Appeals in the pending In re 381 Search Warrants case). If the third party chooses not to challenge the subpoena or warrant, your options are limited. In some jurisdictions you may be able to attempt to “suppress” (prevent) the use of the seized information by law enforcement at your trial. Unfortunately, this motion to suppress takes place well after the information has been turned over to law enforcement and thus provides little real protection—law enforcement will have already been using the information against you in investigating the case and during negotiations over a possible plea bargain (which is how the majority of criminal cases are resolved). In other jurisdictions, the doctrine bars you from suppressing the use of the seized information.

Are We Stuck With the Third Party Doctrine?

Thankfully, some courts have begun moving away from the flawed Third Party Doctrine and instead are using a more nuanced view of privacy. Under this approach, privacy is not an all-or-nothing system. Instead, privacy is treated as a bundle of permissions allowing or barring the use of information for various purposes and by various individuals. Similar to filesystem permissions or access control lists (ACLs), when you turn over information to a third party, you do so expecting them to only use the information for certain reasons and to only disclose the information in certain circumstances. This defines your expectation of privacy for the information and the level of protection your information receives.

For example, when you turn over your financial information to your bank, you reasonably expect them to only use that information in the course of providing you services. You do not expect them to sell your financial information to others or to post it for all to see online. Your expectation of privacy for such information turned over to your bank is higher than your expectation of privacy for information you turn over to your public Facebook profile. As a result, seizures of your bank information would receive greater protection than your Facebook profile.

While this move towards a modern view of privacy may eventually provide you with the same privacy protections for your information regardless of where you store it, as of today the danger still exists that your information could be provided to law enforcement without your consent—and even without your knowledge.

What Can You Do to Protect Your Privacy?

First, let Amazon (or any other third party that holds your information) know that you value your privacy. If they feel their users care about privacy, they are more inclined to fight overbroad and unconstitutional requests by law enforcement.

Second, periodically log in to your Amazon account and remove your Amazon Echo’s stored audio files. If you have a large volume of such files, you may want to look at using a macro or script to batch-remove such files rather than having to click on each one in sequence.

Third, turn off your Amazon Echo’s microphone when you aren’t going to be using it—when the microphone is disabled, it won’t listen for wake words and will not transmit audio files to Amazon for processing.

Finally, a fun project for those of you who really can’t live without their Amazon Echo but don’t want to turn off the microphone.

If you are concerned about Amazon using your conversations to profile you and your friends, yet you still want to use the Echo, your options are limited. Your best bet is to send large amounts of obscuring commands to the Echo so that any profiling will be overwhelmed with false information taken from the obscuring commands rather than true information from your actual commands. The most elegant solution would be to send automatically-generated audio commands directly to Amazon’s systems over the network while masquerading as your Echo. A less elegant way would be to set up a speaker next to the Echo’s microphone, then use the speaker to periodically play a random selection of commands that are either pre-recorded or generated using a text-to-speech converter and text randomizer.

Learn more about Bloomberg Law or Log In to keep reading:

Learn About Bloomberg Law

AI-powered legal analytics, workflow tools and premium legal & business news.

Already a subscriber?

Log in to keep reading or access research tools.