Google has been caught in a privacy blunder over voice collection, with the company admitting that more than 1,000 sound recordings were leaked to a Belgian news site.
Like Amazon, Google has a hub of workers who listen to users' Google Assistant queries in order to improve its AI. These recordings are supposed to stay anonymous and, most importantly, secure on Google's servers.
But over 1,000 of these recordings were leaked by a contractor to Belgian broadcaster VRT NWS. The recordings allegedly contain sensitive information including addresses, according to VRT. To confirm the recordings were real, VRT "let ordinary Flemish people hear some of their own recordings." These people could confirm hearing their own voice on the audio clips.
While the leak of this information is bad enough, 153 of the recordings were conversations recorded without the "Ok Google" wake word being said. Google's smart speakers listen for these wake words and should only start recording anything once they're spoken. Among these voice clips were private conversations, business calls and suchlike, which the participants almost certainly didn't want recorded.
Google admitted the leak in a blog post, where it confirmed a language reviewer had violated the company's security policies and shared the recordings with VRT.
"Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action," said Google's product manager of search, David Monsees. "We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again."
It's the latest in a series of security blunders from voice companies. Amazon has been in the headlines a lot recently for the way it's handled Alexa recordings, with reports that it is not sufficiently anonymizing the data.
Voice companies use workers to listen to recordings in order to "train" the AI. It's an established practice, but one that a lot of users aren't aware of. And that's totally fair because Google doesn't say it's doing this in its privacy policies.
OK, Google: Time to be more transparent with us
Google defended the practice in its blog post, calling it a "critical part" of improving the Assistant.
"Language experts only review around 0.2 percent of all audio snippets," said Monsees. "Audio snippets are not associated with user accounts as part of the review process, and reviewers are directed not to transcribe background conversations or other noises, and only to transcribe snippets that are directed to Google."
Google also acknowledged what it calls "false accepts" - times where the Assistant thinks it's heard the wake words and starts recordings, when in fact the trigger command was never given. This could explain the 153 unintended recordings, but that's a significant number in a sample size of 1,000. So extrapolate that, and it's not a good look. Furthermore, Google's policies state that only intended recordings are sent to Google - which is clearly not the case.
So what can you do? You can switch off storing audio data on your Google account if you like, or you can have it auto-delete your data after every three or 18 months. Google promised it will be "reviewing opportunities" to clarify how data is used. About time too.