For AI to improve, human input is needed - which is why Amazon reportedly has thousands of employees listening in on Alexa voice recordings.
A report from Bloomberg claims that Amazon has a voice review program, comprised of full-time employees and contractors around the world, tasked with listening to recordings captured by Echo and other Alexa devices. Those recordings are transcribed and analyzed to improve Alexa's intelligence - a common practice among speech AI companies.
Read this: The ultimate guide to Amazon Alexa
However, according to insiders speaking to Bloomberg, recordings that get sent to Amazon for review are associated with a user's first name, account number and the device serial number. This is where the report becomes quite damning. Amazon's AI training program isn't unique - Apple does the same thing with Siri - but all information that could identify the user is usually stripped from the data, as it is with Apple.
The report also has team members recalling specific recordings they'd heard, such as a woman singing in the shower, a child screaming, and suspected instances of sexual assault. Again, quite damningly, these employees reportedly share some of these recordings in an internal chatroom when they need help parsing a hard-to-decypher recording.
New Alexa 2020 hardware revealed
In response, Amazon told Bloomberg that "an extremely small sample of Alexa voice recordings" were being used to train its AI.
"âWe have strict technical and operational safeguards, and have a zero tolerance policy for the abuse of our system," the company also said. "Employees do not have direct access to information that can identify the person or account as part of this workflow. All information is treated with high confidentiality and we use multi-factor authentication to restrict access, service encryption and audits of our control environment to protect it.â
Again, while the use of humans to train AI isn't inherently nefarious, it's the potential abuse that is concerning here, particularly Amazon's reported failure to properly anonymize users' data.