On the Google blog explaining how Google Duplex works, this line jumps out at me.
... To obtain its high precision, we trained Duplex’s RNN on a corpus of anonymized phone conversation data ...
A number of concerns arise from this statement:
- How did Google acquire phone data?
- Anonymised or not, the data must still be real conversations, with real people talking to each other or it would be useless to them.
- In order to continue training the code to pretend to be human in other circumstances Google will need to acquire ever more phone conversation data. This feels quite a contentious issue, especially around consent.
Personally, I would always want to know when I am talking to a machine and when it is a human.