What do we call this industry anyway?
As part of the Linux Foundation’s Open Voice Network Ethical Use Task Force (makes it sound like I’m going to be dropping out of helicopters and kicking down doors in search of people using conversational AI for nefarious means) I’ve been exposed to some fascinating discussions of some of the most knowledgeable and thoughtful people in our industry.
One of these discussions recently came back to one of my favourite subjects: terminology!
(Yes, I’m a nerd.)
We’re currently developing an ethical standard (basically a set of rules) for implementing conversational AI projects in a way that emphasises the positive benefits to society and mitigates harms.
But that led us to a discussion about how we should refer to the industry we work in.
If we call it the “Voice industry” then we’re excluding people who are building conversational experiences using chat that we think the standards are relevant to.
If we call it the “Conversational AI industry” then we could be ignoring people doing things in areas fraught with ethical considerations like voice cloning and gathering metadata from speech patterns.
And what about if we go with something like the “Digital assistant industry” which could cover all sorts of things to do with storing and using both implicit and explicit user data plus a whole load of questions about who should own these digital chokepoints.
I think we all agree that we can’t give up and just call ourselves the “AI industry”. Apart from it taking in a load of things we don’t currently have a view on such as the efficacy of self-driving cars, the term AI is so poorly defined that it should never be used in anything other than films about robots overthrowing the human race.
So how do we square the circle?
Usually when you’re having difficulty defining a term it’s because the word (or words) are being loaded with more meaning than they can bear.
And perhaps that’s a sign to us that our industry has grown to the point where it’s not really one thing anymore.
The ethical considerations if you’re using Large Language Models to automatically generate responses for your bank chatbot are quite different from building an app that lets users mimic their friends’ voices.
Instead we need to focus on the practical. These standards will help you avoid ethical missteps if you are doing these activities. Rather than a single overarching standard maybe it’s better to be granular.
Or is this just a cop out? Let me know what you think.