Alexa, Here is My Christmas Wish List for Developers

23 Dec

Ah, Christmas, a time when you can finally take a well-deserved break and eat your body weight in mince pies. At times like these, keeping up with the latest large language model (LLM) papers and #voicefirst tweets fall by the wayside in favour of watching Elf for the third time this week.

But, after a few years of faltering confidence amongst Alexa developers and some shake-ups in Amazon’s voice assistant teams, Voicebot asked me to put together a wishlist of things that could make the must-have toy of 2017 for developers hot again in 2023. How could I resist?

Now I just have to hope I’ve managed to avoid the naughty list!

1. Let developers run ads

It’s clear that a lot of users don’t like ads on their Alexa devices. Letting developers implement them rather than making them part of the first-party home screen would shield Amazon from the negative sentiment as that would be targeted against individual developers instead.

Amazon can organise the sale and provision of the ads while taking a cut, developers can figure out the use cases and experiences that Alexa customers are willing to see / hear ads in, and users can access skills they like without paying. It’s a win / win / win.

I’d argue that letting developers monetize through ads is the single biggest thing that Amazon could do to bring back the hype to the developer ecosystem. There are plenty of use cases ill-suited to direct payments (would anyone have started using Facebook if you had to pay $5 a month for it?), and the threshold of ad-supported product value to start earning is significantly lower.

2. Bring multimodal to the mobile app

I have a reputation for being bearish on multimodal experiences, which is far from the truth. SoundHound’s Dynamic Interaction demo was my voice experience highlight of the year and something the whole industry should aspire to replicate.

For Alexa, the first (and most crucial) step is to get multimodal working in the app. Creating a world where every Alexa user definitely has at least one screen that they have access to would radically alter the calculus of how much effort it is worth devoting to adding visuals to your skill.

I’d also like to see Amazon pushing the app publicly as a way to access Alexa rather than predominantly as a companion to the devices, as this could open up a whole new group of additional customers and use cases.

3. Enable an actual voice-first account linking process

Voice-first account linking has been on developers’ wishlists since the very beginning, and at this point, it’s hard to understand why it hasn’t happened. I imagine that the Alexa team must be under some serious constraints based on the ‘voice-forward’ account linking process they announced at Alexa Live, of which the kindest thing that can be said is that it looks like something designed by legal.

The thing is that this isn’t really good enough for a company that holds itself to be as customer-centric as Amazon. Even if it really can’t be done without a step on the app, there are still obvious ways to make it better.

Voice experiences, by their nature, are ephemeral, which makes it even more essential that they connect into a broader overall service in order to deliver meaningful value to users. Not much use asking Alexa to book next year’s labworks Christmas party if getting the confirmation email takes more steps than just doing it directly in the browser.

4. Provide a more open notification policy

Retention on Alexa skills (beyond some games) is poor compared to their mobile equivalents. Partly this is down to them being less refined services, but part of it is due to one of the key retention tools on Alexa being hamstrung.

Amazon clearly doesn’t want users to get spammed, which is fair enough, but to have a situation where users who have played all the content in your game are unable to ask to get a notification when new content is added seems ridiculous. Letting users request notifications from developers on pretty much anything would support a whole new set of use cases while keeping the user in control.

5. Get back to the original vision of many skills working together

Several years ago, Skill Connections were launched as part of a vision where skills would work as a network handling the bits of the conversation they knew about and then handing off the user to the next skill for the next part. This vision made a lot of sense within the constraints of building reliable Alexa skills while providing real value for the customer.

Unfortunately, it seems to have fallen by the wayside, and the decidedly non-seamless way that Skill Connections work prevents this approach from creating positive and enjoyable user experiences today. The reality is that asking the user if it’s ok to be handed off between skills every single time is not going to work, and Amazon needs to look at whether that is actually necessary in most use cases.

6. Give developers more control over the copy in the ISP payment flow

It is the end of 2022, and in-skill purchases (ISPs) have been around since the start of 2019. That’s four years and yet users still can’t hear the price of the ISP until they’ve already agreed to start the payment flow. I’ve lost count of the number of users I’ve spoken to over the years who say that they didn’t say yes to starting the payment flow because they thought they were going to be charged straight away, and they didn’t know how much it would cost.

The funny thing is that Google solved this for Conversational Actions (RIP) as soon as it launched payments by providing developers with a variable they could place in their copy that Google would overwrite when the speech was being generated. Again, it’s just really hard to understand why a small iteration like this hasn’t appeared at some point over the last four years, but hopefully, 2023 will be the year it happens!

Amazon could even go one better and let developers control the entire copy of the payment flow, so there’s no need to deal with tortuous prompts half written by Amazon and half by the developer. This is bound to improve conversion rates.

7. Let developers certify skills a locale at a time

Amazon is always keen for developers to expand their Alexa skills to new locales, even going as far as to offer a tool to clone your skill into a new locale. But they still haven’t fixed one of the biggest drawbacks of doing so: the fact that every locale has to be certified even if you only make a change in one.

Having had a broken version of a skill stuck in production in the U.S. for six weeks while the new version is held up by certification in Italy, even though the change did not affect that locale makes me wonder if Kafka was an Alexa developer. The closest I’ve come to swearing off Alexa has been in these moments of complete powerlessness in the face of this glacial bureaucracy.

8. Improve the Alexa skill experience on Fire TV

There are a lot of Fire TV devices out there, and customers are actively seeking content on them. For entertainment and media Alexa developers, at least, these customers could become a significant additional market without needing to do much extra work.

In particular, the way you invoke skills on Fire TV needs to become more natural. Today, if you fail to specify that you want a skill it will try to load a Fire TV app or content that matches your request instead. It would also be great to see things like Alexa notifications appearing on the TV and even a far future where the same skill session could show different content on the phone and the TV.

So, those are the top things on my Alexa wishlist. What have I missed? Do you think we’ll actually get any of them in 2023? Let me know your thoughts either on LinkedIn or email me at tom[at]labworks.io. I hope you have very happy holidays!

Tom Hewitson