Speech Synthesis Markup Language, or SSML, is a standardized markup language that allows developers to control pronunciation, intonation, timing, and emotion. SSML support on Alexa allows you to control how Alexa generates speech from your skill’s text responses. You can add pauses, change pronunciation, spell out a word, add short audio snippets, and insert speechcons (special words and phrases) into your skill. These SSML features provide a more natural voice experience.
Today, we are excited to announce five new SSML tags in the US, UK, and Germany that you can use with Alexa, including whispers, expletive bleeps, and more.
In addition, today we also rolled out speechcons in the UK and Germany. Let me explain what they are and how to use them.
Five New SSML Tags
- Whispers – Convey a softer dialog with <amazon:effect name="whispered">.
- Expletive beeps – Bleep out words with <say-as interpret-as="expletive">.
- Sub – Use the <sub> tag when you want Alexa to say something other than what’s written.
- Emphasis – Add <emphasis> to change the rate and volume at which Alexa speaks.
- Prosody – Use this tage to control the volume, pitch, and rate of speech.
Using the New SSML Features
The new amazon:effect tag coupled with the name: “whispered” allows Alexa to convey a softer dialog. Notice in the sample below, that amazon:effect requires a closing tag.
<speak> The user name is Alexa Devs and the password is… wait, come closer… <amazon:effect name="whispered"> the password is whisper. </amazon:effect> </speak>
To hear Alexa whisper, copy the example above and paste it into the voice simulator on the developer portal, as shown below.
Sometimes you need to “bleep” a word or two out to make content acceptable for a general audience–this is exactly what interpret-as="expletive" does. It bleeps out a word that may cause offense. Notice that “expletive” is used with the <say-as> tag.
<speak> Give me liberty or give me <say-as interpret-as="expletive">death</say-as>. </speak>
By itself, <sub> is a little less intuitive than the rest of this group. You can use this tag when you want Alexa to say something other than what is written. For example, if you want her to speak the full words "aluminum or magnesium” rather than the just say their initials, you would use <sub> like this:
<speak> My favorite chemical element is <sub alias="aluminum">Al</sub>, but Al prefers <sub alias="magnesium">Mg</sub>. </speak>
For extra credit (and a laugh!) try <sub alias="aluminum">Al</sub> in American (en-US) and <sub alias="aluminium">Al</sub> in British (en-UK) voice simulators.
The emphasis tag allows you to change the rate and volume at which Alexa speaks. Remember when you were little (and in trouble), a parent would begin talking low and slow and this got your attention? Well, that’s exactly what this does for Alexa and the tenor of the conversation. It varies dialog and thus maintains engagement.
<speak> I already told you, I <emphasis level="strong"> really like </emphasis> making skills for Alexa. </speak>
Try the various options in the voice smulator: none, moderate, strong, reduced. Note that "reduced" lowers volume and increases speed, which reduces emphasis.
Finally, <prosody> provides the ultimate control over volume, pitch, and rate of speech for Alexa. But with greater control comes greater responsibility. While it’s fun to make Alexa sound like ET, it’s really not what we’re aiming for here. To maintain intelligible speech and to provide the best user experience, the amount of change applied to rate, pitch, and volume are limited.
Try these examples in the voice simulator on the developer portal.
<speak> <prosody pitch="low">This is a low pitch.</prosody> </speak> <speak> <prosody pitch="medium">This is a medium pitch.</prosody> </speak> <speak> <prosody rate="slow">This is a slow change in rate.</prosody> </speak> <speak> <prosody rate="fast">This is a fast change in rate.</prosody> </speak>
SSML and the Quiz Game
In this post we’ve learned five new SSML tags that help control speech output for Alexa. Now it’s time to put theory into practice. Take a look at the quiz game template, which makes use of SSML. Assemble the basic skill and get it working. Then make it your own and see if you can add any of the new tags we’ve just reviewed.
<speak> SSML is a powerful tool in shaping the tenor of dialog and user experience. You’ll want to bookmark this page and keep reviewing until you’ve got this stuff <emphasis level="strong"> down cold</emphasis>. </speak>
For more information about getting started with Alexa and SSML, check out the following:
- Speech Synthesis Markup Language (SSML) Reference
- Alexa Skills Kit (ASK)
- Alexa Dev Chat Podcast
- Alexa Developer Forums
And join us for a live webinar on SSML on May 18. We’ll walk through all of the supported tags and show you how to level up your Alexa responses with SSML.
Build a Skill, Get a Shirt
The Alexa Skills Kit (ASK) enables developers to build capabilities, called skills, for Alexa. ASK is a collection of self-service APIs, documentation, tools, and code samples that make it fast and easy for anyone to add skills to Alexa.
Developers have built more than 10,000 skills with ASK. Explore the stories behind some of these innovations, then start building your own skill. Once you publish your skill, mark the occasion with a free, limited-edition Alexa dev shirt. Quantities are limited.