Home » Android » Does Android TTS support Speech Synthesis Markup Language?

Does Android TTS support Speech Synthesis Markup Language?

Posted by: admin May 14, 2020 Leave a comment


Passing the following SSML (Speech Synthesis Markup Language) document to the com.svox.pico TextToSpeech engine resulted in a reading of the XML body but no control from the phoneme element or the emphasis element. This result (no apparent SSML control) is the same on a Nexus One running Android 2.2 as well as on the emulator running an AVD with SDK level 8.

            String text = "<?xml version=\"1.0\"?>" +
                "<speak version=\"1.0\" xmlns=\"http://www.w3.org/2001/10/synthesis\" " +
                    "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" " +
                    "xsi:schemaLocation=\"http://www.w3.org/2001/10/synthesis " +
                        "http://www.w3.org/TR/speech-synthesis/synthesis.xsd\" " +
                    "xml:lang=\"en-US\">" +

                    "tomato " +
                    "<phoneme alphabet=\"ipa\" ph=\"t&amp;#x259;mei&amp;#x325;&amp;#x27E;ou&amp;#x325;\"> tomato </phoneme> " +

                    "That is a big car! " +
                    "That <emphasis> is </emphasis> a big car! " +
                    "That is a <emphasis> big </emphasis> car! " +
                    "That is a huge bank account! " +
                    "That <emphasis level=\"strong\"> is </emphasis> a huge bank account! " +
                    "That is a <emphasis level=\"strong\"> huge </emphasis> bank account!" +
            mTts.speak(text, TextToSpeech.QUEUE_ADD, null);

Does any Android TTS engine support any of the SSML elements?

How to&Answers:

I’ve been experimenting with SSML and it seems that the TTS engine wraps its input automaticly with the root <speak> element, so if you leave it out, then it works fine and you don’t get a parser error.


String text = "Testing <phoneme alphabet=\"xsampa\" ph=\""{[email protected]`\"/>.";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);


The answer seems to be “sort of”. Not all the SSML tags are supported yet, but some test examples of the use of the <phoneme> tag are at https://android.googlesource.com/platform/external/svox/+/89292811b7fe82e5c14fa13942779763627e26db

Though the test examples produce the desired speech output, they also produce XML parser error messages in logcat. I’ve opened an issue about these seemingly incorrect error messages at the Android issue tracker (issue 11010).


It does appear that android.speech.tts at SDK level 23 supports a subset of SSML. Speech text can be wrapped in <speak> tags, and <say-as> is observed, while <break> is not. There is no documentation regarding SSML support.