Lost for Words: Speech Synthesis with Limited Data Using Linear Networks

This article is part of the Academic Alibaba series and is taken from the paper entitled “Linear Networks Based Speaker Adaptation for Speech Synthesis” by Zhiying Huang, Heng Lu, Ming Lei, Zhijie Yan, accepted by IEEE ICASSP 2018. The full paper can be read here.

Image for post
Image for post

Speaker-dependent acoustic models ensure that speech synthesis systems give accurate results. Given an adequate amount of training data from target speakers, speech synthesis systems are able to generate results similar to the target speaker. However, gaining enough data from target speakers is always a constraint.

Speaker adaptation can be used to obtain satisfactory target speaker voice fonts using only limited data. This approach is less labor-intensive than mass recording, manual transcribing and review, and ultimately reduces the cost of creating new voices.

To improve the stability of these adapted voices, the Alibaba tech team have investigated applying Linear Network (LN)–based speaker adaptation methods and Low-Rank Plus Diagonal decomposition (LRPD). A breakdown of these approaches is illustrated below.

Image for post
Image for post
Image for post
Image for post

The effectiveness of these approaches was evaluated by conducting female to female, male to female, and female to male speaker adaptation. Results shown that using LN with LRPD decomposition is most effective when adaptation data is extremely limited. Moreover, using this method with a speaker adaptation model containing only 200 adaptation utterances achieved comparable quality to that of a speaker dependent model trained with 1,000 utterances in terms of naturalness and similarity to target speakers.

Image for post
Image for post

Read the full paper here.

Alibaba Tech

First-hand and in-depth information about Alibaba’s latest technology → Search “Alibaba Tech” on Facebook

Written by

First-hand & in-depth information about Alibaba's tech innovation in Artificial Intelligence, Big Data & Computer Engineering. Follow us on Facebook!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store