Google Cloud Text-to-Speech API now offers Custom Voices

In a blog post published on the 1st of September, Google Cloud has released a number of fresh updates for its Contact Center AI product. The updates include a custom text-to-speech feature for Google’s Text-to-speech API for businesses.

The Beta version of the Cloud Text-to-Speech API now comes with this Custom Voice feature. The feature allows businesses to train their own voice models with a customised voice through scripted recordings of a person of their choosing. Audio is synthesised using the recordings and a customised model is made.

The whole process, for now, involves an approval given by Google for each individual use case. Google then sends a script for the voice recordings and a studio quality recording is to be sent back by the company. The model is then trained and fitted to sound as much like the actual voice as possible. The feature is, however in its early stages and requires immaculate training data. As a result, Google may send the business back its recordings and ask for adjustments to be made.

The training of the model takes several weeks, and no Service Level Agreement (SLA) support has been made available for this beta version of the product.

The feature works well for businesses with a distinct preference of voice and speech patterns. It is, however, only available for American English (en-US) for now; although, Google may have plans to expand into other dialects, and maybe even languages in the future.

Like our stories? Follow our Instagram for pictorial updates.

API Cloud Google text-to-speech