Microsoft a short while ago released new features for its Cognitive Speech Provider to accelerate language finding out with pronunciation assessment, new speech-to-text (STT) languages, and prebuilt and tailor made neural voice enhancements.
Microsoft Azure Cognitive Speech Companies is a extensive assortment of systems and companies this sort of as Speech to Textual content, Textual content to Speech, custom made neural voice (CNV) Conversation Transcription Assistance, Speaker Recognition, Speech Translation, Speech SDK, and Speech Product Improvement Package (DDK) to speed up speech incorporation into purposes.
Pronunciation Evaluation is a function of Speech Services in the Azure Cognitive Expert services portfolio, publicly offered in 10+ languages and variances, which include American English, British English, Australian English, French, Spanish, and Chinese, with further languages in preview. It utilizes Azure Neural Textual content-to-Speech and Transformer models, ordinal regression, and a hierarchical construction to boost the precision of the phrase-degree assessment providing language learners of all backgrounds to make improvements to their capabilities.
Source: https://techcommunity.microsoft.com/t5/ai-cognitive-products and services-website/speech-support-update-hierarchical-transformer-for-pronunciation/ba-p/3740866
In addition, the Azure Speech to textual content supports authentic-time language identification for multilingual language finding out situations and assists human-human conversation with improved comprehension and readable context. This service’s new speech-to-textual content (STT) languages are based mostly on broad quantities of data leveraging the newest multilingual modeling know-how and transfer studying methods delivering output, which contains Inverse Textual content Normalization (ITN), capitalization (when appropriate), and automatic punctuation to enhance readability.
And lastly, Microsoft Azure AI presents a range of prebuilt neural voices for AI lecturers, articles browse-aloud abilities, and a lot more. Custom made Neural Voice (CNV) also allows customers to produce a exclusive, custom made synthetic voice for their applications, employing human speech samples as education data. CNV is based on neural textual content-to-speech know-how and is fantastic for symbolizing manufacturers and personifying equipment for conversational interactions. Education businesses are utilizing this technology to personalize language learning, for illustration, Duolingo and Pearson.
Qinying Liao, a principal program supervisor at Microsoft, stated in an Azure Tech local community weblog post:
Microsoft features above 400 neural voices masking extra than 140 languages and locales. With these Text-to-Speech voices, you can rapidly incorporate go through-aloud features for a much more available app style and design or give a voice to chatbots to present a richer conversational encounter to your consumers.
In typical, Andy Beatman, a senior item advertising supervisor at Azure AI, said in an Azure AI blog site write-up:
The integration of AI, specifically speech providers, into the education sector is becoming ever more essential as it can enormously increase the finding out working experience and increase the efficiency of teaching. Speech expert services these as Azure Pronunciation Assessment and Customized Neural Voice offer personalization, automation, and analytics in instruction platforms, which can guide to superior scholar engagement and accomplishment.
Last of all, a lot more Azure Cognitive Speech Products and services aspects are available on the documentation landing site. Also, customers can use Speech Studio to test how custom speech features would assistance boost recognition for their audio.