Audio Corpora Libraries

Custom Developed and Existing Audio Corpora

Existing Corpora Libraries for 2016

MatrixHCI is developing a variety of quality acoustic speech corpora for use in speech recognition research and development.  These libraries are available for license to companies in need of large collections of transcribed speech samples.

The libraries that are being developed for 2016 release include:

Conversational speech, dictational speech, and aviation-related speech corpora.

Spoken languages, include English and Spanish.  We are also compiling an audio corpus of English speakers that use “vocal fry” in their speech, technically called “creaky voice.”  Vocal fry is produced by compressing the vocal chords which reduces the airflow through the larynx and the frequency of vibrations causing speech to sound rattled or “creaky,” fry.

It is most common among younger females populating in the US, and is becoming evermore pervasive in today’s social society of communication.

Unfortunately, vocal fry can cause speech recognition accuracy problems; however, this new audio corpus is specifically designed to bridge the gap and allow acoustic models to better handle this new type of speaking behavior.

You can find an example of vocal fry on a video below.

shutterstock_330739340
shutterstock_303649073

English Conversational Audio Corpora Available in 2016

MatrixHCI will be releasing audio corpora for the following spoken in  the English language.

English Conversational Speech 200 hrs.

– Geographically diverse set of 400 speakers each speaking conversationally for 30 minutes each.

– Recorded at 96kHz from 6 different recording sources, including high quality Cardoid Microphone, Lapel Microphone, Omnidirectional conference table mic, and a noise canceling headset mic.

– Additionally, audio has been collected and passed through both iPhone/iOS and Android devices through the AT&T phone network.

For more information about license details on either of these audio corpora, please contact us.

English Dictational Audio Corpora Available in 2016

English Dictational Speech 200 hrs.

– Geographically diverse set of 400 speakers each speaking from a set of phonetically balanced utterances for 30 minutes each.

– Recorded at 96kHz from 6 different recording sources, including high quality Cardoid Microphone, Lapel Microphone, Omnidirectional conference table mic, and a noise canceling headset mic.

– Additionally, audio has been collected and passed through both iPhone/iOS and Android devices through the AT&T phone network.

For more information about license details on either of these audio corpora, please contact us.

shutterstock_262339862
shutterstock_244449322

English (Vocal Fry) Conversational Audio Corpora Available in 2016

MatrixHCI will be releasing audio corpora for the following spoken in the English language.  This corpus will include only speakers who exhibit vocal fry in their speech.

It is intended to help increase the accuracy of speech recognition acoustic models for younger generation speakers who use this new approach to speaking.

English (Vocal Fry) Conversational Speech 100 hrs.

– Geographically diverse set of 400 speakers each speaking conversationally for 30 minutes each.

– Recorded at 96kHz from 6 different recording sources, including high quality Cardoid Microphone, Lapel Microphone, Omnidirectional conference table mic, and a noise canceling headset mic.

– Additionally, audio has been collected and passed through both iPhone/iOS and Android devices through the AT&T phone network.

For more information about license details on either of these audio corpora, please contact us.

Spanish Conversational Audio Corpora Available in 2016

MatrixHCI will be releasing audio corpora for the following spoken in the Spanish language.

Spanish Conversational Speech 200 hrs.

– Geographically diverse set of 400 speakers each speaking conversationally for 30 minutes each.

– Recorded at 96kHz from 6 different recording sources, including high quality Cardoid Microphone, Lapel Microphone, Omnidirectional conference table mic, and a noise canceling headset mic.

– Additionally, audio has been collected and passed through both iPhone/iOS and Android devices through the AT&T phone network.

For more information about license details on either of these audio corpora, please contact us.

shutterstock_219751555
shutterstock_219751576 - Copy

Spanish Dictational Audio Corpora Available in 2016

Spanish Dictational Speech 200 hrs.

– Geographically diverse set of 400 speakers each speaking from a set of phonetically balanced utterances for 30 minutes each.

– Recorded at 96kHz from 6 different recording sources, including high quality Cardoid Microphone, Lapel Microphone, Omnidirectional conference table mic, and a noise canceling headset mic.

– Additionally, audio has been collected and passed through both iPhone/iOS and Android devices through the AT&T phone network.

For more information about license details on either of these audio corpora, please contact us.

Air Traffic Control Audio Corpus for 2016

MatrixHCI will be releasing audio corpora for Air Traffic Control (ATC) transmissions from many major airports throughout the US.

Audio has been collected from many phases of flight to include Ground, Taxi to Takeoff, Departure, Approach, and en route.

Additionally, audio has been collected in selected major airports during the time of abnormally high traffic activity.

ATC audio corpus can be custom configured to meet your needs.  Contact MatrixHCI for a consultation; as well as, to learn how we can help enable speech recognition in the aviation and aerospace sectors.

Audio was collected using custom built J-pole antennas that have been tuned to the middle of the ATC frequency, and all recordings are recorded in raw wav form at 96kHz using MatrixHCI’s remote audio collection bus located directly at each airport for quality reception.

This audio includes both ATC-to-pilot and pilot-to-ATC transmissions.

Additionally, many recordings also have ADS-B data associated with the transcribed recordings.

Screen Shot 2015-12-03 at 10.18.18 PM

MatrixHCI offers Custom Software Development with an emphasis on Advanced Cutting-Edge Speech Recognition

If you have a need for specialized custom speech recognition solutions, please contact us for a free confidential consultation.