What Happened When Google Threw All Voice Data To The Blender. Answer: SpeechStew (Analytics India)

Analytics India: What Happened When Google Threw All Voice Data To The Blender. Answer: SpeechStew. “Training large models is a massive challenge as it requires collecting and annotating vast amounts of data. It is particularly challenging in the case of speech recognition models. To overcome this challenge, a team from Google Research and Google Brain have introduced an AI model, SpeechStew. The model is trained on a combination of datasets to achieve state-of-the-art results on various speech recognition benchmarks.”

VentureBeat: Artie releases tool to measure bias in speech recognition models

VentureBeat: Artie releases tool to measure bias in speech recognition models. “Artie, a startup developing a platform for mobile games on social media that feature AI, today released a data set and tool for detecting demographic bias in voice apps. The Artie Bias Corpus (ABC), which consists of audio files along with their transcriptions, aims to diagnose and mitigate the impact of factors like age, gender, and accent in voice recognition systems.”

Stanford News: Stanford researchers find that automated speech recognition is more likely to misinterpret black speakers

Stanford News: Stanford researchers find that automated speech recognition is more likely to misinterpret black speakers. “The technology that powers the nation’s leading automated speech recognition systems makes twice as many errors when interpreting words spoken by African Americans as when interpreting the same words spoken by whites, according to a new study by researchers at Stanford Engineering.”

Harvard Business Review: Voice Recognition Still Has Significant Race and Gender Biases

Harvard Business Review: Voice Recognition Still Has Significant Race and Gender Biases. “Voice AI is becoming increasingly ubiquitous and powerful. Forecasts suggest that voice commerce will be an $80 billion business by 2023. Google reports that 20% of their searches are made by voice query today — a number that’s predicted to climb to 50% by 2020. In 2017, Google announced that their speech recognition had a 95% accuracy rate. While that’s an impressive number, it begs the question: 95% accurate for whom?”

Mozilla: More Common Voices

Mozilla: More Common Voices. “Today we are excited to announce that Common Voice, Mozilla’s initiative to crowdsource a large dataset of human voices for use in speech technology, is going multilingual! Thanks to the tremendous efforts from Mozilla’s communities and our deeply engaged language partners you can now donate your voice in German, French and Welsh, and we are working to launch 40+ more as we speak. But this is just the beginning. We want Common Voice to be a tool for any community to make speech technology available in their own language.”

Seeker: AI Earthquake Tracker Is Inspired by Speech Recognition Technology

Seeker: AI Earthquake Tracker Is Inspired by Speech Recognition Technology. “The state of Oklahoma has witnessed a stunning rise in the frequency of earthquakes, which has been linked to an increase in the use of fracking technology in the oil and gas sector. Starting in 2009, the annual number of quakes measuring above magnitude 3.0 in the state exploded from fewer than three to as many as 903 in 2015. Now, all this seismic activity has prompted scientists to develop a new tool for tracking it — drawing on speech recognition technology.”

TechCrunch: Microsoft’s speech recognition system hits a new accuracy milestone

TechCrunch: Microsoft’s speech recognition system hits a new accuracy milestone. “Microsoft announced today that its conversational speech recognition system has reached a 5.1% error rate, its lowest so far. This surpasses the 5.9% error rate reached last year by a group of researchers from Microsoft Artificial Intelligence and Research and puts its accuracy on par with professional human transcribers who have advantages like the ability to listen to text several times.”

Fast Company: Mozilla is crowdsourcing a massive speech-recognition system

Fast Company: Mozilla is crowdsourcing a massive speech-recognition system. “From Amazon’s Alexa to Apple’s Siri, speech recognition and response are becoming mainstays of how we interact with computers, apps, and internet services. But the technology is owned by giant corporations. Now the Mozilla Foundation, maker of the free Firefox browser, is recruiting volunteers to train an open-source speech recognition system.”

Google Opening Up Speech Recognition API

More Google: it has opened access to its speech recognition API. “The Google Cloud Speech API, which will cover over 80 languages and will work with any application in real-time streaming or batch mode, will offer full set of APIs for applications to ‘see, hear and translate,’ Google says. It is based on the same neural network tech that powers Google’s voice search in the Google app and voice typing in Google’s Keyboard. There are some other interesting features, such as working in noisy environments and in real-time.”