Pete Warden: Launching spchcat, an open-source speech recognition tool for Linux and Raspberry Pi

Pete Warden: Launching spchcat, an open-source speech recognition tool for Linux and Raspberry Pi. “I’ve been following the Coqui.ai team’s work since they launched, and was very impressed by the quality of the open source speech models and code they have produced. I didn’t have an easy way to run them myself though, especially on live microphone input. With that in mind, I decided my holiday project would be writing a command line tool using Coqui’s speech to text library. To keep it as straightforward as possible I modeled it on the classic Unix cat command, where the default would be to read audio from a microphone and output text (though it ended up expanding to system audio and files too) so I called it spchcat.”

News@Northeastern: The Race To Save Indigenous Languages, Using Automatic Speech Recognition

News@Northeastern: The Race To Save Indigenous Languages, Using Automatic Speech Recognition. “Growing up in the windy plains near the Northern Cheyenne Indian Reservation, [Michael] Running Wolf says that although his family—which is part Cheyenne, part Lakota—didn’t have daily access to running water or electricity, sometimes, when the winds died down, the power would flicker on, and he’d plug in his Atari console and play games with his sisters. These early experiences would spur forward a lifelong interest in computers, artificial intelligence, and software engineering that Running Wolf is now harnessing to help reawaken endangered indigenous languages in North and South America, some of which are so critically at risk of extinction that their tallies of living native speakers have dwindled into the single digits.”

CNN: Irish tech firm helps kids’ voices be heard

CNN: Irish tech firm helps kids’ voices be heard. “While personal artificial intelligence (AI) assistants are becoming increasingly integrated in our everyday lives, they are just one use of voice tech — and are primarily designed for adults. Irish tech startup SoapBox Labs wants that to change. The Dublin-based firm has developed speech recognition technology designed specifically for children — and it’s already in use across a range of applications, from toys to education apps.”

What Happened When Google Threw All Voice Data To The Blender. Answer: SpeechStew (Analytics India)

Analytics India: What Happened When Google Threw All Voice Data To The Blender. Answer: SpeechStew. “Training large models is a massive challenge as it requires collecting and annotating vast amounts of data. It is particularly challenging in the case of speech recognition models. To overcome this challenge, a team from Google Research and Google Brain have introduced an AI model, SpeechStew. The model is trained on a combination of datasets to achieve state-of-the-art results on various speech recognition benchmarks.”

VentureBeat: Artie releases tool to measure bias in speech recognition models

VentureBeat: Artie releases tool to measure bias in speech recognition models. “Artie, a startup developing a platform for mobile games on social media that feature AI, today released a data set and tool for detecting demographic bias in voice apps. The Artie Bias Corpus (ABC), which consists of audio files along with their transcriptions, aims to diagnose and mitigate the impact of factors like age, gender, and accent in voice recognition systems.”

Stanford News: Stanford researchers find that automated speech recognition is more likely to misinterpret black speakers

Stanford News: Stanford researchers find that automated speech recognition is more likely to misinterpret black speakers. “The technology that powers the nation’s leading automated speech recognition systems makes twice as many errors when interpreting words spoken by African Americans as when interpreting the same words spoken by whites, according to a new study by researchers at Stanford Engineering.”

Harvard Business Review: Voice Recognition Still Has Significant Race and Gender Biases

Harvard Business Review: Voice Recognition Still Has Significant Race and Gender Biases. “Voice AI is becoming increasingly ubiquitous and powerful. Forecasts suggest that voice commerce will be an $80 billion business by 2023. Google reports that 20% of their searches are made by voice query today — a number that’s predicted to climb to 50% by 2020. In 2017, Google announced that their speech recognition had a 95% accuracy rate. While that’s an impressive number, it begs the question: 95% accurate for whom?”

Mozilla: More Common Voices

Mozilla: More Common Voices. “Today we are excited to announce that Common Voice, Mozilla’s initiative to crowdsource a large dataset of human voices for use in speech technology, is going multilingual! Thanks to the tremendous efforts from Mozilla’s communities and our deeply engaged language partners you can now donate your voice in German, French and Welsh, and we are working to launch 40+ more as we speak. But this is just the beginning. We want Common Voice to be a tool for any community to make speech technology available in their own language.”

Seeker: AI Earthquake Tracker Is Inspired by Speech Recognition Technology

Seeker: AI Earthquake Tracker Is Inspired by Speech Recognition Technology. “The state of Oklahoma has witnessed a stunning rise in the frequency of earthquakes, which has been linked to an increase in the use of fracking technology in the oil and gas sector. Starting in 2009, the annual number of quakes measuring above magnitude 3.0 in the state exploded from fewer than three to as many as 903 in 2015. Now, all this seismic activity has prompted scientists to develop a new tool for tracking it — drawing on speech recognition technology.”

TechCrunch: Microsoft’s speech recognition system hits a new accuracy milestone

TechCrunch: Microsoft’s speech recognition system hits a new accuracy milestone. “Microsoft announced today that its conversational speech recognition system has reached a 5.1% error rate, its lowest so far. This surpasses the 5.9% error rate reached last year by a group of researchers from Microsoft Artificial Intelligence and Research and puts its accuracy on par with professional human transcribers who have advantages like the ability to listen to text several times.”

Fast Company: Mozilla is crowdsourcing a massive speech-recognition system

Fast Company: Mozilla is crowdsourcing a massive speech-recognition system. “From Amazon’s Alexa to Apple’s Siri, speech recognition and response are becoming mainstays of how we interact with computers, apps, and internet services. But the technology is owned by giant corporations. Now the Mozilla Foundation, maker of the free Firefox browser, is recruiting volunteers to train an open-source speech recognition system.”

Google Opening Up Speech Recognition API

More Google: it has opened access to its speech recognition API. “The Google Cloud Speech API, which will cover over 80 languages and will work with any application in real-time streaming or batch mode, will offer full set of APIs for applications to ‘see, hear and translate,’ Google says. It is based on the same neural network tech that powers Google’s voice search in the Google app and voice typing in Google’s Keyboard. There are some other interesting features, such as working in noisy environments and in real-time.”