Classic FM: Genius Google tool turns your tuneless humming into a lovely violin solo

Classic FM: Genius Google tool turns your tuneless humming into a lovely violin solo. “Using your phone or desktop, you can transform any unpolished melody into a violin, saxophone, flute or trumpet solo. And when we say unpolished melody, we literally mean any noise. Honestly, anything.”

Artnet News: This Newly Discovered Radio Clip May Be the Only Known Recording of Frida Kahlo’s Voice. Listen to It Here

Artnet News: This Newly Discovered Radio Clip May Be the Only Known Recording of Frida Kahlo’s Voice. Listen to It Here. “Everyone knows the face of Frida Kahlo. Now, we may finally know what she sounded like, following the discovery of what could be the only known recording of the great Mexican artist’s voice. The recording discovered in Mexico’s national sound library, the Fonoteca National, could be the only known record of Kahlo speaking.”

Mozilla Blog: Sharing our Common Voices – Mozilla releases the largest to-date public domain transcribed voice dataset

Mozilla Blog: Sharing our Common Voices – Mozilla releases the largest to-date public domain transcribed voice dataset. “From the onset, our vision for Common Voice has been to build the world’s most diverse voice dataset, optimized for building voice technologies. We also made a promise of openness: we would make the high quality, transcribed voice data that was collected publicly available to startups, researchers, and anyone interested in voice-enabled technologies. Today, we’re excited to share our first multi-language dataset with 18 languages represented, including English, French, German and Mandarin Chinese (Traditional), but also for example Welsh and Kabyle. Altogether, the new dataset includes approximately 1,400 hours of voice clips from more than 42,000 people.”

Quartz: Google’s voice-generating AI is now indistinguishable from humans

Quartz: Google’s voice-generating AI is now indistinguishable from humans. “Humans have officially given their voice to machines. A research paper published by Google this month—which has not been peer reviewed—details a text-to-speech system called Tacotron 2, which claims near-human accuracy at imitating audio of a person speaking from text.”

China Is Creating a Database of Its Citizens’ Voices to Boost its Surveillance Capability: Report (TIME)

TIME: China Is Creating a Database of Its Citizens’ Voices to Boost its Surveillance Capability: Report. “The Chinese government has collected tens of thousands of ‘voice pattern’ samples from targeted citizens and is inputting them into a national voice biometric database, according to a Human Rights Watch report published Monday. The idea is that an automated system, thought to still be in development, will use the database to pick out individual voices in telephone and other conversations, boosting the government’s already expansive surveillance capabilities.”

Korea’s Law Enforcement Will Use Voice Sample Database

I’ve heard of DNA databases, and fingerprint databases, but never voice sample databases. “According to a statement from Korean Prosecution Services on April 27, its Forensic Science Division (director Young-dae Kim) will develop what is being called the ‘Korean Voice Sample Database’, which will be used to identify subjects and informants, conduct voice comparisons, and verify vocal evidence. The project, which began in 2014, will be completed by the end of this year.”

Create a Font With the Sound of Your Voice

This is nifty: create a font with the sound of your voice. “You know those games where you use something as a metaphor for your personality? Like, if you were a fruit what fruit would you be? Or are you a Chandler, a Joey, a Rachel, a Carrie, a Samantha, a Miranda? Now to celebrate its 20th anniversary, The Webbys has launched a design and digital nerd version of this parlor game/social masochism with a site that can turn your voice into an original font.”

Google Voice Search Takes Another Quality Step Forward

Google Voice Search has taken another quality step forward. “Our improved acoustic models rely on Recurrent Neural Networks (RNN). RNNs have feedback loops in their topology, allowing them to model temporal dependencies: when the user speaks /u/ in the previous example, their articulatory apparatus is coming from a /j/ sound and from an /m/ sound before. Try saying it out loud – “museum” – it flows very naturally in one breath, and RNNs can capture that. The type of RNN used here is a Long Short-Term Memory (LSTM) RNN which, through memory cells and a sophisticated gating mechanism, memorizes information better than other RNNs. Adopting such models already improved the quality of our recognizer significantly.”