Motherboard: Someone Turned 50,000 Hours of UFO Podcasts Into a Searchable Database. “a UFO enthusiast and barrister in England who goes by the pseudonym Isaac Koi is transcribing archives of UFO-related shows. So far, he’s catalogued over 50,000 podcast episodes and videos.”
A big thanks to Diane R for telling me about FluidDATA, available at https://fluiddata.com/ . It’s a search engine for podcast transcripts! From the About page: “FluidDATA podcast search engine allows you to search millions of podcast transcripts. Search all the best podcasts for keywords, topics, or people. View podcast statistics and trends in all the top podcasts. Add podcast transcript search to your podcast app or website with our advanced search API.” I sense an article in the future.
Phys .org: Multidisciplinary study provides new insights about French Revolution. “New research from experts in history, computer science and cognitive science shines fresh light on the French Revolution, showing how rhetorical and institutional innovations won acceptance for the ideas that built the French republic’s foundation and inspired future democracies. The researchers, including an Indiana University professor, doctoral student and undergraduate, used data-mining techniques to comb through transcripts of 40,000 speeches from the two-year tenure of the National Constituent Assembly, the first parliament of the revolution.”
Duke University: Interactive Transcripts have Arrived!. “This week Duke Digital Collections added our first set of interactive transcripts to one of our newest digital collections: the Silent Vigil (1968) and Allen Building Takeover (1969) collection of audio recordings. This marks an exciting milestone in the accessibility efforts Duke University Libraries has been engaged in for the past 2.5 years. Last October, my colleague Sean wrote about our new accessibility features and the technology powering them, and today I’m going to tell you a little more about why we started these efforts as well as share some examples.”
USC Shoah Foundation: Nearly 1,000 English Transcripts Added to Visual History Archive. “USC Shoah Foundation integrated the first 984 English-language transcripts into the Visual History Archive over the weekend – the first such update since ProQuest began working on transcribing testimonies as part of its partnership with USC Shoah Foundation last year. The English-language transcripts join 898 German-language transcripts produced by Freie Universität in Berlin that are already available in the VHA.”
From The New Stack, with a hat tip to Angela G.: Big Data Simpsons. “Thanks to the work of Benjamin M. Schmidt, an assistant professor of history at Northeastern University, 25 years of dialogue from The Simpsons have been smashed into a giant data set, connected to a user-friendly search window.”
PRNewswire: Trump Database Factba.se Raises New Round of Seed Funding, Incorporates as FactSquared (PRESS RELEASE). ” FactSquared (http://factsquared.com), parent company of the acclaimed and often-cited platform Factba.se (https://factba.se) today announced it has raised seed funding, led by serial entrepreneurs and investors Mark Walsh and Matt Koll. FactSquared will use the capital to expand its outreach in the political, corporates and media fields and continue development of its platform – which has made instantly searchable every word spoken by President Donald Trump since the 1980s.
Poynter: A new game puts the public into public radio archives. “The game, called Fix It, was launched by the American Archive of Public Broadcasting, a collaboration between the Library of Congress and the WGBH Educational Foundation. It asks the public for help in identifying and correcting errors in public media transcripts — which improves both the searchability and accessibility of archival material from the collection.”
Going through my Google Alerts, I stumbled on this brief writeup about a Web-based dictation tool which supports many different languages. “Available in a multitude of languages including Korean, Lithuanian, Spanish (of many varieties!), Greek, and almost every other language taught in LCSL, this web-based transcription tool is surprisingy accurate when used with good quality audio and even uses context to automatically correct its errors.”