Clouds and blackberries: how web archives can help us to track the changing meaning of words (Alan Turing Institute)

Alan Turing Institute: Clouds and blackberries: how web archives can help us to track the changing meaning of words. “The meaning of words changes all the time. Think of the word ‘blackberry’, for example, which has been used for centuries to refer to a fruit. In 1999, a new brand of mobile devices was launched with the name BlackBerry. Suddenly, there was a new way of using this old word. ‘Cloud’ is another example of a well-established word whose association with ‘cloud computing’ only emerged in the past couple of decades. Linguists call this phenomenon ‘semantic change’ and have studied its complex mechanisms for a long time. What has changed in recent years is that we now have access to huge collections of data which can be mined to find these changes automatically. Web archives are a great example of such collections, because they contain a record of the changing […]

News@Northeastern: The Race To Save Indigenous Languages, Using Automatic Speech Recognition

News@Northeastern: The Race To Save Indigenous Languages, Using Automatic Speech Recognition. “Growing up in the windy plains near the Northern Cheyenne Indian Reservation, [Michael] Running Wolf says that although his family—which is part Cheyenne, part Lakota—didn’t have daily access to running water or electricity, sometimes, when the winds died down, the power would flicker on, and he’d plug in his Atari console and play games with his sisters. These early experiences would spur forward a lifelong interest in computers, artificial intelligence, and software engineering that Running Wolf is now harnessing to help reawaken endangered indigenous languages in North and South America, some of which are so critically at risk of extinction that their tallies of living native speakers have dwindled into the single digits.”

The ML Glossary: Five years of new language (Google Blog)

Google Blog: The ML Glossary: Five years of new language. “Over guacamole and corn chips at a party, a friend mentions that her favorite phone game uses augmented reality. Another friend points her phone at the host and shouts, ‘Watch out—a t-rex is sneaking up behind you.’ Eager to join the conversation, you blurt, ‘My blender has an augmented reality setting.’ If only you had looked up augmented reality in Google’s Machine Learning Glossary, which defines over 460 terms related to artificial intelligence, you’d know what the heck your friends are talking about. If you’ve ever wondered what a neural network is, or if you chronically confuse the negative class with the positive class at the doctor’s office (‘Wait, the negative class means I’m healthy?’), the Glossary has you covered.” I tried to keep context while not including the “Oh look, you humiliated yourself by not consulting Google” lede, but […]

New York Times: How Word Lists Help — or Hurt — Crossword Puzzles

New York Times: How Word Lists Help — or Hurt — Crossword Puzzles. “If we were to go by the New York Times Crossword, Lake ERIE would be the most dazzling body of water on Earth. Mining ORE would be the most lucrative business venture. According to xwordinfo.com, ERIE is the third most popular word in the New York Times Crossword. It has appeared over 1,350 times. ORE is seventh, with over 1,200 appearances. ORE and ERIE are examples of crosswordese, words that appear often in crossword puzzles but rarely in day-to-day conversation.”

Nature: ‘Tortured phrases’ give away fabricated research papers

Nature: ‘Tortured phrases’ give away fabricated research papers. “In April 2021, a series of strange phrases in journal articles piqued the interest of a group of computer scientists. The researchers could not understand why researchers would use the terms ‘counterfeit consciousness’, ‘profound neural organization’ and ‘colossal information’ in place of the more widely recognized terms ‘artificial intelligence’, ‘deep neural network’ and ‘big data’. Further investigation revealed that these strange terms — which they dub ‘tortured phrases’ — are probably the result of automated translation or software that attempts to disguise plagiarism.”

There’s an official $#@&ing terminology for censoring swears like $#@&: Grawlix (Boing Boing)

Boing Boing: There’s an official $#@&ing terminology for censoring swears like $#@&: Grawlix. “this tweet was the first time I have ever seen a “$&%#@!” word referred to as ‘Grawlix.’ It’s one of those weird linguistic things that I’ve always just accepted, and taken for granted, without considering that someone would have named, identified, and categorized it. According to a 2013 article from Slate, the term ‘grawlix’ was coined by Beetle Bailey creator Mort Walker.”

CNET: Uber will offer free Rosetta Stone to ride-hail and delivery drivers

CNET: Uber will offer free Rosetta Stone to ride-hail and delivery drivers . “The drivers will have free access to all 24 languages Rosetta Stone offers, directly from the Uber Driver app. The partnership will be available to drivers and delivery people who have achieved gold, platinum or diamond status through the Uber Pro program in more than three dozen countries, such as Argentina, Brazil, South Africa, the UK and US. Uber also worked with Rosetta Stone to develop some language education focused on interactions drivers often have with their riders.”

Slator: Searchable Database Gives Users an Overview of Language Policies in Europe

New-to-me, from Slator: Searchable Database Gives Users an Overview of Language Policies in Europe. “The database, called the European Language Monitor (ELM), is searchable for topics such as what language regulations and technologies exist in an EU member country. It is currently divided into four databases according to years of data collection. The goal, to provide up-to-date, ‘qualitative and quantitative data, links to rulings and legislation and other types of documentation.’”

Mind Matters News: How A Searchable Database Is Helping Decipher A Lost Language

Mind Matters News: How A Searchable Database Is Helping Decipher A Lost Language. “There was once a flourishing civilization on the island of Crete called the Minoan culture (3000–11100 B.C.). Two languages are associated with it, Minoan A and, later, Minoan B. Minoan B was deciphered but Minoan A has remained a mystery that has ‘tormented linguists for many decades,’ as Patricia Klaus puts it. Deciphering it would give us a window back as far as 1800 BC.”

CNET: What is cheugy? And how do you know if you’re a cheug?

CNET: What is cheugy? And how do you know if you’re a cheug?. “You might have noticed the word ‘cheugy’ popping up online and wondered what it means and how to pronounce it. New slang is a surefire way to make you question your fleeting youth. In this case, that couldn’t be more true. In short, cheugy is a trendy way to say something is passe, and the word’s having a moment on TikTok, where folks are busy labeling what’s cheugy, having existential crises over being cheugy or just embracing life as a cheug.” Oh, so, like, someone who listens to disco and says “groovy” all the time?… oh. >cough

Science Magazine: Want other scientists to cite you? Drop the jargon

Science Magazine: Want other scientists to cite you? Drop the jargon. “If you want your work to be highly cited, here’s one simple tip that might help: Steer clear of discipline-specific jargon in the title and abstract. That’s the conclusion of a new study of roughly 20,000 published papers about cave science, a multidisciplinary field that includes researchers who study the biology, geology, paleontology, and anthropology of caves. The most highly cited papers didn’t use any terms specific to cave science in the title and kept jargon to less than 2% of the text in the abstract; jargon-heavy papers were cited far less often.”

Moscow Times: Russians Post More Profanities After Social Media Swearing Ban

Moscow Times: Russians Post More Profanities After Social Media Swearing Ban. “Russian-speaking social media users have posted 10% more profanity-laced content in the two months since a law requiring platforms to delete them came into force than before, the RBC news website reported Sunday. The Medialogia media monitor tallied 20.2 million posts containing swear words on Facebook, TikTok, YouTube, Instagram and Twitter, as well as three Russian platforms, from Feb. 1-March 31.”