EurekAlert: One class in all languages. “Now anyone from around the world can listen live to a Nobel Prize Laureate lecture or earn credits from the most reputable universities with nothing more than internet access. However, the possible information to be gained from watching and listening online is lost if the audience cannot understand the language of the lecturer. To solve this problem, scientists at the Nara Institute of Science and Technology (NAIST), Japan, presented a solution with new machine learning at the 240th meeting of the Special Interest Group of Natural Language Processing, Information Processing Society of Japan (IPSJ SIG-NL).”
Ars Technica: Microsoft open sources algorithm that gives Bing some of its smarts. “Microsoft has released today the SPTAG [Space Partition Tree and Graph] algorithm as MIT-licensed open source on GitHub. This code is proven and production-grade, used to answer questions in Bing. Developers can use this algorithm to search their own sets of vectors and do so quickly: a single machine can handle 250 million vectors and answer 1,000 queries per second. There are some samples and explanations in Microsoft’s AI Lab, and Azure will have a service using the same algorithms.”
Quartz: The emails that brought down Enron still shape our daily lives. “The Enron Corpus, as the collection is known, has been used in more than 100 projects since that research team presented it to the public in 2004. As the biggest public collection of natural written language in an organizational setting, it has been used to study everything from statistics to artificial intelligence to email attachment habits. An online art project by two Brooklyn artists will send every single one of the emails to your personal inbox, a process which (depending on the frequency of emails you request) will take anywhere from seven days to seven years.”
Towards Data Science: I trained fake news detection AI with >95% accuracy, and almost went crazy. “With so many advances in Natural Language Processing and machine learning, I thought maybe, just maybe, I could make a model that could flag news content as fake, and perhaps take a bite out of the devastating consequences of the proliferation of fake news.”
Researchers at Yahoo have developed an abuse-detecting algorithms. “The Yahoo team used a number of conventional techniques, including looking for abusive keywords, punctuation that often seemed to accompany abusive messages, and syntactic clues as to the meaning of a sentence. But the researchers also applied a more advanced approach to automated language understanding, using a way of representing the meaning of words as vectors with many dimensions.” The technique has a success rate of about 90%, which is wow.
Geektime has a writeup on a tool that translates natural language questions into SQL queries. “Kueri’s system enables developers to implant a unique search box within apps. The search box knows how to take questions from end users in natural language … and translate them into SQL queries in real time. The app can run the queries through the database and display the results to the user. In addition, in order to make it even easier for the end user, it facilitates automatic completion during typing, with completions of words and smart suggestions according to the context of the search and database.”
Google has launched a new natural languages API. “Google today announced the public beta launch of its Cloud Natural Language API, a new service that gives developers access to Google-powered sentiment analysis, entity recognition, and syntax analysis.”