Quartz: Analysis of 500 million Reddit comments shows how the alt-right made the alt-left a thing

Quartz: Analysis of 500 million Reddit comments shows how the alt-right made the alt-left a thing. “By taking a deep-dive into the data, we can see the different ways in which the alt-right have attempted to capitalize on Trump’s speech and the opportunity to turn the ‘alt-left’ into a mainstream political concept. Through examining the last six months of Reddit comments—all half a billion of them—we can see the intensity with which /r/The_Donald has attempted to push the focus of public conversation toward condemnation of the left, not the right.”

Eos: A New Tool for Deep-Down Data Mining

Eos: A New Tool for Deep-Down Data Mining . “The primary goal of our U.S. National Science Foundation EarthCube building block project, GeoDeepDive, is to facilitate the creation and augmentation of literature-derived databases and to leverage published knowledge and past investments in data acquisition. The project combines library science (the aggregation and curation of digital documents and bibliographic metadata), geoscience (the generation of research questions and labeling of terms in externally managed scientific ontologies), and computer science (the use of high-throughput computing infrastructure and machine reading systems to parse and extract data from millions of documents). Here we report on the status of this project and describe how the GeoDeepDive infrastructure can be used in scientific research and education applications.”

VentureBeat: Google launches Cloud Dataprep in public beta to help companies clean their data before analysis

VentureBeat: Google launches Cloud Dataprep in public beta to help companies clean their data before analysis. “At its Google Cloud Next conference in San Francisco back in March, Google unveiled Cloud Dataprep, a service that lets companies clean their structured and unstructured datasets for analysis in, for example, Google’s BigQuery, or even for use in training machine learning models. Over the past six months, Cloud Dataprep has been in private beta, but Google is now officially graduating the service to public beta for anyone to use.”

The Atlantic: The Case for Sharing All of America’s Data on Mosquitoes

The Atlantic: The Case for Sharing All of America’s Data on Mosquitoes. “For decades, agencies around the United States have been collecting data on mosquitoes. Biologists set traps, dissect captured insects, and identify which species they belong to. They’ve done this for millions of mosquitoes, creating an unprecedented trove of information—easily one of the biggest long-term attempts to monitor any group of animals, if not the very biggest. The problem, according to Micaela Elvira Martinez from Princeton University and Samuel Rund from the University of Notre Dame, is that this treasure trove of data isn’t all in the same place, and only a small fraction of it is public. The rest is inaccessible, hoarded by local mosquito-control agencies around the country.”

Pennsylvania Historic Preservation: Datasets on PA Historical Markers and Agriculture Now Online

Pennsylvania Historic Preservation: Datasets on PA Historical Markers and Agriculture Now Online. “The Pennsylvania Historical and Museum Commission (PHMC) is pleased to announce the recent addition of two datasets to OpenDataPA: Pennsylvania Historical Markers and the 1850 Agricultural Production in Pennsylvania. Wondering what datasets and open data have to do with the PA SHPO? Lots, apparently.”

Vice: You Can Now Download Information From Every Congressional Session Since 1973

Vice: You Can Now Download Information From Every Congressional Session Since 1973 . “Since 2009, developers have been able to use the ProPublica Congress API (first developed by The New York Times) to retrieve data about the thousands of bills introduced during every two-year session in the House of Representatives. Until now though, you had to download each piece of information separately, and you needed to know how to write API calls…. That’s no longer the case. Wednesday, ProPublica announced that you can now download all the information about all of the bills in each legislative session using its new bulk bill data set.”

TechCrunch: Salesforce is using AI to democratize SQL so anyone can query databases in natural language

TechCrunch: Salesforce is using AI to democratize SQL so anyone can query databases in natural language. “SQL is about as easy as it gets in the world of programming, and yet its learning curve is still steep enough to prevent many people from interacting with relational databases. Salesforce’s AI research team took it upon itself to explore how machine learning might be able to open doors for those without knowledge of SQL.”