Dartmouth: Using Social Media Big Data to Combat Prescription Drug Crisis. “Researchers at Dartmouth, Stanford University, and IBM Research, conducted a critical review of existing literature to determine whether social media big data can be used to understand communication and behavioral patterns related to prescription drug abuse. Their study found that with proper research methods and attention to privacy and ethical issues, social media big data can reveal important information concerning drug abuse, such as user-reported side effects, drug cravings, emotional states, and risky behaviors.”
Washington Post: FBI database for gun buyers missing millions of records. “The FBI’s background-check system is missing millions of records of criminal convictions, mental illness diagnoses and other flags that would keep guns out of potentially dangerous hands, a gap that contributed to the shooting deaths of 26 people in a Texas church this week. Experts who study the data say government agencies responsible for maintaining such records have long failed to forward them into federal databases used for gun background checks — systemic breakdowns that have lingered for decades as officials decided they were too costly and time-consuming to fix.”
There is a new subreddit for open source databases. Very small and not much here yet, but I subscribed in a blink.
ZDNet: Open-sourcing data will make big data bigger than ever. “Free software has been with computing since day one, but proprietary software ruled businesses. It took open source and its licenses to transform how we coded our programs. Today, even Microsoft has embraced open source. Now, The Linux Foundation has created a new open license framework, Community Data License Agreement (CDLA), which may do for data what open source did for programming.”
The Next Web: Companies are collecting a mountain of data. What should they do with it?. “From our tweets and status updates to our Yelp reviews and Amazon product ratings, the internet-connected portion of the human race generates 2.5 quintillion bytes of computer data every single day. That’s 2.5 million one-terabyte hard drives filled every 24 hours. The takeaway is clear: in 2017, there’s more data than there’s ever been, and there’s only more on the way. So what are savvy companies doing to harness the data that their human users shed on a daily basis?”
NOAA: NOAA and partners release database for research to bridge weather to climate forecast gap . “Wouldn’t it be nice to know now what the weather is going to be like for the vacation you have planned next month? Or, if you’re a farmer, whether you’re going to get enough rainfall during a crucial planting time coming up in a few weeks? Weather forecasts help us make decisions about the next few days to a week, and seasonal climate forecasts give us information on the time scale of three months to a year or more. But a significant gap in scientists’ understanding has limited the ability to forecast what will happen two weeks to two months from now, also called the subseasonal scale…. Two new datasets, funded in part by NOAA Research’s Modeling, Analysis, Predictions, and Projections (MAPP) Program, now provide easy access to 60 terabytes of climate forecasts containing predictions of rainfall, temperature, winds and other variables at the subseasonal level.”
NiemanLab: The internet isn’t forever. Is there an effective way to preserve great online interactives and news apps?. “[Meredith] Broussard and colleague Katherine Boss, the librarian for journalism, media, culture, and communication at NYU, are working on a workflow and on building tools to help organizations effectively and efficiently preserve their big data journalism projects, and putting together a scholarly archive of data journalism projects.”