The Guardian: Could the global Covid death toll be millions higher than thought?. “The World Mortality Dataset contains information on more than 100 countries. Among those missing are most African and many Asian countries, including some of the world’s most populous and – judging by news reports and other sources – worst-affected. India, for example, does not routinely release national vital data, yet some researchers estimate its Covid death toll could be as high as 4 million.”
Penn State News: What was really the secret behind Van Gogh’s success?. “By using artificial intelligence to mine big data related to artists, film directors and scientists, the researchers discovered this pattern is not uncommon but, instead, a magical formula. Hot streaks, they found, directly result from years of exploration (studying diverse styles or topics) immediately followed by years of exploitation (focusing on a narrow area to develop deep expertise).”
Wired: Humans Can’t Be the Sole Keepers of Scientific Knowledge. “Writing scientific knowledge in a programming-like language will be dry, but it will be sustainable, because new concepts will be directly added to the library of science that machines understand. Plus, as machines are taught more scientific facts, they will be able to help scientists streamline their logical arguments; spot errors, inconsistencies, plagiarism, and duplications; and highlight connections. AI with an understanding of physical laws is more powerful than AI trained on data alone, so science-savvy machines will be able to help future discoveries. Machines with a great knowledge of science could assist rather than replace human scientists.” I have so many conflicting thoughts about this article that I gave myself a headache. Be warned.
Washington Post: Messy, incomplete U.S. data hobbles pandemic response. “The contentious and confusing debate in recent weeks over coronavirus booster shots has exposed a fundamental weakness in the United States’ ability to respond to a public health crisis: The data is a mess. How many people have been infected at this point? No one knows for sure, in part because of insufficient testing and incomplete reporting. How many fully vaccinated people have had breakthrough infections? The Centers for Disease Control and Prevention decided to track only a fraction of them. When do inoculated people need booster shots? American officials trying to answer that have had to rely heavily on data from abroad.”
Wolfram Blog: Analyzing Episode Data for The Office Series with the Wolfram Language. “Which episode was best for laughs? How did the episodes vary over the course of a season? Which season is the best? (According to Kevin, while every season thinks it’s the best, nothing beats the cookie season.) So here, I endeavor to present that additional analysis.” I have never seen the show so I didn’t get some of the explanatory references. I need a find a version of this for Monty Python or MST3K.
BioSpectrum Asia: Korea to establish national digital library on health and genome data by 2028. “The second pilot project will analyze the genetic makeup of 12,500 donated DNA samples from Korean patients living with a rare disease. Over the next year, the resulting data will be used by the Illumina-backed consortium to prepare for the main project in analyzing and comparing the genes of 1 million Koreans to advance the country’s medical technology and improve future public health.”
The Markup: The Secret Bias Hidden in Mortgage-Approval Algorithms. “An investigation by The Markup has found that lenders in 2019 were more likely to deny home loans to people of color than to White people with similar financial characteristics—even when we controlled for newly available financial factors that the mortgage industry for years has said would explain racial disparities in lending.”
New York Times: Show Me the Data!. “Who should get vaccine booster shots and when? Can vaccinated people with a breakthrough infection transmit the virus as easily as unvaccinated people? How many people with breakthrough infections die or get seriously ill, broken down by age and underlying health conditions? Confused? It’s not you. It’s the fog of pandemic, in which inadequate data hinders a clear understanding of how to fight a stealthy enemy.”
UC Santa Barbara: Taming Satellite Data. “More than 700 imaging satellites orbit the Earth, and every day they beam vast amounts of information to databases on the ground. There’s just one problem: While the geospatial data could help researchers and policymakers address critical challenges, only those with considerable wealth and expertise can access it. Now, a team of scientists, including UC Santa Barbara’s Tamma Carleton… has devised a machine learning system to tap the problem-solving potential of satellite imaging.”
Scientific Data: The Upworthy Research Archive, a time series of 32,487 experiments in U.S. media . “This archive records the stimuli and outcome for every A/B test fielded by Upworthy between January 24, 2013 and April 30, 2015. In total, the archive includes 32,487 experiments, 150,817 experiment arms, and 538,272,878 participant assignments. The open access dataset is organized to support exploratory and confirmatory research, as well as meta-scientific research on ways that scientists make use of the archive.”
Tech Xplore: Turning network traffic data into music. “Cybersecurity analysts deal with an enormous amount of data, especially when monitoring network traffic. If one were to print the data in text form, a single day’s worth of network traffic may be akin to a thick phonebook. In other words, detecting an abnormality is like finding a needle in a haystack.”