Scientific Data: LocalView, a database of public meetings for the study of local politics and policy-making in the United States

Scientific Data: LocalView, a database of public meetings for the study of local politics and policy-making in the United States . “This article introduces LOCALVIEW, the largest existing dataset of real-time local government public meetings–the central policy-making process in local government. In sum, the dataset currently covers 139,616 videos and their corresponding textual and audio transcripts of local government meetings publicly uploaded to YouTube–the world’s largest public video-sharing website–from 1,012 places and 2,861 distinct governments across the United States between 2006–2022.”

Google Blog: Using new technology and old books to combat disease

Google Blog: Using new technology and old books to combat disease. “Hundreds of millions of people are affected by insect-borne diseases every year, and climate change is only making the problem worse. Increases in temperature and rainfall have expanded the range of insects, including ticks and mosquitos, contributing to outbreaks of diseases such as dengue fever, lyme disease and malaria. Where can humanity find answers to the newest challenges? One idea: old books.”

FathomNet: A global image database for enabling artificial intelligence in the ocean (Nature)

Nature: FathomNet: A global image database for enabling artificial intelligence in the ocean. “Recent advances in machine learning enables fast, sophisticated analysis of visual data, but have had limited success in the ocean due to lack of data standardization, insufficient formatting, and demand for large, labeled datasets. To address this need, we built FathomNet, an open-source image database that standardizes and aggregates expertly curated labeled data.”

University of Michigan: Open source platform enables research on privacy-preserving machine learning

University of Michigan: Open source platform enables research on privacy-preserving machine learning. “The biggest benchmarking data set to date for a machine learning technique designed with data privacy in mind has been released open source by researchers at the University of Michigan. Called federated learning, the approach trains learning models on end-user devices, like smartphones and laptops, rather than requiring the transfer of private data to central servers.”

Scientific Data: The Multilingual Picture Database

Scientific Data: The Multilingual Picture Database . “In this paper we present the Multilingual Picture (Multipic) database, containing naming norms and familiarity scores for 500 coloured pictures, in thirty-two languages or language varieties from around the world. The data was validated with standard methods that have been used for existing picture datasets. This is the first dataset to provide naming norms, and translation equivalents, for such a variety of languages; as such, it will be of particular value to psycholinguists and other interested researchers. The dataset has been made freely available.”

LitHub: How Empirical Databases Have Changed Our Understanding of Early American Slavery

LitHub: How Empirical Databases Have Changed Our Understanding of Early American Slavery. “In historical scholarship during the early 21st century, some of these new methods and tools of truth-seeking have been put to work on a large scale in the history of slavery and race in America. Among the most important and useful of these tools are the careful construction of empirical databases. Increasingly, this work has been done by teams of scholars, who combine traditional sources with digital methods on a new scale.”

Monterey Herald: Monterey Bay Aquarium shares a treasure trove of data about young white sharks

Monterey Herald: Monterey Bay Aquarium shares a treasure trove of data about young white sharks. “The Monterey Bay Aquarium and its collaborators have released a cache of data about great white sharks they’ve been collecting for over 20 years. Earlier this month, an international team of scientists and aquarists led by John O’Sullivan, the director of collections at the Monterey Bay Aquarium and Chris Lowe of CSU Long Beach published a dataset… containing decades’ worth of information about juvenile white sharks. Researchers all over the world can now use the data to help them understand where white sharks go during their seasonal migrations, what ocean conditions they prefer and how they interact with other fish.”

NIWA: Easy access to environmental research data

National Institute of Water and Atmospheric Research (NIWA): Easy access to environmental research data. “New Zealand’s seven Crown Research Institutes (CRIs) have created the National Environmental Data Centre (NEDC) website to make the environmental information held by CRIs more accessible to all New Zealanders. The datasets include a huge range of information from climate and atmosphere, freshwater, land and oceans, including biodiversity and geological data.”

Press release: Big data in geochemistry for international research (University of Göttingen)

University of Göttingen: Press release: Big data in geochemistry for international research. ” Large data sets are playing an increasingly important role in solving scientific questions in geochemistry. Now the University of Göttingen has inherited GEOROC, the largest geochemical database for rocks and minerals from the Max Planck Institute for Chemistry (Mainz). The database has been revised and modernised in its structure and made available to its global users in a new form. The ‘GEOROC’ database, the largest global data collection of rock and mineral compositions, currently contains analyses from over 20,000 individual publications (the oldest dating back to 1883) from 614,000 samples. Together, these data represent almost 32 million individual analytical values.”

Our most dangerous streets: Huge new collision database points to Toronto’s postwar suburbs (Toronto Star)

Toronto Star: Our most dangerous streets: Huge new collision database points to Toronto’s postwar suburbs. “A Star analysis of a huge new database of Toronto traffic collisions is shining a bright spotlight on a distinctly suburban problem. The new data set, much larger and more complete than any previously available records, offers a comprehensive account of nearly 500,000 collisions reported to Toronto police between 2014 and 2021, most mapped to the nearest intersection.”

Butterfly Conservation: Database brings together all known ecological facts about UK butterflies and moths for the first time

Butterfly Conservation: Database brings together all known ecological facts about UK butterflies and moths for the first time. “Butterfly Conservation and the UK Centre for Ecology & Hydrology have worked together on the database, which has collated information that previously existed in a wide range of sources such as field guides, books and journals. Until now, most of this information wasn’t available in a single location nor in a digital format. The new database has brought this information into one usable, digital resource. This involved many months of inputting data from books into spreadsheets, categorising data, and condensing the data into a suitable format for use in data analysis software such as R.”