Search Engine Journal: How to Block ChatGPT From Using Your Website Content. “There is concern about the lack of an easy way to opt out of having one’s content used to train large language models (LLMs) like ChatGPT. There is a way to do it, but it’s neither straightforward nor guaranteed to work.” Unlike a lot of the “how to” articles I index, this one is fairly speculative. Useful with lots of good information, but speculative.
BusinessWire: Encord Launches Open Source Active Learning Toolkit to Speed Up Real-World Applications of Computer Vision (PRESS RELEASE). “Encord, the platform for data-centric computer vision, has released Encord Active, a free open source industry agnostic toolkit that enables machine learning (ML) engineers and data scientists to understand and improve their training data quality and help boost model performance.”
Data Descriptor: A crowdsourced dataset of aerial images with annotated solar photovoltaic arrays and installation metadata . “Overhead imagery is increasingly being used to improve the knowledge of rooftop PV installations with machine learning models capable of automatically mapping these installations. However, these models cannot be reliably transferred from one region or imagery source to another without incurring a decrease in accuracy. To address this issue, known as distribution shift, and foster the development of PV array mapping pipelines, we propose a dataset containing aerial images, segmentation masks, and installation metadata (i.e., technical characteristics).”