Census Bureau Building Tool to Scrape Tax Data
The folks at the Census Bureau are working on a tool to scrape tax data from the Web. “…researchers at the Census Bureau are studying and applying methods for unstructured data, text analytics and machine learning. These methods belong to the realm of ‘Big Data.’ Big Data refers to large and frequently generated datasets representing a variety of structures. As opposed to designed survey data, Big Data are ‘found’ or ‘organic’ data. Typically, these data are created for a click log, a social media blog or an online PDF report, but are innovatively repurposed and used for something else such as inferring behavior. Since the data were not specifically designed to infer, they often have unique challenges.”