Global Investigative Journalism Network: Video Resources for Data Investigations . “For 20 years, GIJN conferences have helped spread data journalism around the world. Our last Global Investigative Journalism Conference — GIJC21, held in November — was no different. GIJN’s first fully online conference featured a full track of data workshops and panels, ranging from analysis with spreadsheets and SQL to programming with R and Python, from tips on scraping and cleaning to data visualization and social network mapping. The sessions were led by a team of all-star trainers from seven countries. This is the second installment of GIJC21 videos, which until now have been available only to conference attendees.” All the videos I spot-checked had captions.
Ars Technica: Rookie coding mistake prior to Gab hack came from site’s CTO. “Over the weekend, word emerged that a hacker breached far-right social media website Gab and downloaded 70 gigabytes of data by exploiting a garden-variety security flaw known as an SQL injection. A quick review of Gab’s open source code shows that the critical vulnerability—or at least one very much like it—was introduced by the company’s chief technology officer.”
Medium: Analysis of Google Political Ads using BigQuery. “Hello everyone, this is my first article on Medium. I have been interested in data science and analytics while working on my Masters project. I have tried my hand with different beginner datasets to learn some of the basics of Python, SQL, and other languages. However, I felt that repeating the same exercises got boring after a while, and I started losing interest in the subject. Then I got a hold of Google Cloud Services and the BigQuery platform.”
Code (Love): 21 of the best free resources to learn SQL. “I self-taught myself SQL after I bombed a technical interview that involved SQL. It got me a bit mad at myself, so I went ahead and started looking for different resources to help me practice and learn SQL. I wasn’t looking to spend any money so I focused on getting the best free resources. The list below is the fruit of my efforts. I hope that it helps you on your journey to learn SQL.”
TechCrunch: Salesforce is using AI to democratize SQL so anyone can query databases in natural language
TechCrunch: Salesforce is using AI to democratize SQL so anyone can query databases in natural language. “SQL is about as easy as it gets in the world of programming, and yet its learning curve is still steep enough to prevent many people from interacting with relational databases. Salesforce’s AI research team took it upon itself to explore how machine learning might be able to open doors for those without knowledge of SQL.”
TechCrunch: Google launches Cloud Spanner, its new globally distributed relational database service. “Google today announced the beta launch of Cloud Spanner, a new globally distributed database service for mission-critical applications. Cloud Spanner joins Google’s other cloud-based database services, like Bigtable, Cloud SQL and the Cloud Datastore, but with the crucial difference of offering developers the best of both traditional relational databases and NoSQL databases — that is, transactional consistency with easy scalability.”
InfoQ: Google BigQuery Adds New Public Datasets . “Stack Overflow recently announced making its dataset available through Google’s BigQuery. Using regular SQL statements, developers can query the full set of Stack Overflow data including posts, votes, tags, and badges. Using BigQuery’s REST API developers can export data on demand using their tool of choice. Available datasets in BigQuery can be JOINed using plain SQL allowing developers to derive useful insights across domains.” Other data sets are available as well.
This is for SQL fans only. My SQL chops are not great, but what I could understand out of this paper I liked: SQL Query Parser: An Automated Tool for Translating the Queries Into Spreadsheets. “Many people find difficulties in working with databases queries so they completely want to migrate from databases to another application. Hence here it is the solution to combine the database concept with another application to make it simple and easy. The concept called spreadsheet is combined with the databases which forms a method which is called as a SQL Query Parser. Spreadsheets are the most popular application for data analysis and manipulations. Thus SQL Query Parser is an automated tool which translates the Database SQL query to Formula based Spreadsheet. Also it parses the statements into the parse tree and generates the syntax tree providing validation to the statements at an early stage.”
Data analysis without tears: is this a new trend, something I’m just paying attention to because I’m currently up to my elbows in MYSQL, or ? I don’t know. I do know that VQL sounds pretty cool. “VQL connects to a SQL database or Relational Database Systems, such as PostgreSQL, Amazon Redshift and Heroku. It can also upload data from a CSV or spreadsheet. In all cases, the solution imports the information, predicts column categories and automatically divides the data into a comprehensive table in a matter of minutes. Users can then make instant inquiries sans code, searching for certain text, numbers and dates throughout the dataset. If they aren’t feeling the spreadsheet layout, they can also create histograms – similar to Excel, but with a lot more information.”
Geektime has a writeup on a tool that translates natural language questions into SQL queries. “Kueri’s system enables developers to implant a unique search box within apps. The search box knows how to take questions from end users in natural language … and translate them into SQL queries in real time. The app can run the queries through the database and display the results to the user. In addition, in order to make it even easier for the end user, it facilitates automatic completion during typing, with completions of words and smart suggestions according to the context of the search and database.”
And in our “shut up and take my money” department, we’ve got an MIT project that’s got me drooling. Democratizing databases: With a new tool, any competent spreadsheet user can construct custom database interfaces. “New software from researchers at MIT’s Computer Science and Artificial Intelligence Laboratory could make databases much easier for laypeople to work with. The program’s home screen looks like a spreadsheet, but it lets users build their own database queries and reports by combining functions familiar to any spreadsheet user. Simple drop-down menus let the user pull data into the tool from multiple sources. The user can then sort and filter the data, recombine it using algebraic functions, and hide unneeded columns and rows, and the tool will automatically generate the corresponding database queries.”
A new, free tool helps find sensitive data in SQL databases (PRESS RELEASE). “IDERA SQL Column Search searches the string definitions of table columns in an SQL Server database to match them to a set of user-defined strings. Rather than searching the data itself, the tool searches the column name definitions for typically sensitive words like social security, date of birth or account number. The flexible design comes preconfigured with 45 common sensitive data search strings with the ability to easily create custom strings and multiple search profiles.”