T-SQL Tuesday is a monthly blog party for the SQL Server community. This month’s party is hosted by Rob Volk (b|t), who has asked us to share some analogies that help explain databases in simple terms or as he puts it, “explain databases like I’m five.”
Recently I’ve been doing more with Apache Spark and Databricks and as a result using Parquet files which store data in columnar format. Because of this I have needed an analogy to explain how columnar databases store information and why this might be better for querying large amounts of data.
T-SQL Tuesday is a monthly blog party for the SQL Server community. This month’s party is hosted by Elizabeth Noble (b|t), who has asked us to write about automating our stress away with a story about something we automated to make our lives easier.
Maybe I was influenced by the DevOps example that Elizabeth gave in her invitation post, but the first thing I thought of was a project I worked on a few years back using Amazon’s Redshift data warehouse. For most of the project there were three of us and we all came from a more Microsoft centric background. Part of the project involved a SQL Server database and we had an excellent deployment pipeline designed. We aimed to release about once a week and the SQL portion could be done in minutes, but we had not found any good CI/CD tools for Redshift and our release often used up most of the workday. Usually with rollbacks and some investigation to determine why the Dev, QA, and Prod environments differed.
The simple answer to the question, What is Azure Databricks?, is that it is a Databricks Workspace integrated into the Azure platform. This leads to the question, What is Databricks? And the simple answer to that question is that Databricks is a platform that allows you to easily manage and interact with an Apache Spark analytics service. Which of course leads to the question, What is Apache Spark?
If you are curious about Cosmos DB the link above is a good place to start. The current definition of the service is much longer than that one sentence, but I like the simplicity of it. I think for many of the developers that Cosmos is targeted towards, this is really enough to know. Cosmos is a database service that you can use for the backend of your application. It works anywhere in the world that you can connect to Azure services, it works relatively quickly, and allows you to use a variety of API models to connect to it. Some of the more common ones are the SQL Core API (which grew out of Azure DocumentDB), MongoDB, and a graph API using Gremlin.
It is the last week of Code Chrysalis and the last two weeks have been spent working on our final team project. In addition to all of the things we learned during the first half of the course we learned several new things to complete our project including the Facebook Graph API, NYT Developer API, Python Requests-HTML library, and AWS Lambda functions. We also stretched some CSS and design muscles a bit.
For week nine and ten of Code Chrysalis the pace and focus have changed. I am now working on a final project with two of my classmates and we are using everything we learned in the course (and then some) to create an amazing web application which we will demo in two weeks. The experience has changed from an intensive learning environment and now feels more like going in to a job every day. A job that you really like.
Week six of Code Chrysalis is over. The highlight of this week was going to Pivotal Labs in Tokyo and getting to learn their style of agile design called Lean XP.
In addition each student also created an MVP* of an application in two days. We started by brainstorming ideas and then each turned an idea into reality. If you are looking for a good book to read check out my MVP app, What The Heck Should I Read. Continue reading →