Database, what is it?

Posted on

Every beginner in data analysis comes to the idea that it is impossible to work like this. It is impossible to work without access to a database and to millions of data rows.

So what is it, database, and how to deal with it?

Database is the storage for data, it saves, organizes, and delivers it on call.

There 2 types of databases:

  1. RDBMS - relational database management system, where data are stored in rows and columns, that make up the tables.
  2. No-SQL or object-oriented database is non-relational, stores unstructured data, that you cannot organize in rows and tables. For example documents, audio files, images.

RDBMS is the one, we are interested in now. As I said earlier it can deliver data on call, and this call it receives with help of SQL - structured query language, that is used to communicate with databases. We can actually not only retrieve data but also process, filter, sort, and aggregate it before retrieving it. Which gives us a great opportunity to get exactly the data we need.

We have only one Standard SQL, but there are dialects of it. How did it happen?

Many Database Management Systems add functionalities and change the syntax to their needs, as a result even if you know Standard SQL, you still need to learn some specifics, depending on the Database you are using.

I’ve done a couple of online courses on SQL, initially with different dialects of it, but lately, I began to use BigQuery, which uses Standard SQL and I’m totally fine with it now. I think if you know Standard SQL, you can pretty easily switch to any other dialects, that your company is using. So all my future examples of code will be exactly for Standard SQL.

BigQuery it’s a data warehouse that has different built-in BI features to work with data. Also, you can use it as a storage and work with your data within BigQuery or use BigQuery to access your data from other sources. And what is most important for us, with its help we can query out datasets or use public datasets, that are already installed in this data warehouse.

To begin with it, we need to go to the Google Cloud Console, BigQuery page, make an account and choose a Sandbox option, that is originally made for first customers to try and then buy a paid version with much more options. But aren’t we the future first customers?

Maryna Demchenko's website. I use this website to share my experience of becoming a data analyst.

Copyright © 2021

This website is built with GatsbyJS and Bulma