- Include the goals, results, project URL (if applicable), and what you learned in a brief paragraph.
- Once approved, the description is automatically displayed in a post on your e-portfolio.
The goal of this assignment was to dive deeper into data analytics. The assignment asked for a paper explaining a data analytics topic, how this topic related to what we learned in class, and then a real-world example of this topic. Below is my assignment on big data.
The world creates 2.5 quintillion bytes of data per day. Every like on Instagram, text sent, and Google search goes into this number. In fact, 90% of the data in the world was created within the last two years and the volume doubles every two years. This is big data. The true definition of big data is a large volume of data of different varieties (structured and unstructured) coming in at a quick velocity (in real-time). With all this information coming in, companies can organize it to make well rounded projected business decisions. Some ways businesses use big data is marketing, expansion, and time and cost-efficiency. Big data is first gathered, then stored, retrieved, and then interpreted to make these important company decisions.
In our course, we got firsthand experience with some of the databases and tools to store and retrieve big data. Big data is also related to our lessons about structured vs. unstructured data. Structured data is pre-defined and formatted to a structure, unstructured data is the opposite, not pre-defined and not formatted. Relational databases are an example of structured data, semi-structured data include CSV, JSON, and XML, while unstructured data is images, text, and documents. Through relational databases like MySQL and MongoDB, big data is sorted and organized so that it can be easily accessed with querys and interpreted to make decisions. The good thing about relational databases is that the information is imported once, which reduces redundancy. Normalization allows each table to be unique, a very important quality when 2.5 quintillion bytes of data circulate every day.
Although almost all companies use relational databases to simplify their data a specific company is Uber. In fact, Uber uses MySQL! Uber says that MySQL allows its operators to use the data stored in the tables, without having to understand the underlying technologies. By using MySQL Uber can access all information about customers and their demographics, as well as their rides and payment. They can also access information about the drivers, their demographics, car details, and all the data in between. With this data, they can make decisions about hiring, improving their direction mapping, and advertising their product.
“Big Data: What It Is and Why It Matters.” SAS, www.sas.com/en_us/insights/big-data/what-is-big-data.html.
Itamar Ben Hemo Posted on March 27. “Big Data Statistics: How Much Data Is There in the World?” Rivery, 12 June 2020, rivery.io/big-data-statistics-how-much-data-is-there-in-the-world/.
Shiftehfar, Reza. “Uber’s Big Data Platform: 100+ Petabytes with Minute Latency.” Uber Engineering Blog, 6 Apr. 2020, eng.uber.com/uber-big-data-platform/.