Community Platform
Interests
  • Entrepreneurship
  • Product design
  • RFID
  • Servers
  • more...
This Year
No Points
Total
1365 Points
MIS Badge

Click here
to validate the recipient

Distributed Data Technologies–Hadoop

Distributed Data Technologies:

Hadoop

For this assignment I will be examining Hadoop and it’s role as a distributed data technology. This topic as a whole is extremely important to the business world as it is a key contributor to the data analytics field. The purpose of these technologies is to store and process large amount of data, also known as Big Data. Hadoop, specifically, serves as one of the most popular, in that it is an open source software platform that has the ability to store massive data sets and subsequently analyze data that exists on many different servers (Proffitt).

 

The ultimate purpose of Hadoop for companies is to eliminate the risk and inefficiency of storing data on the company’s hardware and running analyses across the company server(s). Hadoop relates to the material covered in MIS 2502 as it draws a connection to the extremity and necessity of big data in the real world. In class we learn about big data as a concept and we analyze various data sets individually as a learning process. However, we don’t see the company side of things, that is, how overwhelming these amounts of big data can be when there are several thousands of data sets that must all being utilized and stored somewhere. Essentially, without the use of big data these data analyses would never be possible but without the software to handle all the big data, the big data wouldn’t be a resource. This is something we don’t talk about in MIS 2505 however it is covered in MIS 0855 as it is a vital concept to understand in business.

 

Hadoop has been a fundamental resource for many companies, starting around the late 1900’s (with other names) and eventually branching off as its own entity in 2008 and gaining significant popularity as technology grew. Since then it has accumulated a variety of different projects (services offered) which all serve a different purpose from governance integration to data access. For example, a key part of Hadoop is its distributed file system (HDFS) which is the solution to its data management project (“Apache Hadoop Ecosystem and Open Source Big Data Projects”). This project is essential for many companies who rely on the aforementioned storage of large amounts of data. In practice, this project is applied when a company has data that doesn’t fit on their hardware and is subsequently slowing down their systems.

 

Works Cited

“Apache Hadoop Ecosystem and Open Source Big Data Projects.” Hortonworks, hortonworks.com/ecosystems/.

Proffitt, Brian, et al. “Hadoop: What It Is And How It Works.” ReadWrite, 23 May 2013, readwrite.com/2013/05/23/hadoop-what-it-is-and-how-it-works/.

Skip to toolbar