Community Platform
Interests
  • Cyber-security
This Year
No Points
Total
1510 Points
MIS Badge

Click here
to validate the recipient

Data Analytics- NoSQL

This semester, I took a course called MIS 2502: Data Analytics. Throughout this course, I learned tools to analyze data and leverage it for a business setting.

In today’s data-centric world, having access to and gathering data is not the issue; the issue that companies are facing is that once they have that data they do not know how to utilize it to improve business processes. MIS 2502 has taught me techniques to analyze data. The hands-on structure of the class has also given me the ability experience using business intelligence tools and strategies. Throughout the semester I created decision trees, clustering analyses, visualizations, and more. I think what sets an MIS student apart from a computer science student is our familiarity with business strategy. When we receive the results from different data analysis forums, we are able to implement technology into the workforce that will satisfy solutions from a business perspective to maximize profit and get the most out of the investment.

Below is my portfolio project in which I conducted research about a current data topic that interests me. The topic is NoSQL, a database that provides storage and retrieval of data for more than just relational data models.


NoSQL

no sqlkey value store

What is NoSQL?

NoSQL, or Not only SQL, is a database that stores data analytically rather than relationally. The purpose of NoSQL is to process large data sets instead of connecting two tables of data like we witnessed during MySQL exercises in class. The name NoSQL comes from its ability to process SQL database language.  There are pros and cons to NoSQL databases and relational databases depending on the business process needed. The capabilities of NoSQL include increased capacity, functionality, fault tolerance, and availability. NoSQL can process structured and unstructured data with ease of use and constant updates. The technology is new, fast, and functional rather than the sometimes massive IT architecture of relational databases. In addition, it does not require mapping out schemas ahead of time like we practiced with ERD exercises in class. This is called agile software development, or fast iteration, which cuts down on rigorous planning and scheduling and allows the user to change the data depending on their need. In addition to being an incredible time-saver, it offers more reliable integration and less downtime. Furthermore, NoSQL scales horizontally rather than vertically which means it spreads data across many servers. The redundancies of the servers enable little to no service interruption if there is a power failure. This concept, called auto-sharding, is not only a superior technology disaster recovery plan, but a business continuity plan. Furthermore, NoSQL has integrated caching which has a highly functional directory and memory to reference data that is frequently used.

Data in NoSQL can be organized by key-value stores (pictured above), graph stores, document databases, and wide-column stores. Primarily, key value stores are for unstructured databases in which a key is created for each record, like the attribute in the ERD diagram, and fields are stored in bins which is the value, similar to rows and columns respectively. They are the most simple of the NoSQL databases. Document databases pair a key with a document which can encompass multiple key-values, ordered maps called key-arrays, and embedded documents. Graph stores are used for things like social networks. For example, Neo4J is one of the premiere open-source graph platforms. Their value proposition is to store data in graphs rather than tables and connect “graphistas” from Global 2000 startup companies. Finally, wide-column stores are meant for database queries over different data sets that do not utilize a row format. For example, HBase is a column-centric platform that runs on Java.

Why is NoSQL important?

In MIS 2502, we learned about the difference between relational and analytical databases. Relational databases, also known as transactional databases, capture data describing an event instead of capturing data to support analysis. This is the framework for Online Transaction Processing (OLTP) with a series of tables with logical associations between them. These databases are good for storage and data integrity, but bad for analysis because the tables must be joined together before the analysis is done. The primary goal of relational databases is to minimize redundancy; whereas, the primary goal of analytical databases is to interpret the data for better business solutions.
Because data analysis is a hot topic right now, NoSQL has many capabilities to deal with the immense amounts of ever-changing data. NoSQL can store business data about customers, products, frequencies, performance, and more. NoSQL is more efficient than relational databases because they cannot adapt quickly enough to changes, implementations of new technologies, and are not up-to-date in terms of storage, capacity, and speed.

Works Cited

“Neo4j, the World’s Leading Graph Database.” Neo4j Graph Database. N.p., 2015. Web. 23 Apr. 2015. <http://neo4j.com/>.

“NoSQL Databases Explained.” NoSQL Databases Explained. Mongo DB, 2015. Web. 23 Apr. 2015. <http://www.mongodb.com/nosql-explained>.

“What Are NoSQL Key-Value Store Databases?” Aerospike. N.p., 2015. Web. 23 Apr. 2015. <http://www.aerospike.com/what-is-a-nosql-key-value-store/>.

“What Is HBase?” IBM. Hadoop, n.d. Web. 23 Apr. 2015. <http://www-01.ibm.com/software/data/infosphere/hadoop/hbase/>.

Skip to toolbar