Here are some things you may need to pay attention to:
Week 3 Quiz Due (you can click the highlights to get the related content)
- please read the reading materials for Week 2 (four readings)
- please finish the week 3 quiz before the class on 9/6 (Tuesday).
Week 3 guest speaker
- I will invite Joe, the senior associate director from the MIS department to give us a 15 mins talk, helping us go over the ways of getting an MIS major, minor, and/or certificate in the class of 9/6.
Some key things from week 2:
In-class activity submission
- You will get up to 5 points for submitting the complete dictionary
- The points you get from this assignment have the same importance as your individual assignment, which takes up 20% of your final scores
- We will have other in-class activities in later classes, which may have higher points
- For this particular activity, I will only evaluate how you build the dictionary
- If you didn’t provide variable meanings or units because you are not familiar with the abbreviation or the car functions, that is totally fine. There is no point loss on that.
- So, if you are correct in labeling the data type (we have talked about the common data type in the class), I will give you a 5 point.
In-class quiz
- You will get up to 4 points from attending the in-class quiz
- The in-class quiz has the same importance as the before-class quiz, where it takes up to 10% of your final score
Week 2 Key Takeaways
- Data is important Today
- data is the new oil/gold
- massive data contains massive useful insights
- Your competitors use data to drive revenue
- advanced tech makes deriving the insights easier
- Big Data means
- Volume: the amount of data has literally exploded
- Variety: many different sources of data are combined together
- Velocity: data can change very quickly
- Compared with the old days, data storage and collection are convenient and cheap
- Data Science is the generalizable extraction of knowledge from data
- Dangers of big data analytics
- it is easy to find what is not there
- causality is harder to find, especially, the direction of the causality can be tricky
- dirty data appears more often. However, it is harder to find errors
- If I have big data, Do I still need to think
- Yes. It is hard to find actionable insights from big data
- Data can have a survival bias
- The hypothesis is “educated guess”, which is also a testable prediction. Good hypothesis should
- be testable
- be falsifiable
- ground in rationale
- Metadata: the data that describe other data. It contains
- Variable name (a dataset column label)
- Variable description (in a data dictionary)
- Data type (in a data dictionary)
- Value (the datum itself)
- Metadata is often stored as a data dictionary attached to a dataset
- Common data types contain
- Integer: whole numbers; it is called Number in Excel and Number (whole) in Tableau;
- Floating point: decimal values; it is called Number in Excel and Number (decimal) in Tableau;
- Boolean: binary values; it is called <N/A> in Excel and Boolean in Tableau;
- String: Alphanumeric characters; it is called Text in Excel and String in Tableau;
- Date/Time: Calendar date and time; it is called Data in Excel and Date or Date & time in Tableau.
- The data type is important
- it determines the type of values that a data field can have
- it defines the kind of operations that can be performed on data
- Incorrect data types can create problems in analysis and result in wrong results (e.g., the gene name errors shown in the week 2 reading materials)
- Why does Metadata matter?
- it brings economic costs (e.g., the Mars Climate Orbiter)
- it can make human life in danger (e.g., Gimli glider)
- Advantages of Metadata
- metadata facilitate understanding and data processing
- it makes navigating data and data-based objects easier
- Weakness of Metadata
- it takes some time to read it
- creating and maintaining is laborious
- the creator of the data just doesn’t need it, making it not created
- In practice, we often need to reverse engineer or just guess the missing parts of metadata when analyzing datasets.
Exam, Assignment, Project, and Quiz Dues
- three in-person and closed exams will happen on 9/29 (Thursday), 10/27 (Thursday), and 12/01 (Thursday)
- two individual assignments will be due on 10/13 (Thursday) and 11/10 (Thursday)
- the final group project will need to have 4-5 students in one group and is due on 12/06 (Tuesday)
- quizzes are normally due before the start of the first class (Tuesday) of a week
Instructions to get Tableau software:
- Download the latest version of Tableau Desktop and Tableau Prep Builder from here: https://www.tableau.com/tft/activation
- Click on the link above and select “Download Tableau Desktop” and “Download Tableau Prep Builder” (we need to use both Tableau Desktop and Tableau Prep Builder)
- On the form, enter your school email address for Business email and enter the name of your school for Organization.
- Activate with the product key TCVB-B1D4-A680-1321-2667
Leave a Reply