Here are some important things for preparing for the 8th week:
Weekly quiz (you can click the highlights to get the related content)
- Please read the reading materials for Week 8 (one reading)
- Please finish the week 8 quiz before the class on 10/11 (Tuesday).
First Assignment (due before the class on 10/13/2022)
- Please don’t wait until the last minute to start the assignment!
- Based on the knowledge we have learned; you are able to complete it
- The assignment specification, answer sheet, dataset needed, and the submission link are put on the course website
- If you finish it earlier and want to see whether your answers are correct, please send your answers and Tableau workbook to me or Anna (anna.boykis@temple.edu) via email for a pre-check.
In-class activity
- I moved In-class activity 7.2 to the class on 10/13 (8.2), where you can submit your cleaned excel and get the extra points
- For in class-activities, I separated them into instruction and data on the course website to make it easier to be seen and understood.
- I strongly suggest you read the instruction as class preparation before coming to the class.
- During our demonstration, I also suggest you open it along with you.
- I need to demonstrate it, making me unable to open two files at one screen (making them smaller affect view)
- All of my demonstrations are based on the instruction
- Finally, the In-class activity 7.2 is important. It affects how well you will do on your second assignment and the second exam.
Week 7 key takeaways
The Agency Problem
- The data creator is often NOT the data consumer.
- The agency problem can be one of the reasons for dirty data
Problems with data are a big deal
- People doing analytics spend 50% of their time i) searching for data, ii) correcting errors, and iii) verifying correctness (Redman 2013).
- Bad data can create a vicious cycle that kills a data-driven decision-making culture
Some Best Practices
- Focus on getting new data right.
- Limit time fixing old data.
- Data producers should communicate with data consumers.
- Have a mindset to check your work constantly.
Data Cleansing
- Outlier: an observation that lies an abnormal distance from other values in a random sample from a population
- Outliers can be removed or replaced
- Guessing is never the right answer to deal with outlier
Leave a Reply