• Log In
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Data Science

Department of Management Information Systems, Temple University

Data Science

MIS 0855.003 ■ Fall 2022 ■ Guohou Shan
  • Schedule
  • Announcements
  • Readings
  • About
    • Course details
    • Course syllabus
    • Grading
    • Get Tableau software
    • Guohou, your instructor
    • Gradebook
  • Gradebook

Week9 Prep and Week8 Summary

October 13, 2022 Leave a Comment

Here are some important things for preparing for the 9th week:
 
Weekly quiz (you can click the highlights to get the related content)
  • Please read the reading materials for Week 9 (two readings)
  • Please finish the week 9 quiz before the class on 10/18 (Tuesday).
Additional extra credit task (due before the class on 10/20/2022)
  • There are three steps for doing this task
    • Step 1: leave your information here
    • Step 2: attend a mock interview
    • Step 3: fill out a survey
  • Before starting, there are several things you need to pay attention
    • You need to take step 1 as first, step 2 as second, and step 3 as third. Or else half of your credits will be cut off because we can not use your data
    • You need to finish all of these three steps, or else half of the credits will be cut off
    • Please find a quiet place because mock interviews record your voice and video
    • Step 2 and step 3 links are also put at the end of step 1. So, if you use/click the link there in the correct order, you will also be fine.
    • Having a mock interview necessarily means you need to use a laptop and give permission to get access to your microphone and camera
    • If you have taken this before, please use a different email address
    • You don’t need to submit anything to me, I will put the additional extra credit in your gradebook next Thursday.
    • If you want to get the $20 Amazon gift card, please take this task seriously and try to get more than an 85% score.
    • I will contact you once I find you got more than an 85% score. Alternatively, you can email me about your score. 
  • Please let me know if you have any questions.

Week 8 key takeaways

  • Relational database
    • CSV and Excel are flat files
    • A relational database stores different data in different tables.
  • Benefits of relational database
    • integrity. It’s easier to maintain the integrity of data when the same item is recorded in one place only.
    • flexibility. You can create different cuts to data.
    • efficiency. It’s faster to retrieve and update data when you don’t have to plough through lots of redundant values.
  • However, relational databases are more complex to operate and use than flat files.
  • ETL: Extract, Transform, Load
  • Why Do We Need ETL?
    • The power of data analytics is often based on combining data from different sources
    • However, data stored in different places are often formatted differently
    • It can be very difficult to enforce a consistent schema
  • Steps need to be considered for setting up an ETL process
    • Read metadata (data dictionary)
    • Choose the correct version of the data
    • Set up rules for resolving other inconsistencies, duplicates, omissions, and other problems in the data and validate the data.
  • ‘Big Data’ Is a Set of Technologies
    • Hadoop: stores data in smaller chunks across a network on different computers (nodes).
    • MapReduce: processes the pieces of data in parallel in different nodes and combines the results together.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

LECTURE RECORDINGS

Recordings are as follows.

14th week 11/29/2022
12/1/2022
13th week 11/15/2022 11/17/2022
12th week 11/08/2022 11/10/2022
11th week 11/01/2022 11/03/2022
10th week 10/25/2022
10/27/2022
9th week 10/18/2022 10/20/2022
8th week 10/11/2022 10/13/2022
7th week 10/04/2022 10/06/2022
6th week 09/27/2022
09/29/2022
5th week 09/20/2022 09/22/2022
4th week 09/13/2022 09/15/2022
3rd week 09/06/2022 09/08/2022
2nd week 08/30/2022 09/01/2022
1st week 08/23/2022 08/25/2022

RECENT ANNOUNCEMENTS

  • Week 13 Summary
  • Week13 Prep and Week12 Summary
  • Week12 Prep and Week11 Summary
  • Week10 Prep and Week9 Summary
  • Week9 Prep and Week8 Summary
  • Week8 Prep and Week7 Summary
  • Week6 Exam and Week5 Summary
  • Week5 Quiz and Week4 Summary
  • Week4 Quiz and Week3 Summary
  • Week3 Quiz and Week2 Summary
  • Week2 Quiz and Week1 Summary

  • Video for part6 in-class-activity0915
  • Video recap for in-class-activity0913
  • Data visualization
  • Assessing the Trustworthiness of Data
  • Welcome to MIS0855 Data Science!
  • Teaching Team

    Guohou (Jack) Shan (Instructor)
    guohou.shan@temple.edu

    I am available to meet students on Tuesday (3:30 pm – 4:30 pm) at Speakman 208E. I will also hold virtual office hours from 1-3 pm on Monday and Friday (4 pm – 5 pm) over zoom (https://temple.zoom.us/j/2292311746).  If the time is not convenient for you, please send me an email.

    Anna M Boykis (Information Technology Assistant)
    anna.boykis@temple.edu

    Email Anna to arrange a meeting.

    GREAT DATA SITES

  • FiveThirtyEight
  • Guardian Data Blog
  • Flowing Data
  • Financial Times Data Blog
  • Pew Research Data
  • US Government Open Data
  • OpenDataPhilly
  • Copyright © 2025 · Department of Management Information Systems · Fox School of Business · Temple University