• Log In
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Data Science

Department of Management Information Systems, Temple University

Data Science

MIS 0855.003 ■ Fall 2022 ■ Guohou Shan
  • Schedule
  • Announcements
  • Readings
  • About
    • Course details
    • Course syllabus
    • Grading
    • Get Tableau software
    • Guohou, your instructor
    • Gradebook
  • Gradebook

Instructor

Week 13 summary

November 19, 2022 Leave a Comment

Week 13 key takeaways
  • Descriptive analytics summarizes and helps understand past data (e.g., detect fraudulent behavior)
  • Predictive analytics predicts values for data points we do not have. If these data points are in the future, we call this forecasting (e.g., weather forecasting)
  • Prescriptive analytics facilitate decision-making directly by suggesting an action (e.g., whether or not to grant a car loan to a consumer) or automatically triggering it (algorithmic management).
  • Confidence interval: the accuracy of our predictions; the higher the confidence interval, the wider the prediction curve area is.
  • Machine learning is a broad category of techniques to learn model parameters from data so that the model can be used to predict values or classify items.

Week13 Prep and Week12 Summary

November 10, 2022 Leave a Comment

Feel happy to complete our 12th week with you!
 
Here are some important things for preparing for the 13th week, which is our last week of giving a lecture:
 
Weekly Quiz (you can click the highlights to get the related content)
  • Please read the reading materials for Week 13 (three readings)
  • Please finish the week 13 quiz before the class on 11/15 (Tuesday)
  • This is our last weekly quiz! Woohoo!
Class Activity
  • I set five extra credit points for our 11/17 (Thursday) class. Please come to the class to get them!
  • 11/17 is our last lecturing class, I plan to take a photo with you guys. Please come to enjoy our time!
  • Also, I plan to spend some time inviting you guys to write evaluations for me! 
  • Please come to help me! Thanks!
Your final group projects
  • If you still don’t form a group, please let me know!
  • If you formed a group, please send me an email with the names of your team members and tell me which day your team prefers to present!
Week 12 key takeaways
  • structured data are recorded as well-defined fields that correspond to distinct variables. Examples like star rating, book price, book page, exam score, and sentiment score, ect.
  • unstructured data, such as natural language writings, consist of a mishmash of semantic entities. Examples like book textual content, image, video, audio, etc.
  • word clouds can visualize the most prominent terms in a collection of texts
  • sentiment analysis categorizes whether a statement is positive or negative and assigns it a score accordingly
  • sentiment analysis is a form of predictive analytics
  • sarcasm is particularly difficult for sentiment analysis

Project Presentation Team

November 10, 2022 Leave a Comment

Based on the time I received your team members for the final project, I made this list.

Team# Members Tentative presentation day 
1 Kaiya Palmer,  , Andrew Michael Horwatt, Victoria R Adams  
2 Luke J Morgan, Carly, William Kenneally, Jacob Hartnaft  
3 Kaitlyn M Weissmann, Taylor Zahodnick, Dhriti Patel, Victoria Sharpadskaya, Faiza Ahmed 11/29
4 Joseph Sadler, Max Fuster, Cole Messer, Nikolas Keszthelyi 11/29
5 Bradley Greenfield, Rhys Laubscher, Lauren E Caracci, Annabeth Pisinski-Cutler  
6 Susan, Ann, Daniel, Luis  
7 Amy, Sharon, J’Niya, Julia, Justis 11/29
8 Nishit Jani, Alex Elsmore, Sagun Patel, Salvi Patel, and Patrick Elliker  
9 Zachary J Brennan, Jacob Juul, Khalil Hamid, and Cassidy  
10 Zachary Douglas, Wyatt  
11 John Ramgolam, Kyle Elliot, Tom Cornell, Matt Korytnyuk, Kavi Pragji  

Week12 Prep and Week11 Summary

November 3, 2022 Leave a Comment

Here are some important things for preparing for the 12th week:
 
Weekly Quiz (you can click the highlights to get the related content)
  • Please read the reading materials for Week 12 (three readings)
  • Please finish the week 12 quiz before the class on 11/8 (Tuesday)
  • We almost finished all of the weekly quizzes. This is the penultimate! 
Second Individual Assignment 
  • The due date is 11/10
  • Please finish it asap and use me or Anna to help you check your answers.
  • You can check with us multiple times. 
  • Don’t feel shy! We are glad to help!
  • Please send us your answers and excel via email.
Guest Speaker
  • Jim from The Hilltop Institute will come to our class to give a virtual talk!
  • Please try to come in person (we may continue some content from 12.1 if we didn’t finish then)
  • You can also join over zoom: https://temple.zoom.us/j/2292311746
Your final group project
  • Please form 4-5 members of your team.
  • Please send me an email with the names and emails of your team members after you finalize your team.
Week 11 key takeaways
  • Dashboards
    • Multiple perspectives to the data rather than just one scorecard
    • A well-designed dashboard can provide a powerful overview of a phenomenon
    • Dashboards are often interactive
      • customize visualizations
      • drill deeper into the data.
    • Dashboards are not easy to design
    • Recommendations for better data discovery using dashboards
      • Link KPIs to detailed views
      • Create dynamic reports
      • Give users an ‘information scent’
  • Pivot Table
    • A pivot table is a table of statistics that summarizes the data of a more extensive table.
    • This summary might include sums, averages, or other statistics, which the pivot table groups together in a meaningful way.
    • All related data must be in the same table to be pivoted
    • It can entail lots of redundant fields

Week10 Prep and Week9 Summary

October 20, 2022 Leave a Comment

Here are some important things for preparing for the 10th week:
 
Exam
  • Please come to the class held on 10/27 to take an in-person exam.
  • Please read the study guide and prepare your questions if you have any for the review summary session held on 10/25.
  • The exam format is the same as exam 1. 
Please feel free to let me know if you need any accommodations or have questions.
Also, I will move the next virtual office hour from Monday (10/24) to Wednesday (10/26), aiming to help you prepare for your second exam.
 

Week 9 key takeaways

  • Indicators: are variables that measure (indicate) a business-relevant value.
  • KPIs: the critical (key) indicators of progress toward an intended result. KPIs help focus attention on what matters most. It can measure whether equipment, process, team, individual or organization is performing at an adequate level.
  • KPI Criteria: SMART
    • Specific purpose for the business
    • Measurable
    • Achievable by the organization
    • Relevant to success
    •  Time-phased
  • KPIs can go wrong
    • The “tyranny of success”: The difference between success and failure in non-profits is not clearly defined
    • What can you do to avoid it?
      • Use multiple KPIs as a measure, not just one
      • Be sure to use all of the SMART criteria
  • Scorecards
    • A scorecard measures performance against several goals (KPIs).
    • Often used to compare different products in terms of relevant dimensions.
    • A neutral comparison needs to define a specific use case and choose the dimensions and their weightings accordingly.
    • One item is superior to others, then choose the dimensions that put the preferred item on top.
  • Dashboards
    • Multiple perspectives to the data rather than just one scorecard
    • A well-designed dashboard can provide a powerful overview of a phenomenon
    • Dashboards are often interactive
      • customize visualizations
      • drill-deeper into the data.
    • Dashboards are not easy to design
    • Recommendations for better data discovery using dashboards
      • Link KPIs to detailed views
      • Create dynamic reports
      • Give users an ‘information scent’

Week9 Prep and Week8 Summary

October 13, 2022 Leave a Comment

Here are some important things for preparing for the 9th week:
 
Weekly quiz (you can click the highlights to get the related content)
  • Please read the reading materials for Week 9 (two readings)
  • Please finish the week 9 quiz before the class on 10/18 (Tuesday).
Additional extra credit task (due before the class on 10/20/2022)
  • There are three steps for doing this task
    • Step 1: leave your information here
    • Step 2: attend a mock interview
    • Step 3: fill out a survey
  • Before starting, there are several things you need to pay attention
    • You need to take step 1 as first, step 2 as second, and step 3 as third. Or else half of your credits will be cut off because we can not use your data
    • You need to finish all of these three steps, or else half of the credits will be cut off
    • Please find a quiet place because mock interviews record your voice and video
    • Step 2 and step 3 links are also put at the end of step 1. So, if you use/click the link there in the correct order, you will also be fine.
    • Having a mock interview necessarily means you need to use a laptop and give permission to get access to your microphone and camera
    • If you have taken this before, please use a different email address
    • You don’t need to submit anything to me, I will put the additional extra credit in your gradebook next Thursday.
    • If you want to get the $20 Amazon gift card, please take this task seriously and try to get more than an 85% score.
    • I will contact you once I find you got more than an 85% score. Alternatively, you can email me about your score. 
  • Please let me know if you have any questions.

Week 8 key takeaways

  • Relational database
    • CSV and Excel are flat files
    • A relational database stores different data in different tables.
  • Benefits of relational database
    • integrity. It’s easier to maintain the integrity of data when the same item is recorded in one place only.
    • flexibility. You can create different cuts to data.
    • efficiency. It’s faster to retrieve and update data when you don’t have to plough through lots of redundant values.
  • However, relational databases are more complex to operate and use than flat files.
  • ETL: Extract, Transform, Load
  • Why Do We Need ETL?
    • The power of data analytics is often based on combining data from different sources
    • However, data stored in different places are often formatted differently
    • It can be very difficult to enforce a consistent schema
  • Steps need to be considered for setting up an ETL process
    • Read metadata (data dictionary)
    • Choose the correct version of the data
    • Set up rules for resolving other inconsistencies, duplicates, omissions, and other problems in the data and validate the data.
  • ‘Big Data’ Is a Set of Technologies
    • Hadoop: stores data in smaller chunks across a network on different computers (nodes).
    • MapReduce: processes the pieces of data in parallel in different nodes and combines the results together.

Week8 Prep and Week7 Summary

October 6, 2022 Leave a Comment

Here are some important things for preparing for the 8th week:
 
Weekly quiz (you can click the highlights to get the related content)
  • Please read the reading materials for Week 8 (one reading)
  • Please finish the week 8 quiz before the class on 10/11 (Tuesday).
First Assignment (due before the class on 10/13/2022)
  • Please don’t wait until the last minute to start the assignment!
  • Based on the knowledge we have learned; you are able to complete it
  • The assignment specification, answer sheet, dataset needed, and the submission link are put on the course website
  • If you finish it earlier and want to see whether your answers are correct, please send your answers and Tableau workbook to me or Anna (anna.boykis@temple.edu) via email for a pre-check. 
In-class activity
  • I moved In-class activity 7.2 to the class on 10/13 (8.2), where you can submit your cleaned excel and get the extra points
  • For in class-activities, I separated them into instruction and data on the course website to make it easier to be seen and understood. 
  • I strongly suggest you read the instruction as class preparation before coming to the class.
  • During our demonstration, I also suggest you open it along with you. 
  • I need to demonstrate it, making me unable to open two files at one screen (making them smaller affect view)
  • All of my demonstrations are based on the instruction
  • Finally, the In-class activity 7.2 is important. It affects how well you will do on your second assignment and the second exam.


Week 7 key takeaways
 
The Agency Problem
  • The data creator is often NOT the data consumer.
  • The agency problem can be one of the reasons for dirty data

Problems with data are a big deal

  • People doing analytics spend 50% of their time i) searching for data, ii) correcting errors, and iii) verifying correctness (Redman 2013).
  • Bad data can create a vicious cycle that kills a data-driven decision-making culture

Some Best Practices

  • Focus on getting new data right.
  • Limit time fixing old data.
  • Data producers should communicate with data consumers.
  • Have a mindset to check your work constantly.

Data Cleansing

  • Outlier: an observation that lies an abnormal distance from other values in a random sample from a population
  • Outliers can be removed or replaced
  • Guessing is never the right answer to deal with outlier

Week6 Quiz and Week5 Summary

September 22, 2022 Leave a Comment

Here are some important things for preparing for the 6th week:
 
Exam day (9/29/2022)
  • in-person, in class
  • only you, your pen, exam paper, and exam answer sheet allowed
  • closed book, no mobile phone, no desktop, no iPad, no talking with others, no notes
Review session and Q&A (9/27/2022)
  • in-person, in class 
  • The study guide is here
  • Bringing your questions, if any
Week 5 key takeaways 
  • Elements of telling a good story
    • Find the compelling narrative
    • Think about your audience
    • Be objective and offer balance
    • Don’t Censor
    • Finally, Edit, Edit, Edit
  • Steps to tell a powerful story with data
    • What is the [business] problem
    • How to measure the [business] impact?
    • What’s the available data?
    • The initial solution hypothesis
    • The solution
    • The [business] impact of the solution
  • Avoid those while telling a story with data
    • Technical terminology
    • Step-by-step method description
    • Complex statistics
  • Infographics versus data visualization
    • Infographics can be made up of data visualizations, but data visualizations are not made up of infographics
  • The value of infographics
    • Get a better sense of big data
    • Reduce information overload
    • Match the way humans process information
  • Storytelling with infographics is not that different from any other type of storytelling
  • Tips for designing infographics
    • 5-second rule
    • Tell one story well
    • Minimize text – visualize when possible
    • Eliminate legends as much as possible
    • Be data transparent (cite sources)
 
I will release your grades regarding your quiz in the morning of this Saturday. 
If you didn’t submit them, please finish by this Friday night.

Week5 Quiz and Week4 Summary

September 15, 2022 Leave a Comment

Here are some important things for preparing for the 5th week:
 
Week 5 quiz due (you can click the highlights to get the related content)
  • please read the reading materials for Week 5 (one reading)
  • please finish the week 5 quiz before the class on 9/20 (Tuesday).
 
Preparing for the exams
  • Don’t want to signal urgency and panic, but our first exam will happen on 9/29. 
  • I will hold on review session and Q&A on 9/27.
  • At the same, if you have any questions regarding the class content (e.g., concepts), please feel free to contact me or come to my office hour directly. 
  • I plan to release the new office hour schedules next Tuesday based on your common time availability selections from this link: https://fox.az1.qualtrics.com/jfe/form/SV_9BxknEnNQ8HOS34
 
Week 4 key takeaways 
  • Data visualization is part science, part art
  • Some basic principles of data visualization
    • Simple
    • Compare
    • Attend
    • Explore 
    • View diversity
  • Some basic elements of a data visualization
    • Content
    • Context
    • Construction
  • Some additional notes about data visualization
    • Y axis of figures should start from 0
    • Visualization size should be comparable with values
    • Aspect ratio can tell a lie about the trend of visualization of figures
    • 3-D effect makes the visualization confusing to interpret
    • Too many sectors make the pie chart ugly
    • Pie chart must add up to 100%
  • Some basic elements of a data visualization
    • Function: Function is a mapping from a set of values to another set of values
    • Filtering:  dropping observations from consideration using some criteria
    • Sorting: allows us to change to the order in which the items are presented in a sequence.
I will release your grades regarding your quiz and in-class activity in the morning of this Saturday. 
If you didn’t submit them, please finish by this Friday night.

Week4 Quiz and Week3 Summary

September 8, 2022 Leave a Comment

Here are some important things for preparing for the 4th week:
 
Week 4 quiz due (you can click the highlights to get the related content)
  • please read the reading materials for Week 4 (one reading)
  • please finish the week 4 quiz before the class on 9/13 (Tuesday).
 
Review sessions and office hour re-determination
  • I have received some suggestions for holding review sessions. I will do that in the class before the exam day. 
  • At the same time, please click the following link to select all of your available times for an office hour. I will make the final decision based on your common selections.
  • Link: https://fox.az1.qualtrics.com/jfe/form/SV_9BxknEnNQ8HOS34
  • You can also leverage the office hour to address your questions regarding the class content/concepts.
Some explanations for the class so far:
 
MIS community website and Gradebook
  • Holding the class content on a website is not convenient for students who heavily use Canvas, but it is the department’s policy. Sorry for that. 
  • You can find your grade in the grade book, being put under the “About–>Gradebook” of the website. 
  • You can click “About–>Gradebook” and then select our course (“Fall 2022 – MIS 0844 -003 – Data Science”) and click “VIEW GRADES” (I plan to demonstrate in the class)
In-class quiz and activities
  • I normally decide to set the extra points one week before the class lecturing. 
  • I have two general principles: 1> I think it is necessary to test whether the concept delivered in the class is understood via class quiz or activity; 2> I think it is time to assign some extra points. Anyway, my principle is to help you guys. 
Week 3 key takeaways 
  • How to Get Data?
    • Collect ourselves
      • For example, we can collect the library visit data by sitting in front of the library. Then, we can know when is the visiting peak, which can be used for better managing the library.
      • Advantage: we know how the data were produced
      • Drawback: it is laborious
    • Get access to the existing data
      • Directly downloading
      • Using reporting tool
      • Scraping
      • Using API
      • Benefit: convenient
      • Weakness: difficult to know whether the data represents the real phenomenon we care about
    • Open data
      • Different organizations have different motivations to open-source their data. However, there are some reasons for not sharing data
      • Difficult to gain value from sharing data
      • Afraid of competition and possible liabilities
      • Data breaches
      • Laborious to prepare data
  • Biases in (Big) Data
    • Survivorship bias
    • Intentional bias (e.g., fake reviews and ratings)
    • Confirmation bias
  • Assessing the trustworthiness of data
    • What are your hypotheses?
    • What are your biases?
    • What is the sample (size)?
    • What is the data source?
    • How good are your (customer) measures?
I will release your grade in the morning of this Saturday (this should be our regular grade releasing time). My principle here is to help students who are urgent and extremely busy for a week to have some grades for a particular quiz or assignment. 
  • Page 1
  • Page 2
  • Go to Next Page »

Primary Sidebar

LECTURE RECORDINGS

Recordings are as follows.

14th week 11/29/2022
12/1/2022
13th week 11/15/2022 11/17/2022
12th week 11/08/2022 11/10/2022
11th week 11/01/2022 11/03/2022
10th week 10/25/2022
10/27/2022
9th week 10/18/2022 10/20/2022
8th week 10/11/2022 10/13/2022
7th week 10/04/2022 10/06/2022
6th week 09/27/2022
09/29/2022
5th week 09/20/2022 09/22/2022
4th week 09/13/2022 09/15/2022
3rd week 09/06/2022 09/08/2022
2nd week 08/30/2022 09/01/2022
1st week 08/23/2022 08/25/2022

RECENT ANNOUNCEMENTS

  • Week 13 Summary
  • Week13 Prep and Week12 Summary
  • Week12 Prep and Week11 Summary
  • Week10 Prep and Week9 Summary
  • Week9 Prep and Week8 Summary
  • Week8 Prep and Week7 Summary
  • Week6 Exam and Week5 Summary
  • Week5 Quiz and Week4 Summary
  • Week4 Quiz and Week3 Summary
  • Week3 Quiz and Week2 Summary
  • Week2 Quiz and Week1 Summary

  • Video for part6 in-class-activity0915
  • Video recap for in-class-activity0913
  • Data visualization
  • Assessing the Trustworthiness of Data
  • Welcome to MIS0855 Data Science!
  • Teaching Team

    Guohou (Jack) Shan (Instructor)
    guohou.shan@temple.edu

    I am available to meet students on Tuesday (3:30 pm – 4:30 pm) at Speakman 208E. I will also hold virtual office hours from 1-3 pm on Monday and Friday (4 pm – 5 pm) over zoom (https://temple.zoom.us/j/2292311746).  If the time is not convenient for you, please send me an email.

    Anna M Boykis (Information Technology Assistant)
    anna.boykis@temple.edu

    Email Anna to arrange a meeting.

    GREAT DATA SITES

  • FiveThirtyEight
  • Guardian Data Blog
  • Flowing Data
  • Financial Times Data Blog
  • Pew Research Data
  • US Government Open Data
  • OpenDataPhilly
  • Copyright © 2025 · Department of Management Information Systems · Fox School of Business · Temple University