Section 005, Instructor: Joe Spagnoletti

Schedule

Sect

Date

Topics / Questions In Class Exercise Readings Assignments
1.1

Jan 18

Instructor and class introductions

Course  and schedule overview

Introduction (slides)

  • What is the difference between data, information, and knowledge?
  • What makes “big data” big?
Data Is Everywhere

  • Identify data embedded into the environment
  • Differentiate between the data source and the data
  • Describe the information that could be derived from that data
Review the course site and email me any questions.
1.2

Jan 20

 

 

Science and Data Science (slides)

  • What is data science?
  • What is the difference between a theory and a hypothesis?
  • What are the dangers of data analysis without a hypothesis?
Developing Hypotheses

  • Create hypotheses about things you experience in your daily
  • Develop testable hypotheses
  • Propose an underlying rationale for that hypothesis
  • Explain the difference between a hypothesis, its rationale, and a theory
Data Science and Prediction (Dhar)

  • (This link will take you to the library page.  Type in the title and view the PDF for the article

http://www.wired.com/2013/03/three-science-words-we-should-stop-using

  • (Three Science Words We Should Stop Using – Allain)

 

2.1

Jan 23

 

A Brief Introduction to Data (slides)

  • What are the forms data can take?
  • Where does data come from?
  • What is metadata? A data dictionary?

 

Building a Data Dictionary

  • Identify pre-assigned data types from an existing Microsoft Excel spreadsheet
  • Deduce the purpose of individual data fields based on context and sample values
  • Develop a plan for finding additional information about the meaning and units of measurement for data fields
I’m Beating the NSA to the Punch by Spying on Myself (Stein)

What the NSA Wants to Know About Your Phone Calls (Di Justo)

The Ashley Madison Hack Is Only the Beginning (Aguilar)

What the Fox Knows (Silver)

Open Data (Wikipedia)

In Search of America’s Best Burrito (Silver)

Reading Quiz #1

(by 2:00pm)

 

Jan 25  No Class Weekly Question #1

(by 2:00pm)

2.2

Jan 27

 

Identifying Sources of Data (slides)

  • What kinds of data are available in different disciplines (arts, sciences, medicine, business, government, etc.)?
  • What kinds of problems and issues can data insight address?

 

Finding Sources of Data

  • Identify uses for data from open data sites
  • Navigate metadata repositories and explore the data sets
  • Understand the types of data available through open data sources
  • Formulate possible uses for new data sets
3.1

Jan 30

 

Learning to (Mis)Trust Data (slides)

  • How do you spot reliable sources of data?
  • How do you assess data quality?
  • What is the “Filter Bubble?”

 

Assessing Trustworthiness of Data Sources

  • Identify potentially unreliable data and information within a data source.
  • Differentiate between questionable data and questionable information.
  • Differentiate between reliable and unreliable sources.
Bubble Trouble: Is Web Personalization  Turning Us into Solipsistic Twits? (Weisberg)

The Hidden Biases in Big Data (Crawford)

In Data We Trust (Hayes)

Reading Quiz #2

(by 2:00pm)

 

3.2

Feb 1

 

 

Guest Speaker  (Subject to Change)  Weekly Question Discussion Weekly Question #2

(by 2:00pm)

3.3

Feb 3

Guest Speaker  (Subject to Change)  Weekly Question Discussion

 

Module 2: Telling Stories with Data
4.1

Feb 6

 

Viewing Data  (slides)

  • What are the different ways of viewing data?
  • When do you need to visualize data?
  • What are the basic techniques of data visualization?

 

Chapter 2: Good Graphics? Handbook of Data Visualization (Unwin—-pages 57-77)

Stephen Few on Data Visualization: 8 Core Principles (Hoven)

Watch out, Terrorists: Big Data is on the Case (Acohido)

Reading Quiz #3

(by 2:00pm)

 

Assignment #1: Create a Data Analysis Plan

4.2

Feb 8

 

Finding Good and Bad Data Visualizations

  • Understand the difference between effective and ineffective data visualizations
  • Identify the message a graphic is trying to convey
  • Evaluate how successful the graphic is at conveying that message
  • Explain why, according to the principles discussed in class, the graphic is (or is not) effective
Weekly Question #3

(by 2:00pm)

 

Feb 10

 

 Guest Speaker

Josh Mann

Marketing Director Comcast

5.1

Feb 13

 

Communicating Using Data (slides)

  • What are the principles of communicating data?
  • How do you communicate complex ideas using data?
  • How do you construct visualizations that complement a report? That stand on their own?

 

 Review of Assignment #2 Telling a Story with Data (Davenport)

Visualizing a Day in the Life of a New York City Cab (Matlin)

Chapter 1: The Science of Infographics. Cool Infographics (Krum)

Chapter 6: Designing Infographics. Cool Infographics (Krum)

Reading Quiz #4

(by 2:00pm)

4.3

Feb 15

 

Introduction to Tableau

  • What is Tableau? What can you do with it?
  • How is it different from Microsoft Excel?
Getting Familiar with Tableau

  • Learn the basics of Tableau by visualizing a sample data set
  • Create textual and graphical visualizations in Tableau.
  • Combine data from multiple tables to create visualizations.
  • Create calculated fields to categorize data.
  • Create a dashboard to view multiple visualizations at once.
Weekly Question #4

(by 2:00pm)

5.1

Feb 17

 

Storytelling with Infographics (slides)

  • How are infographics different from other types of visualizations?
  • How do infographic tools differ from other data tools we’ve used so far?

 

Telling a Story through Visualization

  • Identify which visualizations are most appropriate to convey a message Discover the way in which data can be represented using Tableau.
  • Select visualizations that best describe relationships in the data.
  • Suggest and implement modifications to improve existing visualizations.
5.2

Feb 20

 

 Creating Infographics with Piktochart

  • Learn how to create an infographic using Piktochart.
  • Import graphics made in Tableau into Piktochart
  • Create relevant content for an infographic
  • Export a finished graphic from Piktochart
 

NO WEEKLY QUIZ

 

 

6.2

Feb 22

 

 Exam Review  Assignment 2: DUE

NO WEEKLY QUESTION

6.3

Feb 24

 

EXAM 1

 

Module 3: Working with Data in the Real World
7.1

Feb 27

 

Dirty Data (slides)

  • How does data get dirty?
  • What are the consequences (i.e., ethical, financial) of dirty data?
  • How do you clean it?

 

 

Data’s Credibility Problem (Redman)

Damn Excel! How the ‘Most Important Software Application of All Time’ is Ruining the World (Gandel)

Stupid Data Corruption Tricks (Taber)

Top Ten Ways to Clean Your Data (Microsoft)

 Reading Quiz #5

(by 2:00pm)

7.2

Mar 1

 

Data Cleansing (slides)

  • How do you identify data problems?
  • How do you correct data problems?
  • When is fixing the data not worth it?
 How Data Gets Dirty

  • Analyze and understand the process of evaluating data quality.
  • Identify threats to data quality
  • Design mechanisms to identify quality problems in collected data
  • Develop remedies to prevent future data quality problems
Weekly Question #5

(by 2:00pm)

7.3

Mar 3

 

Locating “Bad Data” Using Excel

  • Find and fix a data set with incorrect values Use Excel to identify incorrect values and outliers in a data set
  •  Selectively apply corrections to a data set
  • Understand the positive and negative impacts of changing data, even if that change is intended to correct it
8.1

Mar 6

 

Choosing Relevant Data (slides)

  • How do you identify Key Performance Indicators (KPIs)?
  • How do you identify the right measure for the selected problem?
 

Performance Indicator (Wikipedia)

The Tyranny of Success: Nonprofits and Metrics (Schambra)

Wearable Tech is Plugging Into Health Insurance (Olson)

Tracking Health One Step at a Time (Bialik)

Reading Quiz #6

(by 2:00pm)

8.2

Mar 8

Evaluating KPIs (slides)

  • How do you categorize and visualize KPIs according to a threshold?
  • How do you use Tableau to evaluate KPIs? How would you use Excel
Identifying Key Performance Indicators

  • Select Key Performance Indicators (KPIs) that facilitate evaluation for a given scenario
  • Identify “good” KPIs that adhere to the SMART criteria
  • Select the best KPIs from a list of potential metrics
  • Describe the limitations of using KPIs to make an evaluation
Weekly Question #6

(by 2:00pm)

 

 

8.3

Mar 10

Visualizing Key Performance Indicators

  • Create a set of visualizations that enable comparison using Key Performance Indicators using Tableau
  • Create heat maps to visualize key performance indicators.
  • Use calculated fields to categorize acceptable and unacceptable performance.
  • Visualize those calculate fields using easy-to-read symbols.
Assignment 3 Due: Cleaning a Data Set
9.1

Mar 20

 

Connecting Diverse Data (slides)

  • How do you identify data sets that can be combined?
  • How do you combine data sets?
  • How do you resolve conflicts?
How Data Integration Works (Strickland)

The GOP Arms Itself for the Next “War” in the Analytics Arms Race (Gallagher)

Best Practices for Designing Views and Dashboards (Tableau)

The One Skill You Really Need for Data Analysis (Farmer)

Reading Quiz #7

(by 2:00pm)

 

 

9.2

Mar 22

 

Creating Interactive Dashboards (slides)

  • How does a dashboard differ from an Infographic? A chart?
  • How do dashboards facilitate decision-making?
Connecting Data Sets

  • Analyze two data sets at the same time by combining them within Tableau.
  • Identify common data between data sets that allow them to be connected.
  • Generate a common field that facilitates connection by software such as Tableau.
  • Analyze data from two different data sets once they are combined.
Weekly Question #7

(by 2:00pm)

9.3

Mar 24

 

Creating Interactive Dashboards

  • Create a dashboard with interactive data filtering using Tableau
  • Understand how to create an interactive dashboard in Tableau.
  • Select and apply filters that add to the user’s ability to understand the data.
  • Navigate a data set using an interactive dashboard to find answers about the data.
10.1

Mar 27

Exam Review

 

Assignment 4

Kickoff

 

 

 

10.2

Mar 29

 

Guest Speaker Or additional Exam Review  

 

10.3

Mar 31

EXAM 2  
11.1

Apr 3

 

Storing and Retrieving Data (slides)

  • What is a database? How are spreadsheets just a type of database?
  • How are advances in technology changing how we think about storing data?
  • What are the core technologies of big data analytics?
Knowing Just Enough about Relational Databases (Rosenblum and Dorsey)

How To Explain Hadoop to Non-Geeks (Bertolucci)

How to Structure Source Data for Excel Pivot Tables & Unpivot (Acampora)  NOTE: Use firefox or Chrome to open.

 Reading Quiz #8

(by 2:00pm)

11.2

Apr 5

 

Using Tableau for Aggregating Data (slides)

Assignment 4 Teamwork

Creating a Simple Database

  • Create a simple database to store song playlists
  • Create a table structure for data that can capture relevant information
  • Develop a data set to populate an empty database
  • Explain why data is often split into multiple tables
Weekly Question #8

(by 2:00pm)

11.3

Apr 7

 

 Assignment 4 Teamwork  Working with “Pivot Tables” in Tableau

  • Work with dimensional data to navigate a data set
  • Summarize a table of data organized along dimensions
  • Create hierarchies to enable drill-up/drill-down capability
  • Select the correct dimensions and measures to answer a question
12.1

Apr 10

 

Assignment 4 Teamwork Unstructured Data in a Big Data Environment (Hurwitz et al.)

Techniques and Applications for Sentiment Analysis (Feldman)

Don’t Worry, Facebook Still Has No Clue How You Feel (Wohlsen)

 Reading Quiz #9

(by 2:00pm)

12.2

Apr 12

 

Beyond Numbers (slides)

Twitter Sentiment Analysis using Excel and Google Drive (slides)

Manually Determining the Sentiment of Textual Data

  • Differentiate between positive and negative sentiment in text Perform a manual sentiment analysis of a Twitter stream
  • Develop rules for classifying a message as positive or negative
  • Explain the problems and issues with accurately describing sentiment within text
 

 

Weekly Question #9

(by 2:00pm)

12.3

Apr 14

 

 Assignment 4 Teamwork Sentiment Analysis Using Excel

  • Differentiate between positive and negative sentiment in text
  • Perform a sentiment analysis of a Twitter stream using software tools
  • Compare automatic and manual sentiment analysis methods
  • Explain the limitations of automatic versus manual sentiment analysis
13.1

Apr 17

 

 Assignment 4 Teamwork Predictive Analytics Using Tableau

  • Perform an forecasting analysis
  • Perform a simple association analysis
What Analytics Can Teach Us About the Beautiful Game (Paine)

Big Data Analytics: Descriptive vs. Predictive vs. Prescriptive (Bertolucci)

They’re Watching You at Work (Peck)

 Reading Quiz #10

(by 2:00pm)

13.2

Apr 19

 

Predicting the Future (slides)

More on Predictive Analytics (slides)

Predicting Undergraduate Success at Temple University

  • Identify data that is useful in predicting an outcome
  • Identify the data required to answer a problem
  • Determine potential sources for that data
  • Explain the limitations of an analysis using that data
Weekly Question #10

(by 2:00pm)

13.3

Apr 21

 

 Assignment 4 Teamwork Simple Predictive Analytics Using Tableau

  • Analyze a data set to make inferences about future outcomes
  • Forecast future sales based on order transaction data
  • Perform association analysis to determine which products are purchased together
  • Interpret the meaning of the results from these analyses
 

Apr 24

 

Assignment 4  Presentations
 

Apr 26

Assignment 4 Presentations
 

Apr 28

Assignment 4 Presentations  
 

May 1

EXAM REVIEW  
May 3 NO CLASS
May 5

1-3 pm

FINAL EXAM  

f

Office Hours

Joe Spagnoletti (instructor)

Office: Speakman 207H

Hours: (1:20-1:50, 3:00) M, W, F by appointment.

Email: joespag@temple.edu

TA: Prince Patel

Email: Prince@temple.edu