Section 005, Instructor: Shana Pote

Take Home: In-Class Exercise 2.2: Finding Sources of Data

As discussed in class on Thursday, please comment on this post with the following:

  • What dataset did you find
  • Where did you find it
  • Why did you think it was interesting
  • What did you learn from the data
  • How could you use the data or what decisions could you make based on it

Please post your comments before class on Thursday, September 14, 2017.

14 Responses to Take Home: In-Class Exercise 2.2: Finding Sources of Data

  • I found a dataset containing information on healthy corner stores throughout Philadelphia. I found it on I thought it was interesting because I have a meal plan, and being a vegetarian, there aren’t a whole lot of options in the cafe, let alone healthy vegetarian options. I learned that a lot of these corner stores offer screenings for blood pressure, height and weight checks, and nutrition education. I could use this data to locate a corner store close to campus to buy healthy things like fruits and vegetables to supplement my health.

    I found the above data set on the website. It is average price information on 156 of most commonly consumed fruits and vegetables (fresh and processed). I find this to be interesting because I try to make the majority of what I consume to be healthy fresh produce and being aware of the pricing per serving is helpful in making this choice more affordable. I also found it interesting they included a price comparison set for processed snacks per serving (pizza, cookies, chips, etc) vs. each fruit/vegetable per serving. Interestingly many of the unhealthy options were more expensive per serving but not all of them. There were quite a few healthy produce options to be less expensive per serving which would be useful information in educating parents and children on the affordable healthy options.

  • I found a data set on about the SEPTA trains average lateness for each stop on the Paoli/Thorndale line. I found this helpful and interesting because I take SEPTA a lot to get home, since I live about five minutes from the station in my area. I’m also late to everything, so it was good to find out how on average each stop shows up later than 3 minutes. This is helpful to know when to get to the station and helps me figure out If I’m going to be late, or if the train will be.

  • I found a data set from OpenDataPhilly listing the aggregate energy usage and environmental impact of different types of public and private sector buildings in Philadelphia. I thought the associated map and data visualization were very interesting and showed, in an interactive way, which types of buildings were using the most energy and giving off the most emissions. I learned that Hotels have a relatively massive energy usage compared to their emissions. This data could be used to make decisions about where the city should invest funds or institute measures in order to cut down on emissions in order to be more “green”. It is also being used as a benchmark for the energy use and emissions of city buildings.

  • I found a data set on Farmers Market locations on I found it interesting because in the city you would think there aren’t many options for fresh and local produce/goods. The only one I’ve seen before is on Cecil, which is really convenient but I wanted to see if there were others. I found it interesting that there were so many options in all parts of the city. I also would have expected for there to be more options in wealthier areas, like Rittenhouse, but really there are a lot more in the North/Northeast. Also, the data set provided public transportation options, which I think would help a lot of people realize how easily accessible fresh produce is.

    I would definitely use this information to find more options for finding fresh and local produce. I also use public transportation to get around the city, so it would be very helpful for me to use this before checking out a new market.

    The data set I found is concerned with the bicycle network in Philadelphia. It tells which streets are bicycle friendly, what type of street it is, and the length of the bike lane. I found this data set to be interesting because as I was walking in the rain and humidity today, I could not help but feel that the air was dirty. The haze around Philadelphia’s skyscrapers could not be just be fog or clouds, it had to be air pollution. One way to combat this ever-growing problem could be to expand bike networks around the city. Since I also work in a bike shop, I have a personal affinity for bikes and their value and feel that there needs to be greater investment into these networks, especially in cities. They are healthy, affordable, and eco-friendly which is helpful all around.

  • The dataset I found was trying to predict this seasons NFL rankings. I found it on Over the years data has really become a integral part of sports as sport has advanced over the years, especially if you look at baseball with their Money ball system which is now also used in all kinds of sports. Not only that the data is very interesting for us watchers too since the popularity of fantasy football as people use this data to try draft their team or predict good and bad matchups for their fantasy team. The thing I learned from this was about their Elo system, algorithm, how some results surprised them and that they had to adjust for luck. While not very ethical use of the data, I could use it to bet, fantasy league or learn more about Elo system.

    The data set that I found listed FDIC insured banks that have failed between 2000-2014. I found the data set after searching through I thought this data set was interesting, because there was a surprisingly high amount of banks that have been closed between 2000-2014. From the data it is possible to learn how many banks have been closed in each state and the dates that they closed. It is possible to use the data to avoid placing your money into banks in states that have a high bank close rate. In addition to this it is possible to examine why many banks close around the same time and date.

  • I found a data set about salaries for all the Philly City employees on This data set tells you about the name, job position and the annual salary of city employees including elected officials and court staff in Philly. I found this data set to be interesting because it list the money they make per year and their job position, from this information I can find out which type of the jobs have the highest salary and which jobs have the lowest salary. The person on the top of the list is Sam Gulino, he works for department of public health as a medical examiner and his annual salary is $268,533.

  • I found a data set on a game that was played, the rules are in the article, that emulate having to chose to use nuclear weaponry or not. The data set is a graph of the number they chose in the game and how many times it was picked. I found this article on FiveThirtyEight and I thought it was interesting because I liked how they formed that game to simulate a truly tough decision such as having to decide to nuke another country or not. I learned from this data that most of the participants either picked 0, 50, or 100. 100 was picked the most. This suggests that there may have been a few times where two people picked 100 and lost 10,000 dollars. I could use this data to pick a high number that is not 100 in order to not lose a lot of money, or zero would be a good pick as well. If I split the earnings with another person, I could get 100 dollars every time without having to risk losing 10,000 dollars.


    I found a data set on about parking violations in the different neighborhoods in Philly. I found this interesting because you get to see which neighborhoods get the most parking tickets for the years 2012-2015. I learned that some neighborhoods have huge amounts of parking violations and some have very little. I could use this data to see which areas of the city have more available parking and which neighborhoods lack parking, thus resulting in more tickets and fines. Based on if I am going to a neighborhood with a large amount of tickets or a shorter amount I can decide if I will be able to find parking or not. If I am going to an area with high parking violation, there probably isn’t a lot of parking and I should take Uber/ Lyft or public transportation to reduce the chances of getting a ticket or spending an hour trying to find parking.


    The data set that I found through lists the average price of what is covered by insurance, Medicare, and average payments for inpatient treatments in medical facilities throughout the country. One thing that I found interesting from this data was how much costs are covered in places like California and how little is covered in places like Maryland, which covers under $10,000 in most of the medical facilities listed. Based upon this data, a person in need of inpatient care can make their decision of where they want to be treated (if they are able to) so that they are able to afford their treatment.

    The data set I found through is related to the inspection of restaurants and safety compliance upon inspection by the Chicago Department of Public Health. The data includes the time inspections conducted, comments made during the visit, and any violations. This information would be very helpful and useful for customers when deciding where they are going to eat. This information can also be included in Yelp or other review sites of restaurant.

    The data set I found through is related to the inspection of restaurants and safety compliance upon inspection by the Chicago Department of Public Health. The data includes the time inspections conducted, and comments made during the visit, and any violations. This information would be very helpful and useful for customers when deciding where they are going to eat. This information can also be included in Yelp or other review sites of restaurant.

Leave a Reply

Your email address will not be published. Required fields are marked *

Office Hours

Shana Pote

Alter Hall 232
Class time: 5:30-8pm, Thursdays
Office hours: Thursdays, 1 hour before class, or by appointment.
Subscribe to Class Site via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 12 other subscribers