-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Some quick instructions:
You must complete the quiz by the start of class on March 7, 2016. The quiz is based on the readings for the whole week.
When you click on the link, you may see a Google sign in […] -
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the exercise.
And here is the dataset you’ll need [Vandelay Orders by Zipcode.xlsx].
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the exercise.
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here are the instructions (in Word) (and as a PDF). Make sure you read them carefully! This is an assignment that should be done individually.
And here is the data file you’ll need: Vande […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Leave your response as a comment on this post by the beginning of class on February 24, 2016. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your op […]
-
http://www.nytimes.com/2014/05/27/upshot/is-college-worth-it-clearly-new-data-say.html?_r=0
This data is interesting because we are all college students right now. Basically everyone that goes to college is trying to get a degree so they can get a job once they graduate. With all the money that goes into that degree, a lot of students are starting to wonder if it’s even worth it. This data proves that it’s worth it considering the fact that people with a four-year degree made 98% more money than people without one in 2013. This article also states that it was calculated, roughly, that college actually costs negative $500,000 in the long-run, not going to college according to the data collected would cost someone about $500,000.
-
These days companies are trying to help employees and keep them in good shape. This article is talking about how companies are gathering data from various wellness firms and insurers. They are using that data to see how many people get sick and get prescribed medications. For example, in the article it says that Wal-Mart is gathering data to see how many people are about to get diabetes. That way they can help treat their employees and prevent them from getting sick. The data they are gathering is kept confidential. They are gathering the data to help their employees and keep them safe.
-
http://fortune.com/2016/02/17/castlight-pregnancy-data/
This article is interesting because companies and employers can now use data to predict if someone is pregnant before she is ready to disclose that information. They are able to do this by accessing worker’s medical claims, pharmacy claims, and search queries to figure out if a person is pregnant. It just shows that data is in our everyday life and its kind of impossible to hide from it because there are so many different way to access it. It also showed me that employers might know more about me then what I imagine and that can change my thought process or attitude while I am working.
-
http://www.theactuary.com/news/2016/02/life-expectancy-for-pensioners-highest-ever/
As an actuary, we use prior data to “predict” future outcomes. As I prepare to enter the insurance industry specific to retirement, it is interesting to see how prior data is used to predict that future lifetimes are increasing. This is important to me because as future lifetimes increase, future pension payouts will be greater. Therefore, this data is very important in considering how much money needs to be reserved for retirement, both for others and for myself.
-
As an aspiring actuary*
-
http://www.cnbc.com/2016/02/19/us-consumer-price-index-jan-2016.html
As a business major, this article in particular interested me as it’s about our economy in general. This article mentions that how in January of this year, many costs for things such as rent and medical necessities are beginning to rise. The U.S. Consumer Price Index has risen the most that it has since August 2011, so four and a half years. While gasoline prices have managed to fall, medical costs such as prescription drugs, doctor visits, and hospital costs have all slowly risen. It’s interesting to see this information because the rise of prices is also being used in order to predict whether or not the Federal Government will raise interest rates later this year too. -
I have always been interested in Public Health and even considered it for one of my majors. I found this article interesting because it explains how the spread of the Zika virus could be stopped with the help of Open Data. It explains how the World Health Organizations is calling for Health journals to share the information that they have collected concerning the virus. They take it further that they should share information concerning any public health emergency. The International Committee of Medical Journal Editors has made available all data they have over the virus and they urge other journals to do the same.
-
http://www.sacbee.com/news/local/education/article61424987.html
This article is interesting because it shows how much of a gray area data privacy is. Cyberlaw is a fairly recent area of the law, and it shows. This is why data is such a controversy every time it is brought to court – nobody knows what to do about it. Yet the consequences are huge; just in this case, private school records of 10 million students up to 25 years old could be released. If such data were to be leaked, collateral damage would reach millions of dollars. There has been harassment and petitions thrown around and not nearly enough security to assure people that the data would actually be safe. After all, if a Twitter hacker could steal 20,000 employee records from FBI earlier this month, no data is going to actually be safe. The age of data is moving entirely too fast, and everyone else is struggling to catch up with the times through trial and error – and judging from this article, that is a whole lot of errors we have coming.
-
http://www.businessinsider.com/after-stock-crash-tableau-cfo-digs-in-2016-2
Business Insider emphasized on how tableau stock dropped 50% in a single day.They have a $2.2 billion loss in market capitalization. It was beating its 4th quarter expectations. The massive loss seems to stem from earnings expectations for the year ahead.
-
This article notes the permanence of Harper Lee’s posthumous influence on American culture. “Mockingbird” is taught by more high school teachers (35%) than any other fiction book. “Mockingbird” is also widely taught on college campuses as well, ranking 255 on a list of books mandated by syllabi, which also include various textbooks for a variety of studies. To me this was interesting because I also read this book in middle school, and was curious to see how large of an influence this text has nation-wide. -
This article elaborates on how there are differences in privacy policies. I had found this data interesting since our class discussions on privacy breaches and the filter bubble. This data was interesting because I had never realizes that the policies are actually stating how they will be violating our privacy rather than protecting it. They focus on companies such as Google and Facebook which leads back to our class discussions, they only invade our privacy in order to personalize the content we view on their websites. This is relevant to me from the use of both of these sites and know having the knowledge to delete my search history in attempts to combat this and have some form of privacy on the internet. -
One major story in the sports world is with Steph Curry and the Golden State Warriors basketball team. They are cruising through the NBA regular season, preparing to make a run at consecutive championships, but they also have an opportunity to break the season record mark of 72-10 set by Michael Jordan’s Chicago Bulls in 1996. It’s interesting just following the team because they are so fun but history is currently being made. The article talks about their odds to win and includes many stats and days visualizations to show their greatness. Fivethirtyeight currently gives the Warriors a 54 percent chance of getting to 73 wins. Regardless, of whether or not they get there, this brings excitement to basketball fans and this article wraps up what they bring into data.
-
Can Big Data Analytics Save Billions in Healthcare Costs?
by Christine Donato
Specializing in SAP HANA and healthcare technology, Christine has extensive background in sales enablement, social media marketing, and digital writing. She is a strong proponent of the positive impact big data analytics can have on personalized medicine and preventative care. Christine is local to the Philadelphia area yet has a strong passion for travel and adventure.This article is very appealing to me as it describes how data can transform and revolutionize Healthcare system once it gets organized. Working with data in Healthcare system could be very difficult because of various inconsistent definitions and terms used by different doctors and used to explain patients. The inconsistency and unstructured data can be hard to aggregate, so Organizations like HISSS (Healthcare Information and Management Systems Society) are coming up with conferences that can help IT professionals, clinicians, executives, and vendors from around the world and show them different ways to improve the quality, cost-effectiveness, access, and value of healthcare through information technology. Digital transformation and usage of data is helping doctors to come up with unique treatment tactics which are different for each patient. This can provide personalized and precise and paced treatment to patients. Donato mentions that the usage of data and technology is increasing rapidly for personal health concerns via mobile apps and wearable devices that can keep track of every activity of a person. She describes how non profit societies like the American Society of Clinical Oncology (ASCO) can help doctors create treatments tailored down to a micro bucket and/or individual level and the way it will affect different people with different genes. Analyzing data of 97% of the cancer patients will allow societies like HISSS to acknowledge patterns and treatment effects by enabling better, more data-driven decision making. Usage and improvement of data in healthcare can provide proper treatment by analyzing patterns and cases that involve people with similar problems and at the same time get personalized treatment according to genes and resistance of a patient.
-
This article recaps some of the newest tech innovations that was shown at the recent Mobile World Congress (MWC). This year there was a common theme of using data to make urban cities run more efficiently. For example data from cars are being recorded transmitted to big data repositories. Analysis can use data from car crashes to quicken emergency response. As a big player in this field AT&T has invested in tracking sensors on pedestrians and office workers in their move to space planning. A few smart cities examples that was show cased at the show include a solar power pedestrian sized parking meter in Paris where the display give users information on nearby shopping and points of interest. It can be programmed to accept payments for train tickets as well. In Australia, some meters along the beaches even gives surfer insight on the day’s waves.
-
Although I know virtually nothing about operas and am not very interested in them, this article caught my attention. I found it interesting that the same four operas were being performed over and over again. This article discusses how the Metropolitan Opera will be including four very common operas in its 2016-17 schedule. These four operas are Aida, La Boheme, Carmen, and La Traviata. There is a visualization showing these four operas with three other operas that will be performed next season. This visualization shows how frequently these operas have been performed since 1883. There are very few years where none of the four common operas were performed. There are also large spans where those operas were performed in consecutive years. The article shows that the Met is struggling to make new operas for new opera goers, while trying to maintain its current audience of opera lovers. The article also stated that opera houses have experienced a decrease in attendance over the past few seasons. Even though I am not that interested in opera, I would like to see how they progress over the next few years. -
This article is interesting to me because I am a huge basketball and a fan of Golden State Warriors’ star Steph Curry. I think the article does a great job of pointing to several different sets of data that add to the original topic of the team’s odds to break the record for most wins in a season. From the data presented, it seems like they do have a pretty good chance! It’s been interesting to see how data analytics has become such an important part of professional sports, and I’m curious to see how its role changes in the future. -
The obvious choice for an article about data in recent news is an article about the presidential race. This article discusses Cruz and Trump putting more focus on their “offensive” ad campaigns, which is due partially to the results, or data, of the primary state elections. The candidates use this data to know whether they are the front runner of the race or a straggler.
-
http://www.nasdaq.com/article/bosses-harness-big-data-to-predict-which-workers-might-get-sick-20160216-01321
This article is explaining how companies are using data such as where people shop, whether they vote, and credit score to make predictions about employees’ health. I find this article extremely interesting because these factors seem unrelated to a person’s health. Employers are using these tactics to minimize the costs of medical care and reach their employees about improving their health before it becomes a serious problem. It allows employers to reach their employees and try and help them in a different way than has been used before. It is also a quite controversial topic. A question of privacy comes up because some individuals may not be comfortable with an employer accessing their private information and records. This is a very interesting technique used by employers and I am interested in how it will continue to be used (or not) by companies. -
http://blogs.wsj.com/digits/2016/02/22/most-americans-say-apple-should-help-unlock-terror-suspects-iphone/?mod=ST1
This article talks about America’s opinion on the FBI’s attempts to have Apple create a backdoor that they can use to access IOS allowing them a more effective means to track down terrorists. This interesting part about this article is not the subject but that 51% of the people surveyed said that Apple should create the backdoor. If this policy were to come to fruition, it would pave the way for a whole new set of policies that would limiting our privacy rights. Another situation that can arise from this is if someone got access to the backdoor who shouldn’t. IOS is one of the most used software in the world and if a hacker were to find a way to get access to that back door it could lead to a lever of cyber crime that we have never seen before. Worst case scenario a terrorist organization could gain access to it. This leads me to wounder what percentage of the surveyed population have any kind of knowledge into these types of matters or if they are the typical american who thinks that anything that can fight terrorism is immediately a good thing for everybody. -
http://money.usnews.com/careers/best-jobs/financial-adviser
This website was very useful for me because I am pursuing the career of a financial advisor. This website gave me the necessary data to further increase my interest in the financial advising career. It gave me the median, high, and low salary; unemployment rate; and number of jobs. It also gave me a scorecard on future growth, stress, work life balance, and job market. Lastly, it gave me the ranking compared to other occupations. These are all useful information for me and helped me decide to continue to pursue my career. -
http://www.pgatour.com/statsreport/2016/02/22/strokes-gained-northern-trust-open-riviera.html My article is the strokes gained data from last week’s PGA Tour event in Los Angeles at Riviera Country Club. As a college golfer, I am always trying to use professional data to understand the best way to shoot low scores. I can look at numbers from different situations and players and use them to make myself better when I am in the same position. I use stats such as strokes gained to know percentages and to make risk/reward decisions on the course.
-
Republican-Leaning Cities Are At Greater Risk Of Job Automation
I title of this article immediately caught my attention because I am a Republican. In addition, anything about job industry growth catches my attention because in a few short years, I will be entering the work force as a college educated professional. The facts in this article directly relate to me. For example, it states that jobs, such as cashiers, will most likely be replaced by machines. I am currently a cashier at my one job. I found this article interesting. It says that Republicans should fear for their economy more than Democrats, but my hypothesis would probably be the other way around. I wonder why Republicans have more jobs that can be replaced than Democrats.
-
The government is currently in the process of taking Apple to court. The government wants access to extract data from iPhones, however, Apple’s CEO Tim Cook is refusing. Apple is arguing that helping the FBI could endanger Apple’s users, therefore threatening the trust between Apple and its customers. As an Apple user, I find this whole situation very interesting. I can understand the situation from both sides. The FBI wants the data to help solve a terrorist case, but Apple wants to protect and respect its customers. I personally like that Apple is resisting. It shows that they truly care about their customers’ safety and privacy. We should not have to worry about the FBI being able to hack into the data on our devices.
-
http://aiddata.org/subnational-geospatial-research-datasets
I found this open data for international development. I am interested because I learned what open data is from class. This article is about sub-national, geospatial research datasets. This article introduce what benefits open data are and how thing is going in AirData. There are so many date in it, and I am so excited about open data.
-
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Some quick instructions:
You must complete the quiz by the start of class on February 22, 2016. The quiz is based on the readings for the whole week.
When you click on the link, you may see a Google sign in […] -
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the exercise.
And here is the graphic file you’ll need: Philadelphia Area Obesity Rates.png.
If you don’t have a local saved copy of the Milk vs. Soda exercise (4.2 – Getting Familiar with Tab […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the exercise.
Before you start, save this Tableau file and the studentloans2013 Excel workbook to your computer.
Remember, to download a file, click on the file link above, which takes you to the […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the study guide for the first exam.
Wednesday, Feb. 17 is exam review. (If we are current on material, we may have some time Monday, Feb. 15 as well.)
Format for review is:
Unstructured, […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the exercise.
And here is the spreadsheet you’ll need to complete the exercise [In-Class Exercise 4.2 – FoodAtlas.xlsx].
Make sure you download the Excel file to do this exercise. To dow […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the assignment.
Here is the worksheet as a Word document to make it easy to fill in and submit (along with your Tableau file).
And here is the data file you will need to complete the assignment […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Leave your response as a comment on this post by the beginning of class on February 10, 2016. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your op […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Here is the exercise. And in Word format.
Remember to leave a comment on this post with the link to your graphic for our discussion.
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
In preparation for tomorrow’s activity on good and bad visualizations, you might enjoy this talk on the topic. Known as the “king of infographics” David McCandless is a renowned author and designer of spectacular […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 5 months ago
Some quick instructions:
You must complete the quiz by the start of class on February 8, 2016. The quiz is based on the readings for the whole week.
When you click on the link, you may see a Google sign in […] -
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 6 months ago
Here is the exercise.
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 6 months ago
This week we discussed “discovering” relationships in data that weren’t really meaningful (spurious correlations). There is a site dedicated to this called Spurious Correlations.
You can scroll down to the bott […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 6 months ago
Leave your response as a comment on this post by the beginning of class on February 3, 2016. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your op […]
-
I think the most important takeaway from this week’s discussion has to do with the idea of bias. Usually in the cases of restaurants, the reviews are usually bad on the websites because the person felt so strongly that they decided to go and write a review. For instance, last week I went to a good amount of restaurants for restaurant week in center city. Before I went to each place I tried to get a general idea of it by reading the reviews and looking at the pictures. For one, the reviews did not do it justice. The place in my opinion was fantastic, but the reviews did not reflect it. I usually feel that people who post bad reviews expect a lot, which in turn causes them to never be satisfied. The only way to really counteract this problem is by posting a good review at a place if you had a good experience. This will help to balance out the reviews and try to make it fair.
-
The most important takeaway from last week’s discussion was how skewed marketing ads can be. We learned about how a lot of data is bias or is just a sample representation of what really is occurring. So it makes me wonder how much of what I see on billboards and TV is accurate and precise. I am a telemarketer and everyday I use data to determine what territory I am going to call. I may look at a lead sheet and scan over the customer’s name, residence, and list of referrals and based on that data I will decide rather or not I will call them. I do have a bias favoring the Virginia and Maryland areas when I am calling. This did not cause a signal problem for me and it hardly ever does because the data I use is backed pretty strongly by my results and percentages.
-
I think the most important takeaway is being bias. Some people just write bad reviews just because they have a problem the first time they go there. They do not give the places a second chance. For example, last week I went to Dave and Buster;s with some of my friends. Before I went i looked at a few of the reviews some customers have posted. Most of the reviews the people wrote said some bad things. However, when I went with my friends, it was amazing. The employees were nice and they got everything we asked for. The reviews did not do that place any justice. We had an amazing night and I would have given that an good review.
-
In my opinion, the most important takeaway from bias and the Signal Problem was that bias can expose you to only certain information, not all of the information available and can make you make your decision incorrectly. For example, If I wanted to buy a video game and I went to look at the reviews to see if it was actually a game I wanted to get or not. Bias could be the genre of the game because some people might review sports game higher than action games or vice versa and that’s what I would see on the reviews or my preference can get in the way of making a clear decision. This did result in a signal problem because more people favoring action games could’ve gave bad reviews on a sports game and I would’ve seen so many bad reviews that I could’ve thought the a lot people dislike it when it’s not a clear representation of all people reviewing this game. One way I could have counteracted this bias is to go on different websites which promote games I wouldn’t normally see and that could expose to information I wouldn’t normally have seen.
-
The most interesting thing to me was how bias is every where. Bias can lead to you only receiving certain information or even getting the wrong information. For example, my roommates and I wanted to go out to dinner and to a place none of us have been to. We all looked up the reviews and there were so many different reviews that you could tell some were definitely bias. Everyone has different experiences and opinions. With reading the reviews we trusted the people who had made multiple posts and seemed reliable, not the people who had only posted once because they felt so strongly about their experience.
-
Personally, I think that bias within data is much more harmful in the long run, especially considering that a bias in data isn’t considered in everyday activities. Recently, I’ve been looking for a pair of boots with very specific qualities: short, flat heel, and a light brown suede. I always use the same browser, and stick to the same few retail websites. As my search for these boots have continued, I have seen an increase in ads for short boots as well as for the stores I have been searching. Not until today did I realize how prominent the Signal Problem is: I was searching once again for boots under $150, and the results of my search were sorted by featured items. However, these featured items all fit within my previous search terms. Although this is not a negative issue, the same bias and signal issues can be easily transferred to more large scale problems.
-
I personal very carefully about the information I need to use to make decision. I cannot think of any decision that influence by bias, but I believe it does influence me. It is possible that I just did not notices. I believe bias exist everything and it influence a lot of people’s life. I will pay more attention into it. However, Signal Problem is very annoying. Every time I brought or view something online, it shows up many similar ADs about those items when I view other websites. It is just so annoying, and I could not figure out some way to avoid it.
-
I think the most important thing about last discussion is how easy it is to fake trustworthiness. It might take a while, but when just putting up a profile photo that might not even be real makes the user/reviewer’s words a little more reliable, there is no telling what a dedicated liar can accomplish. Of course, most people don’t have the time nor effort to pull off such scams, but it’s a scary thought nevertheless. As for a recent example, just a few days ago I was looking for datasets for the assignment. I have noticed that I kept looking for data in a few cities I know like New York, Columbus or California to compare with Philadelphia. In addition, because I like living in Philadelphia, I usually look for information that confirms my beliefs and ignores datasets that has Philadelphia being worse off than some other places. I only fixed this problem by looking back after I was done, changing my hypotheses to a fair mix of positive and negative opinions, and look for corresponding datasets.
-
In my opinion, the most important takeaway from the discussion was that you truly need to look into the specifics of the data. Draw your own conclusions about whether or not the data is reliable, because there is a large amount of unreliable data out there. Prime example is when I went to the fish store yesterday to purchase pet fish. I went to this large aquatic pet superstore that had a great selection. However, when I searched it I realized that it had a 2.5/5 star rating, and that cautioned me. After looking into the reviews, I had realized that the bad ones were quite biased based off of dumber complaints that people had. Issues such as overpriced food caused for a 1 star review, which I agree isn’t necessarily great, but is 1 star really necessary? Therefore, I went to the store anyway, and had a great experience! Awesome selection and the employees were very helpful. Always remember to truly look into the data, and try and figure out the true reason that it is presented in that manner.
-
In my opinion, the most interesting topic from the discussion last week was how often bias can be found in reviews. After searching for reviews about a couple of different places on Yelp (such as Temple, restaurants, and my previous place of employment) I’ve noticed that there was many reviews that had either a one star review or a five star review, with there rarely ever being any reviews with a rating in between. To add on to that, many of those extremely one sided views came from people who have barely done and reviews before, and for some it was even their only one. This made me realize that those reviews can’t necessarily be very trustworthy because one person may have been judging their opinion based on only one visit to a location. In fact, I even saw a one star rating for Temple on Yelp from a student who was angered because classes weren’t cancelled last Monday due to the snowstorm. The talk about bias has definitely made me pay close attention to reviews and make sure I only take them with a grain of salt rather unless there is a consistent number of reviews saying similar things.
-
I think the most important takeaway was the institution of bias. Personally, it would take the absolute worst and best experiences imaginable for me to write a review about a place. If I did so, it would be extremely biased, either staunchly in favor or in opposition of the restaurant. Most reviews of restaurants are similar, biased towards favor or opposition. My opinion/review would not be very trustworthy because my experience at the restaurant is exclusive to me. Thus, a lot of reviews should be taken with a grain of salt, because you are not guaranteed the same bad or good experience as others.
-
I believe the most important takeaway from the discussion was identifying trustworthy data. On sites like Yelp, many people are going to be biased one way or another whether it be for a school, restaurant, or retail store. I find the reviews that explore both the good and bad of a place the most helpful. A rating that is in the middle rather than towards an extreme is a review I am likely to trust more. Recently I have been searching for a place to live next year, and I have been reviewing each complex on whoseyourlandlord.com. Many reviews on this site are extreme opinions, more so negative comments than positive. I am using to this data to understand the positives and negatives of living in each of these complexes before I make a decision.
-
I think the most important take-away from last week’s discussion is that you cant necessarily believe that the information you come across is completely accurate. Especially in the case of the Internet, where people are free to post/edit – there are bias’s present. I recently used the Google reviews and ratings when looking up a restaurant. I realized that I simply look at the star rating along with the number of “$” indicating the restaurant’s cost, whether it is cheap, moderate, or expensive. I never really took into account that what may be inexpensive to someone may be expensive to another. I also just took anything above a 4 star rating as a good restaurant, and was disappointed when I disagreed. I did not take the time to look at the number of reviews – one 4.5 star review could end up being the owner of the restaurant, which would have indicated, to me anyway, that the restaurant is really good. Next time, I will likely examine the number of reviews and maybe even read a few before making a decision based on the metadata summarized on the Google search page when looking up a restaurant.
-
In my opinion, the most important takeaway from the discussion last week was that not all data is entirely accurate. For example, reviews are loaded with biased data and can’t be credited as trustworthy, reliable data forms. It seems as though most reviews on Yelp and Google are extreme opinions, where individuals post either very negative or positive reviews, and give extremely high or low ratings. These reviews are based off of individual’s personal opinions and experiences and it seems as though only people with extreme opinions take the time to write reviews on these websites. It is very rare to find reviews and ratings that are in the middle, and I personally believe these reviews are most credible. The next time I look to buy something online, I will be certain to read a couple reviews to decide whether or not the data is reliable and not just simply filled with extreme ratings and reviews.
-
Last week’s discussion brought up many good points about data and how a lot it can display a bias. For example, google has an option to review a place, and one review for Temple University was rated one star. The person who posted the review was very upset about how Temple University students had class on a very snowy day, and that caused him to leave a bad review. One bad experience at Temple caused him to say that Temple University is unsafe and he doesnt feel comfortable having loved ones go here, which is completely untrue.
-
I found last week’s readings and discussions very interesting, especially the talk of the different bias in data. This is very important when using data because there are many underlying factors not even considered that could have a significant effect on data. Last week, when my girlfriend and I were deciding where to eat, we took to yelp to decide. This can be a fantastic resource but it also can be filled with hidden bias. It is very possible accounts are fake and are used for business owners to laude what’s theirs and to manipulate reviews for competition. Realizing this, we only paid attention to profiles that seemed credibe.
-
The biggest point that stood out to me this past week was that data can be very untrustworthy in certain situations. our discussion on the Yelp and Google reviews proved that to me. Last week during restaurant week I was picking a place to go and eat. Instead of using the reviews on yelp I asked a few close friends from around the area about a few different restaurants and got their opinions on the matter because I knew that they would be unbiased and truthful in telling me which restaurant to go to. I also got more in depth with them about what they ordered and suggested. Their reviews could also have a certain bias but I trust them more than I trust the Yelp reviews.
-
The most important strategy that I took away from our class discussion on the Signal Problem was being able to differentiate the truthful reviews from the bias reviews. About a week ago I was looking online for a place near by to get my haircut. I found Diamond Cutz and looked though a few reviews. Every single review I looked at was very positive except one. The customer gave a poor rating and no explanation as to why they gave the bad review. I took this as a bias review since it was the only one of dozens and went to Diamond Cutz anyway and it turned out to be a great place!
-
The most significant knowledge that i gained from last week is that not all data can be trusted reliable data and not all data correlates with each other. In some case studies a lot of information is collected but not all of the data is useful to solve the problem at hand. Signal also pays a key factor in what data and reviews are valid as well because we have to take into consideration who is delivering the data. Websites like yelp and amazon can sometimes give falsified data because someones bias could have a profound effect on their review of the product or service.
-
I believe that both signal problem and bias were two very interesting topics that we covered last week. It’s very evident that individuals can create a signal problem (especially with review based sites) as they are generally going to give an extremely bias review based on a possibly good or bad experience they may have had; if everyone instead were to review no matter their experiences we could have much more reliable data. To counteract issues like these we need to find a way to incentivize people giving their reviews regardless of experience or bias.
-
The most important takeaway from the discussion for me was that you can’t escape bias, and that it’s better to get used to learning how to identify untrustworthy data that might stem from bias. Many reviews you see, whether they are for restaurants, products, services, etc., are comprised of either glowing reviews or nasty ones. It can be difficult to determine which of the “glowing” reviews are trustworthy and which were written by the company owner’s sister-in-law or great-nephew. It can be equally tricky to identify which of the “nasty” reviews were written by people who have a flair for the dramatic. I recently purchased a new laptop, but beforehand I researched extensively on reviews, consumer reports, etc. I definitely had a bias towards Macs, however, and found myself comparing everything against the laptop I really wanted. While I still took the reviews and reports seriously, I will admit that I was trying to research others just to confirm my decision to buy my first-choice Mac. Counteracting bias in general requires that we take everything with a grain of salt, and really the only way to be sure about most things is to try them out for ourselves (restaurants, products, and services).
-
In my opinion the most important aspect that I had got from last week’s discussion on bias and the Signal Problem, is that finding good quality and reliable data relies on firstly good information and also, while not always necessary but greatly appreciated, the effort and time spent from an individual in making and delivering the said data. Many times short lived, no-name reviews on restaurant reviews for example, doesn’t seem credible because of its illegitimate nature. However, when we see a full name, and a carefully explained review noting positives and negatives we tend to value that review over others, even though a bias will be present. A recent example on how I used data coupled with bias in my decision making was in choosing certain courses for Spring semester. Using the data on sites like RateMyProfessors.com sometimes prove accurate when choosing courses that certain professors teach, while at times they might not however my bias comes into play when I see certain reviews I like given what I mentioned above where an individual takes time to give a detailed explanation to how good or bad a professor is. Sometimes /I find the reviews completely contrary to what I read, causing a signal problem with the ratings/ or data that was accumulated for a professor. A way to counteract my bias in regards to this signal problem would be to give a detailed review myself or to purposely enter a class where the professors rating is low and see for myself if the data is an accurate representation.
-
During last weeks discussion on the bias and signal problem I think the most important takeaway was that most data is unreliable according to bias. Just from looking at the different reviews of Universities in our area I found that most reviews were from current students or alumni with only positive comments which is extremely biased. I have been noticing bias in reviews more after this discussion. My friends and I have been trying to find new restaurants to eat at but, looking at reviews first and mostly finding positive reviews featured first even if there are more negative reviews. this just makes me realize I must take everything into account because the internet filters my view of the world without even knowing it.
-
The most important takeaway from last weeks discussion was how bias can make many reviews unreliable. The reviews are often unreliable because many people write a really positive review or a really negative review based on one experience. You can’t really decide how a your experience at a restaurant will be based off another person’s only time at that restaurant. For example, over the weekend I decided to look at some reviews on yelp for restaurants in center city. Sure enough, there were numerous one or five star reviews about the restaurants. I chose to go to a restaurant that had a few one star reviews for different reasons to see if I experienced the same problems. I had no reason to complain because both the food and service were great. The best way to counteract a bias you may find on a review site like yelp, is to go to a place in person or ask someone who has been there multiple times.
-
I believe the biggest takeaway from last week’s discussion was the trust in data. Not all data received is necessarily correct and accurate via bias, that leaves us, the consumer, to try and fact-check the data if we are not sure if it is correct. Data i’ve used recently was purchasing a MacBook Pro. I did a lot of research about the product and its protection against viruses as my previous laptop succumbed to a violent virus resulting in my loss of all my personal documents. I checked many different reliable sources to figure out if a Mac was reliable and, according to all my sources, the MacBook was far more reliable in terms of virus protection and performance in comparison to my previous Dell laptop.
-
The most important part of last week’s discussion was bias. While reading through reviews of places to eat the best cheese-steak, bias was very apparent. The reviews were very extreme, either very supportive of how good the sandwich was or how poor the sandwich was. You never really know how good the cheese-steak is until you experience it for yourself. The best way to counteract a bias is to read reviews from people with profile pictures and people that may give 3-4 star reviews. Reviews that may seem long also might cover every aspect of what is being reviewed so these are more trustworthy.
-
I think one of the most important takeaways is that the facts we are finding on the internet may not always be “fact.” Due to our locations, preferences, past searches, and other factors, the information that we are searching for is tailored to what we ‘want’ to see, versus what may be more true or something else. I have had this happen to me when searching information for a race/ethnicity class – I do not identify as a political party, as I have varied opinions on different topics (most would probably consider me to be more liberal) and, when searching for facts about welfare and who receives them, I would receive statistics that would upset most “republican” ideologies about welfare.
-
In my opinion, the most important takeaway was that it is extremely important to make sure you know where your data is coming from, so you can identify any potential biases that could be associated with it. I recently made a decision to avoid purchasing a video game that I had been looking forward to playing. I used a review from a site called IGN to help me make my decision. In this situation, IGN actually did a good job of avoiding biases by hiring someone unrelated to the company to do the review, because one of IGN’s main reviewers had been involved in the game as a voice actor. Had an IGN employee written the review, they may have had a bias towards the game and given it a positive review that it did not deserve in order to help out their co-worker and give the game positive publicity.
-
The most important take away for me from the conversation we had on data biases and signal problems is that you have to take everything with a grain of salt. Do not believe everything you see because more data are usually needed. For example, when looking at Yelp one has to account for the user who posted, the amount of reviews the search has, and the content of the reviews. Recently, I used data from a annual report in order to formulate an idea on a company’s future progression. The bias here could be the fact that the company is the entity that wrote the annual report, therefore, I would have take biases into consideration.
-
I think the most important takeaway from last weeks class was the biases we face when collecting data. Particularly, we face many biases with data retrieved from online sources such as yelp. Yelp can be a very useful piece of data. However, the data found on yelp can be subject to many biases. We were talking in class about people’s emotions being a cause for bias on sites such as yelp. If you go to a restaurant after having a bad day and you also have a bad experience at the restaurant, maybe this was just the end to a really bad horrible day and your emotions made you a little on edge and therefore made the experience worse than it really was. That would cause a bias in your comment that others read when looking for data on that particular resaurtant you visited. This is why data that comes from sites like yelp can’t ever be depended on completely.
-
Along with everyone else, I agree that bias is a topic that I stuck with me from the lectures. Biases are everywhere and we cannot escape them for as long as we are living, but it’s biases that can lead to misleading or irrelevant information. For someone who is always on the move or always trying to look for new restaurants or articles to read I subject myself to data and biases every day. From the lecture, biases have taught me a powerful lesson…. question everything! I’m now looking back at all the times I would tell friends about new restaurants, places to hang, or articles I’ve read and how certain I was that the restaurant or place was so cool, or how much I learned from the articles and it is probably a 150/50 chance that some of information I interpreted was either false or not credible. This makes me aware of what I put out into the search engines and to look for more resources about similar topics if I want to really invest my time trying to go there or learn about it. Not that what I have read was pointless, but it skewed my judgment because I believed that was the only or popular source. Now I know it is better to have seven different perspectives than one.
-
I think the most important takeaway from last week’s discussion was that bias is everywhere in our lives. In last week’s class, we found bias on Yelp and in Google reviews, and we discussed how biased those reviews can be. In my opinion, bias are unavoidable, because in fact, reviews are very subjective ideas based on personal experiences. For example, I once went to a nice restaurant with a friend, but we argued during dinner, which really influenced my judgement, so if anyone asked me about that restaurant, I would give four stars though it might be worth five stars. Now I try to counteract the bias by only describing the facts. And although it is hard to eliminate or escape bias, I think as long as we are aware that most data are biased, we can drop the misleading data and get the “right” information.
-
The most important thing that I learned in last weeks discussion was that bias is opinion based. I know that sounds redundant but you have to remember that you are not the people reviewing this topic. Even though some people may feel very strong one way or the other, when there a multitude of opinions you get see a general consensus about a certain restaurant, game or show. Problem is is that you are not part of the general consensus and may have different tastes then the majority of the people reviewing the item. When i went to look up reviews for a new show that came out this season to see if it was worth picking up, the majority of reviews said the show was boring, slow and generic. Thankfully,the summary of the show was enough for me to give it a shot and after watching the first three episodes I thought the show to be perfectly paced for what it was aiming for and unique when compared to other shows in its sub-genre. The way I personally counteract bias is by reaching out and getting opinions from people like me, and if i can’t fin anyone like that to give me an opinion, try it myself.
-
I think that one of the most important takeaways from our discussion last week was the Filter Bubble. I never knew that they could/would do that to us! A decision I made recently that I had to research for was the type of drum set I wanted to get. I looked at reviews on the website as well as videos on Youtube so I could see how they sounded and judge for myself. A bias that became present on Youtube was the publishers of the content I was watching about the drumsets. The people who made the drum sets were posting the videos, which only involved good things about the drums. When reading on the internet about them, from forums to places where they were purchased, people had other opinions about certain features that were highlighted in the videos. I began to not pay as much attention to the videos and started focusing more on the written reviews due to the signal problem at hand. The way I counteracted the bias and Signal Problem for my case was that I watched videos of people playing the drum sets to songs rather than just playing to review them. To steer away from user bias due to their drum skills I watched various players on the same sets to ensure I found the right ones. Through this experience I learned that the best data speaks for itself. I also learned that less bias in data leads to more confident predictions and outcomes.
-
I think the most important takeaway from last week’s discussion was to watch out for bias reviews from online websites. It is important to keep in mind that a review may just be based off of one really good or one really bad experience at a certain place.Also, I learned that I should not base a decision off of just one review, I should read multiple reviews. For example, a few days ago I wanted to buy a product from a seller on Amazon, but first I wanted to read reviews on the seller to make sure he or she had a credible reputation and would send me the real product I was looking to buy. Some reviews were good and some were bad, but there was definitely a bias in the responses because most of the reviews were based off of just one interaction with the seller.
-
I think that the most important takeaway from last week’s discussion was that bias is everywhere in our lives. In last week’s we found that we can find reviews on either Yelp or Google review. In my opinion, bias are certain because reviews are very smart ideas based on personal experiences. For instance, before I go see a movie I will like to read other people’s comments to see if the movie is a good movies and if worth my money.
-
The most important take away I got from class last week was the idea that we must be more careful in looking for bias in the data that we choose to use. While looking at the reviews in class from Yelp and Google I came across a lot of sketchy reviews. When me and my friends were looking at reviews for this restaurant recently I was paying most attention to the negative reviews because I had previously had a bad experience myself which is showing bias. I could have had my friend look over the reviews knowing that she had never been to the establishment before.
-
The most important take away from last week’s class was that I got a good in site of filter bubble and how search engines and other popular websites track our activity to sell ads and make money. I also discovered that the enormous amount of bias on the reviewing websites creating signal problems. Before last week’s class I looked at review sites as completely positive and reliable. After actually analyzing popular sites like google and yelp I was able to figure out the real deal behind reviews. Now I can get a good sense of idea identifying biased reviews. We recently googled a good place to get a haircut nearby as my friend wanted to get a haircut. After looking up some places nearby and reading review about them on google, we were able to identify the false, misguiding and biased reviews. analyzing little things like total number of reviews and the patterns of the reviews, finally we were able to sort out the best option. This would have been impossible if we would have trusted all the reviews out there.
-
The most important takeaway from last weeks discussion would undoubtedly be the issue of bias when it comes to data. With this I specifically thought of how bias plays a huge role in me choosing my classes for the semester. I usually finalize my decision on whether or not i’ll remain registered for a class based on what I read about the professor on a rating website. I can honestly say that I trust the opinions of my fellow students enough to change the time and even the day of my class if the professor does not get a good rating. I depend on the data provided from that website highly every semester and would feel genuinely anxious about how my classes would pan out without me knowing a bit about the professor beforehand. There have been times when I’ve tried to overlook certain biases that I see within the information given on a teacher just because a certain class may fit flawlessly into my schedule, yet if it ends up being as bad as stated, i move on to the next class.
-
I think the most important takeaway from class is the importance of online reviews. I grew up in a time with the internet thus the ability to rate items online. An example is anytime I buy something online. I will always check reviews online. These biased opinionated reviews will always skew me making the purschase.
-
In my opinion, the most important takeaway from the class discussion is the topic of biases. Recently I have been trying to coordinate a birthday party for a close friend. I decided to use Yelp and other review sites to find a good restaurant to celebrate the cause. After reading many reviews i started to notice a bias trend that people either loved or hated the places they went too. This caused a problem in my decision making because the data was skewed. Inevitably I chose the restaurant with the most star reviews.
-
What I took away was the importance of realizing that signal problems exist. I never really though about it until you mentioned the example about hurricane report. It makes absolute sense. Data can be easily skewed due to many reasons so know the source and analyzing it careful is important. A recent example of how I’ve used data to make a decision is when I decided to go Vegan. After watching a few documentaries on Netflix and being exposed to the data on the environmental impact our daily eating habits has I made a decision to stop eating meat and dairy. The data they emphasized in Conspiracy may potentially be biased since they were trying to persuade their viewers in one form or another. Because the numbers where so large (e.g they mentioned that it takes 1100 gallons of water to come up with the average portion of beef an american eat daily) I actually did my own research and found that they weren’t far off. This being said I think bias and signal problem in data is really important point to factor in when analyzing data from a large source.
-
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 6 months ago
As emailed, today’s session was recorded via Class Capture for anyone unable to make it to class due to snow, ice and travel conditions.
You can view the recording here.
To receive full credit for attendance […]
-
Shana Pote wrote a new post on the site MIS 0855: Data Science Spring 2016 9 years, 6 months ago
Some quick instructions:
You must complete the quiz by the start of class on February 1, 2016. The quiz is based on the readings for the whole week.
When you click on the link, you may see a Google sign in […] - Load More