-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Here is the exercise
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Here’s your reading for the week ahead:
Analytics Beautiful Game
Descriptive Predictive
Watching you at work -
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Here is the exercise.
And here is the spreadsheet you’ll need for the exercise [In-Class Exercise 12.2 – Sentiment Analysis Tools.xlsx].
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Nathan’s office hours to help with any questions about the online class yesterday will be from 12:15 pm – 1:45 pm this Thursday (April 13) at Alter 236G. Time can also be used for any group trying to refine, nail […]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Leave your response as a comment on this post by the beginning of class on April 17, 2017. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your o […]
-
A data-driven service I regularly use would be Amazon.com. If you exported this data to a CSV, each row would represent a customer order. Examples of columns would be order id, total, number of products, names of products, first name, last name, address 1, city, state, zip code country, shipping method, shipping cost, order date, ship date, credit card number, and coupon/promo codes.
-
A data driven service I use regularly is the ESPN app to look up stats of specific NFL players. Those stats could be easily exported into an excel spreadsheet and examples of information that could go into columns, would be player’s name, positions, and team. From there, depending on the position, it could be tackles, sacks, receptions, rushing yards, touchdowns, and passing yards.
-
I use the ESPN fantasy app on a regular basis especially during the NFL season. There i can see a various amount of stats from selected players on my “Fantasy team”. There I could categorize each player into columns and then separate their production into rows. Production gives my team points, and classified production is such things as TDs, receptions, interceptions, rushing yards and much more.
-
I use Amazon all the time to purchase everything from textbooks to shoes. A column could be an item, say a pair of shoes. There could even be multiple columns for each style–like a separate column for the same brand for sandals and one for sneakers. The rows could include styles, different vendors selling the same product, price, shipping options and reviews of the vendor selling the items.
-
Rate My Professor serves as a great example for the weekly discusion. If rendered in a database the colums would inlcude, professor first/last name, location, school, would take again rating, level of difficulty, hotness. The user information would be stored in a db as the userid of who made the commment and their exitisting comments and ratings on the site.
-
A data-driven source that I use regularly is the bleacher report app. Similar to other sports apps, it compiles all stats and advanced analytical stats, especially basketball. Each row would represent each respective player in whatever league a consumer is looking for. Each column would represent each statistic – points/rebounds/assists/steals per game, win shares, player efficiency rating, etc.
-
A database that I use frequently is the online library catalogue. If I were to create a spreadsheet to compile the data, the columns would contain information on the book/CD/DVD titles, author name, locations item is found (i,e,. which library or if it checked out), and brief description of item. The rows would include individual review of the items.
-
A data driven service that I used regularly is is WordPress. If I were to measure the post/articles in a spreadsheet, I would record the post date, time, title of article, category of topic, likes, comment and then I would have a rating for each of the articles for a personalized measurement(1-3). In the row I would have the user name this will allow me to see who is providing content on a consistent basis, who the market responds and what frequency produces what type of results.
-
A data driven service that I use regularly is Google Maps. If I were to store the data into a spreadsheet, the rows would represent each person that uses Google Maps. The columns would be most visited places, their top rated places, lowest rated places, where they spend the longest amount of time, usage of the app (bike, public transit, car, or walking), and their preference of route options.
-
A data driven source I use regularly is Forever 21 credit card collection. I work at Forever 21 and we have weekly goal for signing people up for a credit card. To store data into a spreadsheet, the rows would be the names of the cardholders, and the columns would include: their credit limit, their monthly bill due date, the date they signed up, their address, telephone number, and email.
-
A data-driven service I use is online banking. Each column would be different accounts, and each row would represent detailed account information, such as account number, current balance, credit line, credit available, transaction history, and the information about each transaction, like transaction amount, posting date, description, transaction type, transaction status, and so on.
-
A data-driven service I use daily is Snapchat. If I wanted to store my usage data on a spreadsheet, the columns would could be name of the person who I sent a snap to, whether it was a picture or video, time, date, and whether or not there was a geotag added on the picture. The rows would then be the detailed information for each category. The spreadsheet would be a daily chronological log of all of the snapchats I’ve sent.
-
A data-driven service that I use daily is Twitter. If the data from this service was transformed into a spreadsheet, it would be a representation of my Twitter activity and interactions. A row would be an individual tweet, and some columns would include the send date, the send time, how many imprints the tweet had, how many people liked it, how many people retweeted it, and how many people responded to it.
-
A data-driven service I usually use irregularly is Amazon.com. If I store the data for the service in a spreadsheet, each row will represent an order. Columns will be order ID, item, categoryID, date, customerID, date/time, number of items, price, total price, payment method, coupon(Yes/NO), shipping method, is the customer a member of Amazon (i.e. Amazon Prime)
-
A data-driven service I usually use is the New Yorker. If the data from the activity on this site were represented in a spreadsheet, the columns would be: name of user,
date, (when you started on a page and when you leave) duration on page, articles visited, advertisements visited. This would help the New Yorker better cater content for each individual user. -
A data driven service I usually use is blackboard. I guess the best way to summarize the spreadsheet in this case would be by assorting the columns into classes and then the rows into material for each respective class. This could include grades, updates, and assignments. The only thing difficult about this would be the vast amount of variety in the classes. Also submissions would be tough as well. However, it is definitely do-able.
-
A data-driven service that I use regularly would be Amazon.com to purchase all types of items. Each row in the spreadsheet would represent a different customer order. Some of the data columns would be the order number, the date the order was placed, the total of the order, the name of the person ordering the item, the address where the order is being shipped, the name of the item, the number of items, the payment method and the estimated delivery date.
-
A data driven service that I use is Aamazon. In Excel each column would be a category related to the order which would include data such as a unique identifier like “OrderID”, and then supplementary data such as “Price”, “Quantity”, “Order Total”, “CustomerName” and so on. Each row would fill in this necessary data.
-
Snapchat is a data-driven service most people use daily. If I were to collect the data then some columns I would include would be the type of message, whether it was a picture, video, or chat message, who sent the message, who received the message, and the time and date it was sent, received, and opened. Additional rows would be an individual’s snapchat username, email, snap-scores, and if applicable the users date of birth. Where as the rows would be each individual snapchat account.
-
A data driven service that I use frequently is Mind Body; Mind Body is the system that is used at the climbing gym that I work at. If you were to pull the data from the service and insert it into a spreadsheet the rows contain the clients name. Some columns that would included would number of customer visits, member/nonmember, status of client (active or archived), and number of purchases.
-
A data driven service I use frequently is Netflix. The rows would be the shows and movies I watch while the columns may be number of seasons watched (if a tv show), number of minutes spent watching the tv show or movie, how long I browsed before choosing that tv show or movie, whether or not I have watched the tv show or movie before, the way I watch (laptop/smart tv/cellphone).
-
Snapchat is a service I use regularly that is very data driven. There are a multitude of data types that could come from Snapchat. There would be location, snapchats sent, sender, receiver, time sent, camera used, times viewed a snapchat story, times a snapchat story has been viewed, and amount of snaphcat articles read. Snapchat has a multitude of ways to store data, and that is why they are a thriving tech giant.
-
A data driven service that I use is Apple Music. There are many types of data in iTunes such as song name, artist, album, and number of plays for each song. You are able to view all of the properties of each song to find all of this data.
-
A data-driven service that i use is Yahoo Finance. It stores the stocks price, %change in price, bet/ ask price, and etc. You will be able to view a history of a stock from the day of IPO, as well as currency of other countries.
-
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Some quick instructions:
You must complete the quiz by the start of class on April 17, 2017.
When you click on the link, you may see a Google sign in screen. Use your AccessNet ID and password to sign […] -
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Here is the exercise.
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Here’s your reading for April 17:
Sentiment analysis
Unstructured data
Facebook no clue -
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Here’s the video and WebEx walkthrough for the April 10 class. Apologies in advance for the production value, but you’ll get the gist and we can follow up on April 17. I kept it quick. Nathan’s in class exercise […]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Some quick instructions:
You must complete the quiz by the start of class on April 10, 2017.
When you click on the link, you may see a Google sign in screen. Use your AccessNet ID and password to […] -
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Here is the exercise.
Here is the excel spreadsheet you will need to complete this exercise [In-Class Exercise 11.2 – NCAA 2013-2014 Player Stats]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years ago
Leave your response as a comment on this post by the beginning of class on April 10, 2017.
Leave a post about your group project:
What is the subject of your group project?
Which of your fellow […]-
Our group project will consist of the analysis of smartphone ownership (Pew Research) worldwide through the manipulation of survey data. We will leverage excel, SPSS, and Tableau for data manpulation and visualization. My group mates are Craig Stetchak, Phillip Aaron, Izzy Sarlo, and Anthony Petrole.
-
Our group will consist of Fei Qing, Yaning Wang, Yixuan Zhou, and Man Shu Wong. We will be analyzing the 2016 presidential campaign fundraising. Our data source will download from the federal election commission, the goal of the assignment is to examine the financial strategy being used during the campaign by using data analysis tools such as Excel and Tableau.
-
Our group is myself, Patrick Granquist, George Raymond and Jillian Foster. We do not have a idea yet but we will after this class period.
-
Our group will be analyzing who has healthcare in the US. We will examine the data and make a heatmap of the country to show where the highest concentration of insured/uninsured Americans live.
-
-
Group Member: Imani West, Alex Fortebuono, Daishaun Grimes, Sai Gangisetty & Ben Bucceri
We will be analyzing the upcoming NFL Draft based on the statistics provided by the 2017 NFL Combine.
-
Subject: home court advantage in college basketball – is the home team more likely to win? Does playing at home significantly affect offensive or defensive statistics of either team?
Group: Ben Bucceri, Sai Gangisetty, Daishaun Grimes, Alex Fortebuono, Imani West
-
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
Leave your response as a comment on this post by the beginning of class on April 10, 2017. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your o […]
-
I usually wonder what happens to the airplane waste and how much does it weigh. According to the article, airlines passengers generated 5.2m tones of waste in 2016. This is a significant amount of waste and it bothers me because I am a person who travels a lot, and I consider myself to be among those people who contributes airlines waste. The article states that the waste contains the toilet waste, wine bottles, half-eaten lunch trays, unused toothbrushes, and etc. I felt worse after hearing that because i thought the food I did not touch would be recycled at least, not knowing they threw them away. Also, the article states this costs this industry $500M per year, which is also a significant amount of money that can be reduced and add toward ticket costs. I am hoping they solve this issue and have their products be recycled instead of being thrown away unused so probably can lower air fair for everyone.
https://www.theguardian.com/sustainable-business/2017/apr/01/airline-food-waste-landfill-incineration-airports-recycling-iberia-qantas-united-virgin -
Cybersecurity has become an increasingly prevalent problem in the past couple years, especially in an increasingly digitized society. Even still, Americans are still a little unsure of cybersecurity topics. For instance, only only 54% of surveyed Americans are able to correctly identify a phishing attack, which could give up personal data. Additionally, only 46% understood that email is not encrypted by default, and 39% of users are “aware that internet service providers (ISPs) are able to see the sites their customers are visiting while utilizing the ‘private browsing mode.'” From a basic standpoint, adults are less likely to identify a secure password from a list, making their accounts even more vulnerable. With the vast amount of data on the internet and in the cloud, etc., this really poses a problem for our future security.
-
My article explains how analytics have taken over the game of basketball and now are relied on way too much. Recently in the last 5-7 years, NBA GM’s and scouts have started using advanced statistical analyses on potential draft prospects to try and predict if they will pan out and be a contributor to their team or not. Mark Cuban is a notable owner and businessman and he talks about how analytics are soon going to be inferior to AI. https://www.si.com/tech-media/2017/03/31/mark-cuban-analytics-overrated-technology-sports
-
http://usatodayhss.com/2017/a-few-surprises-in-the-data-behind-single-sport-and-multisport-athletes
I found this article quite interesting because my brother is actually going though the process of selecting colleges and he is an athlete. He is a soccer player, however, he was a multi sport athlete for most of his life until high school. My mom wanted to be a multi sport to appeal to college scouts more, but the data from this article proves that soccer is often a single sport commitment. I think sports such as soccer and gymnastics have more single sport athletes because they can be competed year round. Football and lacrosse are seasonal sports, so it is often encouraged that participants of these sports participate in another sport in their “off season” to stay in shape.
-
Ths article deals with how cybercriminals look to tax time and data obtainable on LinkedIn and Facebook to get information on employees. By getting information easily obtainable on the internet such as employer, position and 401ks from companies that post them online, cybercriminals are targeting company HR leaders, calling to get information such as social security number, salary amount, etc. They can then use this information to steal the employees identity and obtain credit in their name. HR people need to be more cautious of who they give information to; just because someone identifies themselves as an HR leader doesn’t mean they are being honest.
-
https://www.theguardian.com/technology/2017/apr/08/speed-reading-apps-can-you-really-read-novel-in-your-lunch-hour
This article talks about the history of speed reading and how it has become a popular thing. In the article, it listed some presidents that used to practice speed reading, and some popular ones are: President Kennedy, Nixon and Carter. The article are show some an example of a resource to help with speed reader. However, it comes at the cost of $4.99 and it is an application that uses RSVP (Rapid Serial Visual Presentation) to help increasingly feed text to the user. And lastly, the article found that attempting to read over 600 words per minute will lower the comprehension rate to below 75%. This article was interesting to me because as a college student, I tend to try and speed read my readings. That may not be good because it might lower my comprehension rate of the reading. -
This article is about jobs in the US. The US added 98,000 jobs in March, per the Bureau of Labor Statistics. This is a sharp decline from February, and the figure falls far below most economists’ expectations. This article, however, reassures us that one slow month of job growth doesn’t change the fact that for a long time, job creation in the US has been very strong. In fact, the unemployment rate currently sits at about 4.5%, which is the lowest it has been in over a decade. Additionally, the labor force participation rate increased. Hourly earning have increased since 2012 as well. Overall, despite a slow month in job creation, the US labor market is as strong as its been in a very long time. It was interesting to me because a lot of people I know were troubled by March’s numbers, but this article reassured us that the labor market is doing just fine. -
http://www.tomsitpro.com/articles/oracle-big-data-cloud-services,1-2981.html
I found this article quite interesting because I just did a research on Amazon Redshift a while ago. It is a data warehouse like Oracle. As a competitor of Redshift, Oracle offers some new features such as combine SQL and NoSQL together, enhanced Natural Language Processing, etc which can reduce the amount of time spent preparing data for analysis. -
http://www.nbc12.com/story/35113409/gamestop-investigates-possible-data-breach
This article details a recent data breach suffered by GameStop, a video game retailer. The company’s website was hacked and the credit card information of its customers was believed to be put up for sale on a website. The article discusses the different ways you can protect and monitor your financial data in the chance that your data was involved in the hack.
-
http://www.mastersindatascience.org/industry/health-care/
This article talks about data in the health care industry. Even though I am a business major, the health care industry has always been once of my main interests. There is always new and upcoming data in health care, and it is really important because it can impact peoples lives immensely. This article talks about prices, medicines, new devices and more, that could impact the industry. -
This article talked about the slowdown in U.S. job increasing rate in March, which is out of expectations. However, the author explained that there was no need to worry too much about these data, and the author listed some possible reasons about this slowdown, such as weather, and change in unemployment rate.
-
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
Hadoop for non-geeks
A note about that first reading: It’s a bit dated and Hadoop has advanced since that article. Much of the focus in the open source community has been on side projects tied to Hadoop. […]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
Here is the exercise.
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
This NYT article on how Google researched teams and their effectiveness is worth a read.
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
Here is the study guide for the second exam. Nathan will hold exam review at 8:30 am – 10:00 am on Friday (03/31/2017) at Breakout room Alter 236C
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
Here are the assignment instructions. Groups MUST be 4 to 5 members. You may not do this assignment on your own or in smaller groups than 5. Note that the date on the assignment is incorrect.
The as […]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
Here is the exercise.
And here is the Excel workbook you’ll need [Pew Story Data (Jan – May 2012).xlsx]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2017 8 years, 1 month ago
Some quick instructions:
You must complete the quiz by the start of class on March 27, 2017.
When you click on the link, you may see a Google sign in screen. Use your AccessNet ID and password to sign […] - Load More