-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 7 months ago
Here are the assignment instructions. Groups MUST be 4 to 5 members. You may not do this assignment on your own or in smaller groups than 5. Note that the date on the assignment is incorrect.
Once we fo […]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 7 months ago
Here is the study guide for the second exam. And here’s the more detailed version.
Agenda for the exam will be to:
–Re-form groups for last group project at the beginning.
–Take test
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here is the exercise.
And here is the Excel workbook you’ll need [Pew Story Data (Jan – May 2012).xlsx]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here is the exercise.
And here are the workbooks [2012 Presidential Election Results by District.xlsx and Portrait 113th Congress.xlsx]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Leave your response as a comment on this post by the beginning of class on March 26, 2018. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your o […]
-
A KPI i use daily would be my steps tracker. it is specific and measurable by the amount of steps i take daily. It is achievable because I am able to track the steps, and set a goal for the number i want to reach by the end of the day. It is relevant to my life because I like to keep up with my health. Lastly it is time-variant because I check my steps on a daily basis.
-
A KPI that I have daily access to is an app called “Quality Time” on my phone which tracks data relevant to my phone usage (i.e. usage duration, number of times I unlock my screen, how much time I’m spending on specific apps, how many times I go to an app a day, etc.). With the app, I try to monitor and limit my excessive phone usage as much as possible. I have desired figures that I wish to achieve, it is specific and measurable, It is relevant to my daily life (and frankly, it’s relevant to everyone), and it is time-variant because I am able to track my daily usage, as well as hourly, weekly, monthly, etc.
-
Heart beat per minute monitor. Heart beats per minute are specific and they are measurable. It is achievable– I can monitor my fitness and decide to get into better shape to lower my heart rate. It is relevant– my heart rate is a direct correlation to my health and well being. It is time variant- I can look at changes in my heart rate over X amount of time.
-
A KPI that I use daily is my step metrics on my Apple watch. It is specific and measurable because it tells me my exact amount of calories burned, time exercised, time standing, and number of steps. It is relevant because activity is an important part of my health. It it achievable because it is based off of my regular activity, and I can change it based on my current lifestyle. It is time-relevant because I can look at how my activity changes over time.
-
A KPI that I use regularly is the activity tracker rings on my Apple watch. I have to close three rings everyday. One for calories burnt, another for minutes exercised, and another for minutes standing. If I start to slack off, I get notifications with instructions on how to complete my rings for that day. The number of calories is specific and measurable, it is achievable because i can always work out more, it’s relevant because it directly impacts my health, and it’s time-variant because I can view my progress every day, week, or month.
-
The KPI I use regularly is an application on my phone that tracks the amount of calories and nutrition I take daily, called My Fitness Pal. This app makes calorie counting convenient and easier for me to stay on a healthy diet. The more I track and long in my meals I found it easier to make healthier choices to reach my goals. It is specific and measurable because it tracks every meal I log in. As I track my meals it ache vices a certain goal that benefits my health, and it is time-variant because I can see my progress at anytime.
-
A KPI that I use on a daily basis is my GPA tracker. It is specific and measurable because it ranges up to 4.0. I need to work hard to achieve a higher GPA. It is relevant because I need to show my GPA to future employers, and it can help/hurt my chances of receiving a job/internship. Lastly, it is time sensitive because at the end of each semester grades are finalized, and you need to make sure you were on top of everything.
-
A KPI that I use every day would be the Activity tracker on my Apple Watch. It tracks my movement, exercise, and stand goals throughout the entire day. It gives me challenges to complete every day and allows me to keep track of my fitness. My health is very important to me and this app allows me to stay in check with my performance each and every day.
-
One KPI that I try to use is the customer satisfaction score. When I work at a restaurant as a server, I try to keep up with how much the customers are satisfied with the restaurant’s experience. It is very relevant to what I do. The higher the satisfaction is, the happier I am and the more tips I earn.
-
One KPI that I use everyday would be my GPA. This tells me my grades in school and how I am doing. It is measurable because it is a number that is the average of all my grades. Also, it is achievable because I can try to get a better GPA than what I already have. It is relevant because I am a student and need a good GPA to graduate and get a job. Lastly, it is time-variant because I can look and change it over a day, week, month, or semester.
-
A KPI that I use is my MIS Professional Achievement Points. It tells me my current points and how much I need to get graduate. This is specific and measurable – the points are measurables. It is achievable – I can participates in more events to earn more points. It’s relevant – the PA points are needed to let me graduate. And it’s time-variant, I can look at PA over months or semesters.
-
A KPI I recently started using is the fitness rings on my smart watch. It tells me the amount of calories I burn, the amount of minutes I exercised, and the amount of minutes I stand.
Specific & Measurable: it is an exact measure of activity.
Achievable: I can change the amount of goals I want to reach and it reminds me to achieve them.
Relevant: The rings determine how active I was and that directly correlates to my overall health.
Time: I can see all my rings from the past and determine whether I completed the daily goals and I can compare them with friends and family with a smart watch. -
One of my daily tasks at work is to mail out acquisition letters to prospective sellers. I have a quota of 250 letters per work day. At the end of the day I divide the number of letters I sent out by 250 to calculate the percentage of my quota that I was able to achieve. When I mail out 250 letters I am 100%, that is a KPI That I use. The purpose of this KPI is critical in logging my efficiency as well as for the sake of transparency with my boss and co-workers. It is measurable in the sense that it calculates my performance in that specific daily task. I tend to average 250 letters so it is definitely achievable. The KPI itself motivates success because other people have access to the calculation sheet for this daily KPI. It is also time phased because It is recorded over the duration of 1 work day (9-5). If the length of the day is altered then it’s recorded in the footnotes of my calculation sheet.
-
A KPI I use on a regular basis is an app I use at my job to record my hours called When I Work. It is specific and measurable; it saves and adds my weekly hours as I input them each day. It is achievable; I can monitor whether I need to work more or less to keep up with my budget and other financial obligations. It’s definitely relevant; I have to record my hours correctly so that I get paid the right amount and get paid on time. And it is time-variant, as I can see how many hours I’ve worked each day, week, month or year.
-
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Some quick instructions:
You must complete the quiz by the start of class on March 26, 2018.
When you click on the link, you may see a Google sign in screen. Use your AccessNet ID and password to sign […] -
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Data integration
GOP Analytics
Dashboard best practices
The One Skill You Really Need
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here is the exercise.
And here is the spreadsheet to complete the exercise [In-Class Exercise 8.2 – OnTime Airline Stats [Jan 2014].xlsx].
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
I came across this story on someone on Reddit visualizing his Tinder experience only to find another person did their 500-day OKCupid outcomes.
Both of the data sets (along worth a bunch of others) are on the […]
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here is the exercise.
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Some quick instructions:
You must complete the quiz by the start of class on March 19, 2018.
When you click on the link, you may see a Google sign in screen. Use your AccessNet ID and password to […] -
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Performance indicators
Tyranny of success: Non profits and metrics
Tracking health
Wearable tech
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Leave your response as a comment on this post by the beginning of class on March 19, 2018. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your o […]
-
Considering I do not use these data systems very often I cannot recall a particular time I made any of these mistakes. Although I myself have not made these mistakes, I think that the most important mistake to avoid would be to not open a CSV file directly into Excel. This is very important because if that mistake is made, the whole data set is compromised, and you are forced to do a lot of clean up.
-
It is vital to use the correct data type in order to avoid data corruption. I often accidentally use the wrong data type when using a type of software that relies on accurate inputs. It is an easy mistake to type a number that’s an integer as a string or vice versa. It is often a beginners mistake, but mistakes can still occur even if you’re an expert who’s working with a monumental amount of data. It is important to pay attention to the smallest details because a small error could have a devastating effect on an operation.
-
Unfortunately, I made the mistake of copying formulas that use relative coordinates during my first internship. My task was to replicate and improve spreadsheets, and since I had little prior experience with Excel, I didn’t see the problem with copying formulas to new sheets. This caused a lot of issues going forward because the formulas did not reference the same data in each sheet. I realized the calculations were off, and it took me hours to fix all of the sheets.
-
I have not worked in many data sets so I have not made any of these mistakes yet, hopefully after reading this article I won’t make any of them either. I think the worst mistake you can make is number 3: start working on a database without doing a full backup first. This can become a big problem if your working on the data for a long time and the data does not save.
-
Even though I have very little experience with managing data or utilizing data, I have made this mistake over and over that is… start working on the database without doing a full backup first. I’m pretty sure that happened to so many people. And once they have incidents where the original data is damaged somehow, they can’t get the data back since they didn’t make any backup first. I should be aware of this for the future.
-
I haven’t made one of these mistakes but I think that the most important ones to avoid are number 6: Missing the data type and also number 3: start working on a database without doing a full backup first. I think these two are the most important to avoid because with having any data type missing could cause a mess that becomes frustrating to go back over and adjust the mistakes. And also having a backup of the database first is important in case of something happening to your work, you would have a backup to go back to.
-
In my past experience working with data, I have made several of the mistakes listed in the Stupid Data Corruption Tricks article. Most notably I have done 1, 3, 6, and 10 throughout most of the times I began working with Excel. I think the most important step someone can take is to always make sure you don’t commit #3. I have had a couple bad experiences where something went wrong and having a backup would have saved me a lot of time and stress. For assignment #3 I was sure to make duplicate copies of the spreadsheet after I had completed each part due to past experiences.
-
I have not made any of these mistakes, as I have little experience with data. However, I believe that number 3, Start working on the database without doing a full backup first, is the most important to avoid. From my work in media, I can vouch that backing up your work is very important. Saving work, and organizing it prior to it is key.
-
I haven’t made any mistakes so far listed in the article, “Stupid Data Corruption Tricks,” because I have not had much experience with Excel. I have made several miscalculation errors on certain projects but none listed in the article. However while reading the article, now I understand what I have to do to avoid certain problems.
-
During the Flash Research test last semester I had unfortunately made number 9 mistake of copy formulas that use relative coordinates without fixing it. After I had copied the formula I had moved the table around and I had to go to the formula and waste time that I could have used to write the paper.
-
I have not used Excel too often and haven’t made any of the listed mistakes. However, I think the most important mistake to avoid is clicking “yes” without carefully evaluating the message that says “do you want to remove this from the server?”. This mistake is very costly because not only does it delete data, it also deletes metadata and configurations, two key aspects of being able to create good data.
-
Personally, I am not using data consistently enough in my life to be making one of the listed mistakes we have talked about. If I were to guess which mistake I would make though, I would say it is safe to assume I might make mistake number 1 and mistake number 6. I am absolutely one to blindly click “yes” on things without regard for the consequence (I never read the “terms and conditions”, but who does?) which can obviously be problematic, and I also feel like I would certainly miss the data type.
-
I have not used Excel too much except for some times in high school and in this class. I can see how all of the mistakes in “Stupid Data Corruption Tricks” can corrupt the data being used. I believe that the most common mistake would have to be opening a CSV file directly into Excel. By doing so, it can change the data and turn it into different values. This would take a lot of work to fix and to make sure everything is how it should be.
-
I currently do not work with data outside of this class, so I haven’t made any of these mistakes to my knowledge. I feel like one of the worst, and likely the most frustrating, mistakes to make would be Number 3, beginning to work on a database without backing it up first. There is nothing more frustrating than neglecting to save your work as you go, only to lose everything if technology fails you. I’m sure this is especially true when it comes to data, working with large, detailed databases in which details would be hard to revisit, pinpoint and fix.
-
I am not certainly sure if I have made any of these mistakes, but chances are I probably have. I’m not all the familiar with microsoft excel, I have used it quite a bit, but it is mostly for simple work. I might have downloaded a csv file or two in the past, I did not even know what a csv file was before last mondays class.
-
I use excel every day at work. I can relate to many of these problems. The first one being Number 5: “Use a deduping tool with “loose” criteria first”. When doing mail merges for marketing, I’ve learned (the hard way) the importance of having strong, obvious criteria first. Mistakes in this have led me to hours of manual correction. The next one, Number 3: “Start working on the database without doing a full backup first” has kicked my ass in both work and this very class itself. Once again, I’ve learned this the hard way. Im appreciative of making and learning from those mistakes now when the stakes aren’t as high as they will be in the future.
-
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here is the exercise.
And here is the dataset you’ll need [Vandelay Orders by Zipcode.xlsx].
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here are the instructions (in Word) (and as a PDF). Make sure you read them carefully! This is an assignment that should be done individually.
And here is the data file you’ll need: VandelayOrders(Jan).xlsx.
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here is the exercise.
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Some quick instructions:
You must complete the quiz by the start of class on March 12, 2018.
When you click on the link, you may see a Google sign in screen. Use your AccessNet ID and password to […] -
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Damn Excel (amen to that)
Data credibility
Data corruption tricks
Clean data top 10
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Here is the study guide for the first midterm exam. Here is also a more detailed version based on notes from class.
-
Lawrence Dignan wrote a new post on the site MIS 0855: Data Science Spring 2018 6 years, 8 months ago
Leave your response as a comment on this post by the beginning of class on March 12. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your opinions, n […]
-
https://www.theverge.com/2018/2/27/17058268/facebook-facial-recognition-notification-opt-out
This article is about Facebook’s new facial recognition feature. It enables Facebook to search for pictures you are not tagged in. I think this is interesting because that means that the company must have ample amount of data on every single person that is registered with them. This amounts to almost 2 billion people. -
Octo Telematics Transforms the Insurance Industry with Machine Learning and Analytics Platform
This article explains how Octo Telematics, a leader in telematics for insurance companies, is improving the insurance industry through machine learning. The article goes into detain about how Octo Telematics’ platform, using Cloudera Enterprise, works and how it can process previously impossible volumes of data, aggregating over 11 billion data points from 5.4 million cars every day. It also incorporates Apache Spark, which can analyze over 20 million miles of driving per minute and calculate location, acceleration, braking, idling, collisions, cornering, etc. This article is beneficial to me because as an actuarial science major, my job is all about risk management and calculating prices based off of risk. With this new technology, I am able to present more accurate information and give more accurate prices to further minimize risk.
-
http://money.cnn.com/2018/02/22/technology/airbnb-property-types-new-experiences/index.html
This article is about how Airbnb wants to engage in an expansion effort. The company will start to offer more ways to search for properties, adding new tiers, called Airbnb Plus and Beyond by Airbnb, for a luxurious experience, and collections for family and work trips, and will expand to include weddings, honeymoons, and group getaways. With this new effort, Airbnb will expand out into a new market attracting more people into the Airbnb guest data base.
-
For this week, I found an article about the data breach. Many companies have failed to protect their data from hackers and paid a significant amount of ransoms. This time, it’s Uber. Uber was unable to defend the drivers’ information back in 2016 but stayed quiet. Pennsylvania sued Uber under data breach notification law that requires companies to notify a data breach within a specific time frame. The maximum penalty can be $13.5 million.
I think companies should, especially these data sensitive times, comply carefully with the laws that deal with data. Beforehand, the companies deal with big data should put a protective measure that can detect a data breach of their consumers or communicate with the society as soon as possible when the events like this happen.Source:
https://www.insurancejournal.com/news/east/2018/03/07/482613.htm -
https://www.economist.com/blogs/graphicdetail/2018/03/daily-chart-2
Since 1950, plastic that has not been recycled or burned has amounted to roughly 4.9 billion tons. Most of that plastic could have been dumped on land in an area the size of Manhattan. Instead, most of that plastic is in the ocean and is not easy to clear. Some computer models estimate that there could be up to 51 trillion micro plastic particles floating around the world. Most of this waste comes from developing eastern Asian countries such as China, Indonesia, and the Philippines; mostly due to their limited laws and regulations with respect to waste removing. This is interesting to me because it is such a massive problem for every nation and it seems as if this problem is often overlooked.
-
https://www.forbes.com/sites/forbestechcouncil/2018/03/12/cyber-insurance-analysis-of-problems-related-to-it-risk-insurance/#730a9f6c49bd
This article discusses the challenges of insuring cyber risk. It has become necessary for companies to protect their data, but the insurance sector has not been developed enough to have the products that offer the most appropriate coverage. This is due to the inability to accurately assess cyber risk and the potential damages. The insurance industry can be slow to adapt, which is concerning because the need for data protection is growing exponentially. -
Artificial Intelligence and Big Data Technologies Drive Business Innovation in 2018
This article talks about the progress of big data and artificial intelligence in the business world. This year might be the year that AI will gain meaningful traction within Fortune 1000 organizations. 93% of executives identify AI as a disruptive technology but there is an agreement that they should use cognitive technologies to stay ahead of their competitors. As an MIS major, I will certainly work with a lot of data in the future. If AI can make predicting outcomes easier then it is also narrow down the jobs pool for data analysts. -
Opportunities for Big Data to Improve Energy Usage Across Industries
This article talks about how data can be used to improve energy usage all across the country. Today, there are tons of places where data comes from, and people take this data and see how they can use it. I believe that there is data about energy efficiency and it can be used to benefit our country. This can be done by analyzing the data, and then formulating a plan that improves energy usage. This would have a huge impact on industries so they can operate the same or better with using less energy. -
http://tcbmag.com/news/articles/2018/april/small-ball-meets-big-data-inside-the-twins-swing
The article chronicles the Twins endeavor into the field of using data analytics and sabremetrics to help them make baseball decisions (which was long overdue). The Twins were an organization that relied heavily on old-fashioned baseball decision-making processes with respect to scouting, free agent signings, trades, and player development, with Terry Ryan at the helm. The transition into analytically minded decision making is no fad; teams are becoming increasingly analytical, as each team tries to gain a competitive advantage in an unfair game. In the article, it talks about new philosophical shifts within the Twins organization, specifically citing the swing changes that Twins’ coaches are teaching the players (in order to lift the ball in the air more, which in and of itself is an engaging topic), and the emphasis on framing pitches (catchers trying to make balls look like strikes), among others. -
This article shows a data set for the usage of Snapchat in 2017. This interests me because of how often kids use social media nowadays. These numbers show the profit, too. This shows how they are able to gain monetary values through so many different ways. -
I found this article interesting because when I think about data, I think about its quantity, and how hard it is to find quality data. according to the article, solving Data Quality issues to improve Data Analytics starts with having the business users articulate not what they want. It is interesting that technology isn’t really the problem in these cases, it’s the training provided to data scientists. -
https://www.cnn.com/2018/03/07/us/applenews-march-madness-perfect-bracket/index.html
This article talks about the chances that someone will have a perfect March Madness bracket. Since there are so many teams and so many outcomes it would be almost virtually impossible to predict every game correct. In the past no one has ever had a perfect bracket and I would not be surprised at all if I never see someone have one in my lifetime. This data about the NCAA tournament interests me because it is all based on statistics that can be proven to be complete irrelevant depending on how two teams play against each other that day. -
I found this article interesting because the author discusses ways to improve data literacy within organizations, which is a topic we’ve discussed in this course and also one that aligns with my personal career goals. I like the author’s simple approach to easing into becoming a “data organization,” as he says. He advises organizations start small, with one person or team, and to teach them slowly to build data literacy overtime. He also emphasizes the importance of trust and transparency which I believe are important in dealing with data, because it’s important that people feel comfortable working with the data and presenting it to others.
-
This article is about the Special Election for Pennsylvania’s 18th Congressional District being held on Tuesday. I find it interesting for a few reasons. First, as a Pennsylvania resident this affects me. Second, this is the first time within eight elections that Representative Tim Murphy will not be running for office. Murphy was so well liked in the significantly Republican district that he had no Democratic opposition in the last two elections. The third and final reason I find this article interesting is because it uses collected data and recent polls to show just how close this race is, and what the outcome of the election means for the upcoming midterm elections. In a field so complicated and sometimes down right crazy as politics, it’s nice to have straightforward data to give an insight into the political sphere.
-
https://finance.yahoo.com/news/data-breach-victims-sue-yahoo-united-states-judge-143243547–finance.html
I found this article interesting because it talks about a breach in Yahoo’s data system which happened after the purchase from Verizon. The system was breached three times. This is similar to an article we read for class in the beginning of the semester. -
I loved this article because it talked about numbers of women in the data field and brought up the problem of women being underrepresented in the industry. The author suggests raising awareness about the opportunities in the tech field (that is not really female-friendly) for women. For example, change the current situation in schools where tech and IT classes are not meant for girls, and girls are not encouraged to take them. However, according to statistics, women drastically surpass men in careers like Data analyst.
-
- Load More