Section 004, Instructor: Larry Dignan

Weekly Question #6: Complete by March 19

Leave your response as a comment on this post by the beginning of class on March 19, 2018. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your opinions, not so much particular “facts” from the class!

Answer one of these:

We spent a little time in class discussing the article Stupid Data Corruption Tricks.

  1. Have you ever made one of the mistakes listed in the article? Describe what happened.
  2. If you haven’t made one of those mistakes, which one of them do you think is the most important to avoid?

28 Responses to Weekly Question #6: Complete by March 19

  • Considering I do not use these data systems very often I cannot recall a particular time I made any of these mistakes. Although I myself have not made these mistakes, I think that the most important mistake to avoid would be to not open a CSV file directly into Excel. This is very important because if that mistake is made, the whole data set is compromised, and you are forced to do a lot of clean up.

  • It is vital to use the correct data type in order to avoid data corruption. I often accidentally use the wrong data type when using a type of software that relies on accurate inputs. It is an easy mistake to type a number that’s an integer as a string or vice versa. It is often a beginners mistake, but mistakes can still occur even if you’re an expert who’s working with a monumental amount of data. It is important to pay attention to the smallest details because a small error could have a devastating effect on an operation.

  • I have opened CSV files directly into Excel and had phone numbers or tracking numbers get turned into scientific notation. It’s usually a simple fix, but for a more complex dataset than what I used I can see where there would be problems. Something else that’s similar is when I work on a document on the desktop version of Word or Excel, and then try to upload it to a cloud program like Sheets. Copying and pasting from one to the other in that case can result in formatting issues.

  • Unfortunately, I made the mistake of copying formulas that use relative coordinates during my first internship. My task was to replicate and improve spreadsheets, and since I had little prior experience with Excel, I didn’t see the problem with copying formulas to new sheets. This caused a lot of issues going forward because the formulas did not reference the same data in each sheet. I realized the calculations were off, and it took me hours to fix all of the sheets.

  • I have not worked in many data sets so I have not made any of these mistakes yet, hopefully after reading this article I won’t make any of them either. I think the worst mistake you can make is number 3: start working on a database without doing a full backup first. This can become a big problem if your working on the data for a long time and the data does not save.

  • I have not made any of the mistakes listed the article. I am not working with any data outside of this class. The mistake I feel should be mist avoided is Number 6, Miss the Data Type. I choose this mistake because I feel like this is one that I would most likely make if I was working with data. It is important to know the Data type so that way, when someone else is trying to make sense of your Data, they do not become confused by the misinformation you are now presenting to them.

  • Even though I have very little experience with managing data or utilizing data, I have made this mistake over and over that is… start working on the database without doing a full backup first. I’m pretty sure that happened to so many people. And once they have incidents where the original data is damaged somehow, they can’t get the data back since they didn’t make any backup first. I should be aware of this for the future.

  • I haven’t made one of these mistakes but I think that the most important ones to avoid are number 6: Missing the data type and also number 3: start working on a database without doing a full backup first. I think these two are the most important to avoid because with having any data type missing could cause a mess that becomes frustrating to go back over and adjust the mistakes. And also having a backup of the database first is important in case of something happening to your work, you would have a backup to go back to.

  • Overall I have had little experience with working with data, so I have done number 6 repeatedly. Missing the data type, this has caused a lot of problems for me when I have done work with data. And because I have done so little work with data it is hard for me to catch the little things when working with data because I am often unsure of what the mistake could be. The data will often return with errors as it is unable to read that data type for what I am trying to ask that software to do, making it hard for me to figure out what the problem can be.

  • In my past experience working with data, I have made several of the mistakes listed in the Stupid Data Corruption Tricks article. Most notably I have done 1, 3, 6, and 10 throughout most of the times I began working with Excel. I think the most important step someone can take is to always make sure you don’t commit #3. I have had a couple bad experiences where something went wrong and having a backup would have saved me a lot of time and stress. For assignment #3 I was sure to make duplicate copies of the spreadsheet after I had completed each part due to past experiences.

  • Sadly, I am guilty of mistake number 3. I have made the mistake of working on a data set without doing a full back up / creating checkpoint saves. During my summer internship freshmen year one of the tasks I was assigned was creating an extensive clothing order for athletes we were sending overseas. I collected the data myself and wrote it down with a pen and paper but when I was approximately 90% done putting it into excel, excel quit unexpectedly and I had to reinput every clothing item requested, sizes, mailing addresses, etc. before I could submit the order to the manufacturer. If I had created a checkpoint save, I would have saved myself a significant amount of time and stress.

  • I have not made any of these mistakes, as I have little experience with data. However, I believe that number 3, Start working on the database without doing a full backup first, is the most important to avoid. From my work in media, I can vouch that backing up your work is very important. Saving work, and organizing it prior to it is key.

  • I haven’t made any mistakes so far listed in the article, “Stupid Data Corruption Tricks,” because I have not had much experience with Excel. I have made several miscalculation errors on certain projects but none listed in the article. However while reading the article, now I understand what I have to do to avoid certain problems.

  • During the Flash Research test last semester I had unfortunately made number 9 mistake of copy formulas that use relative coordinates without fixing it. After I had copied the formula I had moved the table around and I had to go to the formula and waste time that I could have used to write the paper.

  • Through working with data or data sets in different courses I’ve taken, I have encountered a couple of the mistakes listed in this article. Number 6 and number 1 on the list are the most common mistakes I’ve dealt with before. Most commonly number 1 though, clicking yes on the dialog box without carefully evaluating the message, is a mistake I’ve made because I’m so used to hitting ‘Yes’ to save a file among other things, that sometimes I hit ‘Yes’ before fully reading the message.

  • I have not used Excel too often and haven’t made any of the listed mistakes. However, I think the most important mistake to avoid is clicking “yes” without carefully evaluating the message that says “do you want to remove this from the server?”. This mistake is very costly because not only does it delete data, it also deletes metadata and configurations, two key aspects of being able to create good data.

  • Personally, I am not using data consistently enough in my life to be making one of the listed mistakes we have talked about. If I were to guess which mistake I would make though, I would say it is safe to assume I might make mistake number 1 and mistake number 6. I am absolutely one to blindly click “yes” on things without regard for the consequence (I never read the “terms and conditions”, but who does?) which can obviously be problematic, and I also feel like I would certainly miss the data type.

  • i have not made any mistakes.I think the most important one to avoid is number 3, start working on the database without doing a full backup first. I think if you want to work on any data, you should do a full backup just in case something goes wrong. You never know what can happen and you can start working on something and all sudden lose everything without having it saved and start all over again.

  • I never considered the difference between a CSV or an Excel file. I used to just assume that I had to change the numbers (dollars, phone numbers) automatically because I wasn’t putting in raw data. However, the data has been corrupted all along. I think today people assume everything is easily connected, like how Apple’s data is seamless across its platforms. However, we never consider how the data is packaged to understand that we need to double check the data we are transferring and opening.

  • I use excel every day at work. I can relate to many of these problems. The first one being Number 5: “Use a deduping tool with “loose” criteria first”. When doing mail merges for marketing, I’ve learned (the hard way) the importance of having strong, obvious criteria first. Mistakes in this have led me to hours of manual correction. The next one, Number 3: “Start working on the database without doing a full backup first” has kicked my ass in both work and this very class itself. Once again, I’ve learned this the hard way. Im appreciative of making and learning from those mistakes now when the stakes aren’t as high as they will be in the future.

  • I have not used Excel too much except for some times in high school and in this class. I can see how all of the mistakes in “Stupid Data Corruption Tricks” can corrupt the data being used. I believe that the most common mistake would have to be opening a CSV file directly into Excel. By doing so, it can change the data and turn it into different values. This would take a lot of work to fix and to make sure everything is how it should be.

  • I have not made any of the mistakes that have been listed in the article. Well at least that I know about so far. I can see why these mistakes could be made. Some because they are little errors that someone might not be able to notice at first but they all put the proper data at risk. In the future I hope to not make an of these mistakes now that I am more aware of them from the article.

  • I have committed a majority of the mistakes mentioned in the article. Missing the data type is my most common mistake. I remember during a Fiance exam, when faced with a question dealing with percentages, I wrote them as whole numbers. Excel treats Numbers and Percentages as different data types, so when I tried to find the answer, all the references were incorrect because I didn’t change the data type from ‘Number’ to ‘Percentage’. This also happens with date time, and integer values as excel usually rounds these by default and this leads to incorrect answers as most instructors want answer rounded to 2 decimal places.

  • Data utilization has not been fully used by my college career or personal life but I have made some mistakes listed by the article. Instead of utilizing a full backup of data, I have always been inclined to put in work on data without backing it up firsthand. Without a proper backup, data can be misplaced or lost. These are minor errors that may lead to significant issues. Also, I realized that I need to create more space.

  • Over the summer, I had the opportunity to intern for Brinker Capital (a wealth management fund, headquartered outside of Philadelphia). As an intern, my whole day was spent working with Excel. The biggest issue I faced (I had very little Excel training at the time and was thrown right into the fire) was the conversion of files after being imported into Excel. So yes, I have come across the issue of the conversion, or lack thereof, of CSV files. While it was not overly detrimental to the work I was doing, it was certainly very annoying. While this definitely isn’t the only mistake I have made that is listed, it was absolutely one of the most irritating.

  • I currently do not work with data outside of this class, so I haven’t made any of these mistakes to my knowledge. I feel like one of the worst, and likely the most frustrating, mistakes to make would be Number 3, beginning to work on a database without backing it up first. There is nothing more frustrating than neglecting to save your work as you go, only to lose everything if technology fails you. I’m sure this is especially true when it comes to data, working with large, detailed databases in which details would be hard to revisit, pinpoint and fix.

  • I have not made any of the mistakes in the article but one mistake that you should avoid making is missing the data type(Number 6). Careless mistakes like this can have a severe impact on your overall data and when the solution seems peculiar, depending on the amount of data, its hard to find the mistake you made and you will have to go through everything to find that one small mistake or you will believe the solution and continue with that data.

  • I am not certainly sure if I have made any of these mistakes, but chances are I probably have. I’m not all the familiar with microsoft excel, I have used it quite a bit, but it is mostly for simple work. I might have downloaded a csv file or two in the past, I did not even know what a csv file was before last mondays class.

Leave a Reply

Your email address will not be published. Required fields are marked *

Office Hours
Larry Dignan Alter Hall 232 267.614.6467 Class time: 5:30-8pm, Mondays Office hours: Monday hour before class, half hour after class or by appointment.