Section 003, Instructor: Laurel Miller

Weekly Question #6: Complete by October 19

Leave your response as a comment on this post by the beginning of class on October 19. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your opinions, not so much particular “facts” from the class!

Answer one of these:

We spent a little time in class discussing the article Stupid Data Corruption Tricks.

  1. Have you ever made one of the mistakes listed in the article? Describe what happened.
  2. If you haven’t made one of those mistakes, which one of them do you think is the most important to avoid?

35 Responses to Weekly Question #6: Complete by October 19

  • I believe mistake number 3 is the most important to avoid. Failing to backup the database could be a real nightmare if there are any accidents. Saving the progress made should be done frequently, not only for peace of mind but to prevent all the work being done for naught.

  • I’ve made the mistake of sorting a spreadsheet and not including all the columns (#4) before and it was the worst thing I could have done. I hadn’t noticed until after I did more parts to my assignment and ended up having to redo the entire thing because I would have had to undo everything I had spent hours doing. The names of organizations were sorted alphabetically, but I didn’t sort the addresses, phone numbers, and websites accordingly, so everything was a big mess.

  • All of the listed “data corruption tricks” can be detrimental to extracting information from the data within Excel, however I believe sorting a spreadsheet and not including all of the columns, #4, is the most important to avoid. I believe this is the worst mistake as when this happens you may not realize it and extract invalid assumptions from the data set. While only one column or a portion of the columns is sorted, you may draw invalid conclusions about correlations with other columns in the data set without any way to tell.

  • The mistake that I have done that was listed in the article was #10, opening a CSV file directly on Excel. After making edits to the CSV file on Excel and trying to save it, I discovered that it did not save my edits.

  • I believe that #3 would be my most common mistake. I get into the habit of getting into “the zone” while doing work on any databases. I get way too focused that I forget to to save the file every 15 minutes or so that a technological error happens and I end up losing all my progress. The sad part is, it a very preventable issue but lots of people like myself tend to forget about it.

  • When I was creating an expense report at my internship this summer, I was copying information from Microsoft NAV and did not include all the columns. Because the information from NAV was sorted differently than the excel sheet, I had to copy and past the correct columns and rows into the correct columns and rows in the excel file. By copy and pasting information, it is easy to leave out some columns, or accidentally delete entire columns, causing my expenses to be incorrect.

  • Although I can’t recall a time where I personally made any of these mistakes, I would have to sati would be most important to avoid number 3. Number 3 is that you start working on a database without doing a full backup first. This could be a very common mistake, especially if you have worked on a database for a longtime not realizing you never saved your changes. When going in to do more work on the database, you might not realize that you never saved changes from last time and just continue working on it, missing a lot of information.

  • In my Digital Analytics and Reporting class, we spend a copious amount of time on Excel. One thing that I always mess up on is when I’m sorting data and I forget to include a row or column during selection. In doing so, when answering follow up questions on homework, I find that I get the answer wrong because the number in (for example) Column J is not synonymous with Column A after the data has been sorted. It’s a simple mistake but it can be disastrous to your data!

  • Although I have not come across any of these mistakes, I think the most important to avoid would be #1. I find that even when not working with data, this happens in general where the system will try to correct an error by saying something very vague and because of that we tend to let it fix it for us without really knowing what was wrong in the first place. When working with data this is especially important to look out for because by agreeing to letting the server fix your “problem” you could potentially lose all of your data, or come out with incorrect data.

  • I have made the first mistake of clicking “yes” without carefully evaluating the message that says “do you want to remove this from the server?”. I can see why myself and others have made this mistake, and that is because no one actually reads through, instead they just click “yes”. So something has been removed and I had no idea it was from me clicking “yes”. Pop up boxes are annoying and most people, like myself, will click anything to get them to go away.

  • After looking over the “Stupid Data Corruption Tricks” article, I have realized that I have made the sixth mistake, “Miss the data type” many times. In high school I took a computer class which concentrated on applications like Excel, Word, and PowerPoint. During the Excel portion of the year, I made that mistake multiple times, skewing my data and documents. I am also in Statistics for Business this semester and I have an Excel project due every Friday. I also make this mistake in my Statistics class, but need to stop doing it.

  • In my opinion, the most important thing to remember from the article is the number three. It is very vital to always have backup for every set of data. We will never know if our working with the data set can disrupt the data. I f we do not have backup, there is possibility that you might have to start from scratch again which is, recollecting back the data.

  • I think mistake #6, “missing the data type,” is the most important to avoid. It is imperative to know what data you’re working with, and what unit it is measured in. If this mistake happens, all of the data can be skewed, and the analysis can be extremely wrong.

  • Number 3 is definitely the most important one to avoid because if you do all this work but something happens and you forget to save it at certain points then it is as if you didn’t do all that work. This has happen to me but not for a database, I was working on a 12-page research paper for my criminal justice class and I never saved it while I was typing and my roommate spilled his soda on my laptop and it made it freeze so all the work I did had to be redone. From that day on I save my documents after every paragraph and email it to myself just in case something ever goes wrong like that again.

  • Mistake number three, in my opinion, is the most important mistake that should be avoided. Technology is very unpredictable; your computer can shut off, crash, or freeze on you at any moment. And if you do not save your work, you will lose all the progress that you made. Saving your work and keeping a back up seems like a simple mistake that many would not make, but it is definitely the most common and the most costly.

  • Although I have never personally made any of the mistakes listed in the article, I believe #4 would be most important to avoid. If you don’t use all the columns, you’re missing data in your analysis. If you don’t have all the data then all the work you do with it could essentially be useless.

  • I always commit the first mistake. When I had my summer internship, I once clicked yes on a dialogue box that added an extra zero to thousands transit numbers I had in a spreadsheet about accounts payable. This corrupted the transit number because its routing numbers that add a zero as well as the institution or bank number.

  • The one mistake that I have made many times in the past is #3 Start working on the database without doing a full backup first. This past summer I spent several weeks doing work in excel for my internship. I automatically assumed that the computer systems would have already had a backup set in place in case I lost my work. However, when I was almost finished with my project I accidentally lost all of my work. When I went back in to get the backup, all the data was messed up and I had to start from the beginning again. Now, whenever I do work on the computer I always check to make sure I have a backup in case something goes wrong.

  • I believe that Number 4 is the most important to avoid. Sort a spreadsheet but, not include all of the columns because data will not be accurate if not everything is included. You may come to a conclusion that is incorrect because you corrupted your data.

  • I think that Number 3 is the most important mistake to avoid. Not backing up your data and then accidentally losing all of your work is one of the absolute worst things that can happen to you, especially when working with a deadline. I think the rule is most important because it goes across the board in a general sense too. If you don’t backup your work, especially when working with large amount of data, it can be totally detrimental to your entire project. No one wants to have to do all of their work over again. It’s smart to always save a dataset in two or more places.

  • At my job I use excel all of the time. I don’t necessarily use the formulas and functions, however I do use the sort and filter features all of the time to sort contact information for our companies and contacts. There have been plenty of times when sorting that I chose the “select all” when doing this but really only one column is selected and all of the data gets mixed up. I think this is such an easy mistake and sometimes I don’t even realize it happens but it really can mess up all of our data very quickly.

  • While all of the data corruption mistakes are important to avoid, number 3 can definitely cause the most damage. Even if you accidentally corrupt your data in some way, having a backup allows you to fix your mistakes. Not having a backup can result in ruining an entire project and countless hours of work wasted.

  • I have never used Excel before. My first time is for this class, so I have never made these mistakes before. But, I think sorting a spreadsheet, but not including all the columns (#4) is the most important to avoid. I think this because it can be very simple to forget to include a column. Forgetting a column will cause all the data to be skewed which is a very huge problem when cleaning data.

  • I have made mistake #8 which is “Accidentally use VLOOKUP’s fuzzy match” when working with excel. What happened was I was dealing with some spreadsheets for a class analyzing data. I kept forgetting the “false” parameter and I had trouble getting the right range of matches until I noticed what was wrong.

  • I have not made any of these mistakes before because i have never used these programs, and have rarely used excel if ever. This class is the first time i have used excel for nearly anything. To me however, the worst mistake to make would be number 2. This is the worst because it affects others. The example provided was how it affects clients A and B. When affecting clients to me that is the worst mistake to make.

  • The only mistake I have made that was mentioned in the article was using the incorrect data type. Specifically using the wrong data type when entering a date. It is probably the easiest mistake to make out of all ten mentioned in the article. It is also a tough one to catch right away, to avoid this mistake it usually takes looking over your data to catch.

  • I cannot say I have made any of these mistakes, not because I am that good at it but I do not use these tools or use data enough to come across mistakes like these . However, after reading the article, the most important mistake to avoid is number 2: “What system am I logged into?”. Not realizing which system you are in and making a substantial amount of changes can only lead to detrimental mistakes. Not to mention all the wasted time spent on working on one system, leading to double the work down the line once you realize that huge error that was made.

  • The mistake I have made was #3. In high school I had to do a project, I left some of the columns blanks and told myself that I would come back to them. I never wrote it down and in a week or so I finished the project with empty columns (Not knowing). Since I left it blank some of the data didn’t appear making it very inaccurate, causing me to get points taken off. After making this mistake I never leave anything blank or when I have no choice but to leave it blank, I write myself a note.

  • I never made any of these mistakes in Excel or anything like that, however, number 3 is the most pertaining to me. Because, backing up a file or object is something I continually forget to do, via backing up my phone before it resets, or simply not saving an essay every few paragraphs, and then have the computer freeze and lose it all. So, while I never made the mistake in a data program, backing up a file is something I forget to do in other circumstances.

  • In the past I have made the mistake of trying to sort a spreadsheet but not including all of the columns. I took a 3 credit course on Microsoft excel where we did a lot of sorting data and it was easy to make simple mistakes like that one. In the end, not including all of the columns can really mess up the entire data set you are trying to work with and it is important to catch your mistakes.

  • I have made a mistake doing number 6 on the list. There have been a few times where I have transferred data over to excel from external sources and forgot to format it correctly afterwards. This caused a huge mess in the spreadsheet, resulting in inaccurate data unclear results.

  • I think the most important to avoid is start working on the database without doing a full backup first. The first reason is if you don’t have a complete backup you may mess up the whole data file and don’t have a chance to rework for it. Second is the computer is not always that stable when we work in data we need to think about such emergency situation like power off or crash.

  • 1. I think the mistake I relate to the most would be #3, solely because that is something I always do. Whenever I work on anything, including a dataset I completely forgot to save it at any given time. I get into so what I am doing, I disregard backing it up or saving it thinking that nothing major will happen. However, saving the dataset can be crucial in case something crashes.

  • I’ve made a mistake on #3 in the past and I believe it is the most important to avoid because of loss of productivity and time. I think in the past my computer implemented one of its scheduled restarts and I had a word document open. It turns out the backup save didn’t work and it gave me backup data from a different time period instead of the most recent time period so I lost some data. So all the data that I had recently vanished, which means productivity for that time frame was all in vain unless I could remember some concepts and the time I had invested was essentially all for nothing. It is hard to gain back lost time.

  • I have never made any of these mistakes. however I believe that number 1 is probably the most important to avoid because you can remove large amounts of data and configurations by making a simple click. It is very easy and tempting to just click yes. So I would say that this is the most important one to avoid.

Leave a Reply

Your email address will not be published. Required fields are marked *

Office Hours
Laurel Miller (instructor) 1:00-2:00pm, Tuesdays and Thursdays, Speakman Hall 207F or by appointment.
ITA information
Rebecca Jackson (ITA) By appointment only. Email: