Section 003, Instructor: Laurel Miller

Weekly Question #6: Complete by October 19

Leave your response as a comment on this post by the beginning of class on October 19. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your opinions, not so much particular “facts” from the class!

Answer one of these:

We spent a little time in class discussing the article Stupid Data Corruption Tricks.

  1. Have you ever made one of the mistakes listed in the article? Describe what happened.
  2. If you haven’t made one of those mistakes, which one of them do you think is the most important to avoid?

58 Responses to Weekly Question #6: Complete by October 19

  • I believe mistake number 3 is the most important to avoid. Failing to backup the database could be a real nightmare if there are any accidents. Saving the progress made should be done frequently, not only for peace of mind but to prevent all the work being done for naught.

  • I’ve made the mistake of sorting a spreadsheet and not including all the columns (#4) before and it was the worst thing I could have done. I hadn’t noticed until after I did more parts to my assignment and ended up having to redo the entire thing because I would have had to undo everything I had spent hours doing. The names of organizations were sorted alphabetically, but I didn’t sort the addresses, phone numbers, and websites accordingly, so everything was a big mess.

  • All of the listed “data corruption tricks” can be detrimental to extracting information from the data within Excel, however I believe sorting a spreadsheet and not including all of the columns, #4, is the most important to avoid. I believe this is the worst mistake as when this happens you may not realize it and extract invalid assumptions from the data set. While only one column or a portion of the columns is sorted, you may draw invalid conclusions about correlations with other columns in the data set without any way to tell.

  • The mistake that I have done that was listed in the article was #10, opening a CSV file directly on Excel. After making edits to the CSV file on Excel and trying to save it, I discovered that it did not save my edits.

  • I believe that #3 would be my most common mistake. I get into the habit of getting into “the zone” while doing work on any databases. I get way too focused that I forget to to save the file every 15 minutes or so that a technological error happens and I end up losing all my progress. The sad part is, it a very preventable issue but lots of people like myself tend to forget about it.

  • When I was creating an expense report at my internship this summer, I was copying information from Microsoft NAV and did not include all the columns. Because the information from NAV was sorted differently than the excel sheet, I had to copy and past the correct columns and rows into the correct columns and rows in the excel file. By copy and pasting information, it is easy to leave out some columns, or accidentally delete entire columns, causing my expenses to be incorrect.

  • Although I can’t recall a time where I personally made any of these mistakes, I would have to sati would be most important to avoid number 3. Number 3 is that you start working on a database without doing a full backup first. This could be a very common mistake, especially if you have worked on a database for a longtime not realizing you never saved your changes. When going in to do more work on the database, you might not realize that you never saved changes from last time and just continue working on it, missing a lot of information.

  • In my Digital Analytics and Reporting class, we spend a copious amount of time on Excel. One thing that I always mess up on is when I’m sorting data and I forget to include a row or column during selection. In doing so, when answering follow up questions on homework, I find that I get the answer wrong because the number in (for example) Column J is not synonymous with Column A after the data has been sorted. It’s a simple mistake but it can be disastrous to your data!

  • Although I have not come across any of these mistakes, I think the most important to avoid would be #1. I find that even when not working with data, this happens in general where the system will try to correct an error by saying something very vague and because of that we tend to let it fix it for us without really knowing what was wrong in the first place. When working with data this is especially important to look out for because by agreeing to letting the server fix your “problem” you could potentially lose all of your data, or come out with incorrect data.

  • I have made the first mistake of clicking “yes” without carefully evaluating the message that says “do you want to remove this from the server?”. I can see why myself and others have made this mistake, and that is because no one actually reads through, instead they just click “yes”. So something has been removed and I had no idea it was from me clicking “yes”. Pop up boxes are annoying and most people, like myself, will click anything to get them to go away.

  • After looking over the “Stupid Data Corruption Tricks” article, I have realized that I have made the sixth mistake, “Miss the data type” many times. In high school I took a computer class which concentrated on applications like Excel, Word, and PowerPoint. During the Excel portion of the year, I made that mistake multiple times, skewing my data and documents. I am also in Statistics for Business this semester and I have an Excel project due every Friday. I also make this mistake in my Statistics class, but need to stop doing it.

  • In my opinion, the most important thing to remember from the article is the number three. It is very vital to always have backup for every set of data. We will never know if our working with the data set can disrupt the data. I f we do not have backup, there is possibility that you might have to start from scratch again which is, recollecting back the data.

  • I think mistake #6, “missing the data type,” is the most important to avoid. It is imperative to know what data you’re working with, and what unit it is measured in. If this mistake happens, all of the data can be skewed, and the analysis can be extremely wrong.

  • Number 3 is definitely the most important one to avoid because if you do all this work but something happens and you forget to save it at certain points then it is as if you didn’t do all that work. This has happen to me but not for a database, I was working on a 12-page research paper for my criminal justice class and I never saved it while I was typing and my roommate spilled his soda on my laptop and it made it freeze so all the work I did had to be redone. From that day on I save my documents after every paragraph and email it to myself just in case something ever goes wrong like that again.

  • Mistake number three, in my opinion, is the most important mistake that should be avoided. Technology is very unpredictable; your computer can shut off, crash, or freeze on you at any moment. And if you do not save your work, you will lose all the progress that you made. Saving your work and keeping a back up seems like a simple mistake that many would not make, but it is definitely the most common and the most costly.

  • Although I have never personally made any of the mistakes listed in the article, I believe #4 would be most important to avoid. If you don’t use all the columns, you’re missing data in your analysis. If you don’t have all the data then all the work you do with it could essentially be useless.

  • I always commit the first mistake. When I had my summer internship, I once clicked yes on a dialogue box that added an extra zero to thousands transit numbers I had in a spreadsheet about accounts payable. This corrupted the transit number because its routing numbers that add a zero as well as the institution or bank number.

  • The one mistake that I have made many times in the past is #3 Start working on the database without doing a full backup first. This past summer I spent several weeks doing work in excel for my internship. I automatically assumed that the computer systems would have already had a backup set in place in case I lost my work. However, when I was almost finished with my project I accidentally lost all of my work. When I went back in to get the backup, all the data was messed up and I had to start from the beginning again. Now, whenever I do work on the computer I always check to make sure I have a backup in case something goes wrong.

  • I believe that Number 4 is the most important to avoid. Sort a spreadsheet but, not include all of the columns because data will not be accurate if not everything is included. You may come to a conclusion that is incorrect because you corrupted your data.

  • I think that Number 3 is the most important mistake to avoid. Not backing up your data and then accidentally losing all of your work is one of the absolute worst things that can happen to you, especially when working with a deadline. I think the rule is most important because it goes across the board in a general sense too. If you don’t backup your work, especially when working with large amount of data, it can be totally detrimental to your entire project. No one wants to have to do all of their work over again. It’s smart to always save a dataset in two or more places.

  • I’ve never made any of these mistakes, but number 3 is definitely the worst mistake to make. Anything out of your control can happen when you’re working and you don’t want that external factor to be the reason why you have to start your work over. You yourself can also make a mistake yourself at any time too so it’s still very important to back up your work frequently.

  • At my job I use excel all of the time. I don’t necessarily use the formulas and functions, however I do use the sort and filter features all of the time to sort contact information for our companies and contacts. There have been plenty of times when sorting that I chose the “select all” when doing this but really only one column is selected and all of the data gets mixed up. I think this is such an easy mistake and sometimes I don’t even realize it happens but it really can mess up all of our data very quickly.

  • While all of the data corruption mistakes are important to avoid, number 3 can definitely cause the most damage. Even if you accidentally corrupt your data in some way, having a backup allows you to fix your mistakes. Not having a backup can result in ruining an entire project and countless hours of work wasted.

  • I have never used Excel before. My first time is for this class, so I have never made these mistakes before. But, I think sorting a spreadsheet, but not including all the columns (#4) is the most important to avoid. I think this because it can be very simple to forget to include a column. Forgetting a column will cause all the data to be skewed which is a very huge problem when cleaning data.

  • I have made mistake #8 which is “Accidentally use VLOOKUP’s fuzzy match” when working with excel. What happened was I was dealing with some spreadsheets for a class analyzing data. I kept forgetting the “false” parameter and I had trouble getting the right range of matches until I noticed what was wrong.

  • I have not made any of these mistakes before because i have never used these programs, and have rarely used excel if ever. This class is the first time i have used excel for nearly anything. To me however, the worst mistake to make would be number 2. This is the worst because it affects others. The example provided was how it affects clients A and B. When affecting clients to me that is the worst mistake to make.

  • The only mistake I have made that was mentioned in the article was using the incorrect data type. Specifically using the wrong data type when entering a date. It is probably the easiest mistake to make out of all ten mentioned in the article. It is also a tough one to catch right away, to avoid this mistake it usually takes looking over your data to catch.

  • I cannot say I have made any of these mistakes, not because I am that good at it but I do not use these tools or use data enough to come across mistakes like these . However, after reading the article, the most important mistake to avoid is number 2: “What system am I logged into?”. Not realizing which system you are in and making a substantial amount of changes can only lead to detrimental mistakes. Not to mention all the wasted time spent on working on one system, leading to double the work down the line once you realize that huge error that was made.

  • The mistake I have made was #3. In high school I had to do a project, I left some of the columns blanks and told myself that I would come back to them. I never wrote it down and in a week or so I finished the project with empty columns (Not knowing). Since I left it blank some of the data didn’t appear making it very inaccurate, causing me to get points taken off. After making this mistake I never leave anything blank or when I have no choice but to leave it blank, I write myself a note.

  • I never made any of these mistakes in Excel or anything like that, however, number 3 is the most pertaining to me. Because, backing up a file or object is something I continually forget to do, via backing up my phone before it resets, or simply not saving an essay every few paragraphs, and then have the computer freeze and lose it all. So, while I never made the mistake in a data program, backing up a file is something I forget to do in other circumstances.

  • In the past I have made the mistake of trying to sort a spreadsheet but not including all of the columns. I took a 3 credit course on Microsoft excel where we did a lot of sorting data and it was easy to make simple mistakes like that one. In the end, not including all of the columns can really mess up the entire data set you are trying to work with and it is important to catch your mistakes.

  • I have made a mistake doing number 6 on the list. There have been a few times where I have transferred data over to excel from external sources and forgot to format it correctly afterwards. This caused a huge mess in the spreadsheet, resulting in inaccurate data unclear results.

  • I think the most important to avoid is start working on the database without doing a full backup first. The first reason is if you don’t have a complete backup you may mess up the whole data file and don’t have a chance to rework for it. Second is the computer is not always that stable when we work in data we need to think about such emergency situation like power off or crash.

  • 1. I think the mistake I relate to the most would be #3, solely because that is something I always do. Whenever I work on anything, including a dataset I completely forgot to save it at any given time. I get into so what I am doing, I disregard backing it up or saving it thinking that nothing major will happen. However, saving the dataset can be crucial in case something crashes.

  • I’ve made a mistake on #3 in the past and I believe it is the most important to avoid because of loss of productivity and time. I think in the past my computer implemented one of its scheduled restarts and I had a word document open. It turns out the backup save didn’t work and it gave me backup data from a different time period instead of the most recent time period so I lost some data. So all the data that I had recently vanished, which means productivity for that time frame was all in vain unless I could remember some concepts and the time I had invested was essentially all for nothing. It is hard to gain back lost time.

  • I have never made any of these mistakes. however I believe that number 1 is probably the most important to avoid because you can remove large amounts of data and configurations by making a simple click. It is very easy and tempting to just click yes. So I would say that this is the most important one to avoid.

  • Working in payroll tracking logs, I’ve made the mistake of Number 7: Put values in fields that are supposed to be pointers or references. The references are used in another sheet to create charts and graphs that help summarize and visualize data in the first sheet. When I accidentally put in a value in a reference cell, I didn’t understand later on why my charts would seem off a bit. Would take some time to find the problem and fix it.

  • Nothing of the mentioned mistakes have happened to me. However, I believe that the most important mistake to avoid is to start working on the database without doing a full backup first. Backing up is so important because things could go wrong. What if a device crash? all the data will be gone. That is why backing up is really important.

  • None of the above has happened to me. The most dangerous in my opinion is missing columns in excel. It is very hard to notice making it a bigger problem. This seems like the issue that affects the most amount of people.

  • I think number 3, not backing up your database is the worst mistake to make. If yo don’t back up your database then if you make other mistakes like any of the ones on the list then you are giving yourself more work. Without backing up your database you cant go back to square one once you have realized you have messed up your data. This could be very costly depending on how much wrong work you have made to your data.

  • Personally I cannot recall a specific instance where I made one of these mistakes, however I am more than willing to bet that I have at some point. The most detrimental mistake to one’s work here would likely be Number 3 as I think we can all relate to working very hard on a project or paper and having it be deleted just because we neglected to save it or back it up. This has easily cost me several hours and can rob anybody of their time.

  • The most common, number 3, has happened to me with smaller data sets that I was simply looking around in and large datasets I was working on for a client project. Regardless the magnitude of this error I found it be extremely frustrating after spending hours cleaning and recoding data. I learn to set up the autosave feature on my personal computer and set up an autosave for my work computer that saves directly to a server or some type of cloud storage. Once it happened a second time, I’ve learned to be cautious about preserving my work.

  • Yes, I have made one of the mistakes listed in the article. Normally the answer would be no, but since I’m taking the online Excel course I’ve been using Excel more often. The mistake I made was the one with the Vlookup function. During input I forgot to put false in and it messed my stuff up. Thankfully there’s an undo button.

  • I have not personally made most of the mistakes listed in this article, but I believe number 3 is the one I can relate to the most. There is no greater pain than taking hours of your time researching, typing, and confidently progressing on a paper for class and then failing to save before your computer has to restart for some error or your battery runs out. Hours of your time and energy wasted, all you can do is rage, put your head down, and get back to work.

  • I think Number 3 is important steps to avoid as it help to create a checkpoint while you are working for hour and if any sudden shut down or mistake in your system , you can go back to the last point you have working on .

  • I think the most important data corruption trick to do it back up your data before you start to sort it. I feel that data is usually found to be corrupt during the process of organizing it. If you have a back up of the data it is easy to go back and look through the data to figure out what part was changed incorrectly, whether data got deleted, mis-categorized, etc. Having a back up is so important for so many reasons and I think it is the number one step when looking at data of any kind.

  • Number 3: Start working on the database without doing a full backup first.

    Although now, Microsoft office has periodic back ups incase of sudden app closing or computer shutdown. It does not back up every second of what you did. I dont know the time interval that it uses but I do know it is always best to avoid since you do not want all your hard work to go down the drain

  • I think the most important mistake listed is #3 because even though this step might be one of the easiest, some people just always forget. Technology is so useful, but you never know when something can go wrong.If all your work suddenly vanishes and you didn’t save your work, you’ll have to start all over again, running the chance of leaving certain things out.

  • As I would see it, the most critical thing to recollect from the article is the number three. It is exceptionally fundamental to dependably have reinforcement for each arrangement of information. We will never know whether our working with the informational index can disturb the information. I f we don’t have reinforcement, there is plausibility that you may need to begin starting with no outside help again which is, recalling back the information.

  • The only one that I have committed is #6. This isa gen-ed class for me and it’s the first time I have worked with data sets. During our assignment with the data about the cars I misunderstood what the category was and used horsepower instead of liters when talking about the cars’ engines. Other than that, due to my inexperience with the topic I can’t say I have done any of these other things in the list.

  • I have made a mistake listed in this article. I have made the mistake of number 6 multiple times. I miss type the data and it always comes back to confuse me and wastes my time fixing it. I would think mistyping the data is the more important to avoid cause if you put the data in incorrect fields you have to do more work to fix it.

  • in my opinion, i believe the number 1 mistake to be made from this list is actually number 1 especially for me. The main problem with this error is the amount of data and configurations that could be lost. Knowing myself I would be very tempted to just hit yes, or hit it by mistake.

  • Ive honestly never made any of these mistakes. The one that I think is most important is: Number 3: Start working on the database without doing a full backup first. You have to have multiple backups when you’re working on a data base. Theres so many different ways that you can backup a data base, it doesn’t make sense not to. If you put 8 hours of your time into building a data base and it suddenly crashes, that potentially millions of dollars wasted.

  • An error I have made before is #6. I missed the data type. When I was entering in Zip Codes for a data set, I forgot to change the data type to zip code. This is very like the class example but it is a mistake I have made before. I noticed my mistake upon revision and now I always double check when using Excel.

  • One of the data mistakes that I have made before if starting to do work on a database without doing a full backup first. When I was interning with the School district, I did not realize that data restore may not work all of the time even if the CRM software has continuous backup. I would always conduct a checkpoint save when doing my work but I did not realize I should have conduct a full backup when I start my work on a database.

  • The mistake I think is most important to avoid is number 3, to start working on the database without doing a full backup first. There is nothing worse than having your computer shut down and losing all of your work, that’s why I believe it important to backup your data right after you complete something. Even though new software can restore some of your work when your computer restarts, it’s important to backup your work because you never know if it can be fully recovered.

  • I haven’t made any of these mistakes but I think the worst one would be number 3. That is because it is important to do a full backup so you won’t lose any data.

  • Since I cannot recall a specific instance in which I have made one of these mistakes, I’ll say that the most important one to avoid is #3: Start working on the database without doing a full backup first. Not backing up an original version of a database before making edits can be catastrophic to its function and distort the information that is drawn from the database and used in analysis, even if the edits are apparently undone.

Leave a Reply

Your email address will not be published. Required fields are marked *

Office Hours
Laurel Miller (instructor) 1:00-2:00pm, Tuesdays and Thursdays, Speakman Hall 207F or by appointment.
ITA information
Rebecca Jackson (ITA) By appointment only. Email: