Section 002, Instructor: Larry Dignan

Weekly Question #6: Complete by March 20, 2017

Leave your response as a comment on this post by the beginning of class on March 9, 2017. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your opinions, not so much particular “facts” from the class!

Answer one of these:

We spent a little time in class discussing the article Stupid Data Corruption Tricks.

  1. Have you ever made one of the mistakes listed in the article? Describe what happened.
  2. If you haven’t made one of those mistakes, which one of them do you think is the most important to avoid?

37 Responses to Weekly Question #6: Complete by March 20, 2017

  • I have not been in the position to make one of those mistakes so I think the most important one to avoid is Number 3: Start working on the database without doing a full backup first. If there is saved backup then if something goes wrong there is a least a backup you can rely on. That way you do not lost something that you or others have put in hours of work on. It may take an extra minute or so to do it but its better to be safe than sorry.

  • When working for a company like Melmark Inc., a school with special need kids, we were collecting data on the kid’s daily behavior to implement plans that can be used on teaching each one them. The database would then be read by a system that could auto populate the rebate applications. For the system to read the data though it needed to be formatted perfectly. Before my first MIS course, I rarely worked with data and programs like SQL. But, if I were to relate to one of these, it would be clicking “yes” without carefully evaluating the message that says “do you want to remove this from the server?” I have definitely made this mistake before in my data analytics class and at my workplace deleted things I did not mean to get rid of. I probably did not make this mistake on excel because I knew how to use Excel. It was very difficult to determine the proper data type while collecting information from the database using the select statement. We also were able to easily corrupt the data by uploading excel docs twice or not converting them to CSV files. But I have made the mistake before and deleted important information by accident. I think it is very important to avoid clicking “yes” without completely evaluating what the message is saying because you could lose data and information completely, or create much more work for yourself.

  • The most common mistake that I make when manipulating data AND the most important mistake to avoid in my opinion is to back up data. On many occassions i have forgotten to back my data up and corrupted the file due to incorrect manipulations, then I would have to restart from square one when i could have backed up each step of the way to save time and precious insights. Backing up is a must every step of the way, especially when the data contains vulnerable data or if the data does not belong to you. Like Anthony said up top, better safe then sorry. Only takes a minute so make this step instinctual.

  • Fortunately, I have never had the opportunity to make these kinds of mistakes while working with data. However, I think the most important mistake to avoid is Number 1: Click “yes” without carefully evaluating the message that says “do you want to remove this from the server?.” Although not with data, I have clicked “yes” on dialogue boxes that pop-up on other sites without reading what it says, and it has caused major problems — deleting files, closing servers, etc. Without reading the message properly, files may become corrupted, or altogether destroyed. Taking a bit more time in the short-run, and being careful, will save a lot of time and effort in the long-run. As Ben Franklin was fond of saying, “Haste makes waste.”

  • I think the most important mistake to avoid when dealing with data is number 3 – Start working on the database without doing a full backup first. It’s safer and wiser to have a backup plan in case the file corrupts, so as you don’t lose any important data. This process doesn’t take much time either.

  • Many times I have been in a rush to close out a program or my computer may have restarted. Just when I thought I had saved and backed-up all of the data I’ve been working on, a dialogue box pops up and instead of reading all of the details, I skin over the words and chose the option that seems best depending on the situation. Several times I was wrong. Then I found myself not knowing how to make the box reappear, if all or some of the data hadn’t already been deleted. Personally, I believe this is the most “stupid way to corrupt your data” because no matter how many times you backup your data, a simple yes or no click could simply erase it from the whole system.

  • I have accidentally sorted the data in an excel file and not included all of the columns. Luckily I noticed soon enough that I could use the back arrow to correct my mistake. Doing something like this is really easy to do when your quickly clicking around, so I always make a back up file.

  • I’ve never made any of the mentioned mistakes, but I think the most important to avoid is #6, missing the data type. This one seems to be the trickiest. You could make an entire spreadsheet, thousand of rows and columns deep, and everything seems fine, but using the wrong data type for just one cell could ruin the whole sheet. For example, using string when it is meant to be date/time.

  • The biggest mistake I have ever made was 100% not backing up my documents. Growing up with a father who owns his own IT company, he ALWAYS tells me to save EVERYTHING to Dropbox, and what didn’t I do, not save my documents to Dropbox. Friday before my 50 page research assignment for Microeconomics is due, my hard drive failed and I lost EVERYTHING. So I had until Monday at 11AM to type up my 50 pages! I learned my lesson, and everything I do is always saved to Dropbox. I would never thing my laptop would ever do this to me. Everyone should back up their documents because it does just happen, even to ones who take care of their laptops, like myself.

  • I make the first “mistake” constantly; however, for the way I use this loophole, it is not actually a mistake for me. At my job in the Mayor’s Office, I frequently need to download different map data files and overlay, for example, a map of Philly’s zip codes with a map of Philly’s City Council districts. After I overlay these maps, my program returns the percentage of each City Council District that is comprised by which zip code/s. The issue is that this program returns all this data in a CSV file. So I then need to open the CSV file in Excel so that I can view the data and bring it in to whatever final data storage destination I need it to reside in. Many of the mistakes in this article are indeed mistakes; however, there are certainly instances where these mistakes can be helpful as well… such as when viewing my mapping overlay program results.

  • Sometimes I will forget to check the data type, and it is hard to find out what is wrong with the data. So I think it is important to avoid #6, Missing the data type. Especially with dates and integers. Because it is hard to tell the difference and it can ruin the whole data.

  • I have forgotten the data type before, and nothing happened-no data would sum! It was very annoying and it took a great deal of time to figure out. I would argue, however, that keeping a clean backup at all times is the most important thing, because if you don’t have a lot of time to dedicate to this issue, at least you can start at the beginning. Sometimes redoing it will reveal to you what you did wrong the first time.

  • I haven’t made these mistakes, and I think the third mistake “start working on the database without doing a full backup first” is the most important to avoid. If you made some mistakes other than the third one, you could waste some time to fix the problem, but if you started working on the database without doing a full backup, the mistake might cause irreparable loss, and you might not be able to start over again.

  • While I haven’t used excel before this class I have used various products and editors such as Adobe Illustrator and Premiere. The two mistakes, numbers three and one, have similar ways of occurring in these programs and I have made them myself. Not saving frequently enough and not backing up video files can be costly when editing and automatically clicking yes is a bad habit that has cost me much time in many areas of my online life.

  • I have done Number 10, “Open a CSV file directly into Excel”, many times without even knowing it could cause problems. I suppose in those prior situations the data I was working with wasn’t large enough where scientific notation kicked in, but it is good to know this tip to avoid in the future.

  • I have often opened a CSV file into Excel, I was not even aware that doing so could cause serious problems. Since my past experience with excel has mostly just been using it for small in class assignments I guess there was not much that could have gone wrong. Going forward I will need to be more careful about how this could affect my data.

  • I have used Excel many times, but I have been lucky to not have experienced any of these mistakes. I think the most important to avoid is Number 3: Start working on the database without doing a full backup first. If you forget to back up your work before using it, you risk losing it. You also have to be careful if you mess up because if it is not backed up, you might not be able to start over.

  • A common mistake I have made in the past is to not back up the data that I was working with, and at times create different versions. This has particularly happened to me on excel sheets for finance classes, such as intermediate corporate finance, in which I would work on a pro forma without saving my progress at various stages. That way, if I made a mistake after a certain stage, I could access the last stage and begin from there.

  • After looking over the mistakes listed in the article, I concluded I have not put myself in the situation where I am vulnerable to those mistakes. That being said, I think the most important one is Number 6. Number 6 is “Miss the data type” where people often confuse dates with integers. This can greatly affect the way you look at dat sources and analyze them. This can result in a snowball type effect where the mistakes just build off one another until everything is so corrupt. You must pay attention to details and look closely at numbers to make sure you avoid this common mistake.

  • Step number 3 is the most important step. Step number 3 is to start working on the database without doing a full backup first. Having a full backup is always important because if something happens where the file doesn’t save, you will still have the file available to you. Although this never happened to me with an excel file, it has happened to me on word and I had to start the file over again because I did not make a backup of the file.

  • I have not made any of these mistakes before but I believe that #6 would have the most impact since it is quite tricky to catch when dates are integers or integers have dates when you have a huge data set. You would have to implement clearly defined columns and know the data that you are working with very well in order to avoid this type of mistake.

  • A very common mistake I have made before is number 3: start working on the database without doing a full backup. Often I start work right after receiving the job, and I always assume autosave would save any accident happen when working on the database. However, sometimes the system does not save all of the data you have, and in order to ensure you don not lose your data you have to fully backup the database as well as do checkpoint save along the way!

  • I have not run into any of these problems myself, but the one that seems like the biggest issue is missing the data type. Not having the correct data type can totally mess up an analysis completely. Since it is one of the first parts of data analysis, mistyping data can be a crucial error.

  • I do not have much experience working with such data mentioned in the article, therefore, I have not made one of the mistakes mentioned. However, I believe the worst mistake would be to not backup your workbook before making edits to it. If you make a monumental mistake in the beginning and don’t realize it until later, all of your work will be compromised because you didn’t save it. Backing up your work is probably the biggest mistake you can make, whether you are working on something relating to data or not.

  • I’ve made mistakes similar to number 9. Over the years at temple I’ve had many excel assignments and I have definitely copied a formula without proper adjusting the coordinates. This results in incorrect information and a frustrating experience trying to locate the source of the issue. I usually avoid this mistake now after having more experience with excel.

  • Luckily I have never made any of the mistakes listed in the article, but I think the most serious mistake is #3, doing work without a full backup. I’ve never had this error occur with data, but with basic assignments using powerpoint and word there have been times thatI haven’t saved a file only for my computer to shut down or for me to exit out of the tab accidentally, ultimately losing everything I’d spent hours working on. While most of the mistakes have some solution or clean up that may be enacted following the time the incident occurred, losing material without doing a backup is essentially hitting the restart button on any work you’ve put work and effort into accomplishing.

  • I personally have not made one of these mistakes but I feel like number 3 is one to never make. I think this is something to always make sure you do so as to not lose your work that you have done. Especially if you work on and make substantial changes to your workbook, to not save your work could really put you behind.

  • I have not made any of these mistakes, but I think the most important one to avoid would be working on a database without a full backup. Without running constant backups, you can run the risk of losing hours of work. Not backing up a database can also cause it to be out of date with relevant information because it has not been backed up recently.

  • Of the common mistakes made in Excel, the most common one I have committed is number 9: copy formulas that use relative formulas. Though I do not use Excel that frequently, this is the problem that I run into the most. This problem occurs when I try to copy a formula in another cell and it doesn’t produce an actual value, instead it gives an error in the cell or group of cells. I often forget to use absolute formulas and have to go back after the error message occurs, or it gives a number that isn’t close to what I’m attempting to calculate.

  • For number 3, I’ve done this a lot in the pre-autosave days. If I’m in a rush the last thing I will think about doing is setting up a backup to a database I’m working on. If it’s not a personal database, I can’t see anyone making this mistake. It’s just too risky not backing up all the data that’s being changed.

  • A few semesters ago, I took digital analytics which is a required course for advertising majors on the account management and media planning track. Within the class we worked closely with excel and Google Ad Words. I found committing a few of the errors mentioned in the article. One that i would commit constantly was copying the wrong formulas, in return i would get the wrong values. I’ve also accidentally delete some of my data multiple times and either did not notice until it was too late or picked up on it and had to start all over.

  • I have been doing a lot of work with Excel over the past month and I have found that it is easy to make mistakes. The problem is that if you do not go over your work with someone else you may not always see the errors. I think that it is important to explain your analysis to another person in simple language. I have been using COUNTIFS, SUMIFS, VLOOKUPS AND IFS functions. This are so awesome and can be confusing when you over thing them. A specific change that I have had is when I want to count groups that match a specific criteria it can be ease to over look or click on an adjacent cell. This was a lesson I had to learn. My solutions is ; I do the analysis, sit the project down and then I explain it to another person to help give me clarity while checking my work.

  • I think the most important error to avoid is simply reading what questions ask instead of just hitting yes. Too many times people are trying to figure out how to do something and just hit yes or allow without reading what they are doing. This can lead to inaccurate calculations and can mess up your data. It is very easy to take the time and read what pop ups say.

  • I have made the mistake of starting to work on a database before having a backup already in place. This can prove to be a time consuming mistake since I lost all my calculated data and analysis which made me have to start the project over again. This is a mistake that may be tough to deal with but will absolutely be a learning experience. Sometimes having to re-do the calculations can lead to rushing your work and making mistakes.

  • Working on something without creating a backup is one of the worst things you can do. There are so many factors that can play into something like that going wrong. its best to always create a backup.

  • I seldom do things with Excel, so I do not have experience done something wrong, but I think the most important to avoid is start working on the database without doing a full backup first. Making a full backup is able to revise you mistake easily because you can find original data, instead of screw all data up. Once you make mistakes, you can use the original data and do it again, rather than lose original data and can’t do nothing about it.

  • I’ve made the mistake of opening a .CSV file directly into Excel more than once… Both times it was when trying to consolidate and clean up contact lists. Most notably, things like phone numbers and zip codes were a hassle to find and correct. I’ve also had instances trying to fix exported contacts from Excel to .CSV and trying to import elsewhere (like into an email address book), when the fields weren’t properly labeled and everything just becomes a jumbled mess.

Leave a Reply

Your email address will not be published. Required fields are marked *

Office Hours
Larry Dignan Alter Hall 232 267.614.6467 Class time: 5:30-8pm, Mondays Office hours: Monday half hour before class, half hour after class or by appointment. ITA: Nathan Pham. Contact via email at