Instructor: Jing Gong

R/RStudio Related Questions/Issues/Errors/Bugs

R/RStudio is powerful, but may also has a steep learning curve and requires a lot of trials and errors. Throughout the rest of the semester, if you experience any difficulty with R/RStudio, get an error message, or are struggling with fixing a bug, you can post a comment here to ask for help. Other students who knows the solution (including me) can reply. This way everyone can benefit from the discussion.

48 Responses to R/RStudio Related Questions/Issues/Errors/Bugs

  • Let me start with a comment question from students.

    Question: For assignment #7, when I try to open the OnTimeAirport-Jan14.csv file, R Studio would not allow me to open because the file is too big. Is there any reason for this?

    Answer: Yes. The file is too big and RStudio would not let you open directly. An alternative way to open the file and get familar is to use Excel or Notepad to open it. Just make sure that you did not make any change to the file.

    • I cannot get the script to run after trying every suggestion. I need a school computer to complete this assignment. Is R studio found of any other computer on campus besides Fox? If not, can we get an extension on the assignment so we can use a Fox computer?

      • I am not sure if RStudio is available for other computers outside Fox. You still have until Monday before 3:00pm to work on the assignment.

  • When I go through the in class exercise on my personal computer I get > setwd(“C:/Users/Nick/Downloads”) instead of setwd(“~/My Data Folder /Temple Course Docs/MIS2502/R/RFiles/Test”). Then I get an error message and the script won’t run. Any help would be appreciated.

    • It is fine to have setwd(“C:/Users/Nick/Downloads”). This shows the directory you were working with because you probably put your R file in C:/Users/Nick/Downloads.
      You probably haven’t changed everything I mentioned in the email I just sent. Go through the checklist, make all the changes, and see if you still get an error. If you still get errors, post the first error message you got and I will take a look.

  • Ok guys. I have been trying to do this assignment #7. I am running in to the same problem as described in the first question stated on this chat.
    Using Mac Air, not sure if that matters, I opened the file from community Site, saved it as a CSV file to the R-files folder that I created, I am able to view my file in excel as cvs file. How do I get to work with the file in Rstudio? This is not clear to me? Do I change the inputFilename in the TestScript.r folder where the variable value is “USBalanceOfTrade.csv”?
    Any help will be greatly appreciated.

    • First, you need to use Descriptives.r for Assignment #7 and modify based on it (instead of TestScript.r). Second, the INPUT_FILENAME should be changed to “OnTimeAirport-Jan14.csv” in order for you to import the file to RStudio.

      LIne 36 is the command to actually import the file: dataSet <- read.csv(INPUT_FILENAME);

      • I know that you said to use Excel or Notepad if the file is too large to open with R-Studio, but how should we use Excel/Notepad? Should we make pivot tables in Excel? Because of this I cannot open the file and am not sure of how to go about this. Any help would be much appreciated. Thank you.

        • It is just you cannot directly open the csv file by following File\Open File in RStudio. But it does not mean that the dataset won’t be loaded into RStudio. The read.csv( ) function used in the R script makes sure that dataset is imported to RStudio and is ready for analysis.

          • I tried this method, but after i replaced input_filename with ‘ontimeairport-jan14.csv’ and i run the script, it says there’s an error and that the file is not found.

  • When we are solving for #6 to #9 on the answer sheet, are we picking the airport with the most cancellations? Or do we have to also find how many total flights were from that airport first?

    • This corresponds to item (6) on Page 2 – “Use describeBy() to compare the cancellation rates across origin airports” and item (7) – “Use describeBy() to compare the cancellation rates across airlines”.
      Essentially you need to figure out the percentage of cancelled flights for each airport/airline, and pick the airpor(s)/airline(s) with the most vs. the least cancellation.
      The mean gives you the percentage.

  • If want to count how many flights does PHL have in total. should I count the flight date or is there another way of doing this? I’m not sure how I should edit the code to get to those questions.

    • The summary( ) function does it. Read in-class exercise #10. Under “Look Through the R Script”, (12): you will see the output by typing summary(dataSet) and what the output means.

  • I know that you have said to use Excel or Notepad if the file was too large to open in R-studio, but could you please explain how we should complete the assignment using Excel or Notepad? Should we make pivot tables in excel? I cannot open the file and am not sure how to go about this. Thank you!

  • I know that you said to use Excel or Notepad if the file is too large to open with R-Studio, but how should we use Excel/Notepad? Should we make pivot tables in Excel? Because of this I cannot open the file and am not sure of how to go about this. Any help would be much appreciated. Thank you.

  • Error in file(file, “rt”) : cannot open the connection
    In addition: Warning message:
    In file(file, “rt”) :
    cannot open file ‘USBalanceOfTrade.csv’: No such file or directory

    Keep getting this error and rstudio will not let me run the script even after I saved as a csv file, what do I need to do to fix this?

    • Chances are that you did not do one of the following:
      1. Did you have all the files (both r file and csv file) in the same directory?
      2. Did you change the working directory to source file location? How? Read in-class exercise #10.
      3. Under the VARIABLES section, did you change the following variable values accordingly for the new dataset?
      (1) INPUT_FILENAME

  • How do you present the number of flights? I understand how to group, but I am not sure what it’s grouped too. I know the assignment says how many flights there are (84,656) but r-studio will not allow me to use the numeric number. I’m now not sure which variable to use.

    • The summary( ) function does it. Read in-class exercise #10. Under “Look Through the R Script”, (12): you will see the output by typing summary(dataSet) and what the output means.

  • Could you please elaborate on how to use excel or notepad to open the file? Every time I try to open the file R studio comes back with an error message that the file is too large. I don’t understand how to get around this. Any help would be great.

  • I’m trying to get the summary statistics grouped by origin. I put Summary(dataset$origin), but I keep getting this message:

    Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘Summary’ for signature ‘”factor”’

  • Professor,

    I keep receiving this error when pressing “run all”:

    Error in median.default(x, na.rm = na.rm) : need numeric data
    In addition: Warning message:
    In mean.default(x, na.rm = na.rm) :
    argument is not numeric or logical: returning NA

    The error appears to be caused somewhere after line 62. It is not allowing me to view my histogram or any data. Any ideas for a solution? Thank you.

  • Error in tapply(seq_len(0L), list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, :
    arguments must have same length

    I keep getting this error and I don’t know what it means?

  • Keep getting this message:

    Error in stats[1, 3] <- median(x, na.rm = na.rm) :
    number of items to replace is not a multiple of replacement length
    In addition: Warning messages:
    1: In read.table(file = file, header = header, sep = sep, quote = quote, :
    line 1 appears to contain embedded nulls
    2: In read.table(file = file, header = header, sep = sep, quote = quote, :
    incomplete final line found by readTableHeader on 'OnTimeAirport-Jan14.xlsx'
    3: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
    4: In mean.default(x, na.rm = na.rm) :
    argument is not numeric or logical: returning NA
    5: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

    • Did you use OnTimeAirport-Jan14.xlsx as the input file? The input file should be .csv. Use the original file I posted and do not change it into .xlsx file. Also, make sure you do not have any typo.

  • Error in t.test.formula(subset$DepDelayMinutes ~ subset$Origin) :
    grouping factor must have exactly 2 levels

    Not sure what this error message means.

    • Did you do Step 5 (2) as listed in my email (also as follows)?

      Make sure that you changed line 80:
      subset <- dataSet[ which(dataSet$Position=='PG' | dataSet$Position=='SF'), ];
      into:
      subset <- dataSet[ which(dataSet$Origin=='PHL' | dataSet$Origin=='LAX'), ];

  • describeBy(dataset$Cancelled,dataset$Origin);

    describeBy(dataset$Cancelled,dataset$AirlineFullName);

    > describeBy(dataset$Cancelled,dataset$Origin);
    Error in describeBy(dataset$Cancelled, dataset$Origin) :
    object ‘dataset’ not found
    > describeBy(dataset$Cancelled,dataset$AirlineFullName);
    Error in describeBy(dataset$Cancelled, dataset$AirlineFullName) :
    object ‘dataset’ not found

    I do not understand what I am doing wrong with these 2 lines, is it something minor?

  • Hey guys this is for the decision tree in class assignment.
    Error message is:
    Error in eval(expr, envir, enclos) :
    could not find function “createDataPartition”

    Line 62: nothing was changed, opened the file walked through the reading and system returned an error

    trainIndex <- createDataPartition(inputFile$ID, p=TRAINING_PART, list=FALSE, times=1);
    Any ideas to explain what is going on here will be very helpful. Do I have to change anything? I loaded my packages: ( rpart, caret, and rpart.plot)

    • I think the caret package was not installed properly. Try type in “require(“caret”)” directly in the console and see if you get an error. If the error suggests that you did not install the caret package, let me know.
      In rare cases, if you have a Mac, you might not be able to install the caret package properly, and you may need to find a Windows computer instead.

  • Hey Everyone,
    For those that have a Mac as their personal computer and need a PC to do assignment 8 because the caret package doesn’t work for us, do not waste your time going to the tech. Unfortunately, they do not have R or RStudio installed on the computers there. However, the MIS computer labs (Alter 602 & 603) do have them installed so these places will help you get everything done. Good luck!

  • So I keep getting Error in eval(expr, envir, enclos) : object ‘Payback’ not found when I run all. I’ve typed payback every way possible and still keep getting the error. I’m working on a pc and all the packages have installed properly.

  • Hey guys. Doing assignment 8. I run the script and it appears to run without any issues. But I am getting two different Decision Trees the one in Rstudio, Right side console, gives be an exact %. Now my output PDF that is saved to my Rfiles is not the same with “classifies correctly % of the time”( the number is a whole number) What should I check to assure that I get the same out put file as the one in the Console?

    • It’s the same tree but with different number of digits shown on the graph. To display the same number of digits on the PDF, you can change one of the argument in the second prp( ) function from round(predRate,2)*100 to round(predRate,4)*100. This argument determines the number of digits to be displayed.

  • Hey everyone,
    For questions 12 and 13, the chi-squared statistics questions, what data, if any, did you use to make the expected table? Are we just supposed to use random numbers?

  • Can someone tell me if the following will affect my results negatively or is it just an issue of plotting aesthetics? If it’s a problem, anyone got a solution?

    “Warning message: labs do not fit even at cex 0.15, there may be some overplotting”

Leave a Reply to George J. Raymond Cancel reply

Your email address will not be published. Required fields are marked *