Data Analytics – Section 1

To Do

Assignment #7 – Decision Trees in R

Here is the assignment and Assignment #7 – Decision Trees in R ANSWER SHEET (in Word format).

Here is the data file you’ll need [BankLoan.csv].

Note: If you try to open this file in Excel, you’ll get two error dialogs. The file is fine. Just click “Yes” and “OK” and the file will open.

Another note: Make sure you’ve included ALL the attachments (check the assignment instructions) as separate files or you will not receive credit for the assignment. Please do not send me a ZIP or RAR file! 

This assignment is due on 4/11 by the start of class.

In Class Exercise #12 – Decision Trees in R

Here is In-Class Exercise #12 – Decision Tree Induction Using R.

And here are the supporting files. Remember, download them to your computer by right-clicking and selecting Save As

  • The R script you’ll need: dTree.r
  • The data file you’ll need: OrganicsPurchase.csv

    • Note: If you try to open this file in Excel, you’ll get two error dialogs. The file is fine. Just click “Yes” and “OK” and the file will open.

Watch this R & R Studio Introduction before 3/28/16!!!

As we discussed, you are responsible for watching the Intro to R Mix that can be found here:  https://mix.office.com/watch/1vfw1it1t0rjc

You should watch this mix and download R & R Studio to your PC before class on Monday, 3/28/16 as we will be jumping right into the In Class Exercise.

Here are the files I reference in the Mix, in case you want to do a little practicing: 2009BaseballTeamStats  BaseballAnalysis.r

 

In Class Exercise #10 – Getting Familiar with R & R Studio

Here is the exercise.

And here are the supporting files. When you download them to your computer, I would suggest creating a special folder to hold your R files.  You can download easiest by right-clicking and selecting Save As

Weekly Question

Leave your response as a comment on this post by the beginning of class on March 30, 2016. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your opinions, not so much particular “facts” from the class!  If you sign in using your AccessNet ID and password you won’t have to fill in the name, email and captcha fields when you leave your comment.

Here is the question:

Do a little bit of research and come up with an example of how R is used. You can describe either a company using R and what they use it for, a news story about how companies are using it, or an interesting package that does some interesting functionality. Write one to three sentences on what you found and the URL where you found it.

(Hint: Just Google “companies using R” or “applications of R” or something like that. Examples aren’t too difficult to find.)

Intro to R and R studio

The slide deck for Intro to R has been posted under Slide Decks.

Make sure you have this handy for class and take additional notes!

Here’s the BaseballAnalysis.r script and the 2009BaseballTeamStats.csv data file presented in the slides. To try it for yourself:

  1. Download both files to a folder on your computer.
  2. Open RStudio.
  3. Select the Session menu, choose Set Working Directory/To Source Files Location.
  4. Select the Code menu, choose Run Region/Run All.
  5. Watch the magic happen!

Setting up R & R Studio

R is a widely-used, open source statistical analysis platform. RStudio is an integrated development environment for R – that means it makes using R easier!

You should install both software packages – R and RStudio! Don’t just install R or your life will be difficult!

We’ll be using this software to do some analytics! You can get a full copy of the software – PC or Mac – for free!

First download and install R:

  1. If you have Windows
    1. Download the installer file for R (this is a direct link to the install file).
    2. Make sure you keep track of where you save the installer.
    3. Double-click on the installer file and follow the instructions (just accept the default settings).
  2. If you have a Mac
    1. Download “R-3.2.1-snowleopard.pkg” if you have Mac OS X 10.6 or “R-3.2.4.pkg” if you have Mac OS X 10.9 or higher.
    2. Make sure you keep track of where you save the installer.
    3. Do whatever it is you “Mac people” do to install software! (I don’t have a Mac!)

Now download and install RStudio:

  1. Download the appropriate installer: Windows or Mac.
  2. Make sure you keep track of where you save the installer.
  3. Install the software! Just accept the default settings.

After both are installed, you’re always going to run RStudio, which will use R behind the scenes to give you a pleasing analytics experience!