My introduction to R - step 1

You want to know how to actually do machine learning, or you want to know how those finance quants do their job, or you just want to add a valuable skill to your cv ...
In other words, you want to learn R (btw what programming language do pirates use? Rrrrrrr ).
But you want to learn in small steps, easy to follow along and yet with visible results already after a few steps. Well, you have come to the right place and as additional benefit you can post questions and comments whenever you want or need to know more ...

Just one important disclaimer: I am just another user, perhaps with a bit more experience than you at the moment; but I still peek into my R for Dummies book every now and then.
I am certainly not an R guru.

The first thing we need to do is install R and Rstudio.

Install R: Go to www.r_project.org, click on download R and select a mirror.
If you live e.g. in the UK, you might select the one from Imperial College of London.
There you select your operating system and click on install R for the first time (if you are on Windows) or save the R-3.61.pkg package (if you are on a Mac), or whatever is the latest package. Linux users need to select their distribution etc.

Install RStudio: Go to rstudio.com and click on the download rstudio button. Choose the free version of RStudio Desktop and select the installer for your operating system or Linux distribution etc. ...

If you have problems with the installation, please post a comment or ask Google for help.

But if all goes well, you should be able to start RStudio and it will look like this:



Click on File in the menu bar and select File > New File > R script.
RStudio will now look like this:



It contains 4 sub-windows and I have scribbled numbers into the screenshot to better explain what they are.
1> top left: This window displays our R script and we will edit it there.
2> bottom left: This window shows executed commands, error messages, etc.
3> top right: This window shows all the different data and variables we will generate.
4> bottom right: This window is used for various displays, help text etc.

In order to get really started, we type our first R script in sub-window 1 and it contains only one line:
the_answer = 7*6
After typing that, with the cursor still on the line, press Ctrl-Enter to "execute" it.
Alternatively, we could have selected Code from the menu bar and then "Run Selected Line(s)".
The RStudio screen should now look like this:



In the top left window 1 we still see our R script, containing one line.
In the bottom left window 2 we see that R executed our line without error and
in the top right window 3 we see that R created a variable named the_answer with the value 42.
R actually did three things: It created the_answer, executed the arithmetic operation 7*6 and then assigned the outcome of that operation to the_answer.

I read that some people have a problem with the = operator when learning to program in some cases. R has a solution for that, one can also use the assignment operator <- instead of the equal sign. So we could have written
the_answer <- 7*6
with the exact same outcome. In older texts this assignment operator is often used instead of = and you should know that it really makes no difference.

Now we want to save our script and I recommend that you create a folder somewhere on your pc, which you will use to store the R scripts and data files we use in this tutorial; I named my folder myR and will reference it in future steps with this name.
Once you have created and/or selected a place on your pc, select File on the menu bar, click on Save As..., navigate to myR and choose a name for your script, e.g. first_step.R
I recommend that your script files end with .R

This concludes the first step of my introduction.

exercise: Click on Help in the menu bar, select R Help and browse e.g. "An Introduction to R", in window 4.

Just one more thing. At the end of your RStudio exercise, select File from the menu and click on Quit Session...; if you are prompted to save the workspace image select No, which means that the next time you start RStudio, you start from a clean slate.

No comments:

Blog Archive