Skip to main content

Statistical Analysis Software

This guides serves as an introduction to some Statistical Analysis Software. It aims to help patrons familiarize with five of the most popular statistical software.

RStudio Tabs

We will reference to these tabs throughout the manual

File; Edit; Code; View; Plots; Session; Build; Debug; Profile; Tools; Help

Useful Symbols & Commands

  • #:  This symbol introduces comments with the purpose of explaining the following code. Good programs require good commenting, in order to facilitate other people reading your code!
  • $: This symbol refers to a specific column from a dataframe.
  • <- or =: This symbol serves to store a dataframe, variable, value, etc. In the Global Environment for you to further use it
  • Run: To run the lines, you leave your cursor on the line interested and click Run or Ctrl+Enter (Command+Enter on Mac Computers). To run multiple lines, select the lines of interest and then click Run. In the next picture, Run is circled in red.

How to Import Files

Steps for Importing a File:

  1. Open RStudio: you will see an empty screen with three windows.
  • Console: Console displays the results of all your command;
  • Environment/History/etc: Environment is useful for seeing all the variables stored while conducting the analysis;
  • Files/Plots/etc.: with additional features.
  1. File -> New File -> RScript or click on the button circled in red: we recommend you to type all your commands in RScript. The files typed here can be saved for further uses.
  2. Session -> Set Working Directory -> Choose Directory: choose the location on your computer where you retrieve the data files.
  3. If you have txt file, input: dataframe_name<- read.table(“file_name.txt”, header = TRUE). In our example in the picture, we are assigning our dataframe using “<-” to Data. Header = TRUE refers to whether you want to include the first line of your data as the headers of the variables (TRUE) or not (FALSE).
  4. If you have a csv file, input: dataframe_name_of_your_choice <- read.csv(“your_file_name.csv, header = TRUE). For explanations, refer to step #5.
  5. Run the previous line: in this step, you store the data in the Environment.
  6. Now you can start your analysis. In the picture above, we chose a dataframe of height vs weight. 

How to Graph

Steps for Creating a Scatterplot:

  1. Input:  plot(x_variable, y_variable, main = "Scatterplot_Title", xlab = "x_label", ylab = "y_label"). In our example, we input: plot(Data$height, Data$weight, main = "University Height/Weight", xlab = "Height", ylab = "Weight")
  2. Run the previous line
  3. Output displayed in the Plots window

How to Conduct Basic Statistical Analysis

Steps for Computing Summary Statistics

  1. Input: summary(variable_name). In our example, we summarized the variable weight
  2. Run the previous line
  3. Output displayed in the Console

Steps for Fitting a Regression Model: 

  1. Input: model_name <- lm(y_variable, x_variable, data = dataframe_name). In our example, we input: model <- lm(weight, height, data = Data)
  2. Run the previous line
  3. Output displayed in the Console

How to Install and Use an R Package:

In R, developers have created over 6000 packages that contain commands that could make coding easier, more efficient or are simply more suitable for your analysis. Among the most popular packages, there are dplyr, ggplot2, knitr, shiny, devtools etc. 

Steps for Installing an Using a Package

  1. Input: install.packages("package_name). In our example, we installed ggplot2.
  2. Run the previous line.
  3. Input: library("package_name").
  4. Run the previous line.
  5. You can now use the package. For further information regarding the package, you can input and run package_name? or help(package = "package_name")

Additional Resources

For any additional help, you can refer to the following:

  • Help tab:
    • R Help: resources, manuals and references for consultation
    • R Studio Docs: documentation for RStudio and services
    • Cheat sheets:1-2 pages compact cheat sheets for the most common package