The purpose of the document is to give you some hints to get started with the warmup assignment. This assumes you have decided to use R in RStudio as your development environment. If not, select the appropriate environment below:
- Python using Jupyter Notebooks (Virtual Lab or on your own machine)
- Python using Google Colab (cloud-based)
- (you can also run R in Google Colab)
Overview
You can run RStudio on the Virtual Lab or on your own machine. This hint will address both cases because there is little difference. See the R tutorial for more information on R and RStudio.
Steps
Ensure the Tidyverse library is installed
The Tidyverse library is already installed on the Virtual Labs instance of RStudio. You can skip this step if you are using the Virtual Lab.
Start RStudio and follow the instructions below to install the Tidyverse:
Create a new R script
- Create a new R script and save it with an appropriate name in an appropriate location. Creating a script file is better than just typing into the console because you can save the script file for later use.
- Add the following two lines to the top of your R script to load the Tidyverse and ReadXL libraries:
library(tidyverse)
library(readxl)
- Execute the two commands above with Control-Enter. This will enable the "import dataset" button, which is useful when navigating to data files.
- Use the “Import Dataset” button to import the video game data file (Excel)
- Use the summary function to get a list of descriptive statistics in R’s console window.
- R adopts the following naming convention for data frames (tables of data):
<data frame name>$<column name>
- Some of the columns in the video game dataset contain special characters, such as parentheses “()”, which R uses for function arguments. These have to be wrapped in special quotation marks (called backticks: `) so R knows to treat the special characters as part of the column names rather than code. Fortunately, RStudio has an autocomplete feature that gives you the correct column names once you type the “$” sign. I strongly recommend you take advantage of the autocomplete feature when possible.
- You can often nest functions in R. For example, to make the console output look a bit better, you can put the
summary
function inside theround
function:
- R adopts the following naming convention for data frames (tables of data):
round(summary(dat$`FirstYearSales (M)`),2)
That is pretty much all you need to do in R to complete the warmup assignment. Ensure you save your script file so you can reuse it in subsequent assignments.