To practice what we just covered, here are a few tasks. We’ll start easy and work our way towards more complex problems. In most cases, there is a hint and the solution to the task available. However, try not to reach for the solution until you are well and truly stuck.

There are probably more tasks here than what can be expected of a smart novice to complete in 30 minutes so if you don’t manage to get them all done before next sessions starts, don’t feel discouraged. Just save the rest for homework :).

Open a new `R`

Script file.

Typing code *in the console*, calculate \(\sqrt{\frac{17\times 3}{5.3}} + 100\)

`R`

are
- multiplication:
`2*3`

- division:
`10/5`

(**never**)`\`

! - square root:
`sqrt(2)`

- exponentiation:
`2^10`

(that’s 2^{10})

Now type the same command *in the script* and run it from there.

Write an execute a command that calculates the square root of the numbers 9, 100, and 1024, all in one go.

`c()`

function.

Store the results of each of the three commands you just wrote into objects `calc_1`

, `calc_2`

, and `calc_3`

. If done right, the objects should appear in your Global Environment pane.

`<-`

.

Ask `R`

to print the content of each of these objects.

Write code that takes the square of each element of `calc_2`

but in a way that DOES NOT overwrite `calc_2`

. Make sure it worked by running the command.

`R`

never modifies objects unless you reassign them.

Now modify the line of code so that it DOES overwrite the `calc_2`

, storing in it the squares of the original values. Once again, double-check that it worked by printing out the contents of the object in the console.

`calc_2`

.

Let’s say we want to calculate the Body Mass Index (BMI) of these five people:

- Amrita, 1.91 m, 87 kgs
- Bilal, 1.82 m, 91 kgs
- Jia, 1.68 m, 52 kgs
- Josiah, 1.74 m, 64 kgs
- Marios, 1.78 m, 83 kgs

BMI is calculated as \(\frac{\text{weight in kgs}}{\text{(height in m)}^2}\). Now, we *could* calculate each individual BMI but that’s cumbersome and gets progressively more so with increasing numbers. Instead, we can use vectorised operations.

Create an object `height_m`

that stores the heights of our five people.

`c()`

to combine elements in a vector and `<-`

to assign the output to an object. Individual elements must be separated by commas

Next, create an object `weight_kg`

that stores the weights. Make sure you enter the wieghts in the same order you entered the heights. No hints this time!

Finally, apply the BMI formula to our two objects and store the results in a object called `bmi`

. Then have `R`

print it out to see the results.

This way, you can just keep adding heights and weights to the respective vectors and then re-run the calculation.

Add a couple of heights and weights of your choice to `height_m`

and `weight_kg`

respectively and recalculate `bmi`

.

Finally, let’s practice some ways of asking things about our data. This is a crucial skill for sanity checking your data and data processing and will come in especially handy in the early stage when you’re still not very confident in what you’re doing.

While your script should only include commands that impact data processing/visualisation/analysis we recommend you complete the following tasks - especially those that ask you to create new objects - in your script file.

Without printing `calc_1`

ask `R`

how many elements there are inside of it.

`length()`

of `calc_1`

?
Let’s say we want to run some checks on our BMI data. To be able to calculate *meaningul* BMIs, the two objects, `height_m`

and `weight_kg`

, must meet several conditions:

- They must contain the same number of elements.
- They must only contain numbers
- The values must be within reasonable ranges.

None of these are difficult to check with only a handful values in each object by simply eyeballing the data but as datasets get bigger, the ability to offload these kinds of checks onto the computer becomes invaluable.

Ask `R`

whether or not the respective lengths of `height_m`

and `weight_kg`

are equal. Save the output of the command in a new object called `length_test`

.

`x`

and `y`

, use the `==`

operator (`=`

!`x == y`

OK, let’s now check that the two objects only contain numbers.

Use the `is.numeric()`

function to test whether or not an object is of class `numeric`

. Let’s test both `height_m`

and `weight_kg`

.

Use the logical operator `&`

to link the two expressions to test both with just one command. This time, store the output in `numeric_test`

.

All good! Now, let’s see if the values are reasonable. Here, it’s up to you as the analyst to define what you deem reasonable. The computer can only tell you if your data meet your criteria, not what the criteria should be.

Let’s say one criterion is that the values of `height_m`

must be smaller than their corresponding values of `weight_kg`

. Admittedly, it’s not a very good criterion in this context but it might be in different contexts and it also makes for a good exercise so bear with us. Can you figure out how to ask `R`

if this is true?

`x < y`

, is`x`

less than`y`

?`x <= y`

, is`x`

less than*or equal to*`y`

?`x > y`

, is`x`

greater than`y`

?`x >= y`

, is`x`

greater than*or equal to*`y`

?`x != y`

, are`x`

and`y`

*NOT*equal?

The result of the previous task is a separate test for every element pair. The `all()`

function takes a logical vector and outputs `TRUE`

is all its elements are `TRUE`

and `FALSE`

otherwise. Use it to see if the condition we’re investigating is met for all value pairs and save the output in `comparison_test`

.

`all()`

function.
All looks *kosher* thus far.

Next let’s explore if the values have reasonable ranges. There are several ways of doing this, each with its pros and cons so let’s have a look at a few.

First of all, we can simply look at the minimum and maximum values of an object. The `range()`

function returns this information. Let’s have a look at both `height_m`

and `weight_kg`

.

This is very useful information but it’s not the best way of sanity-checking our data as it still requires some eyeballing.

Let’s say we think that all values of `height_m`

should be between 1.2 and 2.3. Can you come up with a one-liner that test for this criterion? If so, save the output of the command in `height_range_test`

.

`height_m`

should be larger than 1.2 `height_m`

should be smaller than 2.3. This should be true for all elements.

Alternatively, we can ask if the minimum (`min()`

) of `weight_kg`

is greater than 40 and at the same time its maximum (can you guess the function?) is less than 250. Try this without hints and save the output of the command in `weight_range_test`

.

Finally, let’s see if all of our tests returned true.

`all()`

function returns either `TRUE`

or `FALSE`

.

Great! The values passed our five checks so we can have some confidence that our BMI calculation is meaningful.

First of all, well done! You managed to do quite a lot here and got to practise basic operations, assigning names to objects, and performing basic data-checking test. But on top of all this, you also found out important things about `R`

.

- Some operations are
**vectorised**, meaning that they can be performed with vectors rather than just with a single value. This means that you can transform variables or calculate new ones and test your data efficiently. - If you want to treat several values/elements as a whole you
**must combine them in a vector**if they are not already in one. - A function in
`R`

**never modifies its inputs**. If you want to modify and object you need to**reassign using**.`<-`

- There is no “undo” button but you can always start from the beginning and re-run your code.
- Storing objects under names/variables in your environment allows you to conveniently access them.
- There are powerful tools you can use to
**sanity check your datasets and data-processing**.`R`

cannot tell you what tests to design but, once you know what you want to test for, it will happily do it for you.