Data Types
Logis and Doubles and Strings, oh my!
Last updated
Logis and Doubles and Strings, oh my!
Last updated
When we were making some vectors in the last lesson, you might have noticed a word appear in the environment panel between the name of the vector, and the values it contained:
These are the types of data that make up the vector, and determine the properties of the data, and what kind of manipulations you can perform on it.
In R, all data objects have a data type. In this lesson we are going to have look at 5 important data types, which from most simple to most complex are:
Logicals logi
Integers int
Doubles num
Complex Numbers cplx
and Strings chr
Let's go into a little detail about these different data types.
The simplest of data types you can encounter in R are the logicals. Logical type data can take one of two values, either TRUE
or FALSE.
We have already encountered logicals in the last chapter, when performing comparisons.
If you're feeling lazy, logicals can also be defined just by using T
and F
While it's a little unusual to assign logicals as variables, it is quite common to receive them as outputs from tests or functions you might run and use them in conditional if else
type statements (more on this in a later lesson).
Moving one step up in complexity from logicals, we get to integers, which are used to represent, well, integers?
To differentiate themselves from the more common double type, integers in R are written as a number followed by a capital L
. Let's explore this difference using the typeof()
function, which will return the data type of an object.
For the most part, R will automatically handle and swap between integers and doubles as necessary, and the difference is generally not something you generally need to worry about.
That said, integers do serve an important purpose when R passes code to software written inC
or FORTRAN,
but this is well beyond the scope of this course. At the very least, now you know integers exist!
In most cases, your numeric R data is going to be stored using the double data type. Doubles are used to represent any the real numbers you might want to use.
The term double is short for double-precision number, meaning they can store twice the information as our integers. As well as storing decimal values, this means you can store much bigger numbers with doubles than integers.
However, they also take up more space in storage than integers.
Moving up in complexity from our doubles, we get to imaginary and complex numbers.
In R, a complex number are written as <real> + <imaginary>i.
For some of us it would be pretty rare to need to delve into the world of complex numbers, but R has full support for complex algebra if that's what takes your fancy.
Characters are how R deals with data that comes in the shape of words, rather than numbers.
Characters can consist of any characters you want, as long as they're wrapped up inside a pair of double ""
or single ''
inverted commas.
A common slipping place for new R users is dealing with strings of numbers. While it's perfectly reasonable to do something like
Trying to do the same thing with numbers in characters will land you in some strife!
Just because your data starts as one type, that doesn't mean it needs to stay that way! To change data from one type to another, we can use the inbuilt type coercion functions. To coerce your data into:
A logical, use as.logical()
An integer, use as.integer()
A double, use as.numeric()
A complex number, use as.complex()
Or a string, use as.character()
While you can always coerce a less complex data type to a more complex one, the reverse isn't always true.
Now that we've explored the different types of data in R, here are a few challenges to test what you've learned.