I'm starting to learn R for academic reasons, because I need to use it for statistics (unfortunately PHP can't quite keep up!).
Today something finally clicked for me-- at least enough to realize how bizarre R is. So I wanted to ask if anyone else has experience with R or similar languages and has any idea why this is the case.
(This post looks really long, but I promise, it's a quick read.)
In short, it looks like a single variable as a single variable can have multiple values.
Here are some code examples. Note that R is a command-line language primarily, so that's how it's presented here.
Code:
> 1:10
[1] 1 2 3 4 5 6 7 8 9 10
That prints 1 through 10. Instinctively I'd call this a list-- or more technically, an array. Right? No.
Well, we can store it in a variable:
Code:
> x=1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
But now what is x?
Code:
> class(x)
[1] "integer"
And now I get angry with the program and want to go back to PHP. It makes absolutely no sense to me. That must be wrong, right? No.
What happens if we try to do something to this? Surely we can't add one to an array, right? Oh, but it's not an array. And indeed we can. What happens?
Code:
> x+1
[1] 2 3 4 5 6 7 8 9 10 11
> x-5
[1] -4 -3 -2 -1 0 1 2 3 4 5
So... as far as I can tell it is as if it's a single variable (let's say it has a value of 1) except that it also has a bunch of extra values-- it's like it's many variables in one, but still acting as a single variable. Huh.
At the same time:
Strange. What about selecting an element?
Ah, so it is an array. Wait. No. (And it also has "fixed" the 0th numbering system we've all learned to love... or hate.)
So, it looks like it's at least one dimensional. So it has rows. Or columns. Right? No.
Code:
> nrow(x)
NULL
> ncol(x)
NULL
Now, you can make it into a thing with rows or columns.
Code:
> cbind(x)
x
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
> rbind(x)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x 1 2 3 4 5 6 7 8 9 10
> y = rbind(x)
> class(y)
[1] "matrix"
So, aha, now we have a "matrix". That is... an array (with multiple dimensions). It almost makes sense. But it's more work.
And for some odd reason you can map one class into another as you'd like. But you can't use them directly. So it seems to have y'know, strict, but unimportant, typing:
Code:
> as.matrix(x)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
> as.integer(y)
[1] 1 2 3 4 5 6 7 8 9 10
Ah, back to the thing that is so obviously an integer. Right? You remember from algebra class, right? An integer is a bunch of numbers that are real numbers without decimal values. Or, wait, wasn't it one number? Right. An integer is one number. But not in R. It's several!
And just for one more weird thing about R. Concatenation. What should it do? It should take let's say letters and join them into a string. Right? Nope.
Code:
> c(1,2,3)
[1] 1 2 3
Ah, yes. If we concatenate 1, 2 and 3, we should get a new integer with values 1, 2, 3... sure. But ok, surely letters will work!
Code:
> c('h','e','l','l','o')
[1] "h" "e" "l" "l" "o"
Or not.
We can do this if it's relevant, though:
Code:
> 'hello'
[1] "hello"
Or, if we're feeling fancy--
Code:
> paste('h','e','l','l','o')
[1] "h e l l o"
Hmm....
Almost.
But then, using another quirk of R, we can save it:
Code:
> paste('h','e','l','l','o',sep='')
[1] "hello"
What's sep='', you might ask? Well... it's the separator argument of course. Functions don't necessarily have set argument orders. Instead you can specify them by name! Also odd... (not that this part can't be useful).
And by the way I should have been using the "official" assignment operator <- (rather than the lazy =) the whole time:
Anyway, anyone else experienced similar oddities? Or, perhaps more productively (aside from my slight rant about this), any ideas why it does this? Of course R is primarily for statistics. Maybe there's some obvious reason due to that. Personally I just find it bizarre.
Bookmarks