Why R can be hard to learn

Many of the analysts we speak to are being pushed over to R, primarily because it’s open source and therefore a free alternative to commercial data analytics packages for which the costs can sometimes run into tens of thousands of pounds (or more). However, even experienced analysts often find that getting to grips with R can be a difficult business. Many people view R as being notoriously difficult to learn. There are a number of reasons why this is the case.

Lack of consistency

In some ways the open source nature of R is its biggest weakness as well as its core strength. Much of the real power of R comes not from its core functionality but from the vast number of user-generated packages that give it its advanced functionality. However, the disparate nature of these, and the fact that they’re user-generated, means that they can lack consistency of approach, support can be extremely limited and hunting through them to find something that does what you want it to can be a challenge. For users who are used to the kind of structured and consistent training manuals and support documentation that typically come with commercial analytics packages, the ‘wild west’ feel of R can be something of a culture shock.

Commercial software tends to be written in a more consistent style. If you’re used to using SPSS then something like Stata or even the SAS Enterprise Guide interface won’t come as a complete shock to you. Commercial software is written with user-friendliness and accessibility in mind (or at least it should be!). That means that the terminology will be consistent. Names for functions should be intuitive. Icons and menu items are often  standardised across packages.

This isn’t the case with R. As already mentioned, much of the power of R comes from the ways in which it is constantly being extended by its users. Everyone is free to add whatever code they want, and not everyone adheres to the same terminology and conventions. Similarly, there’s always more than one way of skinning a cat so the code that one person writes for a particular function will be completely different from someone else’s code even though it does exactly the same thing.

More knowledge needed to get up and running

It’s not that R doesn’t come with help. It does, but it’s fair to say that some of its help can be a little on the opaque side and perhaps not written with the needs of beginners in mind. The entry level knowledge that you need to be able to get going is much greater with R than it is with a commercial analytics package such as SPSS or SAS. If you’re new to R it’s going to take you longer to get up and running and you won’t be able to just dive straight in and get cracking with your analysis until you’ve got to grips with the unfamiliar nature of the R environment (when compared to other analytics packages).

An unfamiliar interface

The most obvious difference between R and the commercial analytics packages that our clients are typically used to is the lack of a graphical user interface. With packages like SPSS and SAS Enterprise Guide you don’t have to be a programmer to be able to use them. Of course, if you are comfortable programming then you can unlock a deeper level of functionality but many users get everything they need from SPSS and SAS without ever having to write a line of code. R is different. It doesn’t come with any kind of graphical interface. The only way you can interact with it in its native form is through code, which can make it seem completely off limits to analysts who are not also programmers. It’s worth noting that there are graphical interface front ends that you can bolt onto R which give you the best of both worlds – the power and flexibility of the native R environment but with the ease of use and familiarity of a graphical user interface. There are a number of these on the market but the one we like the best and recommend to our clients is BlueSky Statistics, which is particularly useful for analysts who are used to working with the SPSS interface.

A steeper learning curve

R does offer many benefits when compared to commercial software, not least of which being the fact that it’s free. Some would also say that it’s far more flexible and powerful than any of the commercially available alternatives and that this power and flexibility is only possible because R is not constrained by a particular commercial way of doing things. However if you’re planning to make the switch from SPSS, SAS or another commercial analytics tool to R and you’re not a coder then it’s important to understand that the learning curve is likely to be steeper than you may be used to.

Find out more about BlueSky Statistics here.

Find out about our training, support and consultancy services here.