I remember when I moved into our first house, and I had to do a bit of DIY in the bathroom to replace some silicon sealant. I removed the old stuff using a utility knife, which whilst worked, involved a lot of messing about, a scratched bath panel, a cut finger and quite a bit of cursing. Several years on I had to do something similar, however this time around I bought myself one of these at our local store:
I never looked back. This small lightweight tool was not only incredibly fast, efficient and easy to use, it also produced a far superior finish with much less fuss. I had the job done in no time!
So why am I telling you this? Well, the exact same thing is true of software utilities. The right tool used on the wrong job is the wrong tool. On numerous occasions I’ve passed comment on the Oracle Developer Community forums where a user is writing a complex PL/SQL procedure to solve a problem that could be solved in a couple of lines of shell script. I’m of the opinion that one can never have too many tools in their toolkit, and for that reason I recently began to look into R as a tool for data analysis and graphing.
I started out with some very basic examples to familiarize myself with the syntax, data structures etc, and then thought I’d see how difficult it cold be to produce the Mandelbrot Set graphic (I used a similar project to begin learning Python a while back).
For my first attempt, I followed the same structure that I had used in previous projects – a procedural approach. My final code looked like this:
width2) { return(i) } reI was quite happy with it. It was fast (all things considered) and produced the set perfectly. So once I’d achieved what I’d set out to do, I then began wondering whether what I’d done was a “good” way of doing it in R. So after a bit of Google searching, I stumbled across a page which gave an example of producing the set in R. https://www.r-bloggers.com/the-mandelbrot-set-in-r/. I was genuinely stunned (i’m impressed by strange things! 🙂 ).
cols Less than 25 lines of code to produce the graphic, and just how amazingly elegant a solution! Not only that, it was quick!! i'm guessing that is mainly down to the set based approach (apart from the iterative part of course), which demonstrates great use of the standard R based functions. This is what it seems R does best - it processes data quickly, with minimal input from the user, and allows you to rapidly visualize that in a multitude of ways. To me, it's quite beautiful! Producing a sine wave plot is as easy as
x=seq(-pi,pi,0.1) y=sin(x) plot(x,y, type="l")Do you want to visualize multiple data sets in relation to each other? Sure – well we’ll use one of the build in data sets as an example (which we can see using the
data()
command, iris (Edgar Anderson’s Iris Data). Using the pairs
command to build a correlation plot of the first four columns (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width):
pairs(iris[1:4])We can also very easily colour code our data sets according to the species using additional options to the
pairs
command.
pairs(iris[1:4], pch = 21, bg = c("red","green3","blue")[unclass(iris$Species)])where
pch
specifies the symbol to use – 21 being a filled circle (see here) and bg
specifies a vector of colours to use, with unclass
determining the dimension by which the colour pallete is indexed. The result is a very nice looking graphic.
So nothing Oracle related – which is nice for a change – but something very powerful with great potential that I’m massively excited to learn more about!