Frequently, you want to simultaneously plot multiple series on the same plot. Let’s try plotting daily observations along with a 30 day moving average.
To start, I have observations for YHOO stock from 12 April 1996 through 2 July 2009.
First, the data needs cleaning — I turn the column names into lower case for convenience with the tolower function and turn the text dates formatted as yyyy-mm-dd into dates instead of factors via the as.Date constructor for Date classes:
> yahoo <- read.csv(file='~/stuff/blog/YHOO stock prices [19960412, 20090702].csv', header=T, sep=',')
> str(yahoo)
'data.frame': 3329 obs. of 7 variables:
$ Date : Factor w/ 3329 levels "1996-04-12","1996-04-15",..: 3329 3328 3327 3326 3325 3324 3323 3322 3321 3320 ...
$ Open : num 15.2 15.5 15.8 15.9 15.6 ...
$ High : num 15.3 15.7 15.9 16 15.8 ...
$ Low : num 14.9 15.3 15.3 15.6 15.5 ...
$ Close : num 15 15.4 15.7 15.9 15.7 ...
$ Volume : int 16919900 12716100 16033900 12312100 26449100 19827800 30979700 15866300 26488700 20323100 ...
$ Adj.Close: num 15 15.4 15.7 15.9 15.7 ...
>
> colnames(yahoo) <- tolower( colnames(yahoo) )
> yahoo$date <- as.Date( as.character( yahoo$date ) )
>
> # order yahoo into the same way we want to display it
> yahoo <- yahoo[ order(yahoo$date), ]
Now, let's take a first pass at plotting:
> plot(x=yahoo$date, y=yahoo$close,
+ main='YHOO stock close', xlab='date', ylab='close ($)')
That isn't very pretty, not least of which because we're displaying too much data to be useful. Let's cut it down to just data from January 1 2008 and on:
> yahoo2 <- yahoo[ yahoo$date >= as.Date('2008-01-01'), ]
> plot(x=yahoo2$date, y=yahoo2$close,
+ main='YHOO stock close', xlab='date', ylab='close ($)')
It's worth pointing out that R's plotting code will attempt to set the upper and lower y bounds to something reasonable based on that data you present it with. However, sometimes, particularly to get a sense of scale, you really want to see the full range. You can accomplish this by explicitly setting the y axis limits with ylim. I also make the data more presentable.
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ col='black', type='l',
+ main='YHOO stock close', xlab='date', ylab='close ($)')
Also, I wish to plot the moving average, so I create the function ma30 to calculate it. I also add ma30 as a column, using the whole data range so that the moving average is correct at the beginning of our subset:
> ma30 <- function( x, na.rm=F ){
+ val <- rep( 0, length( x ) )
+ for( j in 1:length( x ) ){
+ val[ j ] <- sum( x[ max( j - 29, 1 ):j ], na.rm=na.rm) / length( max( j-29,1):j )
+ }
+ val
+ }
>
> yahoo$close30 <- ma30(yahoo$close)
> yahoo2 <- yahoo[ yahoo$date >= as.Date('2008-01-01'), ]
And finally, I replot the data, adding the moving average as a second series and making it slightly bolder (lwd=2) to emphasize the moving average over the daily observations:
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ col='black', type='l',
+ main='YHOO stock close', xlab='date', ylab='close ($)')
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2)




Pingback: Stochastic Nonsense | Plotting With Custom X Axis Labels in R — Part 5 in a Series
Thanks for this post! It’s amazing how little I’ve worked with graphics, and a desperate Google search led me here to what I needed to know.