<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Stochastic Nonsense &#187; plotting series</title>
	<atom:link href="http://blog.earlh.com/index.php/tag/plotting-series/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.earlh.com</link>
	<description></description>
	<lastBuildDate>Mon, 19 Sep 2011 03:30:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Plotting in Grids</title>
		<link>http://blog.earlh.com/index.php/2009/12/plotting-in-grid/</link>
		<comments>http://blog.earlh.com/index.php/2009/12/plotting-in-grid/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 04:03:20 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=474</guid>
		<description><![CDATA[This is post #12 in a running series about plotting in R. I regularly find myself wanting to show arrays or grids of plots in R. This is straightforward using par and mfrow as long as you want a symmetric, &#8230; <a href="http://blog.earlh.com/index.php/2009/12/plotting-in-grid/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #12 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>I regularly find myself wanting to show arrays or grids of plots in R.  This is straightforward using par and mfrow as long as you want a symmetric, evenly spaced grid of plots.  Unfortunately, this often is not what I want.  Even more unfortunately, this is a hard question to google for.  I&#8217;ve tried array of plots, grid of plots, matrix of plots, asymmetric grids of plots, asymmetric arrays, uneven grids of plots, uneven mfrow, uneven mfcol, etc, and nothing worked.  (Searches listed here in the hopes that other people with the same question will find the answer.)</p>
<p>I actually didn&#8217;t think this could be accomplished without using lattice and ggplot2, but I recently discovered that it can be done with R&#8217;s base plotting functions.  The function layout provides what we&#8217;re looking for.  It takes a matrix describing where you want your sequence of plots to go.  After creating your layout, you can use layout.show to visually see where your plots will go.  Let&#8217;s take a look at some examples.</p>
<p>This creates a two by two grid, exactly as mfrow does.<br />
<code>
<pre class="brush: text;">
# 2 by 2 grid, the same as mfrow=c(2,2)
pp <- layout(matrix(c(1,2,3,4), 2, 2, byrow=T))
layout.show(pp)
</pre>
<p></code><br />
<center><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.00.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.00-300x300.png" alt="plot12.00" title="plot12.00" width="300" height="300" class="aligncenter size-medium wp-image-485" /></a><br />
</center></p>
<p>For comparison, this creates a 2 by 2 grid as mfcol does.  The only difference is the order of the plot numbers in the matrix.<br />
<code>
<pre class="brush: text;">
# 2 by 2 grid, the same as mfcol=c(2,2)
pp <- layout(matrix(c(1,2,3,4), 2, 2, byrow=F))
layout.show(pp)
</pre>
<p></code><br />
<center><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.01-300x300.png" alt="plot12.01" title="plot12.01" width="300" height="300" class="aligncenter size-medium wp-image-485" /></a><br />
</center></p>
<p>We can put 0 in any position in the matrix to not plot there.<br />
<code>
<pre class="brush: text;">
# no plotting in the first quadrant
pp <- layout(matrix(c(1,0,2,3), 2, 2, byrow=T))
layout.show(pp)
</pre>
<p></code><br />
<center><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.02-300x300.png" alt="plot12.02" title="plot12.02" width="300" height="300" class="aligncenter size-medium wp-image-485" /></a><br />
</center></p>
<p>Now, let's just have one plot use all of the left column.  The trick to spanning columns like this is to repeat the number of the plot that you want to span -- note that 1 occurs twice in the layout matrix.<br />
<code>
<pre class="brush: text;">
# now a fat plot on the left and two small plots in the right column
pp <- layout(matrix(c(1, 1, 2, 3), 2, 2, byrow=F))
layout.show(pp)
</pre>
<p></code><br />
<center><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.03-300x300.png" alt="plot12.03" title="plot12.03" width="300" height="300" class="aligncenter size-medium wp-image-485" /></a><br />
</center></p>
<p>Finally, we can set widths for the columns (or for the rows -- just use heights instead of widths).<br />
<code>
<pre class="brush: text;">
# same as above, but with the left column having 3/4 of the width
pp <- layout(matrix(c(1, 1, 2, 3), 2, 2, byrow=F), widths=c(3,1))
layout.show(pp)
</pre>
<p></code><br />
<center><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.04.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.04-300x300.png" alt="plot12.04" title="plot12.04" width="300" height="300" class="aligncenter size-medium wp-image-485" /></a><br />
</center></p>
<p>Now, let's show off what I originally wanted to do: display a plot of two dimensions of a distribution, along with the marginal distributions.  I'm wrapping the functionality up into a function so it's easy to reuse.  I use plot to show the sample and barplot to show the distribution as calculated by hist.</p>
<p><code>
<pre class="brush: text;">
# now lets demonstrate with a plot of the multivariate normal and histograms of the marginal distributions
# use package MASS to get the mvrnorm function

plotWithMarginals <- function(x, y){

	# find min / max on each dimension
	# then set up breaks so that even if x, y are on very different ranges things work
	mm <- max(abs(range(x, y)))
	breaks <- seq(-mm, mm, by=(2*mm)/1000)

	hist0 <- hist(x, breaks=breaks, plot=F)
	hist1 <- hist(y, breaks=breaks, plot=F)

	# create a grid and check it out to make sure that it's what we want
	pp <- layout(matrix(c(2,0,1,3), 2, 2, byrow=T), c(3,1), c(1,3), T)
	layout.show(pp)

	rang <- c(-mm, mm)

	par(mar=c(3,3,1,1))
	plot(x, y, xlim=rang, ylim=rang, xlab='', ylab='')

	# now plot marginals
	top <- max(hist0$counts, hist1$counts)
	par(mar=c(0,3,1,1))
	barplot(hist0$counts, axes=F, ylim=c(0, top), space=0)

	par(mar=c(3,0,1,1))
	barplot(hist1$counts, axes=F, xlim=c(0,top), space=0, horiz=T)
}

# mvrnorm <-- sample from a multivariate normal distn
library(MASS)
</pre>
<p></code></p>
<p>Now that all the prep is done, this shows a multivariate normal distribution with no correlation between the two variables.  Note the shape of the marginal distributions.<br />
<code>
<pre class="brush: text;">
eye2 <- matrix(c(1,0,0,1), 2, 2)
sample <- mvrnorm(n=10000, mu=c(0,0), Sigma=eye2)
plotWithMarginals(sample[,1], sample[,2])
</pre>
<p></code></p>
<p><center><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.05.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.05-300x300.png" alt="plot12.05" title="plot12.05" width="300" height="300" class="aligncenter size-medium wp-image-485" /></a><br />
</center></p>
<p>And finally, for contrast, a correlated multivariate normal.<br />
<code>
<pre class="brush: text;">
yescorr <- matrix(c(1, 0.9, 0.9, 1), 2, 2, byrow=T)
sample <- mvrnorm(n=10000, mu=c(0,0), Sigma=yescorr)
plotWithMarginals(sample[,1], sample[,2])
</pre>
<p></code></p>
<p><center><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.06.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/12/plot12.06-300x300.png" alt="plot12.06" title="plot12.06" width="300" height="300" class="aligncenter size-medium wp-image-485" /></a></center></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/12/plotting-in-grid/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Shading Pieces of an R Plot</title>
		<link>http://blog.earlh.com/index.php/2009/08/shading-pieces-of-an-r-plot/</link>
		<comments>http://blog.earlh.com/index.php/2009/08/shading-pieces-of-an-r-plot/#comments</comments>
		<pubDate>Wed, 12 Aug 2009 01:25:58 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=400</guid>
		<description><![CDATA[This is post #11 in a running series about plotting in R. I often want to shade pieces of an R plot, in order to visually draw out some piece, such as weekends or recessions. Let&#8217;s look at how to &#8230; <a href="http://blog.earlh.com/index.php/2009/08/shading-pieces-of-an-r-plot/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #11 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>I often want to shade pieces of an R plot, in order to visually draw out some piece, such as weekends or recessions.  Let&#8217;s look at how to do that with the plain plotting tools.</p>
<p>First, I have some obscured data from work.  I&#8217;m going to take 3 series and turn them into stacked filled line plots.  But first, let me show you where we&#8217;re going to end up:</p>
<div id="attachment_401" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.01-300x200.png" alt="Spending Plot, Three Series, Weekends Shaded" title="plot11.01" width="300" height="200" class="size-medium wp-image-401" /></a><p class="wp-caption-text">Spending Plot, Three Series, Weekends Shaded</p></div>
<p>First, let&#8217;s grab some data: <a href='http://blog.earlh.com/wp-content/uploads/2009/08/post11.data.csv'>post11.data</a> and prep it, which basically involves making R understand that the day column is a date.<br />
<code>
<pre class="brush: text;">
m2 <- read.csv(file='post11.data.csv', header=T, sep=',')
m2$day <- as.Date(as.character(m2$day))
</pre>
<p></code></p>
<p>Now that that's over, let's just plot the 3 stacked series.  The basic technique, as mentioned in the last post, is constructing polygons with our desired boundaries.</p>
<p><code>
<pre class="brush: text;">
ylim <- c(0, 1.1*max(m2$src1 + m2$src2 +  m2$src3))
xx <- c(m2$day, rev(m2$day))
yysrc2 <- c(rep(0, nrow(m2)), rev(m2$src2))
plot(x=m2$day, y=m2$src2, ylim=ylim, col='red', type='l', xaxt='n',
	ylab='Dollars ($)', xlab='Date', main='Spending')
polygon(xx, yysrc2, col='red')

yysrc1 <- c(m2$src2, rev(m2$src2) + rev(m2$src1))
polygon(xx, yysrc1, col='blue')

yysrc3 <- c(m2$src2 + m2$src1, rev(m2$src2) + rev(m2$src1) + rev(m2$src3))
polygon(xx, yysrc3, col='green')
</pre>
<p></code><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.02-300x200.png" alt="plot11.02" title="plot11.02" width="300" height="200" class="aligncenter size-medium wp-image-404" /></a></p>
<p>And let's add some prettying up: a legend and X axis labels on the first of the month and the last data point present.  Note that this code is generic, so it will work on arbitrary date ranges, including spanning years, date ranges that don't end on the last day of a month, etc.</p>
<p><code>
<pre class="brush: text;">
# x axis labels
labdates <- as.Date('1970-01-01') + min(m2$day):max(m2$day)
labdates <- labdates[ format(labdates, '%d') == '01']
labdates <- unique(c(labdates, max(m2$day)))

labnames <- format(labdates, '%d %b %y')
axis(1, at=labdates, labels=labnames)

# black lines first day of month
for(a in labdates[ format(labdates, '%d') == '01']){
	abline(v=a)
}

legend(x=m[1,]$day, y=ylim[2]+500, c('src1', 'src2', 'src3'), fill=c('red', 'blue', 'green'))
</pre>
<p></code><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.03-300x200.png" alt="plot11.03" title="plot11.03" width="300" height="200" class="aligncenter size-medium wp-image-405" /></a></p>
<p>Now, let's shade the background.  What I'm going to do is draw semi transparent grey boxes over Saturday and Sunday.  There are a couple issues to be careful of: first, we don't want to do this over the legend, so we make the first couple of rectangles shorter.  Second, we find weekends by looking for Saturday, but the first day of data present could start on Sunday, so we have to check that as well.  The rect command draws the specified distance left and right (where -1 equals minus one day, since the data is formatted as Date), as well as top and bottom.<br />
<code>
<pre class="brush: text;">
# make the first 3 lower to avoid our legend
cnt <- 0

# set alpha = 80 so it's relatively transparent
color <- rgb(190, 190, 190, alpha=80, maxColorValue=255)

# check if the first data point is a Sunday
m2$dow <- format( m2$day, '%a' )
lhs <- m2[1,'dow']
if(lhs == 'Sun'){
	a <- m2[1,]$day
	rect(xleft=a-1, xright=a+2 - 1, ybottom=-1000, ytop=1.1*ylim[2] * ifelse(cnt < 2, 0.7, 1), density=100, col=color)
	cnt <- cnt + 1
}

# plot 2-day width rectangles on every Saturday
for( a in m2[ m2$dow == 'Sat', ]$day ){
	rect(xleft=a-1, xright=a+2 - 0.5, ybottom=-1000, ytop=1.1*ylim[2] * ifelse(cnt < 2, 0.7, 1), density=100, col=color)
	cnt <- cnt + 1
}
</pre>
<p></code><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.04.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/08/plot11.04-300x200.png" alt="plot11.04" title="plot11.04" width="300" height="200" class="aligncenter size-medium wp-image-414" /></a></p>
<p>Finally, I added a quick set of dashed (lty=3) horizontal rules.  This makes it much easier to read on projectors.  For results, see the first image in this post.<br />
<code>
<pre class="brush: text;">
# dashed grid
horguides <- c(5,10,15,20)* 1000
for(h in horguides){
	abline(h=h, col='gray60', lwd=0.5, lty=3)
}
</pre>
<p></code></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/08/shading-pieces-of-an-r-plot/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Multiple Y Axes in R Plots &#8212; Part 9 in a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/multiple-y-axes-in-r-plots-part-9-in-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/multiple-y-axes-in-r-plots-part-9-in-a-series/#comments</comments>
		<pubDate>Mon, 20 Jul 2009 16:00:23 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=204</guid>
		<description><![CDATA[This is post #09 in a running series about plotting in R. Frequently, you want to plot data that is not at all on the same scale. In R, this is done via plotting a second graph on top of &#8230; <a href="http://blog.earlh.com/index.php/2009/07/multiple-y-axes-in-r-plots-part-9-in-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #09 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>Frequently, you want to plot data that is not at all on the same scale.  In R, this is done via plotting a second graph on top of your first and building the axes labels by hand.  Here&#8217;s a rough outline:<br />
<code>
<pre class="brush:text;">
> plot  <-- first plot

> par(new=T)   <-- tell R to overwrite the first plot

> plot( ..., axes=F, ... )   <-- plot our second plot, but don't touch the axes
</pre>
<p></code></p>
<p>With that in mind, let's continue our long running example and plot both YHOO and GOOG stock prices on the same graph, along with moving averages for both.</p>
<p>Here, again, are both data series: <a href='http://blog.earlh.com/wp-content/uploads/2009/07/GOOG-stock-prices-19960412-20090702.csv'>google</a> and <a href='http://blog.earlh.com/wp-content/uploads/2009/07/YHOO-stock-prices-19960412-20090702.csv'>yahoo</a>.</p>
<p>First, let's prep our google data:<br />
<code>
<pre class='brush:text;'>
> goog <- read.csv(file='~/stuff/blog/GOOG stock prices [19960412, 20090702].csv', header=T, sep=',')
> colnames(goog) <- tolower( colnames(goog) )
>
> goog$date <- as.Date( as.character( goog$date ) )
> goog <- goog[order(goog$date),]
>
> # util functions
> summary30 <- function( x, FUN, na.rm=F ){
+		val <- rep( 0, length( x ) )
+		for( j in 1:length( x ) ){
+			val[ j ] <- FUN( x[ max( j - 29, 1 ):j ], na.rm=na.rm)
+		}
+		val
+	}
>
> goog$close30 <- ma30(goog$close)
> goog2 <- goog[ goog$date >= as.Date('2008-01-01'),]
>
</pre>
<p></code><br />
This is exactly how we prepped the yahoo data.</p>
<p>Initially, let's just try plotting both sets of series on the same scale and see what happens.<br />
<code>
<pre class='brush:text;'>
> plot(x=goog2$date, y=goog2$close, ylim=c(0,1.1*max(goog2$close)),
+ 	col='black', type='l',
+ 	main='goog stock close', xlab='date', ylab='close ($)',
+ 	xaxt='n')
>
> points(x=goog2$date, y=goog2$close30, col='green', type='l', lwd=2)
>
> points(x=yahoo2$date, y=yahoo2$close, col='black', type='l')
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l')
</pre>
<p></code><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.01-300x200.png" alt="plot09.01" title="plot09.01" width="300" height="200" class="aligncenter size-medium wp-image-241" /></a><br />
While we did get all four time series onto the same plot, the yahoo data is so squashed that you can't really tell what's going on.</p>
<p>So let's try the above approach and create independent scales / Y axes for the two sets of time series:<br />
<code>
<pre class="brush:text;">
> plot(x=goog2$date, y=goog2$close, ylim=c(0,1.1*max(goog2$close)),
+ 	col='black', type='l',
+ 	main='goog stock close', xlab='date', ylab='close ($)',
+ 	xaxt='n')
>
> points(x=goog2$date, y=goog2$close30, col='green', type='l', lwd=2)
>
> par(new=T)
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ 	col='black', type='l', lty=2,
+ 	xaxt='n', axes=F, ylab='')
>
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2, lty=2)
</pre>
<p></code><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.02-300x200.png" alt="plot09.02" title="plot09.02" width="300" height="200" class="aligncenter size-medium wp-image-243" /></a><br />
I attempted to use dashed (lty=2) instead of solid lines to differentiate the two sets of data, but it's clearly not a good outcome.  Instead, let's color both time series -- the daily observations and the moving average -- the same colors for each stock, and rely on line width to differentiate within stocks.  I also switched blue for green for Google as it shows up much better.  You can pass a col parameter to the axis functions, so let's set the axis line and tick marks to the same color as our series to help associate the series with their proper scales.</p>
<p><code>
<pre class="brush:text;">
> plot(x=goog2$date, y=goog2$close, ylim=c(0,1.1*max(goog2$close)),
+ 	col='blue', type='l',
+ 	main='Google (GOOG) vs Yahoo (YHOO) stock close', xlab='date', ylab='close ($)',
+ 	xaxt='n', yaxt='n', lwd=0.75)
>
> points(x=goog2$date, y=goog2$close30, col='blue', type='l', lwd=2.5)
> axis(2, pretty(c(0, 1.1*max(goog2$close))), col='blue')
>
> par(new=T)
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ 	col='red', type='l', lwd=0.75,
+ 	xaxt='n', axes=F, ylab='')
>
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2.5)
>
> axis(4, pretty(c(0, 1.1*max(yahoo2$close))), col='red')
>
</pre>
<p></code></p>
<p><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.03-300x200.png" alt="plot09.03" title="plot09.03" width="300" height="200" class="aligncenter size-medium wp-image-242" /></a></p>
<p>This is definitely an improvement -- you can see the differences in the two stocks, and the lines can easily be visually distinguished.  Nonetheless, the blue color of the left axis is pretty faint, the red color of the right axis is left intuitive than I would have liked, and the tick marks not only are on different scales but occur with much different frequency.  The last is an unfortunate side effect of how pretty, an R function used to attempt to pick out nice values, works.</p>
<p>So for our final plot, I decided to create nicer tick values by hand.  If you look at the maxima of the two series, you'll note that you can round them up a little and pick a set of numbers with a nice ratio.  So I'll adjust the two scales so that each tick for google is 35 times the same yahoo tick.<br />
<code>
<pre class="brush:text;">
> max(goog2$close)
[1] 685.33
> max(yahoo2$close)
[1] 29.98
> 700/35
[1] 20
>
</pre>
<p></code></p>
<p>I'm also going to stick both Y axes on the left to really help distinguish between the two stocks, and in doing so, I'll have to manually move the Y axis label out farther to accommodate.  This can be accomplished with the oma, or outer margin, parameter to par.  I'll also bring back the fancy X axis labels from part 6 of this series.<br />
<code>
<pre class="brush:text;">
> # create label locations for the yahoo data -- pretty values for the range [0, 35]
> yat <- pretty(c(0, 35))
>
> # add extra room to the left of the plot
> par(oma=c(0,2,0,0))
>
> # plot, but don't label any of the axes
> plot(x=goog2$date, y=goog2$close, ylim=c(0,700),
+ 	col='blue', type='l',
+ 	main='Google (GOOG) vs Yahoo (YHOO) stock close', xlab='date', ylab='',
+ 	xaxt='n', yaxt='n', lwd=0.75)
>
> points(x=goog2$date, y=goog2$close30, col='blue', type='l', lwd=2.5)
> # manually label axis 2, left, with the ratio calculated above times our manual label locations
> axis(2, col='blue', at=20*yat, labels=20*yat)
>
> # tell R to draw over the current plot with a new one
> par(new=T)
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,35),
+ 	col='red', type='l', lwd=0.75,
+ 	xaxt='n', axes=F, ylab='')
>
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2.5)
>
> # label the yahoo data
> axis(side=2, at=yat, labels=yat, col='red', line=2)
>
> # manually label, farther out than normal, the Y axis
> mtext(side=2, line=4, 'close ($)')
>
> # this code proceeds as in part 6 to neatly label the X axis
> # put X axis labels on first date present in each quarter
> locs <- tapply(X=yahoo2$date, FUN=min, INDEX=format(yahoo2$date, '%Y%m'))
>
> at = yahoo2$date %in% locs
>
> at = at &#038; format(yahoo2$date, '%m') %in% c('01', '04', '07', '10')
> axis(side=1, at=yahoo2$date[ at ], 	labels=format(yahoo2$date[at], '%b-%y'))
> abline(v=yahoo2$date[at], col='grey', lwd=0.5)
>
> legend(x=as.Date('2009-01-01'), y=35,
+ 	legend=c('GOOG daily close', 'GOOG 30 day MA', 'YHOO daily close', 'YHOO 30 day ma'),
+ 	col=c(rep('blue',2), rep('red', 2)), lwd=c(1.5, 3.5, 1.5, 3.5))
>
</pre>
<p></code><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.04.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot09.04-300x200.png" alt="plot09.04" title="plot09.04" width="300" height="200" class="aligncenter size-medium wp-image-244" /></a><br />
I think this plot came out much better.  On the left hand side, the contrast between the two scales is clear, and the tick marks neatly line up.  You can also clearly see the percentage change in the two stocks mirrored each other.  My last nitpick is that an inch could be reclaimed from the bottom of the plot by not showing a data range where the stocks never venture, but I kind of like that you really get a good sense of the range of the data relative to the lower bound, zero.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/multiple-y-axes-in-r-plots-part-9-in-a-series/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Visualizing and Comparing Distributions &#8212; Part 8 of a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/visualizing-and-comparing-distributions-part-8-of-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/visualizing-and-comparing-distributions-part-8-of-a-series/#comments</comments>
		<pubDate>Mon, 13 Jul 2009 21:16:32 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=156</guid>
		<description><![CDATA[This is post #08 in a running series about plotting in R. Last time, I talked about visualizing the Uniform, Normal, Exponential, and Poisson Distributions. However, there are more useful methods than just plotting the density and distribution functions. Of &#8230; <a href="http://blog.earlh.com/index.php/2009/07/visualizing-and-comparing-distributions-part-8-of-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #08 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>Last time, I talked about <a href="http://blog.earlh.com/index.php/2009/07/multiple-plots-and-visualizing-distributions-part-7-in-a-series/"> visualizing the  Uniform, Normal, Exponential, and Poisson Distributions</a>.  However, there are more useful methods than just plotting the density and distribution functions.</p>
<p>Of course, you can always simply ask R to output summary statistics:<br />
<code>
<pre class="brush:text;">
> n <- 10000
> d <- data.frame(unif=runif(n=n), norm=rnorm(n=n), exp=rexp(n=n),
+   pois=rpois(n, lambda=1))
>
> summary(d)
> summary(d)
      unif                norm                exp                 pois
 Min.   :2.449e-05   Min.   :-3.603527   Min.   :0.0001143   Min.   :0.000
 1st Qu.:2.489e-01   1st Qu.:-0.657498   1st Qu.:0.2963731   1st Qu.:0.000
 Median :4.991e-01   Median :-0.003342   Median :0.6927739   Median :1.000
 Mean   :5.001e-01   Mean   : 0.008527   Mean   :0.9958194   Mean   :1.006
 3rd Qu.:7.498e-01   3rd Qu.: 0.662084   3rd Qu.:1.3718103   3rd Qu.:2.000
 Max.   :1.000e+00   Max.   : 3.512077   Max.   :9.4726291   Max.   :6.000
>
</pre>
<p></code></p>
<p>Or perhaps, we could be even more detailed and ask for more finely detailed order statistics.  The quantile function will calculate arbitrary order statistics:<br />
<code>
<pre class="brush:text;">
> quantile(d$unif, probs=seq(0,1,by=0.05))
          0%           5%          10%          15%          20%          25%          30%          35%          40%          45%          50%          55%
2.448517e-05 4.870828e-02 9.949762e-02 1.450521e-01 1.953495e-01 2.488873e-01 3.011948e-01 3.541679e-01 4.020943e-01 4.508098e-01 4.991203e-01 5.527021e-01
         60%          65%          70%          75%          80%          85%          90%          95%         100%
6.038723e-01 6.540685e-01 6.990784e-01 7.498041e-01 7.989884e-01 8.480684e-01 8.982745e-01 9.496064e-01 9.999980e-01
> t(t( quantile(d$unif, probs=seq(0,1,by=0.05)) ))
             [,1]
0%   2.448517e-05
5%   4.870828e-02
10%  9.949762e-02
15%  1.450521e-01
20%  1.953495e-01
25%  2.488873e-01
30%  3.011948e-01
35%  3.541679e-01
40%  4.020943e-01
45%  4.508098e-01
50%  4.991203e-01
55%  5.527021e-01
60%  6.038723e-01
65%  6.540685e-01
70%  6.990784e-01
75%  7.498041e-01
80%  7.989884e-01
85%  8.480684e-01
90%  8.982745e-01
95%  9.496064e-01
100% 9.999980e-01
>
> s <- seq(0,1,by=0.02)
> e <- sapply(d, FUN=function(x) quantile(x, probs=s))
> e <- as.data.frame(e)
> e[1:10,]
            unif       norm          exp pois
0%  2.448517e-05 -3.6035270 0.0001142514    0
2%  1.900289e-02 -1.9884255 0.0226651895    0
4%  3.844486e-02 -1.7161347 0.0440634106    0
6%  6.012384e-02 -1.5041045 0.0634013971    0
8%  8.089568e-02 -1.3566198 0.0899735826    0
10% 9.949762e-02 -1.2407524 0.1103189153    0
12% 1.191770e-01 -1.1392377 0.1342348062    0
14% 1.369209e-01 -1.0483029 0.1590086026    0
16% 1.555910e-01 -0.9618975 0.1829207834    0
18% 1.748496e-01 -0.8799168 0.2056880019    0
>
</pre>
<p></code></p>
<p>Plotting the quantile values is worth a shot, just to see what we get:<br />
<code>
<pre class="brush: text;">
>
> plot(x=s, y=e$unif, ylim=c(min(e), max(e)), type='b', pch='.', cex=3, lwd=2, col='black',
+ 	xlab='quantiles [0,100]', ylab='dist values', main='Quantile Plots, Four Distributions')
> for(j in 2:4){
+ 	points(x=s, y=e[,j], type='b', pch='.', cex=3, lwd=2, col=c('', 'red', 'blue', 'green')[j])
+ }
> legend(x=0, y=8, legend=colnames(e), col=c('black', 'red', 'blue', 'green'), lwd=3)
>
</pre>
<p></code><br />
<div id="attachment_164" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.04.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.04.png" alt="Quantile Plots, Four Distributions" title="plot08.04" width="480" height="480" class="size-full wp-image-164" /></a><p class="wp-caption-text">Quantile Plots, Four Distributions</p></div></p>
<p>Quantile quantile plots are typically used when you want to see if a sample is a particular distribution.  You plot the quantiles of the sample versus the assumed distribution and compare; if they are the same distribution you should get a straight line at roughly a forty five degree angle.<br />
<code>
<pre class="brush: text;">
> par(mfrow=c(1,2), oma=c(0,0,2,0))
>
> qqplot(x=d$exp, y=d$pois,
+ 	xlab='exponential', ylab='poisson', main = 'qq plot: different distributions')
> qqplot(x=d$norm, y=rnorm(n=n),
+ 	xlab='our normal sample in d', ylab='new normal sample', main = 'qq plot: same distribution, resampled')
>
> mtext('QQ plots -- different vs same distributions', side=3, outer=T)
</pre>
<p></code><br />
<div id="attachment_160" class="wp-caption aligncenter" style="width: 685px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.01.png" alt="QQ Plots - Different Distribution and Same Distribution" title="plot08.01" width="675" height="450" class="size-full wp-image-160" /></a><p class="wp-caption-text">QQ Plots - Different Distribution and Same Distribution</p></div></p>
<p>Box and whisker plots<br />
<code>
<pre class="brush: text;">
> dev.set(which=1)
> boxplot(x=d, xlab='distributions', main='Distributions Visualization - boxplot')
</pre>
<p></code><br />
<div id="attachment_161" class="wp-caption aligncenter" style="width: 810px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.02.png" alt="Box and Whisker Plot, Four Distributions" title="plot08.02" width="800" height="900" class="size-full wp-image-161" /></a><p class="wp-caption-text">Box and Whisker Plot, Four Distributions</p></div></p>
<p>Finally, for univariate distributions, I prefer box-percentile plots.  These are similar to boxplots, but the width of the distribution graphs are proportional to the percent of observations more extreme in that direction.  They are also marked at the 25th, 50th, and 75th percentiles.</p>
<p><code>
<pre class="brush: text;">
> library(Hmisc)
>
> dev.set(which=1)
> bpplot(d, xlab='distributions', main='Distributions Visualization - box percentile plot')
>
</pre>
<p></code><br />
<div id="attachment_162" class="wp-caption aligncenter" style="width: 810px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.03.png" alt="Box-Percentile Plot, 4 Distributions" title="plot08.03" width="800" height="900" class="size-full wp-image-162" /></a><p class="wp-caption-text">Box-Percentile Plot, 4 Distributions</p></div></p>
<p>And if the distributions are genuinely different, let&#8217;s examine 5 univariate Normal Distributions: our already sampled N(0,1), a resampled N(0,1), N(1,1), N(0,3), N(-1,0.5).  Side by side, the box percentile plot really draws out the differences in the distributions.<br />
<code>
<pre class="brush: text;">
>
> d2 = data.frame(norm=d$norm, resample=rnorm(n), mean1=rnorm(n=n, mean=1), sd3=rnorm(n=n, sd=3), sdhalf=rnorm(n=n, sd=0.5, mean=-2))
>
> bpplot(d2, xlab='distributions', main='Multiple Normal Distribution Visualization - box percentile plot')
>
</pre>
<p></code><br />
<div id="attachment_171" class="wp-caption aligncenter" style="width: 810px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.05.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot08.05.png" alt="Box Percentile Plot, Five Different Normal Distributions" title="plot08.05" width="800" height="900" class="size-full wp-image-171" /></a><p class="wp-caption-text">Box Percentile Plot, Five Different Normal Distributions</p></div></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/visualizing-and-comparing-distributions-part-8-of-a-series/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Multiple Plots and Visualizing Distributions &#8211; Part 7 in a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/multiple-plots-and-visualizing-distributions-part-7-in-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/multiple-plots-and-visualizing-distributions-part-7-in-a-series/#comments</comments>
		<pubDate>Mon, 13 Jul 2009 02:50:18 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=145</guid>
		<description><![CDATA[This is post #07 in a running series about plotting in R. I was helping a friend plot some interesting distributions this weekend, so I decided to use distributions to demonstrate one of the neater bits of R&#8217;s basic plotting &#8230; <a href="http://blog.earlh.com/index.php/2009/07/multiple-plots-and-visualizing-distributions-part-7-in-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #07 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>I was helping a friend plot some interesting distributions this weekend, so I decided to use distributions to demonstrate one of the neater bits of R&#8217;s basic plotting tools: the ability to easily combine plots into a single plot.</p>
<p>par will allow you to manipulate all sorts of parameters to plots &#8212; you should eventually poke through the docs.  Nonetheless, one of the simplest things to do is to add plots to a matrix plot.  You can specify whether you want these to be row major or column major by using the mfrow or mfcol parameter respectively.  Let&#8217;s take a look, and play with the uniform distribution while we&#8217;re at it:<br />
<code>
<pre class="brush:text;">
> # create a new plotting window
> dev.set(which=1)
>
> # a 1x3 matrix of plots, row major
> par(mfrow=c(1,3))
>
> s01=seq(0,1,by=0.01)
> plot(x=s01, y=dunif(s01), type='l', main='density', ylab='dunif', xlab='[0,1]')
> plot(x=s01, y=punif(s01), type='l', ylim=c(0,1), main='distribution',
+ ylab='punif', xlab='[0,1]')
> plot(x=s01, y=qunif(s01), type='l', main='quantile', ylab='qunif', xlab='[0,1]')
>
</pre>
<p></code><br />
 Distribution Density, Distribution, and Quantile Plots Side-by-side&#8221;]<a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.01.png" alt="Uniform[0,1] Distribution Density, Distribution, and Quantile Plots Side-by-side" title="plot07.01" width="600" height="300" class="size-full wp-image-148" /></a></p>
<p>From left to right, we see the density (dunif), distribution (punif) and quantile (quniff) plots for the Uniform[0,1] distribution.  The first thing we might wish to do is to add a label to the left side of the combined plot saying what we&#8217;re plotting.  To do this, we&#8217;ll have to adjust the margins for the combined plot, again using par:<br />
<code>
<pre class="brush:text;">
# oma sets the outer margins in terms of lines of text
> par(mfrow=c(1,3), oma=c(0,4,0,0))
#
> s=seq(-5,5,by=0.01)
> plot(x=s, y=dnorm(s), type='l', main='density', ylab='dnorm', xlab='[-5,5]')
> plot(x=s, y=pnorm(s), type='l', ylim=c(0,1), main='distribution', ylab='pnorm', xlab='[-5,5]')
> plot(x=s01, y=qnorm(s01), type='l', main='quantile', ylab='qnorm', xlab='[0,1]')
> mtext(text='Standard Normal', side=2, line=2, outer=T)
>
</pre>
<p></code><br />
<div id="attachment_149" class="wp-caption aligncenter" style="width: 610px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.02.png" alt="Standard Normal Distribution Density, Distribution, and Quantile Plots Side-by-side" title="plot07.02" width="600" height="300" class="size-full wp-image-149" /></a><p class="wp-caption-text">Standard Normal Distribution Density, Distribution, and Quantile Plots Side-by-side</p></div></p>
<p>Now, let&#8217;s suppose we wanted to use something like a LaTeX expression in our plots &#8212; we can do this using expression.  For a complete explanation of the syntax, see plotmath.<br />
<code>
<pre class="brush:text;">
> par(mfrow=c(1,3), oma=c(0,4,0,0))
> s <- seq(0,10, by=0.01)
> plot(x=s, y=dpois(s, lambda=1), type='l', main='density', ylab=expression(qpois(lambda==1)), xlab='[0,10]')
> plot(x=s, y=ppois(s, lambda=1), type='l', ylim=c(0,1),
+ 	main='distribution', ylab=expression(ppois(lambda==1)), xlab='[0,10]')
> plot(x=s01, y=qpois(s01, lambda=1), type='l',
+ 	main='quantile', ylab=expression(qpois(lambda==1)), xlab='[0,1]')
# poisson with lambda = 1
> mtext(side=2, line=2, outer=T, text=expression(Poisson(lambda==1)))
</pre>
<p></code><br />
<div id="attachment_150" class="wp-caption aligncenter" style="width: 610px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.03.png" alt="Poisson (lambda=1) Distribution Density, Distribution, and Quantile Plots Side-by-side" title="plot07.03" width="600" height="300" class="size-full wp-image-150" /></a><p class="wp-caption-text">Poisson (lambda=1) Distribution Density, Distribution, and Quantile Plots Side-by-side</p></div></p>
<p>Now that we&#8217;ve looked at 3 distributions, say we wanted to look at Uniform[0,1], Standard Normal, Exponential(rate=1) and Poisson(lambda=1) all on the same plot.  par will allow us to do that:<br />
<code>
<pre class="brush:text;">
> par(mfrow=c(4,3), oma=c(0,4,0,0))
> s01=seq(0,1,by=0.01)
> plot(x=s01, y=dunif(s01), type='l', main='density', ylab='dunif', xlab='[0,1]')
> plot(x=s01, y=punif(s01), type='l', ylim=c(0,1), main='distribution', ylab='punif', xlab='[0,1]')
> plot(x=s01, y=qunif(s01), type='l', main='quantile', ylab='qunif', xlab='[0,1]')
> mtext(text=expression('Uniform'['[0,1]']), side=2, line=2, outer=T)
>
> s=seq(-5,5,by=0.01)
> plot(x=s, y=dnorm(s), type='l', main='density', ylab='dnorm', xlab='[-5,5]')
> plot(x=s, y=pnorm(s), type='l', ylim=c(0,1), main='distribution', ylab='pnorm', xlab='[-5,5]')
> plot(x=s01, y=qnorm(s01), type='l', main='quantile', ylab='qnorm', xlab='[0,1]')
> mtext(text='Standard Normal', side=2, line=2, outer=T)
>
> s <- seq(0,10, by=0.01)
> plot(x=s, y=dexp(s), type='l', main='density', ylab='dexp', xlab='[0,10]')
> plot(x=s, y=pexp(s), type='l', ylim=c(0,1), main='distribution', ylab='pexp', xlab='[0,10]')
> plot(x=s01, y=qexp(s01), type='l', main='quantile', ylab='qexp', xlab='[0,1]')
> mtext(text='Exponential', side=2, line=2, outer=T)
>
> s <- seq(0,10, by=0.01)
> plot(x=s, y=dpois(s, lambda=1), type='l', main='density', ylab=expression(qpois(lambda==1)), xlab='[0,10]')
> plot(x=s, y=ppois(s, lambda=1), type='l', ylim=c(0,1),
+ 	main='distribution', ylab=expression(ppois(lambda==1)), xlab='[0,10]')
> plot(x=s01, y=qpois(s01, lambda=1), type='l',
+ 	main='quantile', ylab=expression(qpois(lambda==1)), xlab='[0,1]')
> mtext(side=2, line=2, outer=T, text=expression(Poisson(lambda==1)))
</pre>
<p></code><br />
, Normal(0,1), Exponential(1), and Poisson(1) Distribution Plots&#8221;]<a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.04.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.04.png" alt="Uniform[0,1], Normal(0,1), Exponential(1), and Poisson(1) Distribution Plots" title="plot07.04" width="800" height="900" class="size-full wp-image-151" /></a></p>
<p>There are a couple issues with this plot, still: there is no main title, the labels we carefully applied with mtext stomped all over each other, and we waste an awful lot of whitespace.  Let&#8217;s take a stab at fixing all of the above:<br />
<code>
<pre class="brush:text;">
> #dev.set(which=1)
> par(mfrow=c(4,3), oma=c(0,4,4,0), mar=par()$mar*0.4)
>
> s01=seq(0,1,by=0.01)
> plot(x=s01, y=dunif(s01), type='l', main='density', ylab='dunif', xlab='[0,1]')
> plot(x=s01, y=punif(s01), type='l', ylim=c(0,1), main='distribution', ylab='punif', xlab='[0,1]')
> plot(x=s01, y=qunif(s01), type='l', main='quantile', ylab='qunif', xlab='[0,1]')
> mtext(text=expression('Uniform'['[0,1]']), side=2, line=2, outer=T, at=0.88)
>
> s=seq(-5,5,by=0.01)
> plot(x=s, y=dnorm(s), type='l', main='density', ylab='dnorm', xlab='[-5,5]')
> plot(x=s, y=pnorm(s), type='l', ylim=c(0,1), main='distribution', ylab='pnorm', xlab='[-5,5]')
> plot(x=s01, y=qnorm(s01), type='l', main='quantile', ylab='qnorm', xlab='[0,1]')
> mtext(text='Standard Normal', side=2, line=2, outer=T, at=0.62)
>
>
>
> s <- seq(0,10, by=0.01)
> plot(x=s, y=dexp(s), type='l', main='density', ylab='dexp', xlab='[0,10]')
> plot(x=s, y=pexp(s), type='l', ylim=c(0,1), main='distribution', ylab='pexp', xlab='[0,10]')
> plot(x=s01, y=qexp(s01), type='l', main='quantile', ylab='qexp', xlab='[0,1]')
> mtext(text='Exponential', side=2, line=2, outer=T, at=0.38)
>
> s <- seq(0,10, by=0.01)
> plot(x=s, y=dpois(s, lambda=1), type='l', main='density', ylab=expression(qpois(lambda==1)), xlab='[0,10]')
There were 50 or more warnings (use warnings() to see the first 50)
> plot(x=s, y=ppois(s, lambda=1), type='l', ylim=c(0,1),
+ 	main='distribution', ylab=expression(ppois(lambda==1)), xlab='[0,10]')
> plot(x=s01, y=qpois(s01, lambda=1), type='l',
+ 	main='quantile', ylab=expression(qpois(lambda==1)), xlab='[0,1]')
> mtext(side=2, line=2, outer=T, text=expression(Poisson(lambda==1)), at=0.13)
>
> mtext(text='Interesting Distribution Functions', side=3, line=2, outer=T)
>
</pre>
<p></code><br />
, Normal(0,1), Exponential(1), and Poisson(1) Distribution Plots &#8212; Nicely Formatted and Labeled&#8221;]<a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.05.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot07.05.png" alt="Uniform[0,1], Normal(0,1), Exponential(1), and Poisson(1) Distribution Plots -- Nicely Formatted and Labeled" title="plot07.05" width="800" height="900" class="size-full wp-image-152" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/multiple-plots-and-visualizing-distributions-part-7-in-a-series/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Labeling Plots &#8211; Annotations, Legends, etc &#8212; Part 6 in a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/labeling-plots-annotations-legends-etc-part-6-in-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/labeling-plots-annotations-legends-etc-part-6-in-a-series/#comments</comments>
		<pubDate>Fri, 10 Jul 2009 23:44:16 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=118</guid>
		<description><![CDATA[This is post #06 in a running series about plotting in R. You regularly want to label pieces of a plot in order to point a particular feature out or answer a question that your audience will have. Let&#8217;s see &#8230; <a href="http://blog.earlh.com/index.php/2009/07/labeling-plots-annotations-legends-etc-part-6-in-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #06 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>You regularly want to label pieces of a plot in order to point a particular feature out or answer a question that your audience will have.  Let&#8217;s see how to do this in R.</p>
<p>First, let&#8217;s collapse all the R source we need to get to the plot we had at the end of <a href="http://blog.earlh.com/index.php/2009/07/plotting-with-custom-x-axis-labels-in-r-part-5-in-a-series/">part 5 &#8211; axis labeling</a>.</p>
<p><code>
<pre class="brush:text;">
> # load data and prep
> yahoo <- read.csv(file='~/stuff/blog/YHOO stock prices [19960412, 20090702].csv', header=T, sep=',')
> colnames(yahoo) <- tolower( colnames(yahoo) )
>
> yahoo$date <- as.Date( as.character( yahoo$date ) )
> yahoo <- yahoo[order(yahoo$date),]
>
> # util functions
> summary30 <- function( x, FUN, na.rm=F ){
+ 		val <- rep( 0, length( x ) )
+ 		for( j in 1:length( x ) ){
+ 			val[ j ] <- FUN( x[ max( j - 29, 1 ):j ], na.rm=na.rm)		}
+ 		val
+ 	}
>
> yahoo$close30 <- ma30(yahoo$close)
>
> # create our plot
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ 	col='black', type='l',
+ 	main='YHOO stock close', xlab='date', ylab='close ($)',
+ 	xaxt='n')
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2)
>
> # put X axis labels on first date present in each quarter
> locs <- tapply(X=yahoo2$date, FUN=min, INDEX=format(yahoo2$date, '%Y%m'))
>
> at = yahoo2$date %in% locs
>
> at = at &#038; format(yahoo2$date, '%m') %in% c('01', '04', '07', '10')
> axis(side=1, at=yahoo2$date[ at ], 	labels=format(yahoo2$date[at], '%b-%y'))
> abline(v=yahoo2$date[at], col='grey', lwd=0.5)
>
</pre>
<p></code><br />
<div id="attachment_123" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.01.png" alt="YHOO close prices" title="plot06.01" width="480" height="480" class="size-full wp-image-123" /></a><p class="wp-caption-text">YHOO close prices</p></div></p>
<p>First, let&#8217;s look at the dramatic jump in the stock price on the first of February 2008 &#8212; Microsoft announced their takeover bid for Yahoo.  We can annotate our plot with that text, and even draw an arrow to our series.  Note that the x,y locations specified in all these functions are in whatever coordinate system you passed into the plot function.<br />
<code>
<pre class="brush:text;">
> text(x=as.Date('2008-03-01'), y=9, labels='MSFT offer', col='blue')
>
> # length slightly shrinks the size of the arrow head; lwd makes the line bolder
> arrows(x0=as.Date('2008-03-01'), y0=10, x1=as.Date('2008-02-01'), y1=20, col='blue', length=0.1, lwd=3)
</pre>
<p></code><br />
<div id="attachment_122" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.02.png" alt="YHOO close prices, with annotation" title="plot06.02" width="480" height="480" class="size-full wp-image-122" /></a><p class="wp-caption-text">YHOO close prices, with annotation</p></div></p>
<p>Finally, more for demonstration than anything else, let&#8217;s plot the 30 day moving min and max of the close price.  To distinguish these two series, I&#8217;ll use dashes instead of solid lines and make the lines very light.  First, I&#8217;ll create functions to calculate the respective series.  The lty param to points controls the line type, in this case dashed, and lwd less than one is a very narrow line.<br />
<code>
<pre class="brush:text;">
> # add some min / max info
> summary30 <- function( x, FUN, na.rm=F ){
+ 		val <- rep( 0, length( x ) )
+ 		for( j in 1:length( x ) ){
+ 			val[ j ] <- FUN( x[ max( j - 29, 1 ):j ], na.rm=na.rm)		}
+ 		val
+ 	}
>
> # create the series and pass in the function we want to use
> yahoo$minclose30 <- summary30(yahoo$close, FUN=min)
> yahoo$maxclose30 <- summary30(yahoo$close, FUN=max)
> yahoo2 <- yahoo[ yahoo$date >= as.Date('2008-01-01'),]
> #
> points(x=yahoo2$date, y=yahoo2$minclose30, col='blue', type='l', lty=2, lwd=0.5)
> points(x=yahoo2$date, y=yahoo2$maxclose30, col='blue', type='l', lty=2, lwd=0.5)
>
</pre>
<p></code><br />
<div id="attachment_121" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.03.png" alt="YHOO close prices, min/max" title="plot06.03" width="480" height="480" class="size-full wp-image-121" /></a><p class="wp-caption-text">YHOO close prices, min/max</p></div></p>
<p>And finally, we should add a legend just to make very clear what is going on in our plot.  Note that lty allows you to set the line type &#8212; normal or dashed &#8212; for each legend item.  I also used png with width=720 and height = 480 to stretch the plot out for better viewing.<br />
<code>
<pre class="brush:text;">
> legend(x=as.Date('2009-02-01'), y=30,
+ 	legend=c('daily close', '30 day MA', '30 day min/max'),
+ 	col=c('black', 'red', 'blue'), lwd=3, lty=c(1,1,2))
</pre>
<p></code><br />
<div id="attachment_120" class="wp-caption aligncenter" style="width: 730px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.04.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot06.04.png" alt="YHOO close prices, min/max, and legend" title="plot06.04" width="720" height="480" class="size-full wp-image-120" /></a><p class="wp-caption-text">YHOO close prices, min/max, and legend</p></div></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/labeling-plots-annotations-legends-etc-part-6-in-a-series/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Plotting With Custom X Axis Labels in R &#8212; Part 5 in a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/plotting-with-custom-x-axis-labels-in-r-part-5-in-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/plotting-with-custom-x-axis-labels-in-r-part-5-in-a-series/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 10:03:35 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Data Munging]]></category>
		<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=85</guid>
		<description><![CDATA[This is post #05 in a running series about plotting in R. There are a variety of ways to control how R creates x and y axis labels for plots. Let&#8217;s walk through the typical process of creating good labels &#8230; <a href="http://blog.earlh.com/index.php/2009/07/plotting-with-custom-x-axis-labels-in-r-part-5-in-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #05 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>There are a variety of ways to control how R creates x and y axis labels for plots.  Let&#8217;s walk through the typical process of creating good labels for our YHOO stock price close plot (see <a href="http://blog.earlh.com/index.php/2009/07/plotting-multiple-series-in-r-part-4-in-a-series/">part 4</a>).</p>
<p>Reviewing our plot from last time, we left off with code that plots two line series in different colors and different line widths.<br />
<code>
<pre class="brush:text;">
plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
	col='black', type='l',
	main='YHOO stock close', xlab='date', ylab='close ($)')
points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2)
</pre>
<p></code><br />
<a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot05.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot05.01.png" alt="YHOO close price plot -- poor labelling" title="YHOO close price plot -- poor labelling" width="480" height="480" class="aligncenter size-full wp-image-88" /></a><br />
Unfortunately, while R understands our X axis data as dates, it doesn&#8217;t choose optimal labels for our purposes.</p>
<p>Instead, let&#8217;s try labeling the first day of the month in each business quarter.  To do this, we use the format function on dates to pick out the first (day 01) of every month, and select months 1,4,9, and 12 for the business quarters.  Note that R allows us to use the
<pre>%in% </pre>
<p> operator to ask if a value is contained in a vector.  Further, note that
<pre>format</pre>
<p> produces text, not numeric values, so we have to match the results against an array of strings.<br />
<code>
<pre class="brush: text;">
> at = format(yahoo2$date, '%m') %in% c('01', '04', '09', '12') &#038; format(yahoo2$date, '%d') == '01'
>
> # the first of many months isn't in the data
> yahoo2$date[ at ]
[1] "2008-04-01" "2008-12-01" "2009-04-01"
</pre>
<p></code><br />
Which is a little disappointing &#8212; we&#8217;re only left with three data values.  Nonetheless, let&#8217;s see what it looks like:</p>
<p><code>
<pre class="brush: text;">
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+	col='black', type='l',
+	main='YHOO stock close', xlab='date', ylab='close ($)',
+	xaxt='n')
>
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2)
>
> # create labels at side 1 (bottom), at the dates we've selected, and with abbreviated month - year labels
> axis(side=1, at=yahoo2$date[ at ], labels=format(yahoo2$date[at], '%b-%y'))
>
</pre>
<p></code><br />
Which produces<br />
<div id="attachment_87" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot05.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot05.02.png" alt="YHOO Close plot -- weird x axis labeling" title="plot05.02" width="480" height="480" class="size-full wp-image-87" /></a><p class="wp-caption-text">YHOO Close plot -- weird x axis labeling</p></div></p>
<p>Walking through the code, in the plot call, we use
<pre> xaxt='n'</pre>
<p> to tell plot not to create X axis labels.  The format command asks which dates are in the months (1,4,7,10) that start quarters, and the second format command asks which days are the first of the month.  You&#8217;ll note that we don&#8217;t have many dates on our graph &#8212; that&#8217;s because often, the first day of the month isn&#8217;t in our data!  Only 3 days in the data are both on the first and the beginning of a new quarter.</p>
<p>Instead, let&#8217;s just find the first day of each month that is present in the data:<br />
<code>
<pre class="brush: text;">
> locs <- tapply(X=yahoo2$date, FUN=min, INDEX=format(yahoo2$date, '%Y%m'))
> t(t(locs))
        [,1]
200801 13880
200802 13910
200803 13941
200804 13970
200805 14000
200806 14032
200807 14061
200808 14092
200809 14124
200810 14153
200811 14186
200812 14214
200901 14246
200902 14277
200903 14305
200904 14335
200905 14365
200906 14396
200907 14426
>
> # find the dates we have selected
> at = yahoo2$date %in% locs
> yahoo2$date[at]
 [1] "2008-01-02" "2008-02-01" "2008-03-03" "2008-04-01" "2008-05-01" "2008-06-02" "2008-07-01" "2008-08-01"
 [9] "2008-09-02" "2008-10-01" "2008-11-03" "2008-12-01" "2009-01-02" "2009-02-02" "2009-03-02" "2009-04-01"
[17] "2009-05-01" "2009-06-01" "2009-07-01"
>
</pre>
<p></code></p>
<p>tapply is an extraordinarily handy R function that runs a user supplied function, in this case min, on data, returning one value for each unique level of the factor supplied in INDEX.  When we print loc, the first column is our unique factor &#8212; a combination of year and month &#8212; and the second column is the minimum date value for that factor.  We then select select the first dates in each month and further select just those months that are the beginning of new business quarters:<br />
<code>
<pre class="brush: text;">
> at = at &#038; format(yahoo2$date, '%m') %in% c('01', '04', '07', '10')
> yahoo2$date[at]
[1] "2008-01-02" "2008-04-01" "2008-07-01" "2008-10-01" "2009-01-02" "2009-04-01" "2009-07-01"
>
</pre>
<p></code></p>
<p>Finally, we bring this all together to plot the data and format the X axis to show the first date in each quarter, adding vertical lines to draw the eye to the quarter divisions.<br />
<code>
<pre class="brush:text;">
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ 	col='black', type='l',
+ 	main='YHOO stock close', xlab='date', ylab='close ($)',
+ 	xaxt='n')
>
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2)
>
> axis(side=1, at=yahoo2$date[ at ], 	labels=format(yahoo2$date[at], '%b-%y'))
> abline(v=yahoo2$date[at], col='grey', lwd=0.5)
>
</pre>
<p></code><br />
<div id="attachment_86" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot05.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot05.03.png" alt="YHOO close plot -- quarter labeling" title="plot05.03" width="480" height="480" class="size-full wp-image-86" /></a><p class="wp-caption-text">YHOO close plot -- quarter labeling</p></div></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/plotting-with-custom-x-axis-labels-in-r-part-5-in-a-series/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Plotting Multiple Series in R &#8212; Part 4 in a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/plotting-multiple-series-in-r-part-4-in-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/plotting-multiple-series-in-r-part-4-in-a-series/#comments</comments>
		<pubDate>Sun, 05 Jul 2009 16:00:40 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Data Munging]]></category>
		<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=63</guid>
		<description><![CDATA[This is post #04 in a running series about plotting in R. Frequently, you want to simultaneously plot multiple series on the same plot. Let&#8217;s try plotting daily observations along with a 30 day moving average. To start, I have &#8230; <a href="http://blog.earlh.com/index.php/2009/07/plotting-multiple-series-in-r-part-4-in-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #04 in a running <a href="http://blog.earlh.com/index.php/plotting-in-r-a-series/"> series </a> about plotting in R.
</div>
<p>Frequently, you want to simultaneously plot multiple series on the same plot.  Let&#8217;s try plotting daily observations along with a 30 day moving average.</p>
<p>To start, I have <a href='http://blog.earlh.com/wp-content/uploads/2009/07/YHOO-stock-prices-19960412-20090702.csv'>observations for YHOO stock</a> from 12 April 1996 through 2 July 2009.</p>
<p>First, the data needs cleaning &#8212; I turn the column names into lower case for convenience with the tolower function and turn the text dates formatted as yyyy-mm-dd into dates instead of factors via the as.Date constructor for Date classes:<br />
<code>
<pre class="brush:text">
> yahoo <- read.csv(file='~/stuff/blog/YHOO stock prices [19960412, 20090702].csv', header=T, sep=',')
> str(yahoo)
'data.frame':	3329 obs. of  7 variables:
 $ Date     : Factor w/ 3329 levels "1996-04-12","1996-04-15",..: 3329 3328 3327 3326 3325 3324 3323 3322 3321 3320 ...
 $ Open     : num  15.2 15.5 15.8 15.9 15.6 ...
 $ High     : num  15.3 15.7 15.9 16 15.8 ...
 $ Low      : num  14.9 15.3 15.3 15.6 15.5 ...
 $ Close    : num  15 15.4 15.7 15.9 15.7 ...
 $ Volume   : int  16919900 12716100 16033900 12312100 26449100 19827800 30979700 15866300 26488700 20323100 ...
 $ Adj.Close: num  15 15.4 15.7 15.9 15.7 ...
>
> colnames(yahoo) <- tolower( colnames(yahoo) )
> yahoo$date <- as.Date( as.character( yahoo$date ) )
>
> # order yahoo into the same way we want to display it
> yahoo <- yahoo[ order(yahoo$date), ]
</pre>
<p></code></p>
<p>Now, let's take a first pass at plotting:<br />
<code>
<pre class="brush:text">
> plot(x=yahoo$date, y=yahoo$close,
+ 	main='YHOO stock close', xlab='date', ylab='close ($)')
</pre>
<p></code></p>
<div id="attachment_67" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.01.png" alt="YHOO close prices, all time" title="plot04.01" width="480" height="480" class="size-full wp-image-67" /></a><p class="wp-caption-text">YHOO close prices, all time</p></div>
<p>That isn't very pretty, not least of which because we're displaying too much data to be useful.  Let's cut it down to just data from January 1 2008 and on:<br />
<code>
<pre class="brush:text">
> yahoo2 <- yahoo[ yahoo$date >= as.Date('2008-01-01'), ]
> plot(x=yahoo2$date, y=yahoo2$close,
+ 	main='YHOO stock close', xlab='date', ylab='close ($)')
</pre>
<p></code></p>
<div id="attachment_66" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.02.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.02.png" alt="YHOO close prices, 2008 on" title="plot04.02" width="480" height="480" class="size-full wp-image-66" /></a><p class="wp-caption-text">YHOO close prices, 2008 on</p></div>
<p>It's worth pointing out that R's plotting code will attempt to set the upper and lower y bounds to something reasonable based on that data you present it with.  However, sometimes, particularly to get a sense of scale, you really want to see the full range.  You can accomplish this by explicitly setting the y axis limits with ylim.   I also make the data more presentable.<br />
<code>
<pre class="brush:text">
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ 	col='black', type='l',
+ 	main='YHOO stock close', xlab='date', ylab='close ($)')
</pre>
<p></code></p>
<div id="attachment_65" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.03.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.03.png" alt="YHOO close data" title="plot04.03" width="480" height="480" class="size-full wp-image-65" /></a><p class="wp-caption-text">YHOO close data</p></div>
<p>Also, I wish to plot the moving average, so I create the function ma30 to calculate it.  I also add ma30 as a column, using the whole data range so that the moving average is correct at the beginning of our subset:<br />
<code>
<pre class="brush:text">
> ma30 <- function( x, na.rm=F ){
+ 		val <- rep( 0, length( x ) )
+ 		for( j in 1:length( x ) ){
+ 			val[ j ] <- sum( x[ max( j - 29, 1 ):j ], na.rm=na.rm) / length( max( j-29,1):j )
+ 		}
+ 		val
+ 	}
>
> yahoo$close30 <- ma30(yahoo$close)
> yahoo2 <- yahoo[ yahoo$date >= as.Date('2008-01-01'), ]
</pre>
<p></code></p>
<p>And finally, I replot the data, adding the moving average as a second series and making it slightly bolder (lwd=2) to emphasize the moving average over the daily observations:<br />
<code>
<pre class="brush:text; toolbar: false; wrap-lines: false;">
> plot(x=yahoo2$date, y=yahoo2$close, ylim=c(0,1.1*max(yahoo2$close)),
+ 	col='black', type='l',
+ 	main='YHOO stock close', xlab='date', ylab='close ($)')
> points(x=yahoo2$date, y=yahoo2$close30, col='red', type='l', lwd=2)
</pre>
<p></code></p>
<div id="attachment_64" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.04.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot04.04.png" alt="YHOO close data plus 30 day moving average" title="plot04.04" width="480" height="480" class="size-full wp-image-64" /></a><p class="wp-caption-text">YHOO close data plus 30 day moving average</p></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/plotting-multiple-series-in-r-part-4-in-a-series/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Comparing Many Variables in R with Plots &#8212; Part 3 in a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/comparing-many-variables-in-r-with-plots-part-3-in-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/comparing-many-variables-in-r-with-plots-part-3-in-a-series/#comments</comments>
		<pubDate>Sat, 04 Jul 2009 16:00:32 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=47</guid>
		<description><![CDATA[This is post #03 in a running series about plotting in R. Say you have a data frame with a number of variables that you would like to compare against each other. While you could plot them all on the &#8230; <a href="http://blog.earlh.com/index.php/2009/07/comparing-many-variables-in-r-with-plots-part-3-in-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #03 in a running <a href="http://blog.earlh.com/index.php/tag/plotting-series/"> series </a> about plotting in R.
</div>
<p>Say you have a data frame with a number of variables that you would like to compare against each other.  While you could plot them all on the same graph, statisticians frequently wish to look for visual evidence of correlation between different sets of observations &#8212; particularly when looking for visual evidence of heteroskedasticity.  A simple way to do this for every variable in a data frame is to call plot on the data frame itself.</p>
<p>Say we have data as such:<br />
<code>
<pre class="brush: text;">
> s <- data.frame(x=1:30, y=10*runif(n=30))
> s$z <- 10*runif(n=30)
# (a more common application is calling residual or predict on a fitted model)
> plot(s)
>
</pre>
<p></code><br />
data: <a href='http://blog.earlh.com/wp-content/uploads/2009/07/plot03.csv'>plot03</a></p>
<div id="attachment_48" class="wp-caption aligncenter" style="width: 490px"><a href="http://blog.earlh.com/wp-content/uploads/2009/07/plot03.01.png"><img src="http://blog.earlh.com/wp-content/uploads/2009/07/plot03.01.png" alt="Pairs plot for a data frame" title="plot03.01" width="480" height="480" class="size-full wp-image-48" /></a><p class="wp-caption-text">Pairs plot for a data frame</p></div>
<p>This is an example of a pairs plot, which I&#8217;ll cover in more detail in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/comparing-many-variables-in-r-with-plots-part-3-in-a-series/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Saving Plots in R &#8212; Part 2 of a Series</title>
		<link>http://blog.earlh.com/index.php/2009/07/saving-plots-in-r-part-2-of-a-series/</link>
		<comments>http://blog.earlh.com/index.php/2009/07/saving-plots-in-r-part-2-of-a-series/#comments</comments>
		<pubDate>Fri, 03 Jul 2009 16:00:19 +0000</pubDate>
		<dc:creator>earl</dc:creator>
				<category><![CDATA[Plotting]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[plotting series]]></category>

		<guid isPermaLink="false">http://blog.earlh.com/?p=43</guid>
		<description><![CDATA[This is post #02 in a running series about plotting in R. Though the docs are relatively clear on how to save an R plot to disk in a variety of formats, I had some trouble figuring out how because &#8230; <a href="http://blog.earlh.com/index.php/2009/07/saving-plots-in-r-part-2-of-a-series/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="border: 1px solid rgb(230, 219, 85); margin: 0px auto; padding: 10px; width: 70%; background-color: #F5F5F5; font-size: 0.9em; text-align: center;">
	This is post #02 in a running <a href="http://blog.earlh.com/index.php/tag/plotting-series/"> series </a> about plotting in R.
</div>
<p>Though the docs are relatively clear on how to save an R plot to disk in a variety of formats, I had some trouble figuring out how because I googled the wrong words.  In the hope that this can help someone in the future, here&#8217;s how I generated the plots for the last post and saved them to disk:<br />
<code>
<pre class="brush: text;">
> png(filename='plot01.04.png', type='quartz')
> plot(x=s$x, y=s$y, type='b', col='blue', xlim=c(20,30), ylim=c(6,10),
+	xlab='x in [20,30]', ylab='y in [6,10]', main='Basic Plotting Sample, Filtered')
>
> dev.off()
</pre>
<p></code> </p>
<p>It&#8217;s really as simple as using png or jpeg depending on your desired filetype.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.earlh.com/index.php/2009/07/saving-plots-in-r-part-2-of-a-series/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

