Monthly Archives: July 2009

Visualizing and Comparing Distributions — Part 8 of a Series

This is post #08 in a running series about plotting in R. Last time, I talked about visualizing the Uniform, Normal, Exponential, and Poisson Distributions. However, there are more useful methods than just plotting the density and distribution functions. Of … Continue reading

Posted in Plotting, R, Statistics, Visualization | Tagged , | Leave a comment

Multiple Plots and Visualizing Distributions – Part 7 in a Series

This is post #07 in a running series about plotting in R. I was helping a friend plot some interesting distributions this weekend, so I decided to use distributions to demonstrate one of the neater bits of R’s basic plotting … Continue reading

Posted in Plotting, R, Statistics, Visualization | Tagged , | 1 Comment

Removing Extra Column of Data from CSVs in R — R Tip

When R writes a csv file, you get an extra column of data as such: > s > plot(x=s$x, y=s$y ) > > write.csv(x=s, file=’s0.csv’ ) When you peek in the csv file, you see this: blog earl$ head s0.csv … Continue reading

Posted in Data Munging, R, R Tip | Tagged , | Leave a comment

Examining CSV Data Columns From a Shell

It’s very handy to be able to pop open a shell and peek in your csv files. awk is a command that will do just that — it divides each line into fields based either on a whitespace separator or … Continue reading

Posted in Data Munging, R, R Tip | Tagged | Leave a comment

Labeling Plots – Annotations, Legends, etc — Part 6 in a Series

This is post #06 in a running series about plotting in R. You regularly want to label pieces of a plot in order to point a particular feature out or answer a question that your audience will have. Let’s see … Continue reading

Posted in Plotting, R, Visualization | Tagged , | Leave a comment

Removing Quotes From csv Files

Many programs, particularly Excel, having an annoying habit of dumping crap such as quotes or currency symbols into your csv files. I pointed out earlier a simple way to deal with this in R, but if you’re more comfortable with … Continue reading

Posted in Data Munging | Tagged | Leave a comment

Saving MySQL Query Results into csv

Say you have a mysql query such as select start_date, count(*), sum(impressions) as impr, sum( revenue) as revenue from adsense_analytics_days group by start_date order by start_date desc and you want to save the results into a csv file. MySQL makes … Continue reading

Posted in Data Munging | Tagged , | Leave a comment

Plotting With Custom X Axis Labels in R — Part 5 in a Series

This is post #05 in a running series about plotting in R. There are a variety of ways to control how R creates x and y axis labels for plots. Let’s walk through the typical process of creating good labels … Continue reading

Posted in Data Munging, Plotting, R, Visualization | Tagged , | 1 Comment

Plotting Multiple Series in R — Part 4 in a Series

This is post #04 in a running series about plotting in R. Frequently, you want to simultaneously plot multiple series on the same plot. Let’s try plotting daily observations along with a 30 day moving average. To start, I have … Continue reading

Posted in Data Munging, Plotting, R | Tagged , | 2 Comments

Comparing Many Variables in R with Plots — Part 3 in a Series

This is post #03 in a running series about plotting in R. Say you have a data frame with a number of variables that you would like to compare against each other. While you could plot them all on the … Continue reading

Posted in Plotting, R, Visualization | Tagged , | Leave a comment