Category Archive
The following is a list of all entries from the R Tip category.
Querying Postgres or Greenplum From R on a Mac, Installation Instructions
Filed in Data Munging, R, R Tip, January 21, 2010, 2:19 pmNB: this works on 64b versions of R; I tested it with the R64 app with R version 2.10.1 on Snow Leopard
Step by step instructions for talking to Postgres or Greenplum:
install macports
install postgres; I used 8.4
sudo port install postgresql84
in a shell, create an environmental variable PG_CONFIG pointing to the pg_config binary installed [...]
Querying Databases From R on a Mac
Filed in Data Munging, R, R Tip, January 11, 2010, 5:28 pmI use a mac, currently running OS 10.6 / Snow Leopard, and I’d like to query our greenplum / postgres database from R. This used to work with R 2.9, but I unfortunately had to upgrade R, and R 2.10 on the mac is a 64 bit app. So, I want to use [...]
Querying Postgres or Greenplum from R on a Mac
Filed in Data Munging, R, R Tip, December 31, 2009, 9:00 amSo, I’m using snow leopard, and I want to query our postgres / greenplum database.
First things first: I’m familiar with the RODBC package on CRAN. This installs fine, since it’s a binary package. I also installed the ODBC Administrator app that you have to download from apple here . Now all [...]
Querying Databases in R
Filed in Data Munging, R, R Tip, August 14, 2009, 9:00 amOne of the first things you’ll want to do in R is set it up to talk to databases. The easiest way to do this is using ODBC, via package RODBC.
To get the package, run
> install.packages(RODBC)
Once you have RODBC installed, you call it in R as follows. But it’s very simple: a bit [...]
R Dates – Recovering and Converting From Integers
Filed in R, R Tip, August 12, 2009, 9:00 amOne problem with R is that dates (class Date) are internally stored as integer numbers of days elapsed since 1 January 1970 and R sometimes loses the dateness of the variables and thinks of it only as an integer. So in the first line, we take the range of dates present in our data, [...]
Examining Data Frames — head and tail
Filed in Data Munging, R, R Tip, August 2, 2009, 12:30 amhead and tail, for those familiar with the unix command line, are two very handy utilities for looking at data frames. Along with str, which displays the structure of a data frame, they help you look at your data:
> d d
> str(d)
‘data.frame’: 50 obs. of 2 variables:
$ mean: int 1 [...]
Removing Extra Column of Data from CSVs in R — R Tip
Filed in Data Munging, R, R Tip, July 11, 2009, 9:00 amWhen R writes a csv file, you get an extra column of data as such:
> s
> plot(x=s$x, y=s$y )
>
> write.csv(x=s, file=’s0.csv’ )
When you peek in the csv file, you see this:
blog earl$ head s0.csv
“”,”x”,”y”
“1″,1,8.29164186026901
“2″,2,2.83956938423216
“3″,3,7.43510165950283
“4″,4,6.38210728997365
“5″,5,9.29241271456704
“6″,6,6.13102467032149
“7″,7,5.03747826907784
“8″,8,1.83257902506739
“9″,9,9.62789378128946
blog earl$
What is that first column? It’s actually pretty obvious in this example, but if you’ve sorted your data frame [...]
Examining CSV Data Columns From a Shell
Filed in Data Munging, R, R Tip, July 10, 2009, 8:08 pmIt’s very handy to be able to pop open a shell and peek in your csv files. awk is a command that will do just that — it divides each line into fields based either on a whitespace separator or by a separator specified by -F. Here’s a script that prints the 8th [...]