Column Names of R Data Frames

Say you read a data frame from a file but you don’t like the column names. Here’s how you go about labelling them as you like. Start with a simple csv file:

col1, col2, col3
"1,233", "$12.79", "$1,333,233.17"
"470", "$1,113.22", "$0.12"

Load it, and see what we get:

 data <- read.csv(file='~/stuff/blog/dirty.csv', header=T, sep=',')
> data
   col1       col2           col3
1 1,233     $12.79  $1,333,233.17
2   470  $1,113.22          $0.12
> str(data)
'data.frame':	2 obs. of  3 variables:
 $ col1: Factor w/ 2 levels "1,233","470": 1 2
 $ col2: Factor w/ 2 levels " $1,113.22"," $12.79": 2 1
 $ col3: Factor w/ 2 levels " $0.12"," $1,333,233.17": 2 1
>

Now, lets examine the column names (and also note how we see how many there are):

> colnames( data )
[1] "col1" "col2" "col3"
> nrow(data)
[1] 2
> ncol(data)
[1] 3
>

And R allows us to modify the column names of a data frame by assigning to the array produced by colnames:

> colnames(data)
[1] "col1" "col2" "col3"
>
> # set the name of column 2
> colnames(data)[2] <- 'column 2'
> colnames(data)
[1] "col1"     "column 2" "col3"
>
> # you can assign all of the columns at once, if you wish
> colnames(data) <- c( 'col 1', 'col 2', 'col 3')
> colnames(data)
[1] "col 1" "col 2" "col 3"
> str(data)
'data.frame':	2 obs. of  3 variables:
 $ col 1: Factor w/ 2 levels "1,233","470": 1 2
 $ col 2: Factor w/ 2 levels " $1,113.22"," $12.79": 2 1
 $ col 3: Factor w/ 2 levels " $0.12"," $1,333,233.17": 2 1
>
This entry was posted in Data Munging, R. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>