Say you read a data frame from a file but you don’t like the column names. Here’s how you go about labelling them as you like. Start with a simple csv file:
col1, col2, col3 "1,233", "$12.79", "$1,333,233.17" "470", "$1,113.22", "$0.12"
Load it, and see what we get:
data <- read.csv(file='~/stuff/blog/dirty.csv', header=T, sep=',') > data col1 col2 col3 1 1,233 $12.79 $1,333,233.17 2 470 $1,113.22 $0.12 > str(data) 'data.frame': 2 obs. of 3 variables: $ col1: Factor w/ 2 levels "1,233","470": 1 2 $ col2: Factor w/ 2 levels " $1,113.22"," $12.79": 2 1 $ col3: Factor w/ 2 levels " $0.12"," $1,333,233.17": 2 1 >
Now, lets examine the column names (and also note how we see how many there are):
> colnames( data ) [1] "col1" "col2" "col3" > nrow(data) [1] 2 > ncol(data) [1] 3 >
And R allows us to modify the column names of a data frame by assigning to the array produced by colnames:
> colnames(data) [1] "col1" "col2" "col3" > > # set the name of column 2 > colnames(data)[2] <- 'column 2' > colnames(data) [1] "col1" "column 2" "col3" > > # you can assign all of the columns at once, if you wish > colnames(data) <- c( 'col 1', 'col 2', 'col 3') > colnames(data) [1] "col 1" "col 2" "col 3" > str(data) 'data.frame': 2 obs. of 3 variables: $ col 1: Factor w/ 2 levels "1,233","470": 1 2 $ col 2: Factor w/ 2 levels " $1,113.22"," $12.79": 2 1 $ col 3: Factor w/ 2 levels " $0.12"," $1,333,233.17": 2 1 >