Zipline Construction Video

Scribd building the zipline from Tim Morgan.


Picking Subsets of CSV/TSV Files With awk

Say you have a csv or tsv file, and you want to only select the bits where a particular column is not zero. Start with a csv like this:

earl$ head ttt
104834, 0, 206, 104578, false
104837, 4, 206, 103566, false
104854, 0, 193, 101063, false
104856, 0, 195, 101851, false
8469683, 0, 149, 50191, false
121867, 4, 207, 107816, [...]


Filled Line Plots / Graphs in R — Part 10 in a Series

This is post #10 in a running series about plotting in R.

Otherwise known as filled curves.
Say you want to, instead of drawing a single line, draw a filled curve. R’s basic plot doesn’t make the especially easy, though it can be made much easier with packages such as ggplot2 as we’ll see [...]


Building a Zip Line at Scribd

Scribd built a zip line! Chris Seifert and I, along with our coworkers’ help, built a zip line over 3 nights at Scribd. Which is why I haven’t been posting more.
Scribd has a long office space, with 6 pairs of 8-sided concrete columns running down the middle. Chris and I decided to [...]


Multiple Y Axes in R Plots — Part 9 in a Series

This is post #09 in a running series about plotting in R.

Frequently, you want to plot data that is not at all on the same scale. In R, this is done via plotting a second graph on top of your first and building the axes labels by hand. Here’s a rough [...]


In Which Lucy Goes to the Vet

The vet seemed to think she’s healthy and all, except she’s a 10 pound 2 oz cat in a 9 pound cat body. Unfortunately for her. So the diet will continue.


Howto Remove Tabs From CSV Files, A Second Method

As mentioned before, you regularly want to transform tsv files into csv files. While tr is a much less powerful program than sed or awk, it is much easier to use:

tr ‘\t’ ‘,’ < input_file > output_file


Writing MySQL Query Results to Disk

Notes to myself: how to easily write query results to disk using mysql.

mysql -h main-backup.local -u earl -e “select count(*) from adsense_analytics_days;” -p collegelist_development > csvname.csv;

where h specifies the name of the mysql server, u the username, e the query, p the database.
This will output a tsv file; to turn it into csv try using [...]


Howto Swap the Order of Columns in a CSV or TSV File – Use awk

Sample file: tab separated

col1 col2 col3
val11 val12 val13
val21 val22 val23
val31 val32 val33

blog earl$ awk ‘{FS=”\t”; OFS=”, “; print $1,$3,$2}’ < input.tsv

In this case, FS is the field separator for the input and OFS is the field separator for the output. Thus if we wanted to go to eg tsv to tsv we would set both to “\t” (default for awk); csv [...]


Howto Transform TSV to CSV, or Just Remove Tabs

Unfortunately, statistics and machine learning seem to degenerate into a giant mess of getting data from multiple sources, munging it together, transforming it, and formatting the output, even before you can get to the work proper. A common problem is taking tab separate value (tsv) files, perhaps produced as the output of a mysql [...]