Suppose you have an array that you’d like to sort by another array. A common use case might be a set of arrays of somethings and for each something you generate a score in say [0,1]. Now you’d like to sort your somethings by their scores.
Concretely, say you have an array of scores:
scores: [0.3347867, 0.9069004, 0.4391635, 0.8376249, 0.7133011] indices: [0, 1, 2, 3, 4]
and you want the indices of the sorted scores, ie
scores_sorted: [0.3347867, 0.4391635, 0.7133011, 0.8376249, 0.9069004] indices_sorted: [0, 2, 4, 3, 1]
in R, you can always use order, as in
$ R
> scores <- runif(5)
> perm <- order(scores)
> data.frame(score=scores, order=perm)
score order
1 0.3347867 1
2 0.9069004 3
3 0.4391635 5
4 0.8376249 4
5 0.7133011 2
>
> # and just to check
> scores[ perm ]
[1] 0.3347867 0.4391635 0.7133011 0.8376249 0.9069004
You can do something similar in ruby:
irb > scores = [0.3347867, 0.9069004, 0.4391635, 0.8376249, 0.7133011]
=> [0.3347867, 0.9069004, 0.4391635, 0.8376249, 0.7133011]
irb > scores.zip( (1..scores.length).to_a )
=> [[0.3347867, 1], [0.9069004, 2], [0.4391635, 3], [0.8376249, 4], [0.7133011, 5]]
irb >
irb > scores.zip( (1..scores.length).to_a ).sort_by{ |e| e.first }
=> [[0.3347867, 1], [0.4391635, 3], [0.7133011, 5], [0.8376249, 4], [0.9069004, 2]]
irb >
irb > perm = scores.zip( (1..scores.length).to_a ).sort_by{ |e| e.first }.map{ |e| e[1] - 1 }
=> [0, 2, 4, 3, 1]
irb >
irb > scores.values_at(*perm)
=> [0.3347867, 0.4391635, 0.7133011, 0.8376249, 0.9069004]
And finally in C++, you can leverage qsort_r; this function was designed to be a reentrant / threadsafe qsort so you’re given a void* to pass a block of memory into your comparison function. You can use this to sort the indices array by the scores:
#include// [...] // utility fns that join an array of u (unsigned int) or f (double) into a string char* vsprintf_u(char* buff, unsigned int* array, unsigned int len){ char* orig = buff; buff += sprintf(buff, "["); for (int i=0; i < len; i++) buff += sprintf(buff, "%3u, ", array[i]); sprintf(buff, "]"); return orig; } char* vsprintf_f(char* buff, double* array, unsigned int len){ char* orig = buff; buff += sprintf(buff, "["); for (int i=0; i < len; i++) buff += sprintf(buff, "%1.4f, ", array[i]); sprintf(buff, "]"); return orig; } /** * qsort_r comparison fn: sort array indices by scores */ int score_comparator(void* scoresv, const void* leftv, const void* rightv){ unsigned int* left = (unsigned int*)leftv; unsigned int* right = (unsigned int*)rightv; double* scores = (double*)scoresv; if (scores[ *left ] < scores[ *right ]) return -1; else if (scores[ *left ] == scores[ *right ]) return 0; return 1; } // [...] printf("test code:\n"); double scores[5] = {0.3347867, 0.9069004, 0.4391635, 0.8376249, 0.7133011}; unsigned int perm[] = {0, 1, 2, 3, 4}; char buff[4192]; printf("presort: %s\n", vsprintf_f(buff, scores, 5)); printf("presort: %s\n", vsprintf_u(buff, perm, 5)); qsort_r(perm, 5, sizeof(unsigned int), scores, &score_comparator); printf("postsort: %s\n", vsprintf_u(buff, perm, 5)); printf("postsort: ["); for (unsigned int i=0; i < 5u; i++) printf("%1.4f, ", scores[ perm[ i ] ]); printf("]\n");
which produces when run
$ ./a.out presort: [0.3348, 0.9069, 0.4392, 0.8376, 0.7133, ] presort: [ 0, 1, 2, 3, 4, ] postsort: [ 0, 2, 4, 3, 1, ] postsort: [0.3348, 0.4392, 0.7133, 0.8376, 0.9069, ]
Note the canonical way to sort somethings by a float in c++ is to bang everything into a struct or class and leverage qsort on the structs/classes directly. However, this is often pretty inconvenient, and if you have a lot of whatever you want to sort, it's too memory intensive to put everything into structs/classes with the sole addition of your score field.
I think it's obvious why I prefer to program in R.
NB: I am developing for OS X; if you are targeting linux you'll have to figure out how to link qsort_r yourself. I think someone also decided to permute the argument order. Sigh.

