FinViz is a site that aggregates a lot of financial information and presents it in interesting and innovative ways. The centrepiece of the site is a very impressive “squarified treemap” (see here for an interesting article on the history of the treemap as a data visualization tool). The map is apparently built using the Google Maps API, and is densely packed with information.
A couple of snapshots I found particularly interesting… here is the 3-month performance treemap for the S&P 500:
And here is the 1-week performance treemap:
It is a pretty interesting comparison – check out the tech sector hammering in the 3-month view (AAPL and GOOG particularly).
As an aside – has anyone else seen any interesting examples of treemaps or Voronoi maps (apart from the usual disc space explorers)?
Here is an example of using the SVNKit API to crawl a SVN repository and pick up the commit sizes. It uses a very simple (and incorrect) heuristic for estimating the number of lines changed per commit – it just gets the absolute value of the difference of the numer of lines added and subtracted per commit.
The code below will produce a comma-separated values file containing the author, commit time, line change count estimate, and revision number.
Loading the resulting file into R allows us to apply some analysis. We can plot the total number of commits per comitter:
Or look at the total number of lines committed on each commit:
And look at some summary stats (again, per author):
$user1
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 1.0 5.0 439.3 45.5 45100.0
$user2
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 3.0 26.0 294.9 105.5 62700.0
$user3
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 1.00 1.00 46.64 5.00 22300.00
$user4
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 5.5 51.0 225.5 166.0 1882.0
$user5
Min. 1st Qu. Median Mean 3rd Qu. Max.
39.0 108.0 267.0 231.4 298.0 445.0
$user6
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 2.0 7.0 181.3 41.0 21170.0
$user7
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 5.0 34.5 164.8 136.0 3066.0
You can see from the entries for the first couple of authors above that the mean is skewed by some very large commits – making the median a much more robust measure of average lines per commit.
// Get svn log for entire repo history long currentRev = repo.getLatestRevision();
ArrayList<SVNLogEntry> entries = new ArrayList<SVNLogEntry>(repo.log(new String[] {""}, null, 1, currentRev, true, true));
// Diff all subsequent revisions for (int i = 1; i < entries.size(); ++i) { int changedThisCommit = 0;
SVNLogEntry current = entries.get(i);
SVNLogEntry prev = entries.get(i-1);