This project creates various types of statistics and graphs from subversion repository log data.News/UpdatesNew version 0.6 available (28 Mar 2010)
New Version 0.5.14 Available (4 Feb 2010) - detection of binary files based on list of commonly used binary files extension. Improvements in calculating the diffs for large repositories where you can access repository as 'file://' repository.
DO NOT USE 0.5.13. Version 0.5.13 has a bug in the linecount computations. If you are using 0.5.13, please discard the repository stats database and regenerate it again.
Steps to generate these statistics :
subversion log information is first converted into a sqlite database. then using sql queries various stats are generated these stats are converted into graphs using the matplotlib package
The various graphs generated are inspired by the graphs generated using StatSVN/StatCVS.
Currently following statistics and graphs are generated
General Statistics Revision count Author count File Count Head revision number Top 10 Hot List Top 10 Active Authors Top 10 Active Files LoC graphs total loc line graph (loc vs dates) average file size vs date line graph Contributed lines of code line graph (loc vs dates). Using different colour line for each developer Loc and Churn graph (loc vs date, churn vs date)- Churn is number of lines touched (i.e. lines added + lines deleted + lines modified) File Count graphs file count vs dates line graph file type vs number of files horizontal bar chart Directory size graphs directory size vs date line graph. Using different coloured lines for each directory directory size pie chart (latest status) Directory file count pie char(latest status)
Commit Activity Graphs Commit Activity Index Activity by hour of day bar graph (commits vs hour of day) Activity by day of week bar graph (commits vs day of week) NEW Author Commit trend history (histogram of time between consecutive commits by same author) Author Activity horizontal bar graph (author vs adding+commiting percentage) Commit activity for each developer - scatter plot (hour of day vs date) Others Tag cloud of words from revision log messages. Tag cloud of author names. These scripts depend on following python packages
pysvn - Python interface to subversion sqlite3 - Included by default in python distribution matplotlib - python graph library Currently I am experimenting with applying social network analysis to repositories. Check the preliminary results at Social Network Analysis of Rietveld Subversion Repository and Treemap of Commit count vs centrality for Rietveld repository
I am a novice to python, sqlite and matplotlib. So any suggestions on improvements are welcome.