TY - JOUR
T1 - Visual data mining and analysis of software repositories
AU - Voinea, Lucian
AU - Telea, Alexandru
N1 - Relation: http://www.rug.nl/informatica/organisatie/overorganisatie/iwi
Rights: University of Groningen. Research Institute for Mathematics and Computing Science (IWI)
PY - 2007
Y1 - 2007
N2 - In this article we describe an ongoing effort to integrate information visualization techniques into the process of configuration management for software systems. Our focus is to help software engineers manage the evolution of large and complex software systems by offering them effective and efficient ways to query and assess system properties using visual techniques. To this end, we combine several techniques from different domains, as follows. First, we construct an infrastructure that allows generic querying and data mining of different types of software repositories such as CVS and Subversion. Using this infrastructure, we construct several models of the software source code evolution at different levels of detail, ranging from project and package up to function and code line. Second, we describe a set of views that allow examining the code evolution models at different levels of detail and from different perspectives. We detail three views: the file view shows changes at line level across many versions of a single, or a few, files. The project view shows changes at file level across entire software projects. The decomposition view shows changes at subsystem level across entire projects. We illustrate how the proposed techniques, which we implemented in a fully operational toolset, have been used to answer non-trivial questions on several real-world, industry-size software projects. Our work is at the crossroads of applied software engineering (SE) and information visualization, as our toolset aims to tightly integrate the methods promoted by the InfoVis field into the SE practice.
AB - In this article we describe an ongoing effort to integrate information visualization techniques into the process of configuration management for software systems. Our focus is to help software engineers manage the evolution of large and complex software systems by offering them effective and efficient ways to query and assess system properties using visual techniques. To this end, we combine several techniques from different domains, as follows. First, we construct an infrastructure that allows generic querying and data mining of different types of software repositories such as CVS and Subversion. Using this infrastructure, we construct several models of the software source code evolution at different levels of detail, ranging from project and package up to function and code line. Second, we describe a set of views that allow examining the code evolution models at different levels of detail and from different perspectives. We detail three views: the file view shows changes at line level across many versions of a single, or a few, files. The project view shows changes at file level across entire software projects. The decomposition view shows changes at subsystem level across entire projects. We illustrate how the proposed techniques, which we implemented in a fully operational toolset, have been used to answer non-trivial questions on several real-world, industry-size software projects. Our work is at the crossroads of applied software engineering (SE) and information visualization, as our toolset aims to tightly integrate the methods promoted by the InfoVis field into the SE practice.
KW - Maintenance
KW - Software engineering
KW - Software visualization
KW - Software evolution
KW - Data mining
U2 - 10.1016/j.cag.2007.01.031
DO - 10.1016/j.cag.2007.01.031
M3 - Article
SN - 0097-8493
VL - 31
SP - 410
EP - 428
JO - Computers & Graphics
JF - Computers & Graphics
ER -