Empirical studies of software process assessment methods
There are now many methods for assessing the maturity and capabilities of software engineering organizations. Assessment scores are being used in making the contract award decision by the U.S. Navy and Air Force, as well as in commercial organizations. Furthermore, conformance to process standards such as ISO 9001, as determined during an audit, is a necessity for doing business in many European countries. Software process assessments are also an essential element of the self-improvement cycle for many organizations. There has been a relative dearth of empirical investigations of the core premises of most contemporary assessment methods and their underlying models. Software organizations were being required and/or pressured to conform to certain standards (e.g., to be at Level 3 on the CMM) without adequate empirical evidence supporting the assumptions made by these standards. The software community needs to be more confident that assessment results accurately reflect the capabilities of organizations being assessed, not simply the idiosyncrasies of those doing the assessments. We need a solid basis to better understand assessment methods, evaluate their basic premises, and make decisions about their use and improvement. In our review we address validity issues of software process assessments as well as reliability issues. Validity of measurement is defined as the extent to which a measurement procedure is measuring what it is purporting to measure. Reliability is defined as the extent to which the same measurement procedure will yield the same results on repeated trials. Validity: More empirical evidence already exists than is sometimes realized. And our understanding of the effectsof maturity is starting to improve. Based on the empirical evidence reviewed in this chapter, one can conclude that process maturity is generally associated with better performance in software organizations. Reliability: The studies reviewed in this chapter represent much of the published research that examines the reliability of software process assessments. The number of studies of this particular topicis not large, but the cummulative evidence thus far suggests that assessments can in fact be done reliably.