Perhaps, suggested a panel of experts at the 2009 RSA Conference, the problem lies in their attempt to measure an abstract such as effectiveness.
"The key is to try to get metrics to be less about quality and more about an activity you can actually observe" said Cigital Inc. CTO Gary McGraw during a panel discussion Wednesday. "We can observe, for example, whether an organization is doing code review -- that's easy. Whether they're doing it effectively is harder."
"Getting information is difficult," McGraw said. We need a system where information comes to us and we don't have to chase it."
Having data that's been collected over a period of time enables a security program to create benchmarks for itself and observe trends. Nichols compared these observations to what is known as the treatment effect in medical circles.
"You can eventually compare yourself to yourself," she said. "If you spend $200,000 on a SIM, what is the treatment effect of that investment?"
Microsoft's SDLC is one such established program that has specific success criteria, which Shostack said developers and management measures against year after year. The prime metric is to continually reduce the number of security issues shipping in production code, in addition to releasing fewer security updates. Shostack said Microsoft also measures bug counts, the rate at which bugs are found and the software development stage in which they're found.
However, Blaze challenged the notion that success is measured by how often bugs are fixed, suggesting that there is an obvious way to skew that metric in Microsoft's favor. Shostack countered by saying that is an impossibility because of the hacker community's continuous poking of Microsoft products looking for critical and exploitable vulnerabilities.
"There are a set of people who can hold our feet to the fire," Shostack said. "They will shout 'Hey, look what I found and Microsoft hasn't fixed it yet.'"
BSIMM studied the software security metrics used by nine massive organizations, including Microsoft, Google Inc., Wells Fargo & Co., and the Depository Trust & Clearing Corporation (DTCC), and produced what McGraw called a software security yardstick based on observed activities.
"What works for one organization is unlikely to work for another, even if they're in the same vertical. Investment banks, retirement services firms and the DTCC are all regulated by the same regulators, but culturally, they're different enough so the metrics don't work," McGraw said. "That's why we look at observables."