This hacker is rating software security Consumer Reports-style

Mudge's Cyber Independent Testing Lab (CITL) is fuzzing binaries at scale and building a checklist of compile-time security best practices.

The poor security of much enterprise software can be dramatically improved at low cost with the compile-time equivalents of seatbelts and airbags. With that in mind, the Cyber Independent Testing Lab (CITL) is building a Consumer Reports-style rating systems to grade the security of thousands of software binaries.

Founded by l0pht hacker and former head of cybersecurity research at DARPA Peiter "Mudge" Zatko, and bankrolled with seed funding from the US Air Force, the CITL presented their methodology and some preliminary results at the 34c3 hacker conference in Leipzig, Germany a few weeks ago.

"It's ridiculous," Tim Carstens, acting director of the CITL, says. "At the enterprise scale, you've easily got a hundred thousand different binaries running in different places, and some infinitesimal fraction of that has the latest security features, and most of it isn't compiled in a way that enables those trivial defenses."

While basic compile-time security features like ASLR or DEP may not be silver bullets, they do make an attacker's job much more difficult. The vast amount of low-hanging fruit that attackers currently enjoy can be taken away from them, and at low cost to software vendors and enterprise security administrators. "At scale a lot of really basic defenses are not present," Carstens says. "Major vendors' software does not have stack execution prevention or heap prevention enabled, and this is software that has an attack surface."

To solve this problem, the CITL is building a checklist of compile-time security best practices. "For software vendors, the main question I would pose to them is, what is their pre-release process on their gold image?" Carstens says. "What is their prerelease checklist? Check for the presence of compiler hardening features like ASLR and DEP, things in that class."

To encourage vendors to prioritize security, the CITL is mass testing thousands of publicly-available binaries against their checklist, and plans to publish Consumer Reports-style ratings. Enterprise security administrators will be able to use the CITL's ratings to identify weaknesses in their infrastructure and to demand more secure software from their suppliers.

"Do you know what software is running in your environment?" Carstens asks CSOs and CISOs. "How much do you know about that software? Have a process in place. That's a program you can spin up, and you're going to get some value you can show your board."

What does the CITL's checklist look like?

Measuring software security turns out to be a really hard problem, and it begins with deciding how to define the word "security." At the CITL's talk in December at 34c3, Carstens compared measuring software security to prospecting for diamonds. Since diamonds are quite rare, prospectors look instead for minerals that are often found near diamonds--such as garnet, diopside, and chromite. In the same way, since measuring security in some absolute way may be impossible, the CITL takes a more pragmatic approach, and instead asks, "How difficult is it for an attacker to find a new exploit for this software?"

"Am I reaching towards the Platonic ideal of security? Absolutely not," Carstens says. "The things we're reporting on presently, we are measuring very conservative things. I've never met a dissenter who said you shouldn't look for ASLR, for example."

Using a custom fuzzer that is still under development, the CITL tests binaries and rates them based on their complexity, application armoring, and developer hygiene. The more complex the code, the more likely it is to contain security flaws. Developers who use the C strcpy and strcat functions, the CITL reasons, likely haven't given security much thought. Application armoring includes compile-time defenses like stack guards, ASLR, and code signing.

The CITL documentation compares these application armoring features to airbags and seatbelts in cars. "Modern compilers, linkers, and loaders come with lots of nifty safety features -- things that are proven to improve safety and whose use should be established by now as industry-standard. If your car doesn't have airbags, you're entitled to know that before you buy it."

To prevent software vendors from gaming the rating system, the CITL is publishing its checklist, but not the fuzzing tools they use to rate the software itself. It hopes to see multiple implementations of its checklist and thus broader test coverage across the industry.

Inspired by the CITL, Fedora Linux is working to do just that.

Fedora Cyber Test Lab

Fedora Linux includes tens of thousands of packages, including many binaries. Ensuring those binaries meet minimum security requirements would significantly increase the security posture of Fedora and its downstream partners, RedHat and CentOS, Jason Callaway, the Fedora Red Team Special Interest Group (SIG) leader, realized.

"The goal of the SIG is to become Fedora's upstream cybersecurity community," he says. "What CITL is doing is analyzing the amount of effort it would take for a researcher to potentially find another zero-day in that given binary."

Callaway's first release target is an open source fuzzer that will crack open rpms, scan them for elf binaries, perform tests, and then report the results. The project, he admits, is "light-years" behind the CITL's efforts. "We're not really ready to talk about our findings," he says. "I don't trust my own data yet."

Callaway's SIG is also working on a tool for Fedora/RHEL/CentOS admins called elem, short for the Enterprise Linux Exploit Mapper. The goal of elem, Callaway explains, is to make it easy for sysadmins to quickly assess an rpm-based Linux server against a database of known exploits.

"Red and blue teams are two sides of the same coin," Callaway says. "Even though [red team] is the name of the SIG, you can't really effectively do blue team stuff without also doing the red team stuff."

Combined with the still-nascent Fedora implementation of CITL's work, Callaway hopes to significantly increase the security posture of Fedora and its downstream distributions.

Not everyone agrees that the CITL model of rating software security is a workable idea, however.

Criticism of the CITL

Critics of the CITL argue that there is a category difference between defending against manufacturing defects and against sabotage. Underwriters Laboratories (now UL) does not include malicious adversaries when they evaluate electrical appliances, nor does Consumer Reports rate the roadworthiness of cars based on the vehicle's defenses against the mafia disabling your brakes. The threat models are different.

Security researcher Rob Graham went so far as to call the initiative a "dumb idea" back in 2015, writing: "UL is about accidental failures in electronics. CyberUL would be about intentional attacks against software. These are unrelated issues. Stopping accidental failures is a solved problem in many fields. Stopping attacks is something nobody has solved in any field." (Graham did not respond to our requests for comment for this article.)

Carstens acknowledges that Graham has a point, but suggests that, given the poor state of software security across the board, the CITL can nevertheless make a big difference. "You can rate certain defensive tools against an adversary," Carstens says. "There's a measurable difference between a steel wall and a cedar plank wall, for example. An adversary who shows up and is more opportunistic, without a tall enough ladder or a digging machine, then your steel wall is going to be pretty good."

"Hitting 100 out of 100 on my test does not mean your software is invincible," he adds. "It does not."

The Digital Standard

For decades, infosec Cassandras have warned about the catastrophic social, political, and economic consequences of rampant insecurity across the internet. Today, as the cyber and physical realms merge, we may have reached a tipping point. The CITL, in partnership with Consumer Reports, Disconnect, and Ranking Digital Rights, have proposed a Digital Standard: "The standard defines and reflects important consumer values that must be addressed in product development: electronics and software-based products should be secure, consumer information should be kept private, ownership rights of consumers should be maintained, and products should be designed to combat harassment and help protect freedom of expression."

Those values are founded on secure software design and implementation, and the CITL's push to rate software security at scale offers both a carrot and a stick to software vendors. Like Consumer Reports' "name and shame" strategy of evaluating consumer appliances, software vendors can look forward to being called out for failure to deploy a security-focused pre-release checklist.

This is also an opportunity, Carstens says. "It's such a cheap thing for a vendor to do, there's a guaranteed effect. $50,000 for a consultant can fix this for a product happens to be really cheap right now to lead the pack."

The transparency at scale the CITL's ratings database will create means new incentives for software vendors to get their security together, Carstens hopes. "Only when the cost of poor security outweighs the cost of other incentives will that be built into the process."

Join the newsletter!

Error: Please check your email address.
Have an opinion on security? Want to have your articles published on CSO? Please contact CSO Content Manager for our guidelines.

More about CSOFedoraLinuxModernRedHatTest

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by J.M. Porup

Latest Videos

More videos

Blog Posts