In mid-2016, I unexpectedly found myself on the leadership team for the OWASP Top Ten. It is hard to believe that I have been working on the project for more than five years, but it has been a great experience. A lot happened during that time, culminating in today’s release of the 2021 OWASP Top Ten. Contrast Security asked me to pen a blog post on the process, data, and analysis behind the release.
TRANSITION AT OWASP IN 2016
2016 was a year of transition for the Open Web Application Security Project (OWASP), and for the Top Ten project itself, with new leadership appointed for both. For the Top Ten, the previous leadership asked Andrew van der Stock to serve as a co-lead for the project. He subsequently recruited Neil Smithline, Torsten Gigler, and me to assist with the effort.
The reason that Andrew thought of me is related to my obsession with data analysis. Preliminary data for the 2017 release had already been published when the leadership change took place, and I stayed up far too late on far too many nights to crunch these numbers as an interested observer. I published my findings in a couple blog posts, and Andrew realized that my analysis was more extensive than what anyone else had done. So the team asked me to join them, and the four of us have managed the Top Ten ever since. Last year, Andrew became the executive director of OWASP while remaining one of the co-leads for the Top Ten and the Application Security Verification Standard (ASVS).
WORKING WITH NARROWER RESEARCH PARAMETERS IN 2017
When our team started its work, we had a short timeline to finish the 2017 Top Ten. We had the data from the 2016 data call, which asked respondents to send frequency-based telemetry data related to 35 specific Common Weakness Enumerations (CWEs). We decided to reopen the 2016 data call to try to obtain additional data on those 35 CWEs. We were excited that our updated 2017 dataset gave us telemetry from a total of 114,000 applications.
While we stuck with the prescribed CWEs in the second data call, we did change the type of data we collected. Specifically, rather than measuring how frequently each CWE occurred, we looked at the incidence rate. Frequency-based data collection methods can overplay the risk of vulnerabilities that are easy to test for and tend to occur frequently in a single application, such as cross-site scripting (XSS). As a result, we thought a metric that measures what percentage of applications are impacted by one or more instances of a vulnerability would be a better measure of risk in a larger population.
I believe we did a good job in formulating the 2017 Top Ten, with each category encompassing a small number of CWEs. Focusing on the incidence rate provided a more realistic picture of risk, and there is no doubt that the Top Ten included the worst application security problems we faced at the time. Of course, one problem was the narrow range of CWEs we analyzed. If there had been a broader dataset, more vulnerability types would likely be represented in the risk categories of the Top Ten.
Another problem with this relatively narrow approach was for corporate learning professionals who were putting together OWASP Top Ten training programs for developers and security team members. With so few CWEs per category, some programming languages had successfully eliminated some of the Top Ten categories within the language or framework—which was a good thing. But trainers still had to cover the entire Top Ten for compliance reasons, leaving some professionals sifting through generic content that was mostly irrelevant to the systems they used at work.
EMBARKING ON AN AMBITIOUS RESEARCH PROJECT
As we prepared for this release, we all agreed that the research should be based primarily on incidence rate and that our data call should request telemetry on any and all CWEs that each organization tracks. We also asked for data from any source, which contributing organizations could optionally label as being from a security tool, from human analysis, or some combination of the two.
But we knew that telemetry data alone would be insufficient to identify current and emerging threats. So while the data call was open, we simultaneously conducted a community survey of security and development professionals to get their perspective about what keeps them awake at night. Such a survey was also conducted for the 2017 release, which was especially valuable to us as we deliberated on emerging risks.
PUSHING THE RELEASE INTO 2021
When the COVID-19 pandemic struck, the world of work was transformed overnight, and we quickly realized that organizations contributing data had more pressing priorities than voluntarily compiling data for OWASP. Similarly, development and security teams that would use the Top Ten—and organizations that would create training curriculum on the new framework—may not have had the bandwidth to address a new Top Ten in 2020.
We understood this and made a deliberate decision to be patient: We would accept organizations’ data and survey responses whenever they had the bandwidth to provide them. We also informed stakeholders that the new release of the OWASP Top Ten would occur in 2021 rather than 2020, and began making plans to release it in alignment with OWASP’s 20th anniversary celebration.
When we finally closed the data call, we had collected more than 500,000 different application records from 13 contributing organizations, 12 of which are application security companies. We were surprised and gratified, as this was nearly four times as many records as we received for the 2017 Top Ten. The dataset was also much broader, with close to 400 CWEs tracked compared with 35 for the previous release. We were also pleased that the community survey was completed by nearly 450 professionals, resulting in a broad picture of the state of application security today.
DELIBERATING ON CATEGORIZATIONS
The unprecedented breadth and depth of the data we received transformed the way we did our work. Crunching the telemetry data was a much more complex endeavor, and unlike 2017, we knew that some of the CWEs would make it into the Top Ten and some would not. This made the task of categorization much more complex than with the prior release.
Another challenge (in a good way) is the diversity of the leadership team. Andrew, Neil, Torsten, and I have different backgrounds and views on application security. The result was relatively endless debates on how different vulnerabilities should be categorized and which ones should be included in the Top Ten. This is one of the huge challenges with a large, global dataset; individuals will likely have experiences and perspectives that do not directly align due to clients, technology, languages, and other variables they encounter.
FOCUSING ON CORE PRINCIPLES
We did employ several principles as we deliberated on the data and survey findings. One principle was that we should not rename a category just for the sake of renaming it. Rather, the name of each category should reflect the breadth of what is included in it. In some cases, items that we added to a category made a renaming necessary, and readers will notice that many of the categories are renamed as their scope was adjusted.
We also agreed that we would make a concerted effort to categorize vulnerabilities according to the root of the problem rather than the effect, albeit we recognized this would be impossible in some cases. In healthcare parlance, it is the difference between “My arm hurts” and “My humerus is fractured.” However, in cases like Sensitive Data Exposure, the industry is still in the “My arm hurts” phase.
Finally, we agreed that the 2021 OWASP Top Ten would be data-driven, but not obsessively so. This is where the community survey comes in. For example, the survey findings tipped us off to the threat posed by Server-Side Request Forgery (SSRF), which is underrepresented in our data, possibly due to lagging tests or other factors. Further, when SSRF vulnerabilities are exploited, significant risk exists. As a result, we included SSRF in the Top Ten even though the telemetry data on its own did not justify that ranking. Security Logging and Monitoring Failures is in a similar boat, it’s an area where the security community is in a position to tell us something the data may not be able to yet.
We knew we had to cut off the discussions at some point, and we were able to agree collaboratively on everything—even in cases where some of us disagreed with a specific action. In the end, the final product is better than any one of us could have done by ourselves.
DRIVING AWARENESS OF APPLICATION SECURITY RISK
I am proud of the 2021 OWASP Top Ten because it provides a broader and more complete picture of application security risk. On average, each category in the Top Ten now includes 20 CWEs, but the focus is more on more strategic categories of risk rather than individual vulnerabilities. Finally, our research is based more on the incidence rate than on frequency, which brings a clearer focus on what threats are faced by each application.
The intent is for the OWASP Top Ten to serve as an awareness document. And one audience for our awareness efforts is the creators and maintainers of programming languages and frameworks that developers use in their everyday work. Our greatest hope is that they will be motivated to solve as many of these problems as possible at their level by making it exceedingly difficult or impossible to introduce a specific vulnerability into an application. It would be great if some elements of the 2021 Top Ten are no longer a problem when the next iteration of the Top Ten is released. That is ultimately how we hope application security maturity is best advanced.