Re: What is the future of CVE - Scope, Volume & Quality?
So as to add to this just a bit, I am including my response to the IVDA paper. This was a position piece I did for internal discussions and discussions a set of vendors are having about the effects the IVDA proposal could cause.
There is an effort just underway to address some of the issues that were listed in the paper. More on that a bit later.
Director Content Strategy, Architecture and Standards
5000 Headquarters Dr.
Plano, Texas 75024
From: David Mann <firstname.lastname@example.org<mailto:email@example.com>>
Date: Tue, 13 Sep 2011 14:33:51 -0500
To: cve-editorial-board-list <firstname.lastname@example.org<mailto:email@example.com>>
Subject: What is the future of CVE - Scope, Volume & Quality?
WHAT IS THE FUTURE SCOPE OF CVE?
The CVE Team
In June 2011, a paper from China was circulated that calls for the establishment of an International Vulnerability Database Alliance ("IVDA: International Vulnerability Database Alliance," Chen Zheng et al). The MITRE CVE team has been giving this paper careful consideration and while the paper raises several issues with CVE, we believe it raises two important, high priority questions that the CVE Editorial Board should consider and respond to:
Q1: Can CVE effectively keep up in the face of an increasing volume of English-based disclosures?
Q2: What relationship should CVE have with any international effort (such as IVDA) to identify vulnerabilities disclosed in non-English based markets ?
To be sure, there are other questions to be asked and answered, but we feel these two are the most pressing. And while these questions are related, we believe that the Editorial Board should consider them in the order above. Regardless of what happens internationally, CVE is confronted with real issues of scope, volume and response time. How the CVE Editorial Board decides to deal with the issues of scope, volume and response time will likely inform our position relative to efforts to further develop and operate CVE.
If you are confident that you understand how CVE is currently operating, please feel free to jump ahead to the questions and respond to them. Otherwise, we'll begin with a brief overview of our understanding of current CVE operations and scope, to ensure that everyone is starting with the same set of basic assumptions.
Please note, the CVE Team anticipates a day when CVE won't be able to remotely stay abreast of "all publicly disclosed" vulnerabilities. We take as an example of this how the tracking of malware samples has changed in the face of a changing malware threat. The basic thrust of the two questions is to help us focus our resources on the most important issues.
CVE CURRENT OPERATIONS - SHORT OVERVIEW
CVE aspires to assign vulnerabilities to all publicly known vulnerabilities (where "publicly known" has traditionally been taken to mean "disclosed in English-language disclosures"). CVE's are based on 3 primary sources of information. First, the largest number of CVEs is produced based on information pulled from web sites, vulnerability databases and mailing lists. The vast majority of this information is gathered using web spidering capabilities and is generally done in coordination of the producers of the information. The second source of information is the CVE Candidate Naming Authorities (CNAs), who are trained in how to assign CVE IDs and how to include them in their advisories. It is important to note that in nearly all cases, the CVE team learns of CNA-issued CVE IDs in the same way that the rest of the world does -- we pull the information from the CAN's web site. In this way, CNA advisories are treated just like all of the other information sources we monitor. Lastly, there are occasions in which CVEs are assigned in a pre-disclosure context. While the CVE team is not an emergency response coordination center, it is sometimes the case that communities involved in pre-disclosure coordination benefit from using pre-disclosure CVE IDs..
In nearly all cases, new CVE information begins with the gathering of one or more related disclosures, which we call references. The first analytical question the CVE team asks is whether or not any of the references relate to a CVE that already exists in the CVE corpus. This analysis is based primarily on keyword searching, which means that the words that are chosen when writing CVE descriptions are vitally important. If the reference relates to an existing CVE, we add the new reference to the existing references associated with that CVE and, if needed, update the language of the description based on the new information. If we conclude that the reference relates to a new issue or issues, we first determine how many vulnerabilities are being discussed and then create new CVE entries for each. For each newly created CVE, careful consideration is given to writing the description to help ensure that the new CVE can be found when tossed into a haystack with approximately 50,000 other entries.
When we launched CVE many years ago, Stephen Northcutt endorsed the effort saying something along the lines that it was a good step forward and would be really useful when it had 1000 entries. Stephen was right! CVE wasn't keeping up and we've been trying to catch up ever since that first day. While we aspire to cover all publicly available English-based disclosures, our best estimates are that, as of this writing, we end up covering about 35% of all references we monitor and between 65% and 85% of all "high priority" reference sources. This has varied over the years due to a number of factors, primarily the increasing volume and complexity of vulnerabilities and having to manage more "raw" information than in the past. To date, we have responded by instituting processes that attempt to prioritize the processing of disclosures deemed to be the most important.
The CVE team maintains a mostly automated "rolling to-do list" of disclosures to process, whereby the "most important" bubble to the top of the list and the "less important" bubble down. Priority is based on several factors, and some disclosure sources are given higher priority than others. For example, the rolling to-do list gives higher prioritization to references with reserved CVEs and references from major high-priority sources. Also, we work hard to achieve "reference completeness" for an exclusive set of providers, such as US-CERT Bulletins, but not for arbitrary posts to Bugtraq. In addition, disclosures about some software vendors (such as Microsoft) are given higher priority than a disclosure about a php.golf application written by an undergraduate student as a part of his programming class and posted to a blog. ("php.golf" has become something of an internal catchphrase for "stuff we don't care about.")
This basic 2-dimensional prioritization grid (describe the grid) then gives us basic framework for internal performance goals that breaks out into 4 basic response levels:
1. High priority issues: 2 to 3 days
2. Moderate priority issues: within 2 weeks
3. Low priority issues: as we can get to them
4. Lower than low priority issues - these roll off the list, but we keep
them for possible future use or reference
DISCUSSION QUESTIONS REGARDING SCOPE, VOLUME AND RESPONSE TIME
As noted at the beginning, there are two questions that we would like the CVE Editorial Board to consider and respond to:
1. Can CVE effectively keep up in the face of an increasing volume of English-based disclosures?
KL : I think it has been shown that we cannot keep up without incrementally increasing the level of support CVE has. We are already restricting what we analyze and are not looking at dealing with the full picture of English based vulnerabilities. While we have seen a rise in the number of vulnerabilities / malware, has the team increased to keep pace? I don't think so. Where that leaves us is in a situation where the CVE Team is analyzing a smaller and smaller set of vulnerabilities simply due to resource availability.
2. What relationship should CVE have with any international effort (such as IVDA) to identify vulnerabilities disclosed in non-English based markets?
KL: This is what is being developed now. There are a set of vendors that are putting together a counter proposal for managing vulnerability identification and disclosures from a global perspective. I won't reiterate what is in my opinion paper but it is safe to say that CVE will need to play a part in that if we are to address the problem appropriately.
The first question is really about CVE's prioritization of issues and our response time. There are several questions we would like to discuss that are central to the "keep up" question.
KL : Personally I think these are the wrong questions to ask at this time. I think we can aggregate all the lists, see what it is that is important to the respondents and we still are not solving the problem that needs to be discussed. The problem is what is the future of CVE from a maturation perspective? How do we mature the effort so that we can put in place a useful vulnerability identification and analysis capability that will survive and continue to be a valuable resource for the next couple decades? Key word here is analysis. The analytic aspects is an important aspect of what CVE provides today. It cannot be watered down becoming nothing more than a simple reporting mechanism for vendor related disclosures.
KL : Yes, the questions below are important at some point, but we need to figure out what the confines and operating environment for CVE will be before dealing with these types of specifics. These appear to be addressing the issue from only what exists today. It appears these questions are more targeted towards limiting or highly focusing the team on a smaller subset of potential vulnerable software. I do not believe this is the time for that type of thinking. There is movement in the US Federal government and in other governments to try to find a real solution. We need to lead those discussions and what better place than here.
a. Which sources of vulnerability disclosures should be considered
"must haves" for which we provide "reference complete" coverage?
b. Which sources should be considered "nice-to-haves"?
c. Which sources should be considered "can be safely ignored"
(e.g. they just cause noise)?
a. Which vendors and software products should we consider "must haves"
in that we will provide coverage for all reliable vulnerability
reports for them?
b. Which products or vendors should be considered "nice-to-haves"?
c. Which ones should be considered "can be safely ignored" (e.g. php.golf)?
3. Response Time
a. How should the answers to the Sources and Coverage questions be
combined to create a tiered priority list for response time?
b. For each tier, what is a reasonable response time?
a. What rate of duplicate CVE entries can be tolerated?
b. How consistent does CVE "counting" need to be relative to past
counting practices and content decisions? ("Counting" here means
the relationship between a given vulnerability and the number of
CVEs needed to correctly describe it and vice-versa. These may be
one-to-one, one-to-many, many-to-one, or many-to many.).
We believe that questions 1 and 2 form the basis of any rational prioritization for new CVEs. We also believe that questions 3 and 4 need to be considered in tandem. The biggest delays in doing CVE analysis is in "getting things right" both in terms of counting CVEs and in terms of creating descriptions that allow futures queries to have reasonable chance of finding the correct CVE entry and avoiding duplicate entries. We can produce CVE entries faster than the current rate, but one effect could be that we will assign CVE IDs in a less consistent manner and we will produce more duplicates.
The MITRE CVE team has formulated our own internal answers to these questions but we really need and solicit your input. We appreciate the time you may take to think about and respond to these questions, and thank you for your consideration of the above.
- The CVE Team (Steve Boyle, Steve Christey, Dave Mann)