MITRE Engenuity has released its 2023 ATT&CK evaluations, examining how top cybersecurity vendors detect and prevent sophisticated cyberthreats. This year, the evaluations focused on the techniques of Turla, a Russia-based threat group.
Turla uses a command-and-control network, as well as open source tools, which are more difficult to protect and easier to exploit because anyone can edit — and abuse — the code.
This year’s MITRE analysis tested vendors’ ability to detect two scenarios called SNAKE and CARBON. MITRE used multiple offensive security tools, including Keylogger and Mimikatz, to launch attacks on vendors’ environments. The vendors were also tested on protection capabilities, undergoing thirteen tests — some with many steps — to see at which step they could halt an attack.
MITRE’s detection and protection evaluations have usually attracted endpoint security vendors, with the detection evaluations best suited for endpoint detection and response (EDR) products and the protection tests focusing on the abilities of endpoint protection platforms (EPP). Vendors offering both EDR and EPP capabilities for Windows and Linux are able to participate in more steps of an evaluation than vendors with more limited offerings. Over time, security vendors whose primary strengths lie elsewhere have increasingly participated in the respected program.
We encourage security buyers to research these vendors, including their MITRE scores over time, before making a purchase. Our analysis provides one way to look at the MITRE evaluations from an angle that can be helpful. But as we noted in our analysis of last year’s results, your organization will need to test security products in your own infrastructure before you know if it will work for you. And looking into the details of the MITRE tests may also give you significant information about how a product might perform in your environment.
How to Analyze This Year’s MITRE Results
The MITRE results were separated into two categories: detection (SNAKE and CARBON scenarios) and protection (13 tests of a product’s ability to stop an attack).
The detection evaluation involved 143 total steps. For vendors who skipped the Linux tests, that number drops to 132. To calculate the detection scores, we divided the total number of successfully detected steps by 143 (or 132).
The detection tests include an analytics score, a telemetry score, and a visibility score. Analytics coverage not only detects the threat but also tags it with the MITRE standard identification. Telemetry coverage is the collection of raw data about a threat event, not necessarily including context. Visibility coverage is the overall number of detections that MITRE tested and the vendor successfully detected. We cover only the visibility score in our analysis of MITRE testing.
Cisco’s and Check Point’s detection and protection scores weren’t recorded due to technological issues, according to MITRE.
The protection component consists of 13 tests, evaluating which vendors can stop all thirteen attack sequences and how quickly they can do so. Most of the 13 tests had multiple attack steps. Protection tests were optional, and not all vendors participated, including Rapid7 and WithSecure.
On the protection side, there were a few tests in particular with which multiple vendors struggled. Many vendors missed test three, including Fortinet, Bitdefender, and Sophos. Malwarebytes couldn’t complete it either.
And many struggled with test seven as well. While Fortinet eventually completed seven, it took many steps. Same for Bitdefender. VMware Carbon Black didn’t complete test seven, and Tehtris missed every step. A few vendors ran into trouble with test 13 as well. This year’s evaluations revealed some common threads, with a few notable protection tests that appeared to be particularly hard.
Palo Alto Networks had a perfect score, detecting all 143 detection tests and stopping all 13 protection evaluations on the first step. Three other vendors — Microsoft, CrowdStrike and Cybereason — all successfully detected the 143 detection tests and stopped the 13 protection attacks, but missed a small number of the protection steps before stopping the threat.
A number of other vendors had strong showings, but the results for many vendors left room for improvement, especially on the protection end. Many high-profile security vendors failed to stop multiple protection tests. Product teams typically use the tests to improve their offerings, so participation is always a net positive.
Vendors have noted a number of caveats about the evaluations — namely, that the detection tests can potentially be gamed by vendors setting detection sensitivity levels high enough to produce false alerts in the real world, and some vendors have said they had to disable key security features to participate. Those caveats make the protection tests the more important of the two, and many welcomed them when they were introduced two years ago.
Detection & Protection Results Were a Mixed Bag
Many of last year’s winners scored high for detection again this year, including Palo Alto, Microsoft, CrowdStrike, and Cybereason. Five vendors received perfect visibility scores in the detection evaluations, and Sophos was one vendor that scored a noteworthy comeback from a middling 2022 result, underscoring that vendors often use the results to better their products.
The following table gives the overall visibility score for each vendor in the detection tests, from highest number of successful detection tests to the lowest.
Detection visibility score, in percent | |
Crowdstrike | 100% |
Cybereason | 100% |
Cynet | 100% |
Microsoft | 100% |
Palo Alto | 100% |
Sophos | 98.6% |
Fortinet | 97.9% |
Bitdefender | 91.61% |
Deep Instinct | 89.39% |
SentinelOne | 88.11% |
Trend Micro | 88.11% |
Uptycs | 88.11% |
Harfang Labs | 87.41% |
Malwarebytes | 82.52% |
Blackberry | 81.82% |
Trellix | 81.12% |
Elastic | 80.42% |
WatchGuard | 78.79% |
Qualys | 78.32% |
Taegis (secureworks) | 78.32% |
ESET | 77.62% |
Symantec | 75.52% |
Ahn Labs | 74.24% |
IBM Security | 72.03% |
VMware Carbon Black | 72.03% |
Rapid7 | 70.63% |
Tehtris | 67.13% |
WithSecure | 67.13% |
Somma | 58.74% |
The protection test results this year displayed a wide performance range. Only seven vendors stopped all the tests they faced, the same number as last year. But many vendors missed multiple protection tests entirely. There were 13 tests total, and many contained multiple steps. This table compares three different items:
- The number of tests, out of 13, that the vendor was able to stop
- The number of tests, out of 13, that the vendor stopped on the first step
- The number of total steps the vendor missed out of all 13 tests (for example, if they stopped one test on the fifth step, they missed four steps for that test alone)
These metrics show different facets of the protection tests, rather than a single overall percentage.
Vendor | Number of stopped tests | Tests stopped on first step | Steps missed |
Palo Alto | 13/13 | 13/13 | 0 |
CrowdStrike | 13/13 | 12/13 | 4 |
Cybereason | 13/13 | 12/13 | 1 |
Microsoft | 13/13 | 12/13 | 2 |
SentinelOne | 13/13 | 12/13 | 3 |
Trend Micro | 13/13 | 10/13 | 3 |
Symantec | 13/13 | 12/13 | 1 |
Cynet | 12/13 | 9/13 | 7 |
Fortinet | 12/13 | 8/13 | 43 |
Bitdefender | 12/13 | 11/13 | 14 |
Deep Instinct | 12/13 | 10/13 | 5 |
Ahn Labs | 12/13 | 11/13 | 5 |
Sophos | 11/13 | 11/13 | 10 |
Blackberry | 11/13 | 10/13 | 5 |
Trellix | 11/13 | 11/13 | 10 |
Elastic | 11/13 | 11/13 | 4 |
ESET | 10/13 | 8/13 | 27 |
VMware Carbon Black | 10/13 | 4/13 | 18 |
WatchGuard | 9/13 | 3/13 | 25 |
IBM Security | 9/13 | 1/13 | 40 |
Uptycs | 8/13 | 3/13 | 41 |
Malwarebytes | 7/13 | 2/13 | 38 |
Tehtris | 7/13 | 5/13 | 46 |
Palo Alto stopped all the tests, once again earning our confidence as the top overall cybersecurity vendor. Symantec and Cybereason did particularly well here. Malwarebytes and Tehtris were the lowest performers on the protection side, only managing to stop 7 of 13 tests. And while Fortinet stopped 12 of the tests, it missed a sizable 43 attack steps during the testing process. The deeper intruders get into your environment, the more damage they can do.
Bottom Line: MITRE Scores Are Valuable, But Not Everything
MITRE evaluations are far from easy for security vendors, and that difficulty makes them particularly valuable in a market where buyers don’t have a lot of visibility. Even vendors who choose to undergo only the detection tests, like Rapid7, are to be commended for pursuing excellence in a field — like EDR — that they aren’t best known for.
We encourage you to study the MITRE results for yourself if you’re interested in knowing more or are considering making a purchase from one of these vendors. Our interpretation is just one method of looking at the data.
Threats only grow more sophisticated over time, and security providers have the difficult task of keeping up with threat actors’ ingenuity. That makes MITRE evaluations one of the best available tools for both security buyers and vendors to learn.
Read next: