DDLC - Detection Development Life Cycle
Dr. Chuvakin has recently delivered another great blog post about "detection as code". I was glad to read it because it was the typical discussion we used have in our brainstorming conversations at Gartner. It had a nice nostalgic feeling :-). But it also reminded me of my favorite paper from those times, "How To Develop and Maintain Security Monitoring Use Cases".
That paper describes a process framework for organizations to identify and develop use cases for security monitoring. It was intentionally designed to be tool neutral, so it could be used to develop SIEM rules, IDS signatures or any other type of content used by security monitoring tools. It was also built to mimic Agile development processes, to avoid the capital mistake of killing the required agility to adapt to threats by too much process. I had fun discussions with great minds like Alex Sieira and Alex Teixeira (what's this with Alexes and security?) when developing some of the ideas for that paper.
Reading the philosophical musings from Anton on "detection as code" (DaaC?), I realized that most of threat detection is code already. All the "content" covered by our process framework is developed and maintained as code, so I believe we are quite close, from a technology perspective, to DaaC. What I think we really need is a DDLC - Detection Development Life Cycle. In retrospect I believe our paper would be more popular if we used that as a catchy title. Here's a free tip for the great analysts responsible for future updates ;-)
Anyway, I believe there are a few things missing to get to real DaaC and DDLC. Among them:
Testing and QA. We suck at effectively testing detection content. Most detection tools have no capabilities to help with it. Meanwhile, the software development world has robust processes and tools to test what is developed. There are, however, some interesting steps in that direction for detection content. BAS tools are becoming more popular and integrated to detection tools, so the development of new content can be connected to testing scenarios performed by those tools. Just like automated test cases for apps, but for detection content. Proper staging of content from development to production must also be possible. Full UAT or QA environment are not very useful for threat detection, as it's very hard and expensive to replicate the telemetry flowing through production systems just for testing. But the production tools can have embedded testing environments for content. The Securonix platform, for example, has introduced the Analytics Sandbox, a great way to test content without messing with existing production alerts and queues.
Effective requirements gathering processes. Software development is plagued by developers envisioning capabilities and driving the addition of new features. It's a well-known problem in that realm and they have developed roles and practices to properly move the gathering of requirements to the real users of the software. Does it work for detection content? I'm not sure. We see "SIEM specialists" writing rules, but are they writing rules that generate the alerts the SOC analysts are looking for? Or looking for the activities the red team has performed in their exercises? Security operations groups still operate with loosely defined roles and for many organizations the content developers are the same people looking at the alerts, so the problem may not be that evident for everyone. But as teams grow and roles become more distributed, it will become a big deal. This is also important when so much content is provided by the tools vendors or even content vendors. Some content does not need direct input from each individual organization; we do not have many opportunities to provide our requirements for OS developers, for example, but OS users requirements are generic enough to work that way. Detection content for commodity threats is similar. But when dealing with threats more specific to the business, the right people to provide the requirements must be identified and connected to the process. Doing this continuously and efficiently is challenging and very few organizations have consistent practices to do it.
Finally, embedding the toolset and infrastructure into DDLC to make it really DaaC. Here's where my post is very aligned to what Anton initially raised. Content for each tool is already code, but the setup and placement of the tools themselves is not. There's still a substantial amount of manual work to define and deploy log collection, network probes and endpoint agents. And that setup is usually brittle, static and detached from content development. Imagine you need to deploy some network-based detection content and find out there's no traffic capture setup for that network; someone will have to go there and add a tap, or configure something to start capturing the data you need for your content to work. With more traditional IT environments the challenge is still considerable, but as we move to cloud, devops managed environments, these pre-requisite setting can also be incorporated as code in the DDLC.
There's still a lot to make full DaaC and comprehensive DDLC a reality. But there's a lot of interesting stuff in this sense going on, pushed by the need for security operations to align with the DevOps environments in need to be monitored and protected. Check the Analytics Sandbox as a good example. We'll certainly see more like this coming up as we move closer to the vision of threat detection becoming more like software development.