Monday, September 21, 2020

DDLC - Detection Development Life Cycle

Dr. Chuvakin has recently delivered another great blog post about "detection as code". I was glad to read it because it was the typical discussion we used have in our brainstorming conversations at Gartner. It had a nice nostalgic feeling :-). But it also reminded me of my favorite paper from those times, "How To Develop and Maintain Security Monitoring Use Cases".

That paper describes a process framework for organizations to identify and develop use cases for security monitoring. It was intentionally designed to be tool neutral, so it could be used to develop SIEM rules, IDS signatures or any other type of content used by security monitoring tools. It was also built to mimic Agile development processes, to avoid the capital mistake of killing the required agility to adapt to threats by too much process. I had fun discussions with great minds like Alex Sieira and Alex Teixeira (what's this with Alexes and security?) when developing some of the ideas for that paper.

Reading the philosophical musings from Anton on "detection as code" (DaaC?), I realized that most of threat detection is code already. All the "content" covered by our process framework is developed and maintained as code, so I believe we are quite close, from a technology perspective, to DaaC. What I think we really need is a DDLC - Detection Development Life Cycle. In retrospect I believe our paper would be more popular if we used that as a catchy title. Here's a free tip for the great analysts responsible for future updates ;-)

Anyway, I believe there are a few things missing to get to real DaaC and DDLC. Among them:
  • Testing and QA. We suck at effectively testing detection content. Most detection tools have no capabilities to help with it. Meanwhile, the software development world has robust processes and tools to test what is developed. There are, however, some interesting steps in that direction for detection content. BAS tools are becoming more popular and integrated to detection tools, so the development of new content can be connected to testing scenarios performed by those tools. Just like automated test cases for apps, but for detection content. Proper staging of content from development to production must also be possible. Full UAT or QA environment are not very useful for threat detection, as it's very hard and expensive to replicate the telemetry flowing through production systems just for testing. But the production tools can have embedded testing environments for content. The Securonix platform, for example, has introduced the Analytics Sandbox, a great way to test content without messing with existing production alerts and queues. 
  • Effective requirements gathering processes. Software development is plagued by developers envisioning capabilities and driving the addition of new features. It's a well-known problem in that realm and they have developed roles and practices to properly move the gathering of requirements to the real users of the software. Does it work for detection content? I'm not sure. We see "SIEM specialists" writing rules, but are they writing rules that generate the alerts the SOC analysts are looking for? Or looking for the activities the red team has performed in their exercises? Security operations groups still operate with loosely defined roles and for many organizations the content developers are the same people looking at the alerts, so the problem may not be that evident for everyone. But as teams grow and roles become more distributed, it will become a big deal. This is also important when so much content is provided by the tools vendors or even content vendors. Some content does not need direct input from each individual organization; we do not have many opportunities to provide our requirements for OS developers, for example, but OS users requirements are generic enough to work that way. Detection content for commodity threats is similar. But when dealing with threats more specific to the business, the right people to provide the requirements must be identified and connected to the process. Doing this continuously and efficiently is challenging and very few organizations have consistent practices to do it.
  • Finally, embedding the toolset and infrastructure into DDLC to make it really DaaC. Here's where my post is very aligned to what Anton initially raised. Content for each tool is already code, but the setup and placement of the tools themselves is not. There's still a substantial amount of manual work to define and deploy log collection, network probes and endpoint agents. And that setup is usually brittle, static and detached from content development. Imagine you need to deploy some network-based detection content and find out there's no traffic capture setup for that network; someone will have to go there and add a tap, or configure something to start capturing the data you need for your content to work. With more traditional IT environments the challenge is still considerable, but as we move to cloud, devops managed environments, these pre-requisite setting can also be incorporated as code in the DDLC.
There's still a lot to make full DaaC and comprehensive DDLC a reality. But there's a lot of interesting stuff in this sense going on, pushed by the need for security operations to align with the DevOps environments in need to be monitored and protected. Check the Analytics Sandbox as a good example. We'll certainly see more like this coming up as we move closer to the vision of threat detection becoming more like software development.

Friday, September 11, 2020

NG SIEM?

An interesting result from changing jobs is seeing how people interpret your decision and how they view the company you’re moving to. I was happy to hear good feedback from many people regarding Securonix, reinforcing my pick for the winning car in the SIEM race.

But there was a question that popped up a few times that indicates an interesting trend in the market: “A SIEM? Isn’t it old technology?”. No, it is not. It may be an old concept, but definitely not “old technology”.

Look at these two pictures below? What do they show?

 


Both show cars. But can we say the Tesla is “old technology”? Notice that the basic idea behind both is essentially the same: Transportation. But this, and the fact they have four wheels, is probably the only thing in common. This is the same for the many SIEMs we’ve seen in the market in twenty or so many years.

Here is the barebones concept of a SIEM:

 


 












How this is accomplished, as well as the scale of things, have changed dramatically since ArcSight, Intellitactics and netforensics days. Some of the main changes:
  • Architecture. Old SIEMs were traditional software stacks running on relational databases and with big and complex fat clients for UI. Compare this with the modern, big data powered SaaS systems with sleek web interfaces. Wow!
  • Use cases. What were we doing with the SIEMs in the past? Some reports, such as “top 10 failed connection attempts” or some other compliance driven report. Many SIEMs had been deployed as an answer to SOX, HIPAA and PCI DSS requirements. Now, most SIEMs are used for threat detection. Reporting, although still a thing, is far less important than the ability to find the needle in the haystack and provide an alert about it.
  • Volume. SIEM sizing used to be a few EPS, Gigabytes exercise. With the need to monitor chatty sources such as EDR, NDR and cloud applications the measures are orders of magnitude higher. This changes the game in terms of architecture (cloud is the new normal) and also drive the need for better analytics; we can’t handle the old false positive rates with the current base rates of events.
  • Threats. It was so easy to detect threats in the past. It was common to find single events that could be used to detect malicious actions. But attacks have evolved to a point where multiple events may be assessed, in isolation and together as a pattern, to determine the existence of malicious intent.
  • Analytics. Driven by the changes to threats, volume and use cases, the analytics capabilities of SIEM have also changed in a huge manner. While old SIEMs would give us some regex capabilities and simple AND/OR correlation, modern solutions will do that and far, far more. Enriched data is analyzed with modern statistics and ML algorithms, providing a way to identify the stealthiest threat actions.

With all that in mind, does it still make sense to call these new Teslas of threat detection a “SIEM”? Well, if we still call a Tesla a car, why not keep the SIEM name?

 

However, differentiating between the old rusty SQL-based tool and the advanced analytics SaaS tools of modern days is also important. In my previous life as an analyst I would frequently laugh at the “Next Gen” fads created by vendors trying to differentiate. But I also have to say it was useful to provide a distinction between the old Firewall and what we now call NGFW. People know the implied difference in capabilities when we say NGFW. With that in mind, I believe saying NG-SIEM is not really a bad thing, if you consider all those differences I mentioned before. Sorry Gartner, I did it! :-)

So, old SIEM dead, long live the NG-SIEM? No, I don’t think we need to do that. But in conversations where you need to highlight the newer capabilities and more modern architecture, it’s certainly worth throwing the NG there.

Tesla owners can’t stop talking about how exciting their cars are. For us, cybersecurity nerds, deploying and using a Next-gen SIEM gives a similar thrill.