Friday, May 7, 2021

Professional Certifications, Reboot!

 


After two months and a few hundred dollars later, my most recent personal project is completed. 10 years after my TOGAF9 certification, I decided to play the test taker again and obtain a new batch of professional certifications: AWS Certified Cloud Practitioner, AWS Certified Security Specialty and Microsoft Certified: Azure Fundamentals.

 





I didn't need these certifications for my current job, and I'm not looking for a new one either, so this is not about job requirements or job hunting. So why did I do it?

 

I did it because I could do it :-) Well, ok, let me elaborate a bit. My career has been slowly moving away from more technical roles, and that's reducing my direct, hands on contact with technology. I don't think this is a bad thing, but as someone with the technical background, I miss that deeper understanding of the things I need to talk and write about.

 

At the same time, I do not have the same drive to learn things that I do not have an immediate need for, learning just for the sake of learning. I still love learning new things, but that youth drive of building labs and labs for learning things we may never touch as part of our job is just not there anymore. Putting a target like a certification in front of me (and paying for it) seems to be an effective way to trigger my brain into "I need to learn this" mode. I learned many things in the past while preparing for getting certs, so I thought I could use the same method again. It was nice, it worked well for this intent.

 

Cloud adoption keeps growing and cloud is directly affecting the work of anyone in security these days. It's not different for me: I work for a Cloud SIEM vendor, and we are bringing many innovations to the SIEM space that are directly related to cloud. Securonix recently announced "Bring Your Own Cloud", for example, and it is deeply rooted in AWS offerings, so it seemed natural to me that I should put a couple of AWS certs in my project. AWS Cloud Practitioner helped me learn more about the very broad range of offerings from AWS, and the security specialty was useful to provide more depth to my understanding of cloud threats and cloud security controls.

 

In addition to the AWS certs, I wanted to add the (ISC)2 CCSP to the mix as well. I checked the domain of knowledge of the cert, ran through a few practice exams and noticed I already had most of the skills and knowledge required to pass it. So why didn't I do it? Because it's freaking expensive! USD600 is beyond any reasonable justification for a simple multiple-choice exam. Maybe I would take it if I was looking for a job like cloud security architect, or even as a CISO for a company with a strong cloud presence, but just for the fun of doing it? No, I'm sorry, it doesn't make any sense. 

 

An Azure cert was a natural choice to complete my project. AWS and Azure are by far the most visible cloud providers (sorry Google!), so going through the process for both looked like the best choice.

 

There are a few things I noticed during this exercise that I think it's worth sharing. First, it confirmed to me that test taking talent is really a thing. I'm a helluva test taker. I'm not bragging; apart from helping me pass tests and exams easily, it doesn't provide me with any real competitive advantage "on the job". I've always been like that and was happy to see I haven't lost it after so many years without sitting for a test. I didn’t spend more than a handful of hours reading for the full project. I don’t feel nervous and even have fun while taking the tests, so everything was a fun experience.

 

But when you are a hiring manager and you see those certs in a resume, it's always important to find out if those certs came from real experience or just from good test taking skills (or worse, memorizing those awful brain dumps).

 

Don't get me wrong, it doesn't mean that certs on a resume means nothing. Remember, the main reason for me to do it was to force me into learning something about those technologies. Even if I don't have the hands-on experience the test developers were trying to verify with these exams, I still had to at least read a bit and get solid understanding of the basic concepts. 

 

Talking about basic concepts...if you want to get certs, LEARN THE F* BASICS! You can't believe the number of questions I was able to answer because of basic stuff, not necessarily tied to those specific cloud providers. If you know how crypto works, for example, a lot of the AWS security specialty questions will be very easy to answer. Same thing for networking and network security. AWS security groups, network ACLs, Azure network security groups...those are straightforward to learn when you know those things well. I didn't take the real CCSP, but the practice questions I've done indicate that CISSP level concepts would put more than half of way behind you on that one.

 

Finally, some interesting bits about AWS and Azure I was able to notice:

 

·      Knowing the basics of one of those means you have almost all the basics of the other. Key concepts are virtually identical. 

·      The naming convention of Azure is AWESOME. It's very easy to know what products and services do just from their names. They may not sound as sexy as "Athena", "Glacier", or high tech as “S3”, “EC2”, but they tell you in a very simple manner what they are about.

·      Both services are evolving so fast it's hard to keep study material, documentation and questions aligned and up to date on the latest offerings. Don't be surprised to see questions about things you didn't see in your study material. Check some of the announcements and blog posts from the past year as part of your study work.

·      AWS Security Specialty is the one I see closest to being "hard". There's really a lot of stuff to cover, in a relatively deep level of detail: Networking, Crypto, IAM, logging, policies syntax and small idiosyncrasies. I can see how it really tries to assess real experience on AWS security.

 

 

Am I done with the certs now? Maybe, not sure. It may become an expensive hobby :-). Well...those Azure security certs do not look that hard, I still have that 50% off voucher for AWS exams and I really need to spend some time learning about Google cloud ;-)

 

P.S. As I’m doing this during the COVID-19 pandemic, I took these tests in the “online proctored” mode. THEY SUCK! I was expecting those VUE, PSI guys would have learned by now how to do it right. No, they tech and processes are horrible. I had problems during the 3 exams, one with PSI (AWS Practitioner), the worst one, and two with VUE. If you are a person that gets anxious or nervous during the exam, this is definitely not for you. Some of the issues I had to go through would take many candidates out of their minds and strongly impair their ability to answer the questions.

Friday, April 16, 2021

The Bright Future of Cloud SIEM

TL;DR: People keep questioning SIEM value, but cloud SIEM makes SIEM so much better. SIEM is now capable of delivering a lot of security value with far less effort from security teams.


The SIEM market is a US$5B market with a two-digit annual growth rate. Still, we keep seeing multiple questions and discussions around SIEM’s role, future and value. Why?

 

There are many reasons, including:

  • The high importance of SIEM’s role for security operations: The SIEM is often the foundation of Security Operation Centers and has a critical role in their work. It is natural to see it being constantly evaluated and discussed as it has a role in almost all SOC processes.
  • Cost and budget share: SIEM is not cheap. It usually takes a big chunk of the security budget. Organizations will keep trying to reduce it as part of their cost optimization efforts, while vendors of other technologies will keep trying to sell their products as alternatives to tap into existing SIEM budgets.
  • Operational effort required: SIEM is definitely not a “set and forget” tool. This is not a deficiency per se, as other technologies, such as EDR, also require people to deliver value. But the concerns about how much effort must be put into SIEM operations is a constant driver of discussions about improvements or even replacements of this technology.
  • Multitude of experiences: SIEM has been around for more than 20 years. Many professionals have gone through multiple implementations, sometimes with good experiences, sometimes not so much. I’ve seen many people with very strong opinions on SIEM based on their personal experiences with this type of tool, experiences that many times are not representative of how SIEMs can support security initiatives.
  • Evolution of other technologies and of the entire technology landscape: As other technologies evolve, it is inevitable to look at how they impact the role of SIEM. It happened with UEBA, it happened with SOAR, it is happening with XDR. The technology environments where these tools operate are also constantly evolving. Big SAN storage systems came up, virtualization became ubiquitous, big data spread out like wildfire. These changes affect the security tools we use to protect IT environments in multiple ways. Some increased the amount of data to be collected and processed, while others were used to evolve SIEM and make it more scalable and capable.

 

Nothing is more important to those discussions as Cloud SIEM. Not just “hosted” in the cloud, but as a native cloud offering. Why? Because now SIEM vendors can have some control over deployment success. What are you saying, Augusto? Didn’t they have control over the success of their own product before? Yes, that’s true!

As a traditional SIEM vendor, it is very hard for you to ensure the customer will be able to get all the benefits your product can provide. First, they may underestimate the required capacity for their environment. They will end with a sluggish product, overflowing with data, having to deal with adding servers, memory, storage, or even stopping the deployment to rearchitect the whole solution before getting any value from it. I’ve seen countless SIEM deployments dying this way before generating any return of investment.

 

But it doesn’t stop there. They may get the sizing right but underestimate the effort to keep it running. They estimate the number of people to use the SIEM, but they forget that a traditional SIEM requires people to use it but also to keep it running. That means people will spend their time keeping servers running, applying patches (to operating systems, middleware and to the SIEM software too), troubleshooting log collection, ensuring storage doesn’t blow up, and not paying attention to what the SIEM should actually be doing for them. The tool is up and running, but again, not providing any value.


We can see how much the vendor depends on the customer to provide value. And even if the customers do things properly, there are other challenges too. Traditional software allows for high variation of deployments: Customers running on different versions, with different hardware and architecture. How can a vendor distribute SIEM content (parsers, rules, machine learning models, etc) that works in a consistent manner to its customers in this scenario? It just can’t.

Considering these factors, I risk saying that offering a traditional SIEM solution is like the Sisyphus Myth. As much as the vendor tries to deliver value, the solution will eventually fail to achieve the customer objectives. As traditional software, SIEM was really destined to die.

How does the cloud SIEM change this?


First, many challenges on SIEM deployments are related to problems that are completely solved or minimized by the SaaS model. Cloud services are highly scalable and elastic, and SaaS practically eliminates the need to maintain the application and underlying components. Now you have a SIEM that finally scales and does not require an army to keep it running. You can focus on using it appropriately.

Second, a SaaS SIEM puts customers on highly standardized deployments. With most customers running on the same version, without capacity challenges, it’s far easier to deliver content that works for all of them. That makes a huge difference in perceived value. And it doesn’t stop there. With this scenario it becomes easier to the vendor to finally realize the benefits of the “wisdom of the crowds”. Developing more complex ML models for threat detection, for example, becomes easier and more effective. The vendor now has access to more data to train and tune the models. Even simple IOC match detection content can be quickly developed and delivered to all customers, allowing the SIEM vendor to provide detection of new, in the wild threats.

Finally, delivering any software solution via SaaS gives the developer the opportunity to embrace more agile development practices. Upgrading a traditional SIEM deployment is so complex that vendors would naturally rely on traditional waterfall development practices, generating big releases with long times between them. SaaS SIEM can leverage agile development and CI/CD practices, so new features can be quickly added, and defects quickly fixed.

Cloud SIEM is on its infancy when you consider SIEM is just past its teenage years. But there are so many opportunities to explore with this model that I believe now we can say “Next-Gen SIEM” without feeling silly about it. Be careful with “SIEM is dead” claims. That sounds to me much like "I think there is a world market for maybe five computers", by Thomas Watson in 1943.

Friday, March 19, 2021

Some additional words on those SOC robots

 The topic on SOC automation is really a fun one to think about, and even after putting my thoughts into words with my last post, I've still kept thinking about it. Some additional considerations came to my mind.

The simplistic question of "Will machines replace humans in a SOC" can be clearly answered with a NO, as I explained in my previous post. As the human attackers are required to evolve the attacking robots, blue team people are required to update the automated defenses.

But things change if the question is asked with some additional nuance. If you ask "will defense actions be automated end to end, from detection to response actions?", it becomes a more interesting question to answer.

The scenario of automated threats that Anton described in his post will, IMO, require SOCs to put together some end to end automation. Having a human involved for every response will not scale to face those attacks. Humans will be responsible for creating those playbooks and monitor their performance, but they cannot be involved in their execution. We need SOC automation that allows us to detect, investigate and initiate response without human intervention. This is challenging, but we must get there at some point.

Andre Gironda commented on the LinkedIn post pointing to my blog post that even with the appropriate tools he still can't fully automate simple phishing response. I could say he's probably being too perfectionist or doing something wrong, but I actually believe him. I believe automation can provide value by reducing human effort in the SOC right now, but full automation, even for some specific threats, is still challenging. But we'll have to get there if we want to stand a chance.





Tuesday, March 16, 2021

The Robots Are Coming!

 The debate around SOC automation has been a fun one to follow. Allie Mellen wrote a short but on the spot piece about it, reaffirming what seems to be the commonsense opinion on this topic today: Automation is good, but to augment human capacity, not replace it.

 

After that Anton brought up a very interesting follow up, confirming that view but also pointing to a scary future scenario, where automation would be adopted so extensively by the attackers that it would force defense to do the same. Does this scenario make sense? 

 

I believe it does, and indeed it forces defense to adopt more automation. But even if Anton says the middle ground position is "cheating", I still think it is the most reasonable one. There will never be (until we reach the Singularity) a fully automated SOC, just as there will never be a fully automated attacker (until...you know). Why? Let's look at the scenario Anton painted for this evolved attacker:

 

 

• You face the attacker in possession of a machine that can auto-generate reliable zero day exploits and then use them (an upgraded version of what was the subject of 2016 DARPA Grand Challenge)
• You face the attackers who use worms for everything, and these are not the dumb 2003 worms, but these are coded by the best of the best of the offensive “community”
• Your threat assessment indicates that “your” attackers are adopting automation faster than you are and the delta is increasing (and the speed of increase is growing).

 

 

Even if it looks scary, this scenario is still limited in certain points. You may have malware capable of creating exploits by itself, but what will they exploit? What is this exploitation trying to accomplishThere is an abstract level of actions that is defined by the creator of the malware. Using MITRE ATT&CK language, the malware is capable of generating multiple instances of a selection of techniques, but a human must define the tactics and select the techniques to be used. Quoting Rumsfeld, there will be more known unknowns, but the unknown unknown is still the realm of humans.

 

A few years ago, I had a similar discussion with a vendor claiming that their deep learning-based technology would be able to detect"any malware". This is nonsense. Even the most advanced ML still needs to be pointed to some data to look at. If the signal required to detect something is not in that data, there's no miracle. Let's look at a simple example:

 

• A super network-based detection technology inspects ALL network traffic and can miraculously identify any attack.
• The attacker is on host A in this network, planning to attack host B, connected to the same network
• The attacker scans for Bluetooth devices from host A, finds host B, exploits host B via a Bluetooth exploit
• The super NDR/NIDS tool sits there patiently waiting to see an attack that never traverses the monitored network!

 

You may claim this is an edge scenario, but I'm using anexaggerated situation on purposeThere’s still many cases that we can relate to, such as breaches due to the use of shadow IT, cloud resources, etc. What I want to highlight is the type of lateral thinking very often employed by attackers in cybersecurity. And the lateral thinking is still exclusive of humans.

 

What I'm trying to say is that fully automated threats are scary, buy they lack the main force that makes detecting threats challenging. Defense automation can evolve to match the same level, but both sides will still rely on humans to tip the scale when those machines reach a balance point in capabilities.

 

What we have today is similar to those battling robots TV shows. Machines operated by humans. If things evolve as Anton suggests we will move to what happens in "robot soccer": human created machines operating autonomously, but within a finite framework of capabilities.





Robot wars vs Robot Soccer

 

 

Threats and SOCs will become more automated for sure. As they automate, they become faster, so each side has to increase its own level of automation to keep up. But when automation limits are reached, the humans on the threat side must apply that lateral thinking to find other avenues to exploit. They need to take the Kirk approach to Kobayashi Maru. When this happens, the humans on the defense side become critical. They need to figure out what is happening and create new ways to fight against the new methods.

 



 

 

So, humans will still be necessary on both sides. Of course, the operational involvement will be greatly reduced, again, on both sides. But they will be there, waiting to react against the innovation introduced by their counterparts on the other side.

 

This may be an anticlimactic conclusion, and it is. But there are some interesting follow up conversations to have. The number of humans required, their skills and how they are engaged will be different. What does it mean for outsourcing? Do end users still need people on their side? If solution providers engage this problem in a smart way, we may be able to remove, or greatly reduce, the need for humans on the end user organization side, for example. The remaining humans would be on the vendor side, adapting the tools to react against the latest attacks. For the end user organization, the result may look very similar to full automation, as they would not need to add their humans to the mix. Will we end up with the mythical "SOC in a box"? Future will tell.

 

Thursday, March 4, 2021

An Analysis of Past Mistakes

 As I was looking for an old email in my archives, I stumbled on discussions about a security incident that happened almost 13 years ago. That was that time when, well, there's no other way of saying it....I was hacked.

The good thing about looking at incidents like that one after a long time is that it helps us understand what really happened and also run a less passionate and unbiased assessment of our own actions. I have to say this case is really enlightening, in many ways. There are good lessons to learn and mistakes to acknowledge from multiple perspectives: Technical, Managerial and even Political. 
  
The year was 2008. I was part of the Board for the Brazil ISSA chapter. We were trying to push for a more inclusive posture of the association, promoting free monthly encounters and other initiatives. Our group took over the board when we felt there were too many security vendors dominating the association, many of them pulling things to where their business would benefit most. A group of friends and acquaintances discussed this and after some deliberation, I was chosen as the head of the ballot. It was an honor for me at that time, as each one in that group was capable of taking the central role. We won the election using our network and a popular email discussion board at that time to spread our word and our plans for the association.
 
So, back to the "breach". We had set up a portal for the association using an open source CMS, Joomla. Joomla was plagued by vulnerabilities at that time, and someone managed to access the user database and crack the passwords. The password for my test account there...well, I was using it in some other places. It was my old password from before I started working with security. I had replaced it almost everywhere, but it was still used on a few places I had forgot about, like LinkedIn and a hotmail account I used to have so I could use MS Messenger. Well, those, and a couple of other services were quickly found by the attackers, and an embarrassing message with all that was posted in that popular email forum, and other places. In summary, an application breach on a website ran by...security professionals, and some pretty lame secops practices by one those guys exposed. 
 
What have I been able to extract from that incident? A lot. Here it is. 

Technical lessons 

The easiest to mention. We were using a horrible tool from a security perspective (Joomla). We had been warned by some people, but some of our group believed we could run it securely by not using crappy plugins and keeping it always up to date. But we didn't have a dedicated security operations team to keep watching it. In addition to it, we knew there were technically competent people out there trying to hack us. So, the threat component was high. It was an explosive combination. In short, we should have made choices that would simplify the challenge of keeping the vulnerability profile low, as we didn't have time to protect it like it should be. 
 
Then, there was my own personal mistake, reusing a password. It is certainly something no one, especially a security professional, should do.  Of course, I was already aware of that, and I was already using unique, different passwords on almost everything that mattered at that time. But this old password ("trustno1", if you really wanna know!) was something I started using long before getting involved with security. As I became more aware of the risks of password reuse I started changing it everywhere, but there were still a few places I had forgotten to do it. To make things worse, I started using it as my "throwaway" password for testing needs. An account I had for testing on the ISSA chapter website was using that password. Bad secops…bang, they got me. 

Management Lessons

This is where I think we can start getting good lessons from the incident. This is about our organizations, the ISSA chapter. How come a security professionals organization be hacked? 
 
We fell for the same mistakes we see in many other organizations. First, the fact that we were all security people caused the "too many cooks in the kitchen" issue. Who was the "CISO" for our organization? That was never defined, so there weren't clear roles and responsibilities defined regarding our own security. I brought the site up and did some of the initial hardening, but at that time I was already moving those responsibilities to other people and completely focused on other issues (I was preparing to move to Canada at that time). People generally know about vulnerability management, but on that case, I believe no one was actually the owner of that process and consciously doing it for us.  

Political, social and relationship lessons

Here's another point from where I extract a lot of personal lessons. When we took over the chapter, our group had as one of its objectives to close the gap between the "security professionals" community (the CISSPs :-)), in fact those dealing with risk management, security policies and other less technology oriented topics, and those with the technology background or IT security jobs. That should also include the "hacking" community (or "scene"). 
 
That divide between the "management people" and the "technical people" was also related to professionals in different stages in their careers. It was very hard to find technical individual contributors in a highly paid position in Brazil at that time. It wasn't interesting to make them part of ISSA for some of the previous directors because there was low value in junior people as potential customers to their products and services. Trying to be more inclusive of professionals with technical backgrounds was really the attempt to make the association useful for people in the early stages of their careers as well.
 
But although I have a technical background, I was never close the underground scene in Brazil. I knew people who were, some volunteers helping us during those days were very connected to that community. Still, I've never been a fan of some of the more juvenile aspects of hacking communities. The use of leetspeak, piercings, crazy haircuts...nothing against that, it's just not my thing.  This, on top of my effort to make the technical professionals voices heard in the community, made me adopt a gatekeeping position, as in my view they were not being helpful in solving the problem I wanted to solve. In more traditional environments, appearances matter a lot. At that time, it was hard to be taken seriously wearing shorts, a mohawk and writing “3 n0iZ M4n0!!”.
 
In the end, I believe we didn't do enough to reach out and include them, and they felt excluded. Our posture about a "professional organization", plus a growing number of charlatans in the market put fire in a "take down a whitehat" movement, which I ultimately fell victim of.
 
I had helped create the animosity against security professionals, then underestimated their abilities and their motivations against me. What a stupid combination, right? Yes, I know. Talk about not having control over the "Threat" component of the risk equation...
 
In summary, that was my collection of mistakes. Technical blunders, classical management mistakes and a dose of simple immaturity. For those also hurt in the process, I'm sorry. I hope I can keep learning from mistakes like those and make better decisions in the future. This is an extremely important part of working in security, knowing we'll never be able to reach perfection.   

Friday, October 9, 2020

Monitoring and Vulnerability Management

 (Cross posted from the Securonix Blog)

Vulnerability management is one of the most basic security hygiene practices organizations must have in place to avoid being hacked. However, even being a primary security control doesn't make it simple to successfully implement. I used to cover VM in my Gartner days, and it was sad to see how many organizations were not doing it properly.

Many security professionals see VM as a boring topic, usually seeing it simply as a "scan and patch" cycle. Although the bulk of a typical VM program may indeed be based on the processes of scanning for vulnerabilities and applying patches, there are many other things that need to be done so it can deliver the expected results.

One of the most important pieces of it is the prioritization of findings. It is clear to most organizations that patching every open vulnerability is just not feasible. If you can't patch everything, what should you patch first? There are many interesting advancements in this area. What used to be based only on the severity of the vulnerabilities (the old CVSS value) is now a more sophisticated process that leverages multiple data points, including threat intelligence. The EPSS research by Kenna Security is a great example of how evolved the practice of prioritizing vulnerabilities is now when compared to the old CVSS times.

But even when you are able to decide what to patch first, there are also cases where the remediation is not simply applying a patch. Some vulnerabilities involve not only a bug, but also other issues such as the existence of legacy software and protocols in the environment. These situations usually require a more complex approach, and that's where an additional component of the VM process, the compensating controls, become important.

Compensating controls are used to address the risk of a vulnerability while the full remediation cannot be applied. Using an IPS, for example, is a typical compensating control. You can use them when you cannot apply the remediation, such as when a patch is not available, or to mitigate the risk until you are comfortable enough (usually after testing is done, during a maintenance window) to apply it. We usually see some security controls that can avoid or reduce the impact of vulnerability exploitation as the ideal candidates for compensating risk, but there is something I always like to bring up during this discussion: Monitoring.

Think about it for a second. You have an open vulnerability that you still cannot patch. The exploit is available, as well as a lot of information about how it is used. Even if you cannot avoid it, you can use all this information to build a security monitoring use case focused on the exploitation of this specific vulnerability. You it is there, and that there is a chance for it being exploited, so why not put something together to look for that exploitation? You can prioritize the alerts generated by this use case, as you know you are currently vulnerable to that type of attack.

A great example of using security monitoring as part of the VM process is what is happening with the new Windows Zerologon EP (ZEP) vulnerability (CVE-2020-1472). The issue is complex and requires more than just applying a patch. Our VP of Threat Research, Oleg Kolesnikov, produced a great write-up about the details and also variants of exploitation and detection. In summary, Microsoft has provided a patch for the immediate problem, but some third-party systems may still use an older, vulnerable version of Netlogon secure channel connections. To avoid breaking functionality of existing systems, Microsoft has introduced new events in their logs to identify the use of these older versions, and signaled they will move to an enforcement mode that will not accept them anymore after February, 2021.

This is where aligning monitoring with the remediation process becomes so important. The new events added by Microsoft can help identify attack attempts and track other vulnerable systems on the network. A pre-established process to coordinate the use of monitoring tools and infrastructure as an additional compensating control for VM can help in situations like this, where the plan to handle a vulnerability also requires monitoring activities.

Monday, September 21, 2020

DDLC - Detection Development Life Cycle

Dr. Chuvakin has recently delivered another great blog post about "detection as code". I was glad to read it because it was the typical discussion we used have in our brainstorming conversations at Gartner. It had a nice nostalgic feeling :-). But it also reminded me of my favorite paper from those times, "How To Develop and Maintain Security Monitoring Use Cases".

That paper describes a process framework for organizations to identify and develop use cases for security monitoring. It was intentionally designed to be tool neutral, so it could be used to develop SIEM rules, IDS signatures or any other type of content used by security monitoring tools. It was also built to mimic Agile development processes, to avoid the capital mistake of killing the required agility to adapt to threats by too much process. I had fun discussions with great minds like Alex Sieira and Alex Teixeira (what's this with Alexes and security?) when developing some of the ideas for that paper.

Reading the philosophical musings from Anton on "detection as code" (DaaC?), I realized that most of threat detection is code already. All the "content" covered by our process framework is developed and maintained as code, so I believe we are quite close, from a technology perspective, to DaaC. What I think we really need is a DDLC - Detection Development Life Cycle. In retrospect I believe our paper would be more popular if we used that as a catchy title. Here's a free tip for the great analysts responsible for future updates ;-)

Anyway, I believe there are a few things missing to get to real DaaC and DDLC. Among them:
  • Testing and QA. We suck at effectively testing detection content. Most detection tools have no capabilities to help with it. Meanwhile, the software development world has robust processes and tools to test what is developed. There are, however, some interesting steps in that direction for detection content. BAS tools are becoming more popular and integrated to detection tools, so the development of new content can be connected to testing scenarios performed by those tools. Just like automated test cases for apps, but for detection content. Proper staging of content from development to production must also be possible. Full UAT or QA environment are not very useful for threat detection, as it's very hard and expensive to replicate the telemetry flowing through production systems just for testing. But the production tools can have embedded testing environments for content. The Securonix platform, for example, has introduced the Analytics Sandbox, a great way to test content without messing with existing production alerts and queues. 
  • Effective requirements gathering processes. Software development is plagued by developers envisioning capabilities and driving the addition of new features. It's a well-known problem in that realm and they have developed roles and practices to properly move the gathering of requirements to the real users of the software. Does it work for detection content? I'm not sure. We see "SIEM specialists" writing rules, but are they writing rules that generate the alerts the SOC analysts are looking for? Or looking for the activities the red team has performed in their exercises? Security operations groups still operate with loosely defined roles and for many organizations the content developers are the same people looking at the alerts, so the problem may not be that evident for everyone. But as teams grow and roles become more distributed, it will become a big deal. This is also important when so much content is provided by the tools vendors or even content vendors. Some content does not need direct input from each individual organization; we do not have many opportunities to provide our requirements for OS developers, for example, but OS users requirements are generic enough to work that way. Detection content for commodity threats is similar. But when dealing with threats more specific to the business, the right people to provide the requirements must be identified and connected to the process. Doing this continuously and efficiently is challenging and very few organizations have consistent practices to do it.
  • Finally, embedding the toolset and infrastructure into DDLC to make it really DaaC. Here's where my post is very aligned to what Anton initially raised. Content for each tool is already code, but the setup and placement of the tools themselves is not. There's still a substantial amount of manual work to define and deploy log collection, network probes and endpoint agents. And that setup is usually brittle, static and detached from content development. Imagine you need to deploy some network-based detection content and find out there's no traffic capture setup for that network; someone will have to go there and add a tap, or configure something to start capturing the data you need for your content to work. With more traditional IT environments the challenge is still considerable, but as we move to cloud, devops managed environments, these pre-requisite setting can also be incorporated as code in the DDLC.
There's still a lot to make full DaaC and comprehensive DDLC a reality. But there's a lot of interesting stuff in this sense going on, pushed by the need for security operations to align with the DevOps environments in need to be monitored and protected. Check the Analytics Sandbox as a good example. We'll certainly see more like this coming up as we move closer to the vision of threat detection becoming more like software development.