The Evolution of the Modern Security Data Platform
How Cloud-Native Next-Gen SIEMs, XDR platforms and Security data lakes are displacing legacy SIEM solutions
Excited to share this report on the evolution of the modern security data stack. Many would go as far as to say that cybersecurity is fundamentally a data problem. If you own the data, you own everything. This is why we’re seeing all the major public security companies from Palo Alto Networks, Crowdstrike, and Cisco significantly investing to capture market share within this category.
Today’s report delves into the historical role of SIEMs like Splunk, the newer solutions like Anvilogic and what to expect over the next 10 years as SIEMs evolve. Due to the rapid proliferation of security tools and applications in the modern enterprise, the security data stack is currently one of the fastest-growing categories in cybersecurity. We’re seeing a convergence and intersection between cybersecurity and data infrastructure technologies, which we explore in this report.
This research was compiled by talking to founders, investors, operators, and leading vendors within this industry. It was great to collaborate with my friend, Shayan Shafii, an investor at Scale Venture Partners. Thank you to partners like Anvilogic for supporting this research.
Actionable Summary
Last month, Cisco completed one of the largest acquisitions in cybersecurity history, acquiring SIEM giant Splunk for $28 billion. Why did they make such a large acquisition when there are so many hot startups on the private markets? The answer lies in their understanding that security data is the pinnacle of the cybersecurity industry.
Security Information and Event Management (SIEMs) has historically been the center of gravity for Security operations centers (SOC) teams. SIEM solutions were built to enable threat detection and security incident response through the
collection, aggregation, and analysis of log and event data from a wide variety of security solutions.
Legacy SIEMs like Splunk created this category. They were built for on-prem enterprises. However, with the rise of the cloud and the explosion of security tools, these legacy vendors have been unable to meet the demands of modern security and cloud-native technology architecture.
The primary problem has been the cost of ingesting and storing data on these platforms. Secondly, the rising volume of alerts generated from these solutions. The cost of managing security data on legacy platforms has grown exponentially, especially compared to the opportunity cost of utilizing cloud data platforms. The performance and advancements in cloud-native computing and storage on platforms like Snowflake and improvements in threat-hunting capabilities have made it difficult to use legacy solutions.
A new generation of cloud-native data platforms has evolved to displace these legacy SIEMs that still have a large footprint on the market. This next evolution of data platforms incorporates security ETL data management and orchestrator platforms, as well as a new variety of cloud-native SIEMs built around low-cost storage (Multi-data SIEMs, next-gen SIEMs and cloud security data lake). Further right of the stack, we see an evolution of legacy SOAR moving into more native AI Response & Reporting and XDR platforms. We provide a detailed breakdown and analysis within the report.
The History of Security Information and Event Management (SIEMs)
What are SIEMs?
Over the past decade, the proliferation of security tools has led to an overwhelming influx of alerts for security operations teams. These alerts, often numbering in the thousands daily, can be time-consuming to resolve and may result in false positives or duplicates. To manage this, teams have turned to Security Information and Event Management (SIEM) solutions.
SIEM solutions support threat detection and incident response by collecting, aggregating, and analyzing event data from an array of sources, including security solutions, network infrastructure, endpoints, servers, databases, and cloud applications. This data, once primarily time-series-based log data, is structured into a central console that identifies, correlates, and categorizes it into meaningful alerts. SIEM tools have advanced significantly, now leveraging a wider set of data sources for real-time monitoring, threat discovery, incident investigation and response, policy enforcement, and compliance reporting.
SIEM capabilities include:
Log collection, Log analysis and Log correlation
IT Compliance, Application monitoring
Real-time alerting, Dashboard and reporting
Ingest a near-real-time feed of telemetry describing activity in the environment
Structure information, store it and generate statistics for baseline behaviour levels
Surface alerts (incidents) for when a recent stream of data either deviates from the standard baseline, or matches a certain criteria that an enterprise wants to detect.
They are designed to help SecOps teams understand when a breach or event has occurred. For every tool that security teams use to prevent breaches from occurring, there is almost always another tool further down the line intended to catch things that slip through the cracks of their preventive measures; this “2nd layer of defense” is the premise of much of the security industry - e.g. categories like EDR, NDR, and more recently, CDR. All of these categories broadly fall under the umbrella of “detection and response” tools, which in the case of endpoints, networks, and cloud infrastructure, can be serviced with dedicated point solutions. These tools offer a comprehensive understanding of cyber attacks, enhance threat management, and aid in meeting monitoring and compliance requirements.
While detection and response is inherently a reactionary problem, prior to SIEMs, SecOps teams were even more delayed in their response – they couldn’t even begin to investigate something until the information reached them through less automated means. And even once they kicked off an investigation, it would be a horribly cumbersome process of speculatively combing through logs in various domains to try and piece together a story. There’s a great article by Rak Garg with more detail on the advent of SIEMs here. In general, SIEMs represent a way to consolidate all of this telemetry into a single, massive data sink. By joining data from disparate domains, security teams can correlate and add context to siloed data, allowing for more sophisticated detections.
What Changed? 2002 - Present
The first generation of SIEMs (i.e. Splunk) were largely designed for on-prem, monolithic deployments. Since Splunk was first to market here, they could essentially consolidate the 4 main components of a SIEM: a) Forwarder for ingesting data; b) an indexer for enabling search; c) storage tier for retention; and d) a search head for queries into a single software offering that they could drop off in customer data centers. At the time, this was mostly convenient for customers as most of the telemetry that they were ingesting/indexing was generated on-prem, and there was a convenience to having everything handled by a single vendor. However, things changed.
Given the 22-year window here, the technology landscape has drastically changed, but a few main themes have primarily driven the evolution of this market:
Shift from On-Prem to Cloud: As application and IT infrastructure has become increasingly distributed, that has also created a complication in terms of how to architect the plumbing of these SIEM systems. New connectors have to be built for emerging data sources, and monitoring also becomes a more nuanced challenge as you span hybrid environments.
Explosion in Data Volume: Legacy SIEM costs are largely indexed to data volume - meaning, the more stuff you ingest and index, the more you linearly pay. It’s now common knowledge that enterprises are accumulating data at record-setting speeds, meaning that SIEM costs are unfortunately also growing proportionally. In response, IT and security leaders have spent much of the last few years finding clever methods and tools to pre-process, reduce, and prioritize the data that they feed into these expensive systems.
Separation of Compute and Storage: While shifting to the cloud created a few complications (as described above), it did also afford new capabilities that would lend well to the problem space. By decoupling compute (which is independently expensive) and storage (which is independently cheap), IT leaders could begin to design more intelligent architectures that optimized performance and cost simultaneously. Companies like Snowflake are an example of how to build a cloud-native business that leverages these capabilities to disrupt legacy on-prem tools.
Advancements in Data Infrastructure: Cloud infrastructure has also made it easier to leverage advancements in storage to more efficiently store and query various data types. For example, columnar data stores are great for classic use cases around large-scale data aggregations.
The Challenge With Traditional SIEMs: Vendor Lock-In
Traditional SIEMs can be defined as those SIEMs that were built decades ago primarily for on-prem use-cases. They have become the center of gravity within large enterprise SOC teams deeply entrenched and extremely sticky. The flagship company here is Splunk. They originated this market, and architected the initial 4-step process of ingesting, storing, analyzing, and alerting on high volume, real-time log data across domains.
It is worthwhile to mention that Splunk and a number of vendors within this market like Elastic are still deeply entrenched within large enterprises today with the largest market share. Some of these vendors have cloud-native solutions. The company scaled into a massive independent company until it’s 2023 acquisition by Cisco for $28B. Splunk ended Q4-FY24 with over $4B in ARR - a reminder that despite all of the innovation in this market, the world mostly runs on legacy software.
As a result of the following, many organizations have found them stuck and locked-in with these legacy SIEMs. This vendor lock-in has become particularly strong for customers using platforms like Splunk. Organizations are currently contending with the following:
As the cloud has evolved, so we’ve realized an explosion in the volume and variety of data that large enterprises have to manage. Since many of these traditional SIEMs are not capable of handling them well, there has been rising costs with monolithic SIEMs that charge by the amount of data brought in. This is in-addition to the compute required by these vendors for detection, investigation processing, retrieval and rehydration of old logs from cold storage.
Since companies can’t send all their data to Splunk due to the cost, there have been more risks from uncollected security data not being used for threat detection purposes.
A painful and costly rip-and-replace process to move to a different SIEM requires organizations to rebuild all of their detection content and customizations again.
On the flipside, there has been significant improvements in the capacity of data lakes to mitigate detection gaps at lower costs which increases the opportunity costs of being on legacy SIEMs.
In general, this SIEM vendor lock-in intensifies data management issues, it creates a lack of correlation among siloed sources, and necessitates data rehydration for investigations. Currently, organizations are addressing SIEM lock-in with strategies like redirecting security data to more affordable storage, building their own Security Data Lakes, or replacing the SIEM altogether. Each approach comes with its unique set of challenges, such as having to redevelop years worth of customization & detection content, managing data silos, and risking operational disruptions. In response to the challenges above, the SIEM market has seen a rapid evolution in both newer categories and newer architectural design. We elaborate on these evolutions below in more detail.
The Evolution of Cloud-Native SIEMs
As more companies migrated production and IT applications to the cloud, the need for more modern, cloud-scale SIEM systems subsequently followed with redesigned architectures leveraging cloud storage and compute. The evolution into the cloud and new data tools has created a new security data landscape, and vendors are providing solutions for different aspects of the ecosystem. There are six broad categories of cloud-native vendors that are re-shaping the new landscape:
ETL Data Orchestrators
Multi-data SIEMs
Next-Gen SIEM
Cloud Security Data Lake
AI Response & Reporting
XDR Platforms
SaaS Apps & Security Tools
It all starts at here. The evolution of the cloud has led to the proliferation of more SaaS and security tools which has subsequently led to this growth in the explosion of data generated within the enterprise relative to the past few years. Security tools are generating more alerts and more importantly, non-security tools like Salesforce, Confluence, Workday etc all have relevant security data to monitor.
Data ETL & Orchestrators
Once all these alerts are generated this is where the role of security-focused ETL and orchestrators take their role. In legacy environments, all this data had to be normalized, parsed and ingested into Splunk (required significant cost). However, companies like Cribl, Observo, and Monad have come-in as data storage and management intermediaries. They act as an intelligent policy layer, absorbing filtering, and cleansing data (logs and events) before routing them into these large SIEMs. These players integrate with various apps, data management and storage systems aims by intelligently filtering and managing data flow. This reduces unnecessary data replication, and managing data storage costs.
Customers who have to deal with the large cost of managing Splunk have started to look for ways to reduce the amount of bloat in their instances. By being more selective about the type of data they feed into the SIEM, they are able to improve data quality and reduce cost. This enables customers on these large SIEMs save cost because there is less data that need to be parsed by these large SIEMs. On the other hand, these orchestrators are able to capture a percentage of the savings that they would pass on to large SIEM customers (F500 Splunk instances can easily run high 7 or 8-figures in cost.)
Multi-Data Platform SIEMs
Anvilogic
Anvilogic is a multi-data platform SIEM aiming to enable Security Operations Centers (SOCs) to adopt scalable and cost-effective security data lakes without a rip-and-replace of existing tools like Splunk. Their strategy focuses on overcoming the "SIEM Lock-in" challenges by decoupling security analytics from the logging layer so that security teams can use Anvilogic to detect and respond to threats across the data platforms they choose without disrupting existing processes. The platform seeks to reduce risk by closing detection gaps across data platforms while cutting the SIEM costs associated with security data storage and ingestion. They leverage Snowflake for high-volume data sources and advanced analytics, supporting correlation across a range of data from both on-premises and cloud sources. It includes curated content designed and tested in Splunk & Snowflake and an AI copilot that speaks fluent English and is also an expert in the underlying platforms to help analysts build and deploy threat detection use cases across Splunk, Snowflake, and Azure. They initially started as an AI-based SOC platform to simplify entry into detection engineering and threat hunting, now they have evolved into a comprehensive threat detection and response solution for enterprise SOCs.
Anvilogic’s Differentiators
Simplifying Security Data Lake Adoption for SOCs
Anvilogic distinguishes itself through the flexibility of its platform, integrating with a variety of data sources and security solutions. The platform advocates for strategic data management, prioritizing efficient storage over centralization. They aim to mitigate risk and reduce data storage costs by reallocating high-volume use cases to more economically viable data platforms and enabling security teams to take advantage of security data lakes.
This approach seeks to alleviate vendor lock-in, allowing organizations to choose from multiple vendors and platforms for their security analytics needs. Anvilogic positions itself as a solution that facilitates analytics and detection across different data platforms, contrasting with traditional SIEM approaches that rely on upholding all the processing, storage and analytics under a monolithic architecture.
Custom Detection Content And Threat Hunting Capabilities
Anvilogic offers straightforward & analyst-friendly features for tailoring threat detection and hunting on top of a growing roster of supported data platforms. This approach is designed to streamline detection, investigation, and response to security threats, enabling organizations to optimize their SOC operations without being confined to a single data ecosystem. Anvilogic emphasizes the breakdown of silos between security data sources and tools, addressing the challenge of “dark data” that is stored but not actively used. Anvilogic supports languages including SPL, SQL and KQL, with a low-code detection engineering solution to help organizations achieve detection coverage across their data platforms of choice. The company heavily invests in threat detection content generation via its in-house Purple team who specialize in creating detection content aligned to MITRE techniques with a robust list of AWS, Azure & GCP Cloud threat detection use cases. Anvilogic stands out by skillfully developing and maintaining various versions of its threat detection rules, tailored to the specific platforms they support. They also make it easy for customers to customize these pre-built rules for their unique environment.
Optimizing Costs by Co-Existing SIEM & Data Lakes
Splunk is an extremely difficult task and often a non-starter for many organizations, so Anvilogic's core philosophy is to augment a customer’s existing Splunk deployment and adopt a security lake alongside it with a detection engineering & threat hunting workbench that works across their customers’ tech stack. They refer to it as a "multi-data platform SIEM.” This solution is adaptable to multi-cloud environments, scalable, and capable of addressing the evolving nature of cybersecurity threats. To help companies deal with the complexity of change and data storage, Anvilogic also provides AI-assisted features to prioritize use cases for migration to more cost-effective platforms and for adapting existing detection rules based on the customer’s platform. It also uses AI-provided insights to facilitate tuning recommendations of its detections in the underlying platforms it connects with. Overall, this strategy of co-existing with an organization’s SIEM helps to manage the significant change management for companies.
In summary, Anvilogic sets itself apart from competitors by offering a flexible, open platform that emphasizes cost efficiency and deployable threat detection customization across existing log repositories and data lakes. By enabling organizations to utilize more cost-effective data storage and ingestion options, Anvilogic helps reduce the overall cost of security operations while also reducing threat detection gaps. This approach addresses common challenges around traditional SIEM systems, such as vendor lock-in, high costs, and limited customization options.
Next-Gen SIEMs
Panther Labs: Detection Engineering
Panther Labs is also a next-gen SIEM solution that centralizes, normalizes, and analyzes security data to enable cybersecurity teams to detect, investigate, and monitor security threats and misconfigurations at scale. The company’s core product architecture is built on the storage layer of the three major cloud providers and Snowflake. Panther’s platform transforms terabytes of raw logs per day into a structured security data lake to power real-time detection, swift incident response, and thorough investigations. The ability to use Amazon S3 has also helped to lower the cost of storing large amounts of data.
Based on conversations, Panther Labs core capabilities within this market is its detection-as-code (DAC) capabilities on cloud security data. While Splunk’s query language had become so complex that it had it’s own training course and certification, Panther offered analysts the ability to simply define detection logic in Python. They address challenges associated with the transition from on-premises systems to cloud environments, particularly around the integration and analysis of data. Panther Labs addresses the difficulties enterprises face when transitioning from entrenched legacy systems to modern cloud-based solutions. Although, its a very difficult to get companies to fully transition away or completely rip-apart their legacy SIEMs for Panther, so they have had to find a way around it by allowing companies to customize security detections on their existing data on cloud platforms. Jack, the Founder and CTO has written extensively about the evolution of the detection engineer - part 1 and part 2 here. Francis has co-written extensively about Panther Labs, visit Contrary Research, Panther Labs.
Hunters AI
Hunters is another alternative to traditional platforms like Splunk, leveraging a Security Data Lake primarily based on Snowflake. Their platform supports both hosted and cloud-mode options within its multi-tenant environment. A key differentiator for Hunters relative to Panther Labs (that emphasizes detection-as-code) or Anvilogic’s (multi-data platform), is that they prioritize the automation capabilities tailored for reducing analyst workloads. This is achieved through their advanced integration to SIEMs, where they try to provide a more advanced SOAR earlier in the detection process in order to help automate investigation tasks. Based on my discussions, a core strength of theirs lies within the response and reporting, even though they provide all the components of data and detection. Underlying all their AI and automation capabilities is their Axon Team that constantly work on threat hunting.
Hunters' platform has over 300 data integrations and adopts the Open Cybersecurity Schema Framework (OCSF) standard to break data lock-in, enhancing search capabilities across various data forms. One of the core takeaways I got talking is that compared to traditional SIEMs that might take months to fully integrate, Hunters has a rapid out-of-the box onboarding process where basic functionalities like alert firing and data onboarding can be set up in a matter of days. This quick deployment capability ensures that small and mid-sized organizations can start benefiting from the platform almost immediately. This one-stop solution is designed for teams seeking built-in detection mechanisms that require minimal tuning and a robust graph database for deep analysis and future threat detection. Their solution focuses on the mid-market segment, targeting smaller SOC teams and often originating using MDR channel partners.
EDR To XDR Vendors
One major evolution that has happened within this market is that many of the traditional EDR vendors have expanded their platforms to provide XDR capabilities. We observed that primarily SMBs generally lean toward more of the Extended Detection and Response (XDR). While SIEMs were solidifying their presence in enterprise SOC teams, there was an adjacent evolution in the EDR market that also created a competitive alternative. XDR differentiate from themselves from SIEMs in two meaningful ways. One in that they ingest a wider variety of data sources and secondly, they handle more of the investigation and remediation workflows on-platform.
EDR vendors like SentinelOne and CrowdStrike started with agents that would block malware on endpoints. EDR vendors gathered telemetry from wherever they could install agents such from endpoints, workstations, cloud infrastructure, + also email, identity, and more. However, over time, they expanded the footprint of these agents to include other surface areas in the enterprise leveraging an XDR solution.
What is an XDR?
An XDR platform is built around a data lake and is able to ingest much more than telemetry data. It is able to ingest data from various security domains (endpoints, networks, cloud) through APIs and deliver native response capabilities. They leverage AI and behavioral analytics to prioritize high-risk alerts and offer automated remediation across multiple security tools. In general, this expanded the breadth of their telemetry beyond what was typically included in SIEMs. The EDR to XDR vendors also extended their core competency of creating investigation and remediation workflows for endpoint detections to these new, additional surface areas. As a result, XDR is now an emerging design pattern for enterprises with large EDR deployments from SentinelOne and CrowdStrike. In many other cases, many SMBs adopt these vendors.
💡 Quick Differences between a SIEM vs XDR? 💡
It’s important to note that XDR and SIEMs are fairly different. SIEM solutions are geared toward security analytics and data normalization, need careful manual tuning, and are often prone to false positive or immaterial incident alerts. They also require manual integration with SOAR solutions for incident response automation. Lastly, SIEM solutions address compliance and regulation requirements related to its reporting and logs retention capabilities. Consequently, SIEM solutions are best suited for large enterprises with large security teams and log management and compliance use cases. In contrast, XDR solutions are best utilized as end-to-end detection and response platforms targeting small and midsized organizations.
Crowdstrike: XDR & Next-Gen SIEM
Crowdstrike completed its acquisition of Humio in 2021. This acquisition enhanced CrowdStrike’s XDR capabilities with real-time data ingestion and correlation. Humio was a company that provided cloud log management and real-time observability of data, metrics, and traces for synthesized log management. This product has been wrapped up across many of Crowdstrike’s products that are primarily around Crowdstrike Falcon LogScale and Falcon Search retention business.
By acquiring Humio, CrowdStrike significantly improved its data lake offerings, benefiting from Humio's capacity for large-scale logging with cost-effective cloud-based bucket storage. This integration enhanced CrowdStrike's capabilities giving it more comprehensive logging across diverse environments and facilitating real-time threat detection. This expansive data spectrum, accommodated large-scale, multi-terabyte data ingestion, and ensuring unified visibility. It removed the boundaries between live data and archived data with cloud-based bucket storage. For DevOps, CrowdStrike gained a tool that ensures immediate data observability and leveraged a simplified query language to effectively aggregate live log data. Additional benefits include streamlined user management through role-based access control, free-text search, and advanced alerting mechanisms, which are critical for proactive infrastructure monitoring and flexibility, unrestricted data querying, effectively optimizing CrowdStrike's posture in preemptive security measures and operational control. It also allowed them to remove limits on searches with no indexing and no pre-parsed fields to boost flexibility.
The combination of real-time analytics and smart filtering built into CrowdStrike’s Falcon Threat Graph and Humio’s live log management and index-free data ingestion accelerated Crowdstrike’ XDR capabilities. In recent months, Crowdstrike has expanded this XDR platform into a larger Next-Gen SIEM offering. Time will tell how well the solution does over the long-run
SentinelOne: XDR
Similarly, Scalyr provided log analytics and observability SaaS in support of cloud applications. The Company's platform offered cloud-native log management, visualization, and analysis tools that aggregated server logs and metrics into a real-time, centralized system.
This acquisition strengthened SentinelOne’s XDR portfolio giving it a high-performance backend analytics engine that enhanced their ability to manage and analyze large-scale log data with real-time observability. A huge benefit was Scalyr’s noSQL database that allowed SentinelOne to offer streamlined, simplified query capabilities with PowerQueries. This extended the scalability of their Elastic search partnership, and enriched their event data cloud service, thereby minimizing mean time to resolve (MTTR), boosting real-time querying, and enabling comprehensive troubleshooting on a single screen, all while fostering accelerated DevOps practices. As a result, SentinelOne is now able to support a backend with API-accessible storage, ensuring longer retention of event data at reduced costs, and enhancing the user management experience with role-based access control. All of this capabilities makes it easier for SentinelOne to be used by a SIEM/SOAR that need that event data rapidly and at scale.
In summary, Scalyr’s technology enhanced SentinelOne’s XDR capabilities. It allowed them to ingest more terabytes of data across multiple systems and make better correlations so security teams could autonomously detect, respond, and mitigate threats. Relative to Crowdstrike, SentinelOne hasn’t further their next-gen SIEM solution. They have entirely focused on building out their singularity Data Lake and singularity XDR protection platforms. There are other solutions here, like Microsoft SentinelOne, that we haven’t extensively discussed.
Palo Alto Networks' Cortex XSIAM
Cortex XSIAM is a Palo Alto’s comprehensive SIEM, XDR, ASM (Attack Surface Management), and SOAR security operations platform merged into a single solution. Many of these distinct solutions came from their previous acquisitions of different vendors. They have a vision to create an autonomous SOC platform that streamlines and enhance security operations.
XSIAM extends these functionalities by incorporating AI-driven analytics and automation to improve detection accuracy and response times such as the time required for incident resolution. In relation to XDR platforms like Crowdstrike or SentinelOne, XSIAM offers a broader integration of security functions into one platform thereby eliminating the need to switch between different security consoles, simplifying the user experience and improving operational efficiency.
Some of conversations on the field highlighted that Palo Alto Networks’ ecosystem is not as open as some of the other vendors which could be a challenge for companies looking to avoid being tied to one company. The XSIAM platform which utilizes Google’s BigQuery does not give customers direct access to a data lake, which limits customers and locks them into what the product offers through its UI and APIs. This could be a limitation for customers who wish to have direct access to their data for analysis. However, if a customer already utilizes Palo Alto for their network or EDR platform, they wouldn’t have as much problem with some of the points we’ve highlighted above.
AI Response & Automation (Next-Gen SOAR)
While SIEMs are great for aggregating logs and producing detections, they traditionally lack capabilities around automating remediation workflows. For this reason, the adjacent category of SOAR was created: Security Orchestration Automation & Response. Splunk developed their own native SOAR, but other companies like Tines, Torq, Demisto (acq. PANW), and RevelStoke (acq. Arctic Wolf) also emerged as leading independent solutions here.
How SOAR Worked
Orchestration: SOAR tools connect and integrate different security tools and systems, allowing them to work together seamlessly. This integration enables automated gathering of data from various sources, which can then be used for deeper analysis.
Automation: SOAR allows for the automation of common and repetitive tasks based on workflows and predefined rules. This automation can include actions like blocking an IP address at a firewall, suspending user accounts, or gathering additional data about suspicious events.
Response: While SIEM systems are adept at detection and alerting, they often generate more alerts than security teams can manually handle, and not all these alerts are critical. SOAR helps prioritize and respond to these alerts more effectively. SOAR platforms can execute response actions automatically or guide security analysts through a structured response process using playbooks providing step-by-step procedures for mitigating different types of security incidents.
Torq
Torq is a case study of the trends happening within this category. Torq refers to itself as a “hyper-automation” platform for security operations. Torq's platform is highly complementary to SIEMs, security data lakes and XDR solutions discussed above, functioning as an extensible information bus that provides observability and democratizes the automation of security processes. The traditional paradigm of detection, followed by logging in a SIEM, and then SOAR, is breaking down. Instead, there's a shift toward detection, then immediate automation for remediation
Next-Gen providers in this category are leveraging AI to create higher-fidelity Response & Automation. These platforms use AI to automate responses to security incidents with a high degree of precision. They go beyond predefined workflows, using contextual intelligence to decide the best course of action in real time. There are a number of startups like Torq, Tines, Hunters, Dropzone AI, Prophet and Symbian that are primarily focused on this category.
In-House Cloud DIY Data Lake
There are some companies with large sophisticated SOC teams that decide to build and manage their security tools. Many of these companies would prefer to build their whole SIEM using some of their existing Snowflake and Databricks Data platform. Companies have to build manually their own detection rules and queries to capture security data ingestion and detection. The more you do in-house, the more security talent you need. Snowflake has good hot data storage capabilities but doesn’t have a cold storage tier. This is more popular at the high end of the market where there is security talent. Design patterns typically look as follows:
Forwarder: Leverage rich APIs from cloud providers and SaaS applications (alongside more real-time technologies like Kafka and stream processing) to gather telemetry. Tools like Cribl and Observo can be used to intercept and filter logs for cost-effective pre-processing.
Storage: S3/GCS provide cost-effective, scalable storage for long-term retention. Data warehouse platforms like Snowflake are leveraged for compute.
Indexer: Modern, composable SIEM platforms like Panther Labs and Anvilogic can be leveraged to build a query-able interface on top of your data lake.
As buyer maturity has improved with time, security analysts and architects have more comprehensively internalized the underlying components of a SIEM, and have made progress towards architecting these solutions in-house. However, this option is not a popular considering the huge talent shortage and complexities required to manually your own SIEM.
Competitive Landscape Discussion
Splunk ES
Splunk is the dominant player in the SIEM market yet has struggled with its shift from on-premises to cloud. It is notorious for being expensive and has raised concerns among customers regarding its future direction following the Cisco acquisition. There are architectural and business reasons why Splunk is expensive, however, with enterprises shifting more of their workloads to the cloud (where infrastructure requires massive scalability and data storage), this problem is only getting worse over time. Vendors like Anvilogic have differentiated themselves by separating security analytics from the logging layer enabling teams to detect, investigate, and respond across the data platforms they choose. This strategy prevents vendor lock-in by enabling a broader use of data to support threat detection across diverse data platforms. It facilitates security analytics at significantly lower costs than Splunk (up to 90% in some cases), which restricts customers of such flexibility due to the high costs associated with data ingestion and computing required for detection and investigation processing.
Detection Analytics Layer: Panther Labs & Hunters AI
Panthers & Hunters both require customers to rip-and-replace and deploy their cloud next-gen SIEMs. Other companies like Anvilogic’s help companies manage the process of change and complexity more easily. Anvilogic’s platform can deploy detections within their Splunk deployment via the Anvilogic Splunk App and gradually move the high-volume data that is not currently processed in Splunk over to Snowflake to gain additional detection coverage. This transition is cost-effective and can be done at the customer's own pace over time. With Anvilogic, customers maintain complete control over their data pipelines. Their modular detection capabilities allow detections to run directly on the customer's data platforms (Splunk, Azure and Snowflake), supporting a unique multi-data platform strategy ideal for enterprise SOCs. Conversely, Panthers and Hunters process data in-stream before storing it long-term in Snowflake, requiring customers to send all their data to these vendors. This process can lead to compliance, cost, and lock-in implications, leaving many organizations uncomfortable with where their data is being shipped. Lastly, Anvilogic also differentiates itself from Panthers and Hunters with an integrated SOC Copilot that is thoroughly trained with SOC expertise, underlying data models and Snowflake, Splunk and Azure schemas, with access to various external threat intelligence tools like VirusTotal, IPinfo, WHOIS, and many more to offer inline SOC assistance alleviating manual task across detection engineering, investigation, triage, and threat hunting activities.
Cloud Data Platforms / Next Gen SIEMs / XDR Platforms
Threat detection content efforts in Next-Gen SIEMs are often spread very thin, with efforts predominantly directed towards supporting connectors that funnel data into their platforms. Solutions like Palo Alto Networks XSIAM or IBM QRadar Cloud Native, despite offering extensive libraries of embedded threat detection content, tend to prioritize volume over quality, leading to an abundance of options that lack actionable rule sets and are not easily customizable by security teams. Vendor-provided content often lacks clear instructions on how organizations can effectively put their content to practice, leaving Detection Engineering (DE) teams to independently determine their log source requirements to get their rule set to fire. Anvilogic threat detection content is maintained by their in-house purple team that periodically researches the threat landscape turning threat actor techniques into tailored & curated sets of detections directly for Azure, Splunk & Snowflake users and offers a low-code builder with a SOC Copilot that makes it easy to build custom detection content in minutes.
Also, important to note here are the CSPs. The Cloud Service Provider SIEMs, like those offered by Chronicle, Sentinel, and AWS Security Lake, provide a potential solution to reduce SIEM licensing costs due to their subsidized pricing model for existing CSP customers, however, they come with their own challenges such as a painful rip and replace process and limited capabilities outside of the CSP's own data sources, making their solution impractical for some enterprises compared to other SIEMs.
As for the XDR Platforms, they are beneficial for consolidating if you're already purchasing other products from the suite vendor, however, they have some drawbacks such as a painful rip and replace process, poor detection engineering and content due to their black box nature, they are difficult to customize, and their capabilities are limited outside of the vendor's own EDR data.
Future Directions of the Market
In tracing the evolution of the market and meeting with founders/investors in the space, a few main themes have emerged. If one thing is clear, it’s that the future of the SIEM industry will not look like the past. Trends in infrastructure, data volume, AI, and attack patterns have required that SOC teams to continue adjusting their tooling. A few trends the industry is latching onto:
Multi-data cloud platforms: We will see more SIEM systems that support multiple data platforms, allowing enterprises to use different data storage solutions like Azure, Splunk, or Snowflake in a way that best suits their needs without being locked into a single vendor. Next-gen SIEMs will be capable of running detections in different clouds where the data is generated, thus accommodating the growing trend of multi-cloud environments in enterprises.
Distributed Data-Lakes: Security teams increasingly want to bring their own infrastructure and experience all the benefits of a SIEM on top. Security operations teams will increasingly adopt security data lakes without needing to replace existing SIEM solutions, allowing for better cost management and scalability. In this way, emerging SIEM vendors are meeting customers wherever they’re at on their cloud and data lake journey. By owning the underlying data lake, enterprises have (1) more transparency into what data is going into their detection & response systems, and (2) more flexibility to architect their infrastructure for contexts beyond detection & response - the same data lake can be used for other contexts, like ML engineering, and more.
Detection Engineering: Detection Engineering is an emerging discipline that is finally having its moment in the sun. Security teams are increasingly looking to adopt practices like Detection-as-Code, which allow SOC analysts to codify their detection logic and scale themselves more efficiently. By automating detection with simple (but effective!) Python scripts, SOC teams can focus their effort on higher value-adding activities, like tuning their detections for higher accuracy or triaging high-priority incidents.
Unlocking Data Volume: While many of these emerging vendors represent a massive improvement on Splunk, we’ve found that displacing them can still be extraordinarily difficult. Splunk’s largest customers spend 7-8 figures per year, meaning the platform stores an incredibly high volume of some of the most sensitive data in the enterprise. Rather than tearing Splunk out immediately, security leaders are easing themselves off the platform over time, employing clever strategies like shifting high-volume use cases to more cost-effective logging solutions (e.g. Anvilogic), or to pre-process logs before they land in a sink of their choice (e.g. Observo).
AI SOC Analysts: We expect to see AI-driven capabilities such as "Security Copilots" enter the SOC. Companies like Dropzone and Prophet Security are two early examples of companies working towards this. This category extends the concept of an “AI Agent” to a SOC, meaning that the program goes beyond processing text and generating something new; it will actually perform the work of a SOC analyst, including prioritizing, triaging, and remediating critical alerts.
Summary
This is an important market to keep watching over the next few years. Our report has covered the limitations of traditional SIEMs (Security Information and Event Management systems), and the future of cloud-native SIEMs that leverage cloud storage and computing to manage larger data volumes more cost-effectively and flexibly. Our report also anticipates future market directions, highlighting the importance of multi-data cloud platforms, distributed data-lakes, and detection engineering, which are reshaping how organizations manage and analyze security data in a multi-cloud environment.
Another area to watch is the convergence of XDR and SOAR platforms into one AI-driven response and reporting tool. This market is moving very fast with advancements in AI/LLMs, and I will write more about it in future episodes. Thank you for reading if you made it this far.
Such a detailed post, Francis. Great job on the write-up.