SigmaGen: AI-Powered ATT&CK-Mapped Threat Detection with Sigma Rules

LucasLucas
6 min read

Introduction

As cyber threats evolve, the ability to detect and respond to adversarial activity effectively depends on well-structured and frequently updated detection rules. The MITRE ATT&CK® framework provides a standardized way to classify adversarial tactics, techniques, and procedures (TTPs), but security teams still face challenges in operationalizing ATT&CK within their detection pipelines.

The manual creation of Sigma rules—the industry-standard format for SIEM and detection rules—requires deep expertise, time, and continuous updates as new threats emerge. This process is slow, inconsistent, and prone to human error.

SigmaGen addresses these limitations by automating the generation of high-precision Sigma rules using fine-tuned large language models (LLMs). By ingesting security blogs, threat intelligence feeds, and attack reports, SigmaGen extracts MITRE ATT&CK techniques and maps them to structured Sigma detection rules, dramatically improving the speed, scalability, and accuracy of threat detection.


Challenges in Threat Detection

Despite advancements in security operations, organizations still face the following key issues when dealing with threat detection rules:

1. Manual and Error-Prone Rule Generation

Security analysts must manually extract attack patterns from unstructured threat intelligence (e.g., security blogs, malware analyses, and adversary reports) and then translate them into Sigma rules. This process is:

  • Slow: A single rule can take hours to research, write, and test.

  • Inconsistent: Different analysts may interpret threats differently, leading to detection gaps.

  • Reactive: New attack techniques are often detected too late due to slow rule deployment.

2. MITRE ATT&CK Mapping Is Not Standardized

While MITRE ATT&CK provides a structured knowledge base of adversary behaviors, manually mapping unstructured threat intelligence to ATT&CK techniques is challenging. Many security teams:

  • Struggle to extract the right ATT&CK techniques from reports.

  • Lack an automated way to enrich detection rules with ATT&CK metadata.

  • Have difficulty keeping Sigma rules aligned with the latest ATT&CK updates.

3. Detection Rules Are Not Updated Frequently

Threat actors continuously evolve their tactics, making old detection rules obsolete. Without automated rule updates, SOC teams risk:

  • Missing new adversary behaviors.

  • Relying on outdated detection logic.

  • Increased false positives and false negatives.


SigmaGen: AI-Powered ATT&CK Mapping & Sigma Rule Generation

SigmaGen is an LLM-driven system that automates the process of extracting ATT&CK techniques from unstructured threat intelligence and converting them into Sigma rules.

Key Capabilities:

  • Processes multiple data sources (security blogs, attack reports, threat intelligence feeds).

  • Extracts key threat data and maps them to ATT&CK techniques.

  • Generates highly accurate Sigma rules tailored to specific ATT&CK techniques.

  • Ensures continuous rule updates based on newly emerging threats.


SigmaGen’s Technical Architecture

SigmaGen follows a structured pipeline to automate rule generation while ensuring high accuracy and relevance:

1. Data Ingestion & Threat Intelligence Processing

  • Sources: SigmaGen processes raw data from multiple sources:

    • Security blogs and reports (e.g., The DFIR Report, Microsoft Threat Intelligence)

    • Adversary campaign documentation

    • Threat intelligence feeds

    • MITRE ATT&CK knowledge base

  • Preprocessing Steps:

    • Extract structured and unstructured data from text-based sources.

    • Normalize data using natural language processing (NLP) techniques.

    • Identify attack indicators, tactics, and techniques.

2. ATT&CK Technique Extraction & Mapping

SigmaGen applies advanced natural language processing (NLP) and deep learning techniques to automatically extract adversary behaviors from unstructured threat intelligence sources and map them to MITRE ATT&CK techniques. The process involves:

  • Named Entity Recognition (NER) and Contextual Analysis – SigmaGen uses a fine-tuned transformer-based language model to identify key adversarial behaviors, techniques, and procedures (TTPs) within security blogs, research reports, and incident analyses.

  • Multi-Stage ATT&CK Mapping Pipeline – The system applies hierarchical classification models and semantic similarity scoring to map extracted behaviors to ATT&CK techniques and sub-techniques with high confidence.

  • Confidence Scoring & Threat Context Understanding – SigmaGen employs attention-based ranking models to assign a confidence score to each identified ATT&CK technique, ensuring that only the most relevant mappings are retained.

  • Cross-Referencing with ATT&CK Knowledge Base – The extracted techniques are validated against the latest MITRE ATT&CK framework using vectorized similarity search and graph-based embeddings, ensuring that the mappings remain accurate and up to date.

Example Mapping Output:

{
  "technique_id": "T1090",
  "technique_name": "Proxy",
  "description": "Attackers use proxies to obfuscate communication and evade detection.",
  "score": 0.95
}

3. Sigma Rule Generation & Optimization

Once ATT&CK techniques have been extracted, SigmaGen translates them into high-precision Sigma detection rules through a structured AI-driven rule generation pipeline:

  • Adaptive Threat Rule Refinement – SigmaGen applies fine-tuning techniques to refine detection rules based on evolving threat intelligence, allowing dynamic rule updates without requiring full retraining.

  • Automated False Positive Reduction – A built-in pipelines compares newly generated Sigma rules against historical detection patterns, reducing redundant or overly broad detections.

  • Rule Verification Using Simulated Attack Scenarios – SigmaGen integrates automated adversarial simulation (e.g., Atomic Red Team test cases) to validate generated rules against known attack techniques, ensuring that detection logic is both actionable and robust. More details about the rule verification process, including adversarial testing methods and SIEM validation, will be published later.

Example Sigma Rule Generated by SigmaGen:

id: 8e0fbc04-6e4d-4c4b-9f88-d0bcfc5bc2e1
status: test
description: Detects execution of Ngrok command-line (ngrok.exe) to create a TCP tunnel.
logsource:
  product: windows
  category: process_creation
detection:
  selection:
    Image|endswith: '\ngrok.exe'
    CommandLine|contains: 'tcp '
condition: selection
falsepositives:
  - Unknown
level: high
references:
  - https://thedfirreport.com/2024/02/06/real-intrusions-by-real-attackers-the-truth-behind-the-intrusion/
author: SigmaGen
tags:
  - attack.execution
  - attack.t1090
  • This rule automatically detects unauthorized Ngrok tunnels, commonly used for C2 (Command and Control) communications.

Operational Efficiency & Cost Optimization

SigmaGen has been designed for scalability and cost-efficiency, allowing for automated rule updates at a low operational cost.

At the time of cost calculation, the model in use was GPT-4o-mini, optimized for performance and affordability. The following estimates outline the token usage and projected expenses for SigmaGen's rule generation and hosting:

CategoryToken UsageCost/Prompt ($)Estimated Prompts/MonthCost/Month ($)Cost/Year ($)
Input13,3590.00228001.7621.16
Output3700.000248000.1952.34
HostingN/AN/AN/A27.2326.4
Total---$29$350

Future Enhancements

SigmaGen is continuously evolving, with planned improvements including:

  • Improve Rule Creation Accuracy – Enhance the fine-tuned LLM model to reduce false positives and false negatives, ensuring more precise Sigma rules.

  • Automated Validation & Deployment – Implement automated testing of generated Sigma rules against real-world attack datasets before deployment.

  • Media-to-Text Extraction – Introduce processing for images, screenshots, and embedded text in reports to extract richer threat intelligence.

  • Optimized Dataset Processing – Reduce computational costs while maintaining high accuracy in threat data extraction and Sigma rule generation.


Conclusion

SigmaGen transforms MITRE ATT&CK mapping and Sigma rule generation by automating, optimizing, and accelerating the process of operationalizing threat intelligence. By leveraging LLMs trained on real-world adversary behaviors, SigmaGen ensures high-fidelity detection rules that keep up with emerging attack techniques.

With precise ATT&CK mappings, automated rule updates, and SIEM-ready Sigma rules, SigmaGen empowers SOC teams, blue teams, and threat hunters to detect, track, and respond to adversaries faster than ever before.

🚀 SigmaGen is the future of AI-driven threat-informed defense.

0
Subscribe to my newsletter

Read articles from Lucas directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Lucas
Lucas