All AWS Marketplace products
S4 LogForge
Security

S4 LogForge

Realistic SIEM test log generator

13 parser-faithful formats AWS service it replaces: Hand-rolled SIEM test data
Get it on AWS Marketplace

Generate realistic, parser-faithful security logs in 13 formats at any rate — backfill 30 days in seconds or stream in realtime. Built for SIEM PoCs, detection-rule development, dashboards, capacity sizing, and load testing. Correlated MITRE ATT&CK-tagged attack scenarios; deterministic, reproducible output.

S4 LogForge generates security logs that are field-faithful to real devices and SIEM schemas — for when you need production-like data for a SIEM project but cannot use production logs. 13 output formats are each verified end-to-end against real parsers (Elasticsearch ingest pipelines, Elastic integrations, Logstash grok / kv / xml / CEF codec): RFC 3164/5424 syslog; CEF (ArcSight-style); LEEF 2.0 (QRadar-style); PAN-OS 10.2 CSV; ECS 8.11 JSON; XDR telemetry JSON; Windows Event/Winlogbeat; CloudTrail; VPC Flow; Zeek; Suricata.

The problem

Building or validating a SIEM requires realistic security logs, yet production logs are sensitive and unavailable, and hand-faked logs neither survive real parsers nor carry known ground truth. S4 LogForge generates field-faithful, parser-verified test logs with built-in ground truth, so you can move a SIEM project forward without touching production data.

How it works

  1. 1

    Choose formats and scenarios

    Select from 13 output formats and MITRE ATT&CK-tagged attack scenarios, and author your own with the TOML DSL when needed.

  2. 2

    Generate backfill or realtime

    Backfill 30 days in minutes, or stream a realtime diurnal curve to file, syslog, Elasticsearch, or Splunk HEC.

  3. 3

    Measure detections vs ground truth

    Because injected scenarios are known ground truth, you can score detection and false-positive rates against it.

Highlights

13 parser-faithful formats — syslog 3164/5424, CEF, LEEF, PAN-OS CSV, ECS JSON, Windows Event/Winlogbeat, CloudTrail, VPC Flow, Zeek, Suricata, XDR telemetry — each verified against real parsers, not just 'looks like a log'.

Correlated, MITRE ATT&CK-tagged attack scenarios injected into realistic baseline noise, plus a TOML DSL to author your own — measure detection and false-positive rates against known ground truth.

Deterministic and rate-controlled: same seed reproduces byte-identical data; sustain 188k–1.6M events/sec, backfill 30 days in minutes, or stream a realtime diurnal curve to file, syslog, Elasticsearch or Splunk HEC.

What's included

  • 13 parser-verified output formats (RFC 3164/5424 syslog, CEF, LEEF 2.0, PAN-OS 10.2 CSV, ECS 8.11 JSON, XDR telemetry JSON, Windows Event Log XML / Winlogbeat, CloudTrail, VPC Flow, Zeek, Suricata)
  • End-to-end verification against real parsers including Elasticsearch ingest, Elastic integration pipelines, and Logstash grok/kv/xml/CEF codecs
  • Correlated, MITRE ATT&CK-tagged attack scenarios injected into realistic baseline noise, with a TOML DSL to author your own
  • Deterministic seeded reproducibility: the same seed reproduces byte-identical data
  • Throughput of 188k–1.6M events/sec, with 30-day backfill generated in minutes
  • Output sinks: file, syslog, Elasticsearch, and Splunk HEC

Use cases

Run SIEM PoCs and evaluations without production logs

Develop and tune detection rules against known ground truth

Build and validate dashboards on representative data

Perform capacity sizing and load testing

FAQ

Are the logs realistic enough?

All 13 formats are verified end-to-end against real parsers such as Elasticsearch ingest, Elastic integration pipelines, and Logstash grok/kv/xml/CEF codecs. They are field-faithful to real devices and SIEM schemas, not merely something that looks like a log.

Can I reproduce runs?

Yes. Generation is deterministic: the same seed reproduces byte-identical data.

How do I measure detection quality?

Correlated, MITRE ATT&CK-tagged scenarios serve as known ground truth, so you can measure detection and false-positive rates against it.

What can it feed?

It can output to file, syslog, Elasticsearch, and Splunk HEC, for both backfill and realtime streaming.

How fast, and how much data?

It sustains 188k–1.6M events/sec and backfills 30 days of data in minutes.

Pricing model

Hourly software fee + EC2 (t3 class and up). Metered per instance type, no license keys.

Get it on AWS Marketplace