FerroStash
Rust-native Logstash-compatible log pipeline
A Rust-native, Logstash-compatible log and event pipeline. It ingests, transforms, and routes events through the same input → filter → output model as Logstash, parsing the Logstash pipeline.conf DSL (and an equivalent YAML form) natively, without a JVM. A single static binary (about 14 MB) starts in milliseconds and holds tens of MB of RAM, so you can pack far more shippers per host.
Where a typical Logstash pipeline holds about a gigabyte of JVM heap and takes tens of seconds to start, FerroStash runs as a single static binary. The v1.0 line implements the production-common subset of the Logstash 9.x plugin set — about 88 percent of the bundled plugins (98 of 111), weighted toward the parsing and filtering hot path. Inputs include beats, file, tcp, udp, http, syslog, kafka, redis, s3, sqs, jdbc, elasticsearch, and cloudwatch; filters include grok, dissect, kv, json, mutate, date, geoip, dns, csv, xml, useragent, cidr, fingerprint, translate, aggregate, and throttle, plus a native Painless-style script filter; outputs include elasticsearch / opensearch, kafka, s3, http, tcp, udp, file, redis, sqs, sns, cloudwatch, email, and datadog; codecs include json, json_lines, multiline, cef, netflow, avro, msgpack, and protobuf. For reliability it offers an optional on-disk persistent queue with at-least-once delivery and a dead-letter queue (opt-in fsync for power-loss durability) and a built-in monitoring API for node and pipeline stats. Honest scope: it is Logstash config/pipeline compatible, not a byte-identical 100 percent drop-in — coverage is plugin-level, a covered plugin may implement a subset of its options, and a config using a missing plugin fails fast at load.
The problem
Logstash is a flexible log and event pipeline, but the usual deployment holds about a gigabyte of JVM heap, takes tens of seconds to start, and runs a separate JVM agent runtime per shipper. That caps how many shippers fit on a host and weighs heavily on memory and startup time, especially in containers. The need is to keep existing pipeline.conf and plugin investments while shedding the JVM footprint and running lean.
How it works
- 1
Runs input → filter → output natively
FerroStash ingests, transforms, and routes events through the same input → filter → output model as Logstash. It parses the Logstash pipeline.conf DSL (and an equivalent YAML form) natively — without a JVM and without a separate agent runtime — and runs as a single static binary.
- 2
Implements the production-common plugins
The v1.0 line implements 98 of the 111 bundled Logstash 9.x plugins (about 88%), weighted toward the parsing and filtering hot path. It covers the major inputs, filters, outputs, and codecs and includes a native Painless-style script filter. A config that uses a missing plugin fails fast at load, so there is no silent drop.
- 3
Reliability and monitoring
An optional on-disk persistent queue provides at-least-once delivery (read/ack cursor separation, checkpoint-after-output-ack) and a dead-letter queue, with opt-in fsync for power-loss durability. A built-in monitoring API exposes node and pipeline stats.
Highlights
Logstash config/pipeline compatible — parses the pipeline.conf DSL (and YAML) natively and implements about 88% of the bundled Logstash 9.x plugins (98 of 111).
Single static binary (~14 MB), no JVM; ms startup and tens of MB of RAM let you pack more shippers per host.
Reliability: an on-disk persistent queue (at-least-once + DLQ, opt-in fsync) and a built-in monitoring API. Honest scope — config-compatible, not a byte-identical drop-in.
What's included
- Hardened Amazon Linux 2023 AMI (arm64 / Graviton, runs on t4g / c7g / m7g / r7g class)
- A single static Rust binary (about 14 MB, no JVM, ms startup) implementing a Logstash-compatible log and event pipeline
- A native parser for the Logstash pipeline.conf DSL and an equivalent YAML form
- 98 of the 111 bundled Logstash 9.x plugins (about 88%): inputs (beats, file, tcp, udp, http, syslog, kafka, redis, s3, sqs, jdbc, elasticsearch, cloudwatch, and more), filters (grok, dissect, kv, json, mutate, date, geoip, and more, plus a Painless-style script filter), outputs (elasticsearch / opensearch, kafka, s3, http, file, and more), and codecs (json, json_lines, multiline, cef, netflow, avro, msgpack, protobuf)
- An optional on-disk persistent queue with at-least-once delivery and a dead-letter queue (opt-in fsync)
- A built-in monitoring API exposing node and pipeline stats
- No separate control plane, no telemetry home-call, and no license-key check (billed per instance per hour through your AWS bill)
Use cases
Keeping existing Logstash pipeline.conf and plugin investments while running lean without the JVM footprint
Container or edge environments where memory and startup time are constraints and you want more shippers per host
Pipelines that parse logs with grok / dissect / json and route to Elasticsearch / OpenSearch / Kafka / S3 and similar
Running a log pipeline entirely inside your own VPC, with no separate control plane and no telemetry home-call
FAQ
How compatible is it with Logstash?
FerroStash is Logstash config/pipeline compatible, not a byte-identical 100% drop-in. It parses the pipeline.conf DSL natively and implements 98 of the 111 bundled Logstash 9.x plugins (about 88%). Coverage is plugin-level, and a covered plugin may implement only a subset of its options. A config that uses a missing plugin fails fast at load.
Does it need a JVM?
No. FerroStash runs as a single static Rust binary (about 14 MB) with no JVM and no separate agent runtime. It starts in milliseconds and holds tens of MB of RAM, so you can pack more shippers per host.
Are there delivery guarantees?
An optional on-disk persistent queue provides at-least-once delivery (read/ack cursor separation, checkpoint-after-output-ack) and a dead-letter queue. Opt-in fsync can be enabled for power-loss durability.
Is this a replacement for an AWS service?
Logstash is an open-source project from Elastic and is not an AWS service. FerroStash is a self-managed product that runs that Logstash-compatible log pipeline on your own Amazon EC2 instances. It ships as a normal Amazon Linux 2023 AMI that you run inside your own VPC.
How is it billed?
An hourly AMI software fee plus the EC2 you choose (t4g / c7g / m7g / r7g class, Arm). Metered per instance type, with no separate control plane, no telemetry home-call, and no license-key check.
Pricing model
Hourly software fee + EC2 (t4g / c7g / m7g / r7g class, Arm). Metered per instance type.
Other S4 products
S4 — Squished S3
Transparent GPU S3-compression gateway
S4 Logs
Archive CloudWatch Logs to zstd S3
S4 Metrics
Govern CloudWatch metric cardinality