All AWS Marketplace products
FerroDruid
Observability

FerroDruid

Rust-native Apache-Druid-compatible OLAP

Sub-second boot, <200 MB RAM AWS service it replaces: Apache Druid cluster
Get it on AWS Marketplace

A Rust-native, Apache-Druid-spec-compatible real-time OLAP database. It speaks the Druid REST API, native query JSON, and Druid SQL, and reads/writes Druid segment v9/v10 binaries — without a JVM, without ZooKeeper, and without a six-process control plane. The single binary boots in under a second on under 200 MB of RAM.

A classic Apache Druid cluster needs six or more JVM processes plus ZooKeeper plus an external metadata database and 16 GB+ of RAM before it serves a single query; FerroDruid's single-binary mode replaces all of that with one process that ships as a self-contained AMI. v0.2.0 serves all eight native query types (timeseries, topN, groupBy, scan, search, segmentMetadata, dataSourceMetadata, timeBoundary); runs Druid SQL (SELECT, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT, 30+ functions, EXPLAIN PLAN FOR, an MSQ task endpoint, ~95% core SQL parity); exposes 40+ Druid-compatible REST endpoints; reads/writes Druid segment v9/v10; and ingests from Kafka and Kinesis supervisors and native batch. Basic auth (Argon2id) + RBAC is on by default, TLS via rustls, with a unique random admin password generated on first boot.

The problem

Apache Druid is a powerful real-time OLAP engine, but a classic cluster needs six or more JVM processes plus ZooKeeper plus an external metadata database, and 16 GB or more of RAM, before it serves a single query. Standing up, operating, and monitoring that six-process control plane is heavy, and it is overkill for evaluation environments and smaller deployments. You want Druid's API and segment format, but not the burden of running a JVM and ZooKeeper fleet.

How it works

  1. 1

    Boots as a single binary

    Single-binary mode runs one process — no JVM, no ZooKeeper, no external metadata database — that starts in under a second and uses under 200 MB of RAM. It uses SQLite for metadata and the local filesystem for deep storage, and ships as a self-contained AMI.

  2. 2

    Speaks the Druid wire protocol

    It speaks the Druid REST API, native query JSON, and Druid SQL, and reads and writes Druid segment v9/v10 binary files. It serves all eight native query types and exposes more than 40 Druid-compatible REST endpoints, so you can point existing Druid clients or an Apache Superset connector straight at it.

  3. 3

    Starts locked down, password change on first login

    Basic auth (Argon2id) and RBAC are on by default, with TLS via rustls. On first boot it generates a new random admin password unique to that instance (never a default or shared one) and writes it once to the instance system log. The admin account is flagged must-change, so every API endpoint returns HTTP 403 until the operator POSTs a new password, enforcing a change on first login.

Highlights

Druid-spec wire-compatible (REST + native JSON + Druid SQL, segment v9/v10) — existing Druid clients and queries work.

One binary, no JVM / ZooKeeper / six-process control plane; sub-second boot on under 200 MB RAM.

8 native query types + Druid SQL (~95% core parity) + Kafka / Kinesis ingest; auth + RBAC on by default.

What's included

  • Self-contained Amazon Linux 2023 AMI (Graviton / arm64, supporting t4g, c7g, m7g, and r7g class instances)
  • Single-binary mode — one process with no JVM, ZooKeeper, or external metadata database, booting in under a second on under 200 MB of RAM (SQLite metadata plus local-filesystem deep storage)
  • All eight Druid native query types (timeseries, topN, groupBy, scan, search, segmentMetadata, dataSourceMetadata, timeBoundary) and more than 40 Druid-compatible REST endpoints
  • Druid SQL (SELECT / WHERE / GROUP BY / HAVING / ORDER BY / LIMIT, more than 30 functions, EXPLAIN PLAN FOR, an MSQ task endpoint, and approximately 95% core SQL parity)
  • Reads and writes Druid segment v9/v10 binary files, with ingestion from Kafka and Kinesis supervisors and via native batch
  • Security on by default — Basic auth (Argon2id) plus RBAC, TLS via rustls, and a random per-instance admin password generated on first boot that must be changed on first login
  • CloudFormation template for deployment behind an ALB (marketplace/cloudformation/ami.yaml), with single-binary single-node as the supported topology (multi-node fails closed by default)

Use cases

Teams that want Druid-compatible real-time OLAP without operating a six-process JVM and ZooKeeper cluster

A backend for existing clients that use the Druid REST API, native query JSON, Druid SQL, or an Apache Superset connector

Evaluation and development environments for Druid features using a lightweight binary that boots in under a second on under 200 MB

Single-node streaming and time-series analytics that ingest from Kafka and Kinesis supervisors or via native batch

FAQ

How compatible is it with real Apache Druid?

FerroDruid speaks the Druid REST API, native query JSON, and Druid SQL, and reads and writes segment v9/v10. It covers all eight native query types and more than 40 Druid-compatible REST endpoints, with approximately 95% core Druid SQL parity (not 100%). Live wire deep-match was 5 of 5 against Apache Druid 30.0.1 and 5 of 5 with an Apache Superset connector. Honest scope: live validation is against Druid 30.0.1 and single-binary mode; Druid 31 through 36 is a spec-driven design target not yet cross-validated against a running cluster.

Do I need a JVM, ZooKeeper, or an external metadata database?

Not in single-binary mode. One process boots in under a second and uses under 200 MB of RAM — in contrast to a classic Druid cluster, which needs six or more JVM processes plus ZooKeeper plus an external metadata database and 16 GB or more of RAM. The supported single-binary path uses SQLite for metadata and the local filesystem for deep storage.

Can I run a multi-node configuration?

The supported topology is single-binary single-node; multi-node configurations fail closed by default. Honestly, live validation is against single-binary mode, and we have not validated it as a running multi-node cluster at this time. See docs/KNOWN_LIMITATIONS.md for details.

How is security handled, and what is the first-login flow?

Basic auth (Argon2id) and RBAC are on by default, with TLS via rustls. On first boot the AMI generates a new random admin password unique to that instance (never a default or shared one) and writes it once to the instance system log. The admin account is flagged must-change, so every API endpoint returns HTTP 403 until the operator POSTs a new password to /druid-ext/basic-security/authentication/db/basic/users/admin/credential. The rotated credential is persisted and survives restarts.

How do I deploy it, and how do licensing and billing work?

Deploy it with the provided CloudFormation template (marketplace/cloudformation/ami.yaml) behind an Application Load Balancer; terminate TLS at the ALB and do not expose the service port directly to the internet. Point your clients (REST API, native query JSON, Druid SQL, or an Apache Superset connector) at the load balancer endpoint. This listing sells a hardened, scanned, supported distribution built from the Apache-2.0 source at a pinned release version; the code itself remains Apache-2.0. The AMI is metered automatically by AWS per running instance-hour, with no metering code in the product.

Pricing model

Hourly software fee + EC2 (t4g / c7g / m7g / r7g class, Arm). Metered per instance type.

Get it on AWS Marketplace