Surveilr as a Lightweight, SQL-Native Message Queue

This document describes how surveilr can function as a lightweight, SQL-native message queue and broker for compliance-as-code and general compliance workflows. The MQ design intentionally leverages surveilr's built-in ingestion and watch capabilities and its SQLite RSSD storage model, rather than introducing a standalone MQ cluster.

Goals and Positioning

The goal is **not** to compete with high-throughput, horizontally scalable MQ systems (Kafka, Pulsar, large RabbitMQ deployments). The goal is to provide a lightweight MQ that is:

SQL-first and queryable at rest
Durable by default: (SQLite RSSDs)
Easy to deploy: behind firewalls, on laptops, servers, or regulated enclaves
Naturally auditable: and evidence-friendly
Compatible with ubiquitous payload formats: and simple operational protocols

This fits compliance-as-code patterns particularly well: compliance events, control evidence, continuous checks, scan outputs, policy attestations, workflow state transitions, UAT artifacts, and other "small to medium volume but high value" messages.

Ubiquitous Payload Formats

Rather than inventing a bespoke message format, surveilr favors widely used payload shapes and envelopes:

MQTT-style payloads: topic + payload + timestamp + optional headers
SQS-style payloads: message body + attributes + receipt handles + visibility timeouts
CloudEvents-style metadata: id, source, type, subject, time, datacontenttype, data

The strategy is to accept payloads that resemble what teams already emit, then normalize into a consistent internal "message record" model in SQLite.

Ingress: Files and Mail

Surveilr becomes a broker by treating inbound "messages" as immutable resources ingested into RSSDs.

File System Ingress

Producers write message payloads as files in a designated directory (local disk, mounted share, S3-mounted file system, SFTP drop, etc.). Payload formats should favor **JSONL** for streaming append patterns, but also accept JSON, XML, CSV, Markdown, and other common formats.

IMAP Ingress

Producers send messages as emails (SMTP), where the mailbox becomes the inbound queue. Surveilr watches IMAP and ingests email bodies and attachments as resources. IMAP folders map naturally to queues (INBOX, subfolders like "queue/control-evidence", "queue/scans", "queue/alerts").

This "files and mail" model is intentionally pragmatic. In regulated or enterprise environments, file drops and email are often the only universally permitted ingress paths.

Watch Mode as the Broker Engine

Surveilr already has watch mode for file systems and for IMAP. The MQ strategy treats watch mode as the "broker ingestion loop."

Watch Mode on File Systems

surveilr watches one or more directories
every new file is ingested as a `uniform_resource` row in the RSSD SQLite database
metadata (source path, timestamps, hashing, origin host, ingest run identifiers) is captured

Watch Mode on IMAP

surveilr watches one or more mailboxes/folders
each message (and optionally each attachment) is ingested as a `uniform_resource` row
metadata (from/to, subject, message-id, received time, folder name) becomes queryable

The key idea: each inbound "message" becomes a **durable resource** in `uniform_resource`. This gives you a system of record for the queue without needing an external broker log.

Storage Model: SQLite RSSDs as the Durable Queue Log

Once ingested, messages exist as rows and blobs/text within the RSSD. This enables:

Durability: SQLite persistence, with WAL mode and safe sync settings
Portability: an RSSD is a single file that can be moved, backed up, replicated, signed, or archived
Traceability: every message is a resource with provenance, timestamps, and optionally hashes
Auditability: queries and derived artifacts are reproducible

For compliance workflows, this is a major advantage over transient broker queues — the "message history" **is** the evidence.

SQL Views: Turn Ingestions into Queues

The core of "surveilr as MQ" is not the ingestion itself, but the **queue semantics exposed through views** and simple state tables.

Canonical Message View

A view that projects `uniform_resource` into a canonical message shape:

`message_id` (resource id)
`queue_name` (derived from directory path or IMAP folder)
`topic` (optional, derived from filename, JSON field, or header)
`payload` (raw text/blob, plus parsed JSON where possible)
`content_type` / `format`
`received_at`, `source`, `correlation_id`, `attributes`

State Tables

A small set of state tables manage:

Consumer leases: (who is processing what, until when)
Acknowledgements
Retry counters: / `next_visible_at`
DLQ markers: / reasons

With these, you can implement SQS-like patterns entirely in SQLite:

ReceiveMessage: select visible messages not leased, lease them with a visibility timeout
DeleteMessage: acknowledge
ChangeMessageVisibility: extend lease
DLQ routing: after N receives or on explicit failures, mark as dead-letter

Lightweight SQS-Compatible API Facade

To maximize interoperability, the recommended API facade is AWS SQS compatibility. This provides:

Mature SDKs in most languages (Rust, TypeScript, Java, Python)
Existing tooling and patterns
Well-known semantics (visibility timeout, receipt handle, long polling)

A minimal subset covers most compliance workflows:

`CreateQueue` / `ListQueues` (optional if queues are inferred)
`SendMessage` (optional if enqueue is file/IMAP-only)
`ReceiveMessage`
`DeleteMessage`
`ChangeMessageVisibility`
`GetQueueAttributes` (approx counts)

Why SMTP/IMAP Are Still Useful

SMTP and IMAP are not modern MQ protocols, but they are **ubiquitous** and often permitted where other protocols are blocked:

Email is allowed when custom ports are not
Mail servers provide basic durability, routing, and authentication
IMAP folders naturally represent queues and subqueues
Attachments provide a simple way to move structured payloads

The strategy explicitly treats SMTP/IMAP as "MQ transport in a pinch," while surveilr turns that transport into a SQL-queryable durable queue log.

What This Is Good For (and Not)

Good For

Compliance evidence ingestion and routing
Continuous control checks and scan outputs
Workflow orchestration in regulated environments
"Queue with a database" patterns where SQL introspection is essential
Low to moderate throughput systems that prioritize durability and auditability
Environments where only file drops and email are reliably available

Not Intended For

Very high throughput event streaming at scale
Strict low-latency pub/sub at massive fanout
Large distributed consumer swarms needing broker-side partitioning
Cross-region replicated broker clusters

In those cases, surveilr still plays a role as the "evidence and query store" that mirrors messages from a real MQ into RSSDs for audit, analytics, and compliance.

Reference Architecture Summary

1. **Ingress**: File drops (preferred) and IMAP mailboxes/folders (universal fallback)

2. **Broker Core**: surveilr watch mode ingests into `uniform_resource` (RSSD SQLite)

3. **Queue Semantics**: SQL views normalize formats; SQL tables manage leases, visibility, retries, acks, and DLQ

4. **Client Interoperability**: Optional SQS-compatible API facade; optional MQTT bridge

The result is a **first-class, SQL-native, durable, easy-to-operate lightweight MQ** built from surveilr's existing ingestion and RSSD model — ideal for compliance-as-code workloads where evidence, provenance, and queryability matter more than raw scale.