Most logging programs fail for the same reason: they optimize for writing (easy console.log) and punish querying (late-night manual searching through log files). The winning shift is to treat production telemetry as structured, queryable "business events" and emit one context-rich record per request per service hop ("Contextual Unified Activity Records"), then query it like data, not text.
**Why this matters for AI:** Because these activity records contain the complete business context in a structured, queryable format, they become immediately usable by AI systems. Traditional scattered log lines require extensive preprocessing before AI can extract meaning. Contextual Unified Activity Records are AI-ready from the moment they're created — enabling automated anomaly detection, intelligent root cause analysis, and predictive insights without manual data preparation.
Surveilr is a clean fit for this approach because it is a stateful, SQL-centric evidence warehouse that ingests raw artifacts (JSON, JSONL, text, CSV, XML, etc.) locally/at-the-edge, preserves provenance, and exposes everything through SQL-friendly views and tools.
What's Broken in Typical Logging
"String search is broken"
**Problem pattern:** Same "entity" appears in dozens of inconsistent representations (user-123, user_id=user-123, JSON blobs, bracket formats), so searches miss context and correlation becomes a multi-step detective game. Downstream services log different identifiers (order_id, request_id), forcing more searches and guesswork.
**Surveilr answer:**
- Ingest logs as structured activity records (JSON/JSONL strongly preferred) and normalize them into consistent SQL views
- Use SQL views to canonicalize identity fields (user_id, order_id, request_id, trace_id) into stable columns, so you query once and join across services deterministically
- You stop "searching logs"; you run relational queries
- AI systems can directly query these structured records for pattern analysis
"Logs miss the one thing you need: context"
**Problem pattern:** Many lines per request, each line says a tiny thing, and none contain the full business context (tier, cart value, flags, payment attempt, etc.).
**Surveilr answer:**
- Make Contextual Unified Activity Records the primary telemetry artifact: one record per request per service hop, containing the full request + business context
- Store these records in surveilr as the system-of-record for debugging, audit, product analytics, and AI-powered insights (same data, different views)
"Data isn't usable by AI"
**Problem pattern:** Scattered, unstructured log text requires extensive preprocessing before AI can analyze it. Most organizations can't leverage AI for operations because their data isn't in a usable format.
**Surveilr answer:**
- Contextual Unified Activity Records are structured from birth — AI systems can immediately query and analyze them
- Full business context in each record means AI can understand relationships without guessing
- SQL-accessible format enables both human queries and AI-powered automated analysis
"High-cardinality data is 'too expensive'"
**Problem pattern:** Legacy log systems built for text search get slow/expensive when you log the fields that actually make debugging work.
**Surveilr answer:**
- Tail sampling rules (keep what matters; sample the rest)
- Local-first storage and governance: store close to the data, replicate or export only what's needed
- Views and partitions: keep raw immutable, curate subsets for hot queries, archive cold data
Reference Architecture
A. Instrumentation: Emit Contextual Unified Activity Records
Emit exactly one "request outcome record" per service hop, enriched throughout the request lifecycle, emitted once at the end.
**Minimum recommended fields (baseline schema contract):**
- time: timestamp
- identity: request_id, trace_id, span_id (optional), user_id (if available), session_id (optional), order_id (optional)
- service: service name, version, deployment_id, region/zone
- request: method, route/path template (avoid raw PII in path), status_code, duration_ms
- outcome: success/error + error.type/code
- business context: subscription tier, feature_flags, key domain objects (cart totals, provider latency, etc.)
B. Collection: Land Records as JSONL
- Each service writes JSONL activity records to a local file (or stdout captured by an agent)
- Rotate by time/size; keep files immutable once sealed
C. Ingestion: Surveilr as the Stateful Store
- Surveilr ingests JSON/JSONL (and anything else you already have), preserves provenance, and stores content in its uniform resource database so it stays queryable and auditable
- Use Capturable Executables where records must be generated (pulling from APIs, running diagnostics, exporting from systems that don't "log to file")
D. Modeling: SQL Views Turn Telemetry into Answers
Create a small set of canonical views:
-- activity_record_raw: thin view over ingested JSON
CREATE VIEW activity_record_raw AS
SELECT
json_extract(content, '$.time') as time,
json_extract(content, '$.request_id') as request_id,
json_extract(content, '$.trace_id') as trace_id,
json_extract(content, '$.service') as service,
content as raw_record
FROM uniform_resource
WHERE content_type = 'application/json';
-- activity_record: business-friendly typed view
CREATE VIEW activity_record AS
SELECT
request_id,
trace_id,
service,
json_extract(raw_record, '$.status_code') as status_code,
json_extract(raw_record, '$.duration_ms') as duration_ms,
json_extract(raw_record, '$.user_id') as user_id,
json_extract(raw_record, '$.subscription') as subscription,
json_extract(raw_record, '$.error_code') as error_code,
json_extract(raw_record, '$.feature_flags') as feature_flags
FROM activity_record_raw;
E. Querying: Replace Manual Searching with SQL
This is the core payoff: you can directly express the questions you actually ask during incidents:
-- Show all checkout failures for premium users
-- in the last hour with feature flag X enabled
SELECT
request_id,
user_id,
error_code,
duration_ms
FROM activity_record
WHERE service = 'checkout'
AND status_code >= 400
AND subscription = 'premium'
AND json_extract(feature_flags, '$.new_checkout') = true
AND time > datetime('now', '-1 hour')
ORDER BY duration_ms DESC;
F. AI Integration: Automated Analysis and Insights
Because your activity records are structured and contextual, AI systems can:
- Detect anomalies: — identify unusual patterns across thousands of records instantly
- Perform root cause analysis: — trace issues across services by following the structured relationships
- Generate insights: — surface trends and correlations humans might miss
- Predict issues: — learn from historical patterns to anticipate problems before they occur
G. Governance: Privacy, Minimization, Audit
- Edge processing lets you redact/anonymize before sharing upstream
- For regulated environments: keep raw locally, export only curated views (or hashed identifiers) to centralized analytics
Implementation Plan (Practical, Staged)
Phase 0: Stop the Bleeding (1–2 weeks)
- Pick 1–2 critical user journeys (checkout, login, webhook ingestion)
- Add contextual record emission in just those paths
- Land JSONL locally; ingest into surveilr
- Build 2–3 "incident queries" as saved SQL views
Phase 1: Standardize the Contract (2–4 weeks)
- Publish a minimal activity record schema contract (field names + meanings)
- Implement a shared middleware/helper that starts a record at request start, exposes it to handlers for enrichment, and emits once at request end
- Define a naming convention for IDs and feature flags so joins are predictable
Phase 2: Cost Control Without Losing Signal (2–4 weeks)
- Implement tail sampling rules (errors/slow/VIP/flags kept; sample rest)
- Store sampling decision fields alongside the record so queries remain honest
Phase 3: Make It Cross-Service (4–8 weeks)
- Ensure request_id/trace_id propagate across services consistently
- Add service/deploy metadata so regressions correlate with releases
- Build "trace-like" joins from activity records (flow + context)
Phase 4: Enable AI-Powered Operations (4–8 weeks)
- Connect AI systems to your structured activity records
- Build automated anomaly detection using the rich contextual data
- Create AI-assisted incident response that leverages cross-service correlation
- Enable predictive analytics based on historical patterns
Phase 5: Turn It into an Evidence Platform (Ongoing)
- Extend beyond app logs: CI/CD events, deploys, feature flag changes, database migrations, config drift, security and compliance evidence streams
- Surveilr becomes the common substrate for operational truth, not just "logs"
Key Design Decisions
1. **Contextual Unified Activity Records are the primary artifact** — Multi-line "debug diaries" are secondary and should be treated as optional, sampled, and scoped
2. **JSON/JSONL everywhere** — Text logs are tolerated only at the edges; the center of gravity is structured
3. **SQL views are the product** — Raw records are necessary for audit, but views are what make the system usable at speed
4. **AI-ready by design** — Structured, contextual records are immediately usable by AI systems without preprocessing
5. **Sampling is policy, not randomness** — Tail sampling is the default; document it; store the decision fields
6. **Keep data local unless you must centralize** — Surveilr's local-first model is a strategic advantage for cost, privacy, and ownership
Why This Positions Surveilr Well
The industry is converging on "one rich record per request" because it turns debugging into analytics. Surveilr's differentiator is that it's not "a log UI." It's a stateful, SQL-native evidence warehouse that can ingest anything, preserve provenance, and expose it through views that match how operators actually think.
Because these Contextual Unified Activity Records are structured and contain full business context, they're immediately usable by AI systems for automated analysis, pattern detection, and intelligent insights — without the preprocessing burden that makes most operational data unusable for AI.
That makes surveilr a natural home for contextual activity records, tail sampling metadata, and cross-system correlation (deploys, flags, incidents, tickets, audits) without forcing teams into heavyweight, centralized, hyperscale stacks.