$worker

iii-observability

v0.20.0

OpenTelemetry-based traces, metrics, logs, alerts, and sampling.

engine module

baked into the iii engine; no separate install required.

install

$iii worker add iii-observability

configuration

iii-config.yaml

- enabled: true
  exporter: memory
  logs_console_output: true
  logs_enabled: true
  logs_exporter: memory
  logs_max_count: 1000
  logs_retention_seconds: 3600
  memory_max_spans: 1000000
  metrics_enabled: true
  metrics_exporter: memory
  metrics_max_count: 10000
  metrics_retention_seconds: 3600
  sampling_ratio: 1
  service_name: iii
  service_version: ${SERVICE_VERSION:__III_ENGINE_VERSION__}

readme

open as markdown

README.md

iii-observability

Full OpenTelemetry observability for III Engine: distributed tracing, structured logs, performance metrics, alert rules, and trace sampling — all queryable via built-in functions.

Install

iii worker add iii-observability

Resolves from the worker registry at workers.iii.dev.

Skills

Install the iii-observability agent skill for Claude Code, Cursor, and 30+ other agents:

npx skills add iii-hq/iii --full-depth --skill iii-observability

Sample Configuration

- name: iii-observability
  config:
    enabled: true
    service_name: my-service
    service_version: 1.0.0
    exporter: memory
    metrics_enabled: true
    logs_enabled: true
    memory_max_spans: 1000
    sampling_ratio: 1.0
    alerts:
      - name: high-error-rate
        metric: iii.invocations.error
        threshold: 10
        operator: ">"
        window_seconds: 60
        action:
          type: log

Configure

The full configuration surface is registered with the builtin configuration worker under the id iii-observability. The stored entry is the runtime source of truth; the config.yaml block is seed-only — it populates the entry on the very first boot and is ignored afterwards. To change a setting after first boot, edit the entry (console, or configuration::set { "id": "iii-observability", "value": { ... } }); editing config.yaml alone has no effect anymore.

With the default file-backed adapter the entry persists at ./data/configuration/iii-observability.yaml and is read again at every engine start — before logging/tracing init — so even restart-tier fields edited at runtime apply on the next start. ${VAR:default} placeholders work in string fields and are expanded on read.

Values are validated against the JSON schema at configuration::set time (unknown fields rejected, ratios bounded to 0..=1, counts ≥ 1). Two caveats:

Alert operator symbols (>, <, ...) are accepted in config.yaml only; remote edits must use the canonical names the schema advertises (greaterthan, lessthan, ...).
After a schema tightening, a previously-stored out-of-range value makes the boot-time schema refresh fail with SCHEMA_INVALID (warn-and-continue); reads still work and out-of-range values are clamped on read.

Hot Reload

configuration:updated events are applied per field tier:

Tier	Fields	Effect
Live	`logs_console_output`, `logs_sampling_ratio`, `logs_enabled` (ingest gate), `enabled` (ingest gate)	Immediate — read per use
Limits	`memory_max_spans`, `logs_max_count`, `metrics_max_count`, `metrics_retention_seconds`	Immediate — enforced on the next insert / 60s sweep
Swap	`sampling_ratio`, `sampling.*`, `alerts`, `collapse_spans`, `level`	Immediate — compiled artifact rebuilt and swapped (alert states of surviving rules keep cooldown/firing continuity)
Task rebuild	`logs_exporter`, `logs_batch_size`, `logs_flush_interval_ms`, `logs_retention_seconds`, and `logs_enabled` on a false→true transition	The background task restarts with the new settings. A `logs_enabled` false→true toggle revives the log store and respawns the log-trigger subscriber, OTLP logs exporter, and retention task — so the `log` trigger fan-out and OTLP log export reactivate without an engine restart
Restart-only	`exporter`, `endpoint` (trace and logs exporters), `service_name`/`service_version`/`service_namespace` (trace resource and logs exporter identity), `format`, `metrics_enabled`, `metrics_exporter`, `enabled` (pipeline construction)	Logged as a warning; applied at the next engine start via the persisted entry. `endpoint`/`service_name`/`service_version` are restart-tier for all signals so logs and traces always move to the new collector/identity together, never split mid-edit

Known limitation: an engine config-file reload that destroys and recreates this worker shuts down the OTLP trace/metric providers without rebuilding them (they are process-global set-once state) — OTLP export then requires an engine restart. Memory-backed stores are unaffected.

Configuration

Field	Type	Description
`enabled`	boolean	Whether OpenTelemetry tracing export is enabled. Defaults to `false`. Env: `OTEL_ENABLED`.
`service_name`	string	Service name in traces and metrics. Defaults to `"iii"`. Env: `OTEL_SERVICE_NAME`.
`service_version`	string	Service version (`service.version` OTEL attribute). Env: `SERVICE_VERSION`.
`service_namespace`	string	Service namespace (`service.namespace` OTEL attribute). Env: `SERVICE_NAMESPACE`.
`exporter`	string	Trace exporter: `memory`, `otlp`, or `both`. Use `both` when traces should remain queryable in iii while also being exported. Defaults to `otlp`. Env: `OTEL_EXPORTER_TYPE`.
`endpoint`	string	OTLP collector base endpoint. Defaults to `"http://localhost:4317"`. Env: `OTEL_EXPORTER_OTLP_ENDPOINT`.
`sampling_ratio`	number	Global trace sampling ratio (`0.0`–`1.0`). Defaults to `1.0`. Env: `OTEL_TRACES_SAMPLER_ARG`.
`memory_max_spans`	number	Max spans to keep in memory. Defaults to `1000`. Env: `OTEL_MEMORY_MAX_SPANS`.
`metrics_enabled`	boolean	Whether metrics collection is enabled. Defaults to `false`. Env: `OTEL_METRICS_ENABLED`.
`metrics_exporter`	string	Metrics exporter: `memory` or `otlp`. Defaults to `memory`. Env: `OTEL_METRICS_EXPORTER`.
`metrics_retention_seconds`	number	How long to retain metrics in memory. Defaults to `3600`. Env: `OTEL_METRICS_RETENTION_SECONDS`.
`metrics_max_count`	number	Max metric data points in memory. Defaults to `10000`. Env: `OTEL_METRICS_MAX_COUNT`.
`logs_enabled`	boolean	Whether structured log storage is enabled.
`logs_exporter`	string	Logs exporter: `memory`, `otlp`, or `both`. Use `both` when logs should remain queryable in iii while also being exported. Defaults to `memory`. Env: `OTEL_LOGS_EXPORTER`.
`logs_max_count`	number	Max log entries in memory. Defaults to `1000`.
`logs_retention_seconds`	number	How long to retain logs in memory. Defaults to `3600`.
`logs_sampling_ratio`	number	Fraction of logs to retain (`0.0`–`1.0`). Defaults to `1.0`.
`logs_console_output`	boolean	Print ingested logs to the console. Defaults to `true`.
`level`	string	Minimum log level: `trace`, `debug`, `info`, `warn`, `error`. Defaults to `info`.
`format`	string	Log output format: `default` or `json`. Defaults to `default`.
`alerts`	AlertRule[]	Alert rules evaluated against metrics.

OTLP Transport

Traces and metrics export over OTLP/gRPC by default. https:// endpoints use TLS with system roots, while http:// endpoints use cleartext transport.

To export traces and metrics with OTLP/HTTP protobuf instead of gRPC, set the standard OpenTelemetry protocol environment variable before starting the engine:

export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf

Signal-specific protocol variables override the global value:

export OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_METRICS_PROTOCOL=http/protobuf

When HTTP/protobuf is selected, iii treats endpoint as the collector base URL and appends the signal path when needed:

traces: /v1/traces
metrics: /v1/metrics

The logs exporter sends OTLP logs over HTTP and posts to /v1/logs.

Collectors that require authentication or routing headers can use the standard OTLP headers environment variables:

export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer $OTLP_TOKEN"

Use signal-specific headers when one signal needs different values:

OTEL_EXPORTER_OTLP_TRACES_HEADERS
OTEL_EXPORTER_OTLP_METRICS_HEADERS
OTEL_EXPORTER_OTLP_LOGS_HEADERS

The logs exporter reads OTEL_EXPORTER_OTLP_LOGS_HEADERS first and falls back to OTEL_EXPORTER_OTLP_HEADERS. Keep tokens in environment variables or a secret manager; do not commit them to config files.

To keep the iii console useful while exporting to an external collector, use exporter: both for traces and logs_exporter: both for logs.

Alert Rule Fields

Field	Type	Description
`name`	string	Required. Unique alert rule name.
`metric`	string	Required. Metric name to monitor (e.g., `iii.invocations.error`).
`threshold`	number	Required. Threshold value.
`operator`	string	Comparison operator: `>`, `>=`, `<`, `<=`, `==`, `!=`. Defaults to `>`.
`window_seconds`	number	Time window in seconds for metric evaluation. Defaults to `60`.
`cooldown_seconds`	number	Minimum interval between alert fires. Defaults to `60`.
`enabled`	boolean	Whether the alert rule is active. Defaults to `true`.
`action`	AlertAction	`{ "type": "log" }`, `{ "type": "webhook", "url": "..." }`, or `{ "type": "function", "path": "..." }`.

Advanced Sampling

sampling:
  default: 1.0
  parent_based: true
  rules:
    - operation: "api.*"
      rate: 0.1
  rate_limit:
    max_traces_per_second: 100

Functions

Logging

Function	Description
`engine::log::info`	Log an informational message.
`engine::log::warn`	Log a warning message.
`engine::log::error`	Log an error message.
`engine::log::debug`	Log a debug message.
`engine::log::trace`	Log a trace-level message.

All logging functions accept: message (string, required), data (object), trace_id (string), span_id (string), service_name (string).

Logs API

Function	Description
`engine::logs::list`	Query stored log entries. Filters: `start_time`, `end_time`, `trace_id`, `span_id`, `severity_min`, `severity_text`, `offset`, `limit`.
`engine::logs::clear`	Clear all stored log entries from memory.

Traces API

Function	Description
`engine::traces::list`	List stored spans. Filters: `trace_id`, `service_name`, `name`, `status`, `min_duration_ms`, `max_duration_ms`, `start_time`, `end_time`, `sort_by`, `sort_order`, `attributes`, `include_internal`, `offset`, `limit`.
`engine::traces::tree`	Retrieve a trace as a hierarchical span tree. Parameters: `trace_id` (required).
`engine::traces::clear`	Clear all stored trace spans from memory.

Metrics API

Function	Description
`engine::metrics::list`	List metrics with aggregated statistics. Returns engine counters (invocations, workers, performance), SDK metrics, and optional time-bucketed aggregations.
`engine::rollups::list`	List metric rollup aggregations (1-minute, 5-minute, 1-hour windows).

Other APIs

Function	Description
`engine::baggage::get`	Get a baggage value from the current trace context.
`engine::baggage::set`	Set a baggage value in the current trace context.
`engine::baggage::get_all`	Get all baggage key-value pairs.
`engine::sampling::rules`	List all active sampling rules.
`engine::health::check`	Check engine health status. Returns `status`, `components`, `timestamp`, `version`.
`engine::alerts::list`	List all configured alert rules and their current state.
`engine::alerts::evaluate`	Manually trigger evaluation of all alert rules.

Trigger Type: `log`

Config Field	Type	Description
`level`	string	Log level to subscribe to: `info`, `warn`, `error`, `debug`, or `trace`. When omitted, fires for all levels.

Sample Code

const fn = iii.registerFunction("monitoring::onError", async (logEntry) => {
  await sendAlert({
    message: logEntry.body,
    severity: logEntry.severity_text,
    traceId: logEntry.trace_id,
  });
  return {};
});

iii.registerTrigger({
  type: "log",
  function_id: fn.id,
  config: { level: "error" },
});

Log entry payload fields: timestamp_unix_nano, observed_timestamp_unix_nano, severity_number, severity_text, body, attributes, trace_id, span_id, resource, service_name, instrumentation_scope_name, instrumentation_scope_version.

Trigger Type: `trace`

Register a function to react to span activity in the in-memory trace store, so any client — a worker or the web console — can refresh reactively instead of polling. The trigger is a coalesced "traces changed" tick, not a per-span feed: span activity is debounced (~300ms) and the handler receives the distinct affected trace ids for the window. Re-read details via engine::traces::list / engine::traces::tree. Requires the memory exporter (exporter: memory or both); with the OTLP-only exporter there is no in-memory store and trace triggers stay dormant.

Engine-internal spans and the trigger's own delivery spans are excluded from firing it — delivering a trigger via engine.call is itself instrumented as a span, so without this exclusion the trigger would re-fire on its own output (an unbounded feedback loop). This is why a span trigger differs from the log trigger, whose delivery produces spans, not logs.

Config Field	Type	Description
`service_name`	string	Only fire for activity from this service. When omitted, fires for any service. Compared case-insensitively.
`status`	string	Only fire when a span with this status (`ok`, `error`, or `unset`) landed in the window. When omitted, fires for any status. Compared case-insensitively.

Both filters are ANDed; omit both to fire on any span activity.

Sample Code

const fn = iii.registerFunction("devtools::onTracesChanged", async ({ trace_ids }) => {
  // A "refetch soon" beat — re-read the traces you care about.
  await refreshTraceViews(trace_ids);
  return {};
});

iii.registerTrigger({
  type: "trace",
  function_id: fn.id,
  config: {}, // or { status: 'error' }
});

Handler payload: { "trace_ids": string[] } — the distinct trace ids with span activity in the coalesced window.

iii-observability

install

configuration

readme

iii-observability

Install

Skills

Sample Configuration

Configure

Hot Reload

Configuration

OTLP Transport

Alert Rule Fields

Advanced Sampling

Functions

Logging

Logs API

Traces API

Metrics API

Other APIs

Trigger Type: log

Sample Code

Trigger Type: trace

Sample Code

Trigger Type: `log`

Trigger Type: `trace`