ismailperim's comments

ismailperim · 2026-03-05T14:45:11 1772721911

Great questions - you're right on both fronts.

*Node metrics:* Currently we're container-only (docker stats/logs), so yes - we'd miss noisy neighbors or node-level memory pressure. Prometheus integration is on the roadmap to correlate container events with node/cluster metrics. Right now we catch the obvious cases: "this container OOMKilled at its 512MB limit."

*Permissions:* Funny story - I built this while working with OpenClaw (an AI assistant framework). OpenClaw has broad system access by design, but I wanted to explore: what if we made a micro-agent with the minimum permissions needed?

So OnCallMate offers two modes: 1. *Direct socket* (if you trust it / testing): bind /var/run/docker.sock 2. *docker-socket-proxy* (production): read-only layer, no exec/restart/POST

The proxy approach: - Agent connects via TCP, never touches the socket directly - Whitelist: containers, logs, stats, inspect (GET only) - Blacklist: exec, restart, swarm, secrets - Even if AI hallucinates "docker restart nginx", it physically can't

All tool calls are logged for audit trails.

You're right that we should emphasize this more in the README. Principle: treat AI agents like untrusted input.

Have you seen other patterns for safely exposing Docker APIs to automation?