Real-time monitoring of host metrics, container health, Claude sessions, and Docker state
Real-time monitoring dashboard for the Contabo Singapore server and its 5 LXC containers.
[collect-dashboard.sh] --30s--> /tmp/dashboard-status.json (fast metrics)
[collect-dashboard-slow.sh] --5m--> /tmp/dashboard-status-slow.json (slow metrics)
|
[browser] <-- [Cloudflare Access] <-- [nginx :443] <-- [Node.js :3580] <-- reads JSON
- Collectors run as root (need
lxc-attach), write atomic JSON to /tmp/
- Node.js server runs as
ubuntu, serves HTML + API, reads pre-collected JSON
- Zero npm dependencies — built-in Node modules only
| Metric |
Source |
| CPU load (1/5/15) |
/proc/loadavg |
| Memory used/available |
free -b |
| Disk usage (/ and /data) |
df |
| Services status |
systemctl (nginx, claude-net-tunnel, fail2ban) |
| Fail2ban banned IPs |
fail2ban-client status sshd |
| Metric |
Source |
| State (running/stopped) |
lxc-info |
| IP address |
lxc-info |
| CPU load |
lxc-attach -- cat /proc/loadavg |
| Memory |
Host cgroup stats |
| Network TX/RX |
lxc-attach -- cat /proc/net/dev |
| Uptime |
lxc-attach -- cat /proc/uptime |
| Claude process |
lxc-attach -- pgrep -f claude |
| tmux session |
lxc-attach -- tmux list-sessions |
| Docker containers |
lxc-attach -- docker ps |
| SSH sessions |
lxc-attach -- last -10 |
| Metric |
Source |
| Disk usage (rootfs) |
Host filesystem |
| Claude Code version |
lxc-attach -- claude --version |
| Session tokens |
Parsed from JSONL session files |
| Today’s total tokens |
Aggregated from all sessions |
Single page, dark theme, auto-refreshes every 30 seconds.
- Header: Title + last update time + refresh countdown bar
- Host bar: CPU, memory, disk (/), disk (/data), fail2ban count
- Services row: Green/red dots for nginx, claude-net-tunnel, fail2ban
- Tab bar: One tab per container with status indicators
- Container panel: Full-width detail view for selected container
Each container tab shows a pulsing dot:
| Condition |
Level |
Color |
| Container stopped |
Critical |
Red |
| Claude running but tmux gone |
Critical |
Red |
| Load > 4 |
Critical |
Red |
| Memory > 95% |
Critical |
Red |
| tmux not running (active container) |
Warning |
Yellow |
| Load > 2 |
Warning |
Yellow |
| Memory > 80% |
Warning |
Yellow |
| Disk > 50GB |
Critical |
Red |
| Idle container |
OK |
Green |
- Status badges: Claude Running/No Claude, tmux Active/Dead/Idle, Docker count
- Metric cards: CPU Load, Memory, Uptime, Network, Disk Usage
- Today’s Usage: Token consumption with progress bar
- Docker containers: Table with name, image, status
- Recent Sessions: SSH login history
| Endpoint |
Description |
GET / |
Dashboard HTML |
GET /api/status |
Merged fast+slow JSON |
GET /api/logs/:container |
Last 100 entries of latest session log |
GET /api/logs/:container/list |
List session log files |
GET /api/fail2ban |
Detailed ban list |
Container names are validated against an allowlist to prevent path traversal.
| File |
Purpose |
/opt/dashboard/server.js |
Node.js HTTP server (127.0.0.1:3580) |
/opt/dashboard/index.html |
Single-page dashboard |
/usr/local/bin/collect-dashboard.sh |
Fast metrics collector |
/usr/local/bin/collect-dashboard-slow.sh |
Slow metrics collector |
/etc/systemd/system/dashboard.service |
Node.js server service |
/etc/systemd/system/collect-dashboard.timer |
Fast collector timer (30s) |
/etc/systemd/system/collect-dashboard-slow.timer |
Slow collector timer (5min) |
/etc/nginx/sites-available/admin-dashboard |
nginx reverse proxy config |
/opt/dashboard/.env |
Anthropic Admin API key |
The dashboard can query Anthropic’s Admin API for per-user Claude Code analytics:
| Endpoint |
Purpose |
GET /v1/organizations/usage_report/claude_code?starting_at=YYYY-MM-DD |
Per-user daily metrics |
GET /v1/organizations/usage_report/messages |
General usage |
GET /v1/organizations/users |
List team members |
Organization: Omelas AI LLC
Status: API key configured, awaiting team onboarding (Sean and Jaz need to claude login against the org).
ssh ovh5 "systemctl status dashboard collect-dashboard.timer collect-dashboard-slow.timer"
ssh ovh5 "cat /tmp/dashboard-status.json | python3 -m json.tool | head -20"
ssh ovh5 "sudo systemctl restart dashboard"
ssh ovh5 "sudo /usr/local/bin/collect-dashboard.sh"
ssh ovh5 "sudo /usr/local/bin/collect-dashboard-slow.sh"
ssh ovh5 "journalctl -u dashboard -n 20"