DocHub/Infrastructure/Contabo Singapore Dev Server/Monitoring Dashboard

Real-time monitoring of host metrics, container health, Claude sessions, and Docker state

Monitoring Dashboard

Real-time monitoring dashboard for the Contabo Singapore server and its 5 LXC containers.

Access

Item	Value
URL	https://admin.ipnoelp.io
Auth	Cloudflare Access (Google OAuth, @omelasai.com)

Architecture

[collect-dashboard.sh] --30s--> /tmp/dashboard-status.json      (fast metrics)
[collect-dashboard-slow.sh] --5m--> /tmp/dashboard-status-slow.json (slow metrics)
                                                                     |
[browser] <-- [Cloudflare Access] <-- [nginx :443] <-- [Node.js :3580] <-- reads JSON

Collectors run as root (need lxc-attach), write atomic JSON to /tmp/
Node.js server runs as ubuntu, serves HTML + API, reads pre-collected JSON
Zero npm dependencies — built-in Node modules only

What Gets Monitored

Host Level (every 30 seconds)

Metric	Source
CPU load (1/5/15)	`/proc/loadavg`
Memory used/available	`free -b`
Disk usage (/ and /data)	`df`
Services status	systemctl (nginx, claude-net-tunnel, fail2ban)
Fail2ban banned IPs	`fail2ban-client status sshd`

Per Container (every 30 seconds)

Metric	Source
State (running/stopped)	`lxc-info`
IP address	`lxc-info`
CPU load	`lxc-attach -- cat /proc/loadavg`
Memory	Host cgroup stats
Network TX/RX	`lxc-attach -- cat /proc/net/dev`
Uptime	`lxc-attach -- cat /proc/uptime`
Claude process	`lxc-attach -- pgrep -f claude`
tmux session	`lxc-attach -- tmux list-sessions`
Docker containers	`lxc-attach -- docker ps`
SSH sessions	`lxc-attach -- last -10`

Per Container — Slow (every 5 minutes)

Metric	Source
Disk usage (rootfs)	Host filesystem
Claude Code version	`lxc-attach -- claude --version`
Session tokens	Parsed from JSONL session files
Today’s total tokens	Aggregated from all sessions

Dashboard UI

Single page, dark theme, auto-refreshes every 30 seconds.

Layout

Header: Title + last update time + refresh countdown bar
Host bar: CPU, memory, disk (/), disk (/data), fail2ban count
Services row: Green/red dots for nginx, claude-net-tunnel, fail2ban
Tab bar: One tab per container with status indicators
Container panel: Full-width detail view for selected container

Status Indicators

Each container tab shows a pulsing dot:

Condition	Level	Color
Container stopped	Critical	Red
Claude running but tmux gone	Critical	Red
Load > 4	Critical	Red
Memory > 95%	Critical	Red
tmux not running (active container)	Warning	Yellow
Load > 2	Warning	Yellow
Memory > 80%	Warning	Yellow
Disk > 50GB	Critical	Red
Idle container	OK	Green

Container Panel

Status badges: Claude Running/No Claude, tmux Active/Dead/Idle, Docker count
Metric cards: CPU Load, Memory, Uptime, Network, Disk Usage
Today’s Usage: Token consumption with progress bar
Docker containers: Table with name, image, status
Recent Sessions: SSH login history

API Endpoints

Endpoint	Description
`GET /`	Dashboard HTML
`GET /api/status`	Merged fast+slow JSON
`GET /api/logs/:container`	Last 100 entries of latest session log
`GET /api/logs/:container/list`	List session log files
`GET /api/fail2ban`	Detailed ban list

Container names are validated against an allowlist to prevent path traversal.

Files on Server

File	Purpose
`/opt/dashboard/server.js`	Node.js HTTP server (127.0.0.1:3580)
`/opt/dashboard/index.html`	Single-page dashboard
`/usr/local/bin/collect-dashboard.sh`	Fast metrics collector
`/usr/local/bin/collect-dashboard-slow.sh`	Slow metrics collector
`/etc/systemd/system/dashboard.service`	Node.js server service
`/etc/systemd/system/collect-dashboard.timer`	Fast collector timer (30s)
`/etc/systemd/system/collect-dashboard-slow.timer`	Slow collector timer (5min)
`/etc/nginx/sites-available/admin-dashboard`	nginx reverse proxy config
`/opt/dashboard/.env`	Anthropic Admin API key

Anthropic Admin API Integration

The dashboard can query Anthropic’s Admin API for per-user Claude Code analytics:

Endpoint	Purpose
`GET /v1/organizations/usage_report/claude_code?starting_at=YYYY-MM-DD`	Per-user daily metrics
`GET /v1/organizations/usage_report/messages`	General usage
`GET /v1/organizations/users`	List team members

Organization: Omelas AI LLC

Status: API key configured, awaiting team onboarding (Sean and Jaz need to claude login against the org).

Managing the Dashboard

# Check service status
ssh ovh5 "systemctl status dashboard collect-dashboard.timer collect-dashboard-slow.timer"

# View collector output
ssh ovh5 "cat /tmp/dashboard-status.json | python3 -m json.tool | head -20"

# Restart dashboard server
ssh ovh5 "sudo systemctl restart dashboard"

# Force collector run
ssh ovh5 "sudo /usr/local/bin/collect-dashboard.sh"
ssh ovh5 "sudo /usr/local/bin/collect-dashboard-slow.sh"

# Check logs
ssh ovh5 "journalctl -u dashboard -n 20"