27 · Admin · Platform health
The platform-health pages give admins visibility into the running services and recent incidents.
Uptime
/uptime
The user-facing Uptime monitoring page. Anyone with the watchdog
feature can configure three kinds of monitor:
| Kind | What it checks | Notes |
|---|---|---|
| ⚙️ Service | Internal docker service URL | Supports auto-restart of the named container |
| 🌐 Website | External HTTP/HTTPS URL | No restart — pure status check |
| 🖥️ Device | RMM endpoint freshness | OK while the device's last-seen timestamp is within the chosen window (default 5 min) |
Common controls per target:
- Name + interval (poll cadence) + failure threshold
- Public flag → target appears on
/statusfor customers - Group + display order for the public-page layout
- ⟳ Trigger an ad-hoc check
- ⏻ Manually restart the named container (Service kind only)
- Edit / remove
Header stats: total targets, UP / DOWN counts, open incidents, restart attempts in last 24 h, rolling 24 h uptime %.
📷 Screenshot placeholder: screenshots/uptime.png
Device-kind targets
When you pick Device, the form swaps the URL input for an RMM
device dropdown plus a freshness window (seconds). The watchdog
poller flips the target to FAIL when now - device.lastSeenAt
exceeds the window — which is what you want when you can't reach the
endpoint with HTTP but the agent is still expected to phone home.
⚠️ Caution. A FAIL on a device-kind target means the agent has gone silent, not necessarily that the box is off — flaky network can trip it. Pair with the device's RMM alerts for the full picture.
/admin/health and /uptime render the same component against the
watchdog service — /uptime is the same page surfaced outside admin so
it can live in the main sidebar for daily ops eyes-on.
Health
/admin/health
The admin-side view of the watchdog. It shows the monitor targets you have configured (the Service / Website / Device kinds described above), not a fixed list of backend services — what appears here is whatever has been added as a target.
+----------------------------------------------------------------+
| Name Kind Status Latency Last checked |
|----------------------------------------------------------------|
| portal ⚙ Service ✓ OK 14ms 12:04:31 |
| auth-service ⚙ Service ✓ OK 12ms 12:04:31 |
| helpdesk-service ⚙ Service ✓ OK 18ms 12:04:31 |
+----------------------------------------------------------------+
📷 Screenshot placeholder: screenshots/admin-health.png
Each target row carries its last status, last latency, last check time
and consecutive-failure count. You can add, edit and remove targets,
trigger an ad-hoc check, and (for Service-kind targets with a container
name) trigger a restart. Header stats mirror the /uptime page: total
targets, OK / FAIL counts, open incidents, restarts in the last 24 h
and the rolling 24 h uptime %.
A fresh seeded install ships three Service targets —
portal,auth-serviceandhelpdesk-service— each pointed at the service's/healthendpoint.
Status page
/status is the public version of health — what your customers see
during incidents. It lists the targets you have flagged Public
(grouped by their Group name) plus any RMM devices flagged for the
public status page, each with a status pip and — for monitor targets —
a rolling check-history bar and a 7-day uptime %. No internal details
(no per-service host info, no credentials).
Watchdog
The watchdog-service runs as a separate process (port 3019). It
runs a poll tick every 10 s (POLL_TICK_SECONDS); on each tick it
checks any target whose own interval is due — the per-target
default is 30 s. Each check is one of:
- Service / Website — an HTTP probe of the target URL
- Device — a freshness check against an RMM device's last-seen timestamp
When a target's consecutive failures reach its failure threshold, the watchdog:
- Opens a health incident and flips the target to FAIL on
/admin/healthand/uptime - Shows an incident on
/statusif the target is Public - Raises a
CRITICALRMM alert if an RMM device matches the target's name (so the org-wide alert queue stays the single source of truth) - For a Service-kind target with auto-restart on and a container
name set, attempts
docker restartof that container (only when the service is built withENABLE_DOCKER_RESTART=true)
The watchdog also writes a heartbeat file so an external monitor can detect a dead watchdog.
Messaging health
/admin/messaging
The chat / messaging service has its own health view:
- Connected users (live count)
- Messages/sec
- Channel count
- WebSocket reconnect rate (any sustained spike indicates trouble)
Meetings health
/admin/meetings
- Active meetings
- Average per-meeting attendees
- Recording storage used
- Jitsi cluster status (if multi-host)
Studio service health
/admin/studio
Tied to the design-service that renders previews. Usually green; if this is red, branding edits won't preview but the platform still works.
API explorer
/api-explorer
OpenAPI documentation for every backend service. Use it to:
- Confirm a service is reachable from your browser
- Test specific endpoints with the Try it button (auth headers applied automatically from your session)
- Find request/response shapes when integrating
📷 Screenshot placeholder: screenshots/api-explorer.png
Logs
The platform writes structured JSON logs to stdout. On a docker-compose install, the runbook ../runbooks/portal.md shows where they go and how to tail them.
Common workflows
Investigate a slow page
- /admin/health → check the latency column on each monitor target.
- Open the target → review its recent checks and incident history.
- Cross-reference with the service logs (see the portal runbook).
After an incident
- Open /admin/audit → filter to the incident window.
- Cross-reference with /admin/health restart history.
- Post a public update on /status (Admin → Status page → New incident).
Check that the relay is up
- Add a Service-kind target on /admin/health pointed at the relay's
/healthendpoint, or - Hit the relay's
/healthendpoint directly — it returns{ "status": "ok", "service": "relay-service", ... }.
Permissions
| Action | Role |
|---|---|
| View admin health | admin |
| Post status updates | admin |
| View public /status | anyone |
| API Explorer | admin (logged-in default; some endpoints public) |
See also
- Watchdog runbook
- Portal runbook — log locations
- Domain & TLS runbook — public endpoints