Admin · Platform health

The platform-health pages give admins visibility into the running services and recent incidents.

Mitch Wigham

Updated 24 June 2026 · 6 views

27 · Admin · Platform health

The platform-health pages give admins visibility into the running services and recent incidents.

Uptime

/uptime

The user-facing Uptime monitoring page. Anyone with the watchdog feature can configure three kinds of monitor:

Kind	What it checks	Notes
⚙️ Service	Internal docker service URL	Supports auto-restart of the named container
🌐 Website	External HTTP/HTTPS URL	No restart — pure status check
🖥️ Device	RMM endpoint freshness	OK while the device's last-seen timestamp is within the chosen window (default 5 min)

Common controls per target:

Name + interval (poll cadence) + failure threshold
Public flag → target appears on /status for customers
Group + display order for the public-page layout
⟳ Trigger an ad-hoc check
⏻ Manually restart the named container (Service kind only)
Edit / remove

Header stats: total targets, UP / DOWN counts, open incidents, restart attempts in last 24 h, rolling 24 h uptime %.

📷 Screenshot placeholder: screenshots/uptime.png

Device-kind targets

When you pick Device, the form swaps the URL input for an RMM device dropdown plus a freshness window (seconds). The watchdog poller flips the target to FAIL when now - device.lastSeenAt exceeds the window — which is what you want when you can't reach the endpoint with HTTP but the agent is still expected to phone home.

⚠️ Caution. A FAIL on a device-kind target means the agent has gone silent, not necessarily that the box is off — flaky network can trip it. Pair with the device's RMM alerts for the full picture.

/admin/health and /uptime render the same component against the watchdog service — /uptime is the same page surfaced outside admin so it can live in the main sidebar for daily ops eyes-on.

Health

/admin/health

The admin-side view of the watchdog. It shows the monitor targets you have configured (the Service / Website / Device kinds described above), not a fixed list of backend services — what appears here is whatever has been added as a target.

+----------------------------------------------------------------+
| Name              Kind      Status   Latency   Last checked     |
|----------------------------------------------------------------|
| portal            ⚙ Service  ✓ OK     14ms      12:04:31        |
| auth-service      ⚙ Service  ✓ OK     12ms      12:04:31        |
| helpdesk-service  ⚙ Service  ✓ OK     18ms      12:04:31        |
+----------------------------------------------------------------+

📷 Screenshot placeholder: screenshots/admin-health.png

Each target row carries its last status, last latency, last check time and consecutive-failure count. You can add, edit and remove targets, trigger an ad-hoc check, and (for Service-kind targets with a container name) trigger a restart. Header stats mirror the /uptime page: total targets, OK / FAIL counts, open incidents, restarts in the last 24 h and the rolling 24 h uptime %.

A fresh seeded install ships three Service targets — portal, auth-service and helpdesk-service — each pointed at the service's /health endpoint.

Status page

/status is the public version of health — what your customers see during incidents. It lists the targets you have flagged Public (grouped by their Group name) plus any RMM devices flagged for the public status page, each with a status pip and — for monitor targets — a rolling check-history bar and a 7-day uptime %. No internal details (no per-service host info, no credentials).

Watchdog

The watchdog-service runs as a separate process (port 3019). It runs a poll tick every 10 s (POLL_TICK_SECONDS); on each tick it checks any target whose own interval is due — the per-target default is 30 s. Each check is one of:

Service / Website — an HTTP probe of the target URL
Device — a freshness check against an RMM device's last-seen timestamp

When a target's consecutive failures reach its failure threshold, the watchdog:

Opens a health incident and flips the target to FAIL on /admin/health and /uptime
Shows an incident on /status if the target is Public
Raises a CRITICAL RMM alert if an RMM device matches the target's name (so the org-wide alert queue stays the single source of truth)
For a Service-kind target with auto-restart on and a container name set, attempts docker restart of that container (only when the service is built with ENABLE_DOCKER_RESTART=true)

The watchdog also writes a heartbeat file so an external monitor can detect a dead watchdog.

Messaging health

/admin/messaging

The chat / messaging service has its own health view:

Connected users (live count)
Messages/sec
Channel count
WebSocket reconnect rate (any sustained spike indicates trouble)

Meetings health

/admin/meetings

Active meetings
Average per-meeting attendees
Recording storage used
Jitsi cluster status (if multi-host)

Studio service health

/admin/studio

Tied to the design-service that renders previews. Usually green; if this is red, branding edits won't preview but the platform still works.

API explorer

/api-explorer

OpenAPI documentation for every backend service. Use it to:

Confirm a service is reachable from your browser
Test specific endpoints with the Try it button (auth headers applied automatically from your session)
Find request/response shapes when integrating

📷 Screenshot placeholder: screenshots/api-explorer.png

Logs

The platform writes structured JSON logs to stdout. On a docker-compose install, the runbook ../runbooks/portal.md shows where they go and how to tail them.

Common workflows

Investigate a slow page

/admin/health → check the latency column on each monitor target.
Open the target → review its recent checks and incident history.
Cross-reference with the service logs (see the portal runbook).

After an incident

Open /admin/audit → filter to the incident window.
Cross-reference with /admin/health restart history.
Post a public update on /status (Admin → Status page → New incident).

Check that the relay is up

Add a Service-kind target on /admin/health pointed at the relay's /health endpoint, or
Hit the relay's /health endpoint directly — it returns { "status": "ok", "service": "relay-service", ... }.

Permissions

Action	Role
View admin health	`admin`
Post status updates	`admin`
View public /status	anyone
API Explorer	`admin` (logged-in default; some endpoints public)