AgentSSH — Local-first AI SSH operator workspace

Why this exists

Remote work spreads across too many surfaces

Remote server work often spreads across too many surfaces: a terminal, notes, browser searches, documentation, and a separate AI chat. That split makes it easy to lose context, repeat commands, or accept advice without seeing exactly what will happen on the machine.

AgentSSH pulls those pieces into one local workspace. The terminal remains the source of truth, while the assistant becomes a guided layer beside it: proposing one step at a time, explaining why, waiting for approval when needed, and leaving behind a timeline of requests, decisions, output, errors, and cancellations.

The important idea is not "AI runs shell commands." The important idea is AI can help operate a machine only when the product gives the human clear brakes, context, visibility, and a reliable approval path.

What you can do

Features at a glance

Open a real SSH terminal in the browserConnect to a Linux host through a local Node backend and interact with a full xterm.js terminal session that resizes with the workspace.

Ask the agent for operational helpUse natural language for common inspection tasks like checking disk usage, memory pressure, running services, listening ports, logs, system identity, or package state.

Review proposed actions before they runAgent actions become approval cards with a command, risk label, and reason. The operator can approve, deny, or cancel the whole task.

Auto-run known read-only checksFor low-risk inspection commands, the app can skip repetitive approvals when the local policy allows it. Elevated or unknown command shapes stay gated.

Stop work in progressThe UI includes task cancellation and command interruption so a long-running or unwanted agent action can be stopped from the interface.

Search the web with controlsOptional DuckDuckGo search and public page fetching let the agent look up external documentation or version-specific context, with configurable limits and approvals.

Watch the audit trailA live timeline records user requests, proposals, approvals, denials, cancellations, command output events, connection changes, search/fetch events, and agent messages.

Use local or OpenAI-compatible modelsThe backend speaks to OpenAI-compatible chat endpoints with two agent protocols: a plain-text protocol for llama.cpp-style local servers and optional native function calling for more capable backends. Without a model configured, it falls back to keyword-matched common admin tasks.

Architecture

How it works

AgentSSH is a two-process local web app: a Vite/React client for the operator interface and a Node/Express backend that owns SSH, agent orchestration, policy decisions, and WebSocket messaging.

flowchart TB operator["Operator"] subgraph browser["Browser - React + Vite"] direction LR term["xterm.js Terminal
locked during agent commands,
unlocked for password prompts"] chat["Agent Chat
markdown answers"] cards["Approval Cards
risk label + reason
approve / deny / cancel"] think["Thinking Panel
reasoning - planning - analysis"] meter["Status Bar
connection state + context meter"] timeline["Audit Timeline view"] wsclient["WebSocket Client
auto-reconnect with backoff,
settings re-sync on reopen"] end operator -->|"terminal input, questions,
approval decisions, settings"| browser term --- wsclient chat --- wsclient cards --- wsclient wsclient <-->|"origin-validated
local WebSocket"| handler subgraph backend["Local Node Backend - Express + ws"] handler["WebSocket Handler
message routing - approval queue -
output buffering and throttling -
bounded chat and audit history"] policy["Command Policy Classifier
read-only / write / destructive
auto-run / approval required / blocked"] sshsvc["SSH Service - ssh2
persistent shell session -
reliable completion and exit-status
detection - output safety limits -
command timeouts - interrupt path"] aisvc["AI Service
one-action-at-a-time loop -
text protocol or native tool calls -
bounded task context - dedup of
repeated actions - retries"] searchsvc["Search Service
DuckDuckGo results - readable
page extraction - private-network
blocking - every redirect re-checked -
rate limiting and caching"] auditlog["Audit Log
requests - proposals - decisions -
output events - errors - cancellations"] end handler -->|"user request +
bounded history
(never credentials)"| aisvc aisvc -->|"next action:
command / search /
fetch / final answer"| handler handler -->|"proposed command"| policy policy -->|"policy decision"| handler handler -->|"approved or auto-run
commands only"| sshsvc handler -->|"approved searches
and page fetches"| searchsvc searchsvc -->|"public results
and excerpts"| handler handler --> auditlog sshsvc <-->|"encrypted SSH channel
(credentials stay here,
session-only)"| remote aisvc <-->|"chat completions
(no credentials, no raw secrets)"| llm searchsvc -->|"public URLs only"| web handler -->|"terminal output, agent status,
thoughts, proposals, command state,
context estimates, audit events"| wsclient classDef browser fill:#2f8cff14,stroke:#2f8cff40,color:#f2f7ff,stroke-width:1px classDef transport fill:#67d9ff14,stroke:#67d9ff40,color:#f2f7ff,stroke-width:1px classDef backend fill:#2dd4bf14,stroke:#2dd4bf40,color:#f2f7ff,stroke-width:1px classDef external fill:#9fb2c814,stroke:#9fb2c840,color:#9fb2c8,stroke-width:1px classDef policy fill:#f59e0b14,stroke:#f59e0b40,color:#f2f7ff,stroke-width:1px class operator,term,chat,cards,think,meter,timeline browser class wsclient transport class handler,sshsvc,aisvc,searchsvc,auditlog backend class policy policy class remote,llm,web external

User input enters through the browser, but every execution decision happens in the local backend. The agent only ever sees task context, approved tool results, and bounded command history; SSH credentials stop at the SSH service, command proposals pass through the policy classifier and approval queue, and the same handler mirrors all state back to the UI over an origin-checked local WebSocket.

Safety model

Built around human approval

AgentSSH treats command execution as a product safety problem. The assistant does not receive SSH credentials, does not directly own the terminal, and does not silently run state-changing work. It proposes an action, labels the risk, gives a reason, and waits for the operator or the configured local policy.

Layer	What it protects	How it shows up in the UI
Token + origin-validated control channel	Stops other websites and stray local clients from talking to the local backend	Invisible when working normally; the app provisions the token itself and foreign pages cannot connect
Session-only credentials	Prevents secrets from becoming project data or model context	Password/key fields are used for connection setup and not displayed in logs
SSH host-key pinning	Detects a changed or forged server identity	First connect pins the key; a later mismatch blocks the connection with a clear explanation
Output redaction	Keeps likely secrets out of model context and history	Command and web output is scrubbed for credential-shaped values before the agent sees it
Sensitive-file gating	Stops secret files being read into the model unprompted	Reads of credential files are gated behind approval, like a state-changing command
Risk classification	Separates inspection from state-changing or dangerous operations	Approval cards show a risk label and reason
Elevated-command gating	Keeps privileged commands behind their own opt-in policy, even read-only ones	A separate settings toggle for elevated read-only auto-run
Approval gates	Keeps the human in control of meaningful changes	Approve, deny, and cancel actions are visible on each proposal
Honest execution results	Prevents the agent from reasoning on false success after a failed command	Command failures surface with their real exit state in status and timeline
Task cancellation and timeouts	Gives the operator an escape hatch and stops hung commands on their own	Stop Agent and Stop Command controls appear during active work; stalled commands are interrupted automatically
Audit timeline	Makes the workflow inspectable after the fact	Requests, proposals, decisions, output events, errors, and cancellations are timestamped
Web search controls	Prevents surprise external lookups, excessive page reads, or requests into the local network	Search can be disabled, capped, and gated before page fetches
Output limits	Prevents runaway output from overwhelming the interface	Large output is throttled, suppressed, or truncated with visible status

Engineering highlights

Under the hood

WebSocket-first interaction model

Terminal bytes, agent status, command proposals, approvals, audit events, and context estimates move through a single live channel that accepts connections only from the app's own browser origin and only with an auto-provisioned session token.

Resilient client connection

If the local link drops, the UI reconnects automatically with backoff and re-syncs the latest settings, so a refreshed or recovered session picks up where it left off.

Session-scoped SSH handling

Credentials are accepted for connection setup but are not persisted, logged, or passed into the AI prompt.

Truthful command results

The backend detects real command completion and the actual exit status rather than guessing from prompt text, so failed commands are reported to the agent as failures instead of silent successes. Long-running commands also have wall-clock timeouts with a clean interrupt path.

Command policy layer

Proposed shell actions are classified into read-only, write, or destructive categories, with policy outcomes such as auto-run, approval-required, or blocked. Elevated read-only commands are gated behind their own opt-in policy, separate from plain read-only auto-run. The classifier is hardened and fail-closed, with an automated test suite running in CI.

SSH host-key pinning

The first connection to a host pins its key on a trust-on-first-use basis. A later key mismatch blocks the connection with a clear explanation rather than silently reconnecting to a changed or forged server identity.

Secret redaction and sensitive-file gating

Command and web output is scrubbed for credential-shaped values before it reaches the model, and reads of sensitive files are gated behind approval like a state-changing command, so secrets do not quietly become model context.

Agent file tools over SFTP

The agent can list and read files through SFTP, with writes gated behind approval, and the operator has a read-only file browser. File access follows the same approval and redaction rules as commands.

One-action-at-a-time agent loop

The model is asked to choose a single next action, then output is fed back into the next step. This keeps the workflow inspectable. The loop also deduplicates repeated actions and recovers from malformed model responses.

Two agent protocols

A plain-text command protocol keeps the loop working on minimal llama.cpp-style backends, while an optional native function-calling mode gives more reliable parsing on capable endpoints.

Cancellable long runs

Agent tasks have an action cap and must ask before continuing; users can cancel the whole task or interrupt a running command.

Output safety limits

Command output is buffered, throttled, and truncated in several places so runaway commands do not overwhelm the UI or prompt context. Server-side chat and audit histories are bounded so long sessions do not grow without limit.

Sensitive input handling

While an agent command is active, terminal input is locked unless the shell is waiting for a password or passphrase-style prompt.

Context awareness

The UI shows estimated context usage so long troubleshooting runs do not silently drift toward model limits.

Fallback mode

Without an AI endpoint, common server-inspection requests map to safe, predictable command suggestions.

Web search boundaries

Public web search is opt-in, capped, cache-aware, and refuses local/private network URLs - including re-checking every redirect hop, not just the first URL.

Accessible operator UI

Collapsible panels are keyboard-operable with proper ARIA state, and status indicators do not rely on color alone.

Walkthrough

A typical session

1

Connect

Open AgentSSH locally, enter a host, username, port, and either password or private-key credentials. The connection status turns green when the SSH shell is ready.

2

Ask

Ask a plain-language question: "check disk space", "summarize web server status", or "what changed in the logs."

3

Review

The agent proposes one action at a time. Each proposal includes the command or tool action, a short reason, and a risk label.

4

Run

Approved actions run through the SSH service. Output streams into the terminal and is summarized back into the agent loop.

5

Decide

Approve the next step, deny it, cancel the task, or use the terminal directly. The audit timeline keeps a record of the entire session.

Screenshots

Reserved for future capture

All screenshots require a redaction pass before public display. Captions describe what each frame will contain once cleared. Replace hostnames, usernames, IP addresses, and paths with generic demo values.

Screenshot slot 01 Connection modal — Session-only SSH setup with password or private-key authentication. Use redacted placeholder values.

Screenshot slot 02 Main workspace — A live browser terminal next to the agent chat, with status and context usage in the top bar.

Screenshot slot 03 Command approval card — The agent proposes a single action with a risk label, reason, and approve/deny/cancel controls.

Screenshot slot 04 Agent thinking panel — Reasoning, planning, and analysis events stream into a collapsible panel while the task runs.

Screenshot slot 05 Audit timeline — Requests, proposals, approvals, denials, cancellations, output events, and errors are recorded chronologically.

Screenshot slot 06 Settings modal — Local policy controls for AI endpoint, context window, native tool calling, read-only and elevated auto-run, approval memory, and web search limits.

Lessons

What stuck

The hardest part was not wiring a model to a terminal. The harder and more interesting part was designing the control surface around it: when to ask, when to block, how to make progress visible, and how to stop cleanly when the user changes their mind.

Agentic tools need observable state. A spinner is not enough when a tool can touch a real machine.

Approval flows work best when they are part of the main workspace, not a last-second warning.

The terminal should remain the source of truth; the assistant should explain and propose around it.

Safety policy needs layers: credential boundaries, command classification, output limits, cancellation, and auditability all cover different failure modes.

Fallback behavior matters because a useful local tool should degrade gracefully when an AI endpoint is not configured.

Current status

What's built and what's next

AgentSSH is an MVP with a working React/Vite frontend, Node/Express backend, browser terminal, SSH session service, OpenAI-compatible agent loop, command approval cards, cancellable tasks, local policy settings, optional web search, context usage display, and a live audit timeline.

Built

React + Vite frontend
xterm.js terminal with fit addon and responsive resizing
Node.js + Express backend
WebSocket message routing for terminal, chat, status, approvals, and audit events
Token- and origin-validated WebSocket channel, with the token auto-provisioned by the app
Automatic client reconnection with backoff and settings re-sync
SSH connections through the ssh2 library
Password and private-key authentication support
Reliable command completion and real exit-status detection, so failures are reported honestly to the agent
Wall-clock command timeouts with automatic interrupt for stalled commands
OpenAI-compatible AI service with streaming responses
Optional native function-calling mode alongside the llama.cpp-friendly text protocol
Fallback keyword mapper for common server inspection tasks
Hardened, fail-closed command classifier with an automated test suite and CI
Approval, denial, cancellation, and remembered session approval paths
Optional auto-run for known read-only command shapes
Optional elevated read-only auto-run setting, gated separately from plain read-only auto-run
Agent task action cap with approval before continuation
Stop Agent and Stop Command controls
Web search and page fetch service with limits, private-network blocking, and per-redirect re-validation
Secret redaction of likely credentials before output reaches the model, plus approval gating for sensitive-file reads
SSH host-key pinning (trust-on-first-use) that blocks a later key mismatch
SFTP-backed agent file tools (list, read, approval-gated write) and a read-only file browser
Bounded server-side chat and audit histories for long sessions
Context usage estimator
Markdown rendering for assistant responses
Keyboard-accessible panel toggles and ARIA-labelled controls
Audit timeline for operator-visible history, with JSONL export
Saved connection profiles (non-secret fields only), reusable prompt playbooks, and light/dark/high-contrast themes
Error and success notifications, and keyboard shortcuts for approvals

Next steps

Multiple concurrent SSH sessions
Optional secure storage for connection secrets (profiles currently store non-secret fields only)
Filterable audit history (export is already supported)
Richer command result summaries
Per-host policy profiles