SSH Tunneling¶

Securely access remote services running on your GPU instance through SSH port forwarding.

Overview¶

SSH tunneling creates encrypted connections from your local machine to services running on the GPU instance:

Your Computer                GPU Instance
localhost:8000  <---SSH--->  remote:8000 (SGLang API)
localhost:5678  <---SSH--->  remote:5678 (n8n)
localhost:8080  <---SSH--->  remote:8080 (Status Daemon)

All traffic flows through a single encrypted SSH connection, eliminating the need to expose ports publicly.

Quick Start¶

Start a tunnel to your active instance:

soong tunnel start

This forwards three ports by default:

Local Port	Remote Service	Purpose
`8000`	SGLang API	Model inference endpoint
`5678`	n8n	Workflow automation UI
`8080`	Status Daemon	Instance monitoring API

Starting a Tunnel¶

Basic Usage¶

soong tunnel start

The tunnel runs in the background as a daemon process:

Starting SSH tunnel to 123.45.67.89...
SSH tunnel started (PID: 12345)
  localhost:8000 -> 123.45.67.89:8000
  localhost:5678 -> 123.45.67.89:5678
  localhost:8080 -> 123.45.67.89:8080

Background Process

The tunnel uses ssh -N -f to fork into the background. It will persist until explicitly stopped or your machine reboots.

Custom Ports¶

Override default ports:

soong tunnel start \
  --sglang-port 8001 \
  --n8n-port 5679 \
  --status-port 8081

This is useful when local ports are already in use.

Specify Instance¶

Connect to a specific instance (when running multiple):

soong tunnel start --instance-id a1b2c3d4

How It Works¶

sequenceDiagram
    participant User
    participant CLI
    participant SSH Client
    participant GPU Instance

    User->>CLI: tunnel start
    CLI->>CLI: Check if tunnel already running
    CLI->>SSH Client: Start background tunnel
    Note over SSH Client: ssh -N -f -L 8000:localhost:8000 ...
    SSH Client->>GPU Instance: Establish connection
    SSH Client->>CLI: Find process PID
    CLI->>CLI: Store PID to ~/.config/gpu-dashboard/tunnel.pid
    CLI->>User: Tunnel started (PID: 12345)

    Note over User,GPU Instance: Tunnel runs in background

    User->>User: Access http://localhost:8000
    Note over User,GPU Instance: Traffic flows through SSH tunnel

SSH Command Details¶

The tunnel uses these SSH flags:

ssh -N -f \
  -o StrictHostKeyChecking=no \
  -o UserKnownHostsFile=/dev/null \
  -o ServerAliveInterval=60 \
  -L 8000:localhost:8000 \
  -L 5678:localhost:5678 \
  -L 8080:localhost:8080 \
  -i ~/.ssh/id_rsa \
  ubuntu@123.45.67.89

Flag	Purpose
`-N`	No remote command (just forwarding)
`-f`	Fork to background after authentication
`-L`	Local port forwarding specification
`-o ServerAliveInterval=60`	Keep connection alive with heartbeat
`-i`	SSH private key for authentication

PID Management¶

The tunnel process ID (PID) is stored in:

~/.config/gpu-dashboard/tunnel.pid

This allows the CLI to:

Check if a tunnel is already running
Stop the tunnel by PID
Prevent duplicate tunnels

Accessing Services¶

Once the tunnel is running, access services at localhost URLs:

SGLang API (Port 8000)¶

curl http://localhost:8000/v1/models

curl http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-coder-32b",
    "prompt": "def fibonacci(n):",
    "max_tokens": 100
  }'

n8n Workflow UI (Port 5678)¶

Open in browser:

http://localhost:5678

Status Daemon API (Port 8080)¶

# Get instance status
curl http://localhost:8080/status \
  -H "Authorization: Bearer YOUR_TOKEN"

# Extend lease
curl -X POST http://localhost:8080/extend \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d "hours=2"

Checking Tunnel Status¶

soong tunnel status

Output:

Tunnel is running

or

Tunnel is not running

Stopping a Tunnel¶

soong tunnel stop

This terminates the background SSH process:

Stopped tunnel (PID: 12345)

The PID file (~/.config/gpu-dashboard/tunnel.pid) is also removed.

Auto-Cleanup

If the tunnel process dies unexpectedly, the CLI detects the stale PID file and cleans it up automatically.

Direct SSH Access¶

For interactive shell access (not tunneling), use:

soong ssh

This opens an interactive SSH session:

ubuntu@gpu-instance:~$ nvidia-smi
ubuntu@gpu-instance:~$ cd ~/workspace
ubuntu@gpu-instance:~$ python train.py

Difference: Tunnel vs SSH

Tunnel: Background port forwarding, no shell
SSH: Interactive shell session

Troubleshooting¶

Tunnel Already Running¶

Tunnel already running. Stop it first.

Solution: Stop the existing tunnel:

soong tunnel stop
soong tunnel start

Port Already in Use¶

Error starting tunnel: bind: Address already in use

Solution: Either:

Stop the conflicting service on the local port, or
Use custom ports:

soong tunnel start --sglang-port 8001

Tunnel Process Not Found¶

Tunnel process not found (already stopped?)

This happens when the tunnel died unexpectedly. The CLI cleans up the stale PID file automatically.

Solution: Start a new tunnel:

soong tunnel start

SSH Connection Timeout¶

SSH tunnel command timed out

Possible causes:

Instance not ready yet:

soong status  # Check if instance is "active"

Network issues or firewall blocking SSH (port 22)

Incorrect SSH key:

soong configure  # Reconfigure SSH key path

Can't Access localhost:8000¶

Checklist:

Verify tunnel is running:
```
soong tunnel status
```

Check if service is actually running on the instance:

soong ssh
ubuntu@instance:~$ curl localhost:8000/v1/models

Verify firewall isn't blocking local connections:
```
# macOS/Linux
sudo lsof -i :8000
```

Advanced Usage¶

Multiple Instances¶

Run tunnels to multiple instances on different port ranges:

# Instance 1 (dev)
soong tunnel start \
  --instance-id a1b2c3d4 \
  --sglang-port 8000 \
  --n8n-port 5678 \
  --status-port 8080

# Instance 2 (staging)
soong tunnel start \
  --instance-id x9y8z7w6 \
  --sglang-port 8100 \
  --n8n-port 5778 \
  --status-port 8180

One Tunnel at a Time

The current implementation stores only one PID, so only one tunnel can be managed by soong tunnel. For multiple tunnels, use raw SSH commands (see below).

Manual SSH Tunneling¶

For custom configurations, use SSH directly:

ssh -N -f \
  -L 8000:localhost:8000 \
  -L 5678:localhost:5678 \
  -i ~/.ssh/id_rsa \
  ubuntu@123.45.67.89

To stop:

# Find PID
pgrep -f "ssh.*123.45.67.89"

# Kill by PID
kill 12345

Port Forwarding Patterns¶

Local forwarding (what we use):

-L local_port:remote_host:remote_port

Forward a local port to a remote destination.

Remote forwarding:

-R remote_port:local_host:local_port

Expose a local service on the remote instance (less common).

Dynamic forwarding (SOCKS proxy):

-D local_port

Create a SOCKS proxy for routing all traffic through the instance.

Security Considerations¶

Encrypted Connection¶

All traffic through the tunnel is encrypted by SSH, even if the underlying service (like HTTP) is unencrypted.

Plaintext HTTP  →  SSH Encryption  →  SSH Decryption  →  Plaintext HTTP
(Your App)         (Tunnel)            (Tunnel)           (Remote Service)

Authentication¶

Tunnel authentication uses your SSH private key:

~/.ssh/id_rsa  # Default location

Protect Your Private Key

Never share your SSH private key. It grants full access to your instances.

Status Daemon Token¶

API calls to the status daemon (port 8080) require token authentication:

curl http://localhost:8080/status \
  -H "Authorization: Bearer YOUR_TOKEN"

The token is stored in ~/.config/soong/config.json and auto-generated during setup.

Best Practices¶

Always Use Tunnels¶

Instead of exposing services publicly, use SSH tunnels:

✅ http://localhost:8000  # Through tunnel
❌ http://123.45.67.89:8000  # Publicly exposed

Stop Tunnels When Done¶

Free up ports and resources:

soong tunnel stop

Check Status Before Starting¶

Avoid port conflicts:

soong tunnel status
# If running, stop first
soong tunnel stop
soong tunnel start

Use Instance-Specific Tunnels¶

When running multiple instances, specify which one:

soong tunnel start --instance-id a1b2c3d4

Next Steps¶

Launch an instance to tunnel to
Manage leases to keep tunnels alive
Cost optimization to manage your GPU spending