Launching Instances¶
Launch Lambda Labs GPU instances with the optimal configuration for your workload.
Quick Start¶
Launch with default settings:
This uses your configured defaults for model, GPU type, region, and lease duration.
Customizing Launch Parameters¶
Selecting a Model¶
Override the default model with --model:
Available models:
| Model | Parameters | Use Case | Min GPU |
|---|---|---|---|
deepseek-r1-70b |
70B (INT4) | Complex reasoning | A100 80GB |
qwen2.5-coder-32b |
32B (FP16) | Fast coding, long context | A6000 48GB |
qwen2.5-coder-32b-int4 |
32B (INT4) | Budget coding | A10 24GB |
llama-3.1-70b |
70B (INT4) | General purpose | A100 80GB |
llama-3.1-8b |
8B (FP16) | Simple tasks | A10 24GB |
See Model Management for detailed comparisons.
Choosing a GPU Type¶
Override the default GPU with --gpu:
GPU Selection
The configure wizard recommends the cheapest GPU that fits your model. You can always select a larger GPU, but it will cost more.
Common GPU types:
# Budget
gpu_1x_a10 # 24GB VRAM - cheapest
gpu_1x_rtx6000 # 48GB VRAM - mid-range
# High Performance
gpu_1x_a100_sxm4_80gb # 80GB VRAM - large models
gpu_1x_h100_pcie # 80GB VRAM - fastest
Selecting a Region¶
Override the default region with --region:
Available regions depend on GPU availability. Common regions:
us-west-1(Northern California)us-west-2(Oregon)us-east-1(Virginia)us-south-1(Texas)
Region Availability
GPU availability varies by region. Use soong available to see current capacity.
Setting Lease Duration¶
Set lease duration (1-8 hours) with --hours:
Lease Limits
Maximum lease duration is 8 hours. After that, you must explicitly extend the lease or the instance will auto-terminate.
Naming Your Instance¶
Assign a custom name with --name:
Names help identify instances when running multiple sessions.
Launch Workflow¶
sequenceDiagram
participant User
participant CLI
participant Lambda API
participant Instance
User->>CLI: soong start
CLI->>CLI: Load configuration
CLI->>Lambda API: Get instance type pricing
CLI->>User: Show cost estimate
User->>CLI: Confirm (or --yes flag)
CLI->>Lambda API: Launch instance request
Lambda API-->>CLI: Instance ID
CLI->>Lambda API: Poll for status (if --wait)
Lambda API-->>CLI: Instance ready
CLI->>User: Instance IP and commands
Cost Confirmation¶
Before launching, soong shows an estimate:
┌─────────────────────────────────────┐
│ Launch Instance │
├─────────────────────────────────────┤
│ │
│ Cost Estimate │
│ │
│ GPU: 1x A6000 (48 GB) │
│ Rate: $0.80/hour │
│ Duration: 4 hours │
│ │
│ Estimated cost: $3.20 │
│ │
└─────────────────────────────────────┘
? Proceed with launch? (Y/n)
Skip confirmation with --yes:
Waiting for Instance Ready¶
By default, the CLI waits for the instance to be ready:
Skip waiting to return immediately:
When waiting is disabled, check status manually:
Examples¶
Scenario: Deep Reasoning Task¶
Launch the most capable reasoning model:
soong start \
--model deepseek-r1-70b \
--gpu gpu_1x_a100_sxm4_80gb \
--hours 4 \
--name "debug-complex-issue"
Cost: ~$12-16 for 4 hours (depending on region pricing)
Scenario: Fast Iteration Coding¶
Use a fast, budget-friendly coding model:
soong start \
--model qwen2.5-coder-32b-int4 \
--gpu gpu_1x_a10 \
--hours 2 \
--name "rapid-prototyping"
Cost: ~$1.20-1.60 for 2 hours
Scenario: Long Context Refactoring¶
Use Qwen with 32K context window:
soong start \
--model qwen2.5-coder-32b \
--gpu gpu_1x_a6000 \
--hours 6 \
--name "large-file-refactor"
Cost: ~$4.80-6.00 for 6 hours
After Launch¶
Once launched, the CLI displays:
Next steps:
- Start SSH tunnel to access services
- Check status to monitor lease time
- Extend lease if you need more time
Troubleshooting¶
No Capacity Available¶
Solutions:
-
Try a different region:
-
Check current availability:
-
Try a different GPU type (if model allows):
SSH Keys Not Found¶
Solution: Add your SSH public key to Lambda Labs:
- Go to https://cloud.lambdalabs.com/ssh-keys
- Click "Add SSH Key"
- Paste your public key (
~/.ssh/id_rsa.pub) - Try launching again
Instance Launch Timeout¶
Solutions:
-
Check instance status manually:
-
The instance may still be initializing. Wait 1-2 minutes and check again.
-
If stuck in "booting" state for >5 minutes, terminate and relaunch:
Cost Estimate Shows Different Price¶
Cost estimates are based on current Lambda Labs pricing, which can change. The estimate shown is for the lease duration, not the hourly rate.
Example:
Actual Costs
If you extend your lease or the instance runs beyond the lease (before auto-termination), costs will increase. Always monitor with soong status.
Best Practices¶
Choose the Right Model¶
- Complex reasoning: DeepSeek-R1 70B
- Fast coding: Qwen2.5-Coder 32B
- Budget coding: Qwen2.5-Coder 32B INT4
- Simple tasks: Llama 3.1 8B
See Model Management for detailed guidance.
Start with Shorter Leases¶
Begin with 2-4 hours and extend if needed. This prevents paying for unused time.
Use --yes for Automation¶
When scripting or automating launches:
Set Meaningful Names¶
Use descriptive names for multi-instance workflows:
soong start --name "frontend-dev-server"
soong start --name "backend-api-server"
soong start --name "ml-training-job"
Next Steps¶
- Set up SSH tunneling to access services
- Monitor your lease to avoid unexpected termination
- Optimize costs with smart GPU selection