Quick Start Guide
Get up and running with TerminalBench in just a few minutes. This guide covers everything you need to start creating tasks.
Environment Setup
Once you have the prerequisites taken care of, choose your setup path:
Option A: Quick Setup with uv (Recommended)
For the fastest setup experience, use uv, a modern Python package manager:
1. Install uv:
curl -LsSf https://astral.sh/uv/install.sh | sh
2. Install Harbor (Python 3.12 and 3.13 are supported):
uv tool install harbor==0.1.25 --python 3.13
3. Configure your API keys:
Ensure the Cognyzer CLI is installed and your API key is generated before proceeding. Refer to the CLI User Guide for detailed setup instructions.
export OPENAI_API_KEY=<your-portkey-api-key>
export OPENAI_BASE_URL=https://api.portkey.ai/v1
Tip: Add these to your
~/.bashrcor~/.zshrcfor persistence.
4. You're ready! to start working and submitting your tasks!
Note: You still need Docker Desktop (v24.0.0+) installed and running.
Option B: Manual Setup
If you prefer a traditional pip installation or need more control, follow these steps:
Windows Users: Install WSL2 First
If you're on Windows, you need to set up WSL2 before installing Docker.
1. Install WSL2
Open PowerShell as Administrator and run:
wsl --install
When prompted, choose Ubuntu 22.04 as your Linux distribution.
2. Install Docker Desktop with WSL2 Integration
- Download Docker Desktop
During installation, make sure to: - Enable "Use WSL 2 based engine" - Enable "Integrate with WSL" → check Ubuntu
3. Verify Docker in Ubuntu
Open your Ubuntu terminal and run:
docker ps
This should run without errors. If you see a permissions error, you may need to restart Docker Desktop or your WSL session.
Step 1: Install Docker Desktop
Docker is required to run task environments. Note that you need to have at least version 24.0.0 or higher.
Install: - Download Docker Desktop
Verify installation:
docker --version
# Docker version 24.0.0 or higher
Special macOS Configuration:
- Open Docker Desktop → Settings → Advanced
- Enable: "Allow the default Docker socket to be used (requires password)"
- If needed, run:
sudo dscl . create /Groups/docker
sudo dseditgroup -o edit -a $USER -t user docker
Step 2: Install Harbor
Harbor is the main task validation and testing framework.
pip install harbor==0.1.25
Step 3: Configure Your API Keys
Ensure the Cognyzer CLI is installed and your API key is generated before proceeding. Refer to the CLI User Guide for detailed setup instructions.
Set environment variables:
export OPENAI_API_KEY=<your-portkey-api-key>
export OPENAI_BASE_URL=https://api.portkey.ai/v1
Tip: You can add these to your
~/.bashrcor~/.zshrcfor persistence.
Step 4: Work On Your First Task
Follow the complete Platform Submission Guide for detailed step-by-step instructions on:
- Downloading the task skeleton template
- Writing task instructions and configuration
- Setting up the Docker environment
- Creating your solution and tests
- Running local validation
- Submitting via ZIP upload to the Cognyzer Expert Platform
We have found the following VS Code extensions will improve your experience with TerminalBench:
- Docker - For managing containers in VS Code;
- Python - For highlighting syntax, linting, etc.;
- Markdown - For editing markdown files (instruction.md);
- TOML - For editing task.toml files;
- GitLens - For an improved Git integration;
Troubleshooting
Docker Issues
"Cannot connect to Docker daemon" - Ensure Docker Desktop is running - On macOS, check menu bar for Docker icon
Permission denied
sudo chmod 666 /var/run/docker.sock
For more help, see the Troubleshooting Guide.
What's Next?
Now that you're set up:
- Read What Makes a Good Task for quality guidelines
- Check the Task Components reference
- Review the Submission Checklist before submitting your tasks