gstack Mastery Guide
A complete course to take you from "what's a slash command?" to running an entire virtual engineering team. No coding knowledge required.
What is gstack?
gstack is an open-source toolkit created by Garry Tan (CEO of Y Combinator) that turns Claude Code into a virtual software development team. Instead of one AI that does everything the same way, gstack gives you 28+ specialized slash commands that each activate a different "role" — like switching between a CEO, a designer, an engineer, a QA tester, and a release manager.
Think of it this way: without gstack, asking Claude to build something is like hiring one person and saying "do everything." With gstack, it's like having a full team where each person has a clear job and they hand work off to each other in a structured sequence.
gstack doesn't add new AI capabilities. It adds structure, roles, and discipline to what Claude Code already can do. The same way a great process makes a team 10x more effective than a group of talented individuals working chaotically.
The Mental Model
Before you touch any commands, internalize this: gstack follows the same cycle that real software teams use. Every feature, every bug fix, every product idea flows through these stages.
Fixing a small bug? You might jump straight to Build → Review → Ship. Building a new product from scratch? Start from Think. gstack is a toolkit, not a prison — use what fits.
Install gstack From Zero (Mac)
If you've never opened Terminal before, start here. Every step is a copy-paste. Total time: about 15 minutes, most of which is waiting for downloads.
A working gstack setup on your Mac: Claude Code installed, gstack's 28+ slash commands ready to use, and your first session running in under 20 minutes.
You will not need to know how to code. You will need to paste commands into a black window (called Terminal) and click "Allow" on a few system prompts.
Step 1 — Open Terminal
Terminal is a built-in app on every Mac. It's the black window where you type commands directly to your computer.
To open it: press Cmd + Space to open Spotlight, type Terminal, hit Enter. A black or white window will open. That's it — you're in.
Leave this window open for the rest of the install. Everything you do is pasting into this window.
Step 2 — Install Homebrew
Homebrew is a tool that lets you install other tools with a single command. Think of it as the App Store for developer software. Installing it once unlocks every other step.
Copy this entire line, paste it into Terminal, hit Enter:
It will ask for your Mac password. You won't see anything as you type — that's normal, it's a security feature. Type your password anyway and hit Enter.
Installation takes 2–5 minutes. When it finishes, you'll see "Installation successful!"
If Terminal prints "Next steps" with extra commands at the end of Homebrew's install (usually two lines starting with echo and eval), copy and run those too. They tell your computer where to find Homebrew.
Step 3 — Install Node.js
Node is the runtime Claude Code is built on. One command:
Takes about a minute. When your prompt comes back without errors, you're done.
Step 4 — Install Claude Code
This installs Claude Code globally on your machine. Another minute or so.
Step 5 — Install gstack
gstack isn't one "thing" — it's a bundle of skills you drop into Claude Code. Run gstack's installer:
This places gstack's slash commands into the right folder so Claude Code picks them up automatically on your next session.
Step 6 — Start your first gstack session
Pick a folder to work in. If you don't have one, make a new one on your Desktop:
cd ~/Desktop/my-first-gstack
Now launch Claude Code:
Claude Code opens inside Terminal. First-time only, it walks you through signing in with your Anthropic account. Follow the prompts.
Once you're in, try your first slash command:
If gstack loaded correctly, Claude will confirm safety mode is on. You're live.
"command not found: brew" — Homebrew's install finished but your shell doesn't know where it is yet. Close Terminal completely (Cmd + Q), reopen it, try again.
"permission denied" during npm install — macOS blocking the install. Prefix the command with sudo, like sudo npm install -g @anthropic-ai/claude-code. It'll ask for your Mac password.
Slash commands don't appear in Claude Code — gstack didn't install into the right directory. Check that ~/.claude/skills/gstack exists. If not, re-run Step 5.
From here, keep reading. The next section (Getting Started) shows you what to type once you're inside a session.
Getting Started
How to use slash commands
Every gstack skill is activated by typing a / followed by the command name inside Claude Code. You type these in the same place you'd type any message to Claude.
Claude will then "become" that role. It'll change how it thinks, what it focuses on, and what questions it asks you. When you're done with that role, just invoke the next one.
The two ways to run gstack
Call each skill manually in sequence. This gives you control and teaches you what each phase does.
/plan-ceo-review ← product strategy
/plan-eng-review ← technical architecture
/review ← code review
/qa ← test it
/ship ← deploy it
One command chains CEO → Design → Eng review automatically, pausing only for your approval at key decision points.
Phase 1: Think
This is where every new idea should start. gstack asks you 6 forcing questions designed to cut through vague thinking and expose whether your idea has real demand. It has two modes:
Startup mode — challenges you on demand reality, the status quo your users tolerate, how desperate users are for a solution, the narrowest possible wedge to start with, observations others have missed, and future-fit.
Builder mode — for side projects, hackathons, and learning. More exploratory, less pressure-testing.
You: "I want to build a tool that helps restaurants manage their waitlist via SMS"
gstack: [asks 6 hard questions about demand, alternatives, specificity...]
The biggest mistake beginners make is jumping straight to "build me X." The /office-hours questions force you to think clearly about what you're building and why before any code gets written. It saves hours of building the wrong thing.
Phase 2: Plan
Planning in gstack happens from two perspectives: the product leader and the engineer. This dual review catches problems that a single perspective would miss.
Thinks like a product leader: "What's the 10-star version hiding inside this request?" Has 4 modes:
Expansion — go bigger, find the hidden opportunity
Selective Expansion — expand some parts, hold others
Hold Scope — keep it exactly as specified
Reduction — cut it down to the absolute minimum viable version
Outputs a Markdown plan document that becomes the source of truth for what gets built.
gstack: "Let me review this plan as if I'm the CEO. Here's what I think the 10-star version looks like..."
Locks down the technical architecture: data models, API structure, file organization, edge cases, error handling, and what tests need to be written. Produces architecture diagrams and a technical spec.
This is what prevents the "I built it but it's a mess" problem. The eng review creates a blueprint that the build phase follows.
gstack: "Here's my architecture recommendation: data flow, component structure, edge cases to handle..."
Opens your live site and audits it the way a Stripe or Linear designer would — with immediate, opinionated visual reactions. Covers an 80-item checklist across 10 categories: typography, spacing, hierarchy, color, responsive design, interaction states, motion, content quality, and AI Slop Detection.
Outputs two headline scores: Design Score and AI Slop Score (letter grades A–F). Also extracts your Inferred Design System from the live CSS — fonts, colors, spacing scale — and can save it as DESIGN.md.
Report only — never touches your code. Use /qa-design-review when you want it to fix what it finds.
gstack: "Design Score: C | AI Slop Score: D — 'The site communicates generic SaaS energy.' Top issue: gradient hero + 3-column icon grid. [Full report with 12 findings]"
Reviews your plan through a developer experience lens: API ergonomics, onboarding friction, documentation needs, tooling setup, and local dev workflow. Has multiple interactive modes to target specific DX concerns.
Catches problems that will slow down future developers (including yourself) before you've built the wrong thing.
gstack: "Reviewing from a DX perspective... The current API design requires 4 steps to authenticate. Let's talk about what a zero-friction first call looks like..."
Chains CEO Review → Design Review → Eng Review automatically. Pauses at key decision points for your approval, so you stay in control without manually calling each command.
This is the "I trust the process, just run it" button for planning.
gstack: [runs CEO review, pauses for approval, runs design review, pauses, runs eng review]
Phase 3: Design
gstack's design skills are one of its most powerful differentiators. They don't just suggest colors — they build complete design systems and generate real, production-quality HTML.
Builds your design system from scratch: researches the competitive space, proposes a visual direction, defines typography, colors, spacing, component patterns, and creative risks. Outputs a DESIGN.md file that becomes your design source of truth.
gstack: "Let me research the space and propose a design system. What's the product and who's it for?"
Generates multiple visual variants side-by-side so you can compare directions. Instead of getting one design and tweaking it 50 times, you see 3-4 options at once and pick the best one.
You: "Show me 3 different hero section styles for a SaaS landing page"
Takes approved mockups, CEO plans, design reviews, or a from-scratch brief and generates production-quality Pretext-native HTML. Text reflows on resize, heights adjust to content, and the skill auto-routes to the right API per design type.
Includes framework detection — if your project is React, Svelte, or Vue, it emits idiomatic component code instead of raw HTML.
gstack: "Detected Next.js + Tailwind. Converting approved hero to a responsive React component..."
Live-site visual audit + fix loop. Runs the same 80-item audit as /plan-design-review (typography, spacing, hierarchy, color, AI Slop Detection), then enters a fix loop: locates the source file for each finding, makes the minimal CSS change, and commits with style(design): FINDING-NNN.
One commit per fix, before/after screenshots, fully bisectable. CSS-only changes get a free pass; component JSX edits count against a self-regulation risk budget (hard cap 30 fixes).
Note: the current upstream skill that fixes design issues. The older /qa-design-review name is the legacy alias — both folders may co-exist transitionally.
gstack: "Design Score C → B+ | AI Slop Score D → A. 9 fixes applied across 9 atomic commits. Before/after screenshots in .gstack/design-reports/."
Phase 4: Build
This is where Claude Code writes the actual code. In gstack's flow, building happens after the plan and design are locked. The plans from earlier phases become the instructions Claude follows.
You don't need a special slash command for building. Once your plans are set (from /plan-eng-review), you simply ask Claude to implement them:
"Build the landing page following DESIGN.md"
"Add the API endpoints from the architecture doc"
Claude will reference the plan files that /plan-eng-review and /design-consultation created, and build against those specs rather than improvising.
Even if you can't read code, the planning phase created documents in plain English that describe what should be built. Claude follows those documents. Your job during the build phase is to make sure Claude is sticking to the plan — and if something seems off, ask "does this match the engineering plan?"
Phase 5: Review
A thorough code review that looks for bugs, security issues, performance problems, edge cases, and style inconsistencies. It doesn't just point things out — it catches bugs that /ship then verifies are fixed.
Think of it as the "experienced developer looking over the junior developer's shoulder" step.
gstack: "I'll review the recent changes. Found 3 issues: a potential null pointer, a missing error handler, and an N+1 query..."
Runs a threat model on your code: injection attacks, authentication holes, data exposure, CSRF, XSS, and more. Thinks like a hacker trying to break your app.
gstack: "Running threat model... Found: API endpoint doesn't validate auth tokens, SQL injection possible in search..."
An overall code quality health check. Scans the codebase for technical debt, code smells, outdated dependencies, missing tests, and structural issues. Think of it as a routine checkup rather than a targeted review of recent changes.
gstack: "Code health report: 3 files with high complexity, 2 outdated dependencies, test coverage at 61%. Priority issues: [list]..."
A live developer experience audit — runs real tests of your local dev setup, CLI tools, build process, and onboarding flow. Unlike /plan-devex-review (which reviews plans), this reviews actual running code and tools.
Tests the "getting started" journey from scratch, identifies friction points a new developer would hit, and suggests concrete fixes.
gstack: "Running DX audit... Cloning repo... Running npm install... Found: 3 undocumented env variables are required before the app starts. Here's how to fix it..."
Phase 6: Test
The breakthrough skill. /qa opens a real Chromium browser, navigates to your app, takes screenshots, clicks buttons, fills forms, and verifies things actually work. When it finds a bug, it doesn't just report it — it fixes the code, writes a regression test, and verifies the fix.
This was Garry Tan's "massive unlock" — Claude saying "I SEE THE ISSUE" and then actually fixing it live.
gstack: "Opening browser... navigating to localhost:3000... Screenshot taken. Testing login flow... [clicks, types, verifies] Found: the signup button doesn't respond on mobile widths. Fixing..."
Same as /qa but only reports issues without fixing them. Use this when you want a clean bug report without Claude modifying your code.
gstack: "Found 5 issues: [list with screenshots and reproduction steps]"
Runs the same 80-item visual audit as /plan-design-review, then fixes what it finds. For each design finding it locates the source file, makes the minimal CSS/styling change, commits with style(design): FINDING-NNN, re-navigates to verify, and takes before/after screenshots. One atomic commit per fix — fully bisectable.
CSS-only changes get a free pass. Changes to component JSX/TSX count against a risk budget. Hard cap at 30 fixes; if risk score exceeds 20% it stops and asks.
gstack: "Design Score: C → B+ | AI Slop Score: D → A. 9 fixes applied (8 verified). style(design): FINDING-001 — replace 3-column icon grid with asymmetric layout..."
Performance testing — measures load times, response times, memory usage, and identifies bottlenecks.
gstack: "Running performance tests... API response time: 340ms (acceptable), but the dashboard page loads in 4.2s (too slow). The bottleneck is..."
Phase 7: Ship
Pushes your code as a pull request. Handles git operations, commit messages, PR descriptions. Verifies that review issues were actually fixed before allowing the push.
gstack: "Creating PR... All review issues resolved. PR created: #47 'Add user authentication'..."
Merges the PR and deploys to production. Handles the actual deployment pipeline.
gstack: "Merging PR #47... Deploying to production... Deploy successful."
Monitors the deployment for errors after it goes live. Watches logs, error rates, and performance metrics to catch problems early.
gstack: "Monitoring deployment... Error rate: 0.1% (normal). No new exceptions. Deploy looks healthy."
One-time deployment pipeline configuration. Sets up the connection between your repository and your hosting platform so that /ship and /land-and-deploy work correctly for your project.
Guides you through selecting your hosting provider, configuring environment variables, and verifying the deploy pipeline end-to-end.
gstack: "What hosting provider are you using? [Vercel / Railway / Fly / AWS / other]... Configuring deployment pipeline..."
Read-only snapshot of the workspace-aware ship queue. Shows which version slots are claimed and which sibling workspaces have WIP — critical when running 5–10 Conductor workspaces in parallel and you don't want two of them claiming the same version number.
Doesn't modify anything. Pure visibility into "who is shipping what right now" across the whole cluster.
gstack: "Ship queue: v2.4.1 claimed by workspace-alpha (45min ago, branch feature/billing). v2.4.2 free. workspace-beta has WIP on hotfix/auth. workspace-gamma idle."
Phase 8: Reflect
Runs a retrospective on what just happened: what went well, what didn't, what to change. Stores learnings so future sessions benefit.
gstack: "Retrospective: The auth feature shipped in 2 sessions. What went well: clear plan prevented scope creep. What to improve: QA found 3 bugs that review should have caught..."
Generates release notes, changelogs, and documentation for what was shipped. Keeps your project's documentation up to date automatically.
gstack: "Release v1.3.0 — Added: user authentication with OAuth, password reset flow. Fixed: mobile navigation bug..."
GBrain — Persistent Memory
GBrain is gstack's persistent knowledge base for AI agents. It stores what your agent learns, what you've decided, what worked, and what didn't — and lets the agent search all of it on demand. Without GBrain, every new session starts cold. With it, yesterday's learning on machine A surfaces in today's session on machine B.
GBrain sits underneath every other skill. /learn writes session-scoped notes; GBrain promotes them to durable, searchable memory. /retro pulls cross-machine history when sync is on. /health includes a GBrain dimension (doctor status, sync queue depth, last-push age) in its composite score.
Three deployment paths: Supabase (cloud, shareable across teammates and machines), auto-provisioned Supabase (one-command project creation), or PGLite (local-only, zero network, ~30s setup).
One command from zero to "GBrain is running and my agent can call it." Detects current state, asks at most three questions, walks through install, init, MCP registration for Claude Code, and per-repo trust policy.
Per-remote trust triad: every repo gets a policy — read-write (search + write back), read-only (search but never contaminate), or deny. Set once per repo, sticky across worktrees and branches. Critical for multi-client consultants who don't want Client A's code leaking into Client B's brain.
Sub-flags: --switch (migrate PGLite ↔ Supabase, lossless), --repo (re-prompt trust policy for current repo), --cleanup-orphans (delete abandoned Supabase projects), --resume-provision <ref> (recover from interrupted provision).
gstack: "Where should your brain live? 1) Supabase (paste URL) 2) Auto-provision Supabase 3) PGLite local. Pick one..."
Terminal-level controls for syncing your ~/.gstack/ state (learnings, plans, retros, design docs, developer profile) to a private git repo. Different from GBrain itself — this is the gstack memory layer that becomes indexable by GBrain when both are on.
gstack-brain-init — turns ~/.gstack/ into a git repo, pushes initial commit, writes the URL-only marker file (safe to copy between machines).
gstack-brain-restore — on a new machine, after copying ~/.gstack-brain-remote.txt over, this clones and rehydrates everything.
gstack-brain-sync --status — last successful push, pending queue depth, sync blocks, current privacy mode.
gstack-brain-uninstall — removes .git and .brain-* config; never touches your learnings, plans, or retros. Add --delete-remote to also delete the private GitHub repo.
Privacy modes: off, artifacts-only (plans + designs + retros + learnings, skip behavioral data), full (everything in the allowlist). Set with gstack-config set gbrain_sync_mode <mode>.
Secret protection: every commit is scanned before leaving your machine — AWS keys, GitHub tokens, OpenAI keys, PEM blocks, JWTs, and bearer tokens are all blocked. If a scan hits, sync stops and the queue is preserved.
last push: 2026-05-09 18:42 UTC (12h ago)
queue: 0 pending
mode: artifacts-only
status: clean
Refreshes GBrain against the current repo's code and teaches the agent when to prefer gbrain search / code-def over Grep. Idempotent — safe to re-run anytime.
Without this, GBrain's index drifts behind your code and the agent falls back to slow grep when a fast semantic search exists. Run after big merges, refactors, or whenever search results feel stale.
gstack: "Indexed 247 changed files. Updated 1,820 code defs. Agent will now prefer gbrain search for symbol lookups, code-def for definitions, Grep only for raw text patterns."
GBrain is a separate, actively-developed project with its own release cadence, schema migrations, and MCP surface. Bundling it would slow GBrain improvements from reaching users. GStack memory sync is gstack's own state — kept separate so you can run one without the other. They become powerful together: GBrain provides the indexable surface, gstack-brain provides the content to index.
Safety Guardrails
These commands protect you from accidents. Especially important when you're not a coder and can't easily spot if Claude is about to do something destructive.
Activates warnings before any destructive operation: rm -rf (delete everything), DROP TABLE (destroy database), git reset --hard (lose all changes), force-push (overwrite remote code). Claude will stop and ask you to confirm before running these.
gstack: "Safety mode ON. I'll warn before any destructive operations."
Restricts all file edits to a single directory. If Claude tries to edit files outside that boundary, it's blocked. Essential when debugging — prevents Claude from "fixing" unrelated code while investigating a specific module.
gstack: "Frozen. All edits restricted to src/auth/. Nothing else can be modified."
Combo command that turns on both safety guardrails at once. Maximum protection.
gstack: "Guard mode ON. Careful warnings active + edits frozen to src/payments/"
Removes the directory lock so Claude can edit files anywhere again.
gstack: "Directory lock removed. I can now edit files anywhere."
If you don't understand what Claude is about to do, ask it to explain in plain English before you approve. "Explain what this command does and what could go wrong" is a perfectly valid and smart thing to say. Never approve something you don't understand.
Browser & Browse
gstack includes a built-in headless browser — a real Chromium instance that Claude can control to visit websites, take screenshots, click buttons, and test your app.
Opens a persistent, high-performance Chromium session (~100-200ms per command). Claude can navigate pages, take screenshots, verify layouts, check console errors, and interact with web pages. Used by /qa under the hood.
gstack: "Opening browser... [screenshot] I can see the landing page. What should I check?"
Launches gstack's AI-controlled Chromium browser with a visible sidebar agent. Unlike /browse (which is headless), this opens a browser window you can see — with Claude's controls visible in a sidebar next to the page.
Supports anti-bot stealth mode and can import cookies from your existing browser for authenticated sessions.
gstack: "Launching Chromium with sidebar agent... Browser open. Where should I navigate?"
Auto-detects installed Chromium browsers (Chrome, Arc, Brave, Edge), decrypts cookies via the macOS Keychain, and loads them into the Playwright session. Interactive picker UI lets you choose which domains to import — cookie values never displayed. Or skip the UI by passing a domain directly.
gstack: "Imported 12 cookies for github.com from Comet. Session is ready."
Pull structured data from a web page. First call prototypes the extraction interactively via the $B browser primitive — Claude figures out the selectors and shape live. Subsequent calls matching the same intent run a codified browser-skill in ~200ms.
The two-phase flow turns one-off scraping into reusable, fast pipelines: prototype once, run forever.
gstack: "Prototyping... extracted 10 items. Want me to /skillify this for ~200ms reuse?"
Walks back through your conversation, finds the most recent /scrape prototype, and synthesizes a script + test + fixture. Runs the test to verify, then asks before committing.
This is how exploratory browser work becomes durable infrastructure. After /skillify, the same scrape runs in milliseconds without an LLM in the loop.
gstack: "Found /scrape prototype from earlier. Synthesizing script + test + fixture... Tests pass. Commit as scripts/scrape-hn-top.ts? [Y/n]"
Learning System
gstack gets smarter over time. It stores patterns, preferences, and pitfalls specific to your project so it doesn't repeat mistakes.
Manage what gstack has learned across sessions. You can:
Review — see all stored patterns
Search — find specific learnings
Prune — remove outdated or wrong learnings
Export — save your learnings to share with others
This is how gstack avoids making the same mistake twice. Every other skill auto-searches your learnings before recommending — when a past insight applies, it surfaces as "Prior learning applied". After /retro captures learnings, /learn lets you manage them.
gstack: "23 learnings (14 high, 6 medium, 3 low). [9/10] API responses always wrapped in { data, error } envelope. [8/10] Tests use factory helpers in test/support/factories.ts. 3 stale (referenced files deleted) — prune? [Y/n]"
Saves the working context — git state, decisions made, remaining work — so a future session can resume from the same point. Not just a TODO list: captures the why behind in-progress choices so the next session doesn't have to re-derive them.
gstack: "Saved. Branch: feature/upload-flow. Decisions: chose async enrichment over sync. Remaining: hook up Stripe billing + write 3 regression tests. Resume with /context-restore."
Resumes from a saved context — even across Conductor workspace handoffs. Pairs with /context-save to make multi-session work feel continuous instead of starting cold each time.
gstack: "Resuming feature/upload-flow. Last decision: async enrichment. Next 3 items: Stripe wiring → regression tests → /qa pass."
Self-tunes the sensitivity of AskUserQuestion prompts across all plan/review skills. Mark questions as never-ask (you've decided), always-ask (high-stakes, never auto-resolve), or only-for-one-way (only interrupt for irreversible decisions).
This is the lever for prompt fatigue. Run after /autoplan if it's stopping you on the same questions over and over.
gstack: "Last 10 plan runs hit 'pick database' 8 times — you always picked Postgres. Mark as never-ask? Also 'should we add auth?' triggered twice — mark as always-ask?"
Independent review from OpenAI Codex CLI. Runs against the same diff that /review saw, but with a different training distribution — catches what Claude misses. Three modes:
Review — codex review against current diff; classifies findings P1/P2/P3; any P1 = FAIL. Fully independent (Codex doesn't see Claude's review).
Challenge — adversarial mode at maximum reasoning effort (xhigh); pen-tests your logic.
Consult — open conversation with session continuity for follow-ups.
Cross-model analysis: when both /review and /codex have run, shows overlap (high confidence), unique-to-Codex, and unique-to-Claude — "two doctors, same patient."
gstack: "CODEX REVIEW: PASS (3 findings) [P2] race condition in payment handler — concurrent charges double-debit without advisory lock. Cross-model: OVERLAP race condition; UNIQUE TO CODEX token timing; UNIQUE TO CLAUDE N+1 in listing photos."
Upgrades gstack to the latest version. Fetches updates from the official repository, applies them to your local install, and reports what changed.
Run this periodically to get new skills, bug fixes, and improvements as the project evolves.
gstack: "Checking for updates... Found v1.4.2 (you have v1.3.1). Upgrading... Done. New skills added: /devex-review, /pair-agent."
Side-by-side cross-model benchmark for skills — Claude vs GPT vs Gemini on the same prompt. Measures latency, tokens, and cost; optionally adds an LLM-judged quality score.
Distinct from /benchmark (which measures site performance). This is for deciding which model to route a skill to, or for evaluating a new model release before swapping defaults.
gstack: "Sonnet 4.6: 12.4s, $0.18, P1=2 P2=4. GPT-5.1: 18.1s, $0.31, P1=1 P2=6. Gemini 3 Pro: 9.2s, $0.09, P1=1 P2=3. LLM-judge: Claude > Gemini > GPT on signal density."
Turn any markdown file into a publication-quality PDF — proper margins, page numbers, cover pages, clickable table of contents. Pairs naturally with the markdown artifacts gstack produces (retros, design docs, learning exports, release notes).
gstack: "Rendered to retros-2026-W18.pdf — 8 pages, cover, TOC, page numbers."
Parallel Work (Advanced)
Deep-dive investigation into a specific issue or module. Automatically freezes to the module being investigated so Claude doesn't accidentally change unrelated code while digging into a problem.
gstack: "Auto-freezing to src/webhooks... Investigating... Found: the Stripe signature validation is using the wrong key."
Saves the current state of your work so you can safely pause and resume later. Documents what's been completed, what's in progress, and what comes next — giving the next session full context to pick up exactly where you left off.
gstack: "Saving progress... Completed: auth UI. In progress: API integration (50% done). Next: wire up login form to /api/auth endpoint."
Enables cross-vendor browser sharing — lets multiple AI agents (Claude, GPT-4, Gemini, etc.) share the same browser session. One agent can hand off a browser context to another without losing state.
Powers advanced multi-agent workflows where different AI models collaborate on the same task, each contributing what they're best at.
gstack: "Sharing browser session... Agent handoff ready. Which agent should take over?"
Conductor is gstack's advanced orchestration tool that runs multiple Claude Code sessions in parallel, each in an isolated workspace. Imagine 4 developers working simultaneously:
Session 1: implementing Feature A
Session 2: running QA on Feature B
Session 3: doing a code review on Feature C
Session 4: investigating a production bug
This is power-user territory. Start with single sessions first and graduate to Conductor when you're comfortable with the core workflow.
Full Skill Reference
Every gstack command in one place. Click any to expand.
| /office-hours | 6 forcing questions to sharpen your idea |
| /plan-ceo-review | Product strategy review (4 scope modes) |
| /plan-design-review | UX and interaction review of plans |
| /plan-eng-review | Lock architecture, data flow, edge cases |
| /plan-devex-review | Review plan for developer experience and API ergonomics |
| /autoplan | Chain all 3 reviews automatically |
| /design-consultation | Build design system from scratch |
| /design-shotgun | Generate multiple visual variants to compare |
| /design-html | Convert designs to production HTML |
| /design-review | Catch "AI slop" and design inconsistencies |
| /review | Senior code review for bugs & issues |
| /cso | Security threat model |
| /qa | Full QA with real browser, auto-fixes bugs |
| /qa-only | QA report only, no auto-fixes |
| /qa-design-review | Visual design audit + auto-fixes (AI slop, typography, spacing) |
| /benchmark | Performance testing |
| /investigate | Deep-dive with auto-freeze |
| /health | Overall code quality health check |
| /devex-review | Live DX audit — tests real onboarding and dev workflow |
| /ship | Create PR and push |
| /land-and-deploy | Merge PR and deploy to production |
| /canary | Post-deploy monitoring |
| /setup-deploy | Configure deployment pipeline |
| /retro | Sprint retrospective |
| /document-release | Generate release notes & docs |
| /learn | Manage stored project learnings |
| /codex | Codebase reference system |
| /checkpoint | Save progress to resume later |
| /gstack-upgrade | Update gstack to latest version |
| /pair-agent | Share browser session across multiple AI agents |
| /open-gstack-browser | Launch visible Chromium with sidebar agent |
Leveling Up: Your Path to Pro
Level 1: Beginner (Week 1-2)
Goal: Understand the flow and get comfortable with the core commands.
Practice this loop:
Start with /careful at the beginning of every session. Don't skip it.
Focus on answering gstack's questions thoughtfully. The quality of what gets built depends on the quality of your answers in /office-hours and /plan-ceo-review.
Level 2: Intermediate (Week 3-4)
Goal: Add design and QA to your workflow.
Expanded loop:
Start using /autoplan instead of calling each plan review manually.
Add /qa to catch bugs Claude's code review missed.
Try /freeze when debugging — notice how it prevents Claude from wandering.
Level 3: Advanced (Month 2)
Goal: Use the full toolkit including security, performance, and learning.
Full workflow:
Add /cso for security reviews on anything user-facing.
Use /benchmark to catch performance issues before users do.
Run /retro after each sprint and /learn to manage accumulated knowledge.
Level 4: Pro (Month 3+)
Goal: Run parallel sessions and customize gstack to your workflow.
Pro patterns:
Use Conductor to run multiple features in parallel across isolated sessions.
Customize the learning system with /learn to encode your project's specific patterns.
Use /open-gstack-browser and /setup-browser-cookies to test authenticated flows.
Chain /canary after deploys to catch production issues instantly.
Run /investigate for surgical debugging without breaking unrelated code.
Use /design-shotgun as a rapid exploration tool, not just for final designs.
Prompting Tips for Non-Coders
Your skill as a gstack user comes down to how well you communicate with Claude. Here's how to get better results.
1. Be specific about the outcome, not the implementation
(You're prescribing solutions you don't understand)
waitlist. Customers join via a link, owners see the queue
in real time, and customers get an SMS when it's their turn."
(You're describing the problem and the user experience)
2. Answer gstack's questions with real-world context
When /office-hours asks "who is desperate for this?", don't say "restaurant owners." Say "small restaurant owners in busy neighborhoods who currently use a paper clipboard and lose 20% of their waitlist because people leave without being called." The more specific, the better the output.
3. Challenge Claude — don't just accept everything
If Claude proposes something that doesn't feel right, push back. Say "I don't think that's right because..." or "That seems overcomplicated, can we simplify?" Claude works best when you treat it like a smart colleague, not an oracle.
4. Use these magic phrases
"What could go wrong with this approach?"
"Is there a simpler way to do this?"
"Walk me through this like I'm not a developer"
"What assumptions are you making?"
"Before you write code, summarize the plan in bullet points"
5. Iterate in small steps
Don't ask for an entire app in one message. Build one piece, verify it works (/qa), then build the next piece. Small iterations catch problems early and prevent massive rework.
Common Mistakes
Skipping /office-hours
Jumping straight to "build me X" is the #1 mistake. You end up building the wrong thing faster. The 10 minutes you spend in /office-hours save hours of rework.
Approving things you don't understand
When Claude asks "should I proceed?", never say yes to something you can't explain in your own words. Ask "explain what this means" first. There's no shame in that — it's the smart move.
Skipping /review and /qa
Code that hasn't been reviewed and tested is code you'll regret shipping. These steps exist because AI-generated code frequently has subtle bugs. Always review, always test.
Not using /careful
Without /careful, Claude can run destructive commands without warning. One accidental rm -rf can delete your entire project. Always start with /careful.
Trying to do too much at once
Building an entire app in one Claude session leads to context overload. Break your project into small features, build each one through the full Think → Ship cycle, then move to the next.
Ignoring /retro and /learn
These aren't optional nice-to-haves. They're what make gstack get better over time. Skip them and you repeat the same mistakes. Use them and each session is better than the last.
Mastery Checklist
Track your progress. Click each item as you complete it.
Foundations
- Ran /office-hours and answered all 6 questions for a real idea
- Completed a full /plan-ceo-review cycle with at least one scope mode
- Ran /plan-eng-review and read the architecture output
- Used /careful at session start (and made it a habit)
- Had Claude build something and ran /review on it
- Used /ship to push a real PR
Intermediate
- Used /autoplan to run the full planning flow automatically
- Ran /design-consultation and got a DESIGN.md
- Used /qa and watched Claude test in a real browser
- Used /freeze during a debugging session
- Pushed back on Claude's suggestion and got a better result
- Completed a full Think → Ship cycle for one feature
Advanced
- Used /design-shotgun to compare multiple visual directions
- Ran /cso and fixed security issues it found
- Used /benchmark to find and fix performance issues
- Ran /retro and reviewed the learnings with /learn
- Used /open-gstack-browser or /setup-browser-cookies to test authenticated pages
- Used /investigate for a targeted debugging session
Pro
- Used Conductor to run parallel Claude sessions
- Shipped a multi-feature release using the full gstack workflow
- Curated /learn entries and exported project patterns
- Used /canary to monitor a production deploy
- Built and shipped something you couldn't have built without gstack
Prompt Builder
Describe what you want to do in plain English. Get back a ready-to-paste gstack prompt with the right slash commands in the right order.
This tool calls Claude directly from your browser using your Anthropic API key. Your key stays in this browser tab — it's not stored on any server and it disappears when you close the tab. Get a key at console.anthropic.com. A typical conversion costs less than a cent.
You've reached the end.
Now go use /office-hours on something real. That's where mastery begins.