How to Build an AI Engineering Team with GStack and Claude Code

We are officially in a new era of software development. Not incrementally new. Fundamentally, structurally different.

Garry Tan, the President and CEO of Y Combinator, recently shared a workflow that allowed him to code more in two months than he did in all of 2013. Tan is not a casual tech commentator. He studied computer systems engineering at Stanford, was employee number 10 at Palantir (serving simultaneously as engineer, designer, and product manager), co-founded the microblogging platform Posterous (sold to Twitter), and built the first version of YC's internal platform, Bookface. After hearing industry leaders like Andrej Karpathy and Boris Churnney say they were no longer manually writing code, Tan started experimenting with Claude Code and got hooked. Using his new AI-driven workflow, he essentially rebuilt all of Posterous: a project that originally took two years, $10 million, and a team of 10 engineers to complete.

The productivity multiplier is not just about having a smarter AI model. Out of the box, AI models wander, lack context about your codebase, and inevitably guess, producing plausible-looking code that silently breaks. The real breakthrough is treating AI agents the way humans have always worked successfully: as a team with designated roles, defined processes, and rigorous review systems.

To encode this philosophy, Tan built an open-source repository called GStack, which turns Claude Code into a comprehensive AI engineering team. Built around a "thin harness, fat skills" approach, GStack provides lightweight scaffolding that directs already-intelligent models to do extraordinary work on your codebase. In just three weeks, GStack amassed more GitHub stars than Ruby on Rails.

Here is a step-by-step guide to leveraging GStack and Claude Code for your next project.

STEP 01

Initialize Your Project in Conductor

The best environment for GStack is Conductor, where GStack is natively built in.

You begin by opening the Conductor quick start and simply clicking "GStack". Conductor acts as the primary interface for your AI team. One of its most useful settings is called Gary Mode. Activating Gary Mode surfaces the full reasoning traces of the model, giving you real-time insight into exactly what the AI is processing as it sets up your project context. You see every decision, every consideration, every branch in the model's thinking.

Instead of telling the AI to build a specific app and letting it blindly generate code, GStack channels that energy through a series of specialized "skills". The first thing you do is not write code. The first thing you do is think rigorously about the product.

STEP 02

Refine Your Startup Idea with /office hours

The first and perhaps most powerful skill in GStack is called Office Hours. This skill is a distilled, 10% strength version of the thousands of hours that Y Combinator's 16 partners have spent perfecting startup advising over decades.

When you trigger /office hours, it initiates a conversation that asks six forcing questions designed to make you reframe your product before a single line of code is written. The goal is to stop you from building something nobody wants.

Consider a concrete example: you want to build a tax app that pulls 1099 forms from a user's Gmail account. A standard AI model would simply execute that command. GStack's Office Hours pushes back:

It asks for the strongest evidence that someone actually wants this.
It evaluates the pain point: hunting down forms is annoying, but the consequence is usually just friction, not penalties.
It examines competitors: TurboTax and H&R Block already have import features; Plaid connects to banks. Why aren't those solving the problem?

Through this back-and-forth, the AI helps you discover a "wedge strategy". Instead of building a simple document aggregator that might charge $2 to $5 per year, the agent might suggest positioning as a matchmaking and lead-generation funnel for tax preparers, taking a percentage of the final transaction for a 10x larger business model.

The agent will also help you architect a smarter technical approach: using AI browser automation to search the inbox, locate banks, ask the user for missing portals, and download PDFs directly in the visible browser (without needing to store credentials or integrate with Plaid). Sometimes you reach the end of Office Hours and realize an idea is not worth pursuing at all. That is an incredibly valuable outcome.

STEP 03

Survive Adversarial Review and Finalize the Plan

Once you have a solid concept, GStack puts it through a multi-step adversarial review. The AI attacks your proposed design document for weaknesses.

It might discover that your plan has no failure handling, lacks a privacy section, or fails to address two-factor authentication (2FA) handoffs. GStack attempts to automatically fix these issues. In Tan's workflow, a document survived two rounds of adversarial review, catching and fixing 16 issues in the process. The design score improved from 6/10 to 8/10 without any manual intervention.

After approving the design document, you choose how to proceed with sprint planning:

Plan CEO Review: A hands-on approach where you personally evaluate the specifics of the plan, adjust priorities, and challenge assumptions.
Auto Plan: If you do not want to be in the weeds, this automated skill runs you through CEO, engineering, design, and developer experience reviews using Garry Tan's default programmed recommendations.

The adversarial review step is what separates GStack from casual vibe coding. It introduces the same critical pressure that a good engineering lead or product manager would apply before greenlit development.

STEP 04

Visual Brainstorming with Design Shotgun

With the plan locked, you move to the visual design phase using Design Shotgun: a brainstorming tool that uses image generation to create multiple AI-generated versions of your intended user interface in about 60 seconds.

When you run Design Shotgun on a feature (such as a checklist dashboard for the tax app), the agents return three distinct directional options after a few minutes:

Option A: A highly detailed, data-heavy command center suited for a technical power user.
Option B: A friendly, card-based approach with progress rings designed for everyday consumers.
Option C: An overcomplicated layout you can immediately reject.

You review these options, rate them, provide feedback, regenerate if needed, and lock in the variant that matches your product vision. This replaces the traditional design sprint, compressing days of wireframing and Figma iteration into a single afternoon.

STEP 05

Coding and Staff-Level Code Review

When you click approve on your plans, Claude Code begins building the software.

It helps to understand the underlying models. By default, Claude Code uses Claude Opus 4.6, which Tan describes as an "ADHD CEO": a model with enormous generative range that excels at brainstorming and prototyping. When the coding gets difficult and you are tracking down a complex bug, you can bring in OpenAI Codex, which Tan describes as an "autistic CTO" that excels at methodical, deep technical problem-solving.

Once the code is written, GStack offers a Review command. This acts as a staff-level code review service, conducting a thorough pass over the work to catch bugs that were not anticipated during the planning phase. Think of it as a senior engineer sitting beside you who will not let anything obviously broken land in the main branch.

The combination of Opus 4.6 for ideation and Codex for debugging reflects a broader truth about AI-native teams: no single model is best at everything, and GStack handles the routing for you.

STEP 06

Automating QA with the Playwright CLI Wrapper

One of the biggest bottlenecks in AI-assisted engineering is QA. The agent plans, designs, and codes the application. Then the human developer is left manually clicking through the browser, arguably the least satisfying part of the process.

Early attempts to use Claude in Chrome MCP for browser QA were frustrating: context bloat caused the model to think endlessly and take two to three seconds per action. Tan described it as one of the worst pieces of software he had used.

To solve this, Tan built a CLI wrapper around Playwright and Chromium directly into GStack. This provides a complete headed and headless browser to your AI agents. Using the SLQA and SL browse tools, your AI can now:

Take screenshots and diagnose real browser issues (JavaScript or CSS rendering failures).
Perform complex interactions: fill out forms, click on elements, navigate multi-step flows.
Download media and verify file outputs.
Run full regression test suites.
Update CSS automatically after identifying visual regressions.

This tool bypasses the context bloat problem entirely by giving the agent direct, low-latency browser control rather than routing through an intermediate MCP layer. Your AI team now handles QA rigorously, at scale, without you sitting there clicking refresh.

STEP 07

Finalize with the Ship Tool and Scale to a Level 7 Software Factory

Before any code lands on the main branch, you run the Ship tool: the final gate in the GStack process. This ensures everything that has been built, reviewed, and QA'd actually meets the bar for production.

Once you master this workflow, you reach what Tan calls a "Level 7 software factory". At this level, you are no longer coding linearly. You are managing multiple parallel workflows simultaneously:

You can run multiple Conductor windows across different projects, or three to four parallel sessions within the same project.
These sessions produce parallel pull requests with parallel branches and features that land more or less simultaneously.
There is no traditional to-do list. Every idea, bug report, or piece of user feedback from X (formerly Twitter) becomes a new work tree, opened with a single click in Conductor.

Each work item flows through the same assembly line: Office Hours, CEO Review, Adversarial Review, Code, QA. Because of this structure, a single engineer can evaluate and land 10, 15, 20, or sometimes up to 50 PRs in a single day, managing open-source projects with tens of thousands of stars and hundreds of pending pull requests.

There is a security benefit too. The structured, multi-stage review process in GStack creates a meaningful defense against supply chain attacks in AI-assisted coding. Every contribution, whether from a human or an agent, passes through the same adversarial scrutiny.

What This Means for Enterprise Engineering

GStack is available at github.com/gritan/GStack. The framework is open-source, Conductor is the recommended environment, and the skills library continues to grow.

By adopting the "team of specialists" model, you stop treating AI as an autocomplete tool and start treating it as a fully staffed engineering department. The barrier to building software has collapsed. The question is no longer whether AI can help you build. The question is whether your process is structured enough to direct that capability.

For enterprises, the GStack philosophy maps directly onto what governs large-scale AI adoption: structured roles, defined handoffs, human review at the right checkpoints, and auditability at every stage. The 20X company does not win through raw AI horsepower. It wins through disciplined orchestration of AI at every layer of the stack.

Building Governed Agentic Workflows at Enterprise Scale

GStack solves the developer productivity problem. But enterprises have an additional layer of requirements: privacy boundaries, cross-functional approvals, regulated workloads, and non-technical stakeholders who need visibility without needing a terminal.

lowtouch.ai addresses exactly this gap. Where GStack excels at accelerating individual and small-team engineering output, lowtouch.ai deploys private-by-architecture agentic workflows across IT operations, finance, customer success, and shared services, with Human-in-the-Loop (HITL) controls built in from the start.

If your organization is ready to move from AI experimentation to governed execution at scale, schedule a conversation with the lowtouch.ai team and we will map out what a structured AI operating layer looks like for your environment.

About the Author

Rejith Krishnan

Founder and CEO

Rejith Krishnan is the Founder and CEO of lowtouch.ai, a platform dedicated to empowering enterprises with private, no-code AI agents. With expertise in Site Reliability Engineering (SRE), Kubernetes, and AI systems architecture, he is passionate about simplifying the adoption of AI-driven automation to transform business operations.

Rejith specializes in deploying Large Language Models (LLMs) and building intelligent agents that automate workflows, enhance customer experiences, and optimize IT processes, all while ensuring data privacy and security. His mission is to help businesses unlock the full potential of enterprise AI with seamless, scalable, and secure solutions that fit their unique needs.

LinkedIn →