My Software Factory

Architecture Defense

The Factory Runs On Tickets

This project-template turns software work into durable, observable, reviewable units. Agents do not free-roam the repo. They pick up tickets, produce artifacts, open pull requests, and leave enough state for a human or another agent to understand exactly what happened.

TicketScope, owner, deps.

AgentRole, run, session.

MRBranch, tests, review.

SyncMerge reality back to state.

High-Level Diagram

Factory Flow

Intake

Human request, scheduled check, review finding, audit issue, or agent follow-up.

Ticket Backlog

Everything becomes a clear, owned ticket.

Docs audit

Webhook retry

Review fix

Agent Roster

Every agent can see the available specialist roles.

Frontend Builder

Backend Builder

Security Reviewer

1 Agent Per Ticket

One specialist owns the slice, leaves a clear trail, and drives it to an artifact.

Assigned to Backend Builder

Worktree + Pull Request

The agent works in isolation and opens a reviewable pull request.

Quality Gates / Evals

Tests, security scan, model review, score bar, and review artifact.

Human PR Gate

Human inspects the pull request, artifacts, and blockers. Human owns merge to main.

Merge Update

Observed merge reality updates ticket status, pull request links, and follow-up state.

Work uncovers more work

Build or review reveals another slice.

New Ticket

The slice gets an owner and clear scope.

Back to Backlog

It rejoins the same flow.

Core Component

Ticket Properties

A ticket is the durable work object. It is simple enough to inspect in a Markdown file, but structured enough for routing, dashboards, agent runs, PR linkage, handoffs, and recovery.

Identity

Name, source, and stable place in the backlog.

Scope

The smallest useful slice: clear enough to finish, narrow enough to review.

Kind + Urgency

Classifies the work and how quickly the factory should treat it.

Status

Tracks whether work is ready, active, blocked, deferred, or done.

Owner

Exactly one agent lane owns the slice.

Dependencies

Links ordered work without making one ticket own the whole project.

Run Trail

Connects the ticket to the agent session that touched it.

PR Links

Connects implementation output back to the control plane.

Artifacts

Review reports, request notes, logs, and follow-up evidence.

Agent Layer

Agents Are Hyperspecific Roles

The factory does not depend on one generalist agent. It keeps a broad roster of narrow specialists, and tickets route to the role best shaped for that slice of work.

Ticket Triageshape and route work

Feature Plannerturn intent into slices

Backend Builderserver changes

Frontend Builderinterface changes

API Designercontracts and endpoints

Database Migratorschema and data movement

Bug Reproducerpin down failures

PR Fixernarrow review fixes

Test Runnerverification passes

Integration Testercross-system checks

Code Reviewerquality verdicts

Security Reviewerrisk and dependency checks

Performance Profilerspeed and bottlenecks

Accessibility Checkerusable interfaces

Design QAvisual polish

Docs Writerrunbooks and notes

Architecture Scribedecisions and structure

Docs Auditorstale or broken docs

Release Drafterchange summaries

Merge Watchersync merged reality

Incident Fixerurgent repair work

CI Unblockerpipeline failures

Refactor Scoutfind cleanup slices

Knowledge Scoutfocused context requests

Factory Runnerchoose next runnable work

Ticket Contract

One Agent, One Slice

A ticket is not a vague backlog wish. It names one concrete slice and one agent responsible for moving that slice to a reviewable artifact. Related work becomes another ticket.

Webhook retry

Add retry behavior and tests for failed webhook delivery.

Backend Builder

Owns implementation and opens the pull request.

Review failure

Fix a correctness finding from code review.

PR Fixer

Owns the narrow fix and reruns the checks.

Architecture note

Capture a decision that should outlive the session.

Documentation Writer

Owns the write-up and links it back to the work.

Build Loop

Worktree To Pull Request

Implementation happens away from the shared checkout. Agents work in isolated branches, open pull requests, and stop at the human gate: main and master merges are human-only.

Ticket

Agent picks one owned ticket and leaves a clear run trail.

Worktree

The change happens in an isolated branch checkout.

Pull Request

The output is reviewable, linked, and ready for gates.

Agents prepare the pull request. Humans decide when main or master changes.

Code Review

Review has three layers: deterministic CI and tests, automated agent review, and human judgment before anything reaches main.

Automated Review

Agent reviewers run tests, inspect the change, score the result, and leave concrete findings.

Example verdict 84/100

Correctness

88

Security

82

Architecture

80

Quality

86

Dependencies

84

CI and Tests

Deterministic checks run before the human is asked to trust the change.

Lint pass

Typecheck pass

Unit Tests pass

Integration Tests pass

Security Scan clean

Human Review

The human sees the ticket, pull request, automated verdict, findings, and unresolved risks.

Inspect the evidenceRead the diff, tests, review findings, and linked ticket context.

Decide on riskApprove, request changes, or send the work back to another agent.

Own the mergeMain and master change only after human approval.

Communication Layer

Inter-Agent Communication

Every agent knows every other agent.

Knowledge Request

The owning agent keeps the ticket, but collaborates with a specialist through a focused artifact.

Owning AgentNeeds focused help

Shared ArtifactQuestion, answer, finding, or recommendation

Specialist AgentAnswers one narrow question

New Ticket Handoff

When work reveals a different slice, the agent does not silently expand scope. It creates the next ticket.

Completed SliceCurrent work is done or reviewed

New TicketClear owner, scope, and reason

Next AgentAnother specialist picks it up

Factory ContinuesThe backlog stays alive

Quality Layer

Per-Agent Evals

Hyperspecific agents make hyperspecific evaluation possible. Each role can be judged against the work it is actually supposed to do, instead of one generic quality checklist for every task.

AgentEval Rubric

Backend Builderserver-side implementation

Tests pass

Failure paths

Service fit

Reviewable PR

Frontend Builderuser-facing interface work

Responsive

Accessible

Visual polish

Workflow fit

Security Reviewerrisk and abuse surfaces

Risk map

Dependencies

Secrets safe

Least privilege

Docs Writerdurable project memory

Accurate

Findable

Linked back

Future-useful

Code Reviewerquality judgment

Diff-grounded

Severity clear

Actionable

Score stable

Factory Runnerautonomous coordination

Runnable work

Blockers respected

Handoffs filed

Human merge

Ticket Sources

Who Creates Tickets?

Tickets enter the factory from humans, agents, and scheduled maintenance. The source can vary; the control plane stays the same.

Humans

Direct product intent, bugs, design changes, or architecture needs.

Human asksGoal, bug, feature, or cleanup.

Ticket createdClear owner, scope, and priority.

Agent picks it upWork enters the normal factory flow.

Agents

Completed work often reveals the next slice for another specialist.

Agent finishesBuild, review, docs, or investigation.

New ticket filedThe next slice gets its own owner.

Another agent continuesHandoff becomes backlog state.

Schedules

Recurring factory checks create work before humans have to notice.

Morning sweepsSecurity scan, test coverage, docs health.

Findings become ticketsReal work enters the backlog.

Factory picks them upRunnable tickets eventually get assigned.

User Feedback

Support notes, beta feedback, and usage friction become factory input.

User signalComplaint, confusion, request, or repeated friction.

Feedback triagedTurn the signal into a concrete slice.

Ticket createdProduct work enters the backlog.

Observability

Hyperspecific Observability

Because each agent is hyperspecific, the trace can be hyperspecific too: model, tools, outputs, ticket movement, and evidence are all shaped around the role.

Backend Builder Run

ModelGPT-5 Codex, high reasoning

TicketWebhook retry implementation

Toolssearch code, edit files, run tests, open pull request

OutputRetry logic, regression tests, linked pull request

Follow-upFiled docs ticket for operational retry behavior

StatusReady for code review

Role-shaped logs

A builder log emphasizes files, tests, pull requests, and blockers.

Role-shaped tools

Each agent exposes the commands and artifacts that make sense for its job.

Role-shaped evidence

The reviewer sees the exact trail needed to evaluate that type of work.

Codebase Knowledge

Do Not Reinvent The Wheel

Agents run inside the repository, so the repo is the memory layer.

CLAUDE.md

AGENTS.md

Documentation

MCPs

Decisions

Source Code

Tests

Repo CLIs

Tickets

Reviews

Artifacts

Agent Skills

HITL

HITL Is First-Class

A blocked ticket that needs a human is a first-class citizen. The factory records the blocker, sends the request, waits cleanly, and resumes from the same ticket once the human decision is made.

Blocked Ticket Pauses

The agent keeps the state, evidence, and exact ask attached to the ticket.

Human Request

T Telegram now

Production deploy needs approval. Review the linked run, diff, and test output.

Resolve Tail run

@ Email queued

Security finding needs a human decision before the agent can continue.

Human Decides

Approve, reject, answer, or redirect. The factory resumes without losing the trail.

Greenfield

Spec To Tickets

Greenfield planning is intentionally open-ended. Use manual planning, a compound engineering stack, a design doc, or any other planning phase. The factory boundary is where the spec becomes owned tickets.

Plan Any Way

The upstream method can change by team, product, or project shape.

Manual planning Compound engineering stack Design document AI planning pass

Spec
to
Tickets

Convert intent into scoped, ordered, assignable slices.

Factory Backlog

Once work is ticketed, the same project-template loop takes over.

Migration

API

Frontend

Tests

Config

Review