The Destination

Where We're Heading

Before the how, the why. AI isn't replacing developers — it's changing what developers do. The destination is a world where you define intent, and agents handle execution.

From Writing Code to Defining "Correct"

The core shift in AI-assisted development isn't about typing faster. It's about moving from maker to architect of intent. You stop writing every line and start defining what right looks like — then letting agents execute against that definition.

Today
You write code
Human reads ticket. Human researches codebase. Human writes implementation. Human writes tests. Human reviews their own work. AI occasionally autocompletes a line.
The Vision
You define what correct looks like
You set intent: goals, constraints, success criteria, structural enforcement. Agents research, plan, and implement. Harnesses verify. You review outcomes, not code.
Execution is becoming cheap. Clarity and verification are the new bottlenecks.

What You're Building Towards

In the AI-SDLC, every stage of the development lifecycle has agent participation — guided by intent, constrained by harnesses, connected by dense handoff artefacts. Here's what a single ticket looks like at maturity:

📄 Ticket Created
Research agent analyses the ticket, maps affected files, produces research.md
📐 Research Ready
Plan agent reads research.md, drafts implementation plan with risk register
☑ Human Review
You review the plan, adjust intent, approve or redirect
⚡ Plan Approved
Implement agent follows plan step-by-step, self-corrects from test output
🚀 PR Created
Review agent checks architecture, tests, style — you give final approval

Notice: you're involved at one step — reviewing the plan. Everything else is agents operating within the guardrails you've built. That's the destination.

What Makes This Work

The vision rests on three ideas. Every tab on this site teaches you how to build one of them.

🎯

Intent Engineering

Communicate goals, constraints, and success criteria — not exact words. Define the shape of the output, not the output itself. Move from crafting prompts to architecting agent interactions.

🛡

Physics over Law

Don't tell agents what to do — make wrong things impossible. Tests, compilers, and linters are physics: structural constraints the agent can't ignore. Prompts are law: suggestions it might break.

📄

Dense Handoff Artefacts

Compress hours of exploration into structured documents that carry intent forward. research.md, plan.md, handoff.md — each phase produces an artefact that feeds the next, so no context is wasted.

When You'll Know It's Working

There's a specific moment in the journey where everything comes together. It looks like this:

An agent is refactoring a service class. It introduces a subtle type mismatch. The compiler refuses to build. The agent reads the error, fixes the type. It continues. A few steps later, it accidentally changes calculation order. A unit test fails: "Expected 42.99, got 38.50." The agent diagnoses the issue and fixes it. Finally, a missing DI registration throws a 500 error — caught by Playwright clicking "Add to Cart." Fixed. All green. No human touched the code.

That's three layers of physics catching three different classes of error. The harness is the teacher, not you. That's what you're building towards.

How Your Role Changes

This isn't about AI taking your job. It's about your job becoming more impactful. The progression looks like this:

Where you start
Maker
You write code. AI suggests lines. You accept or reject. The human does all the thinking and all the typing.
Where you grow
Engineering Manager
You define the Definition of Done. You build the harnesses. Agents implement. You review outcomes, not individual lines.
Where you're heading
Architect of Intent
You design the system of systems. Intent flows through docs, skills, harnesses, and pipelines. Agents coordinate agents. You set direction.

How We Get There

The rest of this site is a progressive guide from where you are now to the vision above. Each tab builds on the last:

🗺
Roadmap
Your checklist
📚
Foundations
Core concepts
Prompt Tactics
Hands-on techniques
📄
Intent & Context
Capture & curate
🛡
Physics
Structural enforcement
🚀
Steel Thread
Prove it end-to-end
🌎
Scale
Team & org
🗺
SDLC Map
The big picture

Start with the Roadmap to see the full journey at a glance with interactive checklists, or jump to any tab that matches where you are right now. Every stage delivers value on its own — you don't need to reach Stage 5 to benefit.

📋 Cheat Sheets & Bingo Card ⚖ GitHub AI-SDLC vs AWS AI-DLC
🎯 The Recommended Approach 📋 Planning & Design ⚙️ Agent Mode 🧪 Testing & Review 🔧 Maintenance 🔄 CI/CD & Actions
Stage 1 — Foundations

The Building Blocks

From prompting to intent engineering to physics thinking — the three layers that make AI agents reliable. This is the conceptual foundation everything else builds on.

Intent Sits Above Everything Else

Prompting is the execution layer. Context engineering is about what the agent knows. Intent engineering is about how the goal is structured — workflows, boundaries, and destinations.

Intent Engineering

How the goal is structured. Workflows, boundaries, destinations. Architecting the machine interaction.

Context Engineering

What the agent knows. RAG, vector databases, memory. Supplying the raw material.

Prompting

Execution layer. The words you write. Necessary but not sufficient.

Intent Engineering
Context Engineering
Prompting
We must shift our mindset from text-crafters to systems architects.

Laws vs Physics

Laws are instructions — they can be broken. Physics are structural constraints — they can't. Intent engineering encodes your goals as physics.

Law (Prompt Instructions)

📝
"Don't break existing behaviour"
🎨
"Follow the existing coding style"
🧮
"Make sure the cart logic is correct"
🔤
"Don't introduce type errors"

Physics (Structural Enforcement)

🎭
Playwright E2E tests that fail if behaviour changes
🔧
Linter configured to enforce style rules
🧪
Unit tests that assert cart calculations
⚙️
C# compiler that refuses to build

The rule: If a human has to manually check something, the harness is incomplete. Encode every compliance check as an automated constraint.

Getting Started — Hands On

Prompt Tactics

Six techniques that make your interactions with AI agents dramatically more effective. Try them, then ask yourself: why does this work? Understanding the mechanism matters more than memorising the trick.

Your Prompting Toolkit

These aren't magic incantations — they're structural patterns that exploit how language models reason. Each one changes the shape of the agent's thinking, not just the words you feed it. Once you see the pattern, you'll start combining them instinctively.

📊

Visualisation

Agents can do more than text. Tables, Mermaid diagrams, flowcharts, and architecture maps are often clearer than paragraphs — and they force the agent to think structurally about relationships.
"Please create a Mermaid diagram to help me understand how the shopping cart flow works from UI to database."
Why it works: Generating a diagram requires the agent to identify entities, relationships, and flow direction — a deeper analysis than prose. The structured output format constrains the reasoning into something verifiable.

Prompt For Questions

When you're uncertain, don't guess at a prompt. Instead, ask the agent to interview you. It will surface requirements and edge cases you hadn't considered — before any work begins.
"I don't know how to get started on this project. Please ask me some questions to help clarify things."
Why it works: This flips the interaction from "you guess what the agent needs" to "the agent tells you what it needs." It's requirements gathering — the same thing a good consultant does in a first meeting.
🚀

Reflection

After an agent gives you a response, ask it to self-evaluate. If it rates itself below 10, ask it to do better. The gap between its first attempt and its "try harder" attempt is often surprising.
"Rate your response out of 10. If it isn't a 10, please generate an improved version."
Why it works: Self-evaluation activates a different reasoning mode. The agent compares its output against an implicit quality standard, identifies gaps, and produces a revised version that addresses them. It's a one-step feedback loop.
👥

Group Simulation

Explain a problem, then ask the agent to simulate a panel of experts. You'll get multiple perspectives, trade-offs, and dissenting views that a single-perspective answer would miss.
"Who would be a good group of people to ask about this and what would they say?"
Why it works: Assigning distinct personas forces the agent to consider the problem from multiple angles. A security engineer, a UX designer, and a product manager would each flag different risks. The simulation surfaces that diversity in a single response.
📖

Ask For References

After an agent responds, ask it to support its claims with documentation. This forces it to distinguish between confident knowledge and speculation — and makes hallucinations visible.
"Please explain your conclusion, referencing documentation to support your claims."
Why it works: Asking for evidence forces the agent to ground its reasoning. If it can't cite a source, that's a signal the claim needs verification. This is the provenance tagging concept in action — EXTRACTED (from docs) vs INFERRED (AI reasoning).
🧠

Synthesis & Selective Amnesia

After a long conversation, ask the agent to summarise the problem while deliberately excluding all solutions discussed. Copy that summary into a fresh session. The new agent attacks the problem without anchoring bias.
"Please thoroughly summarise the problem we've been discussing in such a way that a third party can take over. Do not include any solutions we've covered."
Why it works: This is information diets applied to conversations. A long session accumulates stale context and anchoring effects — the same "Dumb Zone" problem that plagues code context. A fresh agent with a clean problem statement reasons from first principles.

Why These Work: Structural Thinking

Notice the common thread across all six tactics. Each one changes the structure of the interaction, not just the words. They shape what the agent pays attention to, how it reasons, and what it produces.

Tactic
What It Changes
Intent Engineering Parallel
Visualisation
Output format → forces structural reasoning
Intent scaffolding — defining the shape
Questions
Direction of flow → agent leads discovery
Research phase — explore before acting
Reflection
Self-evaluation loop → iterative improvement
Feedback loops — the harness teaches
Group Sim
Perspective diversity → surfaces trade-offs
Risk registers — predict what could go wrong
References
Evidence grounding → reveals hallucinations
Provenance tagging — EXTRACTED vs INFERRED
Sel. Amnesia
Context reset → removes anchoring bias
Information diets — fresh context windows

This is the bridge to intent engineering. These tactics work because they change the structure of the interaction. Intent engineering takes the same idea further — encoding structure into workflows, harnesses, and artefacts rather than individual prompts. Start here, then level up on the next tabs.

Try It Right Now

Open Copilot Chat in your IDE and try this sequence. It combines three tactics in one interaction:

1

Start with Questions

"I need to add a caching layer to our API. Please ask me 5 questions to understand what I need before suggesting an approach."

Prompt For Questions
2

Visualise the Design

After answering the questions, ask: "Now create a Mermaid sequence diagram showing the cache hit and cache miss flows."

Visualisation
3

Simulate a Review

"If I showed this design to a senior backend engineer, a security engineer, and a DBA — what would each of them say?"

Group Simulation
4

Reflect and Improve

"Rate the design out of 10 based on their feedback. If it isn't a 10, revise it."

Reflection
Four tactics, one conversation, better outcome than any single prompt could produce.
This is prompt-level intent engineering. The next tabs show how to encode it into systems.

Structured Prompting Frameworks

The six tactics above are intuitive moves. These four frameworks give you repeatable scaffolding for prompts — each one builds on the last. Start with APE for quick tasks, graduate to COAST for complex scenarios. The 4S Framework (Single, Specific, Short, Surround) underpins all of them.

Zone 2 — Structured

A.P.E — Action, Purpose, Expectation

The simplest structured framework. Three components that turn a vague ask into a targeted prompt. Use this when you know what you want but need to be precise about it.

01 Action

State the specific task you need performed.

✗ "Help me with my code"
✓ "Write a Python function that validates email addresses using regex"
02 Purpose

Give context about the goal behind the action.

✗ "Make a chart"
✓ "Create a bar chart comparing Q1-Q4 revenue for the board deck"
03 Expectation

Define format, length, style, or constraints.

✗ "Give me some test cases"
✓ "Generate 5 pytest unit tests with edge cases, using AAA pattern"
Zone 3 — Advanced

R.A.C.E — Role, Action, Context, Expectation

Adds a Role dimension — assigning an expertise persona activates domain-specific vocabulary and patterns. Use when the task benefits from a specialist’s perspective.

01 Role

Assign a specific expertise.

✗ "Help me review this code"
✓ "As a senior security engineer, review this auth module"
02 Action

Specify exactly what to produce.

✗ "Make this better"
✓ "Refactor to extract a reusable validation middleware"
03 Context

Tech stack, constraints, dependencies.

✗ "Fix the login bug"
✓ "Fix OAuth callback — Express 4, passport-google, session store in Redis"
04 Expectation

Quality standards and deliverables.

✗ "Write some tests"
✓ "Integration tests: 80% coverage, mock DB, test happy + error paths"
Zone 4 — Comprehensive

C.O.A.S.T — Context, Objective, Actions, Scenario, Task

Five components for complex, multi-step work. COAST forces you to think about edge cases and acceptance criteria up front — it maps directly to Agent Mode task decomposition.

Context
✗ "Help with my app"
✓ "React 18 + TypeScript, Next.js 14 App Router"
Objective
✗ "Improve performance"
✓ "Reduce initial bundle below 200KB with code splitting"
Actions
✗ "Set up the project"
✓ "1) Lazy routes 2) Split vendor chunks 3) Tree-shake utils"
Scenario
✗ "Handle errors"
✓ "If Redis disconnects, fall back to in-memory; alert via PagerDuty"
Task
✗ "Create a component"
✓ "DataGrid: sortable columns, virtual scroll, CSV export"
Zone 5 — Goal-Oriented

G.C.S.E — Goal, Context, Source, Expectations

The most intent-aligned framework. Start with the end state, provide the landscape, point to authoritative sources, and define quality standards. Maps directly to copilot-instructions.md and Copilot Workspace goals.

01 Goal

Start with the end state, not the process.

✗ "Help me with authentication"
✓ "Implement SSO so users log in once across all three microservices"
02 Context

Architecture, dependencies, constraints.

✗ "We use microservices"
✓ "3 Node.js services behind Kong, shared Redis session store"
03 Source

Reference authoritative material for grounding.

✗ "Follow best practices"
✓ "Follow OWASP Top 10 and our internal auth-sdk v3.2 API contract"
04 Expectations

Quality and format standards.

✗ "Make it production ready"
✓ "TypeScript strict, 90% coverage, OpenAPI 3.1 spec included"

Progression, not competition. APE → RACE → COAST → GCSE isn't about picking the "best" framework. It's about escalating precision as complexity increases. A quick utility function? APE. A cross-service architectural change? GCSE. Match the framework to the task.

Source Packs & Diagnosing Failures

Good frameworks get you close. These two operational patterns close the remaining gap — grounding the AI with curated inputs and systematically fixing broken outputs.

Source Packs

A short, explicit bundle of information given to an AI before asking it to summarise, plan, refactor, or make decisions. Reduce hallucinations by constraining what the AI works with.

A Source Pack contains:
• Source material (meeting notes, docs, code excerpts)
• Facts you're confident are true
• Open questions or unknowns
• Clear boundaries on what the AI may assume
The rule: "If something is not supported by the provided source material, omit it or label it as 'not found in source'."

Diagnosing Failures

When AI output is poor, don't restart from scratch — diagnose the failure mode and apply the targeted fix.

Missing Information
The prompt lacks context. → Ask the AI to request missing info before answering.
Wrong Structure
Correct answer, unusable format. → Enforce a specific output structure (table, JSON, etc.).
Wrong Level or Tone
Too technical or too vague. → Specify audience explicitly ("explain for a senior PM").
Untrusted Claims
Claims can't be traced to a source. → Require a source quote per claim; use a Source Pack.

Trusting AI Output: From Hallucination to Verification

As models get smarter, they prioritise helpfulness (guessing) over truthfulness (admitting ignorance). The output is fluent, so we stop checking. This is automation bias — and it's the single biggest risk in AI-assisted development. These structural rules make hallucinations visible.

The Code Quality Loop

Explore Plan Code Validate Repeat

AI accelerates each step, but humans control correctness. Same loop good engineers already follow — AI changes the speed, not the discipline.

The Honesty Gap

For developers, hallucination manifests as code that looks idiomatic but calls non-existent APIs, uses deprecated patterns, or introduces subtle logic bugs. For product managers, it manifests as plausible-sounding but incorrect feature specs. It reads like expertise.

The risk: because the output is fluent, we stop checking. Trust is lost the moment a hallucination breaks a build or ships a wrong spec.

Three Structural Rules — Physics in Practice

Don't rely on the agent's "morals" or instructions — that's Law. Build the interaction's architecture so inaccuracy is structurally difficult — that's Physics.

Force Blanks

In your SKILL.md or Custom Instructions, require the agent to leave code comments or fields blank if logic is ambiguous — with a mandatory "Reason for Blank" field. This makes uncertainty visible instead of hiding it behind a confident guess.
copilot-instructions.md: "If any business rule is ambiguous, leave a TODO comment with the reasoning instead of guessing."

The 3× Penalty

Explicitly weight the distribution: "A wrong answer is 3× worse than a blank." This recalibrates the model's helpfulness bias toward caution. When in doubt, it leaves a blank rather than guessing — which is exactly what you want.
"When in doubt between guessing and leaving a TODO, always leave the TODO. A wrong answer costs 3× more than a blank."
🔖

Label Sources (Provenance)

Every output tagged with provenance: EXTRACTED (exact match from docs/codebase) or INFERRED (AI's derivation, with a one-sentence logic explanation). This makes the boundary between fact and guess instantly visible.
"Tag each claim: [EXTRACTED] with source reference, or [INFERRED] with your reasoning in one sentence."
Watch Out

The Self-Rating Trap

Remember the Reflection tactic — "rate your response out of 10"? It works brilliantly for creative drafts where the rating triggers an improvement loop. The model can genuinely improve a poem on its second pass.

But for extraction, logic, and code correctness, the confidence score comes from the same process that produced the error. If a model hallucinates a variable name, it will confidently rate itself "9/10" on that answer.

Key insight: Self-assessment works for generation. It fails for verification. You cannot ask the AI to be its own independent auditor — the verification must come from outside the system. That's why we use Physics (tests, linters, compilers) not Law (instructions).
Key Insight

Inverted Causality — Why This Works

In humans, expertise produces expert language. In LLMs, expert language produces expertise. Forcing vocabulary like EXTRACTED and INFERRED isn't just metadata — it restricts predictions to expert regions of the training data, shifting the mean toward quality and reducing variance.

This is the fundamental mechanism behind all structural rules: by constraining the form of the output, you constrain the quality of the reasoning.

Safer Prompts for Common Tasks

Instead of “Refactor this code”
"Explain current behaviour, list risks, confirm tests exist, then refactor without changing behaviour."
Instead of “Write tests”
"List edge cases first, then write tests that would fail if the implementation is wrong."
Instead of “Fix this bug”
"Describe the root cause, show what the correct behaviour should be, then propose the smallest safe fix."
Your verification checklist:
☐ Can the agent say "I don't know" in your workflow?
☐ Does every output carry provenance (EXTRACTED / INFERRED)?
☐ Is a wrong answer penalised more than a blank?
☐ Are you building with Law or Physics?
📋 Cheat Sheets & Bingo Card →
Stages 2–3 — Capture Intent

Intent & Context

Document what exists. Reverse-engineer the intent that was never written down. Engineer the context so agents get curated knowledge, not raw dumps.

What's Actually Different?

Prompt engineering treats the prompt as the product. Intent engineering treats the entire system — prompts, structure, feedback loops, constraints — as the product.

Prompt Engineering

  • Focus on exact wording and formatting
  • Tricks, templates, and magic phrases
  • One prompt, one response
  • Hope the AI follows instructions
  • Retry with different words when it fails
  • All context in a single prompt

Intent Engineering

  • Focus on goals, constraints, and success criteria
  • Structural enforcement through tooling
  • Multi-phase workflows with handoff artefacts
  • Make wrong behaviour structurally impossible
  • Feedback loops that teach the AI what failed
  • Information diets — right context at right time

Intent Across Layers and Workstreams

Intent engineering applies at every level — from vision through to operations — and spans all your workstreams. Each layer encodes intent in a different way.

Vision
Strategy
Design
Implementation
Verification
Operations
Docs
Harnesses
Agent Skills
Pipelines

Core Benefits

Deep Context Alignment
Streamlined Execution
Continuous Improvement
Active — intent encoded here Possible — not yet adopted

Prompts Are Suggestions. Harnesses Are Guarantees.

A prompt says "please don't break the cart." A test suite says "the cart works or you don't proceed." Both communicate intent, but only one enforces it.

Think of it this way: Prompt engineering is writing a good brief. Intent engineering is writing a good brief, hiring a QA team, setting up CI/CD, and defining acceptance criteria — all encoded into the system the AI operates within.

The Dumb Zone & Why Tokens Matter

Past roughly 40% context usage, AI reasoning degrades sharply — the model becomes confident but wrong. Every token you waste on irrelevant context is a token the agent can't use for thinking. This is why curated context beats raw dumps.

Naive approach: dump everything
Raw files (65%)
Prompts (20%)
Reasoning
Intent-engineered: curated context
Artefact
Prompts (20%)
Reasoning (65%) ← this is what you want

Use LSP over grep for precise queries. Compress findings into dense artefacts. Start each RPI phase in a fresh context window. Every token should earn its place.

LSP in Practice: Queries That Change Everything

Here's what "use LSP over grep" actually looks like. These are real queries from the nopCommerce steel thread — the same ones that found critical gaps grep missed.

Find References — Who calls this method?
// LSP: textDocument/references on AddToCartAsync
// Returns: 14 callers across 8 files, with exact locations

ShoppingCartController.cs:617  AddProductToCart_Catalog()
ShoppingCartController.cs:680  AddProductToCart_Details()
CheckoutController.cs:243     MigrateCart()
OrderProcessingService.cs:891 ReOrder()
// ... 10 more — grep found only 6 of these
Document Symbols — Full map of a class
// LSP: textDocument/documentSymbol on ShoppingCartService.cs
// Returns: 62 symbols — 31 fields, 1 constructor, 27 methods

Fields (31):  _catalogSettings, _aclService, _customerService ...
Methods (27): AddToCartAsync, GetShoppingCartAsync,
              DeleteShoppingCartItemAsync, FindShoppingCartItemInTheCartAsync ...
// One query = complete class anatomy. No file-reading tokens spent.
Type Hierarchy — What implements this interface?
// LSP: typeHierarchy/subtypes on IShoppingCartService
// Returns: implementation chain with exact locations

IShoppingCartService (interface, 22 methods)
  └─ ShoppingCartService (src/Libraries/Nop.Services/Orders/)
      // Confirms: single implementation, safe to extract subset

// Compare: grep for "IShoppingCartService" returns 47 matches
// including imports, comments, and XML docs — all noise

How to enable LSP: In VS Code, the language server runs automatically. In agent workflows (Claude Code, Copilot agent mode), use MCP servers like @anthropic/lsp-mcp to give agents LSP access. The key queries are find-references, document-symbols, and type-hierarchy — these three cover 90% of research needs.

Stage 4 — Steel Thread

The Steel Thread

Pick one vertical slice. Apply RPI: Research with LSP, Plan with risk registers, Implement with feedback loops. Prove the methodology on your codebase.

Three Phases, Three Fresh Contexts

AI agents have limited context windows. RPI splits work into phases, each operating in a fresh context with only the artefacts it needs.

R

Research

Explore the codebase freely. No implementation, no planning — pure discovery. Trace flows, find files, document connections.

📄 research.md
P

Plan

Read only research.md. Produce ordered implementation steps, interface definitions, risk areas, and test strategies.

📄 plan.md
I

Implement

Read only plan.md. Execute each step, running the full test suite after every change. Don't proceed until green.

✅ Working code

Information diets: Each phase gets only what it needs. The Plan agent never sees raw code — only compressed research. The Implement agent never sees research — only the ordered plan. This prevents context waste and forces density.

What Each Agent Sees

The width of each bar shows how much information each agent receives. Less is more — constrained input forces focused output.

Research
Full codebase access
→ research.md
Plan
research.md
→ plan.md
Implement
plan.md
→ code

What Holds It Together

1

Decompose-Route-Recompose

Break complex work into phases, route each to a fresh context with exactly the artefacts it needs, then recompose the results. This works with context window limitations instead of fighting them.

2

Physics Thinking

Don't tell the agent what to do — make wrong things impossible. A failing test is worth a thousand prompt instructions. Structural constraints are your most reliable form of intent communication.

3

Harnesses Matter More Than Models

The test suite, the compiler, the linter — these are your real safety net. A mediocre model with a great harness outperforms a brilliant model with no guardrails. Invest in the harness first.

The nopCommerce Steel Thread

nopCommerce is a 200k+ line .NET e-commerce platform. The steel thread: the shopping cart flow from "Add to cart" through controllers and services to the database. One vertical slice through the entire architecture.

Brownfield Extraction: The Step-by-Step Process

Reverse-engineering documentation from an existing codebase is the most common starting point for teams adopting AI-SDLC. Here's the methodology, using the nopCommerce cart flow as a worked example.

1
Identify the Steel Thread
Pick the narrowest user flow that touches every layer. For nopCommerce: "click Add to Cart on a product page." This touches UI (Razor view) → JavaScript (AJAX handler) → Controller → Service → Repository → Database. Don't try to document the whole system — one thread is enough.
2
Trace with LSP, Not Grep
Use find-references on the controller action to discover every caller. Use document-symbols to map the service class (e.g., 62 symbols in ShoppingCartService.cs — 31 fields, 27 methods). Use type-hierarchy to find all implementations. Record exact file paths and line numbers.
3
Produce Dense Research
Compress the trace into a structured research.md: Entry Point → Controller Layer → Service Layer → Data Access → Observations. Include caller counts, dependency lists, and blocking conditions. The nopCommerce research.md maps the full add-to-cart flow in ~2k lines — down from 200k.
4
Plan the Extraction
Fresh agent reads only research.md. Produces ordered steps: create interface → create implementation (copy, don't move) → register in DI → delegate from original → update controller. Each step ends with a verification command (dotnet build, dotnet test, npx playwright test).
5
Implement with the Harness Running
Fresh agent follows the plan. After every change: dotnet format --verify-no-changes && dotnet test && npx playwright test. The harness catches type errors (compiler), logic regressions (unit tests), and broken UI flows (Playwright). The agent self-corrects from error messages. No human code review needed during implementation.

Why this works for brownfields: You're not asking AI to understand your whole codebase. You're giving it a compressed, verified map of one narrow flow — and a harness that catches mistakes structurally. The steel thread proves the methodology; then you repeat it for the next flow.

Five Steps, Building on Each Other

0

Get Orientated

Understand the codebase, the steel thread, and the key files. Context-setting for humans before agents enter the picture.

1

Build the Harness

Write Playwright E2E tests and unit tests before any refactoring. The harness defines "correct" structurally. This is the most important step.

Harnesses > Models
2

Research the Steel Thread

Agent explores the add-to-cart flow end-to-end. Produces a dense research.md with every file, method, and connection documented.

RPI: Research
3

Plan the Refactoring

Fresh agent reads only research.md. Produces an ordered plan with interface definitions, DI registration, and a risk register.

RPI: Plan
4

Implement with Feedback Loops

Fresh agent follows the plan step-by-step, running the full test suite after each change. The harness catches mistakes; the agent learns from errors.

RPI: Implement + Physics

When Physics Thinking Pays Off

Three moments where you see the methodology working in real time:

🔴

Compiler Catches a Type Error

The agent introduces a subtle type mismatch. The C# compiler refuses to build. The agent reads the error, fixes the type, moves on. No human intervention.

Physics Thinking
🧪

Unit Test Catches Logic Regression

The agent refactors the cart service and accidentally changes calculation order. "Expected 42.99, got 38.50." Diagnosed and fixed.

Harnesses > Models
🎭

Playwright Catches UI Regression

The service works, but a missing DI registration throws a 500 error. Playwright catches it by clicking "Add to cart." End-to-end verification.

All Three Together

How We Know It Worked

 Research before code — agent explored before touching anything
 Plan before implementation — ordered steps with risk analysis
 Harnesses before refactoring — tests existed before changes
 Feedback loops — test suite ran after every change
 Dense handoffs — research.md and plan.md carried intent forward
 Physics enforcement — compiler + tests caught real errors
 No manual checking — if the harness is green, the refactoring is correct

The takeaway: The model doesn't matter as much as the methodology. A well-structured system — with phases, harnesses, and feedback loops — produces reliable results regardless of which AI you use.

Stage 3 — Build Physics

Physics & Harnesses

Stop relying on prompts. Build structural enforcement — tests, linters, compilers — that make wrong agent behaviour impossible. These are the patterns that encode intent as physics.

Intent Scaffolding

Define the shape of the output: required sections, detail level, format, and location. The agent fills in the scaffold.

Vague

  • "Research how the shopping cart works in this codebase"

Scaffolded

  • "Trace the add-to-cart flow from UI to database"
  • "For each layer, identify files, methods, and line numbers"
  • "Produce research.md with sections: Entry Point, Controller, Service, Data Access, Observations"
  • "Save to docs/adc/extract-cart-service/research.md"

Constraint Systems

Explicit negative instructions define the safe operating space. They prevent the most common failure modes.

Example
## Constraints
- Do NOT modify the database schema
- Do NOT change the plugin architecture
- Do NOT proceed to the next step until all tests pass
- Focus only on the shopping cart steel thread

Risk Registers as Intent

Predict what could go wrong and map each risk to the harness that catches it. This transforms "be careful" into "here's which test will fail."

Circular DI Dependency

Critical — App fails to start. All Playwright tests fail immediately.

Missing DI Registration

Critical — InvalidOperationException. Caught by any E2E test.

Cart Total Miscalculation

High — Unit test asserts "Expected 42.99, got 38.50."

Feedback Loops as Self-Correction

Build mandatory checkpoints into the workflow. The error message itself becomes a remediation instruction.

Why this works: "Expected 42.99, got 0" is infinitely more useful than "please make sure the cart logic is correct." The harness teaches the agent what's wrong in machine-readable terms.

Information Diets

Give each phase only what it needs. Constrained input forces focused output. See the Methodology tab for the full visualization.

🔎

Research Sees

Full codebase access. Explore freely, compress findings into research.md.

📐

Plan Sees

Only research.md. No raw code. Forces planning from compressed summary.

Implement Sees

Only plan.md + codebase. Follows the plan. Doesn't re-research or re-plan.

Enablement Roadmap

Your AI Enablement Journey

A progressive roadmap from first Copilot prompt to fully agentic workflows. Each stage builds on the last — check the boxes as you go.

Definition

The AI-SDLC is a software development lifecycle where AI agents participate at every stage — from discovery through deployment — guided by intent engineering, constrained by structural enforcement (physics), and connected by dense handoff artefacts.

It isn't a new process bolted onto what you already do. It's a progressive evolution of your existing SDLC: first AI assists you, then augments you, then orchestrates alongside you, and eventually runs autonomously with you setting intent and reviewing outcomes. The five stages below map this journey.

Foundations Capture Intent Build Physics Steel Thread Scale
Click checkboxes below to track your progress — 0 / 0 complete
1
Wave 1 — Assistance

Foundations: Learn the Tools

Get comfortable with Copilot's core features. Build muscle memory for the basics before adding complexity. This is where everyone starts.

Copilot in your IDE — tab completions, inline chat (Ctrl+I), chat panel working
4S Framework — Single, Specific, Short, Surround applied to prompts
Slash commands — /explain, /fix, /tests, /doc used fluently
Context variables — #file, #selection, #codebase to focus Copilot
Chat participants — @workspace, @terminal, @github for domain context
Prompt patterns — zero-shot, few-shot, role prompts, chain-of-thought
Safety habits — always review output, run tests on generated code, never paste secrets
Stage 1 outcome

You can use Copilot as a fast, reliable code companion. You know how to ask good questions and you always verify the answers.

2
Wave 2 — Augmentation

Capture Intent: Document What Exists

Before AI can help you change a codebase, it needs to understand it. This stage is about reverse-engineering the intent that was never written down — architecture, conventions, business rules, decisions.

copilot-instructions.md — repo-level AI instructions in .github/ capturing coding standards
Architecture documented — use Copilot to reverse-engineer and document your system's structure
Agent Decision Context — start recording decisions: what, why, alternatives rejected, rollback strategy
Strategic vs Tactical docs — separate the Why (human-authored) from the How (agent-generated)
Source Packs — bundle context (docs, code excerpts, known facts) for reliable AI summaries
Provenance tagging — require AI output labelled EXTRACTED (from source) or INFERRED (AI derivation)
Agent Mode basics — use Copilot's agent mode for multi-file exploration and documentation tasks
Starter Template — copilot-instructions.md / CLAUDE.md
# Project: [Your Project Name]

## Architecture
- Framework: .NET 10 / React / [yours]
- Pattern: N-tier with service layer
- Key entry points: Controllers → Services → Repositories

## Coding Standards
- Async/await throughout; suffix Async on async methods
- Interfaces for all services; register in DI container
- No magic strings — use constants or enums

## Constraints
- Do NOT modify the database schema
- Do NOT change public API contracts
- Do NOT proceed until all tests pass

## Verification Commands
dotnet format --verify-no-changes   # Style
dotnet test                           # Unit + integration
npx playwright test                   # E2E

## Steel Thread
The add-to-cart flow: Browse → Product → Add → View Cart
Files: ShoppingCartController.cs, ShoppingCartService.cs
Save as .github/copilot-instructions.md for Copilot or CLAUDE.md at the repo root for Claude. This file is the single most impactful thing you can create — it turns every agent interaction from cold-start to context-aware.
Agent Decision Context — Template
docs/adc/
├── YYYY-MM-DD--decision-name.md     ← Decision record
└── YYYY-MM-DD--decision-name/      ← RPI artefacts
    ├── research.md                    ← Phase 1 output
    ├── plan.md                        ← Phase 2 output
    └── handoff.md                     ← Agent context for Phase 3

## Decision Record Template
Title:       Extract CartService from ShoppingCartService
Date:        2026-03-31
Status:      Proposed | Accepted | Implemented | Rejected
Motivation:  ShoppingCartService is 1,976 lines with 31 deps
Approach:    Extract steel thread methods into focused service
Rejected:    Split by CRUD (too granular), rewrite (too risky)
Rollback:    Delete CartService, revert delegation in original
Harness:     dotnet build + dotnet test + npx playwright test
Real example: The nopCommerce workshop uses this exact pattern. The research.md maps the full add-to-cart flow (UI → Controller → Service → Data) with precise file paths and line numbers. The plan.md contains 7 ordered steps, each with a verification command. The handoff.md gives the implement agent just enough context — steel thread scope, key selectors, and harness commands — without re-explaining the research.
Stage 2 outcome

Your codebase has agent-readable context. Copilot understands your conventions, architecture, and constraints. You're no longer starting from zero every prompt.

3
Wave 2 — Augmentation

Build Physics: Add Tests and Harnesses

This is where you stop relying on prompts and start building structural enforcement. Tests, linters, and CI pipelines become the physics that make wrong agent behaviour impossible. You're closing the gaps in code coverage that make AI unreliable.

Unit test coverage — use Copilot /tests to generate tests for critical paths you lack coverage on
E2E tests — Playwright or equivalent covering your core user flows (the steel thread)
Linter configured — style rules enforced automatically, not by convention
CI pipeline running — build + lint + test on every PR, failures block merge
The Code Quality Loop — Explore → Plan → Code → Validate → Repeat, with AI accelerating each step
Force Blanks rule — AI leaves gaps when unsure, with a "Reason for Blank" field
Verification mindset — every risk maps to a harness; if a human has to check, the harness is incomplete
Stage 3 outcome

Your codebase has physics. AI agents can make changes and get immediate, structural feedback. Wrong behaviour is caught automatically, not by code review.

4
Wave 2–3 — Augmentation → Orchestration

Steel Thread: Prove It End-to-End

Pick one narrow, vertical slice through your entire architecture — a "steel thread." Apply the full RPI methodology: Research with LSP, Plan with risk registers, Implement with feedback loops. This is your proof-of-concept that the methodology works on your codebase.

Steel thread identified — one end-to-end user flow touching every layer (UI, controller, service, data)
Research phase (R) — agent explores the steel thread with LSP, produces dense research.md
Plan phase (P) — fresh agent reads only research.md, produces plan.md with risk register
Implement phase (I) — fresh agent follows plan step-by-step, all harnesses green after every change
ADC artefacts saved — research.md, plan.md, handoff.md stored in docs/adc/ for future reference
Custom Skills created — your proven patterns encoded as reusable .github/skills/ or .claude/skills/
MCP servers configured — GitHub, Playwright, or domain-specific servers connected to your agent workflow
Results demonstrated — show the team what worked, what the harnesses caught, what the agent self-corrected
Stage 4 outcome

You've proven the methodology works on your actual codebase. You have a repeatable playbook, reusable skills, and concrete results to share with your team.

5
Wave 3–4 — Orchestration → Autonomy

Scale: From You to the Organisation

Graduate your proven patterns from individual to team infrastructure. Shared skills repos, pipeline agents, PR review bots, automated research phases. The methodology that worked on your machine now runs in your CI/CD pipeline.

Shared skills repo — proven skills promoted from personal to team, versioned and reviewed like code
Team copilot-instructions.md — org conventions as agent-readable intent, maintained by the team
PR review agent — Copilot code review on every PR, checking architecture, style, and test coverage
Dependabot + Copilot — automated dependency self-healing: alert → analysis → fix PR → merge
Coding agent for issues — assign tickets to Copilot, it creates draft PRs asynchronously
Security integration — Defender/GHAS alerts trigger agent-generated remediation PRs
Modernization pipelines — Assess → Plan → Execute → Validate agent workflows for upgrades
Org-wide ADCs — decision contexts become institutional memory, searchable by future agents and humans
Stage 5 outcome

AI agents are team infrastructure. Your pipeline researches, plans, implements, and verifies — with humans setting intent and reviewing outcomes. The AI-SDLC is operational.

One Journey, Five Stages

Each stage unlocks the next. You can't build physics without capturing intent first. You can't run a steel thread without physics. And you can't scale what you haven't proven. The progression is deliberate.

📚
Learn
Tools & prompts
📄
Document
Capture intent
🛡
Enforce
Build physics
🚀
Prove
Steel thread
🌎
Scale
Team & org

Start where you are. Most teams are somewhere between Stage 1 and Stage 2. That's fine. The roadmap isn't a race — it's a progression. Each stage makes the next one possible, and each stage delivers value on its own.

Where AI Fits in the Development Lifecycle

By Stage 5, AI participates in every phase of the SDLC. But even at Stage 1, it's accelerating your work. The difference is scope and trust.

SDLC Phase
How AI Helps
Stage
Discovery
Reverse-engineer architecture, generate system docs, map dependencies with LSP
2+
Planning
Draft implementation plans, risk registers, migration strategies from research artefacts
4+
Coding
Completions, inline chat, agent-mode multi-file edits, vibe coding for prototypes
1+
Testing
Generate unit tests, E2E tests, edge cases. Copilot as the harness-builder, not just the coder
1+
Code Review
Copilot PR reviews for architecture, style, security. Human final approval
5
Security
Defender → GHAS → agent remediation. Runtime threats trigger source-level fixes
5
Maintenance
Dependabot + Copilot self-healing, modernization agents for framework upgrades
5

From Maker to Architect of Intent

The deepest change isn't in tooling — it's in mindset. As you progress through the stages, your role evolves from writing code to defining what correct looks like.

Stages 1–2
Maker
You write code with AI assistance. Copilot suggests, you accept or reject. The human does the thinking.
Stages 3–4
Engineering Manager
You define the Definition of Done and the physics. Agents do the implementation. You review outcomes, not code.
Stage 5
Architect of Intent
You define the system of systems. Intent flows through docs, skills, harnesses, and pipelines. Agents coordinate agents.
Execution is now cheap. Verification and clarity are the new bottlenecks.
Not vibe coding. Intent specification.
Stage 5 — Scale

Scale to Your Organisation

Graduate your proven patterns from individual to team infrastructure. Shared skills, pipeline agents, and the AI-native SDLC.

From Assistance to Autonomy

AI adoption in software development follows a progression. Each wave changes the ratio of humans to agents — and the kind of intent engineering required.

Wave 1

Assistance

Human + Copilot
👤
+

AI as autocomplete. You drive, it suggests. Code completion, inline help, simple Q&A. The human does all the thinking.

Wave 2
Today

Augmentation

Human-directed agents
👤

You direct agents to do whole tasks. Local skills, RPI methodology, personal agent workflows. This is where intent engineering begins.

Wave 3

Orchestration

Team-wide agent workflows
👤
👤

Shared skills repos, pipeline agents, PR review bots. Agents become team infrastructure. Humans supervise.

Wave 4

Autonomy

Agent-orchestrated systems
👤

Agents coordinating agents at scale. Humans set intent and review outcomes. The SDLC runs itself.

We must shift our mindset from text-crafters to systems architects.

Getting There: Individual First, Then Team

You don't jump straight to pipeline agents. The adoption path starts with you — proving the patterns locally — then graduating them to shared infrastructure.

1. Get It Working Locally

Build your own skills, refine your own agent workflows, prove the RPI methodology on real tasks. This is your lab — experiment, iterate, learn what works. You're building muscle memory for intent engineering.

Personal CLAUDE.md Local skills RPI on your own PRs Custom agent prompts ADC decision records
👥

2. Share Patterns with Your Team

Once your skills and workflows are proven, promote them to shared infrastructure. A team skills repo means everyone benefits from your hard-won patterns. Shared CLAUDE.md files encode team conventions as agent-readable intent.

Shared skills repo Team CLAUDE.md Documented agent workflows Pair-programming with agents

3. Agents in the Pipeline

Agents move from your terminal into the CI/CD pipeline. PR review agents check for style, test coverage, and architectural compliance. Research agents pre-analyse tickets. Plan agents draft implementation approaches before a human even starts.

PR review agents CI/CD pipeline agents Ticket pre-analysis Automated research phase Architecture compliance
🌎

4. Agentic Workflows at Scale

The full AI-SDLC: agents that research, plan, implement, and verify — with humans setting intent and reviewing outcomes. The RPI methodology you proved locally is now an organisational capability. Custom Skills encode your org's patterns as portable, reusable knowledge.

Agent-to-agent handoffs Org-wide skills marketplace Autonomous RPI pipelines Intent-driven backlog

What Changes When You Go from You to Us

Individual (Waves 1-2)

📄
Personal CLAUDE.md — your coding conventions, preferences, and project-specific rules
Local skills — custom skills in your own .claude/skills/ folder, iterated until they work
🔎
Manual RPI — you trigger each phase, review each artefact, guide each agent
📋
ADC records — your own decision contexts, learning what to capture
🛠
Local harnesses — tests and linters on your machine, run by your agents

Team & Organisation (Waves 3-4)

📄
Shared CLAUDE.md — team conventions, architecture decisions, and coding standards as agent-readable intent
Shared skills repo — proven skills promoted from individual to team, versioned and reviewed like code
🔎
Pipeline RPI — agents triggered by events (new PR, ticket assignment), phased workflows automated
📋
Org-wide ADCs — decision contexts become institutional memory, searchable by future agents
🛠
CI/CD harnesses — the same physics-based enforcement, now running in pipelines with agent reviewers

Skills Are the Unit of Adoption

A skill is a self-contained folder — a SKILL.md plus any bundled scripts, references, and examples. Skills are how individual knowledge becomes team capability.

💡
Create
Build locally,
iterate until proven
🔍
Prove
Use on real tasks,
refine the SKILL.md
👥
Share
Promote to shared
skills repo via PR
🌎
Scale
Agents use skills
in pipelines at scale

The industry is converging. GitHub's Custom Skills, Copilot's Modernization Agent, the Agentic Context Framework — they all use the same pattern: self-contained skill folders with documentation, scripts, and examples. Your RPI methodology can be packaged as a skill. Your ADC pattern can be a skill. This is how methodology becomes infrastructure.

The AI-Native SDLC

In the AI-SDLC, every stage of the development lifecycle has agent participation — guided by intent, constrained by harnesses, connected by dense artefacts.

📄 Ticket Created
Research agent analyses the ticket, maps affected files, produces research.md
📐 Research Ready
Plan agent reads research.md, drafts implementation plan with risk register
☑ Human Review
Developer reviews the plan, adjusts intent, approves or redirects
⚡ Plan Approved
Implement agent follows plan step-by-step, self-corrects from harness output
🚀 PR Created
Review agent checks architecture, test coverage, style — human does final approval

The key insight: The same intent engineering patterns that make agents reliable on your machine — phased workflows, dense artefacts, physics-based enforcement — are exactly what make them reliable in a pipeline. The AI-SDLC isn't a different methodology. It's the same methodology, graduated from individual to infrastructure.

The Dumb Zone

AI agents don't degrade gracefully as context fills up. Past roughly 40% context usage, reasoning quality drops sharply — the model becomes confident but wrong. This is the Dumb Zone.

Excellent Good Poor
~40%
0%20%40%60%80%100%
Context Window Usage →
Strong reasoning Degrading Dumb Zone

200k+ lines can't fit in a window. nopCommerce's ShoppingCartService alone is 1,976 lines with 31 dependencies. Dumping files into context doesn't work — you need curated context, not raw context.

Three Ways to Minimise Token Waste

🔎

1. Precise Tools

Use LSP, AST queries, and structured APIs instead of grep and file reads. Get exactly the data you need, not pages of approximate matches.

📦

2. Dense Artefacts

Compress hours of exploration into structured documents. Handoff artefacts carry intent forward without carrying raw context.

3. Fresh Context Windows

Start each phase clean. Don't accumulate stale tokens from earlier exploration. The RPI methodology bakes this in.

LSP Changes Everything

The same research task — mapping the shopping cart steel thread — was run twice. Once with grep and manual file reads, once with Language Server Protocol. The difference isn't subtle.

Without LSP (grep)

📄
6 methods identified for extraction
10 risks in register
🔍
Caller counts estimated, not verified
📋
Reads many files to find partial matches
vs

With LSP

📄
10 methods identified for extraction
12 risks in register (+2 critical)
🔍
Precise caller counts: 30+ consumers by layer
📋
Structured queries return only what's needed

Three Critical Gaps Grep Missed

1

Test DI Registration

BaseNopTest.cs needs CartService registered too — not just NopStartup.cs. Without this, all unit tests fail.

Would break all tests
2

Bidirectional Coupling

Delete and Update call each other. Moving one without the other creates a circular dependency that the compiler rejects.

Would block compilation
3

Hidden External Callers

8 callers on GetStandardWarningsAsync from outside the service. Grep found 0 — treated as internal only.

Would break 8 callers

LSP-verified research is physics for planning. Any of these three gaps would have caused the implementation to fail. LSP answers precise questions with precise data — no wasted tokens on scanning files for approximate matches.

Compression, Not Accumulation

Each RPI phase compresses its findings into a dense handoff document. The next phase gets curated knowledge, not raw exploration. This is how you stay out of the Dumb Zone.

200k
lines of code
Full codebase
~2k
lines
research.md
~350
lines
plan.md
6
steps
Focused action

Each stage compresses by roughly 100x. The implementing agent works with a plan, not a codebase. It stays well under the 40% threshold where reasoning degrades.

Context Dumps: Curated, Not Complete

When you need to seed a fresh agent with project context, don't dump everything. Extract only the paths and artefacts relevant to the task at hand — a targeted context dump.

Targeted context dump
# Instead of giving the agent the entire repo...
/context-dump \
  src/Services/ShoppingCartService.cs  \
  src/Interfaces/ICartService.cs       \
  docs/adc/extract-cart-service/       \
  docs/01_strategy/                    \
  .ai/plans/current-plan.md

# The agent gets ~500 lines of curated context
# instead of 200k lines of raw codebase

Raw Context Dump

  • Entire repo loaded into context
  • Agent must sift through irrelevant files
  • Hits context window limits quickly
  • Falls into the Dumb Zone

Curated Context Dump

  • Specific paths and artefacts selected
  • Agent starts with focused, relevant context
  • Stays well under context limits
  • Maximum reasoning quality preserved

This applies beyond initial seeding. Any time you're passing context to an agent — whether through files, tool outputs, or system prompts — ask: is every token here earning its place?

Think in Token Budgets

Every token in the context window has an opportunity cost. Tokens spent on irrelevant context are tokens the agent can't use for reasoning about the actual problem.

Naive approach: dump everything
Raw files (65%)
Prompts (20%)
Reasoning
Intent-engineered: curated context
Artefact
Prompts (20%)
Reasoning (65%) ← this is what you want

The Token Minimisation Toolkit

LSP Over Grep

Structured queries return precise answers. find-references gives you exact callers; grep gives you every line containing a string. One is signal, the other is noise.

📦

Dense Handoff Artefacts

Compress exploration into structured documents. 200k lines → 2k line research.md → 350 line plan.md. Each step is ~100x compression.

Targeted Context Dumps

Select specific paths and artefacts for fresh agents. Every token should earn its place in the context window.

🔄

Fresh Context Windows

Start each RPI phase clean. Don't accumulate stale exploration tokens. Dense artefacts bridge the gap between phases.

The principle: Legacy systems require curated context, not raw context dumps. Adding more files to context does not automatically solve the problem — it usually makes it worse.

Big Picture — Where AI Fits

AI Across the SDLC

The most common question: "Where does AI actually fit in our process?" This map shows every stage of the software development lifecycle with concrete AI capabilities you can use today.

Eight Stages, One Map

AI isn't just for coding. It fits across the entire lifecycle — from planning through operations. The maturity varies by stage: coding is ready now, operations is still emerging. Here's the big picture.

Ready — use today Emerging — maturing fast Preview — early access
01 Plan
Emerging
Spec Kit generates structured specs from prompts. Copilot Chat explores existing codebases with @workspace. Azure Boards sends work items directly to the coding agent.
02 Design
Emerging
Mermaid diagrams via Copilot for architecture visualisation. ADC records capture decisions. Group Simulation tactic for multi-perspective review.
03 Code
Ready
Inline completions, Agent Mode for multi-file changes, Coding Agent for async implementation, copilot-instructions.md, SKILL.md, MCP servers.
04 Test
Ready
/tests generates unit tests. Agent Mode does test-first workflows. CodeQL for SAST. Custom agents for DAST (StackHawk, Aikido).
05 Review
Ready
Copilot Code Review on PRs (comments, never approvals). Custom review agents (.agent.md) for accessibility, security, compliance checks.
06 Deploy
Emerging
GitHub Actions YAML generation and maintenance. Agent Mode for complex pipeline changes. IaC generation (Terraform, Bicep, K8s).
07 Operate
Preview
Azure SRE Agent for monitoring. Copilot SDK for custom ops agents. MCP connections to Datadog, Grafana, PagerDuty.
08 Maintain
Ready
Agent Mode refactoring across files. /explain for legacy code. Dependabot + Copilot review. Steel Thread extraction for brownfield.

Monday morning starter kit: You don't need all of this. Start with three things: (1) Create a .github/copilot-instructions.md with your coding standards. (2) Try agent mode on one real task. (3) Ask Copilot to generate tests for code you wrote this week. That's enough to change how your team works.

🖶 Open Full SDLC Map (printable) →
Quick Reference

One-Pagers

Focused, printable reference sheets distilled from the workshop material. Each answers a specific question you can hand to your team or your exec.

🎯

The Recommended Approach — Research → Plan → Implement → Review

START HERE — The single process for every non-trivial ticket

One timeboxed workflow for greenfield and brownfield work. Four steps, suggested durations, artefacts at each stage, PR checklist, quick commands. This is the "one-pager" your team asked for.

These are designed to be shared. Send a one-pager to your exec, print one for your squad, or use them as a reference during project time. Each stands alone — no prior context needed.

📋 Cheat Sheets & Bingo Card ⚖ Full Comparison 🗺 SDLC Map