Agentic Workflows

When to use Agent mode vs Chat - decision frameworks and guardrails for autonomous AI workflows

What Is Agent Mode?

Agent mode lets AI execute multi-step tasks autonomously: reading files, running commands, making edits, testing changes. Unlike chat (you approve each step), agents work independently toward a goal.

This playbook assumes you know how to activate Agent mode in Cursor. See Cursor Docs for feature tutorials.

Decision Matrix: Agent vs Chat

Scenario	Use Agent Mode	Use Chat
Clear, well-defined task	✅ "Add tests for UserService"	❌
Exploring unfamiliar code	❌	✅ Need human judgment
Multiple file changes	✅ Efficient for coordinated edits	❌ Too much back-and-forth
Security-sensitive code	❌	✅ Requires careful review
Debugging complex issue	⚠️ Maybe - needs supervision	✅ Better with human in loop
Architectural decisions	❌	✅ Human must decide
Boilerplate generation	✅ CRUD, tests, docs	❌
Production hotfix	❌	✅ Too risky for automation

Good Agent Tasks

Characteristics of tasks that work well:

Clear success criteria - "All tests pass" or "Linter errors gone"
Low blast radius - Changes to test files, new features, documentation
Reversible - Easy to undo if wrong
Well-scoped - 5-20 files, not entire codebase
Existing patterns to follow - Agent can reference similar code

Examples:

✅ "Generate unit tests for @src/orders/OrderService.ts
   - Follow patterns from @src/users/UserService.test.ts
   - Cover edge cases: empty cart, discounts, tax
   - Aim for 90%+ coverage"

✅ "Refactor @src/payments/ to use new ErrorHandler pattern from @src/common/errors/
   - Update all try-catch blocks
   - Preserve existing behavior
   - Run tests after each file"

✅ "Add TypeScript strict mode to @src/legacy/
   - Fix type errors incrementally
   - Don't change runtime behavior
   - Add type annotations where needed"

Bad Agent Tasks

Characteristics that indicate human should drive:

Ambiguous requirements - "Make it better" or "Fix the bug"
High risk - Authentication, payments, data migrations
Requires business judgment - "Should we cache this?" (depends on context)
Exploratory work - Understanding unfamiliar codebase
Multiple valid approaches - Architectural decisions

Examples:

❌ "Refactor the entire authentication system"
   → Too broad, too risky, needs architecture decisions

❌ "Fix the performance issue"
   → Needs profiling first, human must identify bottleneck

❌ "Make the UI look better"
   → Subjective, needs design decisions

❌ "Update all dependencies"
   → Breaking changes need human review

❌ "Implement the new feature from the spec"
   → Likely has ambiguities that need clarification

Supervision Levels

How closely to monitor agent work:

Level 1: Watch Every Step (High Risk)

When:

First time using agent on this codebase
Security-sensitive code (auth, payments, data access)
Modifying critical paths
Production hotfixes

How:

Review each file change before agent proceeds
Manually test after each major step
Keep tests running continuously

Level 2: Check-in Points (Medium Risk)

When:

Refactoring with good test coverage
Adding features to well-understood code
Multi-file changes with clear patterns

How:

Let agent work for 5-10 minutes
Review what changed, spot-check logic
Run tests, verify expected behavior
Course-correct if agent drifts off-track

Level 3: Autonomous (Low Risk)

When:

Test generation for existing code
Documentation updates
Code formatting/linting fixes
Boilerplate CRUD endpoints

How:

Define success criteria upfront
Let agent complete the task
Review final output before committing
Run full test suite

Golden rule: Start at Level 1, earn your way to Level 3.

Guardrails: What Agents Shouldn't Touch

Hard no:

## Agent No-Go Zones

- [ ] .env files or secrets
- [ ] Production database credentials
- [ ] Payment processing logic
- [ ] Authentication/authorization core
- [ ] Database migrations (unsupervised)
- [ ] Dependency updates (breaking changes)
- [ ] Git force push or history rewriting
- [ ] Deleting files without explicit permission

Add to your .cursor/rules/agent-guardrails.md:

# Agent Guardrails

## Before making changes:

- Explain your plan before executing
- Ask if touching auth, payments, or migrations
- Never modify .env or secrets files
- Never run destructive commands (rm -rf, DROP TABLE, etc.)
- Run tests after each significant change

## When uncertain:

- Ask for clarification
- Suggest options, let human decide
- Don't guess at business logic
- Don't make architectural decisions

Anti-Patterns

Anti-Pattern 1: Fire and Forget

Mistake:

You: "Refactor the entire backend to use repositories"
[Walk away for 2 hours]
[Come back to 1000 lines changed, half broken]

Better:

You: "Refactor OrderService to use repository pattern
      - Follow example from @UserRepository
      - Tests must pass
      - Stop after OrderService, I'll review before continuing"

Anti-Pattern 2: Over-Trusting Agent Output

Mistake:

Agent: "I've added error handling everywhere"
You: [Commits without reviewing]
Reality: Agent wrapped everything in try-catch, swallowed errors

Better: Always review agent changes for:

Logic correctness
Edge cases
Security implications
Performance impact

Anti-Pattern 3: Vague Instructions

Mistake:

"Fix all the bugs"
"Make it production ready"
"Improve the code quality"

Better:

"Fix the N+1 query in @ReportService.ts line 45
 - Use eager loading with joins
 - Add test to verify query count"

"Add production requirements:
 - Rate limiting on auth endpoints
 - Proper error logging (no stack traces to client)
 - Input validation on all POST/PUT endpoints"

Task Decomposition Pattern

Break large tasks into agent-sized chunks:

Bad (too big):

"Build a user preferences system"

Good (decomposed):

Step 1: "Create database schema for user_preferences table
         - user_id, key, value, created_at, updated_at
         - Generate migration file"
         [Review schema]

Step 2: "Create PreferenceRepository with CRUD operations
         - Follow patterns from @UserRepository
         - Include unit tests"
         [Review, test]

Step 3: "Create PreferenceService with business logic
         - Validation for known preference keys
         - Default values if not set
         - Include unit tests"
         [Review, test]

Step 4: "Create API endpoints
         - GET /preferences
         - PUT /preferences/:key
         - Follow patterns from @UserController"
         [Review, test, ship]

Why this works:

Each step is verifiable
Easy to course-correct
Builds on verified previous step
Can ship incrementally

Remember: Agent mode is powerful but not magical. You're still responsible for the code. The agent is your assistant, not a replacement for thinking.

On this page