Skip to main content

Context Engineering Fundamentals

Context Engineering Fundamentals

Context is the complete state available to a language model at inference time. It includes everything the model can attend to when generating responses: system instructions, tool definitions, retrieved documents, message history, and tool outputs. Understanding context fundamentals is prerequisite to effective context engineering.

When to Use

Use this skill when:

  • Designing new agent systems or modifying existing architectures
  • Debugging unexpected agent behavior that may relate to context
  • Optimizing context usage to reduce token costs or improve performance
  • Onboarding new team members to context engineering concepts
  • Reviewing context-related design decisions

Don't use this skill when:

  • Simple single-turn completions with minimal context
  • Tasks where context management is already mature and optimized
  • Non-agent applications without tool use

Core Concepts

Context comprises several distinct components, each with different characteristics and constraints. The attention mechanism creates a finite budget that constrains effective context usage. Progressive disclosure manages this constraint by loading information only as needed. The engineering discipline is curating the smallest high-signal token set that achieves desired outcomes.

The Anatomy of Context

System Prompts

System prompts establish the agent's core identity, constraints, and behavioral guidelines. They are loaded once at session start and typically persist throughout the conversation. System prompts should be extremely clear and use simple, direct language at the right altitude for the agent.

The right altitude balances two failure modes:

  • Too specific: Hardcoded brittle logic creates fragility and maintenance burden
  • Too vague: High-level guidance fails to give concrete signals for desired outputs

Organize prompts into distinct sections using XML tagging or Markdown headers to delineate background information, instructions, tool guidance, and output description.

Tool Definitions

Tool definitions specify the actions an agent can take. Each tool includes a name, description, parameters, and return format. Tool definitions live near the front of context after serialization.

Tool descriptions collectively steer agent behavior. Poor descriptions force agents to guess; optimized descriptions include usage context, examples, and defaults. The consolidation principle: if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better.

Retrieved Documents

Retrieved documents provide domain-specific knowledge, reference materials, or task-relevant information. Agents use retrieval augmented generation (RAG) to pull relevant documents into context at runtime rather than pre-loading all possible information.

Just-in-time approach: Maintain lightweight identifiers (file paths, stored queries, web links) and use these references to load data into context dynamically.

Message History

Message history contains the conversation between the user and agent. For long-running tasks, message history can grow to dominate context usage. Message history serves as scratchpad memory where agents track progress, maintain task state, and preserve reasoning across turns.

Tool Outputs

Tool outputs are the results of agent actions: file contents, search results, command execution output, API responses. Research shows observations (tool outputs) can reach 83.9% of total context usage. Tool outputs consume context whether they are relevant to current decisions or not.

Context Windows and Attention Mechanics

The Attention Budget Constraint

Language models process tokens through attention mechanisms that create pairwise relationships between all tokens. For n tokens, this creates n² relationships. As context length increases, the model's ability to capture these relationships gets stretched thin.

Models develop attention patterns from training data distributions where shorter sequences predominate. The result is an "attention budget" that depletes as context grows.

The Progressive Disclosure Principle

Progressive disclosure manages context efficiently by loading information only as needed:

  • At startup: Load only skill names and descriptions
  • On activation: Load full content when a skill is activated
  • This keeps agents fast while giving access to more context on demand

Practical Guidance

File-System-Based Access

Agents with filesystem access can use progressive disclosure naturally:

  • Store reference materials, documentation, and data externally
  • Load files only when needed using standard filesystem operations
  • File sizes suggest complexity; naming conventions hint at purpose

Context Budgeting

Design with explicit context budgets in mind:

  • Know the effective context limit for your model and task
  • Monitor context usage during development
  • Implement compaction triggers at 70-80% utilization
  • Place critical information at attention-favored positions (beginning and end)

Example: Organizing System Prompts

<BACKGROUND_INFORMATION>
You are a Python expert helping a development team.
Current project: Data processing pipeline in Python 3.9+
</BACKGROUND_INFORMATION>

<INSTRUCTIONS>
- Write clean, idiomatic Python code
- Include type hints for function signatures
- Add docstrings for public functions
</INSTRUCTIONS>

<TOOL_GUIDANCE>
Use bash for shell operations, python for code tasks.
File operations should use pathlib for cross-platform compatibility.
</TOOL_GUIDANCE>

Guidelines

  1. Treat context as a finite resource with diminishing returns
  2. Place critical information at attention-favored positions (beginning and end)
  3. Use progressive disclosure to defer loading until needed
  4. Organize system prompts with clear section boundaries
  5. Monitor context usage during development
  6. Implement compaction triggers at 70-80% utilization
  7. Prefer smaller high-signal context over larger low-signal context

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: context-fundamentals

Completed:
- [x] Core concepts understood (context components, attention mechanics)
- [x] Context anatomy reviewed (system prompts, tools, documents, history, outputs)
- [x] Attention budget constraint principles applied
- [x] Progressive disclosure strategy implemented
- [x] Practical guidance integrated into design

Key Insights Applied:
- Context components identified: System prompts, tool definitions, retrieved documents, message history, tool outputs
- Attention budget: n² relationships create finite resource
- Progressive disclosure: Load information only as needed
- Token budget monitoring: 70-80% utilization triggers compaction
- Critical placement: Beginning and end positions for important information

Outcomes:
- Agent architecture optimized for context efficiency
- Token usage reduced through progressive disclosure
- Context failures debugged and resolved

Completion Checklist

Before marking this skill as complete, verify:

  • All 5 context components understood (system prompts, tools, documents, history, outputs)
  • Attention budget constraint and n² relationships concept clear
  • Progressive disclosure principle applied to design
  • Context budgeting strategy defined (70-80% threshold)
  • File-system-based access patterns implemented where applicable
  • System prompts organized with clear section boundaries
  • Critical information placed at attention-favored positions
  • Context usage monitoring implemented

Failure Indicators

This skill has FAILED if:

  • ❌ Context components still confused or misunderstood
  • ❌ No awareness of attention budget constraints
  • ❌ Loading all information upfront instead of progressive disclosure
  • ❌ Context usage exceeds 95% without compaction
  • ❌ System prompts lack clear section organization
  • ❌ Critical information buried in middle of context
  • ❌ No monitoring of context usage during development
  • ❌ Applying techniques to non-agent applications unnecessarily

When NOT to Use

Do NOT use this skill when:

  • Working on simple single-turn completions with minimal context
  • Tasks where context management is already mature and optimized
  • Non-agent applications without tool use
  • Building non-LLM systems (traditional software without AI components)
  • Context size is negligible (<10% of window)
  • No iterative conversations or tool use involved

Use alternatives instead:

  • Basic prompt engineering for single-turn tasks
  • Standard documentation patterns for non-agent systems
  • Traditional software architecture for deterministic logic

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Loading entire knowledge base upfrontContext overflow, attention dilutionUse progressive disclosure, load on-demand
Ignoring attention budgetContext degradation, poor performanceMonitor usage, implement 70-80% compaction
Unstructured system promptsModel confusion, inconsistent behaviorUse XML tags or Markdown headers for sections
Mixing altitude levelsToo specific (brittle) or too vague (ineffective)Balance concrete signals with flexibility
Burying critical info in middleAttention bias misses key instructionsPlace at beginning or end of context
Pre-loading all possible docsToken waste, slower performanceUse RAG with just-in-time retrieval
No context monitoringSurprise failures at 100% usageImplement tracking, dashboards, alerts
Treating context as infiniteEventual failures as context growsDesign with explicit limits and compaction

Principles

This skill embodies:

  • #3 Keep It Simple - Prefer smaller high-signal context over larger low-signal
  • #5 Eliminate Ambiguity - Clear section boundaries in system prompts
  • #8 No Assumptions - Monitor context usage, don't guess capacity
  • #10 Measure → Learn → Improve - Track context metrics during development
  • Progressive Disclosure - Load information only as needed, not all upfront
  • Attention Budget Awareness - Finite resource with diminishing returns

Context Engineering Principles:

  1. Treat context as finite resource with attention constraints
  2. Place critical information at beginning/end positions
  3. Use progressive disclosure to defer loading until needed
  4. Organize system prompts with clear boundaries
  5. Monitor usage and implement compaction triggers (70-80%)
  6. Prefer smaller high-signal context over larger low-signal

Full Standard: CODITECT-STANDARD-AUTOMATION.md


Skills

  • context-degradation - Understanding how context fails
  • context-optimization - Techniques for extending context capacity
  • context-compression - Compression strategies for long sessions
  • memory-systems - Persistent memory architectures

Agents

  • context-health-analyst - Monitor context health in real-time
  • compression-evaluator - Evaluate compression quality

Scripts

  • external/Agent-Skills-for-Context-Engineering/skills/context-fundamentals/scripts/context_manager.py - Context building utilities