Skip to main content

Tool Design for Agents

Tool Design for Agents

Tools are the primary mechanism through which agents interact with the world. Unlike traditional software APIs designed for developers, tool APIs must be designed for language models that reason about intent, infer parameter values, and generate calls from natural language requests. Poor tool design creates failure modes that no amount of prompt engineering can fix.

When to Use

Use this skill when:

  • Creating new tools for agent systems
  • Debugging tool-related failures or misuse
  • Optimizing existing tool sets for better performance
  • Designing tool APIs from scratch
  • Evaluating third-party tools for agent integration
  • Standardizing tool conventions across a codebase

Don't use this skill when:

  • Simple single-tool applications
  • Traditional developer-facing API design
  • Non-agent applications

Core Concepts

Tools are contracts between deterministic systems and non-deterministic agents. When humans call APIs, they understand the contract. Agents must infer the contract from descriptions and generate matching calls.

Tool descriptions are prompt engineering that shapes how agents reason about tool use. The descriptions collectively steer agent behavior.

The Consolidation Principle

If a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better.

This leads to a preference for single comprehensive tools over multiple narrow tools.

Example: Scheduling

Before (Multiple narrow tools):

  • list_users - List all users
  • list_events - List all events
  • create_event - Create event
  • check_availability - Check user availability

After (Consolidated):

  • schedule_event - Find availability and schedule (handles full workflow internally)

Why Consolidation Works

BenefitExplanation
Reduced token consumptionEliminates redundant descriptions
Eliminated ambiguityOne tool per workflow
Simpler selectionSmaller effective tool set
Better reliabilityLess chance of wrong tool choice

When NOT to Consolidate

  • Tools with fundamentally different behaviors
  • Tools used in different contexts
  • Tools that might be called independently

Architectural Reduction

The consolidation principle, taken to its logical extreme, leads to architectural reduction: removing most specialized tools in favor of primitive, general-purpose capabilities.

The File System Agent Pattern

Instead of custom tools for data exploration, schema lookup, and query validation, provide direct file system access through a single command execution tool:

# Instead of many specialized tools:
# - explore_database()
# - lookup_schema()
# - validate_query()
# - preview_results()

# Use one primitive:
execute_command("grep -r 'user_id' schema/")
execute_command("cat schema/users.sql")
execute_command("ls -la data/")

Production Evidence: Vercel d0

MetricBefore (17 tools)After (2 tools)
Success rate80%100%
Avg execution274s77s
Tools17 specialized2 primitives (bash + SQL)

Key insight: The semantic layer was already good documentation. Claude just needed access to read files directly.

When Reduction Outperforms Complexity

Reduction works when:

  • Data layer is well-documented and consistent
  • Model has sufficient reasoning capability
  • Specialized tools were constraining, not enabling
  • More time maintaining scaffolding than improving outcomes

Complexity is necessary when:

  • Underlying data is messy or poorly documented
  • Domain requires specialized knowledge model lacks
  • Safety constraints require limiting agent capabilities
  • Operations truly benefit from structured workflows

Tool Description Engineering

Effective descriptions answer four questions:

QuestionWhat to Include
What does it do?Clear, specific functionality. Avoid "helps with" or "can be used for"
When should it be used?Specific triggers and contexts
What inputs?Parameter types, constraints, defaults
What returns?Output format, success/error examples

Example: Well-Designed Tool

def get_customer(customer_id: str, format: str = "concise"):
"""
Retrieve customer information by ID.

Use when:
- User asks about specific customer details
- Need customer context for decision-making
- Verifying customer identity

Args:
customer_id: Format "CUST-######" (e.g., "CUST-000001")
format: "concise" for key fields, "detailed" for complete record

Returns:
Customer object with requested fields

Errors:
NOT_FOUND: Customer ID not found
INVALID_FORMAT: ID must match CUST-###### pattern
"""

Anti-Pattern: Poor Tool Design

def search(query):
"""Search the database.""" # ← Vague, missing everything

Problems: Vague name, missing parameters, no return description, no usage context, no error handling.

Response Format Optimization

Tool response size significantly impacts context usage.

Format Options

def get_customer(customer_id: str, format: str = "concise"):
if format == "concise":
return {"id": id, "name": name, "status": status}
elif format == "detailed":
return full_customer_record

Include format guidance in tool descriptions so agents learn to select appropriately.

Error Message Design

Error messages serve two audiences: developers debugging and agents recovering.

Design for Recovery

# Bad: Generic error
raise Error("Invalid input")

# Good: Actionable for agents
raise Error(
"INVALID_FORMAT: customer_id must match 'CUST-######'. "
"Received: 'customer123'. "
"Example valid ID: 'CUST-000001'"
)

MCP Tool Naming Requirements

When using MCP (Model Context Protocol) tools, always use fully qualified names:

# Correct: Fully qualified
"Use the BigQuery:bigquery_schema tool..."
"Use the GitHub:create_issue tool..."

# Incorrect: Unqualified (may fail with multiple servers)
"Use the bigquery_schema tool..."

Tool Collection Guidelines

Research shows tool description overlap causes model confusion. A reasonable guideline: 10-20 tools for most applications.

If more needed, use namespacing:

database:query
database:schema
web:search
web:fetch

Using Agents to Optimize Tools

Claude can optimize its own tools. When given a tool and observed failure modes, it diagnoses issues and suggests improvements.

def optimize_tool_description(tool_spec, failure_examples):
prompt = f"""
Analyze this tool specification and failures.

Tool: {tool_spec}
Failures: {failure_examples}

Identify:
1. Why agents are failing
2. What information is missing
3. What ambiguities cause incorrect usage

Propose improved description.
"""
return get_agent_response(prompt)

Result: 40% reduction in task completion time by helping future agents avoid mistakes.

Guidelines

  1. Write descriptions that answer what, when, and what returns
  2. Use consolidation to reduce ambiguity
  3. Implement response format options for token efficiency
  4. Design error messages for agent recovery
  5. Establish and follow consistent naming conventions
  6. Limit tool count and use namespacing for organization
  7. Test tool designs with actual agent interactions
  8. Question whether each tool enables or constrains the model
  9. Prefer primitive, general-purpose tools over specialized wrappers
  10. Invest in documentation quality over tooling sophistication
  11. Build minimal architectures that benefit from model improvements

Skills

  • context-fundamentals - How tools interact with context
  • multi-agent-patterns - Specialized tools per agent
  • evaluation-framework - Evaluating tool effectiveness

Agents

  • tool-designer - Tool description generation and validation

Scripts

  • external/Agent-Skills-for-Context-Engineering/skills/tool-design/scripts/description_generator.py - Tool schema builder

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: tool-design

Completed:
- [x] Tool API defined with clear parameters and return types
- [x] Tool description answers: what, when, inputs, outputs
- [x] Consolidation analysis performed (reduced X tools to Y)
- [x] Response format options implemented (concise/detailed)
- [x] Error messages designed for agent recovery
- [x] Tool naming follows MCP qualification requirements
- [x] Agent-tested with actual invocations

Outputs:
- Tool schema with complete type definitions
- Tool description optimized for agent understanding
- Error message catalog with recovery guidance
- Usage examples with expected inputs/outputs
- Consolidation recommendation report (if applicable)
- Agent testing results with success/failure analysis

Completion Checklist

Before marking this skill as complete, verify:

  • Tool description explicitly states what the tool does
  • Tool description specifies when to use (and when NOT to use)
  • All parameters documented with types, constraints, defaults
  • Return values clearly specified with success/error formats
  • Error messages include actionable recovery guidance
  • Tool name is unique and descriptive (no generic names)
  • MCP tools use fully qualified names (Server:tool_name)
  • Response format options reduce token consumption where applicable
  • Agent tested tool with representative tasks
  • Tool performs better than replaced alternatives (if consolidating)

Failure Indicators

This skill has FAILED if:

  • ❌ Tool description vague ("helps with", "can be used for")
  • ❌ Parameters lack type definitions or constraints
  • ❌ Return format undocumented or inconsistent
  • ❌ Error messages generic ("Invalid input", "Error occurred")
  • ❌ Tool name ambiguous (conflicts with other tools)
  • ❌ MCP tools use unqualified names causing selection failures
  • ❌ No response format options for verbose operations
  • ❌ Agent repeatedly misuses tool in testing
  • ❌ Tool slower or less reliable than replaced alternatives
  • ❌ Consolidation created more confusion than it resolved

When NOT to Use

Do NOT use this skill when:

  • Designing human APIs - Different principles for developer-facing APIs
  • Single-use automation scripts - Tool overhead not justified
  • Well-established tool interfaces - Don't redesign working tools without reason
  • Non-agent systems - This skill is agent-specific
  • UI/UX design - Different design principles apply
  • Data schemas - Use data modeling skills instead
  • Simple function calls - Internal functions don't need tool-level design

Alternative approaches:

  • Use REST API design patterns for human-facing APIs
  • Use language idioms for internal function design
  • Use database design patterns for schema definition

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Generic tool names"search", "process", "handle" - unclearSpecific names: "search_customers", "process_invoice"
Vague descriptions"Helps with data" - ambiguous"Retrieve customer by ID from PostgreSQL"
Missing when-to-useAgent doesn't know appropriate contextExplicit triggers: "Use when user asks for customer details"
Undocumented parametersAgent guesses parameter valuesDocument types, constraints, examples
Generic error messages"Error occurred" - can't recover"INVALID_FORMAT: ID must match CUST-######"
No response formatsAlways returns verbose dataOffer concise/detailed options
Tool proliferation50+ similar tools confuse modelConsolidate overlapping functionality
Unqualified MCP namesFails with multiple serversAlways use Server:tool_name format
No agent testingDesign assumptions untestedTest with real agent interactions
Over-consolidationOne tool does everything poorlyBalance consolidation with clarity

Principles

This skill embodies these CODITECT principles:

  • #2 First Principles - Tools as contracts between deterministic and non-deterministic systems
  • #5 Eliminate Ambiguity - Clear descriptions eliminate tool selection confusion
  • #6 Clear, Understandable, Explainable - Tool purpose and usage explicitly stated
  • #7 Error Handling Excellence - Error messages guide agent recovery
  • #10 Performance Matters - Response format options optimize token consumption
  • #12 Simplicity Scales - Consolidation principle reduces complexity
  • #13 Automate Repetitive Tasks - Agents can optimize their own tools
  • #16 Test Everything - Agent testing validates tool design

Full Principles: CODITECT-STANDARD-AUTOMATION.md