Tool Design for Agents

Tools are the primary mechanism through which agents interact with the world. Unlike traditional software APIs designed for developers, tool APIs must be designed for language models that reason about intent, infer parameter values, and generate calls from natural language requests. Poor tool design creates failure modes that no amount of prompt engineering can fix.

When to Use

✅ Use this skill when:

Creating new tools for agent systems
Debugging tool-related failures or misuse
Optimizing existing tool sets for better performance
Designing tool APIs from scratch
Evaluating third-party tools for agent integration
Standardizing tool conventions across a codebase

❌ Don't use this skill when:

Simple single-tool applications
Traditional developer-facing API design
Non-agent applications

Core Concepts

Tools are contracts between deterministic systems and non-deterministic agents. When humans call APIs, they understand the contract. Agents must infer the contract from descriptions and generate matching calls.

Tool descriptions are prompt engineering that shapes how agents reason about tool use. The descriptions collectively steer agent behavior.

The Consolidation Principle

If a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better.

This leads to a preference for single comprehensive tools over multiple narrow tools.

Example: Scheduling

Before (Multiple narrow tools):

list_users - List all users
list_events - List all events
create_event - Create event
check_availability - Check user availability

After (Consolidated):

schedule_event - Find availability and schedule (handles full workflow internally)

Why Consolidation Works

Benefit	Explanation
Reduced token consumption	Eliminates redundant descriptions
Eliminated ambiguity	One tool per workflow
Simpler selection	Smaller effective tool set
Better reliability	Less chance of wrong tool choice

When NOT to Consolidate

Tools with fundamentally different behaviors
Tools used in different contexts
Tools that might be called independently

Architectural Reduction

The consolidation principle, taken to its logical extreme, leads to architectural reduction: removing most specialized tools in favor of primitive, general-purpose capabilities.

The File System Agent Pattern

Instead of custom tools for data exploration, schema lookup, and query validation, provide direct file system access through a single command execution tool:

# Instead of many specialized tools:
# - explore_database()
# - lookup_schema()
# - validate_query()
# - preview_results()

# Use one primitive:
execute_command("grep -r 'user_id' schema/")
execute_command("cat schema/users.sql")
execute_command("ls -la data/")

Production Evidence: Vercel d0

Metric	Before (17 tools)	After (2 tools)
Success rate	80%	100%
Avg execution	274s	77s
Tools	17 specialized	2 primitives (bash + SQL)

Key insight: The semantic layer was already good documentation. Claude just needed access to read files directly.

When Reduction Outperforms Complexity

✅ Reduction works when:

Data layer is well-documented and consistent
Model has sufficient reasoning capability
Specialized tools were constraining, not enabling
More time maintaining scaffolding than improving outcomes

❌ Complexity is necessary when:

Underlying data is messy or poorly documented
Domain requires specialized knowledge model lacks
Safety constraints require limiting agent capabilities
Operations truly benefit from structured workflows

Tool Description Engineering

Effective descriptions answer four questions:

Question	What to Include
What does it do?	Clear, specific functionality. Avoid "helps with" or "can be used for"
When should it be used?	Specific triggers and contexts
What inputs?	Parameter types, constraints, defaults
What returns?	Output format, success/error examples

Example: Well-Designed Tool

def get_customer(customer_id: str, format: str = "concise"):
    """
    Retrieve customer information by ID.

    Use when:
    - User asks about specific customer details
    - Need customer context for decision-making
    - Verifying customer identity

    Args:
        customer_id: Format "CUST-######" (e.g., "CUST-000001")
        format: "concise" for key fields, "detailed" for complete record

    Returns:
        Customer object with requested fields

    Errors:
        NOT_FOUND: Customer ID not found
        INVALID_FORMAT: ID must match CUST-###### pattern
    """

Anti-Pattern: Poor Tool Design

def search(query):
    """Search the database."""  # ← Vague, missing everything

Problems: Vague name, missing parameters, no return description, no usage context, no error handling.

Response Format Optimization

Tool response size significantly impacts context usage.

Format Options

def get_customer(customer_id: str, format: str = "concise"):
    if format == "concise":
        return {"id": id, "name": name, "status": status}
    elif format == "detailed":
        return full_customer_record

Include format guidance in tool descriptions so agents learn to select appropriately.

Error Message Design

Error messages serve two audiences: developers debugging and agents recovering.

Design for Recovery

# Bad: Generic error
raise Error("Invalid input")

# Good: Actionable for agents
raise Error(
    "INVALID_FORMAT: customer_id must match 'CUST-######'. "
    "Received: 'customer123'. "
    "Example valid ID: 'CUST-000001'"
)

MCP Tool Naming Requirements

When using MCP (Model Context Protocol) tools, always use fully qualified names:

# Correct: Fully qualified
"Use the BigQuery:bigquery_schema tool..."
"Use the GitHub:create_issue tool..."

# Incorrect: Unqualified (may fail with multiple servers)
"Use the bigquery_schema tool..."

Tool Collection Guidelines

Research shows tool description overlap causes model confusion. A reasonable guideline: 10-20 tools for most applications.

If more needed, use namespacing:

database:query
database:schema
web:search
web:fetch

Using Agents to Optimize Tools

Claude can optimize its own tools. When given a tool and observed failure modes, it diagnoses issues and suggests improvements.

def optimize_tool_description(tool_spec, failure_examples):
    prompt = f"""
    Analyze this tool specification and failures.

    Tool: {tool_spec}
    Failures: {failure_examples}

    Identify:
    1. Why agents are failing
    2. What information is missing
    3. What ambiguities cause incorrect usage

    Propose improved description.
    """
    return get_agent_response(prompt)

Result: 40% reduction in task completion time by helping future agents avoid mistakes.

Guidelines

Write descriptions that answer what, when, and what returns
Use consolidation to reduce ambiguity
Implement response format options for token efficiency
Design error messages for agent recovery
Establish and follow consistent naming conventions
Limit tool count and use namespacing for organization
Test tool designs with actual agent interactions
Question whether each tool enables or constrains the model
Prefer primitive, general-purpose tools over specialized wrappers
Invest in documentation quality over tooling sophistication
Build minimal architectures that benefit from model improvements

Skills

context-fundamentals - How tools interact with context
multi-agent-patterns - Specialized tools per agent
evaluation-framework - Evaluating tool effectiveness

Agents

tool-designer - Tool description generation and validation

Scripts

external/Agent-Skills-for-Context-Engineering/skills/tool-design/scripts/description_generator.py - Tool schema builder

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: tool-design

Completed:
- [x] Tool API defined with clear parameters and return types
- [x] Tool description answers: what, when, inputs, outputs
- [x] Consolidation analysis performed (reduced X tools to Y)
- [x] Response format options implemented (concise/detailed)
- [x] Error messages designed for agent recovery
- [x] Tool naming follows MCP qualification requirements
- [x] Agent-tested with actual invocations

Outputs:
- Tool schema with complete type definitions
- Tool description optimized for agent understanding
- Error message catalog with recovery guidance
- Usage examples with expected inputs/outputs
- Consolidation recommendation report (if applicable)
- Agent testing results with success/failure analysis

Completion Checklist

Before marking this skill as complete, verify:

Failure Indicators

This skill has FAILED if:

❌ Tool description vague ("helps with", "can be used for")
❌ Parameters lack type definitions or constraints
❌ Return format undocumented or inconsistent
❌ Error messages generic ("Invalid input", "Error occurred")
❌ Tool name ambiguous (conflicts with other tools)
❌ MCP tools use unqualified names causing selection failures
❌ No response format options for verbose operations
❌ Agent repeatedly misuses tool in testing
❌ Tool slower or less reliable than replaced alternatives
❌ Consolidation created more confusion than it resolved

When NOT to Use

Do NOT use this skill when:

Designing human APIs - Different principles for developer-facing APIs
Single-use automation scripts - Tool overhead not justified
Well-established tool interfaces - Don't redesign working tools without reason
Non-agent systems - This skill is agent-specific
UI/UX design - Different design principles apply
Data schemas - Use data modeling skills instead
Simple function calls - Internal functions don't need tool-level design

Alternative approaches:

Use REST API design patterns for human-facing APIs
Use language idioms for internal function design
Use database design patterns for schema definition

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Generic tool names	"search", "process", "handle" - unclear	Specific names: "search_customers", "process_invoice"
Vague descriptions	"Helps with data" - ambiguous	"Retrieve customer by ID from PostgreSQL"
Missing when-to-use	Agent doesn't know appropriate context	Explicit triggers: "Use when user asks for customer details"
Undocumented parameters	Agent guesses parameter values	Document types, constraints, examples
Generic error messages	"Error occurred" - can't recover	"INVALID_FORMAT: ID must match CUST-######"
No response formats	Always returns verbose data	Offer concise/detailed options
Tool proliferation	50+ similar tools confuse model	Consolidate overlapping functionality
Unqualified MCP names	Fails with multiple servers	Always use Server:tool_name format
No agent testing	Design assumptions untested	Test with real agent interactions
Over-consolidation	One tool does everything poorly	Balance consolidation with clarity

Principles

This skill embodies these CODITECT principles:

#2 First Principles - Tools as contracts between deterministic and non-deterministic systems
#5 Eliminate Ambiguity - Clear descriptions eliminate tool selection confusion
#6 Clear, Understandable, Explainable - Tool purpose and usage explicitly stated
#7 Error Handling Excellence - Error messages guide agent recovery
#10 Performance Matters - Response format options optimize token consumption
#12 Simplicity Scales - Consolidation principle reduces complexity
#13 Automate Repetitive Tasks - Agents can optimize their own tools
#16 Test Everything - Agent testing validates tool design

Full Principles: CODITECT-STANDARD-AUTOMATION.md

When to Use​

Core Concepts​

The Consolidation Principle​

Example: Scheduling​

Why Consolidation Works​

When NOT to Consolidate​

Architectural Reduction​

The File System Agent Pattern​

Production Evidence: Vercel d0​

When Reduction Outperforms Complexity​

Tool Description Engineering​

Example: Well-Designed Tool​

Anti-Pattern: Poor Tool Design​

Response Format Optimization​

Format Options​

Error Message Design​

Design for Recovery​

MCP Tool Naming Requirements​

Tool Collection Guidelines​

Using Agents to Optimize Tools​

Guidelines​

Related Components​

Skills​

Agents​

Scripts​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​