Tool Design for Agents
Tool Design for Agents
Tools are the primary mechanism through which agents interact with the world. Unlike traditional software APIs designed for developers, tool APIs must be designed for language models that reason about intent, infer parameter values, and generate calls from natural language requests. Poor tool design creates failure modes that no amount of prompt engineering can fix.
When to Use
✅ Use this skill when:
- Creating new tools for agent systems
- Debugging tool-related failures or misuse
- Optimizing existing tool sets for better performance
- Designing tool APIs from scratch
- Evaluating third-party tools for agent integration
- Standardizing tool conventions across a codebase
❌ Don't use this skill when:
- Simple single-tool applications
- Traditional developer-facing API design
- Non-agent applications
Core Concepts
Tools are contracts between deterministic systems and non-deterministic agents. When humans call APIs, they understand the contract. Agents must infer the contract from descriptions and generate matching calls.
Tool descriptions are prompt engineering that shapes how agents reason about tool use. The descriptions collectively steer agent behavior.
The Consolidation Principle
If a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better.
This leads to a preference for single comprehensive tools over multiple narrow tools.
Example: Scheduling
Before (Multiple narrow tools):
list_users- List all userslist_events- List all eventscreate_event- Create eventcheck_availability- Check user availability
After (Consolidated):
schedule_event- Find availability and schedule (handles full workflow internally)
Why Consolidation Works
| Benefit | Explanation |
|---|---|
| Reduced token consumption | Eliminates redundant descriptions |
| Eliminated ambiguity | One tool per workflow |
| Simpler selection | Smaller effective tool set |
| Better reliability | Less chance of wrong tool choice |
When NOT to Consolidate
- Tools with fundamentally different behaviors
- Tools used in different contexts
- Tools that might be called independently
Architectural Reduction
The consolidation principle, taken to its logical extreme, leads to architectural reduction: removing most specialized tools in favor of primitive, general-purpose capabilities.
The File System Agent Pattern
Instead of custom tools for data exploration, schema lookup, and query validation, provide direct file system access through a single command execution tool:
# Instead of many specialized tools:
# - explore_database()
# - lookup_schema()
# - validate_query()
# - preview_results()
# Use one primitive:
execute_command("grep -r 'user_id' schema/")
execute_command("cat schema/users.sql")
execute_command("ls -la data/")
Production Evidence: Vercel d0
| Metric | Before (17 tools) | After (2 tools) |
|---|---|---|
| Success rate | 80% | 100% |
| Avg execution | 274s | 77s |
| Tools | 17 specialized | 2 primitives (bash + SQL) |
Key insight: The semantic layer was already good documentation. Claude just needed access to read files directly.
When Reduction Outperforms Complexity
✅ Reduction works when:
- Data layer is well-documented and consistent
- Model has sufficient reasoning capability
- Specialized tools were constraining, not enabling
- More time maintaining scaffolding than improving outcomes
❌ Complexity is necessary when:
- Underlying data is messy or poorly documented
- Domain requires specialized knowledge model lacks
- Safety constraints require limiting agent capabilities
- Operations truly benefit from structured workflows
Tool Description Engineering
Effective descriptions answer four questions:
| Question | What to Include |
|---|---|
| What does it do? | Clear, specific functionality. Avoid "helps with" or "can be used for" |
| When should it be used? | Specific triggers and contexts |
| What inputs? | Parameter types, constraints, defaults |
| What returns? | Output format, success/error examples |
Example: Well-Designed Tool
def get_customer(customer_id: str, format: str = "concise"):
"""
Retrieve customer information by ID.
Use when:
- User asks about specific customer details
- Need customer context for decision-making
- Verifying customer identity
Args:
customer_id: Format "CUST-######" (e.g., "CUST-000001")
format: "concise" for key fields, "detailed" for complete record
Returns:
Customer object with requested fields
Errors:
NOT_FOUND: Customer ID not found
INVALID_FORMAT: ID must match CUST-###### pattern
"""
Anti-Pattern: Poor Tool Design
def search(query):
"""Search the database.""" # ← Vague, missing everything
Problems: Vague name, missing parameters, no return description, no usage context, no error handling.
Response Format Optimization
Tool response size significantly impacts context usage.
Format Options
def get_customer(customer_id: str, format: str = "concise"):
if format == "concise":
return {"id": id, "name": name, "status": status}
elif format == "detailed":
return full_customer_record
Include format guidance in tool descriptions so agents learn to select appropriately.
Error Message Design
Error messages serve two audiences: developers debugging and agents recovering.
Design for Recovery
# Bad: Generic error
raise Error("Invalid input")
# Good: Actionable for agents
raise Error(
"INVALID_FORMAT: customer_id must match 'CUST-######'. "
"Received: 'customer123'. "
"Example valid ID: 'CUST-000001'"
)
MCP Tool Naming Requirements
When using MCP (Model Context Protocol) tools, always use fully qualified names:
# Correct: Fully qualified
"Use the BigQuery:bigquery_schema tool..."
"Use the GitHub:create_issue tool..."
# Incorrect: Unqualified (may fail with multiple servers)
"Use the bigquery_schema tool..."
Tool Collection Guidelines
Research shows tool description overlap causes model confusion. A reasonable guideline: 10-20 tools for most applications.
If more needed, use namespacing:
database:query
database:schema
web:search
web:fetch
Using Agents to Optimize Tools
Claude can optimize its own tools. When given a tool and observed failure modes, it diagnoses issues and suggests improvements.
def optimize_tool_description(tool_spec, failure_examples):
prompt = f"""
Analyze this tool specification and failures.
Tool: {tool_spec}
Failures: {failure_examples}
Identify:
1. Why agents are failing
2. What information is missing
3. What ambiguities cause incorrect usage
Propose improved description.
"""
return get_agent_response(prompt)
Result: 40% reduction in task completion time by helping future agents avoid mistakes.
Guidelines
- Write descriptions that answer what, when, and what returns
- Use consolidation to reduce ambiguity
- Implement response format options for token efficiency
- Design error messages for agent recovery
- Establish and follow consistent naming conventions
- Limit tool count and use namespacing for organization
- Test tool designs with actual agent interactions
- Question whether each tool enables or constrains the model
- Prefer primitive, general-purpose tools over specialized wrappers
- Invest in documentation quality over tooling sophistication
- Build minimal architectures that benefit from model improvements
Related Components
Skills
context-fundamentals- How tools interact with contextmulti-agent-patterns- Specialized tools per agentevaluation-framework- Evaluating tool effectiveness
Agents
tool-designer- Tool description generation and validation
Scripts
external/Agent-Skills-for-Context-Engineering/skills/tool-design/scripts/description_generator.py- Tool schema builder
Success Output
When successful, this skill MUST output:
✅ SKILL COMPLETE: tool-design
Completed:
- [x] Tool API defined with clear parameters and return types
- [x] Tool description answers: what, when, inputs, outputs
- [x] Consolidation analysis performed (reduced X tools to Y)
- [x] Response format options implemented (concise/detailed)
- [x] Error messages designed for agent recovery
- [x] Tool naming follows MCP qualification requirements
- [x] Agent-tested with actual invocations
Outputs:
- Tool schema with complete type definitions
- Tool description optimized for agent understanding
- Error message catalog with recovery guidance
- Usage examples with expected inputs/outputs
- Consolidation recommendation report (if applicable)
- Agent testing results with success/failure analysis
Completion Checklist
Before marking this skill as complete, verify:
- Tool description explicitly states what the tool does
- Tool description specifies when to use (and when NOT to use)
- All parameters documented with types, constraints, defaults
- Return values clearly specified with success/error formats
- Error messages include actionable recovery guidance
- Tool name is unique and descriptive (no generic names)
- MCP tools use fully qualified names (Server:tool_name)
- Response format options reduce token consumption where applicable
- Agent tested tool with representative tasks
- Tool performs better than replaced alternatives (if consolidating)
Failure Indicators
This skill has FAILED if:
- ❌ Tool description vague ("helps with", "can be used for")
- ❌ Parameters lack type definitions or constraints
- ❌ Return format undocumented or inconsistent
- ❌ Error messages generic ("Invalid input", "Error occurred")
- ❌ Tool name ambiguous (conflicts with other tools)
- ❌ MCP tools use unqualified names causing selection failures
- ❌ No response format options for verbose operations
- ❌ Agent repeatedly misuses tool in testing
- ❌ Tool slower or less reliable than replaced alternatives
- ❌ Consolidation created more confusion than it resolved
When NOT to Use
Do NOT use this skill when:
- Designing human APIs - Different principles for developer-facing APIs
- Single-use automation scripts - Tool overhead not justified
- Well-established tool interfaces - Don't redesign working tools without reason
- Non-agent systems - This skill is agent-specific
- UI/UX design - Different design principles apply
- Data schemas - Use data modeling skills instead
- Simple function calls - Internal functions don't need tool-level design
Alternative approaches:
- Use REST API design patterns for human-facing APIs
- Use language idioms for internal function design
- Use database design patterns for schema definition
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Generic tool names | "search", "process", "handle" - unclear | Specific names: "search_customers", "process_invoice" |
| Vague descriptions | "Helps with data" - ambiguous | "Retrieve customer by ID from PostgreSQL" |
| Missing when-to-use | Agent doesn't know appropriate context | Explicit triggers: "Use when user asks for customer details" |
| Undocumented parameters | Agent guesses parameter values | Document types, constraints, examples |
| Generic error messages | "Error occurred" - can't recover | "INVALID_FORMAT: ID must match CUST-######" |
| No response formats | Always returns verbose data | Offer concise/detailed options |
| Tool proliferation | 50+ similar tools confuse model | Consolidate overlapping functionality |
| Unqualified MCP names | Fails with multiple servers | Always use Server:tool_name format |
| No agent testing | Design assumptions untested | Test with real agent interactions |
| Over-consolidation | One tool does everything poorly | Balance consolidation with clarity |
Principles
This skill embodies these CODITECT principles:
- #2 First Principles - Tools as contracts between deterministic and non-deterministic systems
- #5 Eliminate Ambiguity - Clear descriptions eliminate tool selection confusion
- #6 Clear, Understandable, Explainable - Tool purpose and usage explicitly stated
- #7 Error Handling Excellence - Error messages guide agent recovery
- #10 Performance Matters - Response format options optimize token consumption
- #12 Simplicity Scales - Consolidation principle reduces complexity
- #13 Automate Repetitive Tasks - Agents can optimize their own tools
- #16 Test Everything - Agent testing validates tool design
Full Principles: CODITECT-STANDARD-AUTOMATION.md