Skip to main content

AI-Driven Testing & QA Automation

Overview

theia AI includes powerful capabilities for automating testing and quality assurance workflows using AI agents combined with browser automation and reporting tools.

App Tester Agent

Purpose

The App Tester agent is designed to:

  • Test browser-based applications autonomously
  • Generate test cases dynamically
  • Execute tests using Playwright
  • Detect and report bugs
  • Run comprehensive smoke tests
  • Provide detailed test results

How It Works

┌──────────────┐
│ User Request │
│ "Test my app"│
└──────┬───────┘


┌──────────────────┐
│ App Tester │
│ Agent │
└──────┬───────────┘


┌──────────────────┐ ┌──────────────┐
│ Playwright MCP │◄───────►│ Browser │
│ Server │ │ Automation │
└──────┬───────────┘ └──────────────┘


┌──────────────────┐
│ Test Results │
│ Bug Reports │
└──────────────────┘

Basic Usage

User: @AppTester test my calculator app running on localhost:3000

App Tester:
1. Opens browser to localhost:3000
2. Analyzes UI elements
3. Generates test cases
4. Executes each test
5. Reports findings

Example Session

User: @AppTester test my calculator app, focus on multiplication only. 
It is running on localhost:3000.

Agent Response:
Opening application at localhost:3000...

Test Case 1: Simple multiplication (5 × 6)
- Expected: 30
- Actual: 0.833333
- Status: ❌ FAILED

Issue detected: Multiplication button performs division instead

Test Case 2: Multiplication with zero (7 × 0)
- Expected: 0
- Actual: NaN
- Status: ❌ FAILED

Test Case 3: Negative numbers (-3 × 4)
- Expected: -12
- Actual: -0.75
- Status: ❌ FAILED

Summary:
- Total tests: 3
- Passed: 0
- Failed: 3
- Critical bug: Multiplication operation is incorrect

Setting Up Browser Testing

Prerequisites

  1. Playwright MCP Server

    • Installed via theia AI Configuration
    • Or via command line: npx @modelcontextprotocol/server-playwright
  2. Application to Test

    • Running locally or remotely
    • Accessible via URL
    • Browser-compatible

Installation Steps

# Option 1: First-time use
# App Tester will prompt to install Playwright MCP server

# Option 2: Manual installation
npm install -g @modelcontextprotocol/server-playwright

# Install browser binaries
npx playwright install

Configuration

// .theia/mcp-servers.json
{
"servers": {
"playwright": {
"type": "executable",
"command": "npx",
"args": ["@modelcontextprotocol/server-playwright"],
"settings": {
"headless": false, // Show browser during tests
"slowMo": 100, // Slow down for visibility
"viewport": {
"width": 1920,
"height": 1080
}
}
}
}
}

Test Scenarios

Smoke Testing

User: @AppTester run smoke tests on my application at localhost:3000

App Tester will:
1. Navigate to the application
2. Test core workflows:
- User login/authentication
- Main navigation
- Key features
- Data submission
- Error handling
3. Report any issues found

Focused Testing

User: @AppTester test the user registration form at localhost:3000/register

App Tester will:
1. Focus on registration form
2. Test validation:
- Required fields
- Email format
- Password strength
- Matching password confirmation
3. Test submission
4. Verify success/error messages

Regression Testing

User: @AppTester I fixed the multiplication bug. 
Please verify the calculator still works correctly.

App Tester will:
1. Run comprehensive calculations
2. Verify multiplication works
3. Check other operations weren't broken
4. Confirm UI still responsive

Exploratory Testing

User: @AppTester explore my e-commerce app and find potential issues

App Tester will:
1. Navigate through different sections
2. Interact with various elements
3. Try edge cases
4. Look for UI glitches
5. Test error conditions
6. Report all findings

Advanced Testing Patterns

1. Test Case Generation

The agent can dynamically generate test cases based on:

  • Application structure
  • User stories
  • Requirements documents
  • Previous test results
Example:

User provides: User story document
Agent generates:
1. Happy path tests
2. Edge case tests
3. Error condition tests
4. Performance tests

2. Autonomous Testing Loops

Agent workflow:
1. Generate test case
2. Execute test
3. Record results
4. If bug found → report it
5. Generate next test case
6. Repeat until coverage complete

Stop conditions:
- All features tested
- Maximum iterations reached
- User interrupts
- Critical bug found

3. Visual Regression Testing

// App Tester can capture screenshots
Test Case: Homepage layout
1. Navigate to homepage
2. Capture screenshot
3. Compare with baseline
4. Report differences > threshold

4. Accessibility Testing

User: @AppTester check accessibility of my app

Agent tests:
- Keyboard navigation
- Screen reader compatibility
- Color contrast
- ARIA labels
- Focus management
- Alt text for images

Bug Reporting Integration

Automatic Bug Reporting

# Combined with bug reporter agent

User: @AppTester test my app and report any bugs to GitHub

Workflow:
1. App Tester finds bug
2. Delegates to Bug Reporter agent
3. Bug Reporter creates GitHub issue
4. App Tester continues testing

Result: Bugs automatically filed during testing

Bug Report Format

Title: [Critical] Multiplication button performs division

Severity: Critical
Component: Calculator
Steps to Reproduce:
1. Open calculator at localhost:3000
2. Click "7" button
3. Click "×" button
4. Click "6" button
5. Click "=" button

Expected: 42
Actual: 1.166667

Environment:
- Browser: Chrome 120
- OS: macOS
- Viewport: 1920x1080

Screenshot: [attached]

Test Result Analysis

Success Metrics

Test Summary:
─────────────────────
Total Tests: 15
✅ Passed: 12 (80%)
❌ Failed: 3 (20%)

By Category:
- UI: 5/5 passed
- Forms: 4/5 passed (1 validation issue)
- Navigation: 3/3 passed
- Integration: 0/2 passed (2 API errors)

Critical Issues: 2
High Priority: 1
Medium Priority: 0

Detailed Reports

interface TestReport {
timestamp: string;
duration: number;
testCases: TestCase[];
summary: {
total: number;
passed: number;
failed: number;
skipped: number;
};
issues: Issue[];
screenshots: string[];
coverage: {
features: string[];
percentage: number;
};
}

Providing Test Context

Application Description

# .theia/ai/app-description.md

Application: E-commerce Platform
URL: https://localhost:3000

Key Features:
- User registration/login
- Product browsing
- Shopping cart
- Checkout process
- Order history

Test Priorities:
1. Checkout flow (critical)
2. Cart operations (high)
3. Search functionality (medium)

Known Issues:
- Payment gateway is mocked in dev
- Some products have placeholder images

Test Scenarios File

# test-scenarios.md

## User Registration
1. Valid registration
- Email: test@example.com
- Password: Test123!

2. Invalid cases:
- Empty fields
- Invalid email format
- Weak password
- Mismatched passwords

## Shopping Cart
1. Add single item
2. Add multiple items
3. Update quantities
4. Remove items
5. Cart persistence

Playwright Integration Details

Available Playwright Functions

// Navigation
playwright_navigate(url: string)
playwright_go_back()
playwright_go_forward()
playwright_reload()

// Interaction
playwright_click(selector: string)
playwright_fill(selector: string, text: string)
playwright_press(selector: string, key: string)
playwright_check(selector: string)
playwright_uncheck(selector: string)
playwright_select_option(selector: string, value: string)

// Assertions
playwright_is_visible(selector: string)
playwright_is_enabled(selector: string)
playwright_get_text(selector: string)
playwright_get_attribute(selector: string, name: string)

// Screenshots
playwright_screenshot(path?: string, fullPage?: boolean)
playwright_screenshot_element(selector: string, path?: string)

// Waiting
playwright_wait_for_selector(selector: string, timeout?: number)
playwright_wait_for_url(url: string, timeout?: number)
playwright_wait_for_load_state(state: 'load' | 'domcontentloaded' | 'networkidle')

Selector Strategies

// By ID
'#submit-button'

// By class
'.btn-primary'

// By text
'text=Submit'

// By role
'role=button[name="Submit"]'

// By test ID
'[data-testid="login-form"]'

// Combined
'form >> button:has-text("Submit")'

Error Handling

Common Issues

  1. Element Not Found

    Error: Selector '#submit-btn' not found

    Agent response:
    - Waits for element (default 30s)
    - Tries alternative selectors
    - Reports if element never appears
  2. Timeout

    Error: Navigation timeout

    Agent response:
    - Increases timeout
    - Checks network issues
    - Verifies URL accessibility
  3. Interaction Blocked

    Error: Element is not clickable

    Agent response:
    - Scrolls element into view
    - Waits for animations
    - Checks for overlays

Retry Logic

async function robustClick(selector: string, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
await page.waitForSelector(selector, { state: 'visible' });
await page.click(selector);
return;
} catch (error) {
if (i === maxRetries - 1) throw error;
await page.waitForTimeout(1000 * (i + 1));
}
}
}

Performance Considerations

Test Speed

Factors affecting speed:
- Network latency
- Application response time
- Number of test cases
- Screenshot capture
- Detailed logging

Optimization:
- Run tests in parallel (if supported)
- Skip non-critical tests
- Use headless mode
- Limit screenshots
- Increase timeouts only when needed

Resource Usage

Resources consumed:
- Browser instance (RAM: 200-500MB)
- MCP server process (RAM: 50-100MB)
- Agent tokens (varies by test complexity)

Management:
- Close browser after tests
- Limit concurrent test runs
- Monitor token usage
- Clean up screenshots

Best Practices

1. Clear Instructions

❌ "Test my app"
✅ "Test the checkout flow of my e-commerce app at localhost:3000,
focusing on cart operations and payment form validation"

2. Provide Context

Before testing:
- Application URL
- Login credentials (if needed)
- Test data requirements
- Known limitations
- Priority areas

3. Incremental Testing

1. First: Core functionality
2. Then: Edge cases
3. Finally: Error conditions

Don't try to test everything at once

4. Review Results

After testing:
1. Review failed tests
2. Verify bug reports
3. Check screenshots
4. Prioritize fixes
5. Re-test after fixes

5. Document Findings

Keep test reports:
- Create test-results.md
- Include timestamps
- Track bug status
- Note environment details

Integration Patterns

With CI/CD

# .github/workflows/ai-testing.yml
name: AI Testing

on: [push, pull_request]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Start application
run: npm start &
- name: Run AI tests
run: |
theia-cli ai test \
--agent app-tester \
--url localhost:3000 \
--report test-results.json

With Issue Tracking

// Automatically create issues for failures
const testResults = await runTests();

for (const failure of testResults.failures) {
await github.issues.create({
title: `[Test Failure] ${failure.testCase}`,
body: formatBugReport(failure),
labels: ['bug', 'automated-test']
});
}

With Monitoring

// Send test metrics to monitoring
await monitoring.recordMetric('test.duration', testResults.duration);
await monitoring.recordMetric('test.success_rate', testResults.passRate);

if (testResults.passRate < 0.8) {
await alerts.send('Test success rate below threshold');
}

Troubleshooting

Agent Not Finding Elements

Problem: Agent reports elements not found Solutions:

  1. Check selectors are correct
  2. Wait for page to load completely
  3. Check for dynamic content
  4. Verify no authentication required
  5. Look for shadow DOM

Tests Taking Too Long

Problem: Tests run very slowly Solutions:

  1. Check network connectivity
  2. Reduce slowMo setting
  3. Use headless mode
  4. Optimize application performance
  5. Limit screenshot capture

Inconsistent Results

Problem: Tests pass sometimes, fail others Solutions:

  1. Add explicit waits
  2. Check for race conditions
  3. Increase timeouts
  4. Look for animation issues
  5. Verify data consistency

Future Enhancements

Planned capabilities:

  • Cross-browser testing (Chrome, Firefox, Safari)
  • Mobile testing (iOS, Android simulators)
  • Performance testing (load time, metrics)
  • API testing (REST, GraphQL)
  • Database validation (check data integrity)
  • Visual diff (pixel-perfect comparison)

Summary

AI-driven testing in theia provides:

Autonomous - Tests run without human intervention ✅ Intelligent - Generates test cases dynamically ✅ Comprehensive - Covers UI, functionality, errors ✅ Integrated - Works with bug tracking ✅ Fast - Parallel execution possible ✅ Detailed - Rich reports with screenshots

Key workflow:

  1. Start application
  2. Invoke App Tester agent
  3. Agent generates and runs tests
  4. Bugs automatically reported
  5. Review results
  6. Fix issues
  7. Re-test

This dramatically reduces manual QA effort while increasing test coverage and consistency.