AI-Driven Testing & QA Automation

Overview

theia AI includes powerful capabilities for automating testing and quality assurance workflows using AI agents combined with browser automation and reporting tools.

App Tester Agent

Purpose

The App Tester agent is designed to:

Test browser-based applications autonomously
Generate test cases dynamically
Execute tests using Playwright
Detect and report bugs
Run comprehensive smoke tests
Provide detailed test results

How It Works

┌──────────────┐
│ User Request │
│ "Test my app"│
└──────┬───────┘
       │
       ▼
┌──────────────────┐
│  App Tester      │
│  Agent           │
└──────┬───────────┘
       │
       ▼
┌──────────────────┐         ┌──────────────┐
│ Playwright MCP   │◄───────►│ Browser      │
│ Server           │         │ Automation   │
└──────┬───────────┘         └──────────────┘
       │
       ▼
┌──────────────────┐
│ Test Results     │
│ Bug Reports      │
└──────────────────┘

Basic Usage

User: @AppTester test my calculator app running on localhost:3000

App Tester:
Opens browser to localhost:3000
Analyzes UI elements
Generates test cases
Executes each test
Reports findings

Example Session

User: @AppTester test my calculator app, focus on multiplication only. 
      It is running on localhost:3000.

Agent Response:
Opening application at localhost:3000...

Test Case 1: Simple multiplication (5 × 6)
- Expected: 30
- Actual: 0.833333
- Status: ❌ FAILED

Issue detected: Multiplication button performs division instead

Test Case 2: Multiplication with zero (7 × 0)
- Expected: 0
- Actual: NaN
- Status: ❌ FAILED

Test Case 3: Negative numbers (-3 × 4)
- Expected: -12
- Actual: -0.75
- Status: ❌ FAILED

Summary:
- Total tests: 3
- Passed: 0
- Failed: 3
- Critical bug: Multiplication operation is incorrect

Setting Up Browser Testing

Prerequisites

Playwright MCP Server
- Installed via theia AI Configuration
- Or via command line: npx @modelcontextprotocol/server-playwright
Application to Test
- Running locally or remotely
- Accessible via URL
- Browser-compatible

Installation Steps

# Option 1: First-time use
# App Tester will prompt to install Playwright MCP server

# Option 2: Manual installation
npm install -g @modelcontextprotocol/server-playwright

# Install browser binaries
npx playwright install

Configuration

// .theia/mcp-servers.json
{
  "servers": {
    "playwright": {
      "type": "executable",
      "command": "npx",
      "args": ["@modelcontextprotocol/server-playwright"],
      "settings": {
        "headless": false,  // Show browser during tests
        "slowMo": 100,      // Slow down for visibility
        "viewport": {
          "width": 1920,
          "height": 1080
        }
      }
    }
  }
}

Test Scenarios

Smoke Testing

User: @AppTester run smoke tests on my application at localhost:3000

App Tester will:
1. Navigate to the application
2. Test core workflows:
   - User login/authentication
   - Main navigation
   - Key features
   - Data submission
   - Error handling
3. Report any issues found

Focused Testing

User: @AppTester test the user registration form at localhost:3000/register

App Tester will:
1. Focus on registration form
2. Test validation:
   - Required fields
   - Email format
   - Password strength
   - Matching password confirmation
3. Test submission
4. Verify success/error messages

Regression Testing

User: @AppTester I fixed the multiplication bug. 
      Please verify the calculator still works correctly.

App Tester will:
1. Run comprehensive calculations
2. Verify multiplication works
3. Check other operations weren't broken
4. Confirm UI still responsive

Exploratory Testing

User: @AppTester explore my e-commerce app and find potential issues

App Tester will:
Navigate through different sections
Interact with various elements
Try edge cases
Look for UI glitches
Test error conditions
Report all findings

Advanced Testing Patterns

1. Test Case Generation

The agent can dynamically generate test cases based on:

Application structure
User stories
Requirements documents
Previous test results

Example:

User provides: User story document
Agent generates:
1. Happy path tests
2. Edge case tests
3. Error condition tests
4. Performance tests

2. Autonomous Testing Loops

Agent workflow:
1. Generate test case
2. Execute test
3. Record results
4. If bug found → report it
5. Generate next test case
6. Repeat until coverage complete

Stop conditions:
- All features tested
- Maximum iterations reached
- User interrupts
- Critical bug found

3. Visual Regression Testing

// App Tester can capture screenshots
Test Case: Homepage layout
1. Navigate to homepage
2. Capture screenshot
3. Compare with baseline
4. Report differences > threshold

4. Accessibility Testing

User: @AppTester check accessibility of my app

Agent tests:
- Keyboard navigation
- Screen reader compatibility
- Color contrast
- ARIA labels
- Focus management
- Alt text for images

Bug Reporting Integration

Automatic Bug Reporting

# Combined with bug reporter agent

User: @AppTester test my app and report any bugs to GitHub

Workflow:
1. App Tester finds bug
2. Delegates to Bug Reporter agent
3. Bug Reporter creates GitHub issue
4. App Tester continues testing

Result: Bugs automatically filed during testing

Bug Report Format

Title: [Critical] Multiplication button performs division

Severity: Critical
Component: Calculator
Steps to Reproduce:
1. Open calculator at localhost:3000
2. Click "7" button
3. Click "×" button
4. Click "6" button
5. Click "=" button

Expected: 42
Actual: 1.166667

Environment:
- Browser: Chrome 120
- OS: macOS
- Viewport: 1920x1080

Screenshot: [attached]

Test Result Analysis

Success Metrics

Test Summary:
─────────────────────
Total Tests: 15
✅ Passed: 12 (80%)
❌ Failed: 3 (20%)

By Category:
- UI: 5/5 passed
- Forms: 4/5 passed (1 validation issue)
- Navigation: 3/3 passed
- Integration: 0/2 passed (2 API errors)

Critical Issues: 2
High Priority: 1
Medium Priority: 0

Detailed Reports

interface TestReport {
  timestamp: string;
  duration: number;
  testCases: TestCase[];
  summary: {
    total: number;
    passed: number;
    failed: number;
    skipped: number;
  };
  issues: Issue[];
  screenshots: string[];
  coverage: {
    features: string[];
    percentage: number;
  };
}

Providing Test Context

Application Description

# .theia/ai/app-description.md

Application: E-commerce Platform
URL: https://localhost:3000

Key Features:
- User registration/login
- Product browsing
- Shopping cart
- Checkout process
- Order history

Test Priorities:
1. Checkout flow (critical)
2. Cart operations (high)
3. Search functionality (medium)

Known Issues:
- Payment gateway is mocked in dev
- Some products have placeholder images

Test Scenarios File

# test-scenarios.md

## User Registration
1. Valid registration
   - Email: test@example.com
   - Password: Test123!
   
2. Invalid cases:
   - Empty fields
   - Invalid email format
   - Weak password
   - Mismatched passwords

## Shopping Cart
1. Add single item
2. Add multiple items
3. Update quantities
4. Remove items
5. Cart persistence

Playwright Integration Details

Available Playwright Functions

// Navigation
playwright_navigate(url: string)
playwright_go_back()
playwright_go_forward()
playwright_reload()

// Interaction
playwright_click(selector: string)
playwright_fill(selector: string, text: string)
playwright_press(selector: string, key: string)
playwright_check(selector: string)
playwright_uncheck(selector: string)
playwright_select_option(selector: string, value: string)

// Assertions
playwright_is_visible(selector: string)
playwright_is_enabled(selector: string)
playwright_get_text(selector: string)
playwright_get_attribute(selector: string, name: string)

// Screenshots
playwright_screenshot(path?: string, fullPage?: boolean)
playwright_screenshot_element(selector: string, path?: string)

// Waiting
playwright_wait_for_selector(selector: string, timeout?: number)
playwright_wait_for_url(url: string, timeout?: number)
playwright_wait_for_load_state(state: 'load' | 'domcontentloaded' | 'networkidle')

Selector Strategies

// By ID
'#submit-button'

// By class
'.btn-primary'

// By text
'text=Submit'

// By role
'role=button[name="Submit"]'

// By test ID
'[data-testid="login-form"]'

// Combined
'form >> button:has-text("Submit")'

Error Handling

Common Issues

Element Not Found

Error: Selector '#submit-btn' not found

Agent response:
- Waits for element (default 30s)
- Tries alternative selectors
- Reports if element never appears

Timeout

Error: Navigation timeout

Agent response:
- Increases timeout
- Checks network issues
- Verifies URL accessibility

Interaction Blocked

Error: Element is not clickable

Agent response:
- Scrolls element into view
- Waits for animations
- Checks for overlays

Retry Logic

async function robustClick(selector: string, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      await page.waitForSelector(selector, { state: 'visible' });
      await page.click(selector);
      return;
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await page.waitForTimeout(1000 * (i + 1));
    }
  }
}

Performance Considerations

Test Speed

Factors affecting speed:
- Network latency
- Application response time
- Number of test cases
- Screenshot capture
- Detailed logging

Optimization:
- Run tests in parallel (if supported)
- Skip non-critical tests
- Use headless mode
- Limit screenshots
- Increase timeouts only when needed

Resource Usage

Resources consumed:
- Browser instance (RAM: 200-500MB)
- MCP server process (RAM: 50-100MB)
- Agent tokens (varies by test complexity)

Management:
- Close browser after tests
- Limit concurrent test runs
- Monitor token usage
- Clean up screenshots

Best Practices

1. Clear Instructions

❌ "Test my app"
✅ "Test the checkout flow of my e-commerce app at localhost:3000, 
    focusing on cart operations and payment form validation"

2. Provide Context

Before testing:
- Application URL
- Login credentials (if needed)
- Test data requirements
- Known limitations
- Priority areas

3. Incremental Testing

First: Core functionality
Then: Edge cases
Finally: Error conditions

Don't try to test everything at once

4. Review Results

After testing:
Review failed tests
Verify bug reports
Check screenshots
Prioritize fixes
Re-test after fixes

5. Document Findings

Keep test reports:
- Create test-results.md
- Include timestamps
- Track bug status
- Note environment details

Integration Patterns

With CI/CD

# .github/workflows/ai-testing.yml
name: AI Testing

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Start application
        run: npm start &
      - name: Run AI tests
        run: |
          theia-cli ai test \
            --agent app-tester \
            --url localhost:3000 \
            --report test-results.json

With Issue Tracking

// Automatically create issues for failures
const testResults = await runTests();

for (const failure of testResults.failures) {
  await github.issues.create({
    title: `[Test Failure] ${failure.testCase}`,
    body: formatBugReport(failure),
    labels: ['bug', 'automated-test']
  });
}

With Monitoring

// Send test metrics to monitoring
await monitoring.recordMetric('test.duration', testResults.duration);
await monitoring.recordMetric('test.success_rate', testResults.passRate);

if (testResults.passRate < 0.8) {
  await alerts.send('Test success rate below threshold');
}

Troubleshooting

Agent Not Finding Elements

Problem: Agent reports elements not found Solutions:

Check selectors are correct
Wait for page to load completely
Check for dynamic content
Verify no authentication required
Look for shadow DOM

Tests Taking Too Long

Problem: Tests run very slowly Solutions:

Check network connectivity
Reduce slowMo setting
Use headless mode
Optimize application performance
Limit screenshot capture

Inconsistent Results

Problem: Tests pass sometimes, fail others Solutions:

Add explicit waits
Check for race conditions
Increase timeouts
Look for animation issues
Verify data consistency

Future Enhancements

Planned capabilities:

Cross-browser testing (Chrome, Firefox, Safari)
Mobile testing (iOS, Android simulators)
Performance testing (load time, metrics)
API testing (REST, GraphQL)
Database validation (check data integrity)
Visual diff (pixel-perfect comparison)

Summary

AI-driven testing in theia provides:

✅ Autonomous - Tests run without human intervention ✅ Intelligent - Generates test cases dynamically ✅ Comprehensive - Covers UI, functionality, errors ✅ Integrated - Works with bug tracking ✅ Fast - Parallel execution possible ✅ Detailed - Rich reports with screenshots

Key workflow:

Start application
Invoke App Tester agent
Agent generates and runs tests
Bugs automatically reported
Review results
Fix issues
Re-test

This dramatically reduces manual QA effort while increasing test coverage and consistency.

Overview​

App Tester Agent​

Purpose​

How It Works​

Basic Usage​

Example Session​

Setting Up Browser Testing​

Prerequisites​

Installation Steps​

Configuration​

Test Scenarios​

Smoke Testing​

Focused Testing​

Regression Testing​

Exploratory Testing​

Advanced Testing Patterns​

1. Test Case Generation​

2. Autonomous Testing Loops​

3. Visual Regression Testing​

4. Accessibility Testing​

Bug Reporting Integration​

Automatic Bug Reporting​

Bug Report Format​

Test Result Analysis​

Success Metrics​

Detailed Reports​

Providing Test Context​

Application Description​

Test Scenarios File​

Playwright Integration Details​

Available Playwright Functions​

Selector Strategies​

Error Handling​

Common Issues​

Retry Logic​

Performance Considerations​

Test Speed​

Resource Usage​

Best Practices​

1. Clear Instructions​

2. Provide Context​

3. Incremental Testing​

4. Review Results​

5. Document Findings​

Integration Patterns​

With CI/CD​

With Issue Tracking​

With Monitoring​

Troubleshooting​

Agent Not Finding Elements​

Tests Taking Too Long​

Inconsistent Results​

Future Enhancements​

Summary​