Resilience

Agent Stress Testing

Run load, spike, soak, chaos, fuzzing, and adversarial tests against your agents to verify resilience, identify breaking points, and validate recovery behavior.

Test Types

Each test type targets a different dimension of agent reliability. Combine multiple test types in a single test suite for comprehensive coverage.

Load Testing

Gradually increase concurrent requests to determine throughput limits, latency degradation curves, and breaking points.

Parameters: concurrency, rampUpSeconds, sustainSeconds, requestsPerSecond

Spike Testing

Send sudden bursts of traffic to test auto-scaling behavior, queue management, and recovery after traffic spikes.

Parameters: baselineRps, spikeRps, spikeDurationSeconds, cooldownSeconds

Chaos Engineering

Inject failures (network partitions, latency, CPU pressure, memory pressure) to test resilience and graceful degradation.

Parameters: faults[], duration, targetComponents

Fuzzing

Generate malformed, boundary-value, and edge-case inputs to discover parsing errors, crashes, and unexpected behavior.

Parameters: inputSchema, mutationRate, maxIterations, corpus

Adversarial Testing

Run prompt injection attacks, jailbreak attempts, and social engineering scenarios against AI agent guardrails.

Parameters: attackVectors[], targetPolicies[], escalationDepth

Creating a Stress Test

Define a stress test with one or more test types, target agents, and configuration parameters. Tests run in isolated sandbox environments to prevent impact on production systems.

POST /api/v1/stress-tests

{
  "name": "Content Scanner v2 - Full Suite",
  "targetAgents": ["019agent-scanner-..."],
  "sandboxConfig": {
    "isolation": "container",
    "resources": { "cpu": "4.0", "memoryMb": 8192 }
  },
  "tests": [
    {
      "type": "load",
      "config": {
        "concurrency": 100,
        "rampUpSeconds": 60,
        "sustainSeconds": 300,
        "requestsPerSecond": 500
      }
    },
    {
      "type": "adversarial",
      "config": {
        "attackVectors": ["prompt-injection", "jailbreak", "role-confusion", "encoding-bypass"],
        "targetPolicies": ["content-filter", "pii-detection"],
        "escalationDepth": 5
      }
    },
    {
      "type": "fuzzing",
      "config": {
        "inputSchema": { "type": "string", "maxLength": 10000 },
        "mutationRate": 0.15,
        "maxIterations": 10000
      }
    }
  ],
  "schedule": "immediate"
}

// Response
{
  "ok": true,
  "data": {
    "id": "019stress-abcd-1234-...",
    "name": "Content Scanner v2 - Full Suite",
    "status": "queued",
    "tests": [
      { "type": "load", "status": "pending" },
      { "type": "adversarial", "status": "pending" },
      { "type": "fuzzing", "status": "pending" }
    ],
    "estimatedDurationMinutes": 18,
    "createdAt": "2026-02-14T09:00:00Z"
  }
}

Scenario Library

DRD ships with a curated library of test scenarios based on real-world attack patterns and failure modes. Use built-in scenarios directly or as templates for custom tests.

Adversarial Prompts

owasp-llm-top10All OWASP LLM Top 10 attack vectors

multilingual-injectionPrompt injection in 20+ languages

indirect-injectionInjection via tool outputs and context

system-prompt-extractionAttempts to extract system prompts

role-hijackRole confusion and persona override attacks

Chaos Faults

network-partitionSimulated network splits between services

latency-injectionRandom latency spikes on API calls

dependency-failureUpstream service timeout/failure

memory-pressureGradual memory exhaustion

clock-skewTime drift to test temporal logic

Fuzzing Corpora

unicode-tortureMalformed Unicode, RTL overrides, zero-width chars

boundary-valuesInteger overflow, empty strings, max-length inputs

format-stringsInjection via format string patterns

nested-structuresDeeply nested JSON/XML payloads

Using built-in scenarios

POST /api/v1/stress-tests

{
  "name": "OWASP LLM Top 10 Scan",
  "targetAgents": ["019agent-scanner-..."],
  "tests": [
    {
      "type": "adversarial",
      "scenario": "owasp-llm-top10",
      "overrides": {
        "escalationDepth": 8,
        "targetPolicies": ["content-filter", "pii-detection", "code-exec"]
      }
    },
    {
      "type": "fuzzing",
      "scenario": "unicode-torture",
      "overrides": { "maxIterations": 50000 }
    }
  ]
}

Results Analysis

Every stress test produces a detailed results report with pass/fail determinations, performance metrics, discovered vulnerabilities, and recommendations.

GET /api/v1/stress-tests/:id/results

{
  "ok": true,
  "data": {
    "testId": "019stress-abcd-1234-...",
    "status": "completed",
    "duration": {
      "totalSeconds": 1080,
      "startedAt": "2026-02-14T09:01:00Z",
      "completedAt": "2026-02-14T09:19:00Z"
    },
    "results": {
      "load": {
        "passed": true,
        "metrics": {
          "maxRps": 487,
          "p50LatencyMs": 42,
          "p95LatencyMs": 189,
          "p99LatencyMs": 520,
          "errorRate": 0.002,
          "breakingPointRps": 612
        }
      },
      "adversarial": {
        "passed": false,
        "findings": [
          {
            "vector": "encoding-bypass",
            "severity": "high",
            "description": "UTF-16 encoded prompt bypasses content filter",
            "reproducible": true,
            "policyViolated": "content-filter"
          }
        ],
        "metrics": {
          "totalAttacks": 2400,
          "blocked": 2387,
          "bypassed": 13,
          "blockRate": 0.9946
        }
      },
      "fuzzing": {
        "passed": true,
        "metrics": {
          "totalInputs": 10000,
          "crashes": 0,
          "hangs": 2,
          "uniqueErrors": 4
        }
      }
    },
    "overallVerdict": "fail",
    "recommendations": [
      "Fix encoding-bypass vulnerability in content filter (HIGH priority)",
      "Investigate 2 hangs discovered during fuzzing (MEDIUM priority)",
      "Consider increasing rate limit -- current breaking point is 612 RPS"
    ]
  }
}

Failed Tests

When a stress test fails, DRD can automatically create remediation tasks in your governance project board and, if configured, file bug bounty submissions for discovered vulnerabilities. Enable auto-remediation in your workspace settings.

SDK Usage

Run stress tests programmatically with the DRD TypeScript SDK. Ideal for CI/CD integration where tests gate deployments.

stress-test.ts

import { DRDClient } from "@drd.io/sdk";

const drd = new DRDClient({ apiKey: process.env.DRD_API_KEY! });

// Run a comprehensive stress test
const test = await drd.stressTests.create({
  name: "CI/CD Gate - Production Deploy",
  targetAgents: ["019agent-scanner-..."],
  tests: [
    { type: "load", config: { concurrency: 50, sustainSeconds: 120 } },
    { type: "adversarial", scenario: "owasp-llm-top10" },
    { type: "fuzzing", scenario: "unicode-torture" },
  ],
});

// Wait for completion (polls automatically)
const results = await drd.stressTests.waitForCompletion(test.id, {
  timeoutMs: 1800000, // 30 minutes
  pollIntervalMs: 5000,
});

// Gate deployment on results
if (results.overallVerdict === "fail") {
  console.error("Stress test failed! Blocking deployment.");
  console.error("Findings:", results.results.adversarial?.findings);
  process.exit(1);
}

console.log("All stress tests passed. Proceeding with deployment.");

Scheduled & Recurring Tests

Schedule stress tests to run at specific times or on a recurring basis. Useful for regression testing and continuous security validation.

POST /api/v1/stress-tests/schedules

{
  "name": "Weekly Adversarial Scan",
  "cron": "0 2 * * MON",
  "timezone": "America/New_York",
  "testConfig": {
    "targetAgents": ["019agent-scanner-...", "019agent-assistant-..."],
    "tests": [
      { "type": "adversarial", "scenario": "owasp-llm-top10" },
      { "type": "load", "config": { "concurrency": 100, "sustainSeconds": 300 } }
    ]
  },
  "notifications": {
    "onFailure": ["slack", "pagerduty"],
    "onSuccess": ["slack"]
  }
}

// Response
{
  "ok": true,
  "data": {
    "scheduleId": "019sched-uvwx-...",
    "name": "Weekly Adversarial Scan",
    "cron": "0 2 * * MON",
    "nextRunAt": "2026-02-17T07:00:00Z",
    "enabled": true
  }
}

Next Steps

Anomaly Detection

Predictive alert system

Learn more →

Monitoring

Real-time oversight

Learn more →

Carbon Tracking

Compute cost monitoring

Learn more →

System Health

Monitor system health metrics during stress tests

Learn more →

Resilience

Agent Stress Testing

Run load, spike, soak, chaos, fuzzing, and adversarial tests against your agents to verify resilience, identify breaking points, and validate recovery behavior.

Test Types

Each test type targets a different dimension of agent reliability. Combine multiple test types in a single test suite for comprehensive coverage.

Load Testing

Gradually increase concurrent requests to determine throughput limits, latency degradation curves, and breaking points.

Parameters: concurrency, rampUpSeconds, sustainSeconds, requestsPerSecond

Spike Testing

Send sudden bursts of traffic to test auto-scaling behavior, queue management, and recovery after traffic spikes.

Parameters: baselineRps, spikeRps, spikeDurationSeconds, cooldownSeconds

Chaos Engineering

Inject failures (network partitions, latency, CPU pressure, memory pressure) to test resilience and graceful degradation.

Parameters: faults[], duration, targetComponents

Fuzzing

Generate malformed, boundary-value, and edge-case inputs to discover parsing errors, crashes, and unexpected behavior.

Parameters: inputSchema, mutationRate, maxIterations, corpus

Adversarial Testing

Run prompt injection attacks, jailbreak attempts, and social engineering scenarios against AI agent guardrails.

Parameters: attackVectors[], targetPolicies[], escalationDepth

Creating a Stress Test

Define a stress test with one or more test types, target agents, and configuration parameters. Tests run in isolated sandbox environments to prevent impact on production systems.

POST /api/v1/stress-tests

{
  "name": "Content Scanner v2 - Full Suite",
  "targetAgents": ["019agent-scanner-..."],
  "sandboxConfig": {
    "isolation": "container",
    "resources": { "cpu": "4.0", "memoryMb": 8192 }
  },
  "tests": [
    {
      "type": "load",
      "config": {
        "concurrency": 100,
        "rampUpSeconds": 60,
        "sustainSeconds": 300,
        "requestsPerSecond": 500
      }
    },
    {
      "type": "adversarial",
      "config": {
        "attackVectors": ["prompt-injection", "jailbreak", "role-confusion", "encoding-bypass"],
        "targetPolicies": ["content-filter", "pii-detection"],
        "escalationDepth": 5
      }
    },
    {
      "type": "fuzzing",
      "config": {
        "inputSchema": { "type": "string", "maxLength": 10000 },
        "mutationRate": 0.15,
        "maxIterations": 10000
      }
    }
  ],
  "schedule": "immediate"
}

// Response
{
  "ok": true,
  "data": {
    "id": "019stress-abcd-1234-...",
    "name": "Content Scanner v2 - Full Suite",
    "status": "queued",
    "tests": [
      { "type": "load", "status": "pending" },
      { "type": "adversarial", "status": "pending" },
      { "type": "fuzzing", "status": "pending" }
    ],
    "estimatedDurationMinutes": 18,
    "createdAt": "2026-02-14T09:00:00Z"
  }
}

Scenario Library

DRD ships with a curated library of test scenarios based on real-world attack patterns and failure modes. Use built-in scenarios directly or as templates for custom tests.

Adversarial Prompts

owasp-llm-top10All OWASP LLM Top 10 attack vectors

multilingual-injectionPrompt injection in 20+ languages

indirect-injectionInjection via tool outputs and context

system-prompt-extractionAttempts to extract system prompts

role-hijackRole confusion and persona override attacks

Chaos Faults

network-partitionSimulated network splits between services

latency-injectionRandom latency spikes on API calls

dependency-failureUpstream service timeout/failure

memory-pressureGradual memory exhaustion

clock-skewTime drift to test temporal logic

Fuzzing Corpora

unicode-tortureMalformed Unicode, RTL overrides, zero-width chars

boundary-valuesInteger overflow, empty strings, max-length inputs

format-stringsInjection via format string patterns

nested-structuresDeeply nested JSON/XML payloads

Using built-in scenarios

POST /api/v1/stress-tests

{
  "name": "OWASP LLM Top 10 Scan",
  "targetAgents": ["019agent-scanner-..."],
  "tests": [
    {
      "type": "adversarial",
      "scenario": "owasp-llm-top10",
      "overrides": {
        "escalationDepth": 8,
        "targetPolicies": ["content-filter", "pii-detection", "code-exec"]
      }
    },
    {
      "type": "fuzzing",
      "scenario": "unicode-torture",
      "overrides": { "maxIterations": 50000 }
    }
  ]
}

Results Analysis

Every stress test produces a detailed results report with pass/fail determinations, performance metrics, discovered vulnerabilities, and recommendations.

GET /api/v1/stress-tests/:id/results

{
  "ok": true,
  "data": {
    "testId": "019stress-abcd-1234-...",
    "status": "completed",
    "duration": {
      "totalSeconds": 1080,
      "startedAt": "2026-02-14T09:01:00Z",
      "completedAt": "2026-02-14T09:19:00Z"
    },
    "results": {
      "load": {
        "passed": true,
        "metrics": {
          "maxRps": 487,
          "p50LatencyMs": 42,
          "p95LatencyMs": 189,
          "p99LatencyMs": 520,
          "errorRate": 0.002,
          "breakingPointRps": 612
        }
      },
      "adversarial": {
        "passed": false,
        "findings": [
          {
            "vector": "encoding-bypass",
            "severity": "high",
            "description": "UTF-16 encoded prompt bypasses content filter",
            "reproducible": true,
            "policyViolated": "content-filter"
          }
        ],
        "metrics": {
          "totalAttacks": 2400,
          "blocked": 2387,
          "bypassed": 13,
          "blockRate": 0.9946
        }
      },
      "fuzzing": {
        "passed": true,
        "metrics": {
          "totalInputs": 10000,
          "crashes": 0,
          "hangs": 2,
          "uniqueErrors": 4
        }
      }
    },
    "overallVerdict": "fail",
    "recommendations": [
      "Fix encoding-bypass vulnerability in content filter (HIGH priority)",
      "Investigate 2 hangs discovered during fuzzing (MEDIUM priority)",
      "Consider increasing rate limit -- current breaking point is 612 RPS"
    ]
  }
}

Failed Tests

SDK Usage

Run stress tests programmatically with the DRD TypeScript SDK. Ideal for CI/CD integration where tests gate deployments.

stress-test.ts

import { DRDClient } from "@drd.io/sdk";

const drd = new DRDClient({ apiKey: process.env.DRD_API_KEY! });

// Run a comprehensive stress test
const test = await drd.stressTests.create({
  name: "CI/CD Gate - Production Deploy",
  targetAgents: ["019agent-scanner-..."],
  tests: [
    { type: "load", config: { concurrency: 50, sustainSeconds: 120 } },
    { type: "adversarial", scenario: "owasp-llm-top10" },
    { type: "fuzzing", scenario: "unicode-torture" },
  ],
});

// Wait for completion (polls automatically)
const results = await drd.stressTests.waitForCompletion(test.id, {
  timeoutMs: 1800000, // 30 minutes
  pollIntervalMs: 5000,
});

// Gate deployment on results
if (results.overallVerdict === "fail") {
  console.error("Stress test failed! Blocking deployment.");
  console.error("Findings:", results.results.adversarial?.findings);
  process.exit(1);
}

console.log("All stress tests passed. Proceeding with deployment.");

Scheduled & Recurring Tests

Schedule stress tests to run at specific times or on a recurring basis. Useful for regression testing and continuous security validation.

POST /api/v1/stress-tests/schedules

{
  "name": "Weekly Adversarial Scan",
  "cron": "0 2 * * MON",
  "timezone": "America/New_York",
  "testConfig": {
    "targetAgents": ["019agent-scanner-...", "019agent-assistant-..."],
    "tests": [
      { "type": "adversarial", "scenario": "owasp-llm-top10" },
      { "type": "load", "config": { "concurrency": 100, "sustainSeconds": 300 } }
    ]
  },
  "notifications": {
    "onFailure": ["slack", "pagerduty"],
    "onSuccess": ["slack"]
  }
}

// Response
{
  "ok": true,
  "data": {
    "scheduleId": "019sched-uvwx-...",
    "name": "Weekly Adversarial Scan",
    "cron": "0 2 * * MON",
    "nextRunAt": "2026-02-17T07:00:00Z",
    "enabled": true
  }
}

Next Steps

Anomaly Detection

Predictive alert system

Learn more →

Monitoring

Real-time oversight

Learn more →

Carbon Tracking

Compute cost monitoring

Learn more →

System Health

Monitor system health metrics during stress tests

Learn more →