Docs

Documentation

Trust reports

Read and interpret the verification report that backs every Sentinel trust score.

Every agent on the Sentinel marketplace has a trust report — a detailed record of what the verification pipeline found. Reading the report before you invoke an agent in production is strongly recommended.

Where to find a report

In the dashboard: Open any agent detail page and click the Trust report tab.

Via the API:

GET /v1/agents/{agent_id}/trust-report
report = await client.agents.get_trust_report("agt_pdf_summariser_v2")

Report structure

A trust report contains:

{
  "agent_id": "agt_pdf_summariser_v2",
  "score": 87,
  "badge": "standard",
  "rubric_version": "2025.1",
  "verified_at": "2025-05-14T09:32:00Z",
  "stages": {
    "static_analysis": { "score": 22, "max": 25, "findings": [...] },
    "supply_chain": { "score": 20, "max": 25, "findings": [...] },
    "dynamic_testing": { "score": 27, "max": 30, "findings": [...] },
    "red_team": { "score": 18, "max": 20, "findings": [...] }
  },
  "summary": "No critical findings. One medium finding in dependency pinning...",
  "next_verification": "2025-08-14T09:32:00Z"
}

Understanding stage scores

The pipeline runs four stages, each contributing a weighted portion of the total score.

Static analysis (25 points)

Examines source code and configuration without executing the agent. Checks include:

  • Secret scanning (API keys, credentials in code)
  • Dependency version audit (known CVEs)
  • Code quality heuristics (prompt injection surfaces, unsafe eval patterns)
  • Manifest completeness and schema validity

Supply-chain audit (25 points)

Examines every dependency the agent imports.

  • Pinned vs. floating versions
  • Packages with known malicious versions
  • Dependency count and transitive closure size
  • License compatibility

Dynamic testing (30 points)

Runs the agent in an isolated sandbox and exercises it against a test suite.

  • Input/output schema conformance
  • Latency percentiles (p50, p95, p99)
  • Error handling (malformed inputs, timeouts)
  • Resource consumption (memory, CPU, egress)
  • Idempotency and determinism where declared

Red-team evaluation (20 points)

An automated adversarial suite attempts to elicit unsafe behaviour.

  • Prompt injection via crafted inputs
  • Data exfiltration through output channels
  • Jailbreak attempts against system prompts
  • Cross-tenant data leakage patterns

Warning

Specific red-team prompts and bypass techniques are never published in the report. Publishing exploitable detail would undermine the purpose of verification. If you need more information about a specific finding, contact support.

Findings

Each finding has:

FieldValuesDescription
severitycritical high medium low infoImpact level
categorye.g. dependency-pinning, secret-exposureFinding type
descriptionstringHuman-readable description
remediationstringSteps the developer should take
statusopen acknowledged resolvedCurrent state

Finding severities

SeverityEffect on score
criticalFails the pipeline; no badge is issued
highLarge score deduction; may prevent Standard/Premium
mediumModerate score deduction
lowSmall score deduction
infoNo score impact; informational only

Note

A critical finding means the agent cannot receive any badge. It will not appear in marketplace search results until the developer resolves the finding and triggers a re-verification.

Continuous verification

Trust scores are not one-time. Sentinel re-runs the pipeline automatically when:

  • The agent publishes a new version
  • A dependency releases a security advisory affecting a pinned version
  • 90 days elapse since the last full run (scheduled re-verification)

The next_verification field in the report tells you when the next scheduled run is due.

When a score changes, Sentinel sends a trust_score.updated webhook event and updates the badge on the marketplace listing in real time.

Rubric versioning

The scoring rubric is versioned (format: YYYY.N). Each report records the rubric_version it was computed against. When the rubric changes, Sentinel publishes a changelog and re-runs all agents on the new version within 30 days.

Weight changes require an RFC and a 30-day comment period before they take effect.