Documentation

Trust reports

Read and interpret the verification report that backs every Sentinel trust score.

Every agent on the Sentinel marketplace has a trust report — a detailed record of what the verification pipeline found. Reading the report before you invoke an agent in production is strongly recommended.

Where to find a report

In the dashboard: Open any agent detail page and click the Trust report tab.

Via the API:

GET /v1/agents/{agent_id}/trust-report

report = await client.agents.get_trust_report("agt_pdf_summariser_v2")

Report structure

A trust report contains:

{
  "agent_id": "agt_pdf_summariser_v2",
  "score": 87,
  "badge": "standard",
  "rubric_version": "2025.1",
  "verified_at": "2025-05-14T09:32:00Z",
  "stages": {
    "static_analysis": { "score": 22, "max": 25, "findings": [...] },
    "supply_chain": { "score": 20, "max": 25, "findings": [...] },
    "dynamic_testing": { "score": 27, "max": 30, "findings": [...] },
    "red_team": { "score": 18, "max": 20, "findings": [...] }
  },
  "summary": "No critical findings. One medium finding in dependency pinning...",
  "next_verification": "2025-08-14T09:32:00Z"
}

Understanding stage scores

The pipeline runs four stages, each contributing a weighted portion of the total score.

Static analysis (25 points)

Examines source code and configuration without executing the agent. Checks include:

Secret scanning (API keys, credentials in code)
Dependency version audit (known CVEs)
Code quality heuristics (prompt injection surfaces, unsafe eval patterns)
Manifest completeness and schema validity

Supply-chain audit (25 points)

Examines every dependency the agent imports.

Pinned vs. floating versions
Packages with known malicious versions
Dependency count and transitive closure size
License compatibility

Dynamic testing (30 points)

Runs the agent in an isolated sandbox and exercises it against a test suite.

Input/output schema conformance
Latency percentiles (p50, p95, p99)
Error handling (malformed inputs, timeouts)
Resource consumption (memory, CPU, egress)
Idempotency and determinism where declared

Red-team evaluation (20 points)

An automated adversarial suite attempts to elicit unsafe behaviour.

Prompt injection via crafted inputs
Data exfiltration through output channels
Jailbreak attempts against system prompts
Cross-tenant data leakage patterns

Warning

Specific red-team prompts and bypass techniques are never published in the report. Publishing exploitable detail would undermine the purpose of verification. If you need more information about a specific finding, contact support.

Findings

Each finding has:

Field	Values	Description
`severity`	`critical` `high` `medium` `low` `info`	Impact level
`category`	e.g. `dependency-pinning`, `secret-exposure`	Finding type
`description`	string	Human-readable description
`remediation`	string	Steps the developer should take
`status`	`open` `acknowledged` `resolved`	Current state

Finding severities

Severity	Effect on score
`critical`	Fails the pipeline; no badge is issued
`high`	Large score deduction; may prevent Standard/Premium
`medium`	Moderate score deduction
`low`	Small score deduction
`info`	No score impact; informational only

Note

A critical finding means the agent cannot receive any badge. It will not appear in marketplace search results until the developer resolves the finding and triggers a re-verification.

Continuous verification

Trust scores are not one-time. Sentinel re-runs the pipeline automatically when:

The agent publishes a new version
A dependency releases a security advisory affecting a pinned version
90 days elapse since the last full run (scheduled re-verification)

The next_verification field in the report tells you when the next scheduled run is due.

When a score changes, Sentinel sends a trust_score.updated webhook event and updates the badge on the marketplace listing in real time.

Rubric versioning

The scoring rubric is versioned (format: YYYY.N). Each report records the rubric_version it was computed against. When the rubric changes, Sentinel publishes a changelog and re-runs all agents on the new version within 30 days.

Weight changes require an RFC and a 30-day comment period before they take effect.