Hardened · Pilot-validated · Demo-ready

The Diligent Entities
API, composed by an agent.

166 tools across 27 categories. Bulk ingest a messy CSV into Diligent Entities, fuzzy-match countries and company types across 249 jurisdictions, detect duplicates with confidence scores, validate before commit, and roll back with audit proof — all from one agent conversation, through one protocol.

166

MCP Tools

Every GraphQL mutation + query wrapped

Three layers, one conversation.

The server is designed so an LLM can navigate it without memorizing 166 tool names. A meta layer describes itself; a smart ingest layer handles the messy human work; a primitive layer exposes every GraphQL mutation the Diligent Entities API offers.

Meta & control plane

Eight tools the agent calls first. Health check, session metrics, capability discovery, schema introspection, auto-pagination, token refresh. The agent learns what it has before it guesses.

Smart ingest layer

The secret sauce. Fuzzy matching, bulk operations with bounded concurrency, dry-run validation, duplicate detection with confidence scores, country-scoped type filtering, ordered-fallback hints.

Primitive layer

Raw CRUD for every entity type — companies, individuals, addresses, appointments, trusts, partners, committees, plus audit trail, security groups, users, and data library. Compose these when no smart tool fits.

Typed error model

Every error classified into validation, limit, schema, auth, network, http. Agents know exactly when to retry, refresh, or give up with a clean explanation.

Bounded concurrency

Bulk mutations run through a worker pool (default 4, max 10). Per-row error capture means one bad row never aborts the batch. Retry + backoff is consistent across all 166 tools.

Reversible by default

Every bulk create returns the entityReference of every row. Roll back the whole run with one follow-up. The audit trail proves what changed.

How it fits together

Architecture at a glance.

Claude talks to the MCP over stdio. The MCP wraps a hardened GraphQL client with retry, structured errors, and token refresh. Reference data is cached for 15 minutes so fuzzy matches don't thrash the API.

              ┌───────────────────────────────────────────────────────┐
              │  Claude (agent loop)                                     │
              └──────────────────────────┬────────────────────────────┘
                                         │ MCP Protocol (stdio)
                                         ▼
      ┌───────────────────────────────────────────────────────────────────┐
      │                    Diligent Entities MCP                          │
      │                                                                   │
      │   Meta layer       Smart ingest       Primitive layer         │
      │   self-discovery   fuzzy match       CRUD + getters             │
      │   health & metrics concurrency       audit trail                 │
      │   introspection    dry-run            security groups             │
      │   query_all        duplicates         addresses & appointments    │
      │                                                                   │
      │   ─────────────────────────────────────────────────────          │
      │                                                                   │
      │   Hardened GraphQL client                                        │
      │   · retry + exponential backoff   · typed error classification    │
      │   · auth refresh hook              · in-memory session metrics     │
      │   · reference-data cache (15m)     · auto-pagination              │
      └──────────────────────────────┬────────────────────────────────────┘
                                     │ HTTPS + Bearer token
                                     ▼
              ┌───────────────────────────────────────────────┐
              │       Diligent Entities                        │
              │       GraphQL API · HotChocolate              │
              └───────────────────────────────────────────────┘

The demo story

Six beats, thirty-seven seconds.

A realistic end-to-end ingestion flow: messy CSV in, clean data plus audit proof out. This is the exact sequence the agent runs during the demo, timed across 3 rehearsal runs against a live tenant.

Context

Agent calls entities_list_capabilities to discover what it has before doing anything. Meta layer first.

<1ms1 callmeta

Map + dry-run

Parses the CSV, warms the reference cache, then calls entities_bulk_create_companies with dryRun: true. Zero writes. Returns 15 clean, 4 with warnings, 1 broken.

~6s5 callsvalidation

Duplicate detection

For the candidates, entities_find_duplicate_companies runs fuzzy name match + exact company-number match + country scoping. Catches pre-seeded traps and internal CSV dupes.

~2s3 callsfuzzy

Commit

Real entities_bulk_create_companies with skipDuplicates: true, concurrency: 4. 17 created, 2 skipped as duplicates, 1 failed with a clean "Netherlands + Ltd" error.

~24s47 callsbulk

Compose a report

"UK companies by type" — no dedicated tool exists. Agent composes it: entities_query_all + client-side group-by on companyType.name. Instant table.

~1s1 callcomposition

Undo + audit

The agent remembered every entityReference. It deletes all 17, then queries entities_list_audit_trails as proof. 107,000+ audit entries; the deletes are in there.

~7s18 callsreversible

Secret sauce

The smart ingest layer.

Ingestion is the place where Diligent Entities, as an API, is at its most unforgiving: reference data, per-country type scoping, and duplicate semantics all have to be right before a single mutation is accepted. The smart ingest layer handles all of it so the agent can think in natural language.

fuzzy match

Levenshtein + Jaccard + acronym prefix

Normalizes, tokenizes, compresses punctuation, applies alias maps, and handles leading-token matches so BV resolves to B.V. (closed limited liability company) even though no character overlaps.

country-scoped

Per-jurisdiction type filtering

Types are scoped with countryIds + isNotInCountries. Matches only consider types valid for the resolved country, so "Ltd" in Netherlands fails fast instead of creating the wrong entity.

hints

Country-context ordered fallbacks

Each country has a preferred canonical name for common abbreviations with ordered fallbacks: Norway AS → [Aksjeselskap, Private Company]. Works across tenants with different type dictionaries.

dry-run

Pre-commit validation

Every bulk-create tool accepts dryRun: true. Runs the full validator, resolves all references, reports warnings per row, and never touches the database.

dedup

Duplicate detection

Three-layer match: exact company-number inside the country, fuzzy name match above a threshold, and internal CSV duplicate scan. Returns confidence scores, not booleans.

concurrency

Bounded worker pool

Bulk creates run through runWithConcurrency with a hard max of 10. Per-row errors are captured structurally. One bad row never breaks the batch.

javascriptentities_bulk_create_companies

// Map CSV rows to the MCP's free-text schema — no IDs required.
const payloads = rows.map(r => ({
  entityName:        r.company_name,
  companyTypeName:   r.company_type,    // "BV" / "GmbH" / "SARL" — resolved by fuzzy match
  countryName:       r.country,         // "Holland" / "USA" — resolved via alias table
  companyNumber:     r.reg_number,
  incorporationDate: r.incorporated,
}));

// Step 1 — dry-run. Zero writes. Catches every issue.
const dry = await entities_bulk_create_companies({
  companies: payloads,
  dryRun:    true,
});

// Step 2 — commit. Duplicates skipped. Concurrency 4.
const real = await entities_bulk_create_companies({
  companies:      payloads,
  skipDuplicates: true,
  concurrency:    4,
});

// Step 3 — hold the references so you can undo.
const refs = real.results
  .filter(r => r.status === 'created')
  .map(r => r.created.entityReference);
// → ["BRTP2020", "COBT2011", "FRMR2013", ...]

27 categories

Browse by capability area.

Click any category to filter the tool reference. The meta layer, smart match, and bulk ingest tools are the ones you'll want the agent to reach for first.

Complete tool reference

All 166 tools.

Each tool documents its name, one-line purpose, and full input schema. Type in the search box or click a category above to filter.

Agent packs

Pre-validated end-to-end scenarios.

Each pack is a full agent loop — tested against a live tenant, timed, and scored. Use them as reference workflows or as the starting point for a new one.

Scenario A · 25 companies

Bulk company ingestion

The canonical workflow. Messy CSV with typos, accented names, country aliases, and one internal duplicate. Validated → deduped → committed → rolled back cleanly.

1warmup_reference_cache

2bulk_create_companies (dryRun)

3find_duplicate_companies (loop)

4bulk_create_companies (concurrency=4)

5delete_company (loop) + audit_trails

Scenario B · 18 individuals

Directors & officers

Individuals with nationality, date of birth, passport, and role. Handles "Surname, Forenames" comma-splits, honorifics, and missing DOBs as warnings rather than errors.

1parse CSV → normalize names + DOBs

2smart_match_nationality (bulk)

3validate_individual_payload (dryRun)

4bulk_create_individuals

5connect to companies as appointments

Scenario C · 20 addresses

International address book

Addresses with international characters, postal codes, regions, and entity connections. References must be alphanumeric ≤12 chars — the ingest layer strips and validates automatically.

1parse CSV → strip ref non-alphanumerics

2smart_match_country (bulk)

3bulk_create_addresses (dryRun)

4bulk_create_addresses (concurrency=4)

5connect_address_to_entity (loop)

Demo pack · 20 rows / 6 beats

The live demo flow

The full six-beat sequence: context → dry-run → dup-check → commit → compose a report → undo with audit proof. Designed for a 5-10 minute narration; runs in ~37 seconds of raw tool work.

1list_capabilities

2warmup + dryRun

3find_duplicate_companies

4bulk_create_companies

5query_all + client-side group-by

6delete loop + audit proof

Composition recipes

What the agent does when there's no tool for that.

The MCP has 166 tools. Real questions are unbounded. When there's no dedicated tool, the agent composes primitives.

UK companies grouped by type

javascriptno dedicated tool · ~1s

const all = await entities_query_all({
  collection: 'companies',
  maxRecords: 500,
});
const uk = all.items.filter(c => c.country?.name === 'United Kingdom');
const byType = uk.reduce((acc, c) => {
  const k = c.companyType?.name || '(unknown)';
  acc[k] = (acc[k] || 0) + 1;
  return acc;
}, {});
// → { 'Limited by Shares': 55, 'Limited by Guarantee': 1, 'Public Limited Company': 1 }

Individuals serving as directors of 3+ companies

javascriptno dedicated tool · N+1 loop

const all = await entities_query_all({ collection: 'individuals', maxRecords: 1000 });
const results = [];
for (const ind of all.items) {
  const appts = await entities_list_appointments({
    entityId: ind.id,
    appointmentTypeName: 'Director',
  });
  if (appts.items.length >= 3) {
    results.push({ individual: ind, count: appts.items.length });
  }
}

Roll back an entire bulk import

javascriptatomic undo with audit proof

// refs was collected from the previous bulk_create_companies call
for (const ref of refs) {
  const co = await entities_get_company_by_reference({ reference: ref });
  await entities_delete_company({ id: co.id });
}
const audit = await entities_list_audit_trails({ take: 50 });
// audit.items contains the 17 delete events, signed with user + timestamp

Reference

Error model.

Every error is classified. Validation and limit errors fail fast; network and 5xx errors are retried with exponential backoff; auth errors transparently refresh the token once.

Type	Trigger	Retry	Agent action
`validation`	Bad field value, wrong date format, reference too long	No	Fix the payload, don't retry
`limit`	HC0047 / HC0051 — query cost exceeded, page size over 50	No	Reduce `take` to ≤50, split batch
`schema`	HC0009 / HC0011 — unknown field	No	Verify shape via `describe_type`
`auth`	401 Unauthorized	Once	Auto: `refresh_api_token`, retry in-place
`network`	ECONNRESET, timeout	3× backoff	Already retried transparently
`http`	5xx other than rate-limit classified cases	3× backoff	Already retried transparently
`NO_TYPES_FOR_COUNTRY`	Country has zero valid company types on this tenant	No	Skip the row or create the type first
`NO_TYPE_MATCH`	Fuzzy match score below threshold	No	Ask the user or use `smart_match_company_type` to explore

Non-retryable types fail immediately. The client doesn't retry validation, limit, schema, or auth (beyond the one transparent refresh). This keeps loops tight and surfaces actionable errors to the agent fast.

Install

Getting started.

The server is a single Node.js process. Connect Claude to it via MCP stdio and you're in business.

bashinstall

# clone and install
git clone https://github.com/RiskaptureAI/diligent-entities-mcp.git
cd diligent-entities-mcp
npm install

# environment
export ENTITIES_API_URL="https://your-tenant.blueprintserver.com"
export ENTITIES_API_TOKEN="<your bearer token>"

# DXM fallback (optional — only for UI automation)
export ENTITIES_DXM_USERNAME="[email protected]"
export ENTITIES_DXM_PASSWORD="<password>"

# run
node src/index.js

Claude MCP config

json~/.claude/mcp.json

{
  "mcpServers": {
    "diligent-entities": {
      "command": "node",
      "args": ["/path/to/diligent-entities-mcp/src/index.js"],
      "env": {
        "ENTITIES_API_URL": "https://your-tenant.blueprintserver.com",
        "ENTITIES_API_TOKEN": "<your bearer token>"
      }
    }
  }
}

First agent prompt

prompthello world

I've got a CSV of 20 companies from a messy handover. Pull them into
Diligent Entities — but don't trust the data. Validate everything first,
tell me what's clean and what's broken, then import the good stuff.

Fine print

Production readiness.

Today this is a hardened pilot, not a production release. If you're shipping it to a real customer, read this section first.

Single-tenant only. The server is configured with one API token at startup. To support multiple Diligent tenants you need either one server instance per tenant (simple and safe) or a multi-tenant refactor that accepts the tenant on each tool call.

Operator-facing docs coming. This page is for the agent and the evaluator. Deployment runbook, alerting setup, token rotation policy, and PII-redaction configuration are tracked separately.

No write without dry-run. The system prompt mandates dry-run before any bulk mutation. The smart ingest layer refuses to commit if the validator rejected the row. That's a convention the agent follows, not a server enforcement — keep it in your prompts.

What's solid today

Retry with exponential backoff and structured error classification across all 166 tools.
Bounded concurrency with per-row error capture — one bad row never aborts a batch.
In-memory session metrics (call count, success rate, avg latency, retry count) exposed via get_session_metrics.
Auto-pagination helper for collections up to 10,000 rows.
Optional JSONL observability log via ENTITIES_LOG_FILE.
Transparent token refresh via DXM Playwright fallback on 401.

What still needs work

Multi-tenancy: one token per server instance today.
GraphQL queries use string interpolation, not variables — refactor before untrusted input gets anywhere near the MCP.
Playwright browser doesn't yet close on SIGTERM — clean up resource leaks before long-running deployments.
Test suite is manual scripts hitting live tenants — needs a mocked-client test framework + CI.
PII redaction at the logging layer is currently opt-out by not enabling file logging.

The Diligent EntitiesAPI, composed by an agent.

Three layers, one conversation.

Meta & control plane

Smart ingest layer

Primitive layer

Typed error model

Bounded concurrency

Reversible by default

Architecture at a glance.

Six beats, thirty-seven seconds.

Context

Map + dry-run

Duplicate detection

Commit

Compose a report

Undo + audit

The smart ingest layer.

Levenshtein + Jaccard + acronym prefix

Per-jurisdiction type filtering

Country-context ordered fallbacks

Pre-commit validation

Duplicate detection

Bounded worker pool

Browse by capability area.

All 166 tools.

Pre-validated end-to-end scenarios.

Bulk company ingestion

Directors & officers

International address book

The live demo flow

What the agent does when there's no tool for that.

UK companies grouped by type

Individuals serving as directors of 3+ companies

Roll back an entire bulk import

Error model.

Getting started.

Claude MCP config

First agent prompt

Production readiness.

What's solid today

What still needs work

The Diligent Entities
API, composed by an agent.