Skip to main content

Bulk Retrieval

Mortgagees and agents often need hundreds or thousands of documents at once. Docyard’s bulk retrieval system processes these requests asynchronously and delivers results via time-limited signed URLs.

Mortgagee Bulk API Flow

The most common bulk retrieval pattern: a mortgage lender pulls declaration pages for their entire loan portfolio via API, authenticating with a shared passphrase and mutual TLS.

1. Create a Retrieval Job

curl -X POST https://api.docyard.io/v1/retrieval/jobs \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -H "Content-Type: application/json" \
  -d '{
    "dockId": "dock_01HQ3K...",
    "recipientId": "rcp_01HQ3N...",
    "metadata": {
      "document_type": "declaration-page"
    },
    "sinceTimestamp": "2025-01-01T00:00:00.000Z"
  }'
Response
{
  "id": "job_01HQ3Q...",
  "dockId": "dock_01HQ3K...",
  "status": "PENDING",
  "totalCount": 0,
  "readyCount": 0,
  "errorCount": 0,
  "createdAt": "2025-01-15T10:34:00.000Z"
}

2. Track Job Progress

Poll the status endpoint as the batch assembler processes artifacts:
curl https://api.docyard.io/v1/retrieval/jobs/job_01HQ3Q... \
  -H "Authorization: Bearer dk_live_a1b2c3d4..."
{
  "id": "job_01HQ3Q...",
  "dockId": "dock_01HQ3K...",
  "status": "PROCESSING",
  "totalCount": 3952,
  "readyCount": 2847,
  "errorCount": 0,
  "createdAt": "2025-01-15T10:34:00.000Z"
}

Job Statuses

StatusDescription
PENDINGQueued — identity and policy verification in progress
PROCESSINGArtifacts being assembled, readyCount incrementing
COMPLETEDAll artifacts ready for download
FAILEDJob failed — check errorCount and retry

3. Download Results

Once COMPLETED, fetch signed URLs:
curl https://api.docyard.io/v1/retrieval/jobs/job_01HQ3Q.../results \
  -H "Authorization: Bearer dk_live_a1b2c3d4..."
{
  "status": "COMPLETED",
  "totalCount": 3952,
  "items": [
    {
      "artifactId": "art_01HQ3M...",
      "filename": "dec-page-POL-2025-88401.pdf",
      "url": "https://storage.docyard.io/signed/abc123...",
      "expiresAt": "2025-01-15T11:34:00.000Z"
    },
    {
      "artifactId": "art_01HQ3N...",
      "filename": "dec-page-POL-2025-88402.pdf",
      "url": "https://storage.docyard.io/signed/def456...",
      "expiresAt": "2025-01-15T11:34:00.000Z"
    }
  ]
}
Signed URLs expire after 1 hour. Result sets expire after 24 hours. Download artifacts immediately after job completion.

Filtering Options

Narrow the retrieval scope with optional filters:
curl -X POST https://api.docyard.io/v1/retrieval/jobs \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -H "Content-Type: application/json" \
  -d '{
    "dockId": "dock_01HQ3K...",
    "recipientId": "rcp_01HQ3N...",
    "artifactIds": ["art_01HQ3M...", "art_01HQ3N..."],
    "sinceTimestamp": "2025-01-14T00:00:00.000Z",
    "metadata": {
      "document_type": "declaration-page",
      "state": "CT"
    }
  }'
FilterDescriptionCommon Use
artifactIdsRetrieve specific artifacts by IDTargeted pulls
sinceTimestampOnly artifacts uploaded after this dateDaily incremental sync
metadataFilter by artifact metadata key-value pairsBy doc type, state, LOB

Integration Patterns by Persona

Mortgagee — Daily Incremental Sync

The typical mortgagee integration runs nightly, pulling new or updated dec pages since the last sync:
1. Authenticate via shared passphrase + mutual TLS
2. Create retrieval job with sinceTimestamp = last_sync_date
3. Poll job status until COMPLETED
4. Download all artifacts via signed URLs
5. Update last_sync_date for next run
6. Repeat daily
# Nightly sync: get all new dec pages since yesterday
curl -X POST https://api.docyard.io/v1/retrieval/jobs \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -H "Content-Type: application/json" \
  -d '{
    "dockId": "dock_01HQ3K...",
    "recipientId": "rcp_01HQ3N...",
    "metadata": { "document_type": "declaration-page" },
    "sinceTimestamp": "2025-01-14T00:00:00.000Z"
  }'

Agent — Client Batch Download

Agents pull documents for a specific client’s policy number using the portal’s bulk download feature, or via API:
curl -X POST https://api.docyard.io/v1/retrieval/jobs \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -H "Content-Type: application/json" \
  -d '{
    "dockId": "dock_01HQ3K...",
    "recipientId": "rcp_01HQ3S...",
    "metadata": { "policy_number": "POL-2025-88401" }
  }'

Policyholder — Single Retrieval

Policyholders don’t use bulk retrieval — they access individual documents through the portal after SMS OTP verification. The portal handles the retrieval internally.

Auditor — Scoped Audit Window

Auditors access documents through the portal only (read-only, no downloads). They do not use the bulk retrieval API. Their access is governed by the time-boxed policy — when the window closes, all access is automatically denied.

Listing Job History

Query retrieval jobs for monitoring and reconciliation:
# All completed jobs for a dock
curl "https://api.docyard.io/v1/retrieval/jobs?dockId=dock_01HQ3K...&status=COMPLETED&limit=20" \
  -H "Authorization: Bearer dk_live_a1b2c3d4..."

# All jobs (any status)
curl "https://api.docyard.io/v1/retrieval/jobs?dockId=dock_01HQ3K..." \
  -H "Authorization: Bearer dk_live_a1b2c3d4..."

Audit Coverage

Every retrieval job generates immutable audit log entries:
  • Job created — who initiated, which recipient, what filters
  • Identity verified — passphrase check, TLS certificate validation
  • Policy evaluated — which recipe was applied, result (PASS/FAIL)
  • Per-artifact access — each signed URL generation is logged
  • Job completed/failed — final status with counts
# Query retrieval audit events
curl "https://api.docyard.io/v1/audit/logs?dockId=dock_01HQ3K...&entityType=retrieval_job" \
  -H "Authorization: Bearer dk_live_a1b2c3d4..."

Next Steps