Skip to main content

Artifact Ingestion

Artifacts are the documents you distribute through Docyard. This guide covers all ingestion methods, deduplication behavior, and metadata strategies.

Single File Upload

The most straightforward method — upload one file at a time:
curl -X POST https://api.docyard.io/v1/docks/dock_01HQ3K.../artifacts/upload \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -F "[email protected]"
Docyard computes a SHA-256 hash, detects the content type, and stores the file. The response includes the artifact ID, hash, and storage key.

Batch Upload

Upload up to 100 files in a single request:
curl -X POST https://api.docyard.io/v1/docks/dock_01HQ3K.../artifacts/upload/batch \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -F "[email protected]" \
  -F "[email protected]" \
  -F "[email protected]"
Each file is independently hashed and deduplicated. The response includes a batchId for tracking and an array of artifact results.
For volumes exceeding 100 files per request, split into multiple batch calls. The Go-based batch assembler handles background processing for large-scale ingestion.

Metadata-Only Records

Register artifacts stored externally without uploading binary content:
curl -X POST https://api.docyard.io/v1/docks/dock_01HQ3K.../artifacts \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -H "Content-Type: application/json" \
  -d '{
    "filename": "policy-renewal-2025.pdf",
    "metadata": {
      "policy_number": "POL-2025-4821",
      "effective_date": "2025-01-01",
      "document_type": "renewal",
      "source_system": "legacy-dms"
    }
  }'
Batch metadata creation is also supported:
curl -X POST https://api.docyard.io/v1/docks/dock_01HQ3K.../artifacts/batch \
  -H "Authorization: Bearer dk_live_a1b2c3d4..." \
  -H "Content-Type: application/json" \
  -d '{
    "items": [
      { "filename": "dec-001.pdf", "metadata": { "policy_number": "POL-001" } },
      { "filename": "dec-002.pdf", "metadata": { "policy_number": "POL-002" } }
    ]
  }'

Deduplication

Docyard uses content-addressable storage. When you upload a file:
  1. SHA-256 hash is computed from the file content
  2. If the hash matches an existing artifact in the dock, the upload succeeds but no new storage is consumed
  3. The response includes isDuplicate: true and returns the existing artifact’s ID
This means:
  • Re-uploading after a failed batch costs nothing
  • The same file uploaded from different sources is stored once
  • You pay for unique content, not upload volume

Metadata Strategies

Artifact metadata powers policy routing and filtering. Design your metadata schema around your business operations:
{
  "policy_number": "POL-2025-4821",
  "effective_date": "2025-01-01",
  "expiration_date": "2026-01-01",
  "document_type": "declaration-page",
  "state": "CT",
  "line_of_business": "commercial-property"
}
If your dock has a metadata schema configured, artifact metadata is validated at upload time. Artifacts that fail validation are rejected with a 400 error detailing which fields are invalid.

Listing Artifacts

Retrieve all artifacts in your dock, newest first:
curl https://api.docyard.io/v1/docks/dock_01HQ3K.../artifacts \
  -H "Authorization: Bearer dk_live_a1b2c3d4..."

Next Steps