Documentation

API reference, DBL language guide, and integration examples. Everything you need to automate benchmarks and integrate ARENA into your workflow.

API Reference

REST API for extractions, documents, datasets, benchmarks, comparisons, and analytics. OpenAPI 3.1.

View API Reference →

DBL Language Guide

Full specification of the Document Benchmark Language. Syntax, blocks, examples, and templates.

Read DBL Spec →

Getting Started

Upload your first document, run your first extraction, and create your first benchmark in under 10 minutes.

Quick Start Guide →

Changelog

What's new, what's fixed, and what's coming next.

View Changelog →

REST API (Professional & Enterprise)

Automate benchmarks, integrate extraction results into CI/CD pipelines, or build custom reporting on top of ARENA's data.

Method	Endpoint	Description
POST	/api/v1/documents/upload	Upload a document
POST	/api/v1/extractions	Run an extraction
GET	/api/v1/extractions/{id}	Get results + companion data
POST	/api/v1/dbl/scripts	Create a DBL script
POST	/api/v1/dbl/scripts/{id}/execute	Execute a benchmark
GET	/api/v1/analytics/accuracy	Query accuracy data
GET	/api/v1/analytics/cost	Query cost data
GET	/api/v1/comparisons/{id}	Get comparison results

Authentication: OAuth (Google, Microsoft). API key support for programmatic access.
Response format: JSON. All extractions return structured JSON with a companion document containing timing, token usage, cost estimates, and confidence scores.

Run an Extraction via API

bash

# Upload a document
curl -X POST https://arena.docdigitizer.com/api/v1/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@invoice.pdf" \
  -F "dataset_id=ds_invoices"

bash

# Run schema-bounded extraction
curl -X POST https://arena.docdigitizer.com/api/v1/extractions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_abc123",
    "provider": "openai",
    "model": "gpt-4o",
    "mode": "vision",
    "strategy": "schema_bounded",
    "schema_id": "sch_invoice_v2"
  }'

jsonResponse

{
  "id": "ext_xyz789",
  "status": "completed",
  "strategy": "schema_bounded",
  "provider": "openai",
  "model": "gpt-4o",
  "mode": "vision",
  "result": {
    "document_type": "invoice",
    "fields": {
      "invoice_number": "INV-2026-0142",
      "date": "2026-03-01",
      "total": 4250.00,
      "currency": "EUR",
      "vendor": "Acme Corp"
    }
  },
  "companion": {
    "duration_ms": 2847,
    "tokens": { "input": 1523, "output": 312 },
    "cost_usd": 0.0234,
    "confidence": 0.97
  }
}

DBL at a Glance

DBL scripts are declarative. Define what you want to benchmark, not how to execute it.

Block	Purpose	Example
`documents`	Select documents from a dataset	dataset "invoices" with where filters
`engine`	Define a provider + model + mode	provider "anthropic", model "claude-sonnet-4-20250514"
`run`	Execute extractions	engines [gpt4o, claude], repeat 3, parallel true
`analysis`	Compute metrics from results	metrics [accuracy, duration_ms, cost_usd]
`visualization`	Generate charts	type bar, x_axis engine, y_axis accuracy
`sweep`	Parameter exploration	Vary temperature, mode, or model across runs
`compare`	Cross-engine comparison	Side-by-side analysis of engines

Read Full DBL Specification →

SDKs & Integrations

Python and Node.js SDKs are on the roadmap. In the meantime, the REST API works with any HTTP client.

Python SDK

pip install arena-benchmarkComing Soon

Node.js SDK

npm install @arena/sdkComing Soon

GitHub Actions

Run benchmarks in CIComing Soon

Webhooks

Get notified when benchmarks completeComing Soon

Start Building

Book a Demo→View DBL Reference