Documentation
API reference, DBL language guide, and integration examples. Everything you need to automate benchmarks and integrate ARENA into your workflow.
API Reference
REST API for extractions, documents, datasets, benchmarks, comparisons, and analytics. OpenAPI 3.1.
View API Reference →DBL Language Guide
Full specification of the Document Benchmark Language. Syntax, blocks, examples, and templates.
Read DBL Spec →Getting Started
Upload your first document, run your first extraction, and create your first benchmark in under 10 minutes.
Quick Start Guide →REST API (Professional & Enterprise)
Automate benchmarks, integrate extraction results into CI/CD pipelines, or build custom reporting on top of ARENA's data.
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/documents/upload | Upload a document |
| POST | /api/v1/extractions | Run an extraction |
| GET | /api/v1/extractions/{id} | Get results + companion data |
| POST | /api/v1/dbl/scripts | Create a DBL script |
| POST | /api/v1/dbl/scripts/{id}/execute | Execute a benchmark |
| GET | /api/v1/analytics/accuracy | Query accuracy data |
| GET | /api/v1/analytics/cost | Query cost data |
| GET | /api/v1/comparisons/{id} | Get comparison results |
Authentication: OAuth (Google, Microsoft). API key support for programmatic access.
Response format: JSON. All extractions return structured JSON with a companion document containing timing, token usage, cost estimates, and confidence scores.
Run an Extraction via API
# Upload a document
curl -X POST https://arena.docdigitizer.com/api/v1/documents/upload \
-H "Authorization: Bearer $TOKEN" \
-F "file=@invoice.pdf" \
-F "dataset_id=ds_invoices"# Run schema-bounded extraction
curl -X POST https://arena.docdigitizer.com/api/v1/extractions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"document_id": "doc_abc123",
"provider": "openai",
"model": "gpt-4o",
"mode": "vision",
"strategy": "schema_bounded",
"schema_id": "sch_invoice_v2"
}'{
"id": "ext_xyz789",
"status": "completed",
"strategy": "schema_bounded",
"provider": "openai",
"model": "gpt-4o",
"mode": "vision",
"result": {
"document_type": "invoice",
"fields": {
"invoice_number": "INV-2026-0142",
"date": "2026-03-01",
"total": 4250.00,
"currency": "EUR",
"vendor": "Acme Corp"
}
},
"companion": {
"duration_ms": 2847,
"tokens": { "input": 1523, "output": 312 },
"cost_usd": 0.0234,
"confidence": 0.97
}
}DBL at a Glance
DBL scripts are declarative. Define what you want to benchmark, not how to execute it.
| Block | Purpose | Example |
|---|---|---|
documents | Select documents from a dataset | dataset "invoices" with where filters |
engine | Define a provider + model + mode | provider "anthropic", model "claude-sonnet-4-20250514" |
run | Execute extractions | engines [gpt4o, claude], repeat 3, parallel true |
analysis | Compute metrics from results | metrics [accuracy, duration_ms, cost_usd] |
visualization | Generate charts | type bar, x_axis engine, y_axis accuracy |
sweep | Parameter exploration | Vary temperature, mode, or model across runs |
compare | Cross-engine comparison | Side-by-side analysis of engines |
SDKs & Integrations
Python and Node.js SDKs are on the roadmap. In the meantime, the REST API works with any HTTP client.
Python SDK
pip install arena-benchmarkComing SoonNode.js SDK
npm install @arena/sdkComing SoonGitHub Actions
Run benchmarks in CIComing SoonWebhooks
Get notified when benchmarks completeComing Soon