Documentation

API reference, DBL language guide, and integration examples. Everything you need to automate benchmarks and integrate ARENA into your workflow.

01

REST API (Professional & Enterprise)

Automate benchmarks, integrate extraction results into CI/CD pipelines, or build custom reporting on top of ARENA's data.

MethodEndpointDescription
POST/api/v1/documents/uploadUpload a document
POST/api/v1/extractionsRun an extraction
GET/api/v1/extractions/{id}Get results + companion data
POST/api/v1/dbl/scriptsCreate a DBL script
POST/api/v1/dbl/scripts/{id}/executeExecute a benchmark
GET/api/v1/analytics/accuracyQuery accuracy data
GET/api/v1/analytics/costQuery cost data
GET/api/v1/comparisons/{id}Get comparison results

Authentication: OAuth (Google, Microsoft). API key support for programmatic access.
Response format: JSON. All extractions return structured JSON with a companion document containing timing, token usage, cost estimates, and confidence scores.

02

Run an Extraction via API

bash
# Upload a document
curl -X POST https://arena.docdigitizer.com/api/v1/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@invoice.pdf" \
  -F "dataset_id=ds_invoices"
bash
# Run schema-bounded extraction
curl -X POST https://arena.docdigitizer.com/api/v1/extractions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_abc123",
    "provider": "openai",
    "model": "gpt-4o",
    "mode": "vision",
    "strategy": "schema_bounded",
    "schema_id": "sch_invoice_v2"
  }'
jsonResponse
{
  "id": "ext_xyz789",
  "status": "completed",
  "strategy": "schema_bounded",
  "provider": "openai",
  "model": "gpt-4o",
  "mode": "vision",
  "result": {
    "document_type": "invoice",
    "fields": {
      "invoice_number": "INV-2026-0142",
      "date": "2026-03-01",
      "total": 4250.00,
      "currency": "EUR",
      "vendor": "Acme Corp"
    }
  },
  "companion": {
    "duration_ms": 2847,
    "tokens": { "input": 1523, "output": 312 },
    "cost_usd": 0.0234,
    "confidence": 0.97
  }
}
03

DBL at a Glance

DBL scripts are declarative. Define what you want to benchmark, not how to execute it.

BlockPurposeExample
documentsSelect documents from a datasetdataset "invoices" with where filters
engineDefine a provider + model + modeprovider "anthropic", model "claude-sonnet-4-20250514"
runExecute extractionsengines [gpt4o, claude], repeat 3, parallel true
analysisCompute metrics from resultsmetrics [accuracy, duration_ms, cost_usd]
visualizationGenerate chartstype bar, x_axis engine, y_axis accuracy
sweepParameter explorationVary temperature, mode, or model across runs
compareCross-engine comparisonSide-by-side analysis of engines
04

SDKs & Integrations

Python and Node.js SDKs are on the roadmap. In the meantime, the REST API works with any HTTP client.

Python SDK

pip install arena-benchmarkComing Soon

Node.js SDK

npm install @arena/sdkComing Soon

GitHub Actions

Run benchmarks in CIComing Soon

Webhooks

Get notified when benchmarks completeComing Soon