Introduction

DBL — DocDigitizer Benchmark Language

DBL is a declarative domain-specific language for defining and executing document extraction benchmarks. In a single script you declare which documents to process, which extraction engines to compare, how to run them, what metrics to compute, and how to visualize the results.

Your First Script

A DBL script is a benchmark block containing up to five inner blocks. The two required blocks are documents and engines.

benchmark "My First Benchmark" {

    documents {
        from dataset "my-dataset"
        where document.file_type in ["pdf"]
        limit 20
    }

    engines {
        engine "openai" {
            model "gpt-4o"
            mode "vision"
        }
        engine "anthropic" {
            model "claude-sonnet-4-5"
            mode "text"
        }
    }

    execution {
        runs 3
        mode parallel
        max_concurrent 4
    }

    analysis {
        compute accuracy
        compute timing {
            percentiles [50, 90, 95, 99]
            group_by engine
        }
        compute cost {
            group_by engine
        }
    }

    output {
        graph bar {
            title "Accuracy by Provider"
            x: engine.name
            y: accuracy.score
        }
    }
}

How to Run

Open the DBL Editor from the app sidebar.
Write or paste your script, or choose a template from the Templates Gallery.
Click Validate to check for syntax and semantic errors.
Click Run to start the benchmark execution.
Monitor progress in the Executions panel; results appear as interactive charts.

Language Reference

Block Structure

Every DBL script is a benchmark block with a name string, containing one or more of the following blocks in order:

Block	Required	Purpose
`documents`	Yes	Select which documents to process from a named dataset
`engines`	Yes*	Declare one or more extraction engines — required when not using `run` blocks
`run`	No	Named execution lane with its own engines and ordered passes (multi-pass mode)
`execution`	No	Control run count, parallelism, and concurrency limits
`analysis`	No	Specify which metrics to compute (accuracy, timing, cost)
`output`	No	Define charts to generate from analysis results
`throttle`	No	Simulate customer bandwidth constraints during document fetching
`sweep`	No	Run the benchmark across multiple values of a parameter to find optimal configurations
`labels`	No	Tag the benchmark (or a run) with metadata labels for filtering and grouping
`include`	No	Inline a named snippet at the current insertion point

Keywords

All 50 reserved keywords grouped by their primary block context:

Group	Keywords
Top-level	`benchmark`, `documents`, `engines`, `execution`, `analysis`, `output`
Documents	`from`, `dataset`, `where`, `limit`
Engines	`engine`, `model`, `mode`
Execution	`runs`, `parallel`, `sequential`, `max_concurrent`, `max_per_engine`
Analysis	`compute`, `accuracy`, `timing`, `cost`, `percentiles`, `group_by`
Output / Graph	`graph`, `bar`, `pie`, `line`, `candlestick`, `line_regression`, `histogram`, `normalized_histogram`, `title`, `series`, `x`, `y`, `label`, `value`, `category`, `open`, `high`, `low`, `close`, `values`, `bins`, `group_by`
Filter operators	`in`, `contains`, `matches`, `and`, `or`, `not`
Literals	`true`, `false`
Throttle	`throttle`, `bandwidth`
Sweep	`sweep`
Multi-pass / Run	`run`, `pass`, `when`, `empty`, `include`, `labels`
Prompts / Schemas	`prompts`, `schemas`, `select`, `pattern`, `set`, `expand`
Post-extraction	`post_extraction`, `extract_schemas`, `validate_extraction`, `consolidate`, `schema`
Compare	`compare`, `one_to_many`, `many_to_many`, `reference`, `min_engines`, `normalization`, `confidence_threshold`

Operators

Comparison Operators

Operator	Symbol	Applicable To
Greater than	`>`	integer, float, size, duration
Less than	`<`	integer, float, size, duration
Greater or equal	`>=`	integer, float, size, duration
Less or equal	`<=`	integer, float, size, duration
Equal	`==`	any type
Not equal	`!=`	any type

Keyword Operators

Operator	Role	Example
`in`	Membership test against an array	`document.file_type in ["pdf", "png"]`
`contains`	Substring test (string fields)	`document.file_name contains "invoice"`
`matches`	Regex test (string fields)	`document.file_name matches "^INV-[0-9]+"`
`and`	Logical conjunction (multiple where clauses)	`... and document.file_size > 10kb`
`or`	Logical disjunction	`... or document.category == "image"`
`not`	Logical negation (prefix)	`where not document.status == "error"`

Literals

Type	Pattern	Examples
String	`"..."` (double-quoted, supports `\\`, `\"`, `\n`, `\t`)	`"invoices"`, `"gpt-4o"`
Integer	`[0-9]+`	`3`, `50`, `100`
Float	`[0-9]+.[0-9]+`	`0.95`, `1.0`
Size	`<int>(kb\|mb\|gb)`	`100kb`, `5mb`, `2gb`
Duration	`<int>(s\|m\|h)`	`30s`, `5m`, `2h`
Boolean	`true` \| `false`	`true`
Array	`[ value, value, ... ]`	`["pdf", "png"]`, `[50, 90, 95, 99]`

Comments

"dbl-comment"># This is a line comment

"dbl-comment">/*
  This is a block comment.
  It can span multiple lines.
*/

benchmark "Example" {
    documents {
        from dataset "my-dataset"  "dbl-comment"># inline comment
    }
}

Blocks Reference

Detailed syntax and semantics for each block.

documents

Selects which documents to include in the benchmark from a named dataset, with optional filtering and a row limit. This block is required.

documents {
    from dataset "dataset-name"
    where <field> <operator> <value>
    limit <number>
}

Available filter fields

Field	Type	Description
`document.file_type`	string	File extension, e.g. `"pdf"`, `"png"`, `"docx"`
`document.file_size`	size (bytes)	File size; supports size unit literals (`100kb`, `5mb`)
`document.file_name`	string	Original file name including extension
`document.category`	string	Broad category: `"pdf"`, `"office"`, `"image"`
`document.page_count`	integer	Number of pages (where applicable)
`document.status`	string	Processing status, e.g. `"ready"`, `"error"`

Filter examples

documents {
    from dataset "invoices"

    "dbl-comment"># Type filter with array membership
    where document.file_type in ["pdf", "png"]

    "dbl-comment"># Size range
    where document.file_size > 100kb
    where document.file_size < 10mb

    "dbl-comment"># File name substring match
    where document.file_name contains "invoice"

    "dbl-comment"># Regex pattern
    where document.file_name matches "^INV-[0-9]{4}"

    "dbl-comment"># Exact status match
    where document.status == "ready"

    "dbl-comment"># Page count range
    where document.page_count >= 1
    where document.page_count <= 20

    limit 50
}

engines

Declares one or more extraction engines. Each engine maps to a provider and optionally specifies a model ID and extraction mode. This block is required and must contain at least one engine declaration.

engines {
    engine "ProviderName" {
        model "model-id"
        mode "text" | "vision" | "default"
    }
}

Supported providers and models

Provider name	Example model IDs	Modes
`"openai"`	`gpt-4o`, `gpt-4o-mini`, `o3-mini`	text, vision
`"anthropic"`	`claude-sonnet-4-5`, `claude-haiku-4-5`	text, vision
`"google"`	`gemini-2.5-flash`, `gemini-2.5-pro`	text, vision
`"mistral"`	`mistral-large-latest`, `mistral-medium-latest`	text, vision
`"cohere"`	`command-r-plus`, `command-r`	text only
`"azure"`	`gpt-4o`, `gpt-4o-mini` (deployment names)	text, vision
`"bedrock"`	`anthropic.claude-3-sonnet`, `amazon.titan-text-express`	text, vision
`"docdigitizer"`	(no model required)	default

engines {
    engine "openai" {
        model "gpt-4o"
        mode "vision"
    }
    engine "anthropic" {
        model "claude-sonnet-4-5"
        mode "text"
    }
    engine "docdigitizer" {
        mode "default"
    }
}

execution

Controls how many times each document/engine combination is run, and whether runs happen in parallel or sequentially. This block is optional; defaults are shown below.

execution {
    runs <number>            "dbl-comment"># default: 1
    mode parallel | sequential   "dbl-comment"># default: sequential
    max_concurrent <number>  "dbl-comment"># default: 1 (parallel only)
    max_per_engine <number>  "dbl-comment"># default: unbounded
}

Property	Default	Description
`runs`	1	Number of times each document is run through each engine
`mode`	`sequential`	`parallel` runs engines concurrently; `sequential` runs them one at a time
`max_concurrent`	1	Maximum number of simultaneous workers (parallel mode only)
`max_per_engine`	unbounded	Maximum concurrent workers per engine; must be <= `runs`

execution {
    runs 5
    mode parallel
    max_concurrent 10
    max_per_engine 3
}

analysis

Specifies which metrics the platform should compute after all extraction runs complete. Results are used by the output block to generate charts. This block is optional.

analysis {
    compute accuracy

    compute timing {
        percentiles [50, 90, 95, 99]
        group_by engine, model
    }

    compute cost {
        group_by engine
    }
}

Metric	What it computes
`compute accuracy`	Compares extraction output against ground-truth schema; populates `accuracy.*` properties
`compute timing`	Aggregates wall-clock durations; supports custom percentile arrays and group-by; populates `timing.*`
`compute cost`	Aggregates token-based cost estimates; populates `cost.*`

output

Defines which charts to render. Each graph directive specifies a chart type and maps domain properties to visual axes. This block is optional.

output {
    graph <type> {
        title "Chart Title"
        <axis>: <domain.property>
    }
}

Property values use dot-access notation referencing domain objects. String values are used only for title.

throttle

Simulates customer bandwidth constraints by limiting the rate at which documents are fetched during a benchmark run. This is useful for testing how extraction engines behave under real-world network conditions. This block is optional; at most one throttle block is allowed per benchmark.

throttle {
    bandwidth 5mb
}

Property	Type	Description
`bandwidth`	size (bytes/sec)	Maximum download rate for document fetching. Use size unit literals: `kb`, `mb`, `gb` (e.g. `512kb`, `5mb`, `1gb`). Plain integers (raw bytes/sec) are also accepted.

Examples

"dbl-comment"># Simulate a slow broadband connection (5 MB/s)
throttle {
    bandwidth 5mb
}

"dbl-comment"># Simulate a constrained mobile connection (512 KB/s)
throttle {
    bandwidth 512kb
}

"dbl-comment"># Use raw bytes/sec for precise control
throttle {
    bandwidth 1048576   "dbl-comment"># 1 MB/s
}

A throttle block with no bandwidth directive has no effect and produces a warning. Bandwidth values ≤ 0 are a semantic error (SEM013). Very low bandwidths (below 100 KB/s) produce a warning about potential request timeouts.

sweep

Runs the benchmark once for each value in a list, varying a named execution parameter across the runs. Each value produces a separate, independent benchmark execution. This is useful for finding the optimal configuration for concurrency, run counts, or bandwidth limits. This block is optional.

"dbl-comment"># Syntax
sweep <variable> [<value>, <value>, ...]

Valid variables

Variable	Value type	Description
`bandwidth`	size (e.g. `1mb`)	Sweeps the throttle bandwidth limit — requires size literals
`runs`	integer	Sweeps the number of repetitions per document/engine pair
`max_concurrent`	integer	Sweeps the maximum number of simultaneous workers
`max_per_engine`	integer	Sweeps the per-engine concurrency limit

Examples

"dbl-comment"># Find the right bandwidth throttle for realistic testing
sweep bandwidth [1mb, 5mb, 10mb, 50mb]

"dbl-comment"># Measure how result quality changes with more repetitions
sweep runs [1, 3, 5, 10]

"dbl-comment"># Find the optimal concurrency level
sweep max_concurrent [1, 2, 4, 8, 16]

"dbl-comment"># Tune per-engine limits
sweep max_per_engine [1, 2, 3]

Each value in the sweep array produces a full, independent benchmark run. All values in a sweep array must be the same type — integers for runs, max_concurrent, and max_per_engine; size literals for bandwidth. Unknown variable names and empty value arrays are semantic errors (SEM014). Sweeps with more than 20 values produce a performance warning.

Multi-Pass Execution

DBL v1.1 introduces run blocks — named execution lanes that contain their own engines and an ordered sequence of passes. Multi-pass benchmarks let you chain extraction steps so that later passes can branch based on the results of earlier ones.

Scripts that do not use run blocks continue to work unchanged — a flat engines block at benchmark level is treated as a single implicit run with one pass.

run

A run block declares a named execution lane. Each benchmark may contain multiple runs. Runs execute in declaration order. Each run specifies its own engines and an ordered list of passes.

run "run-name" {
    labels ["label1", "label2"]

    engine "provider" {
        model "model-id"
        mode "text" | "vision" | "default"
    }

    pass 1 "first-pass" {
        prompts { labels ["categorization"] }
    }

    pass 2 "second-pass" {
        when pass.1.category not empty
        prompts { labels ["schema_bounded"] }
    }
}

Engine selection inside a run

Engines may be declared explicitly, or selected from the model registry by labels:

run "explicit_engine" {
    engine "openai" { model "gpt-4o" mode "vision" }
    engine "anthropic" { model "claude-sonnet-4-20250514" }
}

run "label_based" {
    "dbl-comment"># Resolves to all active models with ALL listed labels
    engines { labels ["fast", "vision"] }
}

Semantic rules

Code	Rule
`SEM021`	Run name must be a non-empty string
`SEM021`	Run must contain at least one engine declaration or engine label selection
`SEM021`	Duplicate run names within the same benchmark are an error

pass

A pass block is a numbered execution step within a run. Passes run in ascending number order. Each pass may specify its own prompts, schemas, and post_extraction directives. A pass with no when clause always executes. A pass whose when evaluates to false is silently skipped.

pass <number> ["optional-name"] {
    [when <boolean-expression>]
    [prompts { ... }]
    [schemas { ... }]
    [post_extraction { ... }]
}

Examples

"dbl-comment"># Pass 1 — no condition, always runs
pass 1 "categorize" {
    prompts { labels ["categorization"] }
}

"dbl-comment"># Pass 2 — runs only when pass 1 returned a category
pass 2 "extract" {
    when pass.1.category not empty and pass.1.confidence > 0.5
    prompts { labels ["schema_bounded"] }
    schemas { labels ["invoices"] expand true }
}

"dbl-comment"># Pass 3 — fallback when categorization produced no category
pass 3 "fallback" {
    when pass.1.category empty
    prompts { labels ["freewheel"] }
}

Semantic rules

Code	Rule
`SEM022`	Pass number must be a positive integer >= 1
`SEM022`	Pass numbers must be unique within the same run
`SEM022`	A `when` clause may only reference passes with a lower number (no forward references)
`SEM026`	Pass `when` clause references a pass number that is not lower than the current pass number

when expressions

The when clause is a boolean expression evaluated at runtime for each document, using the JSON result of prior passes as its context. It supports the same operators as where clauses, plus empty and not empty for presence testing.

Pass references

Syntax	Meaning
`pass.1.field_name`	Access `field_name` in the result JSON of pass number 1
`pass.categorize.field_name`	Access `field_name` in the result JSON of the pass named `categorize`

Operators

Operator	Meaning	Example
`empty`	Field is absent or null	`pass.1.category empty`
`not empty`	Field is present and non-null	`pass.1.category not empty`
`==`	Equality	`pass.1.type == "invoice"`
`!=`	Inequality	`pass.1.status != "error"`
`>` `<` `>=` `<=`	Numeric comparison	`pass.1.confidence > 0.5`
`contains`	Substring test	`pass.1.notes contains "urgent"`
`matches`	Regex test	`pass.1.ref matches "^INV-"`
`in`	Array membership	`pass.1.type in ["invoice", "receipt"]`
`and`	Logical conjunction	`pass.1.category not empty and pass.1.confidence > 0.5`
`or`	Logical disjunction	`pass.1.type == "invoice" or pass.1.type == "receipt"`
`not`	Logical negation (prefix)	`not pass.1.verified == true`

Operator precedence

Precedence from highest to lowest: not > and > or. Use parentheses to override:

"dbl-comment"># Compound condition
when pass.1.category not empty and pass.1.confidence > 0.5

"dbl-comment"># Type-based branching
when pass.1.type == "invoice" or pass.1.type == "receipt"

"dbl-comment"># Parenthesised override
when pass.1.category not empty and (pass.1.confidence > 0.8 or pass.1.manual_review == true)

"dbl-comment"># Fallback
when pass.1.category empty

Semantic rules

Code	Rule
`SEM023`	`when` expression is invalid, references an undefined pass, or uses a type-incompatible operator

include & snippets

The include directive inserts the content of a named snippet at the current position. Snippets are reusable DBL fragments stored in the platform (script type snippet). They may contain any valid block-level content.

include "snippet-name"

Contextual resolution

The snippet is pasted verbatim at the insertion point. What it may inject depends on where it appears:

benchmark "Multi-pass Test" {
    include "standard_analysis_output"     "dbl-comment"># injects analysis{} and output{} blocks

    documents { from dataset "Dataset1" }

    run "llm" {
        engine "openai" { model "gpt-4o" }
        include "two_pass_categorize"       "dbl-comment"># injects pass 1 and pass 2 definitions
    }

    analysis { compute accuracy }           "dbl-comment"># explicit block wins over the snippet's
}

Override rule

Context	Rule
Benchmark level	If the script and the snippet both declare the same block (e.g., `analysis`), the explicit block wins and the snippet's version is discarded.
Inside a run block	If both declare `pass N` for the same number, the explicit pass wins.

Semantic rules

Code	Rule
`SEM024`	Snippet name must be a non-empty string; circular includes and depth > 5 are errors

labels & engine labels

labels directive

The labels directive tags the benchmark or a run with metadata strings. Labels are attached to every extraction result produced in that scope and are used for filtering and grouping in analysis and comparison.

benchmark "Invoice Pipeline" {
    labels ["experiment_q1", "invoices"]    "dbl-comment"># benchmark-level labels

    documents { from dataset "Dataset1" limit 100 }

    run "idp_baseline" {
        labels ["idp"]                      "dbl-comment"># run-level labels
        engine "docdigitizer" { mode "default" }
    }

    run "llm_two_pass" {
        labels ["llm", "multi_pass"]        "dbl-comment"># run-level labels
        engine "openai" { model "gpt-4o" mode "vision" }

        pass 1 "categorize" {
            prompts { labels ["categorization"] }
        }
    }
}

Engine selection by labels

Inside a run, engines can be selected from the model registry by label instead of declared explicitly. The executor resolves to all active models matching all listed labels.

run "fast_run" {
    "dbl-comment"># Selects all active models labelled "fast" AND "vision"
    engines { labels ["fast", "vision"] }

    pass 1 {
        prompts { labels ["categorization"] }
    }
}

Label-based engine selection requires models to be registered and marked active in the Model Registry. Explicit engine declarations andengines { labels [...] } may coexist in the same run.

Full multi-pass LLM vs IDP benchmark example

benchmark "Invoice Pipeline — LLM vs IDP" {
    labels ["experiment_q1", "invoices"]

    documents {
        from dataset "Dataset1"
        where document.file_type in ["pdf"]
        limit 100
    }

    "dbl-comment"># Run 1: DocDigitizer IDP baseline (single pass)
    run "idp_baseline" {
        labels ["idp"]
        engine "docdigitizer" { mode "default" }
    }

    "dbl-comment"># Run 2: Two-pass LLM pipeline
    run "llm_two_pass" {
        labels ["llm", "multi_pass"]
        engine "openai" { model "gpt-4o" mode "vision" }
        engine "anthropic" { model "claude-sonnet-4-20250514" }

        pass 1 "categorize" {
            prompts { labels ["categorization"] }
        }

        pass 2 "extract" {
            when pass.1.category not empty and pass.1.confidence > 0.5
            prompts { labels ["schema_bounded"] }
            schemas { labels ["invoices"] expand true }
        }

        pass 3 "fallback" {
            when pass.1.category empty
            prompts { labels ["freewheel"] }
        }
    }

    execution {
        runs 2
        mode parallel
        max_concurrent 6
    }

    schemas {
        labels ["invoices"]
        expand true
    }

    analysis {
        compute accuracy
        compute timing { percentiles [50, 90, 95] group_by engine }
        compute cost { group_by engine }
    }

    output {
        graph bar { title "Accuracy by Engine" x: engine.name y: accuracy.score }
        graph pie { title "Cost Distribution" value: cost.total label: engine.name }
    }

    compare many_to_many { }
}

Domain Objects

These are the valid dot-access paths in filter expressions, group_by clauses, and graph property values. Any path not listed here is a semantic error.

document.*

Path	Type	Description
`document.file_type`	string	File extension (e.g. `"pdf"`, `"png"`, `"docx"`)
`document.file_size`	size (bytes)	File size in bytes; supports size unit literals
`document.file_name`	string	Original file name including extension
`document.category`	string	Broad category: `"pdf"`, `"office"`, `"image"`
`document.page_count`	integer	Number of pages (where applicable)
`document.status`	string	Processing status string

engine.*

Path	Type	Description
`engine.name`	string	Provider name (e.g. `"openai"`, `"docdigitizer"`)
`engine.model`	string	Model identifier (e.g. `"gpt-4o"`)
`engine.mode`	string	Processing mode: `"text"`, `"vision"`, `"default"`

extraction.*

Properties of a single extraction run (one document x one engine x one repetition).

Path	Type	Description
`extraction.duration_ms`	number	Wall-clock extraction time in milliseconds
`extraction.input_tokens`	integer	Number of input tokens consumed
`extraction.output_tokens`	integer	Number of output tokens produced
`extraction.estimated_cost`	number	Estimated monetary cost in USD
`extraction.status`	string	Result status: `"success"`, `"error"`, `"timeout"`

timing.* after compute timing

Path	Type	Description
`timing.min`	number	Minimum duration across all runs (ms)
`timing.max`	number	Maximum duration across all runs (ms)
`timing.avg`	number	Mean duration (ms)
`timing.p50`	number	50th percentile / median duration (ms)
`timing.p90`	number	90th percentile duration (ms)
`timing.p95`	number	95th percentile duration (ms)
`timing.p99`	number	99th percentile duration (ms)

Non-standard percentiles requested via percentiles [...] are accessible as timing.p{N} where N is the requested integer.

accuracy.* after compute accuracy

Path	Type	Description
`accuracy.score`	number	Overall accuracy ratio, 0.0 to 1.0
`accuracy.fields_matched`	integer	Count of correctly extracted fields
`accuracy.fields_total`	integer	Total fields in the expected schema

cost.* after compute cost

Path	Type	Description
`cost.total`	number	Total cost across all runs (USD)
`cost.average`	number	Average cost per run (USD)
`cost.per_document`	number	Average cost per document (USD)

Visualization

Graph Types

Each graph type has specific required and optional axis properties. Missing required properties are reported as semantic errors (SEM005).

bar

Grouped bar chart. Best for comparing a metric across providers or models.

Property	Required
`x`	Yes — categorical axis (e.g. `engine.name`)
`y`	Yes — numeric value (e.g. `accuracy.score`)
`title`	No — chart title string
`series`	No — grouping dimension (e.g. `engine.model`)

graph bar {
    title "Accuracy by Provider"
    x: engine.name
    y: accuracy.score
    series: engine.model
}

pie

Pie/donut chart. Best for showing proportional distribution (e.g. cost share).

Property	Required
`value`	Yes — numeric slice size
`label`	Yes — slice label (e.g. `engine.name`)
`title`	No

graph pie {
    title "Cost Distribution"
    value: cost.total
    label: engine.name
}

line

Line chart. Best for time-series or ordered comparisons across multiple runs.

Property	Required
`x`	Yes — x-axis domain
`y`	Yes — y-axis metric
`title`	No
`series`	No — multi-line grouping

graph line {
    title "Latency over Runs"
    x: extraction.duration_ms
    y: timing.avg
    series: engine.name
}

candlestick

Candlestick / box chart. Shows timing distribution (min, p50, p95, max) per provider.

Property	Required
`category`	Yes — x-axis category
`open`	Yes — lower quartile / p50
`high`	Yes — high whisker / p95
`low`	Yes — low whisker / min
`close`	Yes — upper quartile / avg
`title`	No

graph candlestick {
    title "Time Distribution"
    category: engine.name
    open: timing.p50
    high: timing.p95
    low: timing.min
    close: timing.avg
}

line_regression

Scatter plot with regression line. Best for showing correlation (e.g. file size vs latency).

Property	Required
`x`	Yes — independent variable
`y`	Yes — dependent variable
`series`	Yes — color grouping
`title`	No

graph line_regression {
    title "Size vs Latency"
    x: document.file_size
    y: extraction.duration_ms
    series: engine.name
}

histogram

Vertical bar histogram showing frequency distribution of values. Supports multiple series for comparing distributions across engines or documents.

Property	Required
`values`	Yes — numeric field to bin (e.g. `extraction.duration_ms`)
`series`	No — grouping dimension (e.g. `engine.name`)
`bins`	No — integer bucket count (default 10)
`title`	No — chart title string

graph histogram {
    title "Extraction Time Distribution"
    values: extraction.duration_ms
    series: engine.name
    bins: 15
}

normalized_histogram

For each group (e.g. document), values are normalized to [0, 1] relative to the group's maximum. Shows relative performance independent of document complexity.

Property	Required
`values`	Yes — numeric field to bin (e.g. `extraction.duration_ms`)
`group_by`	Yes — normalization group (e.g. `document.file_name`)
`series`	No — comparison dimension (e.g. `engine.name`)
`bins`	No — integer bucket count (default 10)
`title`	No — chart title string

graph normalized_histogram {
    title "Normalized Time Distribution"
    values: extraction.duration_ms
    group_by: document.file_name
    series: engine.name
}

Templates

Templates Gallery

Ready-made scripts to get you started quickly. Click Copy to copy the script source to your clipboard, then paste it into the DBL Editor.

Templates are loaded from the DBL Editor. Book a demo to access the full template library in the editor.

Quick StartOpen

Minimal benchmark with two providers and basic accuracy/timing analysis.

Invoice ExtractionOpen

PDF invoice extraction across OpenAI, Anthropic, and DocDigitizer.

Vision vs TextOpen

Compare vision and text extraction modes on the same provider.

Cost AnalysisOpen

Cost breakdown across all configured providers with group_by engine and model.

Timing Deep DiveOpen

Full percentile timing analysis (p50/p90/p95/p99) with candlestick chart.

Large File StudyOpen

Filter documents > 5mb and measure how file size correlates with latency.

Multi-Run StabilityOpen

Run each document 10 times to measure extraction consistency over repetitions.

Category ComparisonOpen

Separate accuracy benchmarks for PDF, image, and Office document categories.

Patterns

Common Snippets

Copy-paste patterns for the most frequent use cases.

Filter PDFs only

where document.file_type in ["pdf"]

Large files only (over 1 MB)

where document.file_size > 1mb

File name contains keyword

where document.file_name contains "invoice"

Parallel execution with limits

execution {
    runs 3
    mode parallel
    max_concurrent 5
    max_per_engine 2
}

Custom percentile array

compute timing {
    percentiles [25, 50, 75, 90, 95, 99]
    group_by engine
}

Exclude errored documents

where not document.status == "error"

Multi-page documents only

where document.page_count >= 2

File name regex match

where document.file_name matches "^INV-[0-9]+"

Throttle bandwidth (simulate 5 MB/s)

throttle {
    bandwidth 5mb
}

Sweep runs to measure consistency

sweep runs [1, 3, 5, 10]

Sweep bandwidth to find realistic throttle

sweep bandwidth [1mb, 5mb, 10mb, 50mb]

Sweep concurrency to tune parallel execution

sweep max_concurrent [1, 2, 4, 8, 16]

Errors

Error Reference

DBL reports errors with a code, category, line number, and description. The compiler collects all errors before stopping — you will see every issue in a single pass.

Lexical Errors (LEX)

Code	Description
`LEX001`	Unterminated string literal — missing closing `"`
`LEX002`	Unknown escape sequence inside a string literal
`LEX003`	Unterminated block comment — missing closing `*/`
`LEX004`	Unrecognized character in source input

Syntax Errors (PAR)

Code	Description
`PAR001`	Expected token not found (e.g. expected `{` after block name)
`PAR002`	Unexpected token inside a block
`PAR003`	Missing required block — `documents` or `engines` not present
`PAR004`	Empty array literal `[]` — arrays must contain at least one element

Semantic Errors (SEM)

Code	Description
`SEM001`	Unknown dot-access path — the property does not exist in the domain model
`SEM002`	Operator/type mismatch in `where` clause (e.g. `>` on a string field)
`SEM003`	Duplicate engine name within the same `engines` block
`SEM004`	Duplicate metric type in `analysis` (e.g. two `compute timing` directives)
`SEM005`	Missing required graph property for the given graph type
`SEM006`	Unknown graph property — not in the allowed set for this graph type
`SEM007`	Invalid `engine.mode` value — must be `"text"`, `"vision"`, or `"default"`
`SEM008`	`limit` must be a positive integer greater than zero
`SEM009`	`runs` must be a positive integer >= 1
`SEM010`	Duplicate property key inside an `execution` block
`SEM011`	`percentiles` value out of range — each value must be between 1 and 99
`SEM012`	Unknown `group_by` identifier — does not resolve to a valid domain property
`SEM013`	`bandwidth` in a `throttle` block must be a positive value (> 0)
`SEM014`	`sweep` has an unknown variable name, an empty values array, or mismatched value types
`SEM015`	`prompts` block has an invalid `select` pattern or empty label values
`SEM016`	`schemas` block has an invalid `select`/`pattern` or empty label values
`SEM017`	`post_extraction` directive unknown, duplicate, or has an invalid property
`SEM018`	`compare` block has an invalid mode, missing/invalid property, or reference engine not declared
`SEM021`	`run` block has an empty name, no engine declarations, or a duplicate run name
`SEM022`	`pass` block has an invalid number, a duplicate number within the run, or a forward pass reference in `when`
`SEM023`	`when` expression is invalid, references an undefined pass, or uses a type-incompatible operator
`SEM024`	`include` has an empty snippet name, circular include chain, or exceeds maximum include depth (5)
`SEM025`	Engine label selection (`engines { labels [...] }`) has an empty array or non-string values
`SEM026`	`when` clause references a pass number that is not lower than the current pass number