Structures

A structure is a schema that defines what data to extract from a document. You define the fields — name, type, description — and Matil reads the document and fills them in. Structures are the foundation of everything in Matil. Every deployment points to a structure.

How it works

Define a structure in the Matil Dashboard. For example, an invoice structure might have fields like invoice_number, invoice_date, total, and lines.
Publish a version. Structures use a draft/publish workflow. Edit the draft, then publish when ready.
Create a deployment that points to that structure version.
Send documents via the API. Matil extracts the data and returns structured JSON.

Example

If your structure defines fields for an invoice, the extracted data in the response looks like:

{
  "invoice_number": "INV-2024-001",
  "invoice_date": "2024-01-15",
  "currency": "EUR",
  "lines": [
    {
      "description": "Consulting services",
      "quantity": 10,
      "unit_price": 125.00,
      "line_total": 1250.00
    }
  ],
  "subtotal": 1250.00,
  "tax_amount": 262.50,
  "total": 1512.50
}

Pricing and pages

Custom structures are billed at 0.10 € per page. A “page” depends on the document type:

Document type	What counts as 1 page
PDF	1 PDF page = 1 page
Image	1 image = 1 page
Text	Every 200 words = 1 page (minimum 1)
Spreadsheet	Every 1,500 cells = 1 page (minimum 1)

For example, a 3-page PDF costs 0.30 €. A spreadsheet with 4,500 cells counts as 3 pages (0.30 €). Marketplace structures have their own fixed price per page, shown on each structure’s page in the Marketplace.

Field types

Data fields

These are the fields that produce values in the output:

Type	Description	Output example
`text`	A single string value. Supports regex validation and allowed values (enum).	`"INV-2024-001"`
`number`	A single numeric value. Supports decimal precision and rounding.	`1250.00`
`boolean`	A true/false value.	`true`
`list_text`	A list of strings.	`["EUR", "USD"]`
`list_number`	A list of numbers.	`[10.0, 20.5]`
`object`	A nested object with its own subfields.	`{"street": "...", "city": "..."}`
`list_object`	A table — a list of rows, each with the same columns.	`[{"description": "...", "quantity": 10}]`

Structural fields

These organize extraction but don’t produce keys in the output:

Type	Description
`group`	Groups fields into an execution unit. Enables parallel extraction, conditional execution, and contextual instructions. Child fields are flattened to the parent namespace in the output.
`structure`	References another structure’s versioned definition. Its fields are flattened into the parent output.
`validation`	An inline rule that verifies extracted data and can trigger automatic corrections or LLM retries.

Groups

Groups let you organize fields into separate extraction units that can run in parallel, improving performance for complex structures. A structure with groups:

{
  "fields": [
    {
      "type": "group",
      "name": "header",
      "fields": [
        { "type": "text", "name": "invoice_number", "description": "..." },
        { "type": "text", "name": "currency", "description": "..." }
      ]
    },
    {
      "type": "group",
      "name": "line_items",
      "instruction": "Extract line items. Currency is {/currency}.",
      "fields": [
        { "type": "list_object", "name": "lines", "description": "...", "columns": [...] }
      ]
    }
  ]
}

Produces a flat output — group names don’t appear:

{
  "invoice_number": "INV-001",
  "currency": "EUR",
  "lines": [...]
}

Groups can also be conditional (skipped if an expression evaluates to false) and can include instructions with {/path} placeholders that reference data from other groups.

Computed fields

Any field can be marked as is_computed: true. Instead of being extracted by the LLM, its value is calculated from an expression after extraction. Useful for derived values like line totals.

Validations

Structures can include validation rules that verify extracted data. When a validation fails, it can trigger programmatic corrections (e.g., recalculate a field) or ask the LLM to re-extract specific fields. When a field fails validation, the response status is completed_with_errors and includes an errors array:

{
  "errors": [
    {
      "path": "/total",
      "message": "Required field missing",
      "code": "REQUIRED_FIELD"
    }
  ],
  "status": "completed_with_errors"
}

Status	Meaning
`completed`	All fields extracted and validated successfully.
`completed_with_errors`	Extraction succeeded, but some fields failed validation. Partial data is available.
`failed`	Processing could not complete.

Versioning

Structures use a draft/publish workflow:

Draft — Your working copy. Edit freely without affecting live processing.
Published version — A snapshot that deployments can point to. Once published, a version is immutable.

You can publish as many versions as you want. Each deployment chooses which version to use, so you can test a new version on a staging deployment before rolling it to production.

How it works

Example

Pricing and pages

Field types

Data fields

Structural fields

Groups

Computed fields

Validations

Versioning

Next steps

Deployments

Entries

​How it works

​Example

​Pricing and pages

​Field types

​Data fields

​Structural fields

​Groups

​Computed fields

​Validations

​Versioning

​Next steps

Deployments

Entries

How it works

Example

Pricing and pages

Field types

Data fields

Structural fields

Groups

Computed fields

Validations

Versioning

Next steps