Overview

Your app can now see.
Understand anything.
Ship faster.

Your users upload files — blurry scans, crumpled receipts, low-res screenshots, multi-page PDFs. You need structured data. Docex is the vision layer that handles the models, the OCR, the retries, and the schema validation. One API call. No pipeline to maintain.

01 · setup
02 · approve
03 · connect
04 · analyze
Infrastructure

You shouldn't need
a vision team.

Building image-to-data pipelines means evaluating models, handling OCR fallbacks, writing retry logic, and watching costs spiral. Or you lock into one provider and hope it works on every input. Docex is infrastructure — an orchestration engine with an expanding model library that routes each job to the right pipeline, handles failure automatically, and keeps you honest on cost.

Dynamic model selection Automatic fallback Structured output 2× upstream billing Schema validation Provider abstraction
Engine

The right model.
Every time.

Docex maintains a catalog of vision and OCR providers. For each job, it selects the optimal pipeline based on your input, budget, and latency requirements. If a provider fails or returns low confidence, it falls back automatically. You describe what you need in plain text. Docex handles the rest.

Input
Router
Claude Vision
GPT-4o OCR
Mistral Fallback
JSON
Trade License · Mainland
LegalACME LOGISTICS L.L.C
License1019388
Issued14·03·2024
Expires13·03·2026
ActivityFreight forwarding
IMG_4128.HEICiPhone 15
BISTRO 42
Truffle Pasta$24.00
Sparkling$8.50
Tiramisu$12.00
Total$44.50
Thu 09·05·24 · Table 7
receipt.jpgPixel 7
await docex.run({
  file: "./uploaded-license.heic",
  prompt: "company name, number, expiry",
});

// → 200 OK · 2.4s · ~$0.03
{
  "legal_name": "ACME LOGISTICS L.L.C",
  "license_no": "1019388",
  "expires_on": "2026-03-13"
}
Use Cases

Any image.
Any task.

Standard document extraction is where most tools stop. Docex starts there and keeps going. Same API. Same structured output. Any input you throw at it.

KYC

KYC & Onboarding

IDs, licenses, bank statements, proof-of-address

SEC

Security & Compliance

Email attachments, suspicious screenshots, scan reports

FIN

Finance & Expenses

Invoices, receipts, purchase orders — crumpled, low-light, any angle

MED

Media & Video

Low-res frame grabs, compressed thumbnails, screen captures

LGL

Legal & Contracts

Parties, clauses, signature detection, term extraction

OPS

Logistics & Operations

Shipping labels, waybills, manifests, damage photos

HLT

Healthcare & Forms

Medical records, insurance claims, lab results, handwritten notes

ANY

Your Workflow

This is what we've tested. Docex adapts to any image-to-data task you have in store.

Pricing

Predictable cost.
No surprises.

We charge 2× what the upstream provider charges us. No markup games, no opaque credits. For most inputs, that means pennies per request. Dynamic routing helps — when a cheaper OCR model handles the job, you pay less. Drop in five dollars to start. Cancel anytime. No annual contract, no sales call.

  • Billed at 2× upstream cost
  • $5 minimum top-up
  • Mock provider for CI — $0
  • Cancel anytime, no lock-in
Wallet · acme-prod ● Active
$3.18 ≈ 64 requests
$5.00last top-up
$1.82spent this week
upstream cost
Deploy

Ship vision.
In five minutes.

Paste this into your coding agent. It wires Docex into your product, scaffolds the endpoint for your stack, and runs a smoke test. You approve one link. Production-ready vision analysis without the production-ready team.

prompt.txt — copy & paste ⧉ copy
Wire Docex into this project as the vision analysis layer for
[describe the use case — e.g., "reading trade licenses during user onboarding"].

Take me through the GitHub approval and the $5 wallet top-up, store the
API key in my env, scaffold a server-side analysis endpoint for my
stack, and run a smoke test to confirm the integration works end-to-end.