Question 1

What document types does Anyrow extract from?

Accepted Answer

PDFs (native, scanned, password-protected), DOCX, images (JPEG, PNG, HEIC, WebP, TIFF), emails (EML, MSG, forwarded IMAP), audio (MP3, WAV, M4A, FLAC), video (MP4, MOV), spreadsheets (XLSX, CSV, TSV), JSON, XML, URLs, and ZIP archives. Deterministic parsing handles structured formats at zero LLM cost; the AI path handles everything else.

Question 2

How does the schema builder handle typed columns?

Accepted Answer

Define columns as text, number, currency, date, boolean, email, URL, array, multi-select, or media-file. Anyrow enforces types on extraction — not just on display. Low-confidence fields route to a review queue before they publish. Schema evolution (add, rename, type-change, delete) runs without table downtime via background backfill.

Question 3

Is the API stable enough for production workloads?

Accepted Answer

Yes. The REST API and TypeScript, Python, Go, and Rust SDKs follow an OpenAPI 3.1 spec with versioned endpoints. Webhooks cover row.created, row.updated, row.deleted, batch.completed, and extraction.failed with retry and custom headers. We consider the core extraction and row CRUD endpoints stable; experimental endpoints are flagged in the reference docs.

Question 4

Does Anyrow support batch processing of thousands of files?

Accepted Answer

Yes. The Batch API accepts up to 1,000 files per request at a 50% cost discount versus single-file extraction, with a 24-hour SLA. Processing runs in parallel on Cloudflare Workers globally. Live progress streams via SSE — you see per-file status (queued, extracting, extracted, failed) without polling. Scale and Enterprise plans have higher concurrent batch limits.

Products

AI-powered extraction

Structured tables

Schema builder

Batch processing

API access

Export formats

Integrations

Team collaboration

Common questions about the Anyrow platform

Run your ops on one relational DB, not four tools.