Cartex¶
Cartex is a sub-item enrichment pipeline for construction document takeoffs.
It receives structured extraction results from Cato-v2 — tables, text notes, and image context extracted from architectural PDF pages — and enriches each schedule row with data pulled from auxiliary tables, general notes, legend diagrams, dimension cards, and multi-label resolution.
How it works¶
- Extract — Cato-v2 renders a PDF page to an image and sends it to Gemini vision AI, producing an
ExtractionResultcontaining tables and contextual information. - Enrich — Cartex routes the extraction result through specialist strategies that run concurrently, then merges their outputs into a single
list[EnrichedRow].
The enriched rows map directly to the columns of a user-defined takeoff template (windows, doors, curtain walls, etc.).
Quick start¶
from main import run
from src.models import UserTableSchema
from src.templates import TemplateType
schema = UserTableSchema(
template=TemplateType.STANDARD_TAKEOFF,
columns=["Product", "Operability", "Width", "Height",
"Quantity", "Location", "Material",
"Rough Opening Measurements", "Special Notes"],
)
result = run("path/to/document.pdf", page_numbers=[0], schema=schema)
Documentation¶
- Architecture overview — what Cartex is and where it fits
- Pipeline stages — extraction and enrichment in detail
- Specialist routing — how strategies are selected
- Data models — Pydantic models and enums
- Merge algorithm — how specialist outputs combine