Skip to content

Using the Gradio interface

The Gradio interface provides a browser-based way to run the Cartex pipeline against a PDF document and inspect enriched results without writing any code. It is intended for local developer testing and QA runs.

Starting the interface

Launch the interface from the repository root:

python ui/app.py

Gradio starts a local server on port 7860 by default and opens a browser tab automatically. If the browser does not open, navigate to http://localhost:7860.

Note

Run this command from the repository root. Do not set PYTHONPATH=src — Cartex uses src.-prefixed imports and setting that variable creates a double-module identity bug where enum comparisons silently fail.

Uploading a document

Use the Upload PDF file picker to select a PDF document. After upload, the Page Preview renders the first page in the selected page range. The preview updates automatically when the Page Numbers field changes.

Page numbers

The Page Numbers field accepts comma-separated page numbers, ranges, or a combination:

Input Pages processed
1 Page 1 only
1,3,5 Pages 1, 3, and 5
1-5 Pages 1 through 5 inclusive
1,3-5 Pages 1, 3, 4, and 5

Page numbers are 1-indexed. When multiple pages are specified, the pipeline calls extract_pages(), which runs a multi-page extraction pass that deduplicates context items across pages and merges all detected tables into a single ExtractionResult.

Template selection

The Template dropdown controls which column schema is applied during enrichment. Each option maps to a TemplateType enum value and a fixed base column list defined in src/templates.py.

Display name TemplateType Use when
Standard Takeoff STANDARD_TAKEOFF Standard window/door schedule with operability, material, and rough opening
Standard Takeoff + TDL/SDL STANDARD_TAKEOFF_TDL Standard schedule that also tracks divided light types (Dividers TDL Type, Dividers SDL Type)
Glass Schedule GLASS_SCHEDULE Dedicated glass schedules with layer, brand, arrangement, and spacer columns
Shop Details SHOP_DETAILS Shop drawing detail sheets with frame profile, hardware, finish, and installation columns

Additional columns

The Additional Columns checkbox group lets you append fields from FIELD_LIBRARY to the base template columns. FIELD_LIBRARY is the full set of known fields defined in src/templates.py.

Selected columns are appended after the template's default columns in the output. Columns already present in the selected template are silently deduplicated — selecting Special Notes when using Glass Schedule has no effect.

Running the pipeline

Click Run Pipeline to start processing. The Pipeline Log shows live progress across three stages:

  1. [1/3] Extraction — tables and context items are detected on the specified pages. The Page Preview updates with coloured bounding-box overlays: blue for main tables, green for auxiliary tables, and orange for context regions.
  2. [2/3] Routing — the router selects which specialist strategies apply to the detected content.
  3. [3/3] Enrichment — selected specialists run and the enriched row count is reported.

On success, the Enriched Table displays one row per extracted schedule row. In addition to the template columns, three diagnostic columns appear:

Column Contents
_confidence Numeric confidence score from the enricher
_reasoning Free-text explanation of how the row was enriched
_field_sources JSON object mapping each column name to the FieldSource that produced its value (e.g. auxiliary_table, text_rule, image_legend, dimension_card)

If the pipeline fails, an Error Traceback panel appears below the table with the full Python traceback.

Reading debug output

Every successful run writes a JSON file to debug/ in the repository root. The filename format is run_YYYYMMDD_HHMMSS.json.

The debug file has the following top-level fields:

Field Type Description
timestamp string ISO 8601 timestamp of the run
file string Absolute path to the PDF processed
pages integer[] 1-indexed page numbers processed
template string TemplateType value used (e.g. glass_schedule)
columns string[] Full column list, including any additional columns
strategies string[] Specialist strategy names that fired, or ["monolithic"]
total_rows integer Number of enriched rows returned
context object[] Serialised TextContextModel and ImageContextModel items from extraction
rows object[] Serialised EnrichedRow objects

Each entry in rows contains data (the enriched column values), field_sources (per-column source attribution), confidence, and reasoning. field_sources and reasoning are the primary fields for diagnosing why a column received a particular value.

Tip

The debug JSON written to debug/ is the authoritative test artifact for each run. When reporting a result or filing a bug, attach the relevant run_*.json file — it captures the full extraction context, strategy selection, and per-field provenance for every row.