Limitations¶
Cartex has known constraints that affect enrichment coverage and integration readiness. This page groups them into current pipeline limitations and dependencies that must be resolved before porting to Cato-v2.
Current limitations¶
Gemini model dependency¶
Cartex's enrichment prompts target Google Gemini exclusively. All specialist prompts (AUXILIARY_TABLE_ENRICHMENT, LEGEND_ENRICHMENT, etc.) use Gemini's structured JSON output mode with Pydantic schema validation. Switching to a different model (e.g., GPT-4 Vision) would require prompt rewriting and output parsing changes.
Impact. Cartex cannot be deployed in environments where Gemini is unavailable. Cato-v2 deployments that use OpenAI for extraction would run enrichment on a different model than extraction.
Resolution. The enricher maintains its own model configuration independent of the extraction layer. Prompt portability to other models is out of scope for initial porting.
Router sensitivity to extraction quality¶
The Router bases strategy selection on a compact summary — table roles, headers, row counts, sample cell values, and context snippets. Misclassified table roles or missing context items cause the router to omit applicable strategies.
Impact. If extraction assigns OTHER to an auxiliary table, T1 (auxiliary table enrichment) will not fire. If image context lacks interpretation text, T3 and T4 are skipped.
Resolution. Extraction quality improvements upstream. The monolithic ENRICHMENT fallback provides a safety net when routing fails, but at lower precision than specialist strategies.
Fixed merge priority order¶
The merge algorithm applies a fixed priority: auxiliary_table > image_legend > text_rule > dimension_card > multi_label. The first non-empty value from the highest-priority specialist wins for each column.
Impact. When a lower-priority specialist has a more accurate value for a specific column, the merge cannot prefer it. For example, if T3 (legend) fills Operability correctly but T1 (auxiliary table) fills it incorrectly, T1 wins.
Resolution. Per-column priority overrides could be added to UserTableSchema, but this adds complexity. The current fixed order reflects the most reliable source hierarchy across tested documents.
Special Notes deduplication is substring-based¶
The merge deduplicates Special Notes fragments using case-insensitive substring matching. If fragment A is a substring of fragment B (or vice versa), the shorter one is dropped.
Impact. Distinct notes that happen to share substrings may be incorrectly deduplicated. For example, "IBC 2406.4" and "IBC 2406.4.1(2)" — the shorter string is a substring of the longer one, so it would be dropped even though both may be independently relevant.
Resolution. Switch to semantic or exact-match deduplication. This is a minor code change in _merge_specialist_results() but requires validation against the test set to avoid note bloat.
Index-based row IDs disable row recovery¶
When extraction fails to detect a primary_key_column, rows receive index-based IDs (e.g., table_0_0_row_0). The merge algorithm excludes these tables from row recovery to avoid false positives.
Impact. If primary key detection fails, rows that no specialist produces output for are silently absent from the result. The pipeline's guarantee of never dropping rows depends on reliable primary key detection.
Resolution. Improve primary key detection in the extraction stage. Alternatively, add a user-facing override for primary_key_column in UserTableSchema.
Single-document scope¶
Cartex enriches one ExtractionResult at a time. It cannot cross-reference data across separate documents (e.g., a window schedule on sheet A5.1 referencing a glazing spec table on sheet A9.1 that was extracted in a different pipeline run).
Impact. Multi-sheet documents where auxiliary tables and main schedules appear on different pages must be extracted together using extract_pages() to produce a single merged ExtractionResult.
Resolution. The extract_pages() method already supports multi-page extraction with content deduplication. Callers must pass all relevant page numbers in a single run() call.
Operability enum coverage¶
The WindowOperabilityType and DoorOperabilityType enums define a fixed set of allowed values. Specialists must map all operability descriptions to one of these values.
Impact. Unusual or emerging product types (e.g., motorized louvre systems, automated sliding walls) have no matching enum value. The specialist prompt instructs Gemini to pick the closest match and note ambiguity in reasoning, but the output may be misleading.
Resolution. Extend the enums as new product types are encountered. The DoorOperabilityType.SECTIONAL value was added this way after the Fairway VTT test.
Porting dependencies¶
These items must be resolved before Cartex can operate within the Cato-v2 production pipeline. See the porting plan for full details.
P0 blocker: Image interpretation is missing
Cato-v2 stores cropped images in S3 but never generates text interpretations.
Without ImageContextModel.interpretation, T3 (legend enrichment) and T4
(dimension enrichment) cannot function. A new Gemini vision call must be added
for non-schedule evidences. See Image context gap.
P0 blocker: No MAIN vs AUXILIARY table classification
Cato-v2 treats all schedule-type evidences (Window Door Unit, Table)
identically. Without role classification, the mapping layer cannot populate
TableModel.role, and T1 (auxiliary table enrichment) will misfire or not
fire at all. See Risk R1.
P0 blocker: Mapping layer does not exist
No converter exists to translate Cato-v2's TakeOffResult / Evidence
structures into Cartex's ExtractionResult. The to_extraction_result() and
build_user_table_schema() functions must be built. See
Mapping layer design.
Enrichment metadata storage¶
Cato-v2's take_off_result_item table stores a flat JSON result dict per row. Cartex's EnrichedRow adds field_sources, confidence, and reasoning — fields with no current storage location.
Impact. Enrichment provenance and confidence data would be lost on save.
Resolution. Add a cartex_metadata JSON column to take_off_result_item, or store audit data in a linked table.
Field name alignment¶
Cartex's UserTableSchema.columns must exactly match the keys in TableModel.rows. If the PromptTemplate defines fields that extraction does not populate (or vice versa), enrichment produces misaligned results.
Impact. Silent data loss — columns present in extraction but absent from the schema are ignored; columns in the schema but absent from extraction remain empty without warning.
Resolution. Add a validation step in the mapping layer that reconciles template field names against actual keys in TakeOffResultItem.result. Log warnings for unmatched fields.
Product Type field requirement¶
For elevation-sourced items, Product Type is derived from the detection model's label name. For schedule-sourced items, it depends on whether the active PromptTemplate includes a Product Type field with extraction rules.
Impact. Door rows may arrive at enrichment without Product Type, degrading enrichment quality for operability and configuration mapping.
Resolution. Ensure the default prompt template always includes Product Type with comprehensive extraction rules.