Limitations¶
Cartex has known constraints that affect enrichment coverage and integration readiness. This page groups them into current pipeline limitations and dependencies that must be resolved before porting to Cato-v2.
Current limitations¶
Gemini model dependency¶
Cartex's enrichment prompts target Google Gemini exclusively. All specialist prompts (AUXILIARY_TABLE_ENRICHMENT, LEGEND_ENRICHMENT, etc.) use Gemini's structured JSON output mode with Pydantic schema validation. Switching to a different model (e.g., GPT-4 Vision) would require prompt rewriting and output parsing changes.
Impact. Cartex cannot be deployed in environments where Gemini is unavailable. Cato-v2 deployments that use OpenAI for extraction would run enrichment on a different model than extraction.
Resolution. The enricher maintains its own model configuration independent of the extraction layer. Prompt portability to other models is out of scope for initial porting.
Router sensitivity to extraction quality¶
The Router bases strategy selection on a compact summary — table roles, headers, row counts, sample cell values, and context snippets. Misclassified table roles or missing context items cause the router to omit applicable strategies.
Impact. If extraction assigns OTHER to an auxiliary table, T1 (auxiliary table enrichment) will not fire. If image context lacks interpretation text, T3 and T4 are skipped.
Resolution. Extraction quality improvements upstream. The monolithic ENRICHMENT fallback provides a safety net when routing fails, but at lower precision than specialist strategies.
Field-authority calibration risk¶
Cartex now resolves fields via FIELD_AUTHORITY_MATRIX (global rankings + template overrides). This removes global strategy winner-takes-all behavior, but matrix quality now directly controls outcomes.
Impact. If a field's authority order is mis-ranked for a document family, the resolver can consistently choose a plausible but suboptimal source.
Resolution. Maintain matrix calibration against ground truths per template and keep documented exceptions in the matrix file (src/pipeline/field_authority.yaml).
Semantic notes synthesis recall drift¶
Special Notes is synthesized by SpecialNotesAdjudicator using batched fast-model semantic consolidation.
Impact. Semantic synthesis improves readability and deduplication but can under-represent low-frequency technical clauses (for example, specific code citations) when multiple observations compete.
Resolution. Keep deterministic fallback, monitor clause-level recall on benchmark fixtures, and tighten synthesis prompt constraints for must-preserve fact types.
Implicit product knowledge gap (brand/model inference)¶
Some fields are implicit in product knowledge and not explicitly written in the document. Example: inferring uPVC/vinyl from a specific branded system name.
Impact. The model may leave such fields blank or produce generic values because the evidence is not explicit in extracted tables/context.
Resolution. Treat this as a known platform limitation unless a controlled brand-knowledge layer is added (curated lookup table, documented provenance rules, and confidence tagging).
Field-contract ambiguity in template source¶
Current field contracts for some templates use overlapping semantics (for example, Glass Width and Glass Layer both referencing thickness-like guidance).
Impact. Specialists may map the same evidence into multiple fields, causing structurally consistent but semantically undesired outputs.
Resolution. Clarify field definitions in the template source of truth, then regenerate/inject updated contracts into prompts.
Obscured-row extraction artifacts¶
When schedule rows are visually occluded by overlaid notes or print artifacts, extraction can produce plausible but non-authoritative extra rows. In Kingsbrook-like layouts, this instability is concentrated in the last ~5 rows of the detected main table.
Impact. Additional rows can appear in output (for example one run produced W75B), but the hallucinated row IDs and values can vary by run. This issue is not fixed.
Resolution. Maintain fixture-level suppression/ignore metadata for reporting while improving table-occlusion handling in extraction. Do not treat a single row ID as the full bug signature.
Index-based row IDs disable row recovery¶
When extraction fails to detect a primary_key_column, rows receive index-based IDs (e.g., table_0_0_row_0). The merge algorithm excludes these tables from row recovery to avoid false positives.
Impact. If primary key detection fails, rows that no specialist produces output for are silently absent from the result. The pipeline's guarantee of never dropping rows depends on reliable primary key detection.
Resolution. Improve primary key detection in the extraction stage. Alternatively, add a user-facing override for primary_key_column in UserTableSchema.
Single-document scope¶
Cartex enriches one ExtractionResult at a time. It cannot cross-reference data across separate documents (e.g., a window schedule on sheet A5.1 referencing a glazing spec table on sheet A9.1 that was extracted in a different pipeline run).
Impact. Multi-sheet documents where auxiliary tables and main schedules appear on different pages must be extracted together using extract_pages() to produce a single merged ExtractionResult.
Resolution. The extract_pages() method already supports multi-page extraction with content deduplication. Callers must pass all relevant page numbers in a single run() call.
Operability enum coverage¶
The WindowOperabilityType and DoorOperabilityType enums define a fixed set of allowed values. Specialists must map all operability descriptions to one of these values.
Impact. Unusual or emerging product types (e.g., motorized louvre systems, automated sliding walls) have no matching enum value. The specialist prompt instructs Gemini to pick the closest match and note ambiguity in reasoning, but the output may be misleading.
Resolution. Extend the enums as new product types are encountered. The DoorOperabilityType.SECTIONAL value was added this way after the Fairway VTT test.
Porting dependencies¶
These items must be resolved before Cartex can operate within the Cato-v2 production pipeline. See the porting plan for full details.
P0 blocker: Context item supply depends on user tagging
Cato-v2 does not support automatic context item detection in the version
being integrated with Cartex. Users must manually draw boxes around context
regions (auxiliary tables, legend images, general notes, item card diagrams).
User-tagged evidences are intercepted after step 7 in DrawingAIService and
processed through a Gemini CONTEXT_EXTRACTION call before reaching Cartex.
The valid_types filter in drawing_ai.py:523 is left unchanged. See
Context item integration approach.
P0 blocker: No MAIN vs AUXILIARY table classification
Cato-v2 treats all schedule-type evidences (Window Door Unit, Table)
identically. Without role classification, the mapping layer cannot populate
TableModel.role, and T1 (auxiliary table enrichment) will misfire or not
fire at all. See Risk R1.
P0 blocker: Mapping layer does not exist
No converter exists to translate Cato-v2's TakeOffResult / Evidence
structures into Cartex's ExtractionResult. The to_extraction_result() and
build_user_table_schema() functions must be built. See
Mapping layer design.
Context item quality depends on user diligence¶
Context items are now available through the user tagging workflow described in Option A. The previous limitation — context items being entirely unavailable — is resolved. The remaining constraint is that context item quality depends on user diligence in tagging relevant regions.
Impact. If a user fails to tag an auxiliary table or legend image, that context is absent from the ExtractionResult and the corresponding specialist strategy will not fire. Enrichment coverage varies with the thoroughness of the user's tagging.
Resolution. This is a known and accepted constraint for the initial release. Future autonomous context detection (see Future: autonomous context detection) will remove this dependency on user diligence.
Enrichment metadata storage¶
Cato-v2's take_off_result_item table stores a flat JSON result dict per row. Cartex's EnrichedRow adds field_sources, confidence, and reasoning — fields with no current storage location.
Impact. Enrichment provenance and confidence data would be lost on save.
Resolution. Add a cartex_metadata JSON column to take_off_result_item, or store audit data in a linked table.
Field name alignment¶
Cartex's UserTableSchema.columns must exactly match the keys in TableModel.rows. If the PromptTemplate defines fields that extraction does not populate (or vice versa), enrichment produces misaligned results.
Impact. Silent data loss — columns present in extraction but absent from the schema are ignored; columns in the schema but absent from extraction remain empty without warning.
Resolution. Add a validation step in the mapping layer that reconciles template field names against actual keys in TakeOffResultItem.result. Log warnings for unmatched fields.
Product Type field requirement¶
For elevation-sourced items, Product Type is derived from the detection model's label name. For schedule-sourced items, it depends on whether the active PromptTemplate includes a Product Type field with extraction rules.
Impact. Door rows may arrive at enrichment without Product Type, degrading enrichment quality for operability and configuration mapping.
Resolution. Ensure the default prompt template always includes Product Type with comprehensive extraction rules.