Data models¶
All data models are defined in src/models.py as Pydantic BaseModel subclasses with strict validation. Template definitions live in src/templates.py.
Core models¶
These models represent the primary data structures that flow through the pipeline.
ExtractionResult¶
The output of the extraction stage. Contains all tables and context items detected on one or more PDF pages.
| Field | Type | Description |
|---|---|---|
tables |
list[TableModel] |
All detected tables (main, auxiliary, other) |
context |
list[TextContextModel \| ImageContextModel] |
All non-table context items |
TableModel¶
A single table extracted from a document page.
| Field | Type | Description |
|---|---|---|
table_id |
str |
Generated identifier (e.g., table_0_0) |
role |
TableRole |
MAIN, AUXILIARY, or OTHER |
page_number |
int |
Source page number |
headers |
list[str] |
Column headers in order |
rows |
list[dict[str, str]] |
Row data keyed by header name |
confidence |
float |
Extraction confidence (0.0–1.0) |
notes |
str \| None |
Extraction irregularities |
bbox |
BoundingBox \| None |
Normalised bounding box on page |
primary_key_column |
str \| None |
Header that uniquely identifies rows |
TextContextModel¶
A text-based context item (general note, code requirement, spec).
| Field | Type | Description |
|---|---|---|
context_id |
str |
Generated identifier |
type |
ContextType |
Always TEXT |
page_number |
int |
Source page number |
content |
str |
Full verbatim text |
category |
ContextCategory |
Classification of the text block |
category_detail |
str \| None |
Additional category information |
scope |
list[str] \| None |
Row IDs this context applies to |
ImageContextModel¶
An image-based context item (legend diagram, item card, dimension drawing).
| Field | Type | Description |
|---|---|---|
context_id |
str |
Generated identifier |
type |
ContextType |
Always IMAGE |
page_number |
int |
Source page number |
content |
str |
File path or placeholder identifier |
format |
str |
Image format (e.g., png) |
dimensions |
tuple[int, int] |
Image dimensions |
interpretation |
str \| None |
Detailed textual description of the image content |
EnrichedRow¶
The final output of the enrichment stage. One instance per main schedule row.
| Field | Type | Description |
|---|---|---|
row_id |
str |
Primary key value from the main schedule |
data |
dict[str, str] |
Column values keyed by schema column name |
field_sources |
dict[str, FieldSource] |
Source of each populated field |
confidence |
float |
Aggregate row confidence from merge winners (minimum winner confidence) |
reasoning |
str \| None |
Concatenated reasoning from all specialists |
validation_flags |
list[str] |
Post-merge validation/coercion flags (for example enum coercions) |
UserTableSchema¶
The user-defined output template that controls which columns the pipeline fills.
| Field | Type | Description |
|---|---|---|
template |
TemplateType |
Template type (e.g., STANDARD_TAKEOFF) |
columns |
list[str] |
Ordered list of target column names |
A model validator ensures Special Notes is always the last column, regardless of where the user places it in the list.
FieldContractSpec¶
Field contracts define per-column semantics and constraints injected into enrichment prompts at runtime.
| Field | Type | Description |
|---|---|---|
label |
str |
Human-readable field name |
definition |
str |
What the field represents (semantic meaning) |
constraints |
list[str] |
Enforceable extraction constraints for that field |
type_values |
list[str] |
Optional bounded value set used by enum-aware validation/selection |
variables |
list[str] |
Optional template-controlled variables for prompt injection |
Contract injection is intentionally split:
definitionexplains meaning and scope.constraintscarries behavior rules.
This avoids semantic duplication in specialist prompts and keeps field behavior centralized in FIELD_CONTRACT_LIBRARY.
Core model relationships¶
The following diagram shows how the core models relate to each other.
classDiagram
class ExtractionResult {
tables: list~TableModel~
context: list~ContextModel~
}
class TableModel {
table_id: str
role: TableRole
headers: list~str~
rows: list~dict~
primary_key_column: str
}
class TextContextModel {
context_id: str
type: ContextType
content: str
category: ContextCategory
}
class ImageContextModel {
context_id: str
type: ContextType
content: str
interpretation: str
}
class EnrichedRow {
row_id: str
data: dict~str, str~
field_sources: dict~str, FieldSource~
confidence: float
}
class UserTableSchema {
template: TemplateType
columns: list~str~
}
ExtractionResult "1" *-- "many" TableModel
ExtractionResult "1" *-- "many" TextContextModel
ExtractionResult "1" *-- "many" ImageContextModel
ExtractionResult --> EnrichedRow : enriched into
UserTableSchema --> EnrichedRow : defines columns
Enum types¶
Pipeline enums¶
| Enum | Values | Purpose |
|---|---|---|
TableRole |
MAIN, AUXILIARY, OTHER |
Classifies table function in the document |
ContextType |
TEXT, IMAGE |
Distinguishes text and image context items |
ContextCategory |
GENERAL_NOTE, PERFORMANCE_SPEC, MATERIAL_REQUIREMENT, STRUCTURAL_CRITERIA, CODE_REQUIREMENT, OTHER |
Classifies text context content |
FieldSource |
MAIN_TABLE, AUXILIARY_TABLE, TEXT_CONTEXT, IMAGE_CONTEXT, UNRESOLVED |
Tracks which source filled each field |
StrategyType |
AUXILIARY_TABLE, TEXT_RULE, IMAGE_LEGEND, DIMENSION_CARD, MULTI_LABEL |
Identifies specialist enrichment strategies |
MatchType |
EXACT, FUZZY, RULE_BASED, UNMATCHED |
Classification of match quality (used in legacy resolution) |
ModelType |
FAST, ADVANCED |
Selects Gemini model tier |
TemplateType |
STANDARD_TAKEOFF, STANDARD_TAKEOFF_TDL, GLASS_SCHEDULE, SHOP_DETAILS |
Predefined output templates |
Domain enums¶
| Enum | Example values | Purpose |
|---|---|---|
WindowOperabilityType |
Casement Single, Direct Set / Picture / Fixed, Awning, Sliding Window, Double Hung |
Constrained operability values for windows |
DoorOperabilityType |
Swing Single, Swing Double, Sliding Door, Folding, Pivot |
Constrained operability values for doors |
Cartex resolves enum domains through field-level policies (FIELD_ENUM_DOMAIN_LIBRARY) plus contract-defined type_values. Prompt constraints are injected via build_enum_constraints_block(), and merge/validation both consume the same resolved enum domains.
Gemini response models¶
These models define the structured JSON schemas that Gemini returns. Each maps to a specific pipeline step.
| Model | Used by | Returns |
|---|---|---|
GeminiTableResult |
TABLE_EXTRACTION prompt |
list[GeminiTableModel] — detected tables |
GeminiContextResult |
CONTEXT_EXTRACTION prompt |
list[GeminiTextContextModel \| GeminiImageContextModel] — context items |
GeminiRoutingResult |
ROUTER prompt |
list[StrategyType] + execution_order + context_assignments + reasoning |
GeminiEnrichedRowResult |
Specialist/monolithic enrichment prompts | list[GeminiEnrichedRow] — enriched rows with field sources and optional per-field authority confidence |
GeminiSpecialNotesSynthesisResult |
SpecialNotesAdjudicator |
rows[row_id, bullets[]] — semantic bullet synthesis for Special Notes |
GeminiResolutionResult |
FUZZY_MATCHING, SEMANTIC_MATCHING |
list[GeminiResolutionModel] — match results (legacy) |
GeminiCompoundResolutionResult |
COMPOUND_RESOLUTION |
list[GeminiCompoundResolution] — primary/secondary splits |
GeminiRuleApplicationResult |
RULE_APPLICATION |
list[GeminiRuleApplication] — rules mapped to rows |
GeminiColumnDetectionResult |
COLUMN_DETECTION |
list[GeminiColumnMatch] — column links between tables |
All Gemini response models inherit from GeminiResponse, which provides confidence (float, 0.0–1.0) and notes (optional string for irregularities).
GeminiEnrichedRow merge-related fields¶
| Field | Type | Description |
|---|---|---|
field_sources |
dict[str, FieldSource] |
Specialist-provided provenance per field |
field_claim_confidence |
dict[str, float] |
Optional per-field confidence that the specialist is authoritative for that field |
GeminiRoutingResult fields¶
| Field | Type | Description |
|---|---|---|
strategies |
list[StrategyType] |
Selected enrichment strategies |
execution_order |
list[list[StrategyType]] |
Staged execution plan — each inner list is a parallel stage |
context_assignments |
dict[str, list[str]] |
Maps strategy name to list of relevant context_id values |
reasoning |
str |
Explanation of strategy selection, execution order, and context assignments |