AI Building-Code Compliance & Permit-Review SaaS
Live in production for a Florida building-code engineering firm. Full-stack, AI-powered platform that replaces days of manual construction-drawing review with supervised AI verification. Ingests construction PDFs, runs AWS Textract OCR + GPT-4o extraction, retrieves exact code passages via ChromaDB RAG, and produces audit-ready compliance reports with real FBC/IBC/NFPA citations across five disciplines.
GPT-4o + RAG — 200+ code checks across five disciplines, every finding cited to a real FBC/IBC/NFPA section. Days of manual review cut to supervised hours.
Permit review means reading hundreds of pages of construction drawings against hundreds of code rules by hand. For AEC firms and Florida code-engineering practices, that process takes days per project, requires deep discipline expertise across Architectural, Structural, Mechanical, Electrical, and Plumbing trades, and leaves no structured audit trail for the permit file.
This platform cuts that from days to hours — and, critically, makes the AI auditable. Every finding the system raises is grounded in a real code passage retrieved by ChromaDB vector search from indexed FBC, IBC, and NFPA standards. The AI cannot invent a rule: it cites, it does not guess. Reviewers verify reasoning instead of trusting a black box, and the output is a fully cited, permit-ready compliance report in both PDF and JSON.
Built on FastAPI + Celery + PostgreSQL backend, Next.js 15 frontend, and a six-stage transparent processing pipeline with live progress, per-stage cost display, and a hard billing ceiling per document. Live in production for a Florida building-code engineering firm.
The platform ingests construction PDF drawings, runs AWS Textract OCR and GPT-4o per-page extraction, retrieves the exact relevant code passage via ChromaDB vector search (RAG), and runs 200+ checks per discipline across five trades: Architectural, Structural, Mechanical, Electrical, and Plumbing, plus NFPA fire and life-safety codes (NFPA 1, 70, 72, 101).
Every finding links to a real citation — for example, FBC Plumbing 2023, Section 712.1 — so the AI cannot hallucinate a rule. It must reference an indexed passage from the FBC, IBC, or NFPA standards stored in ChromaDB. The output is a fully cited, audit-ready compliance report in PDF for the permit file and JSON for BIM/PM tooling, with every page traceable to the source drawing.
The system is cost-aware by design: page, OCR, and LLM cost is estimated up front with a hard ceiling per document to prevent runaway spend. Reviewers operate in a full human-in-the-loop workflow — they can edit any extracted field and re-run checks before accepting the final report.
Permit Review Is Still a Manual, Unaudited, Days-Long Process
- Manual permit review takes days per project — reading hundreds of pages of construction drawings against hundreds of code rules by hand is the standard process at AEC firms and code-engineering practices. A single project review can take two to five days of senior engineer time, with no structured tracking of which rules were checked or why a finding was raised.
- No audit trail for the permit file — traditional review produces a marked-up PDF or a handwritten checklist, neither of which gives the permit office a structured, traceable record of which code sections were checked, which passed, and which failed. When a drawing is revised, the review process starts from scratch with no reference to what was previously verified.
- AI hallucination makes LLMs unusable for compliance without RAG — off-the-shelf LLMs cannot be trusted for code compliance because they generate plausible-sounding but unverified code citations. A building official cannot accept 'AI says this complies' without a reference to the actual code section. Without grounding every finding in a retrieved passage from an indexed code corpus, AI is a liability rather than an asset in permit review.
- Five disciplines require different expertise and rule sets — Architectural, Structural, Mechanical, Electrical, and Plumbing reviews each require a different body of knowledge, different sections of the FBC, IBC, and NFPA standards, and different extraction logic from construction drawings. A single generic AI cannot handle all five trades without discipline-specific prompts, rule sets, and structured output schemas.
- Large PDF processing is expensive and unpredictable — construction drawing sets can be hundreds of pages and hundreds of megabytes. Sending entire documents to an LLM is cost-prohibitive, technically unreliable, and produces unstructured outputs that cannot be audited. Without per-page processing, cost control, and structured JSON extraction, AI-assisted review is not viable at production scale.
- No human-in-the-loop means no professional accountability — code-compliance decisions carry professional and legal liability. A system that produces a report without allowing a licensed engineer to review, edit, and approve every extracted field and finding is not usable in a regulated permit-review workflow. Reviewers need to verify AI reasoning, not accept its output as final.
- Billing surprises kill production AI tool adoption — AEC firms and code-engineering practices cannot adopt an AI tool that runs unpredictable LLM costs per document. Without up-front cost estimates, live per-stage cost display, and a hard ceiling per document, the tool is a financial risk rather than a productivity gain.
How We Built a Zero-Hallucination AI Permit-Review Platform
The platform is built on an event-driven FastAPI + Celery + PostgreSQL backend with a six-stage document processing pipeline. Every component addresses a specific failure mode of off-the-shelf AI applied to code compliance. The result is a tool that a licensed engineer can stake their professional reputation on — because every output is traceable, cited, and human-approved.
Senior engineer reads hundreds of drawing pages by hand — 2 to 5 days per project. No structured audit trail — marked-up PDFs and handwritten checklists only. LLM output not usable for compliance — no code citations, hallucinations unacceptable. Different discipline reviewers needed for each trade — no unified workflow. No cost visibility — LLM fees unpredictable and potentially unbounded. Drawing resubmissions processed again from scratch with no deduplication.
200+ RAG-grounded checks across five disciplines in supervised hours, not days. Every finding cites a real FBC/IBC/NFPA section — permit-ready audit trail in PDF + JSON. ChromaDB retrieves the exact code passage before GPT-4o raises any flag — zero hallucination. Discipline-specific prompts, extraction logic, and rule sets for all five trades in one platform. Up-front cost estimate, live per-stage cost display, and hard ceiling per document. SHA-256 dedup on upload — duplicate drawings rejected before any processing begins.
The Platform Modules — Six-Stage Pipeline, Five Disciplines, Zero Hallucination
PDF Ingestion & Upload
Streaming upload handler for PDFs up to 500 MB. Files streamed to disk in 5 MB chunks with magic-byte validation, MIME sniffing, and SHA-256 content hashing for deduplication. Duplicate submissions rejected before any processing or billing begins.
OCR Engine (AWS Textract)
AWS Textract processes each page and returns structured blocks (lines, words, tables, key-value pairs). Output stored as JSONB in PostgreSQL per page for downstream extraction. Textract cost tracked and displayed per page in the pipeline UI.
Schedule & Table Detection
Dedicated stage detects title blocks, revision schedules, and data tables in the OCR output. Schedule data is extracted into structured records for cross-referencing during code checks. Table rows are parsed into typed JSON for equipment and specification tables.
Equipment Parser
Parses mechanical schedules, panel schedules, fixture schedules, and structural member tables into typed equipment records. Each record includes extracted values, confidence scores, and the source page reference for traceability.
ChromaDB RAG Engine
Indexes FBC (Florida Building Code), IBC (International Building Code), and NFPA 1/70/72/101 standards as vector embeddings. At check time, the relevant code passage is retrieved before GPT-4o generates any finding — grounding every flag in a real cited text.
Compliance Check Engine
Runs 200+ discipline-specific checks across Architectural, Structural, Mechanical, Electrical, and Plumbing trades. Each check retrieves its code passage from ChromaDB, runs GPT-4o against the extracted drawing data, and records the finding with citation, confidence, and source reference.
Human-in-the-Loop Review
Reviewers inspect per-page extraction results, edit any field, override confidence scores, and approve or reject individual findings before the final report is generated. Every edit is logged with timestamp and reviewer identity for the audit trail.
Report Generator (PDF + JSON)
Generates a printable branded PDF compliance report for the permit file and a complete structured JSON export for BIM/PM tooling. Every finding links to its source page, extracted value, and cited code section. Built with ReportLab and FastAPI background tasks.
Cost Tracker & Billing Controls
Estimates page count, OCR cost, and LLM token cost before processing begins. Tracks actual spend per stage in real time and enforces a configurable hard ceiling per document. Live cost display on the pipeline progress UI prevents billing surprises.
Processing Pipeline UI
Six-stage pipeline progress view (Queued / OCR / Schedule Detection / Table Extraction / Equipment Parsing / Ready) with per-stage status, timing, and cost. Built on TanStack Query with polling for live updates during async Celery processing.
Designed for every screen
Web
- Upload construction PDFs up to 500 MB with real-time streaming progress and immediate magic-byte validation
- Six-stage pipeline progress view with per-stage status, timing, and live cost display
- Per-page extraction review: inspect GPT-4o extracted fields, confidence scores, and edit any value before checks run
- Discipline selector: choose which trades to check (Architectural, Structural, MEP, or all five)
- Compliance findings view: filterable by discipline, severity, and citation source (FBC / IBC / NFPA)
- Human-in-the-loop approval: accept, reject, or flag each finding with reviewer comments before finalising the report
- Up-front cost estimate before processing starts, with live cost tracking and configurable hard ceiling
- PDF compliance report download for the permit file, JSON export for BIM/PM tooling
- Document history: all past submissions with processing status, cost totals, and report access
- SHA-256 dedup alert: duplicate drawing submissions detected and shown before any processing charge
AI Layer
An AI processing engine embedded in the platform's pipeline (not a conversational chatbot). Primary Model: OpenAI GPT-4o (vision + structured output). Structured Outputs: PydanticAI validates every GPT-4o response before any database write. RAG Engine: ChromaDB vector search over indexed FBC, IBC, and NFPA 1/70/72/101 standards. Embeddings: OpenAI text-embedding-3-small for ChromaDB indexing. OCR Layer: AWS Textract (per-page blocks, tables, key-value pairs). Hallucination Prevention: every finding requires a ChromaDB-retrieved citation before GPT-4o output is accepted. Human-in-the-Loop: per-page extraction review and per-finding approval before report generation.
System design
The rules that run the product
- Every compliance finding must have a ChromaDB-retrieved code citation before it is written to the database — no citation, no finding
- Document processing is gated on cost approval: the up-front estimate must be acknowledged before Celery jobs are queued
- A hard cost ceiling per document is enforced by the task queue: processing halts and alerts the reviewer if the ceiling is reached mid-pipeline
- SHA-256 content hash deduplication rejects any file already in the system before OCR begins, preventing double-billing on resubmitted drawings
- Per-page extraction results are locked after reviewer approval: re-running checks always uses the approved (human-corrected) data, not raw GPT-4o output
- All six pipeline stages must complete successfully before the report generation stage is unblocked
- Final reports are immutable once generated: any change to findings requires a new report version with a new audit record
- All reviewer edits, approvals, and rejections are logged with timestamp and user identity for the professional audit trail
- Upload & Dedup Workflow: File received → 5 MB chunk streaming to disk → magic-byte MIME validation → SHA-256 hash computed → duplicate check against database → if new: stored to S3 and processing record created → if duplicate: rejection response with original submission reference
- Pipeline Processing Workflow: Celery job queued → Textract OCR per page → JSONB blocks stored to PostgreSQL → Schedule Detection stage → Table Extraction stage → Equipment Parsing stage → all outputs stored as typed JSONB → status updated per stage → frontend polls via TanStack Query
- Compliance Check Workflow: Reviewer approves extraction → check engine runs per discipline → ChromaDB retrieves top-k code passages per check → GPT-4o runs check against extracted data + retrieved passages → PydanticAI validates output → findings written to database with citation, severity, source page
- Report Generation Workflow: Reviewer approves final findings → report generator triggered as Celery task → ReportLab renders branded PDF with all findings, citations, and source page references → JSON export created → both stored to S3 → signed download URLs returned to frontend
- Celery Beat runs nightly cost reconciliation across all documents processed in the billing period
- Textract completion webhook triggers the next pipeline stage automatically — no polling between stages
- Low-confidence extraction fields are automatically flagged for mandatory reviewer attention before checks run
- Cost ceiling alert fires a Sentry notification and halts the Celery pipeline if per-document spend reaches the configured threshold
- S3 signed URL expiry is automatically renewed on reviewer access to prevent expired download links mid-session
- pgvector embedding refresh runs on ChromaDB index updates to keep code-standard citations current after code cycle updates
Everything inside
Document Ingestion & Deduplication
- Streaming upload handler for PDFs up to 500 MB — 5 MB chunk streaming to disk
- Magic-byte MIME type validation — rejects non-PDF files regardless of extension
- SHA-256 content hash deduplication — duplicate submissions rejected before any processing or billing
- S3 storage with signed URL access control — no public document access
- Cost estimate displayed and acknowledged before any Celery job is queued
OCR Pipeline (AWS Textract)
- Per-page Textract processing returning structured blocks: lines, words, tables, key-value pairs
- Block output stored as JSONB in PostgreSQL per page for downstream extraction stages
- Textract cost tracked per page and accumulated against the per-document hard ceiling
- Schedule and title block detection identifies drawing metadata for cross-referencing
- Table rows parsed into typed JSON for equipment schedules and specification tables
AI Extraction (GPT-4o + PydanticAI)
- Per-page structured extraction: typed JSON output with confidence scores for every field
- Discipline-specific prompts for Architectural, Structural, Mechanical, Electrical, and Plumbing pages
- Equipment parsing: mechanical schedules, panel schedules, fixture schedules, structural member tables
- PydanticAI validates every GPT-4o output before database write — no unstructured LLM responses stored
- Low-confidence fields automatically flagged for mandatory reviewer correction
RAG-Grounded Compliance Checks (ChromaDB)
- FBC, IBC, and NFPA 1/70/72/101 standards indexed as OpenAI vector embeddings in ChromaDB
- Top-k code passage retrieval before every GPT-4o compliance check — grounds every finding
- 200+ discipline-specific checks: Architectural (occupancy, egress, accessibility), Structural (loads, framing, foundations), Mechanical (HVAC, ventilation), Electrical (NEC panels, circuits), Plumbing (drainage, backflow, venting)
- NFPA fire and life-safety checks: NFPA 1 (Fire Code), NFPA 70 (NEC), NFPA 72 (Fire Alarm), NFPA 101 (Life Safety Code)
- Every finding includes: code citation, retrieved passage excerpt, extracted drawing value, severity, and source page
Human-in-the-Loop Review
- Per-page extraction review UI: inspect and edit any GPT-4o extracted field before checks run
- Per-finding approval: accept, reject, or flag each compliance finding with reviewer comments
- Reviewer corrections stored separately from raw AI output — checks always run against approved data
- Full audit log: every edit, approval, and rejection timestamped with reviewer identity
- Re-run checks on any corrected extraction without reprocessing the full pipeline
Cost Control & Observability
- Up-front cost estimate (page count + Textract + LLM tokens) before processing begins
- Live per-stage cost display on the pipeline progress UI
- Configurable hard cost ceiling per document — enforced by Celery task gating
- Cost ceiling alert via Sentry notification if threshold is reached mid-pipeline
- Sentry error tracking + structlog JSON logs with per-request trace IDs across all pipeline stages
Report Generation (PDF + JSON)
- Branded PDF compliance report rendered by ReportLab for the permit file
- Every finding links to: source drawing page, extracted field value, retrieved code citation, reviewer approval
- Complete JSON export for BIM and PM tooling with full finding metadata
- Both outputs stored to S3 with signed URL download links
- Immutable report versions: any post-approval change creates a new versioned report record
Processing Pipeline UI
- Six-stage progress view: Queued / OCR / Schedule Detection / Table Extraction / Equipment Parsing / Ready
- Per-stage status indicators with timing and cumulative cost display
- TanStack Query polling for live async Celery pipeline updates without page refresh
- Document history view: all past submissions with processing status, cost totals, and report access
- Duplicate submission alert: shows original submission date and cost if SHA-256 match detected
Built on a modern stack
Outcomes that move the business
- Delivered a production AI permit-review platform (Next.js 15 SPA, FastAPI backend, six-stage Celery pipeline) for a live Florida code-engineering firm
- Zero-hallucination compliance findings: ChromaDB RAG retrieves real FBC/IBC/NFPA citations before every GPT-4o check — the AI cites, it does not guess
- 200+ discipline-specific checks across Architectural, Structural, MEP (Mechanical, Electrical, Plumbing), and NFPA fire/life-safety codes in one codebase
- Full human-in-the-loop workflow: per-page extraction review, per-finding approval, and immutable report versioning for professional accountability
- Cost-aware architecture: up-front estimates, live per-stage tracking, and configurable hard ceilings prevent billing surprises at production document volumes
- Streaming 500 MB upload with SHA-256 dedup and signed S3 URLs — production-grade data handling for large construction drawing sets
- End-to-end observability: Sentry + structlog across backend and frontend with per-request trace IDs and per-stage cost logging to PostgreSQL
- Horizontal Celery scaling with zero code changes — worker replicas add capacity without architectural changes
See it in action
My Role & Deliverables
Role: Full-Stack AI Engineer · Construction Tech · Python, FastAPI, Next.js, GPT-4o, AWS, RAG
Project Description: Built a production AI SaaS that automates building-permit review for a Florida code-engineering firm. Ingests construction PDFs, runs AWS Textract OCR and GPT-4o per-page extraction, and grounds every finding in real code via ChromaDB RAG — so the AI cites FBC/IBC/NFPA sections, never hallucinates. 200+ checks across Architectural, Structural, and MEP disciplines. Stack: FastAPI, Celery/Redis, PydanticAI, PostgreSQL, Next.js 15. Streaming 500 MB uploads, cost ceilings, human-in-the-loop edits, audit-ready PDF/JSON reports. Cut review from days to supervised hours.
Skills & Deliverables:
- Full-Stack Development (FastAPI, Next.js 15, React 19, TypeScript)
- AI / LLM Integration (OpenAI GPT-4o, PydanticAI structured outputs)
- RAG Pipeline (ChromaDB, OpenAI Embeddings, FBC/IBC/NFPA indexing)
- OCR Integration (AWS Textract per-page block extraction)
- PDF Processing & Generation (streaming upload, ReportLab compliance report)
- Task Queue Architecture (Celery six-stage pipeline, Redis broker)
- Cost Control Systems (up-front estimates, live tracking, hard ceilings)
- Production Observability (Sentry, structlog, per-stage cost logging)
- Cloud & DevOps (Docker, GitHub Actions, Coolify, DigitalOcean, Traefik, Cloudflare)



