Database Schema Design
This document describes the full database schema for Epok Advisor. All tables live in the public schema in Supabase (Postgres 17). Supabase Auth handles authentication. Drizzle ORM manages schema and type-safe queries. pgvector extension is enabled for embedding storage.
Entity Relationship Overview
auth.users (Supabase Auth)
│
└── profiles (app-specific user data + Google OAuth tokens)
│
├── engagements (advisor client projects)
│ ├── engagement_members (access control)
│ ├── engagement_milestones (6-step progress tracker)
│ ├── content_folders → content_documents → content_versions (scoped content)
│ ├── questionnaire_responses (filled-out questionnaires)
│ └── analyses → analysis_inputs (AI analysis runs)
│
├── content_folders (hierarchical folder tree — KB or engagement-scoped)
│ └── content_documents (versioned markdown docs)
│ ├── content_versions (draft/published/archived)
│ ├── content_visibility (client portal visibility)
│ ├── content_document_tags (many-to-many tag assignment)
│ └── document_embeddings (pgvector chunks for RAG)
│
├── content_tags (classification labels)
│
├── questionnaire_templates → questionnaire_template_versions
│
├── chat_messages (AI chat history)
│
└── api_keys (external API authentication)
Tables
1. profiles
Extends Supabase Auth auth.users with application-specific data.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | References auth.users.id |
| text NOT NULL | ||
| full_name | text | |
| avatar_url | text | |
| role | enum: super_admin, associate, client_primary, client_member | System-wide role |
| google_access_token | text | OAuth token for Google Docs export |
| google_refresh_token | text | OAuth refresh token |
| google_token_expiry | timestamptz | When the access token expires |
| created_at | timestamptz | Default now() |
| updated_at | timestamptz | Default now() |
Notes:
- Created automatically via a database trigger when a new auth user signs up
roledefaults toclient_primary(safest default — most restrictive non-member role)- Super admins are seeded: ben.unsworth@epokadvice.com, kendra.thompson@epokadvice.com
- Google OAuth tokens are stored per-user for Google Docs export (not all users will have these)
2. engagements
The core business entity — one advisor going through the Epok process.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | Default gen_random_uuid() |
| advisor_name | text NOT NULL | |
| advisor_email | text NOT NULL | |
| firm_name | text | |
| aum_range | text | e.g. "$500M–$1B" |
| status | enum: intake, active, ongoing, complete, archived | Lifecycle state |
| notes | text | Internal notes (not visible to client) |
| created_by | uuid FK → profiles.id | Who created the engagement |
| created_at | timestamptz | |
| updated_at | timestamptz |
3. engagement_members
Junction table: who has access to an engagement and in what capacity.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| engagement_id | uuid FK → engagements.id | |
| user_id | uuid FK → profiles.id | |
| role | enum: admin, associate, client_primary, client_member | Role within this engagement |
| invited_at | timestamptz | When the invite was sent |
| joined_at | timestamptz | When they accepted / first logged in |
| invited_by | uuid FK → profiles.id | |
| created_at | timestamptz |
Unique constraint: (engagement_id, user_id)
4. engagement_milestones
Tracks progress through the structured engagement process.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| engagement_id | uuid FK → engagements.id | |
| step | integer NOT NULL | 1-6 |
| label | text NOT NULL | e.g. "Intake", "Discovery", "Workshop 1" |
| status | enum: upcoming, current, completed | |
| completed_at | timestamptz | When marked done |
| completed_by | uuid FK → profiles.id | Who marked it |
| notes | text | Optional notes |
| created_at | timestamptz |
Unique constraint: (engagement_id, step)
Milestone definitions (seeded when engagement is created):
| Step | Label | Corresponding Tag |
|---|---|---|
| 1 | Intake | MS: Intake |
| 2 | Discovery | MS: Discovery |
| 3 | Workshop 1 | MS: Workshop 1 |
| 4 | Workshop 2 | MS: Workshop 2 |
| 5 | Workshop 3 | MS: Workshop 3 |
| 6 | Complete | MS: Complete |
Each milestone has a corresponding "MS:" tag in content_tags. Engagement overview shows "View docs" links that filter the Documents tab by the milestone's tag.
5. content_folders
Hierarchical folder structure for organizing documents (both KB and engagement-scoped).
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| name | text NOT NULL | Folder name |
| parent_id | uuid FK → content_folders.id (self-ref) | NULL = root folder. CASCADE delete |
| engagement_id | uuid FK → engagements.id | NULL = KB folder; set = engagement folder |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz | |
| updated_at | timestamptz |
Unique constraint: (name, parent_id, engagement_id) — no duplicate folder names at the same level
Notable folders:
- "Inbox" (KB) — landing zone for auto-ingested content (Otter.ai transcripts, etc.)
- Folders can only be deleted when empty (no docs, no sub-folders)
6. content_documents
Metadata for any piece of content — both knowledge base (global) and engagement-scoped.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| title | text NOT NULL | Human-readable title |
| folder_id | uuid FK → content_folders.id | Which folder it lives in (NULL = unfiled) |
| engagement_id | uuid FK → engagements.id | NULL = KB (global); set = engagement-scoped |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz | |
| updated_at | timestamptz |
Notes:
- The old
pathandcontent_typecolumns have been replaced byfolder_id(organization) and tags viacontent_document_tags(classification) - Documents can have 0 or more tags
- "File to Engagement" action: sets
engagement_id, clearsfolder_id, optionally assigns a milestone tag
7. content_versions
Every edit to a document creates a new version.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| document_id | uuid FK → content_documents.id | |
| version | integer NOT NULL | Auto-incrementing per document |
| status | enum: draft, published, archived | Only one published per document |
| content | text NOT NULL | The markdown body |
| change_summary | text | Optional — what changed and why |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz |
Unique constraint: (document_id, version)
Rules:
- Publishing a new version automatically archives the previous published version
- Rolling back = creating a new version with old content (never mutating history)
- Publishing triggers auto-embedding for RAG search
8. content_visibility
Controls whether engagement-scoped content is visible to the client in their portal.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| document_id | uuid FK → content_documents.id | |
| visible_to_client | boolean DEFAULT false | |
| published_at | timestamptz | When it was made visible |
| published_by | uuid FK → profiles.id |
9. content_tags
Classification labels for documents. Tags replace the old hardcoded content_type enum.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| name | text NOT NULL UNIQUE | Tag display name |
| color | text NOT NULL | Color key for UI (e.g. "purple", "blue", "amber") |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz |
Seeded tags: Theme, Data Point, Report Template, Prompt, System Instruction, Internal Note, Intake Notes, Workshop Note, Transcript, Report Draft, Final Report, Questionnaire Response, Summary, Auto-Ingested, MS: Intake, MS: Discovery, MS: Workshop 1, MS: Workshop 2, MS: Workshop 3, MS: Complete, Analysis Output
10. content_document_tags
Many-to-many join between documents and tags.
| Column | Type | Notes |
|---|---|---|
| document_id | uuid FK → content_documents.id | Composite PK |
| tag_id | uuid FK → content_tags.id | Composite PK |
11. questionnaire_templates
Reusable questionnaire definitions.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| title | text NOT NULL | Template name |
| description | text | |
| questions | jsonb NOT NULL | Current question set (array of question objects) |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz | |
| updated_at | timestamptz |
Question object shape: { id, type, text, required, options?, helpText?, section? }
Question types: text, textarea, select, multi_select, scale, yes_no
12. questionnaire_template_versions
Versioned snapshots of questionnaire templates.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| template_id | uuid FK → questionnaire_templates.id | CASCADE delete |
| version | integer NOT NULL | |
| status | text NOT NULL | draft or published |
| questions | jsonb NOT NULL | Frozen question set for this version |
| change_summary | text | |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz |
Unique constraint: (template_id, version)
13. questionnaire_responses
Filled-out questionnaire tied to an engagement.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| template_id | uuid FK → questionnaire_templates.id | CASCADE delete |
| template_version_id | uuid FK → questionnaire_template_versions.id | Frozen version used |
| engagement_id | uuid FK → engagements.id | CASCADE delete |
| respondent_type | enum: interviewer, client, agent | Who filled it out |
| status | enum: draft, complete | |
| answers | jsonb NOT NULL | { questionId: answer } |
| completed_at | timestamptz | |
| completed_by | uuid FK → profiles.id | |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz | |
| updated_at | timestamptz |
14. analyses
AI analysis runs scoped to an engagement.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| engagement_id | uuid FK → engagements.id | CASCADE delete |
| title | text NOT NULL | |
| prompt | text NOT NULL | The prompt with @-mention document references |
| model | text NOT NULL | claude-sonnet, gpt-4o, gpt-4o-mini |
| status | enum: draft, running, complete, error | |
| error_message | text | If status = error |
| milestone_id | uuid FK → engagement_milestones.id | Optional milestone association |
| output_document_id | uuid FK → content_documents.id | Auto-created output doc tagged "Analysis Output" |
| created_by | uuid FK → profiles.id | |
| created_at | timestamptz | |
| updated_at | timestamptz |
15. analysis_inputs
Join table linking analyses to their input documents (referenced via @-mentions in the prompt).
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| analysis_id | uuid FK → analyses.id | CASCADE delete |
| document_id | uuid FK → content_documents.id | CASCADE delete |
Unique constraint: (analysis_id, document_id)
16. document_embeddings
Vector embeddings for RAG search. Created by api.embed.ts on document publish.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| document_id | uuid FK → content_documents.id | CASCADE delete |
| chunk_index | integer | Position within the document |
| content | text | The text chunk (~2000 chars with overlap) |
| embedding | vector(1536) | OpenAI text-embedding-3-small |
| created_at | timestamptz |
Index: HNSW index on embedding column for cosine similarity search
RPC: match_documents(query_embedding, match_threshold, match_count) returns top-N similar chunks
17. chat_messages
Persistent chat history for the AI chat panel.
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| session_id | text | Groups messages into conversations |
| role | text | user, assistant, system |
| content | text | Message content |
| model | text | Which LLM was used |
| user_id | uuid FK → profiles.id | |
| created_at | timestamptz |
18. api_keys
API keys for external access (ChatGPT Actions, Claude MCP, Zapier).
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| user_id | uuid FK → profiles.id | Key owner |
| name | text | Display name (e.g. "OAuth: client_abc123") |
| key_hash | text NOT NULL | SHA-256 hash of the full key |
| key_prefix | text NOT NULL | First 8 chars of key for identification |
| is_active | boolean DEFAULT true | |
| last_used_at | timestamptz | |
| created_at | timestamptz |
Key format: epok_sk_... (base64url random), shown once on creation
Lookup: validate_api_key() SECURITY DEFINER function for efficient hash-based lookup
RLS Policies Summary
All tables have Row Level Security enabled. Key policies:
Core tables
- profiles: Users read own profile; super admins read all; users update own (except role)
- engagements: Super admins full CRUD; members read their own
- engagement_members: Super admins full CRUD; members read own engagement's list
- engagement_milestones: Super admins full CRUD; members read
Content tables
- content_folders: Super admins full CRUD; engagement-scoped folders visible to members
- content_documents + content_versions:
- KB (engagement_id IS NULL): super admins full CRUD; associates read-only; clients no access
- Engagement-scoped: super admins full CRUD; associates CRUD on assigned; clients read published+visible only
- content_visibility: Super admins full CRUD; members read for their engagement
- content_tags / content_document_tags: Super admins full CRUD; all authenticated users can read
RLS recursion fix
content_documents ↔ content_visibility ↔ content_versions have circular policy dependencies. Broken with SECURITY DEFINER helper functions:
is_doc_visible_to_client(doc_id)can_view_document(doc_id)can_access_document_versions(doc_id)is_doc_published_and_visible(doc_id)
Other tables
- questionnaire_templates/versions: Super admins full CRUD; all authenticated read
- questionnaire_responses: Super admins full CRUD; engagement members read/write their own
- analyses / analysis_inputs: Super admins full CRUD; engagement members read
- document_embeddings: Super admins full CRUD (used server-side via service role)
- chat_messages: Users read/write their own messages
- api_keys: Users manage their own keys
Enums
Defined in app/db/schema/enums.ts:
| Enum | Values |
|---|---|
user_role | super_admin, associate, client_primary, client_member |
engagement_status | intake, active, ongoing, complete, archived |
milestone_status | upcoming, current, completed |
engagement_member_role | admin, associate, client_primary, client_member |
content_type | theme, theme_data_point, advisor_intake, workshop_note, workshop_transcript, report_draft, report_final, report_template, prompt, system_instruction, questionnaire_template, questionnaire_response, internal_note |
content_version_status | draft, published, archived |
analysis_status | draft, running, complete, error |
respondent_type | interviewer, client, agent |
questionnaire_status | draft, complete |
Note: content_type enum still exists in the database but is no longer used as a column on content_documents. Classification is now handled entirely by the content_tags / content_document_tags tables.
Seed Data
On initial setup:
- Profiles: Super admin roles for ben.unsworth@epokadvice.com and kendra.thompson@epokadvice.com (set via trigger on first login)
- Milestones: 6-step structure seeded per engagement on creation
- Content tags: ~20 tags seeded (Theme, Data Point, Report Template, Prompt, System Instruction, Internal Note, Intake Notes, Workshop Note, Transcript, Report Draft, Final Report, Questionnaire Response, Summary, Auto-Ingested, MS: Intake, MS: Discovery, MS: Workshop 1-3, MS: Complete, Analysis Output)
- KB folders: "Inbox" folder for auto-ingested content
Drizzle ORM Notes
- Schema defined in
app/db/schema/with one file per table, re-exported fromindex.ts - Schema changes applied via
drizzle-kit push(no migration files) - Type-safe queries in loaders, actions, and API routes
- Connection via direct Postgres connection string (IPv4 add-on, not pooler)