Database Schema Design

This document describes the full database schema for Epok Advisor. All tables live in the public schema in Supabase (Postgres 17). Supabase Auth handles authentication. Drizzle ORM manages schema and type-safe queries. pgvector extension is enabled for embedding storage.

Entity Relationship Overview

auth.users (Supabase Auth)
  │
  └── profiles (app-specific user data + Google OAuth tokens)
        │
        ├── engagements (advisor client projects)
        │     ├── engagement_members (access control)
        │     ├── engagement_milestones (6-step progress tracker)
        │     ├── content_folders → content_documents → content_versions (scoped content)
        │     ├── questionnaire_responses (filled-out questionnaires)
        │     └── analyses → analysis_inputs (AI analysis runs)
        │
        ├── content_folders (hierarchical folder tree — KB or engagement-scoped)
        │     └── content_documents (versioned markdown docs)
        │           ├── content_versions (draft/published/archived)
        │           ├── content_visibility (client portal visibility)
        │           ├── content_document_tags (many-to-many tag assignment)
        │           └── document_embeddings (pgvector chunks for RAG)
        │
        ├── content_tags (classification labels)
        │
        ├── questionnaire_templates → questionnaire_template_versions
        │
        ├── chat_messages (AI chat history)
        │
        └── api_keys (external API authentication)

Tables

1. `profiles`

Extends Supabase Auth auth.users with application-specific data.

Column	Type	Notes
id	uuid PK	References `auth.users.id`
email	text NOT NULL
full_name	text
avatar_url	text
role	enum: `super_admin`, `associate`, `client_primary`, `client_member`	System-wide role
google_access_token	text	OAuth token for Google Docs export
google_refresh_token	text	OAuth refresh token
google_token_expiry	timestamptz	When the access token expires
created_at	timestamptz	Default `now()`
updated_at	timestamptz	Default `now()`

Notes:

Created automatically via a database trigger when a new auth user signs up
role defaults to client_primary (safest default — most restrictive non-member role)
Super admins are seeded: ben.unsworth@epokadvice.com, kendra.thompson@epokadvice.com
Google OAuth tokens are stored per-user for Google Docs export (not all users will have these)

2. `engagements`

The core business entity — one advisor going through the Epok process.

Column	Type	Notes
id	uuid PK	Default `gen_random_uuid()`
advisor_name	text NOT NULL
advisor_email	text NOT NULL
firm_name	text
aum_range	text	e.g. "$500M–$1B"
status	enum: `intake`, `active`, `ongoing`, `complete`, `archived`	Lifecycle state
notes	text	Internal notes (not visible to client)
created_by	uuid FK → profiles.id	Who created the engagement
created_at	timestamptz
updated_at	timestamptz

3. `engagement_members`

Junction table: who has access to an engagement and in what capacity.

Column	Type	Notes
id	uuid PK
engagement_id	uuid FK → engagements.id
user_id	uuid FK → profiles.id
role	enum: `admin`, `associate`, `client_primary`, `client_member`	Role within this engagement
invited_at	timestamptz	When the invite was sent
joined_at	timestamptz	When they accepted / first logged in
invited_by	uuid FK → profiles.id
created_at	timestamptz

Unique constraint: (engagement_id, user_id)

4. `engagement_milestones`

Tracks progress through the structured engagement process.

Column	Type	Notes
id	uuid PK
engagement_id	uuid FK → engagements.id
step	integer NOT NULL	1-6
label	text NOT NULL	e.g. "Intake", "Discovery", "Workshop 1"
status	enum: `upcoming`, `current`, `completed`
completed_at	timestamptz	When marked done
completed_by	uuid FK → profiles.id	Who marked it
notes	text	Optional notes
created_at	timestamptz

Unique constraint: (engagement_id, step)

Milestone definitions (seeded when engagement is created):

Step	Label	Corresponding Tag
1	Intake	MS: Intake
2	Discovery	MS: Discovery
3	Workshop 1	MS: Workshop 1
4	Workshop 2	MS: Workshop 2
5	Workshop 3	MS: Workshop 3
6	Complete	MS: Complete

Each milestone has a corresponding "MS:" tag in content_tags. Engagement overview shows "View docs" links that filter the Documents tab by the milestone's tag.

5. `content_folders`

Hierarchical folder structure for organizing documents (both KB and engagement-scoped).

Column	Type	Notes
id	uuid PK
name	text NOT NULL	Folder name
parent_id	uuid FK → content_folders.id (self-ref)	NULL = root folder. CASCADE delete
engagement_id	uuid FK → engagements.id	NULL = KB folder; set = engagement folder
created_by	uuid FK → profiles.id
created_at	timestamptz
updated_at	timestamptz

Unique constraint: (name, parent_id, engagement_id) — no duplicate folder names at the same level

Notable folders:

"Inbox" (KB) — landing zone for auto-ingested content (Otter.ai transcripts, etc.)
Folders can only be deleted when empty (no docs, no sub-folders)

6. `content_documents`

Metadata for any piece of content — both knowledge base (global) and engagement-scoped.

Column	Type	Notes
id	uuid PK
title	text NOT NULL	Human-readable title
folder_id	uuid FK → content_folders.id	Which folder it lives in (NULL = unfiled)
engagement_id	uuid FK → engagements.id	NULL = KB (global); set = engagement-scoped
created_by	uuid FK → profiles.id
created_at	timestamptz
updated_at	timestamptz

Notes:

The old path and content_type columns have been replaced by folder_id (organization) and tags via content_document_tags (classification)
Documents can have 0 or more tags
"File to Engagement" action: sets engagement_id, clears folder_id, optionally assigns a milestone tag

7. `content_versions`

Every edit to a document creates a new version.

Column	Type	Notes
id	uuid PK
document_id	uuid FK → content_documents.id
version	integer NOT NULL	Auto-incrementing per document
status	enum: `draft`, `published`, `archived`	Only one `published` per document
content	text NOT NULL	The markdown body
change_summary	text	Optional — what changed and why
created_by	uuid FK → profiles.id
created_at	timestamptz

Unique constraint: (document_id, version)

Rules:

Publishing a new version automatically archives the previous published version
Rolling back = creating a new version with old content (never mutating history)
Publishing triggers auto-embedding for RAG search

8. `content_visibility`

Controls whether engagement-scoped content is visible to the client in their portal.

Column	Type	Notes
id	uuid PK
document_id	uuid FK → content_documents.id
visible_to_client	boolean DEFAULT false
published_at	timestamptz	When it was made visible
published_by	uuid FK → profiles.id

9. `content_tags`

Classification labels for documents. Tags replace the old hardcoded content_type enum.

Column	Type	Notes
id	uuid PK
name	text NOT NULL UNIQUE	Tag display name
color	text NOT NULL	Color key for UI (e.g. "purple", "blue", "amber")
created_by	uuid FK → profiles.id
created_at	timestamptz

Seeded tags: Theme, Data Point, Report Template, Prompt, System Instruction, Internal Note, Intake Notes, Workshop Note, Transcript, Report Draft, Final Report, Questionnaire Response, Summary, Auto-Ingested, MS: Intake, MS: Discovery, MS: Workshop 1, MS: Workshop 2, MS: Workshop 3, MS: Complete, Analysis Output

10. `content_document_tags`

Many-to-many join between documents and tags.

Column	Type	Notes
document_id	uuid FK → content_documents.id	Composite PK
tag_id	uuid FK → content_tags.id	Composite PK

11. `questionnaire_templates`

Reusable questionnaire definitions.

Column	Type	Notes
id	uuid PK
title	text NOT NULL	Template name
description	text
questions	jsonb NOT NULL	Current question set (array of question objects)
created_by	uuid FK → profiles.id
created_at	timestamptz
updated_at	timestamptz

Question object shape: { id, type, text, required, options?, helpText?, section? } Question types: text, textarea, select, multi_select, scale, yes_no

12. `questionnaire_template_versions`

Versioned snapshots of questionnaire templates.

Column	Type	Notes
id	uuid PK
template_id	uuid FK → questionnaire_templates.id	CASCADE delete
version	integer NOT NULL
status	text NOT NULL	`draft` or `published`
questions	jsonb NOT NULL	Frozen question set for this version
change_summary	text
created_by	uuid FK → profiles.id
created_at	timestamptz

Unique constraint: (template_id, version)

13. `questionnaire_responses`

Filled-out questionnaire tied to an engagement.

Column	Type	Notes
id	uuid PK
template_id	uuid FK → questionnaire_templates.id	CASCADE delete
template_version_id	uuid FK → questionnaire_template_versions.id	Frozen version used
engagement_id	uuid FK → engagements.id	CASCADE delete
respondent_type	enum: `interviewer`, `client`, `agent`	Who filled it out
status	enum: `draft`, `complete`
answers	jsonb NOT NULL	`{ questionId: answer }`
completed_at	timestamptz
completed_by	uuid FK → profiles.id
created_by	uuid FK → profiles.id
created_at	timestamptz
updated_at	timestamptz

14. `analyses`

AI analysis runs scoped to an engagement.

Column	Type	Notes
id	uuid PK
engagement_id	uuid FK → engagements.id	CASCADE delete
title	text NOT NULL
prompt	text NOT NULL	The prompt with @-mention document references
model	text NOT NULL	`claude-sonnet`, `gpt-4o`, `gpt-4o-mini`
status	enum: `draft`, `running`, `complete`, `error`
error_message	text	If status = error
milestone_id	uuid FK → engagement_milestones.id	Optional milestone association
output_document_id	uuid FK → content_documents.id	Auto-created output doc tagged "Analysis Output"
created_by	uuid FK → profiles.id
created_at	timestamptz
updated_at	timestamptz

15. `analysis_inputs`

Join table linking analyses to their input documents (referenced via @-mentions in the prompt).

Column	Type	Notes
id	uuid PK
analysis_id	uuid FK → analyses.id	CASCADE delete
document_id	uuid FK → content_documents.id	CASCADE delete

Unique constraint: (analysis_id, document_id)

16. `document_embeddings`

Vector embeddings for RAG search. Created by api.embed.ts on document publish.

Column	Type	Notes
id	uuid PK
document_id	uuid FK → content_documents.id	CASCADE delete
chunk_index	integer	Position within the document
content	text	The text chunk (~2000 chars with overlap)
embedding	vector(1536)	OpenAI text-embedding-3-small
created_at	timestamptz

Index: HNSW index on embedding column for cosine similarity search RPC: match_documents(query_embedding, match_threshold, match_count) returns top-N similar chunks

17. `chat_messages`

Persistent chat history for the AI chat panel.

Column	Type	Notes
id	uuid PK
session_id	text	Groups messages into conversations
role	text	`user`, `assistant`, `system`
content	text	Message content
model	text	Which LLM was used
user_id	uuid FK → profiles.id
created_at	timestamptz

18. `api_keys`

API keys for external access (ChatGPT Actions, Claude MCP, Zapier).

Column	Type	Notes
id	uuid PK
user_id	uuid FK → profiles.id	Key owner
name	text	Display name (e.g. "OAuth: client_abc123")
key_hash	text NOT NULL	SHA-256 hash of the full key
key_prefix	text NOT NULL	First 8 chars of key for identification
is_active	boolean DEFAULT true
last_used_at	timestamptz
created_at	timestamptz

Key format: epok_sk_... (base64url random), shown once on creation Lookup: validate_api_key() SECURITY DEFINER function for efficient hash-based lookup

RLS Policies Summary

All tables have Row Level Security enabled. Key policies:

Core tables

profiles: Users read own profile; super admins read all; users update own (except role)
engagements: Super admins full CRUD; members read their own
engagement_members: Super admins full CRUD; members read own engagement's list
engagement_milestones: Super admins full CRUD; members read

Content tables

content_folders: Super admins full CRUD; engagement-scoped folders visible to members
content_documents + content_versions:
- KB (engagement_id IS NULL): super admins full CRUD; associates read-only; clients no access
- Engagement-scoped: super admins full CRUD; associates CRUD on assigned; clients read published+visible only
content_visibility: Super admins full CRUD; members read for their engagement
content_tags / content_document_tags: Super admins full CRUD; all authenticated users can read

RLS recursion fix

content_documents ↔ content_visibility ↔ content_versions have circular policy dependencies. Broken with SECURITY DEFINER helper functions:

is_doc_visible_to_client(doc_id)
can_view_document(doc_id)
can_access_document_versions(doc_id)
is_doc_published_and_visible(doc_id)

Other tables

questionnaire_templates/versions: Super admins full CRUD; all authenticated read
questionnaire_responses: Super admins full CRUD; engagement members read/write their own
analyses / analysis_inputs: Super admins full CRUD; engagement members read
document_embeddings: Super admins full CRUD (used server-side via service role)
chat_messages: Users read/write their own messages
api_keys: Users manage their own keys

Enums

Defined in app/db/schema/enums.ts:

Enum	Values
`user_role`	super_admin, associate, client_primary, client_member
`engagement_status`	intake, active, ongoing, complete, archived
`milestone_status`	upcoming, current, completed
`engagement_member_role`	admin, associate, client_primary, client_member
`content_type`	theme, theme_data_point, advisor_intake, workshop_note, workshop_transcript, report_draft, report_final, report_template, prompt, system_instruction, questionnaire_template, questionnaire_response, internal_note
`content_version_status`	draft, published, archived
`analysis_status`	draft, running, complete, error
`respondent_type`	interviewer, client, agent
`questionnaire_status`	draft, complete

Note: content_type enum still exists in the database but is no longer used as a column on content_documents. Classification is now handled entirely by the content_tags / content_document_tags tables.

Seed Data

On initial setup:

Profiles: Super admin roles for ben.unsworth@epokadvice.com and kendra.thompson@epokadvice.com (set via trigger on first login)
Milestones: 6-step structure seeded per engagement on creation
Content tags: ~20 tags seeded (Theme, Data Point, Report Template, Prompt, System Instruction, Internal Note, Intake Notes, Workshop Note, Transcript, Report Draft, Final Report, Questionnaire Response, Summary, Auto-Ingested, MS: Intake, MS: Discovery, MS: Workshop 1-3, MS: Complete, Analysis Output)
KB folders: "Inbox" folder for auto-ingested content

Drizzle ORM Notes

Schema defined in app/db/schema/ with one file per table, re-exported from index.ts
Schema changes applied via drizzle-kit push (no migration files)
Type-safe queries in loaders, actions, and API routes
Connection via direct Postgres connection string (IPv4 add-on, not pooler)

Database Schema Design

Entity Relationship Overview

Tables

1. profiles

2. engagements

3. engagement_members

4. engagement_milestones

5. content_folders

6. content_documents

7. content_versions

8. content_visibility

9. content_tags

10. content_document_tags

11. questionnaire_templates

12. questionnaire_template_versions

13. questionnaire_responses

14. analyses

15. analysis_inputs

16. document_embeddings

17. chat_messages

18. api_keys