<SYSTEM>This is the full developer documentation for TypeGraph</SYSTEM>

# What is TypeGraph?

> A TypeScript-first embedded knowledge graph library

TypeGraph is a **TypeScript-first, embedded knowledge graph library** that brings property graph semantics and
ontological reasoning to applications using standard relational databases. Rather than introducing a separate graph
database, TypeGraph lives inside your application as a library, storing graph data in your existing SQLite or
PostgreSQL database.

## Architecture

![TypeGraph Architecture: Your application imports TypeGraph as a library dependency. TypeGraph uses Drizzle ORM to store graph data (nodes, edges, schema, ontology) in your existing SQLite or PostgreSQL database. No separate graph database required.](../../assets/typegraph-architecture.svg)

## Core Capabilities

### 1. Type-Driven Schema Definition

Zod schemas are the single source of truth. From one schema definition, TypeGraph derives:

- Runtime validation rules
- TypeScript types (inferred, not duplicated)
- Database storage requirements
- Query builder type constraints

```typescript
const Person = defineNode("Person", {
  schema: z.object({
    fullName: z.string().min(1),
    email: z.string().email().optional(),
    dateOfBirth: z.date().optional(),
  }),
});
```

### 2. Semantic Layer with Ontological Reasoning

Type-level relationships enable sophisticated inference:

| Relationship   | Meaning                                   | Use Case               |
| -------------- | ----------------------------------------- | ---------------------- |
| `subClassOf`   | Instance inheritance (Podcast IS-A Media) | Query expansion        |
| `broader`      | Hierarchical concept (ML broader than DL) | Topic navigation       |
| `equivalentTo` | Same concept, different name              | Cross-system mapping   |
| `disjointWith` | Cannot be both (Person ≠ Organization)    | Constraint validation  |
| `implies`      | Edge entailment (marriedTo implies knows) | Relationship inference |
| `inverseOf`    | Edge pairs (manages/managedBy)            | Bidirectional queries  |

### 3. Self-Describing Schema (Homoiconic)

The schema and ontology are stored in the database as data, enabling:

- Runtime schema introspection
- Versioned schema history
- Self-describing exports and backups
- Migration tooling

### 4. Type-Safe Query Compilation

Queries compile to an AST before targeting SQL:

- Consistent semantics across SQLite and PostgreSQL
- Type-checked at compile time
- Query results have inferred types

## Design Philosophy

### Embedded, Not External

TypeGraph is a library dependency, not a networked service. TypeGraph initializes with your application, uses your
database connection, and requires no separate deployment.

### Schema-First, Type-Driven

Define your schemas once with Zod, and TypeGraph handles validation, type inference, and storage.
No duplicate type definitions or manual synchronization.

### Explicit Over Implicit

TypeGraph favors explicit declarations:

- Relationships are declared, not inferred from foreign keys
- Semantic relationships are explicit in the ontology
- Cascade behavior is configured, not assumed

### Portable Abstractions

The query builder generates portable ASTs that can target different SQL dialects.
The same query code works with SQLite and PostgreSQL.

## What TypeGraph Is Not

TypeGraph deliberately excludes:

- **Graph algorithms**: No built-in shortest path, PageRank, or community detection
- **Distributed storage**: Single-database deployment only

These exclusions keep TypeGraph focused and maintainable.

Note: TypeGraph **does support** semantic search via database vector extensions
(pgvector for PostgreSQL, sqlite-vec for SQLite). See [Semantic Search](/semantic-search) for details.

Note: TypeGraph does support **variable-length paths** via `.recursive()` with
configurable depth limits, optional path/depth projection, and explicit cycle
policy. Cycle prevention is the default.
See [Recursive Traversals](/queries/recursive) for details.

## Why TypeGraph?

### Compared to Graph Databases (Neo4j, Amazon Neptune)

Graph databases are powerful but come with operational overhead:

| Aspect | Graph Database | TypeGraph |
|--------|---------------|-----------|
| **Deployment** | Separate service to manage, scale, and monitor | Library in your app, uses existing database |
| **Network** | Additional latency for every query | In-process, no network hop |
| **Transactions** | Separate transaction scope from your SQL data | Same ACID transaction as your other data |
| **Learning curve** | New query language (Cypher, Gremlin) | TypeScript you already know |
| **Graph algorithms** | Built-in (PageRank, shortest path) | Not included |
| **Scale** | Optimized for billions of nodes | Best for thousands to millions |

**Choose TypeGraph** when your graph is part of your application domain (knowledge bases, org
charts, content relationships) rather than a standalone analytical system.

### Compared to ORMs (Prisma, Drizzle, TypeORM)

ORMs model relations through foreign keys, which works well for simple associations but lacks graph semantics:

| Aspect | Traditional ORM | TypeGraph |
|--------|----------------|-----------|
| **Relationships** | Foreign keys, eager/lazy loading | First-class edges with properties |
| **Traversals** | Manual joins or N+1 queries | Fluent traversal API, compiled to efficient SQL |
| **Inheritance** | Table-per-class or single-table | Semantic `subClassOf` with query expansion |
| **Constraints** | Foreign key constraints | Disjointness, cardinality, implications |
| **Schema** | Migrations alter tables | Schema versioning, JSON properties |

**Choose TypeGraph** when you need to traverse relationships, model type hierarchies, or enforce
semantic constraints beyond what foreign keys provide.

### Compared to Triple Stores (RDF, SPARQL)

Triple stores and RDF provide rich ontological modeling but have practical challenges:

| Aspect | Triple Store | TypeGraph |
|--------|-------------|-----------|
| **Type safety** | Runtime validation, stringly-typed | Full TypeScript inference |
| **Query language** | SPARQL (powerful but verbose) | TypeScript fluent API |
| **Schema** | OWL/RDFS (complex specification) | Zod schemas (familiar, composable) |
| **Integration** | Separate system, data sync required | Embedded in your app |
| **Inference** | Full reasoning engines available | Precomputed closures, practical subset |

**Choose TypeGraph** when you want ontological concepts (subclass, disjoint, implies) without the
complexity of full semantic web stack.

### The TypeGraph Sweet Spot

TypeGraph is designed for applications where:

1. **The graph is your domain model** — not a separate analytical system
2. **You already use SQL** — and don't want another database to manage
3. **Type safety matters** — you want compile-time checking, not runtime surprises
4. **Semantic relationships help** — inheritance, implications, constraints add value
5. **Scale is moderate** — thousands to millions of nodes, not billions

## When to Use TypeGraph

TypeGraph is ideal for:

- **Knowledge bases** with typed entities and relationships
- **Organizational structures** with hierarchies and roles
- **Content graphs** with topics, articles, and references
- **Domain models** requiring semantic constraints
- **RAG applications** combining graph traversal with vector search

TypeGraph is not ideal for:

- Large-scale graph analytics requiring distributed processing
- Social networks with billions of edges
- Real-time streaming graph data
- Applications requiring graph algorithms (use Neo4j or a graph library)

# Quick Start

> Set up TypeGraph and build your first knowledge graph

Get TypeGraph running in your project with this minimal example.

## 1. Install

```bash
npm install @nicia-ai/typegraph zod drizzle-orm better-sqlite3
npm install -D @types/better-sqlite3
```

> **Edge environments (Cloudflare Workers, etc.):** Skip `better-sqlite3` and use
> `@nicia-ai/typegraph/sqlite` with your edge-compatible driver (D1, libsql).
> See [Edge and Serverless](/integration#edge-and-serverless).

## 2. Create Your First Graph

```typescript
import { z } from "zod";
import { defineNode, defineEdge, defineGraph, createStore } from "@nicia-ai/typegraph";
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";

// Create an in-memory SQLite backend
const { backend } = createLocalSqliteBackend();

// Define your schema
const Person = defineNode("Person", {
  schema: z.object({ name: z.string(), role: z.string().optional() }),
});
const Project = defineNode("Project", {
  schema: z.object({ name: z.string(), status: z.enum(["active", "done"]) }),
});
const worksOn = defineEdge("worksOn");

const graph = defineGraph({
  id: "my_app",
  nodes: { Person: { type: Person }, Project: { type: Project } },
  edges: { worksOn: { type: worksOn, from: [Person], to: [Project] } },
});

// Create the store
const store = createStore(graph, backend);

// Use it!
const alice = await store.nodes.Person.create({ name: "Alice", role: "Engineer" });
const project = await store.nodes.Project.create({ name: "Website", status: "active" });
await store.edges.worksOn.create(alice, project, {});

// Query with full type safety
const results = await store
  .query()
  .from("Person", "p")
  .traverse("worksOn", "e")
  .to("Project", "proj")
  .select((ctx) => ({ person: ctx.p.name, project: ctx.proj.name }))
  .execute();

console.log(results); // [{ person: "Alice", project: "Website" }]
```

That's it! You have a working knowledge graph. Read on for the complete setup guide.

---

## Complete Setup Guide

This section covers production setup with SQLite and PostgreSQL in detail.

### Installation

```bash
npm install @nicia-ai/typegraph zod drizzle-orm better-sqlite3
npm install -D @types/better-sqlite3
```

> `better-sqlite3` is optional. For edge environments, use `@nicia-ai/typegraph/sqlite`
> with D1, libsql, or bun:sqlite instead.

### SQLite Setup

TypeGraph provides two ways to set up SQLite:

#### Quick Setup (Recommended for Development)

The simplest way to get started. Handles database creation and schema setup automatically.

> **Note:** `createLocalSqliteBackend` requires `better-sqlite3` and only works in Node.js.
> For edge environments, see [Manual Setup](#manual-setup-full-control) with `/sqlite`.

```typescript
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";

// In-memory database (data lost on restart)
const { backend } = createLocalSqliteBackend();

// File-based database (persistent)
const { backend, db } = createLocalSqliteBackend({ path: "./my-app.db" });
```

The function returns both the `backend` (for use with `createStore`) and `db`
(the underlying Drizzle instance for direct SQL access if needed).

#### Manual Setup (Full Control)

For production deployments or when you need full control over the database configuration:

```typescript
import Database from "better-sqlite3";
import { drizzle } from "drizzle-orm/better-sqlite3";
import { createSqliteBackend, generateSqliteMigrationSQL } from "@nicia-ai/typegraph/sqlite";

// Create database connection
const sqlite = new Database("my-app.db");

// Run TypeGraph migrations (creates required tables)
sqlite.exec(generateSqliteMigrationSQL());

// Create Drizzle instance
const db = drizzle(sqlite);

// Create the backend
const backend = createSqliteBackend(db);
```

#### Edge-Compatible Setup (D1, libsql, bun:sqlite)

For Cloudflare Workers, Turso, or other edge environments, use the driver-agnostic backend:

```typescript
import { drizzle } from "drizzle-orm/d1"; // or libsql, bun-sqlite
import { createSqliteBackend } from "@nicia-ai/typegraph/sqlite";

// D1 example
const db = drizzle(env.DB);
const backend = createSqliteBackend(db);
```

Use [drizzle-kit managed migrations](/integration#drizzle-kit-managed-migrations-recommended)
to set up the schema.

#### Drizzle-Kit Managed Migrations

If you already use `drizzle-kit` for migrations, see [Drizzle-Kit Managed Migrations](/integration#drizzle-kit-managed-migrations-recommended)
for how to import TypeGraph's schema into your `schema.ts` file.

## Defining Your Schema

### Step 1: Define Node Types

Nodes represent entities in your graph. Each node type has a name and a Zod schema:

```typescript
import { z } from "zod";
import { defineNode } from "@nicia-ai/typegraph";

const Person = defineNode("Person", {
  schema: z.object({
    name: z.string().min(1),
    email: z.string().email().optional(),
    bio: z.string().optional(),
  }),
});

const Project = defineNode("Project", {
  schema: z.object({
    name: z.string(),
    description: z.string().optional(),
    status: z.enum(["planning", "active", "completed"]),
  }),
});

const Task = defineNode("Task", {
  schema: z.object({
    title: z.string(),
    priority: z.enum(["low", "medium", "high"]),
    completed: z.boolean().default(false),
  }),
});
```

### Step 2: Define Edge Types

Edges represent relationships between nodes:

```typescript
import { defineEdge } from "@nicia-ai/typegraph";

const worksOn = defineEdge("worksOn", {
  schema: z.object({
    role: z.string().optional(),
    since: z.string().optional(),
  }),
});

const hasTask = defineEdge("hasTask", {
  schema: z.object({}),
});

const assignedTo = defineEdge("assignedTo", {
  schema: z.object({
    assignedAt: z.string().optional(),
  }),
});

// Unconstrained edge — connects any node to any node
const related = defineEdge("related");
```

### Step 3: Create the Graph Definition

Combine nodes, edges, and ontology into a graph:

```typescript
import { defineGraph, disjointWith } from "@nicia-ai/typegraph";

const graph = defineGraph({
  id: "project_management",
  nodes: {
    Person: { type: Person },
    Project: { type: Project },
    Task: { type: Task },
  },
  edges: {
    worksOn: { type: worksOn, from: [Person], to: [Project] },
    hasTask: { type: hasTask, from: [Project], to: [Task] },
    assignedTo: { type: assignedTo, from: [Task], to: [Person] },
    related,  // any→any
  },
  ontology: [
    // A Person cannot be a Project or Task
    disjointWith(Person, Project),
    disjointWith(Person, Task),
    disjointWith(Project, Task),
  ],
});
```

### Step 4: Create the Store

The store connects your graph definition to the database:

```typescript
import { createStore } from "@nicia-ai/typegraph";

const store = createStore(graph, backend);
```

#### Store Creation: Which Function to Use

| Function | Schema Handling | Use Case |
|----------|-----------------|----------|
| `createLocalSqliteBackend` | Automatic | Quick start, development, tests |
| `createStore` + manual migration | None | When you manage migrations externally |
| `createStoreWithSchema` | Validates & auto-migrates | **Recommended for production** |

For production, use `createStoreWithSchema` to validate and auto-apply safe schema changes:

```typescript
import { createStoreWithSchema } from "@nicia-ai/typegraph";

const [store, result] = await createStoreWithSchema(graph, backend);

if (result.status === "initialized") {
  console.log("Schema initialized at version", result.version);
} else if (result.status === "migrated") {
  console.log(`Migrated from v${result.fromVersion} to v${result.toVersion}`);
}
// Other statuses: "unchanged", "pending", "breaking"
// See Schema Migrations for full details
```

#### Graph ID

Every graph has a unique `id` that scopes its data:

```typescript
const graph = defineGraph({
  id: "my_app",  // Scopes all nodes/edges to this graph
  // ...
});
```

**Key behaviors:**

- All nodes and edges are stored with this `graph_id` in the database
- Multiple graphs can share the same database tables (isolated by `graph_id`)
- Changing the ID creates a new, empty graph (existing data is orphaned)

See [Multiple Graphs](/multiple-graphs) for multi-graph deployments.

## Working with Data

### Creating Nodes

```typescript
const alice = await store.nodes.Person.create({
  name: "Alice Smith",
  email: "alice@example.com",
});

const project = await store.nodes.Project.create({
  name: "Website Redesign",
  status: "active",
});

const task = await store.nodes.Task.create({
  title: "Design mockups",
  priority: "high",
});
```

### Creating Edges

Pass node objects directly to create edges:

```typescript
await store.edges.worksOn.create(alice, project, { role: "Lead Designer" });

await store.edges.hasTask.create(project, task, {});

await store.edges.assignedTo.create(task, alice, { assignedAt: new Date().toISOString() });
```

### Retrieving Nodes

```typescript
const person = await store.nodes.Person.getById(alice.id);
console.log(person?.name); // "Alice Smith"
```

### Updating Nodes

```typescript
const updated = await store.nodes.Task.update(task.id, { completed: true });
```

### Deleting Nodes

```typescript
await store.nodes.Task.delete(task.id);
```

## Querying Data

TypeGraph provides a fluent query builder:

```typescript
// Find all active projects
const activeProjects = await store
  .query()
  .from("Project", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ctx.p)
  .execute();

// Find people working on a project
const teamMembers = await store
  .query()
  .from("Project", "p")
  .traverse("worksOn", "e", { direction: "in" })
  .to("Person", "person")
  .select((ctx) => ({
    project: ctx.p.name,
    person: ctx.person.name,
  }))
  .execute();

// Multi-hop traversal: find tasks for a person
const myTasks = await store
  .query()
  .from("Person", "person")
  .whereNode("person", (p) => p.name.eq("Alice Smith"))
  .traverse("worksOn", "e1")
  .to("Project", "project")
  .traverse("hasTask", "e2")
  .to("Task", "task")
  .select((ctx) => ({
    project: ctx.project.name,
    task: ctx.task.title,
    priority: ctx.task.priority,
  }))
  .execute();
```

## Transactions

Group operations in transactions for atomicity:

```typescript
await store.transaction(async (tx) => {
  const project = await tx.nodes.Project.create({
    name: "New Feature",
    status: "planning",
  });

  const task1 = await tx.nodes.Task.create({
    title: "Research",
    priority: "high",
  });

  const task2 = await tx.nodes.Task.create({
    title: "Implementation",
    priority: "medium",
  });

  await tx.edges.hasTask.create(project, task1, {});

  await tx.edges.hasTask.create(project, task2, {});
});
```

## Error Handling

TypeGraph provides specific error types:

```typescript
import { ValidationError, NodeNotFoundError, DisjointError, RestrictedDeleteError } from "@nicia-ai/typegraph";

try {
  await store.nodes.Person.create({ name: "" }); // Invalid: empty name
} catch (error) {
  if (error instanceof ValidationError) {
    console.log("Validation failed:", error.message);
  }
}

try {
  await store.nodes.Project.delete(project.id);
} catch (error) {
  if (error instanceof RestrictedDeleteError) {
    console.log("Cannot delete: edges exist");
  }
}
```

## PostgreSQL Setup

TypeGraph also supports PostgreSQL for production deployments with better concurrency and JSON support.

### Installation

```bash
npm install @nicia-ai/typegraph zod drizzle-orm pg
npm install -D @types/pg
```

### Database Setup

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { createPostgresBackend, generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";

// Create connection pool
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20, // Connection pool size
});

// Run TypeGraph migrations
await pool.query(generatePostgresMigrationSQL());

// Create Drizzle instance and backend
const db = drizzle(pool);
const backend = createPostgresBackend(db);
```

If you use `drizzle-kit` for migrations, see [Drizzle-Kit Managed Migrations](/integration#drizzle-kit-managed-migrations-recommended).

### PostgreSQL Advantages

- **JSONB**: Native JSON type with efficient indexing
- **Connection pooling**: Better concurrency handling
- **Partial indexes**: More efficient uniqueness constraints
- **Full transactions**: ACID guarantees across operations

### Using with Connection Pools

For production, always use connection pooling:

```typescript
import { Pool } from "pg";

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

// Graceful shutdown
process.on("SIGTERM", async () => {
  await pool.end();
});
```

## Next Steps

- [Project Structure](/project-structure) - Organize your graph definitions as your project grows
- [Schemas & Types](/core-concepts) - Deep dive into nodes, edges, and schemas
- [Ontology](/ontology) - Learn about semantic relationships
- [Query Builder](/queries/overview) - Query patterns and traversals
- [Schemas & Stores](/schemas-stores) - Complete API documentation

# Schemas & Types

> Defining nodes, edges, and leveraging TypeScript inference

TypeGraph's power comes from its type system. Define your schema once with Zod, and get:

- **Runtime validation** on every create and update
- **TypeScript types** inferred automatically (no duplication)
- **Query builder constraints** that prevent invalid queries at compile time

## Contents

- [Nodes](#nodes) — Entities with properties and metadata
  - [Defining Node Types](#defining-node-types)
  - [Schema Features](#schema-features)
  - [Node Operations](#node-operations)
- [Edges](#edges) — Relationships between nodes
  - [Defining Edge Types](#defining-edge-types) (domain/range constraints)
  - [Edge Constraints](#edge-constraints) (cardinality)
  - [Edge Operations](#edge-operations)
- [Graph Definition](#graph-definition) — Combining nodes, edges, and ontology
- [Delete Behaviors](#delete-behaviors) — Restrict, cascade, disconnect
- [Uniqueness Constraints](#uniqueness-constraints) — Enforcing unique values
- [Type Inference](#type-inference) — Extracting TypeScript types from schemas

## Nodes

Nodes represent entities in your graph. Each node has:

- **Type**: The type of node (e.g., "Person", "Company")
- **ID**: A unique identifier within the graph
- **Props**: Properties defined by a Zod schema
- **Metadata**: Version, timestamps, and soft-delete state

### Defining Node Types

```typescript
import { z } from "zod";
import { defineNode } from "@nicia-ai/typegraph";

const Person = defineNode("Person", {
  schema: z.object({
    fullName: z.string().min(1),
    email: z.string().email().optional(),
    dateOfBirth: z.string().optional(),
    tags: z.array(z.string()).default([]),
  }),
  description: "A person in the system", // Optional
});
```

### Schema Features

TypeGraph supports all Zod validation features:

```typescript
const Product = defineNode("Product", {
  schema: z.object({
    // Required string
    name: z.string().min(1).max(200),

    // Optional with default
    status: z.enum(["draft", "active", "archived"]).default("draft"),

    // Number with constraints
    price: z.number().positive(),

    // Array with items validation
    categories: z.array(z.string()).min(1),

    // Regex pattern
    sku: z.string().regex(/^[A-Z]{2,4}-\d{4,8}$/),

    // Nullable field
    description: z.string().nullable(),

    // Transform on validation
    slug: z.string().transform((s) => s.toLowerCase().replace(/\s+/g, "-")),
  }),
});
```

### Node Operations

```typescript
// Create with auto-generated ID
const node = await store.nodes.Person.create({ fullName: "Alice Smith" });

// Create with specific ID
const node = await store.nodes.Person.create({ fullName: "Alice Smith" }, { id: "person-alice" });

// Retrieve
const person = await store.nodes.Person.getById("person-alice");

// Update (partial)
const updated = await store.nodes.Person.update("person-alice", {
  email: "alice@example.com",
});

// Delete (soft delete by default)
await store.nodes.Person.delete("person-alice");

// Hard delete (permanent removal) - use carefully!
await store.nodes.Person.hardDelete("person-alice");
```

### Node Object Shape

A node returned from the store has this structure:

```typescript
const alice = await store.nodes.Person.create({ name: "Alice", email: "a@example.com" });

// alice = {
//   id: "01HX...",          // Generated ULID (or your custom ID)
//   kind: "Person",         // The node type name
//   name: "Alice",          // Schema property (flattened to top level)
//   email: "a@example.com", // Schema property
//   meta: {
//     version: 1,
//     createdAt: "2024-01-15T10:30:00.000Z",
//     updatedAt: "2024-01-15T10:30:00.000Z",
//     deletedAt: undefined,
//     validFrom: undefined,
//     validTo: undefined,
//   }
// }
```

Schema properties are flattened to the top level for ergonomic access (`alice.name` instead of
`alice.props.name`). System metadata lives under `meta`.

### Soft Delete vs Hard Delete

By default, `delete()` performs a **soft delete**—it sets the `deletedAt` timestamp but preserves the record:

```typescript
await store.nodes.Person.delete(alice.id); // Sets deletedAt, keeps the record
```

For permanent removal, use `hardDelete()`:

```typescript
await store.nodes.Person.hardDelete(alice.id); // Removes from database
```

**When to use each:**

| Method | Use Case |
|--------|----------|
| `delete()` | Standard deletions, audit trails, undo capability |
| `hardDelete()` | GDPR erasure, storage cleanup, removing test data |

**Warning:** `hardDelete()` is irreversible. It also removes associated uniqueness entries and
embeddings. Consider using soft delete for most use cases.

## Edges

Edges represent relationships between nodes. Each edge has:

- **Type**: The type of relationship (e.g., "worksAt", "knows")
- **ID**: A unique identifier
- **From**: Source node (type + ID)
- **To**: Target node (type + ID)
- **Props**: Properties defined by a Zod schema

### Defining Edge Types

```typescript
import { defineEdge } from "@nicia-ai/typegraph";

// Edge with properties
const worksAt = defineEdge("worksAt", {
  schema: z.object({
    role: z.string(),
    startDate: z.string().optional(),
    isPrimary: z.boolean().default(true),
  }),
});

// Edge without properties
const knows = defineEdge("knows");
// Equivalent to: defineEdge("knows", { schema: z.object({}) })
```

#### Unconstrained Edges

Edges defined without `from` and `to` are **unconstrained** — they can connect any
node type to any node type. When used directly in `defineGraph`, they are automatically
allowed for all node types in the graph:

```typescript
const sameAs = defineEdge("sameAs");
const related = defineEdge("related", {
  schema: z.object({ reason: z.string() }),
});

const graph = defineGraph({
  id: "my_graph",
  nodes: {
    Person: { type: Person },
    Company: { type: Company },
  },
  edges: {
    sameAs,     // any→any (Person↔Person, Person↔Company, Company↔Company)
    related,    // any→any, with properties
    worksAt: { type: worksAt, from: [Person], to: [Company] },  // constrained
  },
});

// All of these work:
await store.edges.sameAs.create(alice, bob, {});       // Person→Person
await store.edges.sameAs.create(alice, acme, {});      // Person→Company
await store.edges.sameAs.create(acme, alice, {});      // Company→Person
```

This is useful for semantic relationships like `sameAs`, `seeAlso`, `related`, or
`tagged` that apply broadly across node types.

#### Domain and Range Constraints

Edges can include built-in domain (source types) and range (target types) constraints
directly in their definition. This makes edge definitions self-contained and reusable:

```typescript
// Edge with built-in domain/range constraints
const worksAt = defineEdge("worksAt", {
  schema: z.object({
    role: z.string(),
    startDate: z.string().optional(),
  }),
  from: [Person],      // Domain: only Person can be the source
  to: [Company],       // Range: only Company can be the target
});

// Edge connecting multiple types
const mentions = defineEdge("mentions", {
  from: [Article, Comment],
  to: [Person, Company, Topic],
});
```

Any edge type can be used directly in `defineGraph` without an `EdgeRegistration`
wrapper. Constrained edges use their built-in `from`/`to`; unconstrained edges
allow all node types:

```typescript
const graph = defineGraph({
  nodes: { Person: { type: Person }, Company: { type: Company } },
  edges: {
    worksAt,  // Constrained - uses built-in from/to
    sameAs,   // Unconstrained - connects any node to any node
  },
});
```

You can still use `EdgeRegistration` to narrow (but not widen) the constraints:

```typescript
const worksAt = defineEdge("worksAt", {
  from: [Person],
  to: [Company, Subsidiary],  // Allows both Company and Subsidiary
});

const graph = defineGraph({
  edges: {
    // Narrow to only Subsidiary targets in this graph
    worksAt: { type: worksAt, from: [Person], to: [Subsidiary] },
  },
});
```

Attempting to widen beyond the edge's built-in constraints throws a `ValidationError`:

```typescript
const worksAt = defineEdge("worksAt", {
  from: [Person],
  to: [Company],
});

// This throws ValidationError - OtherEntity is not in the edge's range
defineGraph({
  edges: {
    worksAt: { type: worksAt, from: [Person], to: [OtherEntity] },
  },
});
```

### Edge Constraints

#### Cardinality

Control how many edges can exist:

```typescript
const graph = defineGraph({
  edges: {
    // Default: no limit
    knows: { type: knows, from: [Person], to: [Person], cardinality: "many" },

    // At most one edge of this type from any source node
    currentEmployer: {
      type: currentEmployer,
      from: [Person],
      to: [Company],
      cardinality: "one",
    },

    // At most one edge between any (source, target) pair
    rated: { type: rated, from: [Person], to: [Product], cardinality: "unique" },

    // At most one active edge (valid_to IS NULL) from any source
    currentRole: {
      type: currentRole,
      from: [Person],
      to: [Company],
      cardinality: "oneActive",
    },
  },
});
```

| Cardinality | Description |
|-------------|-------------|
| `"many"` | No limit (default) |
| `"one"` | At most one edge of this type from any source node |
| `"unique"` | At most one edge between any (source, target) pair |
| `"oneActive"` | At most one edge with `valid_to IS NULL` from any source |

#### Enforcement Timing

Cardinality constraints are checked at edge **creation time**, before the insert:

```typescript
// With cardinality: "one" on currentEmployer:
await store.edges.currentEmployer.create(alice, acme, {});  // OK
await store.edges.currentEmployer.create(alice, other, {}); // Throws CardinalityError
```

The check queries existing edges and throws `CardinalityError` if violated.
For `oneActive`, only edges with `validTo` unset count toward the limit.

### Edge Operations

```typescript
// Create edge - pass nodes directly
const edge = await store.edges.worksAt.create(alice, acme, { role: "Engineer" });

// Retrieve edge
const e = await store.edges.worksAt.getById(edge.id);

// Delete edge
await store.edges.worksAt.delete(edge.id);
```

## Graph Definition

The graph definition combines all components:

```typescript
import { defineGraph } from "@nicia-ai/typegraph";

const graph = defineGraph({
  // Unique identifier for this graph
  id: "my_application",

  // Node registrations
  nodes: {
    Person: {
      type: Person,
      onDelete: "restrict", // Default behavior
    },
    Company: {
      type: Company,
      onDelete: "cascade",
    },
    Employment: {
      type: Employment,
      onDelete: "disconnect",
    },
  },

  // Edge registrations
  edges: {
    worksAt: {
      type: worksAt,
      from: [Person],
      to: [Company],
      cardinality: "many",
    },
    employedAt: {
      type: employedAt,
      from: [Company],
      to: [Employment],
      cardinality: "many",
    },
  },

  // Semantic relationships
  ontology: [subClassOf(Company, Organization), disjointWith(Person, Company)],
});
```

## Delete Behaviors

Control what happens when nodes are deleted:

### Restrict (Default)

Blocks deletion if any edges are connected:

```typescript
nodes: {
  Author: { type: Author }, // onDelete defaults to "restrict"
}

// This throws RestrictedDeleteError if Author has edges
await store.nodes.Author.delete(authorId);
```

### Cascade

Automatically deletes all connected edges:

```typescript
nodes: {
  Book: { type: Book, onDelete: "cascade" },
}

// Deletes the book and all edges connected to it
await store.nodes.Book.delete(bookId);
```

### Disconnect

Soft-deletes edges (preserves history):

```typescript
nodes: {
  Review: { type: Review, onDelete: "disconnect" },
}

// Marks connected edges as deleted (deleted_at is set)
await store.nodes.Review.delete(reviewId);
```

## Uniqueness Constraints

Ensure unique values within node types:

```typescript
const graph = defineGraph({
  nodes: {
    Person: {
      type: Person,
      unique: [
        {
          name: "person_email",
          fields: ["email"],
          where: (props) => props.email.isNotNull(),
          scope: "kind",
          collation: "caseInsensitive",
        },
      ],
    },
    Company: {
      type: Company,
      unique: [
        {
          name: "company_ticker",
          fields: ["ticker"],
          scope: "kind",
          collation: "binary",
        },
      ],
    },
  },
});
```

### Scope Options

- `"kind"`: Unique within this exact type only
- `"kindWithSubClasses"`: Unique across this type and all subclasses

### Collation Options

- `"binary"`: Case-sensitive comparison
- `"caseInsensitive"`: Case-insensitive comparison

## Type Inference

TypeGraph infers TypeScript types from Zod schemas—you never duplicate type definitions.

### Extracting Types from Definitions

```typescript
import { z } from "zod";
import { defineNode, type Node, type NodeProps, type NodeId } from "@nicia-ai/typegraph";

const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
    email: z.string().email().optional(),
    age: z.number().optional(),
  }),
});

// For functions that work with full nodes (id, kind, metadata, props):
type PersonNode = Node<typeof Person>;
// { id: NodeId<Person>; kind: "Person"; name: string; email?: string; version: number; createdAt: Date; ... }

// For functions that only need the property data:
type PersonProps = NodeProps<typeof Person>;
// { name: string; email?: string; age?: number }

// For type-safe node IDs (prevents mixing IDs from different node types):
type PersonId = NodeId<typeof Person>;
// string & { readonly [__nodeId]: typeof Person }
```

Use `Node<typeof X>` when your function needs the full node with metadata.
 Use `NodeProps<typeof X>` when you only care about the schema properties (e.g., for form validation or API payloads).

### Typed Store Operations

```typescript
// Create returns a fully typed Node
const alice: Node<typeof Person> = await store.nodes.Person.create({
  name: "Alice",
  email: "alice@example.com",
});

// TypeScript knows the structure
alice.id;              // NodeId<typeof Person> - branded string
alice.name;            // string
alice.email;           // string | undefined
alice.age;             // number | undefined
alice.version;         // number
alice.createdAt;       // Date

// Type errors caught at compile time
await store.nodes.Person.create({
  name: 123,           // Error: Type 'number' is not assignable to type 'string'
  invalid: "field",    // Error: Object literal may only specify known properties
});
```

### Typed Query Results

```typescript
// Result type is inferred from your select projection
const results = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name,     // TypeScript knows: string
    email: ctx.p.email,   // TypeScript knows: string | undefined
    id: ctx.p.id,         // TypeScript knows: NodeId<Person>
  }))
  .execute();

// results: Array<{ name: string; email: string | undefined; id: NodeId<Person> }>

// Invalid property access is caught
.select((ctx) => ({
  invalid: ctx.p.nonexistent,  // TypeScript error!
}))
```

### Typed Edge Operations

Edge endpoints are constrained to valid node types:

```typescript
// Edge definition: worksAt goes from Person → Company
const graph = defineGraph({
  // ...
  edges: {
    worksAt: { type: worksAt, from: [Person], to: [Company] },
  },
});

// TypeScript enforces valid endpoints
await store.edges.worksAt.create(alice, acmeCorp, { role: "Engineer" }); // OK

await store.edges.worksAt.create(acmeCorp, alice, { role: "Engineer" });
// Error: Argument of type 'Node<Company>' is not assignable to parameter of type 'Node<Person>'
```

# Backend Setup

> Configure SQLite and PostgreSQL backends for TypeGraph

TypeGraph stores graph data in your existing relational database using Drizzle ORM adapters.
This guide covers setting up SQLite and PostgreSQL backends.

:::note[Custom indexes]
TypeGraph migrations create the core tables and built-in indexes. For application-specific indexes
on JSON properties (and Drizzle/drizzle-kit integration), see [Indexes](/performance/indexes).
:::

## SQLite

SQLite is ideal for development, testing, single-server deployments, and embedded applications.

### Quick Setup

For development and testing, use the convenience function that handles everything:

```typescript
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";
import { createStore } from "@nicia-ai/typegraph";

// In-memory database (resets on restart)
const { backend } = createLocalSqliteBackend();
const store = createStore(graph, backend);

// File-based database (persisted)
const { backend, db } = createLocalSqliteBackend({ path: "./app.db" });
const store = createStore(graph, backend);
```

### Manual Setup

For full control over the database connection:

```typescript
import Database from "better-sqlite3";
import { drizzle } from "drizzle-orm/better-sqlite3";
import { createSqliteBackend, generateSqliteMigrationSQL } from "@nicia-ai/typegraph/sqlite";
import { createStore } from "@nicia-ai/typegraph";

// Create and configure the database
const sqlite = new Database("app.db");
sqlite.pragma("journal_mode = WAL"); // Recommended for performance
sqlite.pragma("foreign_keys = ON");

// Run TypeGraph migrations
sqlite.exec(generateSqliteMigrationSQL());

// Create Drizzle instance and backend
const db = drizzle(sqlite);
const backend = createSqliteBackend(db);
const store = createStore(graph, backend);

// Clean up when done
process.on("exit", () => sqlite.close());
```

### SQLite with Vector Search

For semantic search, use sqlite-vec:

```typescript
import Database from "better-sqlite3";
import { drizzle } from "drizzle-orm/better-sqlite3";
import { createSqliteBackend, generateSqliteMigrationSQL } from "@nicia-ai/typegraph/sqlite";

const sqlite = new Database("app.db");

// Load sqlite-vec extension
sqlite.loadExtension("vec0");

// Run migrations (includes vector index setup)
sqlite.exec(generateSqliteMigrationSQL());

const db = drizzle(sqlite);
const backend = createSqliteBackend(db);
```

See [Semantic Search](/semantic-search) for query examples.

### API Reference

#### `createLocalSqliteBackend(options?)`

Creates a SQLite backend with automatic database and schema setup.

```typescript
function createLocalSqliteBackend(options?: {
  path?: string;       // Database path, defaults to ":memory:"
  tables?: SqliteTables;
}): { backend: GraphBackend; db: BetterSQLite3Database };
```

#### `createSqliteBackend(db, options?)`

Creates a SQLite backend from an existing Drizzle database instance.

```typescript
function createSqliteBackend(
  db: BetterSQLite3Database,
  options?: { tables?: SqliteTables }
): GraphBackend;
```

#### `generateSqliteMigrationSQL()`

Returns SQL for creating TypeGraph tables in SQLite.

```typescript
function generateSqliteMigrationSQL(): string;
```

## PostgreSQL

PostgreSQL is recommended for production deployments with concurrent access, large datasets,
or when you need advanced features like pgvector.

### Basic Setup

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { createPostgresBackend, generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";
import { createStore } from "@nicia-ai/typegraph";

// Create connection pool
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20, // Maximum connections
});

// Run TypeGraph migrations
await pool.query(generatePostgresMigrationSQL());

// Create Drizzle instance and backend
const db = drizzle(pool);
const backend = createPostgresBackend(db);
const store = createStore(graph, backend);
```

### PostgreSQL with Vector Search

For semantic search, enable pgvector:

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { createPostgresBackend, generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

// Migration SQL includes pgvector extension
await pool.query(generatePostgresMigrationSQL());
// Runs: CREATE EXTENSION IF NOT EXISTS vector;

const db = drizzle(pool);
const backend = createPostgresBackend(db);
```

See [Semantic Search](/semantic-search) for query examples.

### Connection Pooling

For production, always use connection pooling:

```typescript
import { Pool } from "pg";

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,                    // Maximum pool size
  idleTimeoutMillis: 30000,   // Close idle connections after 30s
  connectionTimeoutMillis: 2000, // Timeout for new connections
});

// Handle pool errors
pool.on("error", (err) => {
  console.error("Unexpected pool error", err);
});

// Graceful shutdown
process.on("SIGTERM", async () => {
  await pool.end();
  process.exit(0);
});
```

### API Reference

#### `createPostgresBackend(db, options?)`

Creates a PostgreSQL backend adapter.

```typescript
function createPostgresBackend(
  db: NodePgDatabase,
  options?: { tables?: PostgresTables }
): GraphBackend;
```

#### `generatePostgresMigrationSQL()`

Returns SQL for creating TypeGraph tables in PostgreSQL, including the pgvector extension.

```typescript
function generatePostgresMigrationSQL(): string;
```

#### `generatePostgresDDL(tables?)`

Returns individual DDL statements (CREATE TABLE, CREATE INDEX) as an array. Useful when you
need per-statement control, for example to execute them in separate transactions or log them
individually.

```typescript
function generatePostgresDDL(tables?: PostgresTables): string[];
```

## Drizzle Entrypoints

TypeGraph exposes Drizzle adapters through two public entrypoints:

- `@nicia-ai/typegraph/sqlite` - SQLite adapter exports
- `@nicia-ai/typegraph/postgres` - PostgreSQL adapter exports

Import from the entrypoint matching your database:

```typescript
import { createSqliteBackend, tables } from "@nicia-ai/typegraph/sqlite";
```

```typescript
import { createPostgresBackend, tables } from "@nicia-ai/typegraph/postgres";
```

## Cloudflare D1

TypeGraph supports Cloudflare D1 for edge deployments, with some limitations.

```typescript
import { drizzle } from "drizzle-orm/d1";
import { createStore } from "@nicia-ai/typegraph";
import { createSqliteBackend } from "@nicia-ai/typegraph/sqlite";

export default {
  async fetch(request: Request, env: Env) {
    const db = drizzle(env.DB);
    const backend = createSqliteBackend(db);
    const store = createStore(graph, backend);

    // Use store...
  },
};
```

**Important:** D1 does not support transactions. See [Limitations](/limitations) for details.

## Backend Capabilities

Check what features a backend supports:

```typescript
const backend = createSqliteBackend(db);

if (backend.capabilities.transactions) {
  await store.transaction(async (tx) => { /* ... */ });
} else {
  // Handle non-transactional execution
}

if (backend.capabilities.vectorSearch) {
  // Vector similarity queries available
}
```

## Connection Management

TypeGraph does not manage database connections. You are responsible for:

1. **Creating connections** with appropriate configuration
2. **Connection pooling** for production use
3. **Closing connections** on shutdown

```typescript
// You create the connection
const sqlite = new Database("app.db");
const db = drizzle(sqlite);
const backend = createSqliteBackend(db);
const store = createStore(graph, backend);

// You close the connection
process.on("exit", () => {
  sqlite.close();
});
```

The `store.close()` method is a no-op—cleanup is your responsibility.

## Environment-Specific Setup

### Development

```typescript
// In-memory for fast tests
const { backend } = createLocalSqliteBackend();

// Or file-based for persistence during development
const { backend } = createLocalSqliteBackend({ path: "./dev.db" });
```

### Testing

```typescript
// Fresh in-memory database per test
beforeEach(() => {
  const { backend } = createLocalSqliteBackend();
  store = createStore(graph, backend);
});
```

### Production

```typescript
// PostgreSQL with pooling
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
  ssl: { rejectUnauthorized: false }, // For managed databases
});

await pool.query(generatePostgresMigrationSQL());
const db = drizzle(pool);
const backend = createPostgresBackend(db);
const [store] = await createStoreWithSchema(graph, backend);
```

## Next Steps

- [Schemas & Types](/core-concepts) - Define your graph schema
- [Semantic Search](/semantic-search) - Vector embeddings and similarity search
- [Limitations](/limitations) - Backend-specific constraints

# Query Builder Overview

> A fluent, type-safe API for querying your graph

TypeGraph provides a fluent, type-safe query builder for traversing and filtering your graph. This
page introduces the query categories and how they compose together.

## Query Categories

Every query builder method falls into one of these categories:

| Category | Purpose | Key Methods |
|----------|---------|-------------|
| [Source](/queries/source) | Entry point - where to start | `from()` |
| [Filter](/queries/filter) | Reduce the result set | `whereNode()`, `whereEdge()` |
| [Traverse](/queries/traverse) | Navigate relationships | `traverse()`, `optionalTraverse()`, `to()` |
| [Recursive](/queries/recursive) | Variable-length paths | `recursive()` |
| [Shape](/queries/shape) | Transform output structure | `select()`, `aggregate()` |
| [Aggregate](/queries/aggregate) | Summarize data | `groupBy()`, `count()`, `sum()`, `avg()` |
| [Order](/queries/order) | Control result ordering/size | `orderBy()`, `limit()`, `offset()` |
| [Temporal](/queries/temporal) | Time-based queries | `temporal()` |
| [Compose](/queries/compose) | Reusable query parts | `pipe()`, `createFragment()` |
| [Combine](/queries/combine) | Set operations | `union()`, `intersect()`, `except()` |
| [Execute](/queries/execute) | Run and retrieve | `execute()`, `first()`, `count()`, `exists()`, `paginate()`, `stream()` |

## Query Flow

A typical query follows this flow:

```text
Source → Filter → Traverse → Filter → Shape → Order → Execute
         ↑__________________|
           (repeat as needed)
```

Each step is optional except Source and Execute. You can filter, traverse, and filter again as many
times as needed before shaping and executing.

## Basic Example

```typescript
const results = await store
  .query()
  .from("Person", "p")                              // Source
  .whereNode("p", (p) => p.status.eq("active"))     // Filter
  .traverse("worksAt", "e")                         // Traverse
  .to("Company", "c")                               // Traverse (target)
  .whereNode("c", (c) => c.industry.eq("Tech"))     // Filter
  .select((ctx) => ({                               // Shape
    person: ctx.p.name,
    company: ctx.c.name,
    role: ctx.e.role,
  }))
  .orderBy("p", "name", "asc")                      // Order
  .limit(50)                                        // Order
  .execute();                                       // Execute
```

## Type Safety

The query builder is fully typed. TypeScript infers result types based on your schema and selection:

```typescript
// TypeScript infers: Array<{ name: string; email: string | undefined }>
const results = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name,    // string (required in schema)
    email: ctx.p.email,  // string | undefined (optional in schema)
  }))
  .execute();

// Invalid property access is caught at compile time:
.select((ctx) => ({
  invalid: ctx.p.nonexistent,  // TypeScript error!
}))
```

## When to Use Queries vs Store API

**Use the query builder** when you need:

- Filtering based on node properties
- Traversing relationships between nodes
- Aggregating data across multiple nodes
- Complex predicates with AND/OR logic

**Use the [Store API](/schemas-stores#store-api)** for simple operations:

- Get a node by ID
- Create a new node
- Update a node's properties
- Delete a node

## Predicates Reference

Predicates are the building blocks for filtering. Each data type has its own set of predicates:

| Type | Documentation |
|------|--------------|
| String | [String Predicates](/queries/predicates/#string) |
| Number | [Number Predicates](/queries/predicates/#number) |
| Date | [Date Predicates](/queries/predicates/#date) |
| Array | [Array Predicates](/queries/predicates/#array) |
| Object | [Object Predicates](/queries/predicates/#object) |
| Embedding | [Embedding Predicates](/queries/predicates/#embedding) |

## Performance Tips

### Filter Early

Apply predicates as early as possible to reduce the working set:

```typescript
// Good: Filter at source
store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.active.eq(true))
  .traverse("worksAt", "e")
  .to("Company", "c");

// Less efficient: Filter after traversal
store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .to("Company", "c")
  .whereNode("p", (p) => p.active.eq(true));
```

### Be Specific with Kinds

Unless you need subclass expansion, use exact kinds:

```typescript
// More efficient: Exact kind
.from("Podcast", "p")

// Less efficient: Includes all subclasses
.from("Media", "m", { includeSubClasses: true })
```

### Always Paginate Large Results

```typescript
const page = await store
  .query()
  .from("Event", "e")
  .orderBy("e", "date", "desc")
  .limit(100)
  .execute();
```

## Next Steps

Start with the fundamentals:

1. [Source](/queries/source) - Starting queries with `from()`
2. [Filter](/queries/filter) - Reducing results with predicates
3. [Traverse](/queries/traverse) - Navigating relationships
4. [Shape](/queries/shape) - Transforming output with `select()`

# Troubleshooting

> Solutions to common issues and frequently asked questions

This guide covers common issues and their solutions when working with TypeGraph.

## Installation Issues

### "Cannot find module '@nicia-ai/typegraph'"

**Cause:** Package not installed or using wrong package name.

**Solution:**

```bash
npm install @nicia-ai/typegraph zod drizzle-orm
```

### "better-sqlite3 compilation failed"

**Cause:** Native module compilation requires build tools.

**Solutions:**

**macOS:**

```bash
xcode-select --install
```

**Ubuntu/Debian:**

```bash
sudo apt-get install build-essential python3
```

**Windows:**

```bash
npm install --global windows-build-tools
```

**Alternative:** Use `sql.js` for pure JavaScript SQLite (no compilation needed).

### "Module not found: drizzle-orm/better-sqlite3"

**Cause:** Drizzle ORM subpath exports require specific import syntax.

**Solution:** Ensure correct imports:

```typescript
// Correct
import { drizzle } from "drizzle-orm/better-sqlite3";

// Incorrect
import { drizzle } from "drizzle-orm";
```

## Schema Definition Errors

### "Node schema contains reserved property names"

**Cause:** Using reserved keys (`id`, `kind`, `meta`) in your Zod schema.

**Solution:** Rename your properties:

```typescript
// Bad - 'id' is reserved
const User = defineNode("User", {
  schema: z.object({
    id: z.string(), // Error!
    name: z.string(),
  }),
});

// Good - use a different name
const User = defineNode("User", {
  schema: z.object({
    externalId: z.string(),
    name: z.string(),
  }),
});
```

TypeGraph automatically provides `id`, `kind`, and `meta` on all nodes.

### "Edge type already has constraints defined"

**Cause:** Defining `from`/`to` constraints on both the edge type and graph registration.

**Solution:** Define constraints in one place only:

```typescript
// Option 1: On the edge type (reusable across graphs)
const worksAt = defineEdge("worksAt", {
  from: [Person],
  to: [Company],
});

const graph = defineGraph({
  edges: {
    worksAt: { type: worksAt }, // No from/to here
  },
});

// Option 2: On the graph (flexible per-graph)
const worksAt = defineEdge("worksAt");

const graph = defineGraph({
  edges: {
    worksAt: { type: worksAt, from: [Person], to: [Company] },
  },
});
```

## Runtime Errors

### ValidationError: "Invalid input"

**Cause:** Data doesn't match the Zod schema.

**Solution:** Check the error details for specific issues:

```typescript
try {
  await store.nodes.Person.create({ name: "" });
} catch (error) {
  if (error instanceof ValidationError) {
    console.log(error.details.issues); // Zod issues array
  }
}
```

### NodeNotFoundError

**Cause:** Attempting to read/update/delete a non-existent node.

**Solution:** Check if the node exists first or handle the error:

```typescript
const node = await store.nodes.Person.getById(someId);
if (!node) {
  // Handle missing node
}

// Or use error handling
try {
  await store.nodes.Person.update(someId, { name: "New" });
} catch (error) {
  if (error instanceof NodeNotFoundError) {
    console.log(`Node ${error.details.id} not found`);
  }
}
```

### RestrictedDeleteError

**Cause:** Attempting to delete a node that has edges, with `onDelete: "restrict"` (the default).

**Solution:** Either delete the edges first or use a different delete behavior:

```typescript
// Option 1: Delete edges first
const edges = await store.edges.worksAt.findFrom(person);
for (const edge of edges) {
  await store.edges.worksAt.delete(edge.id);
}
await store.nodes.Person.delete(person.id);

// Option 2: Use cascade delete in schema
const graph = defineGraph({
  nodes: {
    Person: { type: Person, onDelete: "cascade" },
  },
});
```

### DisjointError

**Cause:** Creating a node with an ID that's already used by a disjoint type.

**Solution:** Ensure IDs are unique across disjoint types or don't use explicit IDs:

```typescript
// If Person and Organization are disjoint:
// Bad - same ID for different types
await store.nodes.Person.create({ name: "Alice" }, { id: "entity-1" });
await store.nodes.Organization.create({ name: "Acme" }, { id: "entity-1" }); // Error!

// Good - let TypeGraph generate unique IDs
await store.nodes.Person.create({ name: "Alice" });
await store.nodes.Organization.create({ name: "Acme" });
```

## Query Issues

### "Alias 'x' is already in use"

**Cause:** Using the same alias twice in a query.

**Solution:** Use unique aliases:

```typescript
// Bad
store.query()
  .from("Person", "p")
  .traverse("knows", "e")
  .to("Person", "p") // Error! 'p' already used

// Good
store.query()
  .from("Person", "p1")
  .traverse("knows", "e")
  .to("Person", "p2")
```

### Empty results when expecting data

**Causes and solutions:**

1. **Type mismatch:** Ensure you're querying the correct node type

   ```typescript
   // Check the node type name matches exactly
   .from("Person", "p") // Must match defineNode("Person", ...)
   ```

2. **Missing includeSubClasses:** When querying a superclass

   ```typescript
   .from("Content", "c", { includeSubClasses: true })
   ```

3. **Strict predicate:** Check your filters aren't too restrictive

   ```typescript
   // Debug by removing filters temporarily
   const all = await store.query().from("Person", "p").select((c) => c.p).execute();
   console.log(all.length); // How many total?
   ```

### Slow queries

**Solutions:**

1. **Use the query profiler:**

   ```typescript
   import { QueryProfiler } from "@nicia-ai/typegraph/profiler";

   const profiler = new QueryProfiler();
   profiler.attachToStore(store);

   // Run your queries...

   const report = profiler.getReport();
   console.log(report.recommendations);
   ```

2. **Add indexes** based on profiler recommendations:

   ```typescript
   import { defineNodeIndex } from "@nicia-ai/typegraph/indexes";

   const nameIndex = defineNodeIndex("Person", ["name"]);
   ```

3. **Limit results:**

   ```typescript
   .limit(100)
   // Or use pagination
   .paginate({ first: 20 })
   ```

## Database Connection Issues

### "Database is locked" (SQLite)

**Cause:** Multiple processes accessing the same SQLite file without WAL mode.

**Solution:** Enable WAL mode:

```typescript
const sqlite = new Database("myapp.db");
sqlite.pragma("journal_mode = WAL");
```

### Connection pool exhausted (PostgreSQL)

**Cause:** Too many concurrent connections.

**Solution:** Configure pool limits:

```typescript
import { Pool } from "pg";

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20, // Adjust based on your needs
  idleTimeoutMillis: 30000,
});
```

### "relation 'typegraph_nodes' does not exist"

**Cause:** Migration not run.

**Solution:** Run the migration SQL:

```typescript
// PostgreSQL
import { generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";
await pool.query(generatePostgresMigrationSQL());

// SQLite
import { generateSqliteMigrationSQL } from "@nicia-ai/typegraph/sqlite";
sqlite.exec(generateSqliteMigrationSQL());
```

## Semantic Search Issues

### "Extension not found" / "vector type not available"

**Cause:** Vector extension not installed.

**PostgreSQL:**

```sql
CREATE EXTENSION IF NOT EXISTS vector;
```

**SQLite:**

```typescript
import * as sqliteVec from "sqlite-vec";
sqliteVec.load(sqlite); // Must be called before creating backend
```

### "Dimension mismatch"

**Cause:** Query embedding has different dimension than stored embeddings.

**Solution:** Use consistent embedding dimensions:

```typescript
// Schema defines 1536 dimensions
const Document = defineNode("Document", {
  schema: z.object({
    embedding: embedding(1536),
  }),
});

// Query embedding must also be 1536
const queryEmbedding = await generateEmbedding(text);
console.log(queryEmbedding.length); // Should be 1536
```

### "Inner product not supported" (SQLite)

**Cause:** sqlite-vec doesn't support inner product metric.

**Solution:** Use cosine or L2:

```typescript
// Instead of:
d.embedding.similarTo(query, 10, { metric: "inner_product" })

// Use:
d.embedding.similarTo(query, 10, { metric: "cosine" })
```

## TypeScript Issues

### "Property 'x' does not exist on type"

**Cause:** Accessing a property not defined in your schema.

**Solution:** Ensure the property is in your Zod schema:

```typescript
const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
    email: z.string().optional(),
  }),
});

// Now both properties are available with correct types
const person = await store.nodes.Person.getById(id);
person?.name;  // string
person?.email; // string | undefined
```

### Type inference not working in select

**Cause:** Complex generic inference limitations.

**Solution:** Use explicit typing or simplify:

```typescript
// If inference fails, be explicit
.select((ctx) => ({
  name: ctx.p.name as string,
  company: ctx.c.name as string,
}))
```

## Still Having Issues?

1. **Check the [Limitations](/limitations)** page for known constraints
2. **Review [Architecture](/architecture)** to understand how TypeGraph works
3. **Search [GitHub Issues](https://github.com/nicia-ai/typegraph/issues)** for similar problems
4. **Open a new issue** with a minimal reproduction case

# Errors

> Error types and handling in TypeGraph

TypeGraph uses typed errors to communicate specific failure conditions. All errors extend the base
`TypeGraphError` class and include categorization, contextual details, and actionable suggestions.

## Error Categories

Every error is categorized to help determine the appropriate response:

| Category | Description | Typical Response |
|----------|-------------|------------------|
| `user` | Invalid input or misuse of API | Fix the input and retry |
| `constraint` | Graph constraint violated | Handle as business logic violation |
| `system` | Internal or infrastructure error | Log, alert, potentially retry |

```typescript
import { isUserRecoverable, isConstraintError, isSystemError } from "@nicia-ai/typegraph";

try {
  await store.nodes.Person.create(data);
} catch (error) {
  if (isUserRecoverable(error)) {
    // Show validation errors to user
    return { error: error.toUserMessage() };
  }
  if (isConstraintError(error)) {
    // Handle business rule violation
    return { error: "This operation violates a constraint" };
  }
  if (isSystemError(error)) {
    // Log and alert
    console.error(error.toLogString());
    throw error;
  }
}
```

## Base Error

### `TypeGraphError`

Base error class for all TypeGraph errors.

```typescript
class TypeGraphError extends Error {
  readonly code: string;
  readonly category: ErrorCategory;
  readonly details: Readonly<Record<string, unknown>>;
  readonly suggestion?: string;

  // Format error for end users (includes suggestion if available)
  toUserMessage(): string;

  // Format error for logging (includes code, category, and details)
  toLogString(): string;
}

type ErrorCategory = "user" | "constraint" | "system";
```

**Properties:**

| Property | Type | Description |
|----------|------|-------------|
| `code` | `string` | Machine-readable error code |
| `category` | `ErrorCategory` | Error classification for handling |
| `details` | `Record<string, unknown>` | Additional context about the error |
| `suggestion` | `string \| undefined` | Actionable guidance for resolution |

**Methods:**

| Method | Returns | Description |
|--------|---------|-------------|
| `toUserMessage()` | `string` | Human-readable message with suggestion |
| `toLogString()` | `string` | Detailed string for logging/debugging |

## Validation Errors

### `ValidationError`

Thrown when schema validation fails during node or edge creation/update. Includes structured issue
details with context about which entity failed.

```typescript
interface ValidationErrorDetails {
  readonly issues: readonly ValidationIssue[];
  readonly entityType?: "node" | "edge";
  readonly kind?: string;
  readonly operation?: "create" | "update";
  readonly id?: string;
}

interface ValidationIssue {
  readonly path: string;
  readonly message: string;
  readonly code?: string;
}
```

**Example:**

```typescript
try {
  await store.nodes.Person.create({ name: "" }); // Empty name fails min(1)
} catch (error) {
  if (error instanceof ValidationError) {
    console.log(error.category);     // "user"
    console.log(error.details.kind); // "Person"
    console.log(error.details.operation); // "create"
    console.log(error.details.issues);
    // [{ path: "name", message: "String must contain at least 1 character(s)" }]
    console.log(error.toUserMessage());
    // "Validation failed for Person create: name - String must contain at least 1 character(s)
    //
    // Suggestion: Check the data you're providing matches the schema..."
  }
}
```

### `DisjointError`

Thrown when attempting to create a node that violates a disjointness constraint.

```typescript
// If Person and Organization are disjoint:
await store.nodes.Person.create({ name: "Alice" }, { id: "entity-1" });

try {
  // Same ID, different disjoint type
  await store.nodes.Organization.create({ name: "Acme" }, { id: "entity-1" });
} catch (error) {
  if (error instanceof DisjointError) {
    console.log(error.category); // "constraint"
    console.log(error.details);
    // { existingType: "Person", attemptedType: "Organization" }
    console.log(error.suggestion);
    // "Use a different ID for the new node, or delete the existing node first..."
  }
}
```

### `EndpointError`

Thrown when an edge is created with invalid endpoint types.

```typescript
// If worksAt only allows Person -> Company:
try {
  await store.edges.worksAt.create(company, person, {}); // Wrong direction
} catch (error) {
  if (error instanceof EndpointError) {
    console.log(error.category); // "user"
    console.log(error.suggestion);
    // "Check the edge definition to see which node types are allowed..."
  }
}
```

### `CardinalityError`

Thrown when a cardinality constraint is violated.

```typescript
// If worksAt has cardinality: "one" (person can only work at one company):
await store.edges.worksAt.create(alice, acme, { role: "Engineer" });

try {
  await store.edges.worksAt.create(alice, otherCompany, { role: "Consultant" });
} catch (error) {
  if (error instanceof CardinalityError) {
    console.log(error.category); // "constraint"
    console.log(error.details); // { edge: "worksAt", cardinality: "one" }
    console.log(error.suggestion);
    // "Remove the existing edge before creating a new one, or update the existing edge..."
  }
}
```

### `UniquenessError`

Thrown when a uniqueness constraint is violated.

```typescript
// If email has a unique constraint:
await store.nodes.Person.create({ name: "Alice", email: "alice@example.com" });

try {
  await store.nodes.Person.create({ name: "Bob", email: "alice@example.com" });
} catch (error) {
  if (error instanceof UniquenessError) {
    console.log(error.category); // "constraint"
    console.log(error.details); // { field: "email", value: "alice@example.com" }
    console.log(error.suggestion);
    // "Use a different value for the unique field, or update the existing record..."
  }
}
```

## Not Found Errors

### `NodeNotFoundError`

Thrown when a referenced node does not exist.

```typescript
try {
  await store.nodes.Person.update("nonexistent-id", { name: "New Name" });
} catch (error) {
  if (error instanceof NodeNotFoundError) {
    console.log(error.category); // "user"
    console.log(error.details); // { id: "nonexistent-id", type: "Person" }
    console.log(error.suggestion);
    // "Verify the node ID is correct and the node hasn't been deleted..."
  }
}
```

### `EdgeNotFoundError`

Thrown when a referenced edge does not exist.

```typescript
try {
  await store.edges.worksAt.update("nonexistent-edge", { role: "Manager" });
} catch (error) {
  if (error instanceof EdgeNotFoundError) {
    console.log(error.category); // "user"
    console.log(error.details); // { id: "nonexistent-edge" }
    console.log(error.suggestion);
    // "Verify the edge ID is correct and the edge hasn't been deleted..."
  }
}
```

### `KindNotFoundError`

Thrown when referencing a node or edge type that doesn't exist in the graph definition.

```typescript
try {
  await store.query().from("NonExistentType", "n").execute();
} catch (error) {
  if (error instanceof KindNotFoundError) {
    console.log(error.category); // "user"
    console.log(error.details); // { kind: "NonExistentType" }
    console.log(error.suggestion);
    // "Check the graph definition to see which node and edge types are available..."
  }
}
```

### `EndpointNotFoundError`

Thrown when an edge references a node that doesn't exist.

```typescript
try {
  await store.edges.worksAt.create(
    { kind: "Person", id: "nonexistent" },
    company,
    { role: "Engineer" }
  );
} catch (error) {
  if (error instanceof EndpointNotFoundError) {
    console.log(error.category); // "user"
    console.log(error.details); // { kind: "Person", id: "nonexistent" }
    console.log(error.suggestion);
    // "Create the referenced node first, or verify the node ID is correct..."
  }
}
```

## Delete Errors

### `RestrictedDeleteError`

Thrown when delete is blocked due to existing edges (when `onDelete: "restrict"`).

```typescript
// If Person has edges and onDelete is "restrict":
try {
  await store.nodes.Person.delete(alice.id);
} catch (error) {
  if (error instanceof RestrictedDeleteError) {
    console.log(error.category); // "constraint"
    console.log(error.details); // { nodeId: "...", edgeCount: 3 }
    console.log(error.suggestion);
    // "Delete all edges connected to this node first, or change the delete behavior..."
  }
}
```

## Configuration Errors

### `ConfigurationError`

Thrown when the store, backend, or schema definition is misconfigured.

```typescript
// Using transactions on D1 (which doesn't support them):
try {
  await store.transaction(async (tx) => {
    // ...
  });
} catch (error) {
  if (error instanceof ConfigurationError) {
    console.log(error.category); // "system"
    console.log(error.suggestion);
    // "Check the backend documentation for supported features..."
  }
}
```

### `SchemaMismatchError`

Thrown when the database schema doesn't match the expected graph definition.

```typescript
try {
  const [store] = await createStoreWithSchema(graph, backend);
} catch (error) {
  if (error instanceof SchemaMismatchError) {
    console.log(error.category); // "system"
    console.log(error.details); // { expected: "...", actual: "..." }
    console.log(error.suggestion);
    // "Run migrations to update the database schema..."
  }
}
```

### `MigrationError`

Thrown when schema migration fails due to breaking changes that require manual intervention.

```typescript
try {
  const [store] = await createStoreWithSchema(graph, backend);
} catch (error) {
  if (error instanceof MigrationError) {
    console.log(error.category); // "system"
    console.log(error.details.breakingChanges);
    // ["Removed required field 'email' from Person"]
    console.log(error.suggestion);
    // "Review the breaking changes and perform manual migration if needed..."
  }
}
```

## Query Errors

### `UnsupportedPredicateError`

Thrown when using a query predicate that isn't supported by the current backend.

```typescript
// Using vector similarity on a backend without vector support:
try {
  await store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.embedding.similarTo(queryVector, 10))
    .execute();
} catch (error) {
  if (error instanceof UnsupportedPredicateError) {
    console.log(error.category); // "system"
    console.log(error.suggestion);
    // "Use a backend that supports this predicate, or rewrite the query..."
  }
}
```

## Error Handling Patterns

### Using Error Utilities

TypeGraph provides utility functions for common error handling patterns:

```typescript
import {
  isTypeGraphError,
  isUserRecoverable,
  isConstraintError,
  isSystemError,
  getErrorSuggestion,
} from "@nicia-ai/typegraph";

try {
  await store.nodes.Person.create(data);
} catch (error) {
  if (!isTypeGraphError(error)) {
    // Not a TypeGraph error, handle differently
    throw error;
  }

  // Get suggestion regardless of error type
  const suggestion = getErrorSuggestion(error);

  if (isUserRecoverable(error)) {
    // User can fix this by providing different input
    return {
      error: error.toUserMessage(),
      suggestion,
    };
  }

  if (isConstraintError(error)) {
    // Business rule violation
    return {
      error: "This operation violates a constraint",
      details: error.details,
    };
  }

  if (isSystemError(error)) {
    // Infrastructure/configuration issue
    console.error(error.toLogString());
    throw error;
  }
}
```

### Catch Specific Errors

```typescript
import {
  ValidationError,
  NodeNotFoundError,
  DisjointError,
} from "@nicia-ai/typegraph";

try {
  await store.nodes.Person.create(data);
} catch (error) {
  if (error instanceof ValidationError) {
    // Handle validation failure with contextual details
    return {
      error: "Invalid data",
      issues: error.details.issues,
      entity: error.details.kind,
    };
  }
  if (error instanceof DisjointError) {
    // Handle constraint violation
    return { error: "ID already used by different type" };
  }
  throw error; // Re-throw unexpected errors
}
```

### Check Error Codes

```typescript
try {
  await store.nodes.Person.update(id, data);
} catch (error) {
  if (error instanceof TypeGraphError) {
    switch (error.code) {
      case "NODE_NOT_FOUND":
        return { error: "Person not found" };
      case "VALIDATION_ERROR":
        return { error: "Invalid data", issues: error.details.issues };
      default:
        throw error;
    }
  }
  throw error;
}
```

### Transaction Error Handling

```typescript
try {
  await store.transaction(async (tx) => {
    const person = await tx.nodes.Person.create({ name: "Alice" });
    const company = await tx.nodes.Company.create({ name: "Acme" });
    await tx.edges.worksAt.create(person, company, { role: "Engineer" });
  });
} catch (error) {
  // Transaction is automatically rolled back on any error
  if (error instanceof ValidationError) {
    console.log("Validation failed, transaction rolled back");
    console.log("Failed on:", error.details.kind, error.details.operation);
  }
  throw error;
}
```

## Contextual Validation Utilities

For library authors or advanced use cases, validation utilities are available from the schema sub-export:

```typescript
import {
  validateNodeProps,
  validateEdgeProps,
  wrapZodError,
  createValidationError,
} from "@nicia-ai/typegraph/schema";

// Validate node properties with full context
const validated = validateNodeProps(PersonSchema, inputData, {
  kind: "Person",
  operation: "create",
});

// Wrap a Zod error with TypeGraph context
try {
  schema.parse(data);
} catch (zodError) {
  throw wrapZodError(zodError, {
    entityType: "node",
    kind: "Person",
    operation: "update",
    id: "person-123",
  });
}
```

## Error Codes Reference

| Code | Error Class | Category | Description |
|------|-------------|----------|-------------|
| `VALIDATION_ERROR` | `ValidationError` | user | Schema validation failed |
| `DISJOINT_ERROR` | `DisjointError` | constraint | Disjointness constraint violated |
| `ENDPOINT_ERROR` | `EndpointError` | user | Invalid edge endpoint types |
| `CARDINALITY_ERROR` | `CardinalityError` | constraint | Cardinality constraint violated |
| `UNIQUENESS_ERROR` | `UniquenessError` | constraint | Uniqueness constraint violated |
| `NODE_NOT_FOUND` | `NodeNotFoundError` | user | Referenced node doesn't exist |
| `EDGE_NOT_FOUND` | `EdgeNotFoundError` | user | Referenced edge doesn't exist |
| `KIND_NOT_FOUND` | `KindNotFoundError` | user | Unknown node/edge type |
| `ENDPOINT_NOT_FOUND` | `EndpointNotFoundError` | user | Edge endpoint node doesn't exist |
| `RESTRICTED_DELETE` | `RestrictedDeleteError` | constraint | Delete blocked by existing edges |
| `CONFIGURATION_ERROR` | `ConfigurationError` | system | Invalid configuration |
| `SCHEMA_MISMATCH` | `SchemaMismatchError` | system | Database schema mismatch |
| `MIGRATION_ERROR` | `MigrationError` | system | Migration failed |
| `UNSUPPORTED_PREDICATE` | `UnsupportedPredicateError` | system | Predicate not supported |

# Architecture

> How TypeGraph works internally and the design decisions behind it

This page explains how TypeGraph works under the hood, the design decisions that shaped it, and why certain
tradeoffs were made.

## High-Level Architecture

```text
┌────────────────────────────────────────────────────────┐
│                     Your Application                   │
│                                                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │                  TypeGraph Library               │  │
│  │                                                  │  │
│  │  ┌────────────┐                  ┌────────────┐  │  │
│  │  │   Schema   │                  │   Query    │  │  │
│  │  │    DSL     │                  │  Builder   │  │  │
│  │  └──────┬─────┘                  └─────┬──────┘  │  │
│  │         │                              │         │  │
│  │         └──────────────┴───────────────┘         │  │
│  │                        │                         │  │
│  │                        ▼                         │  │
│  │              ┌──────────────────┐                │  │
│  │              │  Ontology Layer  │                │  │
│  │              └──────────────────┘                │  │
│  └─────────────────────────┬────────────────────────┘  │
│                            │                           │
│                            ▼                           │
│                    ┌──────────────┐                    │
│                    │   Drizzle    │                    │
│                    │     ORM      │                    │
│                    └──────┬───────┘                    │
│                           │                            │
└───────────────────────────┼────────────────────────────┘
                            │
                            ▼
                   ┌─────────────────┐
                   │  Your Database  │
                   └─────────────────┘
```

TypeGraph is an **embedded library**, not a database. It runs in your application process, uses your existing
database connection, and compiles queries to SQL.

## Core Design Principles

### 1. Embedded, Not External

**Decision**: TypeGraph is a library dependency, not a separate service.

**Why**: Graph databases like Neo4j require managing another piece of infrastructure. For many use
cases—knowledge bases, organizational structures, content relationships—the graph is part of your application,
not a standalone system.

**Tradeoff**: You don't get Neo4j's specialized graph algorithms (PageRank, community detection), but you avoid:

- Additional deployment complexity
- Network latency between app and graph
- Separate scaling and monitoring
- Data synchronization challenges

### 2. Schema-First, Type-Driven

**Decision**: Zod schemas are the single source of truth. TypeScript types are inferred, not duplicated.

**Why**: In many graph systems, you define types in one place, validation in another, and database schemas in a
third. This leads to drift and bugs.

With TypeGraph:

```typescript
const Person = defineNode("Person", {
  schema: z.object({
    name: z.string().min(1),
    email: z.string().email().optional(),
  }),
});

// TypeScript type is inferred automatically
type PersonProps = z.infer<typeof Person.schema>;
// { name: string; email?: string }
```

The schema drives:

- Runtime validation on create/update
- TypeScript types for compile-time safety
- Database storage format
- Query builder type constraints

### 3. SQL as the Execution Engine

**Decision**: Compile graph queries to SQL, don't implement a custom query engine.

**Why**: SQLite and PostgreSQL are battle-tested, highly optimized query engines. Rather than building another one:

```typescript
// Your query
store.query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .to("Company", "c")
  .select((ctx) => ({ person: ctx.p.name, company: ctx.c.name }))

// Compiles to SQL with CTEs
WITH person_cte AS (
  SELECT * FROM typegraph_nodes WHERE kind = 'Person' AND deleted_at IS NULL
),
edge_cte AS (
  SELECT * FROM typegraph_edges WHERE kind = 'worksAt' AND deleted_at IS NULL
),
company_cte AS (
  SELECT * FROM typegraph_nodes WHERE kind = 'Company' AND deleted_at IS NULL
)
SELECT
  p.props->>'name' as person,
  c.props->>'name' as company
FROM person_cte p
JOIN edge_cte e ON e.from_id = p.id
JOIN company_cte c ON c.id = e.to_id
```

This means:

- You get database-level query optimization
- Indexes work as expected
- Transactions are ACID
- You can analyze queries with EXPLAIN

### 4. Precomputed Ontology

**Decision**: Compute transitive closures at store initialization, not query time.

**Why**: Semantic relationships like `subClassOf` and `implies` form hierarchies. Computing "all subclasses of
Media" during every query would be expensive.

Instead, when you create a store:

```typescript
const store = createStore(graph, backend);
// ↑ Computes:
//   - subClassOf closure: Media → [Media, Podcast, Article, Video]
//   - implies closure: marriedTo → [marriedTo, partneredWith, knows]
//   - disjoint sets: Person ⊥ Organization ⊥ Product
```

These closures are stored in the `TypeRegistry` and used during query compilation:

```typescript
.from("Media", "m", { includeSubClasses: true })
// At compile time, expands to: WHERE kind IN ('Media', 'Podcast', 'Article', 'Video')
```

**Tradeoff**: Changing the ontology requires recreating the store. But ontologies typically change rarely
compared to instance data.

### 5. Homoiconic Schema Storage

**Decision**: Store the graph schema and ontology as data in the database itself.

**Why**: Most ORMs and graph libraries define schemas only in application code. The database stores data but
has no record of what the data means. This creates problems:

- You can't understand the database without reading the application source
- Schema changes are invisible—no history, no diff, no audit trail
- Exports require the application to interpret the data
- Multiple applications can't share schema understanding

TypeGraph takes a different approach: the schema is data. When you initialize a store, the complete schema
(node types, edge types, property definitions, ontology relations, precomputed closures) is serialized to JSON
and stored in `typegraph_schema_versions`:

```sql
SELECT schema_doc FROM typegraph_schema_versions
WHERE graph_id = 'my_graph' AND is_active = TRUE;
```

The stored schema includes everything needed to understand the graph:

```typescript
{
  graphId: "my_graph",
  version: 3,
  nodes: {
    Person: { properties: { /* JSON Schema */ }, ... },
    Company: { ... }
  },
  edges: {
    worksAt: { fromKinds: ["Person"], toKinds: ["Company"], ... }
  },
  ontology: {
    relations: [{ metaEdge: "subClassOf", from: "Engineer", to: "Person" }],
    closures: {
      subClassAncestors: { Engineer: ["Person"] },
      // ... precomputed inference data
    }
  }
}
```

This enables:

| Capability | How It Works |
|------------|--------------|
| **Self-describing database** | Query the schema without application code—useful for debugging, admin tools, and data exploration |
| **Schema versioning** | Every schema change creates a new version; previous versions are preserved for auditing |
| **Change detection** | Compare stored schema to code schema to detect additions, removals, and breaking changes |
| **Portable exports** | The [interchange format](/interchange) is self-contained—importers know what the data means |
| **Runtime introspection** | Applications can query the schema at runtime for dynamic UI, validation, or documentation |

```typescript
import { getActiveSchema, getSchemaChanges } from "@nicia-ai/typegraph/schema";

// Query the active schema at runtime
const schema = await getActiveSchema(backend, "my_graph");
console.log("Node types:", Object.keys(schema.nodes));
console.log("Edge types:", Object.keys(schema.edges));

// Detect pending changes before deployment
const diff = await getSchemaChanges(backend, graph);
if (!diff.isBackwardsCompatible) {
  console.error("Breaking changes require migration");
}
```

**Tradeoff**: Schema storage adds a small amount of database overhead (one JSON document per version). The
benefit is a database that explains itself.

## Data Model

### Storage Schema

TypeGraph uses two core tables:

```sql
-- Nodes table
CREATE TABLE typegraph_nodes (
  graph_id    TEXT NOT NULL,
  kind        TEXT NOT NULL,
  id          TEXT NOT NULL,
  props       JSON NOT NULL,      -- Properties as JSON
  version     INTEGER NOT NULL,   -- Optimistic concurrency
  valid_from  TEXT NOT NULL,      -- Temporal validity
  valid_to    TEXT,
  created_at  TEXT NOT NULL,
  updated_at  TEXT NOT NULL,
  deleted_at  TEXT,               -- Soft delete
  PRIMARY KEY (graph_id, kind, id, valid_from)
);

-- Edges table
CREATE TABLE typegraph_edges (
  graph_id    TEXT NOT NULL,
  kind        TEXT NOT NULL,
  id          TEXT NOT NULL,
  from_kind   TEXT NOT NULL,
  from_id     TEXT NOT NULL,
  to_kind     TEXT NOT NULL,
  to_id       TEXT NOT NULL,
  props       JSON NOT NULL,
  version     INTEGER NOT NULL,
  valid_from  TEXT NOT NULL,
  valid_to    TEXT,
  created_at  TEXT NOT NULL,
  updated_at  TEXT NOT NULL,
  deleted_at  TEXT,
  PRIMARY KEY (graph_id, kind, id, valid_from)
);
```

### Why JSON for Properties?

**Decision**: Store node/edge properties as JSON, not as columns.

**Why**:

1. **Schema flexibility**: Adding a property doesn't require ALTER TABLE
2. **Heterogeneous nodes**: Different node kinds have different schemas
3. **Query simplicity**: One table for all nodes, not one per kind

Both SQLite (JSON1 extension) and PostgreSQL (JSONB) have efficient JSON operators:

```sql
-- PostgreSQL
SELECT props->>'name' FROM typegraph_nodes WHERE props->>'status' = 'active';

-- SQLite
SELECT json_extract(props, '$.name') FROM typegraph_nodes WHERE json_extract(props, '$.status') = 'active';
```

**Tradeoff**: You can't create a B-tree index on a JSON property as easily as a column. For high-cardinality
filtering, consider:

- PostgreSQL: Expression indexes on JSONB paths
- SQLite: Expression indexes on `json_extract(...)` (or generated columns)

See [Indexes](/performance/indexes) for TypeGraph utilities to define and create these indexes.

### Temporal Model

Every node and edge tracks temporal validity:

```text
┌──────────────────────────────────────────────────────────────┐
│ Node: Article#123                                            │
├──────────────────────────────────────────────────────────────┤
│ Version 1: "Draft"      │ valid_from: 2024-01-01            │
│                         │ valid_to:   2024-01-15            │
├─────────────────────────┼────────────────────────────────────┤
│ Version 2: "Published"  │ valid_from: 2024-01-15            │
│                         │ valid_to:   NULL (current)        │
└─────────────────────────┴────────────────────────────────────┘
```

When you update a node:

1. The current row's `valid_to` is set to now
2. A new row is inserted with `valid_from = now`, `valid_to = NULL`

This enables:

- **Point-in-time queries**: "What did the graph look like on January 10th?"
- **Audit trails**: "What were all the versions of this article?"
- **Soft deletes**: `deleted_at` marks deletion without losing history

## Query Compilation

### The Query Pipeline

```text
Query Builder API → Query AST → SQL Generator → Drizzle → Database
```

1. **Query Builder**: Fluent API that constructs a typed AST
2. **Query AST**: A data structure representing the query (nodes, edges, predicates, projections)
3. **SQL Generator**: Transforms AST to SQL using CTEs for each step
4. **Drizzle**: Executes the SQL and returns typed results

### Common Table Expressions (CTEs)

TypeGraph compiles traversals to CTEs, which databases optimize well:

```typescript
store.query()
  .from("Person", "p")
  .traverse("authored", "e")
  .to("Document", "d")
  .whereNode("d", (d) => d.status.eq("published"))
```

Becomes:

```sql
WITH
  step_0 AS (
    -- Start: all Person nodes
    SELECT * FROM typegraph_nodes
    WHERE graph_id = $1 AND kind = 'Person' AND deleted_at IS NULL
  ),
  step_1 AS (
    -- Traverse: follow 'authored' edges
    SELECT e.*, s.id as _from_step
    FROM typegraph_edges e
    JOIN step_0 s ON e.from_id = s.id
    WHERE e.kind = 'authored' AND e.deleted_at IS NULL
  ),
  step_2 AS (
    -- Arrive: at Document nodes
    SELECT n.*, s.id as _edge_id
    FROM typegraph_nodes n
    JOIN step_1 s ON n.id = s.to_id
    WHERE n.kind = 'Document' AND n.deleted_at IS NULL
  )
SELECT
  step_0.props->>'name' as person,
  step_2.props->>'title' as document
FROM step_0
JOIN step_1 ON step_1._from_step = step_0.id
JOIN step_2 ON step_2._edge_id = step_1.id
WHERE step_2.props->>'status' = 'published';
```

### Recursive CTEs for Variable-Length Paths

For `recursive()` traversals with cycle prevention enabled (the default),
TypeGraph generates recursive CTEs like:

```sql
WITH RECURSIVE path AS (
  -- Base case: starting nodes
  SELECT id, 1 as depth, ARRAY[id] as path
  FROM typegraph_nodes
  WHERE kind = 'Person' AND id = $1

  UNION ALL

  -- Recursive case: follow edges
  SELECT n.id, p.depth + 1, p.path || n.id
  FROM path p
  JOIN typegraph_edges e ON e.from_id = p.id
  JOIN typegraph_nodes n ON n.id = e.to_id
  WHERE e.kind = 'reportsTo'
    AND p.depth < 100           -- Implicit cap for unbounded traversal
    AND NOT n.id = ANY(p.path)  -- Cycle detection
)
SELECT * FROM path;
```

When you opt into `cyclePolicy: "allow"` and do not project a path column,
TypeGraph can use a lighter recursive shape without path-array state and
cycle predicates.

## Vector Search Architecture

For semantic search with embeddings:

### Storage

```sql
-- PostgreSQL with pgvector
CREATE TABLE typegraph_embeddings (
  graph_id    TEXT NOT NULL,
  node_kind   TEXT NOT NULL,
  node_id     TEXT NOT NULL,
  field       TEXT NOT NULL,
  embedding   vector(1536),      -- pgvector type
  PRIMARY KEY (graph_id, node_kind, node_id, field)
);

CREATE INDEX ON typegraph_embeddings
  USING hnsw (embedding vector_cosine_ops);  -- HNSW index for fast similarity
```

### Query Flow

```typescript
.whereNode("d", (d) => d.embedding.similarTo(queryVector, 10))
```

Compiles to:

```sql
-- PostgreSQL
SELECT * FROM typegraph_nodes n
JOIN typegraph_embeddings e ON e.node_id = n.id
ORDER BY e.embedding <=> $1    -- Cosine distance
LIMIT 10;
```

The database's vector index (HNSW or IVFFlat) handles approximate nearest neighbor search efficiently.

## Performance Characteristics

### What's Fast

- **Point lookups by ID**: O(1) with primary key index
- **Traversals**: Single SQL query with JOINs, optimized by the database
- **Ontology expansion**: Precomputed at initialization, O(1) at query time
- **Semantic search**: HNSW indexes provide sub-linear search

### What's Slower

- **Deep recursive traversals**: Recursive CTEs are more expensive than simple JOINs
- **Large property filtering without indexes**: JSON extraction is slower than column access
- **Cross-kind queries**: `includeSubClasses: true` increases the WHERE IN set

### Optimization Strategies

1. **Filter early**: Apply predicates as close to the source as possible
2. **Limit results**: Always paginate large result sets
3. **Use specific kinds**: Avoid `includeSubClasses` unless needed
4. **Index JSON paths**: For frequently-filtered properties, add expression indexes
5. **Batch writes**: Use transactions to reduce disk syncs and round-trips

## Why These Tradeoffs?

### Why Not a Native Graph Database?

Native graph databases (Neo4j, Amazon Neptune) excel at:

- Very deep traversals (10+ hops)
- Graph algorithms (shortest path, PageRank)
- Massive scale (billions of nodes)

TypeGraph is designed for:

- Knowledge bases with thousands to millions of nodes
- Shallow to medium traversals (1-5 hops typically)
- Applications that already use SQL databases
- Teams that want one database to manage

### Why Drizzle ORM?

1. **Type safety**: Full TypeScript inference
2. **Multiple dialects**: Same API for SQLite and PostgreSQL
3. **Raw SQL access**: When needed for performance
4. **Active ecosystem**: Well-maintained, growing community

### Why Zod for Schemas?

1. **Runtime validation**: Not just types, but actual validation
2. **Inference**: `z.infer<T>` eliminates type duplication
3. **Composition**: Build complex schemas from simple ones
4. **Ecosystem**: Widely used, lots of integrations

## Next Steps

- [Performance](/performance/overview) - Benchmarks and optimization tips
- [Schemas & Stores](/schemas-stores) - Complete function signatures
- [Integration Patterns](/integration) - How to integrate with your stack

# Data Sync Patterns

> Strategies for synchronizing external data with your TypeGraph store

When adding TypeGraph as a graph overlay to an existing application, you need
to keep your graph data in sync with your source of truth. This guide covers
practical patterns for syncing external data into TypeGraph using the bulk
operations API.

## Overview

Most applications adding TypeGraph will have existing data in relational tables,
external APIs, or document stores. Rather than migrating this data, TypeGraph
works alongside it as an overlay that provides:

- Graph traversals across your existing entities
- Semantic search via vector embeddings
- Relationship discovery and inference

The key challenge is keeping the graph in sync with your source data. We cover three approaches:

| Approach | Best For | Complexity |
|----------|----------|------------|
| [On-demand sync](#on-demand-sync) | Low-volume, real-time needs | Low |
| [Batch sync](#batch-sync) | Bulk imports, periodic refresh | Medium |
| [Event-driven sync](#event-driven-sync) | High-volume, near-real-time | Higher |

## Bulk Operations API

TypeGraph provides bulk operations for efficient sync workflows:

```typescript
// Create or update a single node
await store.nodes.Document.upsertById(id, props);

// Create many nodes at once
await store.nodes.Document.bulkCreate(items);

// Insert many nodes without returning results (dedicated fast path)
await store.nodes.Document.bulkInsert(items);

// Create or update many nodes at once
await store.nodes.Document.bulkUpsertById(items);

// Delete many nodes at once
await store.nodes.Document.bulkDelete(ids);
```

### upsertById

Creates a node if it doesn't exist, or updates it if it does. This includes
"un-deleting" soft-deleted nodes:

```typescript
// First call creates the node
const doc1 = await store.nodes.Document.upsertById("doc_123", {
  title: "Original Title",
  content: "...",
});

// Second call updates the existing node
const doc2 = await store.nodes.Document.upsertById("doc_123", {
  title: "Updated Title",
  content: "...",
});

// doc1.id === doc2.id - same node, updated in place
```

### bulkCreate

Efficiently creates multiple nodes in a single operation. Uses a single
multi-row INSERT with RETURNING when the backend supports it:

```typescript
const documents = await store.nodes.Document.bulkCreate([
  { props: { title: "Doc 1", content: "..." } },
  { props: { title: "Doc 2", content: "..." } },
  { props: { title: "Doc 3", content: "..." }, id: "custom_id" },
]);
```

If you only need write side effects and not created payloads, use `bulkInsert`.

### bulkInsert

Inserts multiple nodes without returning results. This is the dedicated fast path
for bulk ingestion — automatically wrapped in a transaction:

```typescript
await store.nodes.Document.bulkInsert([
  { props: { title: "Doc 1", content: "..." } },
  { props: { title: "Doc 2", content: "..." } },
  { props: { title: "Doc 3", content: "..." }, id: "custom_id" },
]);
```

Prefer `bulkInsert` over `bulkCreate` when you don't need results.

### bulkUpsertById

Creates or updates multiple nodes. Ideal for sync workflows where you don't
know which records already exist:

```typescript
// Sync a batch of external records
const externalRecords = await fetchExternalData();

const synced = await store.nodes.Document.bulkUpsertById(
  externalRecords.map((record) => ({
    id: record.id, // Use external ID as graph node ID
    props: {
      title: record.title,
      content: record.body,
      source: { table: "documents", id: record.id },
    },
  }))
);
```

### bulkDelete

Deletes multiple nodes by ID. Silently ignores IDs that don't exist:

```typescript
// Remove nodes that no longer exist in source
const deletedIds = await findDeletedRecords();
await store.nodes.Document.bulkDelete(deletedIds);
```

### getOrCreate APIs

Use get-or-create methods when your dedupe key is not a direct ID:

```typescript
// Match by a named uniqueness constraint
const byEmail = await store.nodes.User.getOrCreateByConstraint(
  "user_email",
  { email: "alice@example.com", name: "Alice" },
  { ifExists: "update" }
);
// byEmail.action: "created" | "found" | "updated" | "resurrected"

// Match edges by endpoints (+ optional matchOn fields)
const membership = await store.edges.memberOf.getOrCreateByEndpoints(
  user,
  org,
  { role: "admin", source: "sync" },
  { matchOn: ["role"], ifExists: "update" }
);
// membership.action: "created" | "found" | "updated" | "resurrected"
```

### Edge Bulk Operations

Edges also support bulk operations:

```typescript
// Create many edges at once (returns created edges)
const edges = await store.edges.relatedTo.bulkCreate([
  { from: doc1, to: doc2, props: { confidence: 0.9 } },
  { from: doc1, to: doc3, props: { confidence: 0.7 } },
  { from: doc2, to: doc3, props: { confidence: 0.8 } },
]);

// Insert many edges without returning results (fast path)
await store.edges.relatedTo.bulkInsert([
  { from: doc1, to: doc2, props: { confidence: 0.9 } },
  { from: doc1, to: doc3, props: { confidence: 0.7 } },
  { from: doc2, to: doc3, props: { confidence: 0.8 } },
]);

// Delete many edges at once
await store.edges.relatedTo.bulkDelete(edgeIds);
```

## On-Demand Sync

Sync individual records when they're accessed or modified. Best for low-volume
scenarios where you want real-time consistency.

```typescript
import { type Store } from "@nicia-ai/typegraph";
import { db, documents } from "./drizzle-schema";

interface AppDocument {
  id: string;
  title: string;
  content: string;
  updatedAt: Date;
}

async function syncDocument(store: Store, doc: AppDocument) {
  // Generate embedding for semantic search
  const embedding = await generateEmbedding(doc.content);

  // Upsert ensures we create or update as needed
  return store.nodes.Document.upsertById(doc.id, {
    title: doc.title,
    content: doc.content,
    embedding,
    source: { table: "documents", id: doc.id },
  });
}

// Sync on read - ensure graph is current before querying
async function getRelatedDocuments(documentId: string) {
  // First, ensure the source document is synced
  const appDoc = await db.select().from(documents).where(eq(documents.id, documentId)).get();
  if (!appDoc) throw new Error("Document not found");

  await syncDocument(store, appDoc);

  // Now query the graph for relationships
  return store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.id.eq(documentId))
    .traverse("relatedTo", "r")
    .to("Document", "related")
    .select((ctx) => ({
      id: ctx.related.id,
      title: ctx.related.title,
      confidence: ctx.r.confidence,
    }))
    .execute();
}

// Sync on write - update graph when source changes
async function updateDocument(documentId: string, updates: Partial<AppDocument>) {
  // Update source of truth first
  const [updated] = await db
    .update(documents)
    .set({ ...updates, updatedAt: new Date() })
    .where(eq(documents.id, documentId))
    .returning();

  // Then sync to graph
  await syncDocument(store, updated);

  return updated;
}
```

## Batch Sync

Process records in batches for bulk imports or periodic refresh. Best for
large datasets or when you need to backfill data.

### Basic Batch Sync

```typescript
interface SyncOptions {
  batchSize?: number;
  onProgress?: (processed: number, total: number) => void;
}

async function syncAllDocuments(store: Store, options: SyncOptions = {}) {
  const { batchSize = 100, onProgress } = options;

  // Get total count for progress reporting
  const [{ count }] = await db.select({ count: sql<number>`count(*)` }).from(documents);

  let processed = 0;
  let offset = 0;

  while (offset < count) {
    // Fetch a batch from source
    const batch = await db.select().from(documents).limit(batchSize).offset(offset);

    if (batch.length === 0) break;

    // Generate embeddings in parallel (respecting API rate limits)
    const embeddings = await batchGenerateEmbeddings(batch.map((d) => d.content));

    // Bulk upsert the batch
    await store.nodes.Document.bulkUpsertById(
      batch.map((doc, i) => ({
        id: doc.id,
        props: {
          title: doc.title,
          content: doc.content,
          embedding: embeddings[i],
          source: { table: "documents", id: doc.id },
        },
      }))
    );

    processed += batch.length;
    offset += batchSize;
    onProgress?.(processed, count);
  }

  return { processed, total: count };
}

// Usage
await syncAllDocuments(store, {
  batchSize: 50,
  onProgress: (processed, total) => {
    console.log(`Synced ${processed}/${total} documents`);
  },
});
```

### Incremental Sync

Only sync records that have changed since the last sync:

```typescript
interface SyncState {
  lastSyncAt: Date;
}

async function incrementalSync(store: Store, state: SyncState): Promise<SyncState> {
  const since = state.lastSyncAt;
  const now = new Date();

  // Fetch only changed records
  const changed = await db
    .select()
    .from(documents)
    .where(gt(documents.updatedAt, since))
    .orderBy(documents.updatedAt);

  if (changed.length > 0) {
    const embeddings = await batchGenerateEmbeddings(changed.map((d) => d.content));

    await store.nodes.Document.bulkUpsertById(
      changed.map((doc, i) => ({
        id: doc.id,
        props: {
          title: doc.title,
          content: doc.content,
          embedding: embeddings[i],
          source: { table: "documents", id: doc.id },
        },
      }))
    );

    console.log(`Synced ${changed.length} changed documents`);
  }

  // Handle deletions (if your source tracks them)
  const deleted = await db
    .select({ id: documents.id })
    .from(documents)
    .where(and(gt(documents.deletedAt, since), isNotNull(documents.deletedAt)));

  if (deleted.length > 0) {
    await store.nodes.Document.bulkDelete(deleted.map((d) => d.id));
    console.log(`Removed ${deleted.length} deleted documents`);
  }

  return { lastSyncAt: now };
}
```

### Scheduled Sync Job

Run incremental sync on a schedule:

```typescript
import { CronJob } from "cron";

// Store sync state (in production, persist this to a database)
let syncState: SyncState = { lastSyncAt: new Date(0) };

// Run every 5 minutes
const syncJob = new CronJob("*/5 * * * *", async () => {
  try {
    syncState = await incrementalSync(store, syncState);
  } catch (error) {
    console.error("Sync failed:", error);
    // Alert, retry, etc.
  }
});

syncJob.start();
```

## Event-Driven Sync

React to changes in your source data via events, webhooks, or database triggers.
Best for high-volume scenarios requiring near-real-time sync.

### Message Queue Pattern

```typescript
import { Queue, Worker } from "bullmq";

// Define sync job types
interface SyncJob {
  type: "upsert" | "delete";
  entityType: "Document" | "User";
  entityId: string;
}

// Producer: Enqueue sync jobs when source data changes
const syncQueue = new Queue<SyncJob>("sync");

async function onDocumentCreated(doc: AppDocument) {
  await syncQueue.add("sync", {
    type: "upsert",
    entityType: "Document",
    entityId: doc.id,
  });
}

async function onDocumentUpdated(doc: AppDocument) {
  await syncQueue.add("sync", {
    type: "upsert",
    entityType: "Document",
    entityId: doc.id,
  });
}

async function onDocumentDeleted(docId: string) {
  await syncQueue.add("sync", {
    type: "delete",
    entityType: "Document",
    entityId: docId,
  });
}

// Consumer: Process sync jobs
const syncWorker = new Worker<SyncJob>(
  "sync",
  async (job) => {
    const { type, entityType, entityId } = job.data;

    if (type === "delete") {
      await store.nodes[entityType].delete(entityId);
      return;
    }

    // Fetch current state from source
    const record = await fetchEntity(entityType, entityId);
    if (!record) {
      // Record was deleted between enqueue and processing
      await store.nodes[entityType].delete(entityId);
      return;
    }

    // Generate embedding if needed
    const embedding = await generateEmbedding(record.content);

    // Upsert to graph
    await store.nodes[entityType].upsertById(entityId, {
      ...record,
      embedding,
      source: { table: entityType.toLowerCase() + "s", id: entityId },
    });
  },
  {
    concurrency: 10,
    connection: redis,
  }
);
```

### Webhook Handler

Process webhooks from external systems:

```typescript
import { Hono } from "hono";

const app = new Hono();

app.post("/webhooks/documents", async (c) => {
  const event = await c.req.json<{
    type: "created" | "updated" | "deleted";
    data: AppDocument;
  }>();

  switch (event.type) {
    case "created":
    case "updated": {
      const embedding = await generateEmbedding(event.data.content);
      await store.nodes.Document.upsertById(event.data.id, {
        title: event.data.title,
        content: event.data.content,
        embedding,
        source: { table: "documents", id: event.data.id },
      });
      break;
    }

    case "deleted": {
      await store.nodes.Document.delete(event.data.id);
      break;
    }
  }

  return c.json({ ok: true });
});
```

### Database Triggers (PostgreSQL)

Use LISTEN/NOTIFY for real-time sync from PostgreSQL:

```typescript
import { Client } from "pg";

// Set up listener
const listener = new Client({ connectionString: process.env.DATABASE_URL });
await listener.connect();
await listener.query("LISTEN document_changes");

listener.on("notification", async (msg) => {
  if (msg.channel !== "document_changes") return;

  const payload = JSON.parse(msg.payload!);
  const { operation, id } = payload;

  if (operation === "DELETE") {
    await store.nodes.Document.delete(id);
    return;
  }

  // Fetch and sync the changed document
  const doc = await db.select().from(documents).where(eq(documents.id, id)).get();

  if (doc) {
    const embedding = await generateEmbedding(doc.content);
    await store.nodes.Document.upsertById(id, {
      title: doc.title,
      content: doc.content,
      embedding,
      source: { table: "documents", id },
    });
  }
});
```

Corresponding PostgreSQL trigger:

```sql
CREATE OR REPLACE FUNCTION notify_document_changes()
RETURNS TRIGGER AS $$
BEGIN
  PERFORM pg_notify(
    'document_changes',
    json_build_object(
      'operation', TG_OP,
      'id', COALESCE(NEW.id, OLD.id)
    )::text
  );
  RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER document_changes_trigger
AFTER INSERT OR UPDATE OR DELETE ON documents
FOR EACH ROW EXECUTE FUNCTION notify_document_changes();
```

## Syncing Relationships

When syncing data that includes relationships, sync nodes first, then edges:

```typescript
interface ExternalUser {
  id: string;
  name: string;
  email: string;
  managerId?: string;
}

async function syncUsers(users: ExternalUser[]) {
  // Step 1: Sync all user nodes first
  await store.nodes.User.bulkUpsertById(
    users.map((u) => ({
      id: u.id,
      props: {
        name: u.name,
        email: u.email,
        source: { table: "users", id: u.id },
      },
    }))
  );

  // Step 2: Sync manager relationships
  // First, remove all existing manages edges (clean slate approach)
  const existingEdges = await store.edges.manages.find();
  if (existingEdges.length > 0) {
    await store.edges.manages.bulkDelete(existingEdges.map((e) => e.id));
  }

  // Then create edges for users with managers
  const usersWithManagers = users.filter((u) => u.managerId);

  await store.edges.manages.bulkInsert(
    usersWithManagers.map((u) => ({
      from: { kind: "User" as const, id: u.managerId! },
      to: { kind: "User" as const, id: u.id },
    })),
  );
}
```

## Handling Sync Failures

### Retry with Exponential Backoff

```typescript
async function syncWithRetry<T>(
  fn: () => Promise<T>,
  options: { maxRetries?: number; baseDelay?: number } = {}
): Promise<T> {
  const { maxRetries = 3, baseDelay = 1000 } = options;

  let lastError: Error | undefined;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error as Error;

      if (attempt < maxRetries) {
        const delay = baseDelay * Math.pow(2, attempt);
        console.warn(`Sync attempt ${attempt + 1} failed, retrying in ${delay}ms`);
        await new Promise((resolve) => setTimeout(resolve, delay));
      }
    }
  }

  throw lastError;
}

// Usage
await syncWithRetry(() => store.nodes.Document.bulkUpsertById(items));
```

### Dead Letter Queue

Track failed syncs for manual intervention:

```typescript
interface FailedSync {
  entityType: string;
  entityId: string;
  error: string;
  failedAt: Date;
  attempts: number;
}

const failedSyncs: FailedSync[] = [];

async function syncWithDLQ(entityType: string, entityId: string, syncFn: () => Promise<void>) {
  try {
    await syncWithRetry(syncFn);
  } catch (error) {
    failedSyncs.push({
      entityType,
      entityId,
      error: (error as Error).message,
      failedAt: new Date(),
      attempts: 3,
    });

    console.error(`Sync failed after retries: ${entityType}:${entityId}`);
  }
}

// Periodically retry or alert on failed syncs
async function processFailedSyncs() {
  for (const failed of failedSyncs) {
    console.log(`Failed sync: ${failed.entityType}:${failed.entityId} - ${failed.error}`);
    // Retry, alert, or log for manual intervention
  }
}
```

## Best Practices

### Use Consistent IDs

Map external IDs to graph node IDs consistently:

```typescript
// Good: Use external ID directly when it's unique and stable
await store.nodes.Document.upsertById(externalDoc.id, { ... });

// Good: Namespace if IDs might collide across sources
await store.nodes.Document.upsertById(`notion:${notionPage.id}`, { ... });
await store.nodes.Document.upsertById(`gdrive:${driveFile.id}`, { ... });
```

### Track Sync Metadata

Store sync information for debugging and auditing:

```typescript
const Document = defineNode("Document", {
  schema: z.object({
    title: z.string(),
    content: z.string(),
    embedding: embedding(1536).optional(),
    source: externalRef("documents"),
    // Sync metadata
    lastSyncedAt: z.string().datetime().optional(),
    syncVersion: z.number().optional(),
  }),
});

await store.nodes.Document.upsertById(doc.id, {
  ...props,
  lastSyncedAt: new Date().toISOString(),
  syncVersion: (existingNode?.syncVersion ?? 0) + 1,
});
```

### Validate Before Sync

Validate external data before syncing to avoid corrupting your graph:

```typescript
const ExternalDocumentSchema = z.object({
  id: z.string().min(1),
  title: z.string().min(1),
  content: z.string(),
});

async function syncDocument(rawDoc: unknown) {
  const result = ExternalDocumentSchema.safeParse(rawDoc);

  if (!result.success) {
    console.error("Invalid document data:", result.error);
    return;
  }

  await store.nodes.Document.upsertById(result.data.id, {
    title: result.data.title,
    content: result.data.content,
  });
}
```

### Monitor Sync Health

Track sync metrics for observability:

```typescript
const syncMetrics = {
  successful: 0,
  failed: 0,
  lastSyncDuration: 0,
  lastSyncAt: null as Date | null,
};

async function monitoredSync(fn: () => Promise<void>) {
  const start = Date.now();

  try {
    await fn();
    syncMetrics.successful++;
  } catch (error) {
    syncMetrics.failed++;
    throw error;
  } finally {
    syncMetrics.lastSyncDuration = Date.now() - start;
    syncMetrics.lastSyncAt = new Date();
  }
}

// Expose metrics endpoint
app.get("/metrics/sync", (c) => c.json(syncMetrics));
```

## Next Steps

- [Integration Patterns](/integration) - Database setup and deployment patterns
- [Semantic Search](/semantic-search) - Add vector embeddings during sync
- [Query Builder](/queries/overview) - Query your synced graph data

# Integration Patterns

> Strategies for integrating TypeGraph into your application architecture

This guide covers common integration patterns for adding TypeGraph to existing
applications, from simple setups to production deployment strategies.

## Direct Drizzle Integration (Shared Database)

If you're already using Drizzle ORM, TypeGraph can share your existing database
connection. TypeGraph tables coexist alongside your application tables.

```typescript
import { drizzle } from "drizzle-orm/node-postgres";
import { Pool } from "pg";
import { createPostgresBackend, generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";
import { createStore } from "@nicia-ai/typegraph";

// Your existing Drizzle setup
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const db = drizzle(pool);

// Add TypeGraph tables to your existing database
await pool.query(generatePostgresMigrationSQL());

// Create TypeGraph backend using the same connection
const backend = createPostgresBackend(db);
const store = createStore(graph, backend);

// For pure TypeGraph operations, use store.transaction()
await store.transaction(async (tx) => {
  const person = await tx.nodes.Person.create({ name: "Alice" });
  const company = await tx.nodes.Company.create({ name: "Acme" });
  await tx.edges.worksAt.create(person, company, { role: "Engineer" });
});
```

### Mixed Drizzle + TypeGraph Transactions

When combining TypeGraph operations with direct Drizzle queries in the same atomic transaction,
create a temporary backend from the Drizzle transaction:

```typescript
await db.transaction(async (tx) => {
  // Direct Drizzle operations
  await tx.insert(auditLog).values({ action: "user_created" });

  // TypeGraph operations in the same transaction
  const txBackend = createPostgresBackend(tx);
  const txStore = createStore(graph, txBackend);
  await txStore.nodes.Person.create({ name: "Alice" });
});
```

This pattern is only needed when you must combine both in one atomic transaction.

**When to use:**

- You want a single database to manage
- Your graph data relates to existing tables
- You need cross-cutting transactions

**Considerations:**

- TypeGraph tables use the `typegraph_` prefix to avoid collisions
- Run TypeGraph migrations alongside your application migrations
- Connection pool is shared, so size accordingly

## Drizzle-Kit Managed Migrations (Recommended)

If you use `drizzle-kit` to manage migrations, you can import TypeGraph's table
definitions directly into your schema file. This lets drizzle-kit generate
migrations for all tables—both yours and TypeGraph's—in one place.

### Setup

**1. Import TypeGraph tables into your schema:**

```typescript
// schema.ts
import { sqliteTable, text, integer } from "drizzle-orm/sqlite-core";

// Import TypeGraph tables (these are standard Drizzle table definitions)
export * from "@nicia-ai/typegraph/sqlite";
// Or for PostgreSQL:
// export * from "@nicia-ai/typegraph/postgres";

// Your application tables
export const users = sqliteTable("users", {
  id: text("id").primaryKey(),
  name: text("name").notNull(),
  email: text("email").notNull(),
});
```

**2. Generate migrations normally:**

```bash
npx drizzle-kit generate
```

Drizzle-kit will now see all tables—TypeGraph's and yours—and generate migrations
for them.

**3. Apply migrations:**

```bash
npx drizzle-kit migrate
# Or for Cloudflare D1:
wrangler d1 migrations apply your-database
```

**4. Create the backend:**

```typescript
import { drizzle } from "drizzle-orm/better-sqlite3";
import Database from "better-sqlite3";
import { createSqliteBackend, tables } from "@nicia-ai/typegraph/sqlite";
import { createStore } from "@nicia-ai/typegraph";

const sqlite = new Database("app.db");
const db = drizzle(sqlite);

// Use the same tables that drizzle-kit manages
const backend = createSqliteBackend(db, { tables });
const store = createStore(graph, backend);
```

### Custom Table Names

To avoid conflicts or match your naming conventions, use the factory function:

```typescript
// schema.ts
import { createSqliteTables } from "@nicia-ai/typegraph/sqlite";

// Create tables with custom names
export const typegraphTables = createSqliteTables({
  nodes: "myapp_graph_nodes",
  edges: "myapp_graph_edges",
  uniques: "myapp_graph_uniques",
  schemaVersions: "myapp_graph_schema_versions",
  embeddings: "myapp_graph_embeddings",
});

// Export individual tables for drizzle-kit
export const {
  nodes: myappGraphNodes,
  edges: myappGraphEdges,
  uniques: myappGraphUniques,
  schemaVersions: myappGraphSchemaVersions,
  embeddings: myappGraphEmbeddings,
} = typegraphTables;
```

Then pass the same tables to the backend:

```typescript
import { createSqliteBackend } from "@nicia-ai/typegraph/sqlite";
import { typegraphTables } from "./schema";

const backend = createSqliteBackend(db, { tables: typegraphTables });
```

### Adding TypeGraph Indexes

The table factory functions also accept `indexes`, which drizzle-kit will include in migrations:

```ts
// schema.ts
import { createSqliteTables } from "@nicia-ai/typegraph/sqlite";
import { defineNodeIndex } from "@nicia-ai/typegraph/indexes";

import { Person } from "./graph";

const personEmail = defineNodeIndex(Person, { fields: ["email"] });

export const typegraphTables = createSqliteTables({}, { indexes: [personEmail] });
```

For PostgreSQL, use `createPostgresTables` from `@nicia-ai/typegraph/postgres`.
See [Indexes](/performance/indexes) for covering fields, partial indexes, and profiler integration.

If you only need PostgreSQL adapter exports, import from `@nicia-ai/typegraph/postgres`:

```typescript
import { createPostgresBackend, tables } from "@nicia-ai/typegraph/postgres";
```

### PostgreSQL with pgvector

For PostgreSQL with vector search, ensure the pgvector extension is enabled
before running migrations:

```sql
CREATE EXTENSION IF NOT EXISTS vector;
```

Then in your schema:

```typescript
// schema.ts
export * from "@nicia-ai/typegraph/postgres";
export const users = pgTable("users", { ... });
```

**When to use:**

- You already use drizzle-kit for migrations
- You want a single migration workflow for all tables
- You need Cloudflare D1 or other platforms that require drizzle-kit migrations

**Advantages over raw SQL migrations:**

- Single source of truth for schema
- Type-safe schema in TypeScript
- Drizzle-kit handles migration diffs automatically
- Works with all drizzle-kit supported platforms

## Separate Database

Use a dedicated database when you want isolation between your application data
and graph data.

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { createPostgresBackend, generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";

// Application database (your existing setup)
const appPool = new Pool({ connectionString: process.env.APP_DATABASE_URL });
const appDb = drizzle(appPool);

// Dedicated TypeGraph database
const graphPool = new Pool({ connectionString: process.env.GRAPH_DATABASE_URL });
const graphDb = drizzle(graphPool);

await graphPool.query(generatePostgresMigrationSQL());
const backend = createPostgresBackend(graphDb);
const store = createStore(graph, backend);
```

**When to use:**

- Your primary database doesn't support required features (e.g., pgvector)
- You want independent scaling for graph operations
- Compliance requires data separation
- You're adding graph capabilities to a legacy system

**Considerations:**

- No cross-database transactions (use eventual consistency patterns)
- Sync data between databases via application logic or events
- Separate backup/restore procedures

## In-Memory (Ephemeral Graphs)

Use in-memory SQLite for temporary graphs, caching, or computation.

```typescript
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";

function createEphemeralStore(graph: GraphDef) {
  const { backend } = createLocalSqliteBackend();
  return createStore(graph, backend);
}

// Use case: Build a temporary graph for computation
async function computeRecommendations(userId: string): Promise<Recommendation[]> {
  const tempStore = createEphemeralStore(recommendationGraph);

  // Load relevant data into temporary graph
  const userData = await fetchUserData(userId);
  await populateGraph(tempStore, userData);

  // Run graph algorithms
  const results = await tempStore
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.id.eq(userId))
    .traverse("similar", "s")
    .to("Product", "p")
    .select((ctx) => ctx.p)
    .execute();

  return results;
}
```

**When to use:**

- Temporary computation graphs
- Request-scoped graph state
- Graph-based caching with expiration
- Isolated test fixtures

**Considerations:**

- Data lost on process termination
- Memory usage scales with graph size
- No persistence—rebuild on restart

## Hybrid Overlay (Graph on Existing Data)

Add graph relationships on top of existing relational data without migrating
your data model. Your existing tables remain the source of truth; TypeGraph
stores only the relationships and graph-specific metadata.

Use the `externalRef()` helper to create type-safe references to external tables:

```typescript
import {
  createExternalRef,
  defineEdge,
  defineGraph,
  defineNode,
  embedding,
  externalRef,
} from "@nicia-ai/typegraph";
import { z } from "zod";

// Define nodes that reference your existing tables
const User = defineNode("User", {
  schema: z.object({
    // Type-safe reference to your existing users table
    source: externalRef("users"),
    // Denormalized fields for graph queries (optional)
    displayName: z.string().optional(),
  }),
});

const Document = defineNode("Document", {
  schema: z.object({
    source: externalRef("documents"),
    embedding: embedding(1536).optional(),
  }),
});

// Graph-only relationships not in your relational schema
const relatedTo = defineEdge("relatedTo", {
  schema: z.object({
    relationship: z.enum(["cites", "extends", "contradicts"]),
    confidence: z.number().min(0).max(1),
  }),
});

const authored = defineEdge("authored");

const graph = defineGraph({
  id: "document_graph",
  nodes: { User, Document },
  edges: {
    relatedTo: { type: relatedTo, from: [Document], to: [Document] },
    authored: { type: authored, from: [User], to: [Document] },
  },
});
```

The `externalRef()` helper validates that references include both the table name
and ID, catching errors at insert time:

```typescript
// Valid: includes table and id
await store.nodes.Document.create({
  source: { table: "documents", id: "doc_123" },
});

// Error: wrong table name (caught by TypeScript and runtime validation)
await store.nodes.Document.create({
  source: { table: "users", id: "doc_123" }, // Type error!
});

// Use createExternalRef() for a cleaner API
const docRef = createExternalRef("documents");
await store.nodes.Document.create({
  source: docRef("doc_456"),
});
```

**Syncing with external data:**

```typescript
// Sync helper: Create or update graph node from app data
async function syncDocument(store: Store, appDocument: AppDocument) {
  const existing = await store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.source.get("id").eq(appDocument.id))
    .select((ctx) => ctx.d)
    .first();

  if (existing) {
    await store.nodes.Document.update(existing.id, {
      embedding: await generateEmbedding(appDocument.content),
    });
    return existing;
  }

  return store.nodes.Document.create({
    source: { table: "documents", id: appDocument.id },
    embedding: await generateEmbedding(appDocument.content),
  });
}

// Query combining graph traversal with app data hydration
async function findRelatedDocuments(documentId: string) {
  // Get graph relationships
  const related = await store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.source.get("id").eq(documentId))
    .traverse("relatedTo", "r")
    .to("Document", "related")
    .select((ctx) => ({
      source: ctx.related.source,
      relationship: ctx.r.relationship,
      confidence: ctx.r.confidence,
    }))
    .execute();

  // Hydrate with full data from app database
  const externalIds = related.map((r) => r.source.id);
  const fullDocuments = await appDb
    .select()
    .from(documents)
    .where(inArray(documents.id, externalIds));

  return related.map((r) => ({
    ...r,
    document: fullDocuments.find((d) => d.id === r.source.id),
  }));
}
```

**When to use:**

- Adding graph capabilities to an existing application
- Semantic search over existing content
- Relationship discovery without schema changes
- Gradual migration from relational to graph thinking

**Considerations:**

- Maintain sync between app data and graph nodes
- Decide what to denormalize (tradeoff: query speed vs. sync complexity)
- The `table` field in `externalRef` enables referencing multiple external sources

## Background Embedding Workers

Decouple embedding generation from request handling using background jobs.

```typescript
// job-queue.ts - Define the embedding job
interface EmbeddingJob {
  nodeType: string;
  nodeId: string;
  content: string;
}

// worker.ts - Process embedding jobs
import { createStore } from "@nicia-ai/typegraph";

async function processEmbeddingJob(job: EmbeddingJob) {
  const { nodeType, nodeId, content } = job;

  // Generate embedding (expensive operation)
  const embedding = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: content,
  });

  // Update the node
  const collection = store.nodes[nodeType as keyof typeof store.nodes];
  await collection.update(nodeId, {
    embedding: embedding.data[0].embedding,
  });
}

// api-handler.ts - Enqueue jobs on create/update
async function createDocument(data: DocumentInput) {
  // Create node without embedding (fast)
  const doc = await store.nodes.Document.create({
    title: data.title,
    content: data.content,
    // embedding: undefined - will be populated by worker
  });

  // Enqueue embedding job (non-blocking)
  await jobQueue.add("generate-embedding", {
    nodeType: "Document",
    nodeId: doc.id,
    content: data.content,
  });

  return doc;
}
```

**Batch processing for bulk imports:**

```typescript
async function backfillEmbeddings(batchSize = 100) {
  let processed = 0;

  while (true) {
    // Find nodes missing embeddings
    const nodes = await store
      .query()
      .from("Document", "d")
      .whereNode("d", (d) => d.embedding.isNull())
      .select((ctx) => ({
        id: ctx.d.id,
        content: ctx.d.content,
      }))
      .limit(batchSize)
      .execute();

    if (nodes.length === 0) break;

    // Batch embed
    const embeddings = await openai.embeddings.create({
      model: "text-embedding-ada-002",
      input: nodes.map((n) => n.content),
    });

    // Batch update
    await store.transaction(async (tx) => {
      for (const [i, node] of nodes.entries()) {
        await tx.nodes.Document.update(node.id, {
          embedding: embeddings.data[i].embedding,
        });
      }
    });

    processed += nodes.length;
    console.log(`Processed ${processed} documents`);
  }
}
```

**When to use:**

- Embedding generation is slow (100-500ms per call)
- You want fast API response times
- Bulk importing existing content
- Retry logic for API failures

**Considerations:**

- Handle job failures and retries
- Consider rate limits on embedding APIs
- Queries on `embedding` should handle null values during population

## Testing

For test setup patterns, seed data strategies, and profiler-based index coverage checks,
see the dedicated [Testing](/testing) guide.

## Deployment Patterns

### Edge and Serverless

Deploy TypeGraph at the edge using SQLite-compatible runtimes.

> **Note:** Edge environments cannot use `@nicia-ai/typegraph/sqlite` because it
> depends on `better-sqlite3`, a native Node.js addon. Instead, use
> `@nicia-ai/typegraph/sqlite` which is driver-agnostic.

**Cloudflare Workers with D1:**

```typescript
// worker.ts
import { drizzle } from "drizzle-orm/d1";
import { createSqliteBackend } from "@nicia-ai/typegraph/sqlite";

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const db = drizzle(env.DB);
    const backend = createSqliteBackend(db);
    const store = createStore(graph, backend);

    // Handle request with graph queries
    const results = await store
      .query()
      .from("Document", "d")
      .whereNode("d", (d) => d.embedding.similarTo(queryEmbedding, 5))
      .select((ctx) => ctx.d)
      .execute();

    return Response.json(results);
  },
};
```

**Turso (libSQL):**

```typescript
import { createClient } from "@libsql/client";
import { drizzle } from "drizzle-orm/libsql";
import { createSqliteBackend } from "@nicia-ai/typegraph/sqlite";

const client = createClient({
  url: process.env.TURSO_DATABASE_URL!,
  authToken: process.env.TURSO_AUTH_TOKEN,
});

const db = drizzle(client);
const backend = createSqliteBackend(db);
const store = createStore(graph, backend);
```

> For Turso and D1, use [drizzle-kit managed migrations](#drizzle-kit-managed-migrations-recommended)
> to set up the schema.

**Bun with built-in SQLite:**

Bun runs locally, so you can use the Node.js-compatible path with better-sqlite3, or
use bun:sqlite with drizzle-kit managed migrations:

```typescript
import { Database } from "bun:sqlite";
import { drizzle } from "drizzle-orm/bun-sqlite";
import { createSqliteBackend } from "@nicia-ai/typegraph/sqlite";

const sqlite = new Database("app.db");
const db = drizzle(sqlite);
const backend = createSqliteBackend(db);
const store = createStore(graph, backend);
```

> Use [drizzle-kit managed migrations](#drizzle-kit-managed-migrations-recommended)
> to set up the schema with bun:sqlite.

**When to use:**

- Low-latency requirements (data close to users)
- Serverless functions with graph queries
- Read-heavy workloads

**Considerations:**

- SQLite limitations (single-writer, no pgvector)
- Cold start times include DB initialization
- sqlite-vec for vector search (cosine/L2 only)

### Read Replica Separation

Route heavy graph queries to read replicas while writes go to primary.

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { createPostgresBackend } from "@nicia-ai/typegraph/postgres";

// Primary for writes
const primaryPool = new Pool({
  connectionString: process.env.PRIMARY_DATABASE_URL,
  max: 10,
});
const primaryDb = drizzle(primaryPool);
const primaryBackend = createPostgresBackend(primaryDb);
const primaryStore = createStore(graph, primaryBackend);

// Replica for reads
const replicaPool = new Pool({
  connectionString: process.env.REPLICA_DATABASE_URL,
  max: 50, // Higher pool for read-heavy workloads
});
const replicaDb = drizzle(replicaPool);
const replicaBackend = createPostgresBackend(replicaDb);
const replicaStore = createStore(graph, replicaBackend);

// Route based on operation
export const stores = {
  write: primaryStore,
  read: replicaStore,
};

// Usage
async function searchDocuments(query: string) {
  // Read from replica
  return stores.read
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.embedding.similarTo(queryEmbedding, 10))
    .select((ctx) => ctx.d)
    .execute();
}

async function createDocument(data: DocumentInput) {
  // Write to primary
  return stores.write.nodes.Document.create(data);
}
```

**When to use:**

- Heavy read workloads (semantic search, graph traversals)
- Write/read ratio is heavily skewed toward reads
- Need to scale read capacity independently

**Considerations:**

- Replication lag means reads may be slightly stale
- Don't use replica for read-after-write scenarios
- Monitor replication lag in production

### Multi-Tenant Architecture

Three approaches for multi-tenant deployments, each with different tradeoffs.

#### Option 1: Shared tables with tenant isolation (simplest)

```typescript
import { defineNode, defineGraph } from "@nicia-ai/typegraph";

// Include tenantId in your node schemas
const Document = defineNode("Document", {
  schema: z.object({
    tenantId: z.string(),
    title: z.string(),
    content: z.string(),
  }),
});

// Always filter by tenant in queries
function createTenantQuery(store: Store, tenantId: string) {
  return {
    searchDocuments: (query: string) =>
      store
        .query()
        .from("Document", "d")
        .whereNode("d", (d) =>
          d.tenantId.eq(tenantId).and(
            d.embedding.similarTo(queryEmbedding, 10)
          )
        )
        .select((ctx) => ctx.d)
        .execute(),

    createDocument: (data: Omit<DocumentInput, "tenantId">) =>
      store.nodes.Document.create({ ...data, tenantId }),
  };
}

// Middleware extracts tenant and creates scoped API
function withTenant(req: Request) {
  const tenantId = req.headers.get("x-tenant-id")!;
  return createTenantQuery(store, tenantId);
}
```

#### Option 2: Schema per tenant (PostgreSQL)

```typescript
import { sql } from "drizzle-orm";

async function createTenantStore(tenantId: string) {
  const schemaName = `tenant_${tenantId}`;

  // Create schema if not exists
  await pool.query(`CREATE SCHEMA IF NOT EXISTS ${schemaName}`);

  // Run migrations in tenant schema
  await pool.query(`SET search_path TO ${schemaName}`);
  await pool.query(generatePostgresMigrationSQL());
  await pool.query(`SET search_path TO public`);

  // Create Drizzle instance with schema
  const db = drizzle(pool, { schema: { schemaName } });
  const backend = createPostgresBackend(db);
  return createStore(graph, backend);
}

// Cache tenant stores
const tenantStores = new Map<string, Store>();

async function getTenantStore(tenantId: string): Promise<Store> {
  if (!tenantStores.has(tenantId)) {
    tenantStores.set(tenantId, await createTenantStore(tenantId));
  }
  return tenantStores.get(tenantId)!;
}
```

#### Option 3: Database per tenant (strongest isolation)

```typescript
interface TenantConfig {
  id: string;
  databaseUrl: string;
}

async function createTenantStore(config: TenantConfig) {
  const pool = new Pool({ connectionString: config.databaseUrl });
  await pool.query(generatePostgresMigrationSQL());

  const db = drizzle(pool);
  const backend = createPostgresBackend(db);
  return {
    store: createStore(graph, backend),
    close: () => pool.end(),
  };
}

// Connection manager with LRU eviction
class TenantConnectionManager {
  private stores = new Map<string, { store: Store; close: () => Promise<void> }>();
  private maxConnections = 100;

  async getStore(tenantId: string): Promise<Store> {
    if (!this.stores.has(tenantId)) {
      if (this.stores.size >= this.maxConnections) {
        await this.evictOldest();
      }
      const config = await fetchTenantConfig(tenantId);
      this.stores.set(tenantId, await createTenantStore(config));
    }
    return this.stores.get(tenantId)!.store;
  }

  private async evictOldest() {
    const [oldestId, oldest] = this.stores.entries().next().value;
    await oldest.close();
    this.stores.delete(oldestId);
  }
}
```

**Comparison:**

| Approach | Isolation | Complexity | Scaling | Cost |
|----------|-----------|------------|---------|------|
| Shared tables | Low (row-level) | Low | Single DB | Lowest |
| Schema per tenant | Medium | Medium | Single DB, separate schemas | Low |
| Database per tenant | High | High | Independent DBs | Highest |

**When to use each:**

- **Shared tables**: SaaS with many small tenants, cost-sensitive
- **Schema per tenant**: Moderate isolation needs, PostgreSQL only
- **Database per tenant**: Enterprise customers requiring data isolation, compliance requirements

## Next Steps

- [Quick Start](/getting-started) - Basic setup and first graph
- [Semantic Search](/semantic-search) - Vector embeddings and similarity
- [Performance](/performance/overview) - Optimization strategies

# Graph Interchange

> Import and export graph data for backups, migrations, and external integrations

TypeGraph provides a standardized interchange format for importing and exporting
graph data. Use it for:

- Importing data extracted by TypeGraph Cloud
- Backing up and restoring graph data
- Migrating data between environments
- Exchanging data with external systems

## Quick Start

```typescript
import { importGraph, exportGraph, GraphDataSchema } from "@nicia-ai/typegraph/interchange";

// Export your graph
const backup = await exportGraph(store);

// Import into another store
const result = await importGraph(targetStore, backup, {
  onConflict: "update",
  onUnknownProperty: "strip",
});

console.log(`Imported ${result.nodes.created} nodes, ${result.edges.created} edges`);
```

## Interchange Format

The interchange format is a JSON structure validated by Zod schemas. You can use
`GraphDataSchema` to validate data before import, or export the schema as JSON
Schema for API documentation.

```typescript
import { GraphDataSchema } from "@nicia-ai/typegraph/interchange";

// Validate incoming data
const validated = GraphDataSchema.parse(jsonData);

// Export as JSON Schema for API docs
import { toJSONSchema } from "zod";
const jsonSchema = toJSONSchema(GraphDataSchema);
```

### Format Structure

```typescript
interface GraphData {
  formatVersion: "1.0";
  exportedAt: string; // ISO datetime
  source: {
    type: "typegraph-cloud" | "typegraph-export" | "external";
    // Additional source-specific fields
  };
  nodes: Array<{
    kind: string;
    id: string;
    properties: Record<string, unknown>;
    validFrom?: string;
    validTo?: string;
    meta?: {
      version?: number;
      createdAt?: string;
      updatedAt?: string;
    };
  }>;
  edges: Array<{
    kind: string;
    id: string;
    from: { kind: string; id: string };
    to: { kind: string; id: string };
    properties: Record<string, unknown>;
    validFrom?: string;
    validTo?: string;
    meta?: {
      createdAt?: string;
      updatedAt?: string;
    };
  }>;
}
```

## Exporting Data

Use `exportGraph` to serialize your graph data:

```typescript
import { exportGraph } from "@nicia-ai/typegraph/interchange";

// Export everything
const fullExport = await exportGraph(store);

// Export specific node kinds
const peopleOnly = await exportGraph(store, {
  nodeKinds: ["Person", "Organization"],
});

// Export specific edge kinds
const relationshipsOnly = await exportGraph(store, {
  edgeKinds: ["worksAt", "knows"],
});

// Include metadata (version, timestamps)
const withMeta = await exportGraph(store, {
  includeMeta: true,
});

// Include temporal fields (validFrom, validTo)
const withTemporal = await exportGraph(store, {
  includeTemporal: true,
});

// Include soft-deleted records
const withDeleted = await exportGraph(store, {
  includeDeleted: true,
});
```

### Export Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `nodeKinds` | `string[]` | all | Filter to specific node types |
| `edgeKinds` | `string[]` | all | Filter to specific edge types |
| `includeMeta` | `boolean` | `false` | Include version and timestamps |
| `includeTemporal` | `boolean` | `false` | Include validFrom/validTo fields |
| `includeDeleted` | `boolean` | `false` | Include soft-deleted records |

## Importing Data

Use `importGraph` to load data into a store:

```typescript
import { importGraph } from "@nicia-ai/typegraph/interchange";

const result = await importGraph(store, data, {
  onConflict: "update",
  onUnknownProperty: "strip",
  validateReferences: true,
  batchSize: 100,
});

if (result.success) {
  console.log(`Created: ${result.nodes.created} nodes, ${result.edges.created} edges`);
  console.log(`Updated: ${result.nodes.updated} nodes, ${result.edges.updated} edges`);
  console.log(`Skipped: ${result.nodes.skipped} nodes, ${result.edges.skipped} edges`);
} else {
  console.error("Import had errors:", result.errors);
}
```

### Import Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `onConflict` | `"skip" \| "update" \| "error"` | required | How to handle existing entities |
| `onUnknownProperty` | `"error" \| "strip" \| "allow"` | `"error"` | How to handle extra properties |
| `validateReferences` | `boolean` | `true` | Verify edge endpoints exist |
| `batchSize` | `number` | `100` | Batch size for database operations |

### Conflict Strategies

**`skip`** - Keep existing data, ignore incoming:

```typescript
// Useful for incremental imports where you don't want to overwrite
await importGraph(store, data, { onConflict: "skip" });
```

**`update`** - Merge incoming data into existing:

```typescript
// Useful for syncing updates from an external source
await importGraph(store, data, { onConflict: "update" });
```

**`error`** - Fail if any entity already exists:

```typescript
// Useful for initial imports where duplicates indicate a problem
await importGraph(store, data, { onConflict: "error" });
```

### Unknown Property Handling

When importing data that has properties not defined in your schema:

**`error`** - Reject the import (default, safest):

```typescript
await importGraph(store, data, { onUnknownProperty: "error" });
// Throws if data has { name: "Alice", unknownField: "value" }
```

**`strip`** - Remove unknown properties silently:

```typescript
await importGraph(store, data, { onUnknownProperty: "strip" });
// { name: "Alice", unknownField: "value" } becomes { name: "Alice" }
```

**`allow`** - Pass through to storage:

```typescript
await importGraph(store, data, { onUnknownProperty: "allow" });
// Behavior depends on your database and schema strictness
```

## TypeGraph Cloud Integration

When using TypeGraph Cloud for document extraction, the workflow is:

1. **Schema Discovery** (optional): Cloud analyzes your documents and proposes schemas
2. **Schema-Guided Extraction**: Cloud extracts entities/relationships matching your schema
3. **Import**: Use `importGraph` to load extracted data into your store

```typescript
import { importGraph, GraphDataSchema } from "@nicia-ai/typegraph/interchange";

// Fetch extraction from Cloud API
const response = await fetch("https://api.typegraph.cloud/extractions/abc123", {
  headers: { Authorization: `Bearer ${apiKey}` },
});
const cloudData = await response.json();

// Validate and import
const validated = GraphDataSchema.parse(cloudData);
const result = await importGraph(store, validated, {
  onConflict: "update",
  onUnknownProperty: "strip", // Cloud may include provenance fields
});
```

### Cloud Data Sources

Data from TypeGraph Cloud includes source metadata:

```typescript
{
  formatVersion: "1.0",
  exportedAt: "2024-01-15T10:30:00Z",
  source: {
    type: "typegraph-cloud",
    extractionId: "ext_abc123",
    schemaId: "schema_xyz789",
    schemaVersion: 2,
  },
  nodes: [...],
  edges: [...],
}
```

## Backup and Restore

### Creating Backups

```typescript
import { exportGraph } from "@nicia-ai/typegraph/interchange";
import fs from "fs/promises";

async function createBackup(store: Store, backupDir: string) {
  const timestamp = new Date().toISOString().replace(/[:.]/g, "-");
  const filename = `backup-${timestamp}.json`;

  const data = await exportGraph(store, {
    includeMeta: true,
    includeTemporal: true,
  });

  await fs.writeFile(
    `${backupDir}/${filename}`,
    JSON.stringify(data, null, 2)
  );

  return filename;
}
```

### Restoring from Backup

```typescript
import { importGraph, GraphDataSchema } from "@nicia-ai/typegraph/interchange";
import fs from "fs/promises";

async function restoreBackup(store: Store, backupPath: string) {
  const json = await fs.readFile(backupPath, "utf-8");
  const data = GraphDataSchema.parse(JSON.parse(json));

  const result = await importGraph(store, data, {
    onConflict: "update", // or "error" for clean restore
    onUnknownProperty: "error",
  });

  if (!result.success) {
    throw new Error(`Restore failed: ${result.errors.map(e => e.error).join(", ")}`);
  }

  return result;
}
```

## Migration Between Environments

Move data from development to staging, or staging to production:

```typescript
import { createStore } from "@nicia-ai/typegraph";
import { exportGraph, importGraph } from "@nicia-ai/typegraph/interchange";
import { graph } from "./schema";

async function migrateData(
  sourceBackend: GraphBackend,
  targetBackend: GraphBackend,
) {
  const sourceStore = createStore(graph, sourceBackend);
  const targetStore = createStore(graph, targetBackend);

  // Export from source
  const data = await exportGraph(sourceStore);

  // Import to target
  const result = await importGraph(targetStore, data, {
    onConflict: "error", // Ensure clean migration
    onUnknownProperty: "error",
    validateReferences: true,
  });

  return result;
}
```

## Building Custom Import Pipelines

For complex import scenarios, you can build pipelines using the Zod schemas:

```typescript
import {
  GraphDataSchema,
  InterchangeNodeSchema,
  InterchangeEdgeSchema,
  type GraphData,
} from "@nicia-ai/typegraph/interchange";

// Transform external data to interchange format
function transformExternalData(externalRecords: ExternalRecord[]): GraphData {
  const nodes = externalRecords.map((record) => ({
    kind: "Document",
    id: record.externalId,
    properties: {
      title: record.name,
      content: record.body,
      source: { system: "external", id: record.externalId },
    },
  }));

  // Validate each node
  const validatedNodes = nodes.map((node) => InterchangeNodeSchema.parse(node));

  return {
    formatVersion: "1.0",
    exportedAt: new Date().toISOString(),
    source: {
      type: "external",
      description: "Imported from external CMS",
    },
    nodes: validatedNodes,
    edges: [],
  };
}
```

## Error Handling

Import returns detailed error information for partial failures:

```typescript
const result = await importGraph(store, data, { onConflict: "error" });

if (!result.success) {
  for (const error of result.errors) {
    console.error(
      `Failed to import ${error.entityType} ${error.kind}:${error.id}: ${error.error}`
    );
  }

  // Decide how to handle partial import
  if (result.nodes.created > 0 || result.edges.created > 0) {
    console.log("Partial import completed, some entities were created");
  }
}
```

## Best Practices

### Validate Before Import

Always validate external data before importing:

```typescript
import { GraphDataSchema } from "@nicia-ai/typegraph/interchange";

const result = GraphDataSchema.safeParse(untrustedData);
if (!result.success) {
  console.error("Invalid data:", result.error.format());
  return;
}

await importGraph(store, result.data, options);
```

### Use Transactions for Consistency

Import operations use transactions when the backend supports them. For backends
without transaction support, consider smaller batch sizes to minimize partial
failure impact.

### Test with `onConflict: "error"` First

When setting up a new import pipeline, use `onConflict: "error"` to catch
unexpected duplicates early:

```typescript
// Development/testing
await importGraph(store, data, { onConflict: "error" });

// Production (after validation)
await importGraph(store, data, { onConflict: "update" });
```

### Monitor Import Results

Log import statistics for observability:

```typescript
const result = await importGraph(store, data, options);

logger.info("Import completed", {
  success: result.success,
  nodesCreated: result.nodes.created,
  nodesUpdated: result.nodes.updated,
  nodesSkipped: result.nodes.skipped,
  edgesCreated: result.edges.created,
  edgesUpdated: result.edges.updated,
  edgesSkipped: result.edges.skipped,
  errorCount: result.errors.length,
});
```

## Next Steps

- [Data Sync](/data-sync) - Patterns for keeping external data in sync
- [Schema Migrations](/schema-management) - Managing schema changes over time
- [Integration Patterns](/integration) - Database setup and deployment

# Limitations

> Known constraints and backend-specific limitations

This page documents TypeGraph's known limitations and constraints.

## Cloudflare D1 Transactions

Cloudflare D1 does not support atomic transactions. When using D1, calling
`store.transaction()` will throw a `ConfigurationError`.

```typescript
// Throws ConfigurationError on D1
await store.transaction(async (tx) => {
  // ...
});
```

**Workaround:** Execute operations directly without transaction wrapper. Operations
execute sequentially but without atomicity guarantees.

```typescript
// Alternative: execute operations directly (not atomic)
const person = await store.nodes.Person.create({ name: "Alice" });
const company = await store.nodes.Company.create({ name: "Acme" });
await store.edges.worksAt.create(person, company, { role: "Engineer" });
```

**Check support programmatically:**

```typescript
if (backend.capabilities.transactions) {
  await store.transaction(async (tx) => {
    /* ... */
  });
} else {
  // Handle non-transactional execution
}
```

## Recursive Traversal Depth

Variable-length traversals use two caps:

1. Unbounded traversals (no `maxHops` option) are capped at 100 hops.
2. Explicit `maxHops` values are validated up to 1000 hops (`maxHops: >1000` throws).
3. Cycle prevention is on by default. To skip cycle checks for speed, opt into
   `cyclePolicy: "allow"` (which may revisit nodes across hops).

This prevents runaway queries while still supporting deep, intentionally bounded traversals.

```typescript
// Implicitly limited to 100 hops
store
  .query()
  .from("Person", "p")
  .traverse("reportsTo", "e")
  .recursive()
  .to("Person", "manager");

// Explicit limits up to 1000 are honored
store
  .query()
  .from("Person", "p")
  .traverse("reportsTo", "e")
  .recursive({ maxHops: 200 }) // honored
  .to("Person", "manager");

// Explicit limits above 1000 throw
store
  .query()
  .from("Person", "p")
  .traverse("reportsTo", "e")
  .recursive({ maxHops: 2000 }) // throws
  .to("Person", "manager");
```

The unbounded-traversal limit is defined as `MAX_RECURSIVE_DEPTH`:

```typescript
import { MAX_RECURSIVE_DEPTH } from "@nicia-ai/typegraph";
// MAX_RECURSIVE_DEPTH = 100
```

## Connection Management

TypeGraph does not manage database connections. You are responsible for:

1. **Creating and configuring** the database connection
2. **Implementing connection pooling** for production use
3. **Closing connections** when done

```typescript
import Database from "better-sqlite3";
import { drizzle } from "drizzle-orm/better-sqlite3";
import { createSqliteBackend, generateSqliteMigrationSQL } from "@nicia-ai/typegraph/sqlite";

// You manage the connection
const sqlite = new Database("app.db");
sqlite.exec(generateSqliteMigrationSQL());
const db = drizzle(sqlite);

const backend = createSqliteBackend(db);
const store = createStore(graph, backend);

// You close the connection
sqlite.close();
```

For production deployments, use connection pooling:

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { createPostgresBackend } from "@nicia-ai/typegraph/postgres";

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20, // Maximum connections
});

const db = drizzle(pool);
const backend = createPostgresBackend(db);
```

The `store.close()` method is a no-op. Connection cleanup is your responsibility.

## Predicate Serialization

Where predicates in unique constraints cannot be serialized. If you use schema
serialization for versioning or migration, predicates are stored as
`"[predicate]"` and cannot be reconstructed.

```typescript
// This predicate works at runtime...
unique({
  name: "email_unique_when_active",
  fields: ["email"],
  where: (props) => props.status.isNotNull(),
});

// ...but serializes as:
// { "where": "[predicate]" }
```

**Workaround:** For full schema serialization support, avoid predicates in unique constraints.
Use application-level validation instead.

## Vector Search Backend Requirements

Vector similarity search requires specific database extensions:

| Backend | Requirement |
|---------|-------------|
| PostgreSQL | pgvector extension |
| SQLite | sqlite-vec extension |
| D1 | Not supported |

Using vector predicates on unsupported backends throws `UnsupportedPredicateError`:

```typescript
try {
  await store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.embedding.similarTo(queryVector, 10))
    .execute();
} catch (error) {
  if (error instanceof UnsupportedPredicateError) {
    // Vector search not available on this backend
  }
}
```

## Query Builder Type Inference

Complex query chains may occasionally require explicit type annotations when TypeScript cannot
infer the full type. This is rare but can occur with deeply nested selects or unions.

```typescript
// If type inference fails, add explicit type
const results = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name as string,  // Explicit annotation
  }))
  .execute();
```

## Bulk Operation Limits

Bulk operations (`bulkCreate`, `bulkInsert`, `bulkUpsertById`, `bulkDelete`) have practical limits based on your database:

| Database | Recommended Batch Size |
|----------|----------------------|
| SQLite | 500-1000 items |
| PostgreSQL | 1000-5000 items |

For larger datasets, batch your operations:

```typescript
const BATCH_SIZE = 1000;

for (let i = 0; i < items.length; i += BATCH_SIZE) {
  const batch = items.slice(i, i + BATCH_SIZE);
  await store.nodes.Person.bulkCreate(batch);
}
```

## No Built-in Graph Algorithms

TypeGraph deliberately excludes graph algorithms. The following are **not** provided:

- Shortest path (Dijkstra, A*)
- PageRank
- Community detection
- Centrality measures
- Graph partitioning

For these use cases, export your data to a specialized graph processing library or database.

## Single Database Deployment

TypeGraph is designed for single-database deployments. It does not support:

- Distributed storage across multiple databases
- Sharding
- Cross-database queries
- Replication coordination

For distributed graph workloads, consider a dedicated graph database.

## Temporal Query Limitations

Temporal queries (`asOf`, `includeEnded`) work correctly but have some constraints:

- Point-in-time queries cannot be combined with streaming (`.stream()`)
- Historical data is only available if temporal fields (`validFrom`, `validTo`) were populated at creation time
- Clock skew between application servers can affect temporal accuracy

## Schema Migration Constraints

Automatic migrations (`createStoreWithSchema`) only handle additive changes:

| Change Type | Auto-Migrated |
|-------------|---------------|
| Add new node type | Yes |
| Add new edge type | Yes |
| Add optional property | Yes |
| Add required property | No |
| Remove property | No |
| Rename type | No |
| Change property type | No |

Breaking changes throw `MigrationError` and require manual migration.

# LLM Support

> Machine-readable documentation for AI assistants and coding tools

TypeGraph documentation is available in formats optimized for Large Language
Models (LLMs) following the [llms.txt specification](https://llmstxt.org/). All
files are generated from the same source docs as the website.

## Recommended Retrieval Order

For coding agents, use these files progressively:

1. Start with [`/llms-small.txt`](/llms-small.txt) for implementation and debugging tasks.
2. Use [`/llms-full.txt`](/llms-full.txt) only when you need deep reference content.
3. Load [`/_llms-txt/examples.txt`](/_llms-txt/examples.txt) only when you need full end-to-end patterns.

## Available Files

| File | Purpose | Size |
|------|---------|------|
| [`/llms.txt`](/llms.txt) | Index with page titles, descriptions, and links | Small |
| [`/llms-small.txt`](/llms-small.txt) | Core docs for implementation and debugging tasks | Medium |
| [`/llms-full.txt`](/llms-full.txt) | Complete documentation in a single file | Large |
| [`/_llms-txt/examples.txt`](/_llms-txt/examples.txt) | Complete application examples | Medium |

## Copy-Paste Agent Instructions

Use this in repository-level agent instruction files (`AGENTS.md`,
`CLAUDE.md`, etc.):

```md
TypeGraph (`@nicia-ai/typegraph`) is a TypeScript-first embedded knowledge graph
library with typed nodes, edges, queries, and schema management over SQLite and
PostgreSQL backends.

When working with TypeGraph code (graph definitions, node/edge schemas, store
operations, query builder, backend setup, or migrations):

1. Load https://typegraph.dev/llms-small.txt first.
2. Use https://typegraph.dev/llms-full.txt only for deep API/reference lookup.
3. Load https://typegraph.dev/_llms-txt/examples.txt only for end-to-end implementation patterns.
4. Prefer current API docs over inferred behavior from old snippets.
```

# Multiple Graphs

> Using separate graph definitions for different domains in the same application

TypeGraph supports multiple graphs for applications that have distinct data domains that benefit from separate graph definitions.

## When to Use Multiple Graphs

Use separate graphs when you have:

- **Distinct domains**: A RAG system for documents and a business network for suppliers have different node types,
  edge semantics, and query patterns
- **Independent lifecycles**: One graph might evolve rapidly while another is stable
- **Team ownership**: Different teams own different graphs, with separate schema review processes
- **Different retention policies**: Document chunks might be ephemeral while business relationships are long-lived

**Don't use multiple graphs** when:

- You need cross-graph queries or traversals (use a single graph with ontology relations instead)
- The domains are closely related (e.g., Users and Documents that Users author)
- You're trying to solve multi-tenancy (use tenant isolation patterns instead)

## Example: Documents and Business Network

A company needs two graphs:

1. **Documents graph**: Powers semantic search over internal documents
2. **Organization graph**: Tracks suppliers, partners, and contracts

### Defining the Graphs

```typescript
// graphs/documents.ts
import { z } from "zod";
import { defineNode, defineEdge, defineGraph, embedding } from "@nicia-ai/typegraph";

const Document = defineNode("Document", {
  schema: z.object({
    title: z.string(),
    source: z.string(),
    createdAt: z.string().datetime(),
  }),
});

const Chunk = defineNode("Chunk", {
  schema: z.object({
    content: z.string(),
    embedding: embedding(1536),
    position: z.number().int(),
  }),
});

const hasChunk = defineEdge("hasChunk");

export const documentsGraph = defineGraph({
  id: "documents",
  nodes: {
    Document: { type: Document },
    Chunk: { type: Chunk },
  },
  edges: {
    hasChunk: { type: hasChunk, from: [Document], to: [Chunk] },
  },
});
```

```typescript
// graphs/organization.ts
import { z } from "zod";
import { defineNode, defineEdge, defineGraph, subClassOf } from "@nicia-ai/typegraph";

const Organization = defineNode("Organization", {
  schema: z.object({
    name: z.string(),
    domain: z.string().optional(),
  }),
});

const Supplier = defineNode("Supplier", {
  schema: z.object({
    name: z.string(),
    domain: z.string().optional(),
    category: z.enum(["materials", "services", "logistics"]),
  }),
});

const Partner = defineNode("Partner", {
  schema: z.object({
    name: z.string(),
    domain: z.string().optional(),
    partnershipLevel: z.enum(["bronze", "silver", "gold"]),
  }),
});

const Contract = defineNode("Contract", {
  schema: z.object({
    title: z.string(),
    value: z.number(),
    startDate: z.string().datetime(),
    endDate: z.string().datetime().optional(),
    status: z.enum(["draft", "active", "expired"]).default("draft"),
  }),
});

const supplies = defineEdge("supplies");
const hasContract = defineEdge("hasContract");

export const organizationGraph = defineGraph({
  id: "organization",
  nodes: {
    Organization: { type: Organization },
    Supplier: { type: Supplier },
    Partner: { type: Partner },
    Contract: { type: Contract },
  },
  edges: {
    supplies: { type: supplies, from: [Supplier], to: [Organization] },
    hasContract: { type: hasContract, from: [Organization], to: [Contract] },
  },
  ontology: [
    subClassOf(Supplier, Organization),
    subClassOf(Partner, Organization),
  ],
});
```

### Creating Stores

Both graphs can share the same database backend. Each graph's data is isolated by its `id`.

```typescript
// stores.ts
import { createStore } from "@nicia-ai/typegraph";
import { createPostgresBackend } from "@nicia-ai/typegraph/postgres";
import { drizzle } from "drizzle-orm/node-postgres";
import { Pool } from "pg";

import { documentsGraph } from "./graphs/documents";
import { organizationGraph } from "./graphs/organization";

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const db = drizzle(pool);
const backend = createPostgresBackend(db);

// Same backend, different stores
export const documentsStore = createStore(documentsGraph, backend);
export const organizationStore = createStore(organizationGraph, backend);
```

### Using the Stores

Each store is fully independent with its own typed API:

```typescript
// Semantic search in documents
async function searchDocuments(query: string, embedding: number[]) {
  return documentsStore
    .query()
    .from("Chunk", "c")
    .whereNode("c", (c) => c.embedding.similarTo(embedding, 10))
    .select((ctx) => ({
      content: ctx.c.content,
      position: ctx.c.position,
    }))
    .execute();
}

// Business queries in organization
async function getActiveSuppliers(category: string) {
  return organizationStore
    .query()
    .from("Supplier", "s")
    .whereNode("s", (s) => s.category.eq(category))
    .traverse("hasContract", "e")
    .to("Contract", "c")
    .whereNode("c", (c) => c.status.eq("active"))
    .select((ctx) => ({
      supplier: ctx.s.name,
      contract: ctx.c.title,
      value: ctx.c.value,
    }))
    .execute();
}
```

## Coordinating Across Graphs

Since cross-graph queries aren't supported, coordinate at the application level.

### Shared Identifiers

Use consistent IDs when entities relate across graphs:

```typescript
// When ingesting a supplier's documents, use the supplier ID as a reference
async function ingestSupplierDocument(
  supplierId: string,
  title: string,
  content: string,
  embedding: number[]
) {
  // Store document with supplier reference in metadata
  const doc = await documentsStore.nodes.Document.create({
    title,
    source: `supplier:${supplierId}`,
    createdAt: new Date().toISOString(),
  });

  const chunk = await documentsStore.nodes.Chunk.create({
    content,
    embedding,
    position: 0,
  });

  await documentsStore.edges.hasChunk.create(doc, chunk, {});

  return doc;
}

// Later, find documents for a supplier
async function getSupplierDocuments(supplierId: string) {
  return documentsStore
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.source.eq(`supplier:${supplierId}`))
    .select((ctx) => ctx.d)
    .execute();
}
```

### Application-Level Joins

Combine results from multiple graphs in your application:

```typescript
interface SupplierWithDocuments {
  supplier: { name: string; category: string };
  documents: Array<{ title: string }>;
}

async function getSupplierOverview(
  supplierId: string
): Promise<SupplierWithDocuments> {
  // Parallel queries to both graphs
  const [supplier, documents] = await Promise.all([
    organizationStore.nodes.Supplier.getById(supplierId),
    getSupplierDocuments(supplierId),
  ]);

  return {
    supplier: {
      name: supplier.name,
      category: supplier.category,
    },
    documents: documents.map((d) => ({ title: d.title })),
  };
}
```

### Event-Driven Sync

For loose coupling, use events to keep graphs in sync:

```typescript
// When a supplier is created, set up document ingestion
eventBus.on("supplier.created", async (event) => {
  const { supplierId, name } = event.payload;

  // Create a placeholder document node for future ingestion
  await documentsStore.nodes.Document.create({
    title: `${name} - Supplier Profile`,
    source: `supplier:${supplierId}`,
    createdAt: new Date().toISOString(),
  });
});

// When a supplier is deleted, clean up related documents
eventBus.on("supplier.deleted", async (event) => {
  const { supplierId } = event.payload;

  const docs = await documentsStore
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.source.eq(`supplier:${supplierId}`))
    .select((ctx) => ctx.d.id)
    .execute();

  for (const docId of docs) {
    await documentsStore.nodes.Document.delete(docId);
  }
});
```

## Separate Backends

For stronger isolation, use separate database connections:

```typescript
// Documents in PostgreSQL with pgvector for embeddings
const documentsPool = new Pool({
  connectionString: process.env.DOCUMENTS_DATABASE_URL,
});
const documentsBackend = createPostgresBackend(drizzle(documentsPool));
export const documentsStore = createStore(documentsGraph, documentsBackend);

// Organization data in a separate database
const orgPool = new Pool({
  connectionString: process.env.ORG_DATABASE_URL,
});
const orgBackend = createPostgresBackend(drizzle(orgPool));
export const organizationStore = createStore(organizationGraph, orgBackend);
```

**When to separate backends:**

- Different performance profiles (vector search vs. relational queries)
- Compliance requirements (PII in one database, analytics in another)
- Independent scaling needs
- Different backup/retention policies

## Schema Management

Each graph has independent schema versioning:

```typescript
import { createStoreWithSchema } from "@nicia-ai/typegraph";

// Each graph tracks its own schema version
const [documentsStore, docsSchemaResult] = await createStoreWithSchema(
  documentsGraph,
  backend
);

const [orgStore, orgSchemaResult] = await createStoreWithSchema(
  organizationGraph,
  backend
);

// Check migration status independently
if (docsSchemaResult.status === "migrated") {
  console.log("Documents schema was migrated");
}

if (orgSchemaResult.status === "migrated") {
  console.log("Organization schema was migrated");
}
```

## Caveats

**No cross-graph queries**: You cannot traverse from a node in one graph to a node in another. If you need this, consider:

- Merging the graphs into one with clear ontology separation
- Using application-level joins as shown above

**Separate ontology closures**: Each graph computes its own `subClassOf`, `implies`, etc. closures. Ontology relations
don't span graphs.

**Independent transactions**: A transaction in one store doesn't include the other. For cross-graph consistency, use
sagas or eventual consistency patterns.

**Shared tables**: When using the same backend, both graphs write to the same `typegraph_nodes` and `typegraph_edges`
tables, differentiated by `graph_id`. This is fine for most cases but means a database-level issue affects both
graphs.

## Next Steps

- [Multi-Tenant SaaS](./examples/multi-tenant) - Isolating data by tenant within a single graph
- [Schema Migrations](./schema-management) - Versioning and migrations
- [Integration Patterns](./integration) - More deployment strategies

# Ontology & Reasoning

> Semantic relationships, type hierarchies, and inference

## When Do You Need an Ontology?

An ontology captures **meaning** about your data—relationships that exist at the type level, not just instance
level. You need ontology when:

- **Type hierarchies**: "A Podcast is a type of Media" (query for Media, get Podcasts too)
- **Concept relationships**: "Machine Learning is narrower than AI" (topic navigation)
- **Constraints**: "A Person cannot also be an Organization" (prevent invalid data)
- **Edge implications**: "If Alice is married to Bob, she also knows Bob" (inferred relationships)
- **Bidirectional queries**: "manages and managedBy are inverses" (traverse in either direction)

Without ontology, you'd implement these manually—if statements scattered throughout your code, hand-rolled
validation, duplicate queries. Ontology centralizes this logic in your schema.

## How It Works

TypeGraph treats semantic relationships between types as **meta-edges**—edges at the type level rather than instance level:

```typescript
// Instance edges: relationships between INSTANCES
// "Alice knows Bob"
const knows = defineEdge("knows");

// Meta-edges: relationships between TYPES
// "Employee subClassOf Person"
subClassOf(Employee, Person);
```

When you define an ontology, TypeGraph:

1. **Precomputes closures** at store initialization (not query time)
2. **Expands queries** automatically based on relationships
3. **Enforces constraints** when creating nodes and edges

## Core Meta-Edges

TypeGraph provides a standard set of meta-edges:

```typescript
import { subClassOf, broader, narrower, equivalentTo, sameAs, differentFrom, disjointWith, partOf, hasPart, relatedTo, inverseOf, implies } from "@nicia-ai/typegraph";
```

### Subsumption (Type Inheritance)

**`subClassOf`**: Defines type inheritance where instances of the child are also instances of the parent.

```typescript
subClassOf(Podcast, Media);
subClassOf(Article, Media);
subClassOf(Company, Organization);
```

**Query Behavior:**

Subclass expansion is **opt-in** via `includeSubClasses: true`:

```typescript
// Without expansion: returns only nodes with kind="Media"
const mediaOnly = await store
  .query()
  .from("Media", "m")
  .select((ctx) => ctx.m)
  .execute();

// With expansion: returns Media, Podcast, AND Article nodes
const allMedia = await store
  .query()
  .from("Media", "m", { includeSubClasses: true })
  .select((ctx) => ctx.m)
  .execute();
// Results include nodes of kind "Media", "Podcast", and "Article"
```

This is a fundamental difference from traditional ORM inheritance—TypeGraph stores the concrete type
(`kind: "Podcast"`) in the database, and expands at query time when requested.

### Hierarchical (Concept Hierarchy)

**`broader`** and **`narrower`**: Define conceptual hierarchy without identity.

```typescript
broader(MachineLearning, ArtificialIntelligence);
broader(DeepLearning, MachineLearning);
broader(ArtificialIntelligence, Technology);
```

**Important**: This is different from `subClassOf`. A topic instance of "ML" is related to "AI",
but is **not** an instance of "AI".

```typescript
// Get all topics narrower than Technology
const narrowerTopics = registry.expandNarrower("Technology");
// ["ArtificialIntelligence", "MachineLearning", "DeepLearning", ...]
```

### Equivalence

**`equivalentTo`**: Defines semantic equivalence between types or external IRIs.

```typescript
equivalentTo(Person, "https://schema.org/Person");
equivalentTo(Organization, "https://schema.org/Organization");
```

**`sameAs`**: Declares identity between individuals (for deduplication).

**`differentFrom`**: Explicitly asserts non-identity.

### Constraints

**`disjointWith`**: Declares that two types cannot share the same ID.

```typescript
disjointWith(Person, Organization);
disjointWith(Podcast, Article);
```

**Effect**: Attempting to create a node that violates disjointness throws `DisjointError`:

```typescript
// Create a Person with ID "entity-1"
await store.nodes.Person.create({ name: "Alice" }, { id: "entity-1" });

// Throws DisjointError: Person and Organization are disjoint
await store.nodes.Organization.create({ name: "Acme" }, { id: "entity-1" });
```

### Composition

**`partOf`** and **`hasPart`**: Define compositional relationships.

```typescript
partOf(Chapter, Book);
hasPart(Book, Chapter);
partOf(Episode, Podcast);
hasPart(Podcast, Episode);
```

### Edge Relationships

**`inverseOf`**: Declares two edge kinds as inverses of each other.

```typescript
inverseOf(manages, managedBy);
inverseOf(cites, citedBy);
inverseOf(follows, followedBy);
```

**Effect**: You can query in either direction using the registry:

```typescript
const inverse = registry.getInverseEdge("manages"); // "managedBy"
```

You can also expand traversals to include inverse edge kinds at query time:

```typescript
const relationships = await store
  .query()
  .from("Person", "p")
  .traverse("manages", "e", { expand: "inverse" })
  .to("Person", "other")
  .select((ctx) => ({
    other: ctx.other.name,
    via: ctx.e.kind,
  }))
  .execute();
```

For symmetric relationships, declare an edge as its own inverse:

```typescript
inverseOf(sameAs, sameAs);
```

**`implies`**: Declares that one edge kind implies another exists.

```typescript
implies(marriedTo, knows);
implies(bestFriends, friends);
implies(friends, knows);
```

**Effect**: Query for `knows` can include `marriedTo`, `bestFriends`, and `friends` edges:

```typescript
const connections = await store
  .query()
  .from("Person", "p")
  .traverse("knows", "e", { expand: "implying" })
  .to("Person", "other")
  .select((ctx) => ctx.other)
  .execute();
```

## Using the Ontology

### In Graph Definition

```typescript
const graph = defineGraph({
  id: "knowledge_base",
  nodes: { ... },
  edges: { ... },
  ontology: [
    // Type hierarchy
    subClassOf(Podcast, Media),
    subClassOf(Article, Media),
    subClassOf(Company, Organization),

    // Concept hierarchy
    broader(MachineLearning, ArtificialIntelligence),
    broader(DeepLearning, MachineLearning),

    // Constraints
    disjointWith(Person, Organization),
    disjointWith(Media, Person),

    // Composition
    partOf(Episode, Podcast),

    // Edge relationships
    inverseOf(cites, citedBy),
    implies(marriedTo, knows),
  ],
});
```

### Registry Lookups

The type registry (accessed via `store.registry`) provides methods to query the ontology:

```typescript
const registry = store.registry;

// Subsumption
registry.isSubClassOf("Podcast", "Media"); // true
registry.expandSubClasses("Media"); // ["Media", "Podcast", "Article"]

// Hierarchy
registry.expandNarrower("Technology"); // ["AI", "ML", "DL", ...]
registry.expandBroader("DeepLearning"); // ["ML", "AI", "Technology"]

// Constraints
registry.areDisjoint("Person", "Organization"); // true
registry.getDisjointKinds("Person"); // ["Organization", "Media", ...]

// Edge relationships
registry.getInverseEdge("cites"); // "citedBy"
registry.getImpliedEdges("marriedTo"); // ["knows"]
registry.getImplyingEdges("knows"); // ["marriedTo", "bestFriends", "friends"]
```

## Custom Meta-Edges

Define domain-specific meta-edges:

```typescript
import { metaEdge } from "@nicia-ai/typegraph";

// Custom meta-edge for prerequisite relationships
const prerequisiteOf = metaEdge("prerequisiteOf", {
  transitive: true,
  inference: "hierarchy",
  description: "Learning prerequisite (Calculus prerequisiteOf LinearAlgebra)",
});

// Custom meta-edge for superseding relationships
const supersedes = metaEdge("supersedes", {
  transitive: true,
  inference: "substitution",
  description: "Replacement relationship (v2 supersedes v1)",
});
```

### Meta-Edge Properties

Each meta-edge can be configured with these properties to control how TypeGraph computes closures and expands queries:

| Property     | Type            | Description               |
| ------------ | --------------- | ------------------------- |
| `transitive` | `boolean`       | A→B, B→C implies A→C      |
| `symmetric`  | `boolean`       | A→B implies B→A           |
| `reflexive`  | `boolean`       | A→A is always true        |
| `inverse`    | `string`        | Name of inverse meta-edge |
| `inference`  | `InferenceType` | How this affects queries  |

### Inference Types

The `inference` property determines how the meta-edge affects query behavior:

| Type             | Description                                  |
| ---------------- | -------------------------------------------- |
| `"subsumption"`  | Query for X includes instances of subclasses |
| `"hierarchy"`    | Enables broader/narrower traversal           |
| `"substitution"` | Can substitute equivalent types              |
| `"constraint"`   | Validation rules                             |
| `"composition"`  | Part-whole navigation                        |
| `"association"`  | Discovery/recommendation                     |
| `"none"`         | No automatic inference                       |

## Closure Computation

TypeGraph precomputes transitive closures at store initialization:

```typescript
// subClassOf closure
// If: Podcast subClassOf Media, Episode subClassOf Media
// Then: expandSubClasses("Media") = ["Media", "Podcast", "Episode"]

// implies closure
// If: marriedTo implies partneredWith, partneredWith implies knows
// Then: getImpliedEdges("marriedTo") = ["partneredWith", "knows"]
```

This makes queries efficient—expansion happens at query compilation time, not execution time.

## Best Practices

### Separate `subClassOf` from `broader`

These have different semantics:

- `subClassOf`: Instance identity (a Podcast **is** a Media)
- `broader`: Conceptual relation (ML **relates to** AI, but ML instance ≠ AI instance)

```typescript
// CORRECT: Type hierarchy
subClassOf(Podcast, Media);

// CORRECT: Concept hierarchy
broader(MachineLearning, ArtificialIntelligence);

// WRONG: Don't mix them
// subClassOf(MachineLearning, ArtificialIntelligence);
```

### Use Disjoint Constraints

Prevent impossible combinations:

```typescript
// Good: Prevent ID conflicts
disjointWith(Person, Organization);
disjointWith(Person, Product);
disjointWith(Organization, Product);
```

### Model Edge Hierarchies with Implies

```typescript
// Relationship hierarchy: specific → general
implies(marriedTo, partneredWith);
implies(partneredWith, knows);
implies(parentOf, relatedTo);
implies(siblingOf, relatedTo);
implies(relatedTo, knows);
```

### Use InverseOf for Bidirectional Queries

```typescript
inverseOf(manages, managedBy);
inverseOf(follows, followedBy);
inverseOf(cites, citedBy);
```

This lets you query efficiently in either direction without duplicating edges.

## API Reference

### Ontology Functions

#### `subClassOf(child, parent)`

Declares type inheritance.

```typescript
function subClassOf(child: NodeType, parent: NodeType): OntologyRelation;
```

#### `broader(narrower, broader)`

Declares hierarchical relationship (narrower concept to broader concept).

```typescript
function broader(narrower: NodeType, broader: NodeType): OntologyRelation;
```

#### `narrower(broader, narrower)`

Declares hierarchical relationship (broader concept to narrower concept).

```typescript
function narrower(broader: NodeType, narrower: NodeType): OntologyRelation;
```

#### `equivalentTo(a, b)`

Declares semantic equivalence between types or with external IRIs.

```typescript
function equivalentTo(
  a: NodeType | string,
  b: NodeType | string
): OntologyRelation;
```

#### `sameAs(a, b)`

Declares identity between individuals.

```typescript
function sameAs(a: NodeType, b: NodeType): OntologyRelation;
```

#### `differentFrom(a, b)`

Declares non-identity.

```typescript
function differentFrom(a: NodeType, b: NodeType): OntologyRelation;
```

#### `disjointWith(a, b)`

Declares mutual exclusion (types cannot share the same ID).

```typescript
function disjointWith(a: NodeType, b: NodeType): OntologyRelation;
```

#### `partOf(part, whole)`

Declares compositional relationship (part to whole).

```typescript
function partOf(part: NodeType, whole: NodeType): OntologyRelation;
```

#### `hasPart(whole, part)`

Declares compositional relationship (whole to part).

```typescript
function hasPart(whole: NodeType, part: NodeType): OntologyRelation;
```

#### `relatedTo(a, b)`

Declares association between types.

```typescript
function relatedTo(a: NodeType, b: NodeType): OntologyRelation;
```

#### `inverseOf(edgeA, edgeB)`

Declares edge types as inverses of each other.

```typescript
function inverseOf(edgeA: EdgeType, edgeB: EdgeType): OntologyRelation;
```

#### `implies(edgeA, edgeB)`

Declares that one edge type implies another exists.

```typescript
function implies(edgeA: EdgeType, edgeB: EdgeType): OntologyRelation;
```

#### `metaEdge(name, options?)`

Creates a custom meta-edge for domain-specific relationships.

```typescript
function metaEdge(
  name: string,
  options?: {
    transitive?: boolean;
    symmetric?: boolean;
    reflexive?: boolean;
    inverse?: string;
    inference?: InferenceType;
    description?: string;
  },
): MetaEdge;
```

### Type Registry API

The type registry is available via `store.registry` and provides methods to query the ontology at runtime.

#### `isSubClassOf(child, parent)`

Checks if a type is a subclass of another.

```typescript
registry.isSubClassOf(child: string, parent: string): boolean;

registry.isSubClassOf("Podcast", "Media"); // true
```

#### `expandSubClasses(type)`

Returns a type and all its subclasses.

```typescript
registry.expandSubClasses(type: string): readonly string[];

registry.expandSubClasses("Media"); // ["Media", "Podcast", "Article"]
```

#### `areDisjoint(a, b)`

Checks if two types are disjoint.

```typescript
registry.areDisjoint(a: string, b: string): boolean;

registry.areDisjoint("Person", "Organization"); // true
```

#### `getDisjointKinds(type)`

Returns all types disjoint with the given type.

```typescript
registry.getDisjointKinds(type: string): readonly string[];

registry.getDisjointKinds("Person"); // ["Organization", "Media", ...]
```

#### `expandNarrower(type)`

Returns all types narrower than the given type (via `broader` relationships).

```typescript
registry.expandNarrower(type: string): readonly string[];

registry.expandNarrower("Technology"); // ["AI", "ML", "DeepLearning", ...]
```

#### `expandBroader(type)`

Returns all types broader than the given type.

```typescript
registry.expandBroader(type: string): readonly string[];

registry.expandBroader("DeepLearning"); // ["MachineLearning", "AI", "Technology"]
```

#### `getInverseEdge(edgeType)`

Returns the inverse of an edge type.

```typescript
registry.getInverseEdge(edgeType: string): string | undefined;

registry.getInverseEdge("manages"); // "managedBy"
```

#### `getImpliedEdges(edgeType)`

Returns edges implied by an edge type.

```typescript
registry.getImpliedEdges(edgeType: string): readonly string[];

registry.getImpliedEdges("marriedTo"); // ["knows"]
```

#### `getImplyingEdges(edgeType)`

Returns edges that imply an edge type.

```typescript
registry.getImplyingEdges(edgeType: string): readonly string[];

registry.getImplyingEdges("knows"); // ["marriedTo", "bestFriends", "friends"]
```

#### `expandImplyingEdges(edgeType)`

Returns an edge type and all edges that imply it.

```typescript
registry.expandImplyingEdges(edgeType: string): readonly string[];

registry.expandImplyingEdges("knows"); // ["knows", "marriedTo", "bestFriends", "friends"]
```

# Indexes

> Define and create indexes for TypeGraph queries

TypeGraph stores node and edge properties in a JSON `props` column. When you filter or order by
JSON properties at scale, you typically need **expression indexes** on those JSON paths.

TypeGraph includes built-in indexes for common access patterns (lookups by ID, edge traversals,
temporal filtering), but application-specific indexes are up to you.

The `@nicia-ai/typegraph/indexes` entrypoint provides:

- **Type-safe index definitions** for node and edge schemas
- **Dialect-specific DDL generation** for PostgreSQL and SQLite
- **Drizzle schema integration** so drizzle-kit can generate migrations
- **Profiler integration** so recommendations account for indexes you already have

## Quick Start (Drizzle / drizzle-kit)

Define your indexes once and pass them into the Drizzle schema factories:

```ts
import { defineEdge, defineNode } from "@nicia-ai/typegraph";
import { createPostgresTables } from "@nicia-ai/typegraph/postgres";
import { andWhere, defineEdgeIndex, defineNodeIndex } from "@nicia-ai/typegraph/indexes";
import { z } from "zod";

const Person = defineNode("Person", {
  schema: z.object({
    email: z.string().email(),
    name: z.string(),
    isActive: z.boolean().optional(),
  }),
});

const worksAt = defineEdge("worksAt", {
  schema: z.object({
    role: z.string(),
  }),
});

export const personEmail = defineNodeIndex(Person, {
  fields: ["email"],
  unique: true,
  coveringFields: ["name"],
  where: (w) => andWhere(w.deletedAt.isNull(), w.isActive.eq(true)),
});

export const worksAtRoleOut = defineEdgeIndex(worksAt, {
  fields: ["role"],
  direction: "out",
  where: (w) => w.deletedAt.isNull(),
});

// drizzle-kit will include these indexes in generated migrations
export const typegraphTables = createPostgresTables({}, {
  indexes: [personEmail, worksAtRoleOut],
});
```

For SQLite, use `createSqliteTables`:

```ts
import { createSqliteTables } from "@nicia-ai/typegraph/sqlite";

export const typegraphTables = createSqliteTables({}, {
  indexes: [personEmail, worksAtRoleOut],
});
```

## Node Indexes

`defineNodeIndex(nodeType, config)` creates an index definition for node properties.

**Key options:**

- `fields`: JSON property paths used for filtering/ordering (B-tree expression keys).
- `coveringFields`: additional properties frequently selected with the same filters. These become
  additional index keys to enable index-only reads when combined with smart select.
- `unique`: create a unique index.
- `scope`: prefixes index keys with TypeGraph system columns (default is `"graphAndKind"`).
- `where`: partial index predicate (portable DSL, compiled per dialect).

:::note[Covering fields vs PostgreSQL INCLUDE]
TypeGraph properties live inside `props`, so indexes are built on expressions. PostgreSQL `INCLUDE`
does not support expressions, so `coveringFields` are implemented as additional index keys rather
than an `INCLUDE (...)` clause.

Because `coveringFields` become index keys, they:

- Increase index size (more key data)
- Affect index ordering (can help `ORDER BY`, but changes sort/range behavior)
- Must be maintained on writes like any other key

:::

### Nested JSON Paths

For top-level properties, use the field name:

```ts
defineNodeIndex(Person, { fields: ["email"] });
```

For nested properties inside `props`, use a JSON pointer:

```ts
defineNodeIndex(Person, { fields: ["/metadata/priority"] });
```

You can also pass pointer segments:

```ts
defineNodeIndex(Person, { fields: [["metadata", "priority"] as const] });
```

### Index Scope

Index `scope` controls which TypeGraph system columns are prefixed ahead of your JSON keys:

- `"graphAndKind"` (default): prefixes with `(graph_id, kind)` to match most TypeGraph queries.
- `"graph"`: prefixes with `graph_id` only (rare; useful for cross-kind queries within a graph).
- `"none"`: no system prefix (rare; usually only correct for global queries).

## Edge Indexes

`defineEdgeIndex(edgeType, config)` works the same way as node indexes, with one extra option:

- `direction`: `"out" | "in" | "none"` (default `"none"`). When set, the index keys are prefixed
  with the join key used by traversal queries (`from_id` for `"out"`, `to_id` for `"in"`).

This makes it easy to create indexes that match `.traverse()` patterns.

**When to use `direction`:**

- `"out"`: optimize outbound traversals that join on `from_id` (start node → edges).
- `"in"`: optimize inbound traversals that join on `to_id` (end node → edges).
- `"none"`: for edge queries not anchored by a traversal join key (less common).

## Partial Indexes (WHERE)

Use `where` to create partial indexes with a small, typed predicate DSL.

System columns are available (e.g. `deletedAt`, `createdAt`, `fromId`), as well as your schema
properties (e.g. `email`, `role`).

```ts
import { andWhere, defineNodeIndex } from "@nicia-ai/typegraph/indexes";

const activeEmail = defineNodeIndex(Person, {
  fields: ["email"],
  where: (w) => andWhere(w.deletedAt.isNull(), w.isActive.eq(true)),
});
```

## Covering Indexes

To maximize the benefit of [smart select optimization](/performance/overview#smart-select),
create indexes that include both the filter columns and selected columns. This enables index-only
scans where the database satisfies the entire query from the index.

```ts
// Index covers email filter AND name selection
const personEmailWithName = defineNodeIndex(Person, {
  fields: ["email"],
  coveringFields: ["name"],
  where: (w) => w.deletedAt.isNull(),
});
```

**Generated PostgreSQL:**

```sql
CREATE INDEX idx_person_email_name ON typegraph_nodes
  (graph_id, kind, ((props #>> ARRAY['email'])), ((props #>> ARRAY['name'])))
  WHERE deleted_at IS NULL;
```

**Generated SQLite:**

```sql
CREATE INDEX idx_person_email_name ON typegraph_nodes
  (graph_id, kind, json_extract(props, '$.email'), json_extract(props, '$.name'))
  WHERE deleted_at IS NULL;
```

## Choosing the Right Index Type

TypeGraph's `defineNodeIndex` / `defineEdgeIndex` generate **B-tree expression indexes** — the right
choice for scalar equality, range, and ordering queries. But JSON properties can also hold arrays
and objects, which need different index strategies.

| Data shape | Query pattern | Index type | TypeGraph utility? |
|------------|--------------|------------|-------------------|
| Scalar (`string`, `number`, `boolean`) | `eq()`, `gt()`, `in()`, `orderBy()` | B-tree expression | Yes — `defineNodeIndex` |
| Array of scalars | `contains()`, `containsAll()`, `containsAny()` | GIN (PostgreSQL) | No — use raw SQL |
| Nested object | `hasKey()`, `pathEquals()`, `pathContains()` | GIN or B-tree expression | Partially — B-tree on specific paths |

### B-tree expression indexes (scalar properties)

Best for equality, range, sorting, and prefix matching on individual JSON fields. This is what
`defineNodeIndex` and `defineEdgeIndex` generate.

```ts
// Good for: .whereNode("p", (p) => p.email.eq("..."))
defineNodeIndex(Person, { fields: ["email"] });

// Good for: .orderBy("p", "createdScore", "desc")
defineNodeIndex(Person, { fields: ["createdScore"] });
```

### GIN indexes (array and object containment — PostgreSQL only)

TypeGraph compiles array predicates to PostgreSQL's JSONB containment operator (`@>`), which is
optimized by GIN indexes. If you filter on array properties, create a GIN index on the `props`
column:

```sql
-- Covers ALL array containment queries on Person nodes
CREATE INDEX idx_person_props_gin ON typegraph_nodes
  USING GIN (props)
  WHERE graph_id = 'my_graph' AND kind = 'Person' AND deleted_at IS NULL;
```

This single GIN index accelerates all of the following predicates:

```typescript
// contains: does the tags array include "typescript"?
.whereNode("p", (p) => p.tags.contains("typescript"))

// containsAll: does it include BOTH "typescript" AND "graphql"?
.whereNode("p", (p) => p.tags.containsAll(["typescript", "graphql"]))

// containsAny: does it include "typescript" OR "graphql"?
.whereNode("p", (p) => p.tags.containsAny(["typescript", "graphql"]))
```

Unlike B-tree expression indexes, a single GIN index covers queries on **any** JSON path within
`props` — you don't need a separate index per field.

:::note[SQLite]
SQLite has no GIN equivalent. Array containment queries on SQLite use `json_each()` scans,
which can't be indexed. If array filtering is performance-critical, consider PostgreSQL.
:::

### Combining B-tree and GIN

For tables where you filter on both scalar fields (equality, range) and array fields (containment),
use both index types:

```sql
-- B-tree for scalar equality + ordering
CREATE INDEX idx_person_email ON typegraph_nodes
  (graph_id, kind, ((props #>> ARRAY['email'])))
  WHERE deleted_at IS NULL;

-- GIN for array containment
CREATE INDEX idx_person_props_gin ON typegraph_nodes
  USING GIN (props)
  WHERE graph_id = 'my_graph' AND kind = 'Person' AND deleted_at IS NULL;
```

PostgreSQL's query planner can use both indexes together via a BitmapAnd scan when a query filters
on both a scalar field and an array field.

## Generating SQL (No drizzle-kit)

If you manage migrations yourself, generate DDL snippets:

```ts
import { generateIndexDDL } from "@nicia-ai/typegraph/indexes";

const sql = generateIndexDDL(personEmail, "postgres");
// → CREATE INDEX ...;
```

## Verifying Index Usage

Use `EXPLAIN ANALYZE` to verify your indexes are being used:

```sql
-- PostgreSQL
EXPLAIN ANALYZE SELECT props #>> ARRAY['email'], props #>> ARRAY['name']
FROM typegraph_nodes
WHERE graph_id = 'my_graph'
  AND kind = 'Person'
  AND deleted_at IS NULL
  AND (props #>> ARRAY['email']) = 'alice@example.com';

-- SQLite
EXPLAIN QUERY PLAN SELECT json_extract(props, '$.email'), json_extract(props, '$.name')
FROM typegraph_nodes
WHERE graph_id = 'my_graph'
  AND kind = 'Person'
  AND deleted_at IS NULL
  AND json_extract(props, '$.email') = 'alice@example.com';
```

Look for "Index Scan" or "Index Only Scan" (PostgreSQL) or "USING INDEX" (SQLite) in the output.

## Profiler Integration

Pass your existing indexes to the [Query Profiler](/performance/profiler) so recommendations
focus on what you *don't* have:

```ts
import { QueryProfiler } from "@nicia-ai/typegraph/profiler";
import { toDeclaredIndexes } from "@nicia-ai/typegraph/indexes";

const profiler = new QueryProfiler({
  declaredIndexes: toDeclaredIndexes([personEmail, worksAtRoleOut]),
});
```

## Limitations

- `defineNodeIndex` / `defineEdgeIndex` generate B-tree expression indexes for **scalar** properties
  (`string`, `number`, `boolean`, `Date`). For array containment queries, create
  [GIN indexes](#gin-indexes-array-and-object-containment--postgresql-only) manually.
- GIN indexes are PostgreSQL-only. SQLite has no equivalent for JSON containment acceleration.
- Embedding fields are indexed via the embeddings table; see [Semantic Search](/semantic-search).

## Next Steps

- [Performance Overview](/performance/overview) — Best practices, N+1 prevention, batch patterns
- [Query Profiler](/performance/profiler) — Automatic index recommendations

# Performance Overview

> Understanding the performance characteristics of TypeGraph

TypeGraph is designed to be a high-performance, low-overhead layer on top of
your relational database. By leveraging the power of modern SQL engines (SQLite
and PostgreSQL) and precomputing complex relationships, TypeGraph ensures that
your knowledge graph scales with your application.

## Performance Philosophy

1. **One Query, One Statement**: Every query — including multi-hop traversals — compiles to a
   single SQL statement. No N+1 queries by design.
2. **Precomputed Ontology**: Transitive closures, subclass hierarchies, and edge implications are
   computed once at schema initialization, not during every query.
3. **Batching & Transactions**: Bulk collection APIs and transactions minimize round-trips for writes.
4. **Zero-Cost Abstractions**: Type safety and ontological reasoning add no measurable runtime overhead.

## N+1 Prevention

A common performance problem in ORMs is the N+1 query: you fetch N entities, then issue one
query per entity to load related data. TypeGraph eliminates this structurally.

Every query — regardless of how many traversals it chains — compiles to a **single SQL statement**
using Common Table Expressions (CTEs). Each traversal step becomes a CTE that joins against the
previous one:

```typescript
// This compiles to ONE SQL statement, not 3 separate queries
const results = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq("Alice"))
  .traverse("worksAt", "employment")
  .to("Company", "c")
  .traverse("locatedIn", "location")
  .to("City", "city")
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c.name,
    city: ctx.city.name,
  }))
  .execute();
```

The generated SQL looks like:

```sql
WITH cte_p AS (
  SELECT ... FROM typegraph_nodes
  WHERE graph_id = ? AND kind IN ('Person') AND ...
),
cte_employment AS (
  SELECT ... FROM typegraph_edges e
  JOIN typegraph_nodes n ON ...
  WHERE e.graph_id = ? AND ...
),
cte_location AS (
  SELECT ... FROM typegraph_edges e
  JOIN typegraph_nodes n ON ...
  WHERE e.graph_id = ? AND ...
)
SELECT ... FROM cte_p
JOIN cte_employment ON ...
JOIN cte_location ON ...
```

This holds for all query types:

- Multi-hop traversals (N CTEs, 1 statement)
- [Recursive traversals](/queries/recursive) (WITH RECURSIVE, 1 statement)
- Aggregations with traversals (CTEs + GROUP BY, 1 statement)
- [Set operations](/queries/combine) (UNION/INTERSECT/EXCEPT of CTEs, 1 statement)

There is no dataloader or batching layer because there is nothing to batch — the database handles
the entire join graph in a single execution.

## Batch Write Patterns

### Single vs bulk operations

For small numbers of writes, individual `create()` calls inside a transaction are fine. For larger
volumes, use the bulk collection APIs — they use multi-row INSERTs and handle parameter chunking
internally.

| Method | Returns results | Use case |
|--------|----------------|----------|
| `bulkCreate(items)` | Yes | Need created nodes back |
| `bulkInsert(items)` | No | Maximum throughput ingestion |
| `bulkUpsertById(items)` | Yes | Idempotent import (create or update by ID) |
| `bulkDelete(ids)` | No | Mass soft-delete |

### PostgreSQL parameter limits

PostgreSQL has a 65,535 bind parameter limit per statement. TypeGraph automatically chunks bulk
operations to stay within this limit:

- Node inserts: ~7,200 per chunk (9 params per node)
- Edge inserts: ~5,400 per chunk (12 params per edge)

You don't need to chunk manually — pass arrays of any size and TypeGraph handles the rest.

### Transaction wrapping

Bulk operations are individually transactional (each chunk is atomic), but if you need the
entire batch to be atomic, wrap it in a transaction:

```typescript
// Atomic: all-or-nothing for the entire import
await store.transaction(async (tx) => {
  await tx.nodes.Person.bulkCreate(people);
  await tx.nodes.Company.bulkCreate(companies);
  await tx.edges.worksAt.bulkCreate(employments);
});
```

Without the wrapping transaction, a failure partway through would leave partial data.

### Choosing the right pattern

```typescript
// Small batch (< 100 items): individual creates in a transaction are fine
await store.transaction(async (tx) => {
  for (const person of people) {
    await tx.nodes.Person.create(person);
  }
});

// Medium batch (100–10,000 items): bulkCreate
const created = await store.nodes.Person.bulkCreate(people);

// Large batch (10,000+ items): bulkInsert (no result allocation)
await store.nodes.Person.bulkInsert(people);

// Idempotent import: bulkUpsertById (creates or updates by ID)
await store.nodes.Person.bulkUpsertById(itemsWithIds);
```

### Batch reads

`getByIds()` on node and edge collections uses a single `SELECT ... WHERE id IN (...)` instead of N
individual queries. Results are returned in input order with `undefined` for missing entries.

```typescript
const [alice, bob] = await store.nodes.Person.getByIds([aliceId, bobId]);
```

:::note[Operation hooks]
Bulk operations (`bulkCreate`, `bulkInsert`, `bulkUpsertById`) skip per-item operation hooks for
throughput. Query hooks still fire normally. See [Schemas & Stores](/schemas-stores#hooks) for details.
:::

## Connection Management

TypeGraph does not manage database connections or pools — you bring your own and are responsible
for lifecycle. See [Backend Setup](/backend-setup) for full setup guides.

### PostgreSQL pooling

Always use a connection pool in production. TypeGraph issues one SQL statement per query, so pool
utilization is straightforward — no long-held connections or multi-statement conversations.

```typescript
import { Pool } from "pg";

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,                       // Size based on your concurrency needs
  idleTimeoutMillis: 30_000,
  connectionTimeoutMillis: 2_000,
});

pool.on("error", (err) => {
  console.error("Unexpected pool error", err);
});
```

**Sizing guidance:** Each concurrent query uses one connection for the duration of that single SQL
statement. A pool of 10–20 connections handles most workloads. If you're running bulk imports in
parallel, size up accordingly.

### SQLite concurrency

SQLite is single-writer. For best throughput:

- Enable WAL mode: `sqlite.pragma("journal_mode = WAL")` — allows concurrent reads while writing
- Batch writes in transactions rather than issuing many small commits
- For read-heavy workloads, SQLite performs well without pooling since `better-sqlite3` is synchronous

### Transaction isolation

PostgreSQL transactions accept an optional isolation level:

```typescript
await store.transaction(
  async (tx) => {
    // Serializable isolation for strict consistency
    const snapshot = await tx.nodes.Account.getById(accountId);
    // ...
  },
  { isolationLevel: "serializable" },
);
```

Available levels: `read_uncommitted`, `read_committed` (default), `repeatable_read`, `serializable`.

SQLite always operates at `serializable` isolation.

## Query Optimization Features

### Precomputed Closures

When you define an ontology (e.g., `subClassOf`, `implies`), TypeGraph precomputes the full
transitive closure at store initialization. Queries like
`.from("Parent", "p", { includeSubClasses: true })` use a pre-calculated list of kinds rather than
recursive lookups at runtime.

### Smart Select

TypeGraph automatically optimizes queries based on which fields your `select()` callback accesses.
When you select specific fields, TypeGraph generates SQL that only extracts those fields using
`json_extract()` (SQLite) or JSONB path extraction (PostgreSQL), rather than fetching the entire
`props` blob.

```typescript
// Optimized: Only fetches email and name from the database
const results = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.email.eq("alice@example.com"))
  .select((ctx) => ({
    email: ctx.p.email,
    name: ctx.p.name,
  }))
  .execute();

// SQL: SELECT json_extract(props, '$.email'), json_extract(props, '$.name') ...
```

This optimization pairs well with [covering indexes](/performance/indexes#covering-indexes): if
your index contains both the filter keys and the selected keys, the database can satisfy the query
with an index-only scan.

**When optimization applies:**

| Pattern | Optimized? | Reason |
|---------|-----------|--------|
| `ctx => ({ email: ctx.p.email })` | Yes | Simple field extraction |
| `ctx => [ctx.p.id, ctx.p.name]` | Yes | Multiple fields in array |
| `ctx => ctx.p` | No | Whole node returned |
| `ctx => ({ upper: ctx.p.email.toUpperCase() })` | Yes | Field extracted; method runs in JS |
| `ctx => ({ ...ctx.p })` | No | Spread requires full node |

The optimization is transparent — if your callback can't be optimized, TypeGraph automatically
falls back to fetching the full node data.

:::note[Select callback purity]
Smart select applies to `.execute()`, `.paginate()`, and `.stream()`. The `select()` callback may be evaluated
multiple times during planning/optimization, so it should be pure (no side effects).
:::

:::note[Known limitations]
Smart select is not currently applied to queries that include variable-length traversals (recursive CTEs),
even when the select callback is otherwise optimizable.
:::

### Built-in Indexes

The default TypeGraph schema includes optimized indexes for the most common access patterns:

- **Graph + Kind + ID**: Primary key for node lookups
- **Graph + From/To ID**: Optimized for edge traversals
- **Temporal columns**: Indexes on `valid_from`, `valid_to`, and `deleted_at`

For application-specific indexes on JSON properties, see [Indexes](/performance/indexes).

### Compilation Caching

Each builder method (`.where()`, `.limit()`, `.orderBy()`, etc.) returns a new immutable instance.
The compiled SQL for each instance is cached internally — repeated `.execute()` calls on the same
builder skip recompilation entirely. This applies to standard queries, aggregate queries, and
set-operation queries (`union`, `intersect`, `except`). This is transparent and requires no API changes.

```typescript
const activeUsers = store
  .query()
  .from("User", "u")
  .whereNode("u", (u) => u.status.eq("active"))
  .select((ctx) => ctx.u);

// First call: compiles AST → SQL → executes
await activeUsers.execute();

// Second call: reuses cached SQL → executes
await activeUsers.execute();
```

### Prepared Queries

For hot paths that execute the same query shape with different values, `.prepare()` pre-compiles the
entire query pipeline (AST build, SQL compilation, text extraction) once. Subsequent
`.execute(bindings)` calls only substitute parameter values and execute.

When `executeRaw` is available (both SQLite and PostgreSQL backends), the pre-compiled SQL text is
sent directly to the driver — zero recompilation overhead.

Best for: API endpoints, hot loops, or any code path that runs the same query shape repeatedly.

See [Prepared Queries](/queries/execute#prepared-queries) for usage details.

## Best Practices

### Filter early

Apply `.whereNode()` predicates as early as possible in your query chain. TypeGraph moves these
predicates into the initial CTEs, reducing the number of rows that need to be joined in subsequent
steps.

### Select specific fields

When you only need certain fields, select them explicitly rather than returning whole nodes.
This triggers the [smart select optimization](#smart-select) and can enable index-only scans with
properly configured indexes.

```typescript
// Preferred: Only fetches what you need
.select((ctx) => ({ name: ctx.p.name, email: ctx.p.email }))

// Avoid when possible: Fetches entire props blob
.select((ctx) => ctx.p)
```

### Use specific kinds

Unless you specifically need to query across a hierarchy, avoid `includeSubClasses: true`. Being
specific about the node kind allows the SQL engine to use more restrictive index scans.

### Use cursor pagination

For large datasets, prefer `.paginate()` over `.limit()` and `.offset()`. Keyset pagination
(using cursors) avoids the `O(N)` cost of skipping rows in standard SQL offsets.

### Index your filter and sort properties

TypeGraph's built-in indexes cover structural lookups (by ID, by edge endpoints). Properties you
filter or sort on in `whereNode()`, `whereEdge()`, and `orderBy()` need application-specific
[expression indexes](/performance/indexes). Use the [Query Profiler](/performance/profiler) to
identify which properties need coverage.

## Profile Your Queries

Use the [Query Profiler](/performance/profiler) to identify missing indexes and understand
query patterns in your application. The profiler captures property access patterns and generates
prioritized index recommendations.

```typescript
import { QueryProfiler } from "@nicia-ai/typegraph/profiler";

const profiler = new QueryProfiler();
const profiledStore = profiler.attachToStore(store);

// Run your application or test suite...

const report = profiler.getReport();
console.log(report.recommendations);
```

## Benchmarks

TypeGraph uses a deterministic performance sanity suite as its benchmark and regression gate.
The suite seeds a realistic graph shape and measures end-to-end query latency across:

- forward and reverse traversals
- inverse/symmetric traversal (`expand: "inverse"` / `expand: "all"`)
- 2-hop and 3-hop traversals
- aggregate queries
- cached execute vs prepared execute
- deep traversals (`10`/`100`/`1000` hop recursive with `cyclePolicy: "allow"`)

Guardrail thresholds enforce expected behavior in CI (for example, traversal latency caps and
ratio checks such as reverse/forward and deep-hop scaling).

Deep-recursive benchmark probes explicitly set `cyclePolicy: "allow"` to isolate recursive CTE
expansion cost; the default `cyclePolicy: "prevent"` prioritizes cycle-safe semantics and is
expected to be slower on long traversals.

*Note: Real-world performance varies by hardware, database driver, network latency (for PostgreSQL),
and schema/data shape.*

<details>
<summary>Benchmark configuration and guardrails</summary>

Current suite configuration:

| Setting | Value |
|---------|-------|
| Seed users | 1200 |
| Follows per user | 10 |
| Posts per user | 5 |
| Batch size | 250 |
| Warmup iterations | 2 |
| Sample iterations (median reported) | 15 |

Default guardrails:

| Check | Threshold |
|-------|-----------|
| reverse/forward ratio | <= 6x |
| inverse traversal latency | <= 500ms |
| inverse/forward ratio | <= 10x |
| 3-hop latency | <= 500ms |
| 3-hop/2-hop ratio | <= 8x |
| aggregate latency | <= 500ms |
| aggregate distinct latency | <= 700ms |
| aggregateDistinct/aggregate ratio | <= 4x |
| cached execute latency | <= 500ms |
| prepared execute latency | <= 500ms |
| prepared/cached ratio | <= 2x |
| 10-hop recursive latency | <= 250ms |
| 100-hop recursive latency | <= 1000ms |
| 100-hop-recursive/10-hop-recursive ratio | <= 30x |
| 1000-hop recursive latency | <= 5000ms |
| 1000-hop-recursive/100-hop-recursive ratio | <= 20x |

Backend-specific overrides:

| Backend | Check | Threshold |
|---------|-------|-----------|
| SQLite | 1000-hop recursive latency | <= 7000ms |
| PostgreSQL | inverse traversal latency | <= 1000ms |
| PostgreSQL | inverse/forward ratio | <= 30x |
| PostgreSQL | 3-hop latency | <= 1000ms |
| PostgreSQL | aggregate distinct latency | <= 1200ms |
| PostgreSQL | prepared execute latency | <= 700ms |

</details>

### Running benchmarks locally

```bash
pnpm bench
```

For guardrail mode (fails on regression thresholds):

```bash
pnpm --filter @nicia-ai/typegraph-benchmarks perf:check
```

Run the same guardrailed suite against PostgreSQL:

```bash
POSTGRES_URL=postgresql://typegraph:typegraph@127.0.0.1:5432/typegraph_test \
  pnpm --filter @nicia-ai/typegraph-benchmarks perf:check:postgres
```

The benchmark source code is located in `packages/benchmarks/src/`.

## Next Steps

- [Indexes](/performance/indexes) — Define custom indexes for your schema
- [Query Profiler](/performance/profiler) — Identify missing indexes automatically
- [Backend Setup](/backend-setup) — Connection setup, pooling, and lifecycle

# Query Profiler

> Capture query patterns and generate index recommendations

The Query Profiler captures property access patterns from your queries and generates
index recommendations. Use it during development or in test suites to identify missing indexes.

## Quick Start

```typescript
import { QueryProfiler } from "@nicia-ai/typegraph/profiler";

// Create a profiler and attach it to your store
const profiler = new QueryProfiler();
const profiledStore = profiler.attachToStore(store);

// Run queries as normal - they're automatically tracked
await profiledStore
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.email.eq("alice@example.com"))
  .select((ctx) => ({ name: ctx.p.name }))
  .execute();

// Get recommendations
const report = profiler.getReport();

for (const rec of report.recommendations) {
  console.log(
    `[${rec.priority}] ${rec.entityType}:${rec.kind} ${rec.fields.join(", ")}`,
  );
  console.log(`  ${rec.reason}`);
}
```

## How It Works

The profiler uses JavaScript Proxy to transparently wrap your store and query builders. When
queries execute, it extracts property access patterns from the query AST:

- **Filter patterns**: Properties used in `.whereNode()` and `.whereEdge()` predicates
- **Sort patterns**: Properties used in `.orderBy()`
- **Select patterns**: Properties accessed in `.select()` callbacks
- **Group patterns**: Properties used in `.groupBy()`

The profiler then compares these patterns against your declared indexes and generates
recommendations for missing coverage.

## Kinds and `includeSubClasses`

When you query with `includeSubClasses: true`, a single alias can represent multiple kinds.
When the profiler is attached to a store, it uses the graph schema to attribute a property access
only to kinds where that JSON path exists. This avoids recommending indexes for unrelated subclasses.

## Attaching to a Store

```typescript
const profiler = new QueryProfiler();
const profiledStore = profiler.attachToStore(store);

// The profiled store behaves exactly like the original
await profiledStore.nodes.Person.create({ email: "bob@example.com", name: "Bob" });

// Queries are tracked automatically
await profiledStore.query().from("Person", "p").select((ctx) => ctx.p).execute();

// Access the profiler from the store
profiledStore.profiler.getReport();
```

The profiled store exposes a `profiler` property for convenient access.

## Declaring Existing Indexes

Pass your existing indexes so the profiler doesn't recommend indexes you already have:

```typescript
import { QueryProfiler } from "@nicia-ai/typegraph/profiler";
import { toDeclaredIndexes } from "@nicia-ai/typegraph/indexes";
import { personEmail, worksAtRole } from "./indexes";

const profiler = new QueryProfiler({
  declaredIndexes: toDeclaredIndexes([personEmail, worksAtRole]),
});
```

You can also declare indexes manually:

```typescript
const profiler = new QueryProfiler({
  declaredIndexes: [
    {
      entityType: "node",
      kind: "Person",
      fields: ["/email"],
      unique: true,
      name: "idx_person_email",
    },
    {
      entityType: "node",
      kind: "Person",
      fields: ["/name"],
      unique: false,
      name: "idx_person_name",
    },
  ],
});
```

## Understanding the Report

```typescript
const report = profiler.getReport();
```

The report contains:

### `recommendations`

Prioritized index recommendations sorted by importance:

```typescript
for (const rec of report.recommendations) {
  console.log(
    `[${rec.priority}] ${rec.entityType}:${rec.kind} ${rec.fields.join(", ")}`,
  );
  console.log(`  Reason: ${rec.reason}`);
  console.log(`  Frequency: ${rec.frequency}`);
}
```

**Priority levels:**

- `high`: Property accessed 10+ times in filters/sorts (configurable)
- `medium`: Property accessed 5-9 times (configurable)
- `low`: Property accessed 3-4 times (configurable)

### `unindexedFilters`

Properties used in filter predicates that lack index coverage:

```typescript
for (const path of report.unindexedFilters) {
  const target =
    path.target.__type === "prop" ? path.target.pointer : path.target.field;
  console.log(`Unindexed filter: ${path.entityType}:${path.kind} ${target}`);
}
```

### `patterns`

Raw property access statistics:

```typescript
for (const [key, stats] of report.patterns) {
  console.log(`${key}: ${stats.count} accesses`);
  console.log(`  Contexts: ${[...stats.contexts].join(", ")}`);
  console.log(`  Predicates: ${[...stats.predicateTypes].join(", ")}`);
}
```

### `summary`

Session statistics:

```typescript
console.log(`Total queries: ${report.summary.totalQueries}`);
console.log(`Unique patterns: ${report.summary.uniquePatterns}`);
console.log(`Duration: ${report.summary.durationMs}ms`);
```

## Test Assertions

Use `assertIndexCoverage()` to fail tests when queries filter on unindexed properties:

```typescript
import { describe, it, beforeAll, afterAll } from "vitest";
import { QueryProfiler } from "@nicia-ai/typegraph/profiler";

describe("Query Performance", () => {
  let profiler: QueryProfiler;
  let profiledStore: ProfiledStore<typeof graph>;

  beforeAll(() => {
    profiler = new QueryProfiler({
      declaredIndexes: toDeclaredIndexes([personEmail, personName]),
    });
    profiledStore = profiler.attachToStore(store);
  });

  // Run your test suite against profiledStore...

  it("all filtered properties should be indexed", () => {
    // Throws if any filter property lacks an index
    profiler.assertIndexCoverage();
  });
});
```

## Configuration

```typescript
const profiler = new QueryProfiler({
  // Indexes you already have
  declaredIndexes: [...],

  // Minimum frequency to generate a recommendation (default: 3)
  minFrequencyForRecommendation: 5,

  // Optional priority thresholds (defaults: 5 and 10)
  mediumFrequencyThreshold: 8,
  highFrequencyThreshold: 20,
});
```

## Lifecycle Methods

```typescript
// Reset collected data (keeps configuration)
profiler.reset();

// Detach from store (allows reattachment)
profiler.detach();

// Check attachment status
if (profiler.isAttached) {
  console.log("Profiler is attached to a store");
}
```

## Manual Recording

For custom integrations, record queries directly from their AST:

```typescript
const query = store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.email.eq("test@example.com"))
  .select((ctx) => ctx.p);

// Record without executing
profiler.recordQuery(query.toAst());
```

## Composite Index Detection

The profiler understands composite index prefix matching. If you have an index on `["email", "name"]`,
queries filtering on just `email` are considered covered:

```typescript
const profiler = new QueryProfiler({
  declaredIndexes: [
    {
      entityType: "node",
      kind: "Person",
      fields: ["/email", "/name"],
      unique: false,
      name: "idx_email_name",
    },
  ],
});

// This query IS covered (uses the email prefix of the composite index)
await profiledStore
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.email.eq("test@example.com"))
  .execute();

// No recommendation generated for email
```

## Best Practices

1. **Profile realistic workloads**: Run your actual queries or test suite, not synthetic benchmarks.

2. **Profile before optimizing**: Don't guess which indexes you need - let the profiler tell you.

3. **Use in CI**: Add `assertIndexCoverage()` to your test suite to catch regressions.

4. **Declare all indexes**: Pass your existing indexes so recommendations are accurate.

5. **Review frequency**: High-frequency patterns are most important to index.

## Next Steps

- [Indexes](/performance/indexes) - Create the indexes the profiler recommends
- [Performance Overview](/performance/overview) - Best practices and smart select

# Project Structure

> Recommended patterns for organizing TypeGraph in your codebase

How you organize your TypeGraph code depends on your project's size and complexity.
This guide covers recommended patterns from simple single-file setups to large multi-domain graphs.

## Small Projects

For projects with a handful of node and edge types, keep everything in two files:

```text
src/
  graph.ts          # Node/edge definitions + graph
  graph-store.ts    # Store instantiation
```

### graph.ts

Contains all definitions and exports the graph:

```typescript
import { z } from "zod";
import { defineNode, defineEdge, defineGraph, disjointWith } from "@nicia-ai/typegraph";

// Node definitions
export const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
    email: z.string().email().optional(),
  }),
});

export const Company = defineNode("Company", {
  schema: z.object({
    name: z.string(),
    industry: z.string().optional(),
  }),
});

// Edge definitions
export const worksAt = defineEdge("worksAt", {
  schema: z.object({
    role: z.string().optional(),
    since: z.string().optional(),
  }),
});

// Graph definition
export const graph = defineGraph({
  id: "my_app",
  nodes: {
    Person: { type: Person },
    Company: { type: Company },
  },
  edges: {
    worksAt: { type: worksAt, from: [Person], to: [Company] },
  },
  ontology: [disjointWith(Person, Company)],
});
```

### graph-store.ts

Instantiates and exports the store:

```typescript
import { createStore } from "@nicia-ai/typegraph";
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";
import { graph } from "./graph";

const { backend } = createLocalSqliteBackend({ path: "./data.db" });

export const store = createStore(graph, backend);
```

This separation keeps the schema definition (which is static) separate from
store instantiation (which involves runtime configuration like database paths).

## Medium Projects

When your graph grows to 10+ node types or you want better organization, split definitions into separate files:

```text
src/graph/
  index.ts          # Re-exports + defineGraph
  nodes.ts          # All node definitions
  edges.ts          # All edge definitions
  ontology.ts       # Ontological relations
  store.ts          # Store instantiation
```

### nodes.ts

```typescript
import { z } from "zod";
import { defineNode } from "@nicia-ai/typegraph";

export const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
    email: z.string().email().optional(),
    role: z.string().optional(),
  }),
});

export const Company = defineNode("Company", {
  schema: z.object({
    name: z.string(),
    industry: z.string().optional(),
    founded: z.number().optional(),
  }),
});

export const Project = defineNode("Project", {
  schema: z.object({
    name: z.string(),
    status: z.enum(["planning", "active", "completed"]),
  }),
});

// ... more node definitions
```

### edges.ts

```typescript
import { z } from "zod";
import { defineEdge } from "@nicia-ai/typegraph";

export const worksAt = defineEdge("worksAt", {
  schema: z.object({
    role: z.string().optional(),
    since: z.string().optional(),
  }),
});

export const manages = defineEdge("manages");

export const assignedTo = defineEdge("assignedTo", {
  schema: z.object({
    assignedAt: z.string().optional(),
  }),
});

// ... more edge definitions
```

### ontology.ts

```typescript
import { subClassOf, disjointWith, inverseOf } from "@nicia-ai/typegraph";
import { Person, Company, Project } from "./nodes";
import { manages } from "./edges";

export const ontology = [
  disjointWith(Person, Company),
  disjointWith(Person, Project),
  disjointWith(Company, Project),
  // inverseOf(manages, reportsTo),
];
```

### index.ts

Combines everything into the graph definition:

```typescript
import { defineGraph } from "@nicia-ai/typegraph";
import { Person, Company, Project } from "./nodes";
import { worksAt, manages, assignedTo } from "./edges";
import { ontology } from "./ontology";

export const graph = defineGraph({
  id: "my_app",
  nodes: {
    Person: { type: Person },
    Company: { type: Company },
    Project: { type: Project },
  },
  edges: {
    worksAt: { type: worksAt, from: [Person], to: [Company] },
    manages: { type: manages, from: [Person], to: [Person] },
    assignedTo: { type: assignedTo, from: [Project], to: [Person] },
  },
  ontology,
});

// Re-export for convenience
export * from "./nodes";
export * from "./edges";
export { store } from "./store";
```

### store.ts

```typescript
import { createStore } from "@nicia-ai/typegraph";
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";
import { graph } from "./index";

const { backend } = createLocalSqliteBackend({ path: "./data.db" });

export const store = createStore(graph, backend);
```

## Large Projects

For large graphs with distinct domains, group related nodes and edges together:

```text
src/graph/
  index.ts              # Combines all domains
  store.ts              # Store instantiation
  domains/
    users.ts            # User, Profile, Team + related edges
    content.ts          # Document, Comment, Tag + related edges
    projects.ts         # Project, Task, Milestone + related edges
```

### domains/users.ts

```typescript
import { z } from "zod";
import { defineNode, defineEdge, subClassOf, disjointWith } from "@nicia-ai/typegraph";

// Nodes
export const User = defineNode("User", {
  schema: z.object({
    email: z.string().email(),
    name: z.string(),
    role: z.enum(["admin", "member", "guest"]),
  }),
});

export const Profile = defineNode("Profile", {
  schema: z.object({
    bio: z.string().optional(),
    avatarUrl: z.string().optional(),
  }),
});

export const Team = defineNode("Team", {
  schema: z.object({
    name: z.string(),
    description: z.string().optional(),
  }),
});

// Edges
export const hasProfile = defineEdge("hasProfile");
export const memberOf = defineEdge("memberOf", {
  schema: z.object({ joinedAt: z.string().optional() }),
});
export const leads = defineEdge("leads");

// Domain-specific ontology
export const usersOntology = [
  disjointWith(User, Team),
  disjointWith(User, Profile),
];

// Export for graph assembly
export const usersNodes = {
  User: { type: User },
  Profile: { type: Profile },
  Team: { type: Team },
};

export const usersEdges = {
  hasProfile: { type: hasProfile, from: [User], to: [Profile] },
  memberOf: { type: memberOf, from: [User], to: [Team] },
  leads: { type: leads, from: [User], to: [Team] },
};
```

### index.ts

Assembles domains into the final graph:

```typescript
import { defineGraph } from "@nicia-ai/typegraph";
import { usersNodes, usersEdges, usersOntology } from "./domains/users";
import { contentNodes, contentEdges, contentOntology } from "./domains/content";
import { projectsNodes, projectsEdges, projectsOntology } from "./domains/projects";

export const graph = defineGraph({
  id: "my_app",
  nodes: {
    ...usersNodes,
    ...contentNodes,
    ...projectsNodes,
  },
  edges: {
    ...usersEdges,
    ...contentEdges,
    ...projectsEdges,
  },
  ontology: [
    ...usersOntology,
    ...contentOntology,
    ...projectsOntology,
  ],
});

// Re-export types for convenience
export * from "./domains/users";
export * from "./domains/content";
export * from "./domains/projects";
export { store } from "./store";
```

## Cross-Domain Edges

When edges connect nodes from different domains, define them at the graph level:

```typescript
// index.ts
import { defineEdge } from "@nicia-ai/typegraph";
import { User } from "./domains/users";
import { Document } from "./domains/content";
import { Project } from "./domains/projects";

// Cross-domain edges
const authored = defineEdge("authored");
const assignedTo = defineEdge("assignedTo");

export const graph = defineGraph({
  // ...
  edges: {
    ...usersEdges,
    ...contentEdges,
    ...projectsEdges,
    // Cross-domain
    authored: { type: authored, from: [User], to: [Document] },
    assignedTo: { type: assignedTo, from: [User], to: [Project] },
  },
});
```

## Naming Conventions

| Element | Convention | Example |
|---------|------------|---------|
| Node definitions | PascalCase | `Person`, `Company` |
| Edge definitions | camelCase | `worksAt`, `hasAuthor` |
| Graph IDs | snake_case | `my_app`, `content_graph` |
| Files | kebab-case | `graph-store.ts`, `project-structure.ts` |
| Query aliases | short lowercase | `p`, `c`, `e1` |

## Type Exports

Export types alongside definitions for use in your application:

```typescript
// graph/nodes.ts
import { type Node, type NodeProps, type NodeId } from "@nicia-ai/typegraph";

export const Person = defineNode("Person", { /* ... */ });

// Convenience type exports
export type PersonNode = Node<typeof Person>;
export type PersonProps = NodeProps<typeof Person>;
export type PersonId = NodeId<typeof Person>;
```

This lets consumers import types directly:

```typescript
import { type PersonNode, type PersonProps } from "./graph";

function displayPerson(person: PersonNode) {
  console.log(person.name);
}

function validatePersonInput(data: unknown): PersonProps {
  return Person.schema.parse(data);
}
```

## Framework Integration

### Next.js / React Server Components

Keep the store in a server-only module:

```text
src/
  graph/
    index.ts
    store.server.ts     # Server-only store
```

```typescript
// store.server.ts
import "server-only";
import { createStore } from "@nicia-ai/typegraph";
import { graph } from "./index";
// ...
```

### Edge Runtimes (Cloudflare Workers, Vercel Edge)

Use the Drizzle backend with edge-compatible drivers:

```typescript
// graph/store.ts
import { createStore } from "@nicia-ai/typegraph";
import { createSqliteBackend } from "@nicia-ai/typegraph/sqlite";
import { drizzle } from "drizzle-orm/d1";
import { graph } from "./index";

export function createGraphStore(env: { DB: D1Database }) {
  const db = drizzle(env.DB);
  const backend = createSqliteBackend(db);
  return createStore(graph, backend);
}
```

## Next Steps

- [Getting Started](/getting-started) - Build your first graph
- [Schemas & Types](/core-concepts) - Deep dive into node and edge definitions
- [Integration](/integration) - Database setup and Drizzle integration

# Subqueries

> EXISTS, IN, and correlated subqueries for complex filtering

Subqueries let you filter based on conditions that depend on related data—check if related records
exist, or if values appear in another query's results.

## EXISTS

Check if related records exist:

```typescript
import { exists, fieldRef } from "@nicia-ai/typegraph";

// Find people who have authored at least one PR
const authors = await store
  .query()
  .from("Person", "p")
  .whereNode("p", () =>
    exists(
      store
        .query()
        .from("PullRequest", "pr")
        .traverse("author", "e", { direction: "in" })
        .to("Person", "author")
        .whereNode("author", (a) => a.id.eq(fieldRef("p", ["id"])))
        .select((ctx) => ({ id: ctx.pr.id }))
        .toAst()
    )
  )
  .select((ctx) => ctx.p)
  .execute();
```

## NOT EXISTS

Find records without related records:

```typescript
import { notExists, fieldRef } from "@nicia-ai/typegraph";

// Find people with no pull requests
const nonContributors = await store
  .query()
  .from("Person", "p")
  .whereNode("p", () =>
    notExists(
      store
        .query()
        .from("PullRequest", "pr")
        .traverse("author", "e", { direction: "in" })
        .to("Person", "author")
        .whereNode("author", (a) => a.id.eq(fieldRef("p", ["id"])))
        .select((ctx) => ({ id: ctx.pr.id }))
        .toAst()
    )
  )
  .select((ctx) => ctx.p)
  .execute();
```

## IN

Check if a value is in a subquery result set:

```typescript
import { inSubquery, fieldRef } from "@nicia-ai/typegraph";

// Find people who work at tech companies
const techWorkers = await store
  .query()
  .from("Person", "p")
  .whereNode("p", () =>
    inSubquery(
      fieldRef("p", ["companyId"]),
      store
        .query()
        .from("Company", "c")
        .whereNode("c", (c) => c.industry.eq("Technology"))
        .aggregate({
          id: fieldRef("c", ["id"], { valueType: "string" }),
        })
        .toAst()
    )
  )
  .select((ctx) => ctx.p)
  .execute();
```

## NOT IN

Exclude values that appear in a subquery:

```typescript
import { notInSubquery, fieldRef } from "@nicia-ai/typegraph";

// Find people not in the blocklist
const allowedUsers = await store
  .query()
  .from("Person", "p")
  .whereNode("p", () =>
    notInSubquery(
      fieldRef("p", ["id"]),
      store
        .query()
        .from("BlockedUser", "b")
        .aggregate({
          userId: fieldRef("b", ["props", "userId"], { valueType: "string" }),
        })
        .toAst()
    )
  )
  .select((ctx) => ctx.p)
  .execute();
```

## fieldRef()

The `fieldRef()` function creates a reference to a field in the outer query for use in subquery predicates:

```typescript
import { fieldRef } from "@nicia-ai/typegraph";

fieldRef("alias", ["field"])      // Reference a single field
fieldRef("alias", ["nested", "path"])  // Reference a nested field
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `alias` | `string` | The alias of the node/edge in the outer query |
| `path` | `string[]` | Path to the field (array for nested access) |

## Helpers Reference

| Function | Description |
|----------|-------------|
| `exists(subqueryAst)` | True if subquery returns any rows |
| `notExists(subqueryAst)` | True if subquery returns no rows |
| `inSubquery(fieldRef, subqueryAst)` | True if field value is in subquery results |
| `notInSubquery(fieldRef, subqueryAst)` | True if field value is not in subquery results |

For `inSubquery()` and `notInSubquery()`, the subquery must project exactly one
scalar column. Prefer `aggregate({ ... })` with a single field.

## Real-World Examples

### Users with Recent Activity

```typescript
// Find users who logged in within the last 7 days
const activeUsers = await store
  .query()
  .from("User", "u")
  .whereNode("u", () =>
    exists(
      store
        .query()
        .from("LoginEvent", "e")
        .whereNode("e", (e) =>
          e.userId.eq(fieldRef("u", ["id"]))
           .and(e.timestamp.gte(sevenDaysAgo))
        )
        .select((ctx) => ({ id: ctx.e.id }))
        .toAst()
    )
  )
  .select((ctx) => ctx.u)
  .execute();
```

### Products Not in Any Cart

```typescript
// Find products that haven't been added to any cart
const unpopularProducts = await store
  .query()
  .from("Product", "p")
  .whereNode("p", () =>
    notExists(
      store
        .query()
        .from("CartItem", "ci")
        .whereNode("ci", (ci) => ci.productId.eq(fieldRef("p", ["id"])))
        .select((ctx) => ({ id: ctx.ci.id }))
        .toAst()
    )
  )
  .select((ctx) => ctx.p)
  .execute();
```

### Users in Specific Teams

```typescript
// Find users who are members of either the engineering or design team
const targetTeamIds = ["team-eng", "team-design"];

const teamMembers = await store
  .query()
  .from("User", "u")
  .whereNode("u", () =>
    inSubquery(
      fieldRef("u", ["id"]),
      store
        .query()
        .from("TeamMembership", "tm")
        .whereNode("tm", (tm) => tm.teamId.in(targetTeamIds))
        .aggregate({
          userId: fieldRef("tm", ["props", "userId"], {
            valueType: "string",
          }),
        })
        .toAst()
    )
  )
  .select((ctx) => ctx.u)
  .execute();
```

## Query Debugging

For debugging or advanced use cases, you can inspect the query AST or generated SQL.

### View the AST

```typescript
const query = store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ctx.p);

const ast = query.toAst();
console.log(JSON.stringify(ast, null, 2));
```

### View Generated SQL

`toSQL()` returns the SQL text and bound parameters for the current backend dialect:

```typescript
const { sql, params } = query.toSQL();
console.log("SQL:", sql);
console.log("Parameters:", params);
```

This is useful for:

- Debugging query behavior
- Understanding performance characteristics
- Logging queries in production
- Running the query with a custom executor

## Next Steps

- [Filter](/queries/filter) - Basic filtering with predicates
- [Combine](/queries/combine) - Set operations
- [Execute](/queries/execute) - Running queries

# Aggregate

> GROUP BY, aggregate functions, and HAVING clauses

TypeGraph supports SQL-style aggregations for analytics and reporting. Group nodes by properties,
compute aggregates like COUNT and SUM, and filter groups with HAVING clauses.

## When to Use Aggregations

Aggregations are useful for:

- **Analytics dashboards**: Employee counts by department, revenue by region
- **Reporting**: Average order value, total sales by product category
- **Data exploration**: Find groups meeting certain criteria
- **Metrics**: Count active users, sum transaction amounts

## Basic Aggregation

Use `groupBy()` and `aggregate()` with aggregate helper functions:

```typescript
import { count, field } from "@nicia-ai/typegraph";

const companySizes = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .to("Company", "c")
  .groupBy("c", "name")                    // Group by company name
  .aggregate({
    companyName: field("c", "name"),       // Include the grouped field
    employeeCount: count("p"),             // Count people in each group
  })
  .execute();

// Result: [{ companyName: "Acme Corp", employeeCount: 42 }, ...]
```

## Aggregate Functions

Import aggregate functions from `@nicia-ai/typegraph`:

```typescript
import { count, countDistinct, sum, avg, min, max, field } from "@nicia-ai/typegraph";
```

### count

Count rows in each group:

```typescript
count("p")              // COUNT(p.id) - count all nodes
count("p", "department") // COUNT(p.props.department) - count non-null values
```

### countDistinct

Count unique values:

```typescript
countDistinct("p")              // COUNT(DISTINCT p.id)
countDistinct("p", "department") // COUNT(DISTINCT p.props.department)
```

### sum

Sum numeric values:

```typescript
sum("p", "salary")      // SUM(p.props.salary)
```

### avg

Average of numeric values:

```typescript
avg("p", "age")         // AVG(p.props.age)
```

### min / max

Minimum and maximum values:

```typescript
min("p", "hireDate")    // MIN(p.props.hireDate)
max("p", "salary")      // MAX(p.props.salary)
```

### field

Include a grouped field in the output:

```typescript
field("p", "department") // The grouped field value
field("c", "id")         // Node ID
field("c", "name")       // Property value
```

## Multiple Aggregations

Combine multiple aggregates in one query:

```typescript
import { count, countDistinct, sum, avg, min, max, field } from "@nicia-ai/typegraph";

const departmentStats = await store
  .query()
  .from("Employee", "e")
  .groupBy("e", "department")
  .aggregate({
    department: field("e", "department"),
    headcount: count("e"),
    uniqueRoles: countDistinct("e", "role"),
    avgSalary: avg("e", "salary"),
    minSalary: min("e", "salary"),
    maxSalary: max("e", "salary"),
    totalPayroll: sum("e", "salary"),
  })
  .execute();
```

## Grouping by Multiple Fields

Chain `groupBy()` calls for multi-column grouping:

```typescript
const breakdown = await store
  .query()
  .from("Employee", "e")
  .groupBy("e", "department")
  .groupBy("e", "level")
  .aggregate({
    department: field("e", "department"),
    level: field("e", "level"),
    count: count("e"),
    avgSalary: avg("e", "salary"),
  })
  .execute();

// Result: [
//   { department: "Engineering", level: "Senior", count: 15, avgSalary: 150000 },
//   { department: "Engineering", level: "Junior", count: 8, avgSalary: 80000 },
//   { department: "Sales", level: "Senior", count: 5, avgSalary: 120000 },
//   ...
// ]
```

## Grouping by Node

Use `groupByNode()` to group by unique nodes (by ID):

```typescript
const projectContributions = await store
  .query()
  .from("Commit", "c")
  .traverse("author", "e")
  .to("Developer", "d")
  .groupByNode("d")                        // Group by developer node
  .aggregate({
    developerId: field("d", "id"),
    developerName: field("d", "name"),
    commitCount: count("c"),
  })
  .execute();
```

## Filtering Groups with HAVING

Use `having()` to filter groups based on aggregate values (SQL's HAVING clause):

```typescript
import { count, havingGt } from "@nicia-ai/typegraph";

// Only departments with more than 5 employees
const largeDepartments = await store
  .query()
  .from("Employee", "e")
  .groupBy("e", "department")
  .having(havingGt(count("e"), 5))         // HAVING COUNT(e) > 5
  .aggregate({
    department: field("e", "department"),
    headcount: count("e"),
  })
  .execute();
```

### Available HAVING Helpers

```typescript
import {
  having,
  havingGt,
  havingGte,
  havingLt,
  havingLte,
  havingEq,
} from "@nicia-ai/typegraph";

// Comparison helpers
havingGt(aggregate, value)   // >
havingGte(aggregate, value)  // >=
havingLt(aggregate, value)   // <
havingLte(aggregate, value)  // <=
havingEq(aggregate, value)   // =

// Generic comparison (for custom operators)
having(aggregate, "gt", value)
```

### Multiple HAVING Conditions

Chain multiple having conditions:

```typescript
const qualifiedDepartments = await store
  .query()
  .from("Employee", "e")
  .groupBy("e", "department")
  .having(havingGte(count("e"), 5))        // At least 5 employees
  .having(havingGte(avg("e", "salary"), 100000)) // Average salary >= 100k
  .aggregate({
    department: field("e", "department"),
    headcount: count("e"),
    avgSalary: avg("e", "salary"),
  })
  .execute();
```

## Aggregations with Traversals

Combine graph traversals with aggregations:

```typescript
const topContributors = await store
  .query()
  .from("PullRequest", "pr")
  .whereNode("pr", (pr) => pr.state.eq("merged"))
  .traverse("targetsRepo", "e1")
  .to("Repository", "repo")
  .traverse("author", "e2", { direction: "in" })
  .to("Developer", "dev")
  .groupBy("repo", "name")
  .groupBy("dev", "name")
  .aggregate({
    repository: field("repo", "name"),
    developer: field("dev", "name"),
    prCount: count("pr"),
    linesChanged: sum("pr", "linesAdded"),
  })
  .limit(50)
  .execute();
```

## Ordering Aggregated Results

Order by aggregate values:

```typescript
const topDepartments = await store
  .query()
  .from("Employee", "e")
  .groupBy("e", "department")
  .aggregate({
    department: field("e", "department"),
    headcount: count("e"),
    totalSalary: sum("e", "salary"),
  })
  .orderBy((ctx) => ctx.totalSalary, "desc")
  .limit(10)
  .execute();
```

## Real-World Example: Team Analytics

```typescript
import { count, countDistinct, sum, avg, field, havingGt } from "@nicia-ai/typegraph";

// 1. Productivity by department
const departmentMetrics = await store
  .query()
  .from("Developer", "dev")
  .traverse("authored", "e")
  .to("PullRequest", "pr")
  .whereNode("pr", (pr) => pr.state.eq("merged"))
  .groupBy("dev", "department")
  .aggregate({
    department: field("dev", "department"),
    developerCount: countDistinct("dev"),
    totalPRs: count("pr"),
    totalLinesAdded: sum("pr", "linesAdded"),
    avgLinesPerPR: avg("pr", "linesAdded"),
  })
  .execute();

// 2. Active reviewers (reviewed > 10 PRs)
const activeReviewers = await store
  .query()
  .from("Developer", "d")
  .traverse("reviewed", "r")
  .to("PullRequest", "pr")
  .groupByNode("d")
  .having(havingGt(count("pr"), 10))
  .aggregate({
    developer: field("d", "name"),
    reviewCount: count("pr"),
  })
  .orderBy((ctx) => ctx.reviewCount, "desc")
  .execute();

// 3. Repository health
const repoHealth = await store
  .query()
  .from("Repository", "r")
  .traverse("contains", "e")
  .to("PullRequest", "pr")
  .groupByNode("r")
  .aggregate({
    repo: field("r", "name"),
    openPRs: count("pr"),
    avgAge: avg("pr", "daysOpen"),
  })
  .execute();
```

## Next Steps

- [Shape](/queries/shape) - Output transformation with `select()`
- [Order](/queries/order) - Ordering and limiting results
- [Traverse](/queries/traverse) - Graph traversals

# Combine

> Set operations with union(), intersect(), and except()

Combine operations merge results from multiple queries using set operations. Use `union()` to combine
results, `intersect()` to find common results, and `except()` to exclude results.

## Set Operations Overview

| Operation | Description | Duplicates |
|-----------|-------------|------------|
| `union()` | Combine results from both queries | Removed |
| `unionAll()` | Combine results from both queries | Kept |
| `intersect()` | Results that appear in both queries | Removed |
| `except()` | Results in first query but not second | Removed |

## union()

Combine results from multiple queries, removing duplicates:

```typescript
const activeOrAdmin = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ({ id: ctx.p.id, name: ctx.p.name }))
  .union(
    store
      .query()
      .from("Person", "p")
      .whereNode("p", (p) => p.role.eq("admin"))
      .select((ctx) => ({ id: ctx.p.id, name: ctx.p.name }))
  )
  .execute();
```

This returns all active users PLUS all admins, with duplicates removed (active admins appear once).

### Selection Shape Must Match

Both queries must have the same selection shape:

```typescript
// Valid: Same shape
query1.select((ctx) => ({ id: ctx.p.id, name: ctx.p.name }))
  .union(
    query2.select((ctx) => ({ id: ctx.p.id, name: ctx.p.name }))
  )

// Invalid: Different shapes - will cause an error
query1.select((ctx) => ({ id: ctx.p.id }))
  .union(
    query2.select((ctx) => ({ id: ctx.p.id, name: ctx.p.name }))
  )
```

## unionAll()

Combine results keeping duplicates:

```typescript
const allMentions = await store
  .query()
  .from("Comment", "c")
  .whereNode("c", (c) => c.mentions.contains(userId))
  .select((ctx) => ({ id: ctx.c.id, text: ctx.c.text }))
  .unionAll(
    store
      .query()
      .from("Post", "p")
      .whereNode("p", (p) => p.mentions.contains(userId))
      .select((ctx) => ({ id: ctx.p.id, text: ctx.p.content }))
  )
  .execute();
```

Use `unionAll()` when:

- You want to preserve duplicates
- Performance matters (no deduplication overhead)
- You're counting occurrences

## intersect()

Find results that appear in both queries:

```typescript
const activeAdmins = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ({ id: ctx.p.id }))
  .intersect(
    store
      .query()
      .from("Person", "p")
      .whereNode("p", (p) => p.role.eq("admin"))
      .select((ctx) => ({ id: ctx.p.id }))
  )
  .execute();
```

This returns only users who are BOTH active AND admins.

### Equivalent to AND

`intersect()` can often be replaced with combined predicates:

```typescript
// Using intersect
query1.intersect(query2)

// Often equivalent to
.whereNode("p", (p) =>
  p.status.eq("active").and(p.role.eq("admin"))
)
```

Use `intersect()` when the queries are complex or involve different traversal paths.

## except()

Find results in the first query but not the second (set difference):

```typescript
const nonAdminActive = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ({ id: ctx.p.id }))
  .except(
    store
      .query()
      .from("Person", "p")
      .whereNode("p", (p) => p.role.eq("admin"))
      .select((ctx) => ({ id: ctx.p.id }))
  )
  .execute();
```

This returns active users who are NOT admins.

### Order Matters

Unlike `union()` and `intersect()`, the order of queries in `except()` matters:

```typescript
// Active users who are NOT admins
activeUsers.except(admins)

// Admins who are NOT active (different result!)
admins.except(activeUsers)
```

## Chaining Set Operations

Chain multiple set operations:

```typescript
const complexSet = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ({ id: ctx.p.id }))
  .union(
    store.query()
      .from("Person", "p")
      .whereNode("p", (p) => p.role.eq("admin"))
      .select((ctx) => ({ id: ctx.p.id }))
  )
  .except(
    store.query()
      .from("Person", "p")
      .whereNode("p", (p) => p.suspended.eq(true))
      .select((ctx) => ({ id: ctx.p.id }))
  )
  .execute();

// (active OR admin) AND NOT suspended
```

## Ordering and Limiting Combined Results

Apply ordering and limits after set operations:

```typescript
const results = await query1
  .union(query2)
  .orderBy("name", "asc")
  .limit(100)
  .execute();
```

## Real-World Examples

### Multi-Source Search

Search across different node types:

```typescript
async function globalSearch(term: string) {
  const people = store
    .query()
    .from("Person", "p")
    .whereNode("p", (p) => p.name.ilike(`%${term}%`))
    .select((ctx) => ({
      id: ctx.p.id,
      type: "person" as const,
      title: ctx.p.name,
    }));

  const companies = store
    .query()
    .from("Company", "c")
    .whereNode("c", (c) => c.name.ilike(`%${term}%`))
    .select((ctx) => ({
      id: ctx.c.id,
      type: "company" as const,
      title: ctx.c.name,
    }));

  return people
    .union(companies)
    .limit(20)
    .execute();
}
```

### Exclude Blocklist

```typescript
const eligibleUsers = await store
  .query()
  .from("User", "u")
  .whereNode("u", (u) => u.status.eq("active"))
  .select((ctx) => ({ id: ctx.u.id, email: ctx.u.email }))
  .except(
    store
      .query()
      .from("BlockedUser", "b")
      .traverse("blockedUser", "e")
      .to("User", "u")
      .select((ctx) => ({ id: ctx.u.id, email: ctx.u.email }))
  )
  .execute();
```

### Find Common Connections

```typescript
async function mutualFriends(userId1: string, userId2: string) {
  const user1Friends = store
    .query()
    .from("Person", "p")
    .whereNode("p", (p) => p.id.eq(userId1))
    .traverse("follows", "e")
    .to("Person", "friend")
    .select((ctx) => ({ id: ctx.friend.id, name: ctx.friend.name }));

  const user2Friends = store
    .query()
    .from("Person", "p")
    .whereNode("p", (p) => p.id.eq(userId2))
    .traverse("follows", "e")
    .to("Person", "friend")
    .select((ctx) => ({ id: ctx.friend.id, name: ctx.friend.name }));

  return user1Friends
    .intersect(user2Friends)
    .execute();
}
```

### Deduplicate Recursive Results

Remove duplicate nodes from recursive traversals:

```typescript
// Get unique reachable nodes (recursive may return duplicates via different paths)
const uniqueNodes = await store
  .query()
  .from("Node", "start")
  .traverse("linkedTo", "e")
  .recursive()
  .to("Node", "reachable")
  .select((ctx) => ({ id: ctx.reachable.id }))
  .union(
    // Union with empty set to deduplicate (hack)
    store
      .query()
      .from("Node", "n")
      .whereNode("n", (n) => n.id.eq("__nonexistent__"))
      .select((ctx) => ({ id: ctx.n.id }))
  )
  .execute();
```

## Next Steps

- [Advanced](/queries/advanced) - Subqueries with `exists()` and `inSubquery()`
- [Execute](/queries/execute) - Running queries
- [Compose](/queries/compose) - Reusable query fragments

# Compose

> Reusable query transformations with pipe() and fragment composition

Compose operations let you create reusable query transformations. Use `pipe()` to apply
transformations and `createFragment()` to build typed, composable query parts.

## The pipe() Method

Apply a transformation function to a query builder:

```typescript
const results = await store
  .query()
  .from("User", "u")
  .pipe((q) => q.whereNode("u", ({ status }) => status.eq("active")))
  .pipe((q) => q.orderBy("u", "createdAt", "desc"))
  .select((ctx) => ctx.u)
  .execute();
```

Each `pipe()` receives the current builder and returns a modified builder, enabling chained transformations.

## Defining Reusable Fragments

Extract common patterns into reusable functions:

```typescript
// Define reusable fragments
const activeOnly = (q) =>
  q.whereNode("u", ({ status }) => status.eq("active"));

const recentFirst = (q) =>
  q.orderBy("u", "createdAt", "desc");

const first10 = (q) =>
  q.limit(10);

// Use in queries
const results = await store
  .query()
  .from("User", "u")
  .pipe(activeOnly)
  .pipe(recentFirst)
  .pipe(first10)
  .select((ctx) => ctx.u)
  .execute();
```

## Typed Fragments with createFragment()

For full type safety, use the `createFragment()` factory:

```typescript
import { createFragment } from "@nicia-ai/typegraph";

// Create a typed fragment factory for your graph
const fragment = createFragment<typeof graph>();

// Define typed fragments
const activeUsers = fragment((q) =>
  q.whereNode("u", ({ status }) => status.eq("active"))
);

const withRecentPosts = fragment((q) =>
  q.traverse("authored", "a")
   .to("Post", "p")
   .whereNode("p", ({ createdAt }) => createdAt.gte("2024-01-01"))
);

// Compose into queries
const results = await store
  .query()
  .from("User", "u")
  .pipe(activeUsers)
  .pipe(withRecentPosts)
  .select((ctx) => ({
    user: ctx.u,
    post: ctx.p,
  }))
  .execute();
```

## Composing Fragments

Use `composeFragments()` to combine multiple fragments into one:

```typescript
import { composeFragments, limitFragment, orderByFragment } from "@nicia-ai/typegraph";

// Compose multiple fragments into one
const paginatedActiveUsers = composeFragments(
  (q) => q.whereNode("u", ({ status }) => status.eq("active")),
  (q) => q.orderBy("u", "createdAt", "desc"),
  (q) => q.limit(20)
);

// Apply as a single transformation
const results = await store
  .query()
  .from("User", "u")
  .pipe(paginatedActiveUsers)
  .select((ctx) => ctx.u)
  .execute();
```

## Helper Fragments

TypeGraph provides pre-built helper fragments:

```typescript
import {
  limitFragment,
  offsetFragment,
  orderByFragment,
  composeFragments
} from "@nicia-ai/typegraph";

// Pre-built fragments
const paginated = composeFragments(
  orderByFragment("u", "createdAt", "desc"),
  limitFragment(20),
  offsetFragment(40)
);

const results = await store
  .query()
  .from("User", "u")
  .pipe(paginated)
  .select((ctx) => ctx.u)
  .execute();
```

### Available Helpers

| Helper | Description |
|--------|-------------|
| `limitFragment(n)` | Limits results to n rows |
| `offsetFragment(n)` | Skips the first n rows |
| `orderByFragment(alias, field, direction)` | Orders by a field |

## Fragments with Traversals

Fragments can include traversals:

```typescript
// Fragment that adds a manager traversal
const withManager = fragment((q) =>
  q.traverse("reportsTo", "r").to("User", "manager")
);

// Fragment that adds department info
const withDepartment = fragment((q) =>
  q.traverse("belongsTo", "b").to("Department", "dept")
);

// Compose for a complete employee view
const employeeDetails = composeFragments(withManager, withDepartment);

const results = await store
  .query()
  .from("User", "u")
  .pipe(employeeDetails)
  .select((ctx) => ({
    employee: ctx.u,
    manager: ctx.manager,
    department: ctx.dept,
  }))
  .execute();
```

## Post-Select Fragments

`pipe()` is also available on `ExecutableQuery`:

```typescript
// Define a pagination fragment for executable queries
const paginate = (q) =>
  q.orderBy("u", "name", "asc").limit(10).offset(20);

const results = await store
  .query()
  .from("User", "u")
  .select((ctx) => ({ name: ctx.u.name, email: ctx.u.email }))
  .pipe(paginate)
  .execute();
```

## Real-World Patterns

### Search with Conditional Filters

```typescript
function searchUsers(filters: {
  status?: string;
  role?: string;
  search?: string;
}) {
  let query = store.query().from("User", "u");

  // Apply filters conditionally using pipe
  if (filters.status) {
    query = query.pipe((q) =>
      q.whereNode("u", ({ status }) => status.eq(filters.status))
    );
  }

  if (filters.role) {
    query = query.pipe((q) =>
      q.whereNode("u", ({ role }) => role.eq(filters.role))
    );
  }

  if (filters.search) {
    query = query.pipe((q) =>
      q.whereNode("u", ({ name }) => name.ilike(`%${filters.search}%`))
    );
  }

  return query.select((ctx) => ctx.u).execute();
}
```

### Configurable Pagination

```typescript
function createPaginationFragment(options: {
  sortField: string;
  sortDir: "asc" | "desc";
  page: number;
  pageSize: number;
}) {
  return composeFragments(
    orderByFragment("u", options.sortField, options.sortDir),
    limitFragment(options.pageSize),
    offsetFragment((options.page - 1) * options.pageSize)
  );
}

// Use with any query
const pagination = createPaginationFragment({
  sortField: "createdAt",
  sortDir: "desc",
  page: 2,
  pageSize: 25,
});

const results = await store
  .query()
  .from("User", "u")
  .pipe(pagination)
  .select((ctx) => ctx.u)
  .execute();
```

### Domain-Specific Query Helpers

```typescript
// Create domain-specific query helpers
const userQueries = {
  active: (q) => q.whereNode("u", ({ status }) => status.eq("active")),

  verified: (q) => q.whereNode("u", ({ emailVerified }) => emailVerified.eq(true)),

  withRole: (role: string) => (q) =>
    q.whereNode("u", ({ role: r }) => r.eq(role)),

  withPosts: (q) => q.traverse("authored", "a").to("Post", "p"),

  recentlyActive: (q) => q.whereNode("u", ({ lastLogin }) =>
    lastLogin.gte(new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString())
  ),
};

// Compose for specific use cases
const activeAdmins = await store
  .query()
  .from("User", "u")
  .pipe(userQueries.active)
  .pipe(userQueries.verified)
  .pipe(userQueries.withRole("admin"))
  .select((ctx) => ctx.u)
  .execute();
```

## Type Definitions

For advanced use cases, TypeGraph exports fragment type definitions:

```typescript
import type {
  QueryFragment,
  FlexibleQueryFragment,
  TraversalFragment
} from "@nicia-ai/typegraph";
```

- **`QueryFragment<G, InAliases, OutAliases, InEdgeAliases, OutEdgeAliases>`** - A typed fragment transformation
- **`FlexibleQueryFragment<G, RequiredAliases, AddedAliases, ...>`** - A fragment that works with any compatible builder
- **`TraversalFragment<G, EK, EA, ...>`** - A fragment for transforming TraversalBuilder instances

## Next Steps

- [Filter](/queries/filter) - Filtering with predicates
- [Traverse](/queries/traverse) - Graph traversals
- [Combine](/queries/combine) - Set operations

# Execute

> Running queries with execute(), paginate(), and stream()

Execute operations run your query and retrieve results. Use `execute()` for simple queries,
`paginate()` for cursor-based pagination, and `stream()` for processing large datasets.

## execute()

Run the query and return all results:

```typescript
const results = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ctx.p)
  .execute();

// results: readonly Person[]
```

### Return Type

Returns a readonly array of the selected type:

```typescript
// TypeScript infers the shape from your selection
const results = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name,
    email: ctx.p.email,
  }))
  .execute();

// results: readonly { name: string; email: string | undefined }[]
```

## first()

Get the first result or `undefined`:

```typescript
const alice = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.email.eq("alice@example.com"))
  .select((ctx) => ctx.p)
  .first();

if (alice) {
  console.log(alice.name);
}
```

## count()

Count matching results without fetching data:

```typescript
const activeCount = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .count();

// activeCount: number
```

## exists()

Check if any results exist:

```typescript
const hasActiveUsers = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .exists();

// hasActiveUsers: boolean
```

## Cursor Pagination

For large datasets, cursor-based pagination is more efficient than `limit`/`offset`. It uses keyset
pagination which doesn't degrade as you go deeper.

### paginate()

```typescript
const firstPage = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ({
    id: ctx.p.id,
    name: ctx.p.name,
  }))
  .orderBy("p", "name", "asc")    // ORDER BY required
  .paginate({ first: 20 });
```

### Pagination Result Shape

```typescript
{
  data: readonly T[],           // The actual results
  hasNextPage: boolean,         // More results available forward
  hasPrevPage: boolean,         // More results available backward
  nextCursor: string | undefined,  // Opaque cursor for next page
  prevCursor: string | undefined,  // Opaque cursor for previous page
}
```

### Forward Pagination

Use `first` and `after` to paginate forward:

```typescript
// Get first page
const page1 = await query.paginate({ first: 20 });

// Get next page using the cursor
if (page1.hasNextPage && page1.nextCursor) {
  const page2 = await query.paginate({
    first: 20,
    after: page1.nextCursor,
  });
}
```

### Backward Pagination

Use `last` and `before` to paginate backward:

```typescript
// Get last page
const lastPage = await query.paginate({ last: 20 });

// Get previous page
if (lastPage.hasPrevPage && lastPage.prevCursor) {
  const prevPage = await query.paginate({
    last: 20,
    before: lastPage.prevCursor,
  });
}
```

### Pagination Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `first` | `number` | Number of results from the start |
| `after` | `string` | Cursor to start after (forward pagination) |
| `last` | `number` | Number of results from the end |
| `before` | `string` | Cursor to start before (backward pagination) |

### Pagination with Traversals

Pagination works with graph traversals:

```typescript
const employeesPage = await store
  .query()
  .from("Company", "c")
  .whereNode("c", (c) => c.name.eq("Acme Corp"))
  .traverse("worksAt", "e", { direction: "in" })
  .to("Person", "p")
  .select((ctx) => ({
    id: ctx.p.id,
    name: ctx.p.name,
    role: ctx.e.role,
  }))
  .orderBy("p", "name", "asc")
  .paginate({ first: 50 });
```

## Streaming

For very large datasets, use streaming to process results without loading everything into memory.

### stream()

```typescript
const stream = store
  .query()
  .from("Event", "e")
  .select((ctx) => ctx.e)
  .orderBy("e", "createdAt", "desc")  // ORDER BY required
  .stream({ batchSize: 1000 });

// Process results as they arrive
for await (const event of stream) {
  console.log(event.title);
  await processEvent(event);
}
```

### Batch Size

The `batchSize` option controls how many records are fetched per database query:

```typescript
// Smaller batches: Lower memory usage, more database queries
.stream({ batchSize: 100 })

// Larger batches: Higher memory usage, fewer database queries
.stream({ batchSize: 5000 })

// Default is 1000
.stream()
```

### Streaming with Processing

```typescript
async function exportAllUsers(): Promise<void> {
  const stream = store
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.status.eq("active"))
    .select((ctx) => ({
      id: ctx.u.id,
      email: ctx.u.email,
      name: ctx.u.name,
    }))
    .orderBy("u", "id", "asc")
    .stream({ batchSize: 500 });

  let count = 0;
  for await (const user of stream) {
    await exportToExternalSystem(user);
    count++;
    if (count % 1000 === 0) {
      console.log(`Exported ${count} users...`);
    }
  }
  console.log(`Export complete: ${count} users`);
}
```

## Prepared Queries

Prepared queries let you compile a query once and execute it many times with different parameter
values. This eliminates recompilation overhead for repeated query shapes.

### `param(name)`

Use `param()` to declare a named placeholder inside any predicate position:

```typescript
import { param } from "@nicia-ai/typegraph";
```

### `prepare()`

Call `.prepare()` on an executable query to pre-compile the AST and SQL. Returns a `PreparedQuery<R>`
that can be executed with different bindings.

```typescript
const findByName = store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq(param("name")))
  .select((ctx) => ctx.p)
  .prepare();

// Execute with different bindings — no recompilation
const alices = await findByName.execute({ name: "Alice" });
const bobs = await findByName.execute({ name: "Bob" });
```

### Parameterized Bounds

Parameters work anywhere a scalar value is accepted:

```typescript
const findByAge = store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.age.between(param("minAge"), param("maxAge")))
  .select((ctx) => ctx.p)
  .prepare();

const youngAdults = await findByAge.execute({ minAge: 18, maxAge: 25 });
const seniors = await findByAge.execute({ minAge: 65, maxAge: 120 });
```

`prepared.execute(bindings)` validates bindings strictly: all declared parameters must be
provided, and unknown binding keys are rejected.

### Supported Positions

`param()` works with any scalar predicate:

| Predicate | Example |
|-----------|---------|
| `eq` / `neq` | `p.name.eq(param("name"))` |
| `gt` / `gte` / `lt` / `lte` | `p.age.gt(param("minAge"))` |
| `between` | `p.age.between(param("lo"), param("hi"))` |
| `contains` | `p.name.contains(param("substr"))` |
| `startsWith` / `endsWith` | `p.name.startsWith(param("prefix"))` |
| `like` / `ilike` | `p.email.like(param("pattern"))` |

:::caution
`param()` is **not** supported in `in()` / `notIn()` — the array length must be known at compile time.
:::

### Performance

When the backend supports `executeRaw` (both SQLite and PostgreSQL backends do), the pre-compiled
SQL text is sent directly to the database driver with substituted parameter values — zero
recompilation overhead. When `executeRaw` is unavailable, the prepared query substitutes parameters
into the AST and recompiles.

## Query Debugging

### toAst()

Get the query AST for inspection:

```typescript
const builder = store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .select((ctx) => ctx.p);

const ast = builder.toAst();
console.log(JSON.stringify(ast, null, 2));
```

### compile()

Compile to SQL without executing:

```typescript
const compiled = builder.compile();
console.log("SQL:", compiled.sql);
console.log("Parameters:", compiled.params);
```

Useful for:

- Debugging query behavior
- Understanding performance characteristics
- Building custom query executors

## Ordering Requirements

Both `paginate()` and `stream()` require an `orderBy()` clause:

```typescript
// Required for pagination
.orderBy("p", "name", "asc")
.paginate({ first: 20 });

// Required for streaming
.orderBy("e", "createdAt", "desc")
.stream();
```

### Stable Ordering

For deterministic pagination, include a unique field in your ordering:

```typescript
.orderBy("p", "name", "asc")
.orderBy("p", "id", "asc")    // Ensures stable ordering
```

## Real-World Examples

### Paginated API Endpoint

```typescript
async function listUsers(cursor?: string, limit = 20) {
  const query = store
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.status.eq("active"))
    .select((ctx) => ({
      id: ctx.u.id,
      name: ctx.u.name,
      email: ctx.u.email,
    }))
    .orderBy("u", "createdAt", "desc")
    .orderBy("u", "id", "desc");

  const result = cursor
    ? await query.paginate({ first: limit, after: cursor })
    : await query.paginate({ first: limit });

  return {
    users: result.data,
    nextCursor: result.nextCursor,
    hasMore: result.hasNextPage,
  };
}
```

### Batch Processing

```typescript
async function processAllOrders() {
  const stream = store
    .query()
    .from("Order", "o")
    .whereNode("o", (o) => o.status.eq("pending"))
    .select((ctx) => ctx.o)
    .orderBy("o", "createdAt", "asc")
    .stream({ batchSize: 100 });

  for await (const order of stream) {
    try {
      await fulfillOrder(order);
      await store.update("Order", order.id, { status: "fulfilled" });
    } catch (error) {
      console.error(`Failed to process order ${order.id}:`, error);
    }
  }
}
```

### Infinite Scroll

```typescript
function useInfiniteUsers() {
  const [users, setUsers] = useState<User[]>([]);
  const [cursor, setCursor] = useState<string | undefined>();
  const [hasMore, setHasMore] = useState(true);

  async function loadMore() {
    const result = await store
      .query()
      .from("User", "u")
      .select((ctx) => ctx.u)
      .orderBy("u", "name", "asc")
      .paginate({ first: 20, after: cursor });

    setUsers((prev) => [...prev, ...result.data]);
    setCursor(result.nextCursor);
    setHasMore(result.hasNextPage);
  }

  return { users, loadMore, hasMore };
}
```

## Next Steps

- [Order](/queries/order) - Ordering and limiting results
- [Shape](/queries/shape) - Output transformation
- [Overview](/queries/overview) - Query categories reference

# Filter

> Reducing results with whereNode() and whereEdge()

Filter operations reduce the result set based on property values. TypeGraph provides `whereNode()`
for filtering nodes and `whereEdge()` for filtering edges during traversals.

## whereNode()

Filter nodes based on their properties:

```typescript
const engineers = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.role.eq("Engineer"))
  .select((ctx) => ctx.p)
  .execute();
```

### Parameters

```typescript
.whereNode(alias, predicateFunction)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `alias` | `string` | The node alias to filter (must exist in query) |
| `predicateFunction` | `(accessor) => Predicate` | Function that returns a predicate |

The predicate function receives a typed accessor for the node's properties.

## whereEdge()

Filter based on edge properties during traversals:

```typescript
const highPaying = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .whereEdge("e", (e) => e.salary.gte(100000))
  .to("Company", "c")
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c.name,
    salary: ctx.e.salary,
  }))
  .execute();
```

### Parameters

```typescript
.whereEdge(alias, predicateFunction)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `alias` | `string` | The edge alias to filter (must exist in query) |
| `predicateFunction` | `(accessor) => Predicate` | Function that returns a predicate |

## Combining Predicates

### AND

Both conditions must be true:

```typescript
.whereNode("p", (p) =>
  p.status.eq("active").and(p.role.eq("admin"))
)
```

### OR

Either condition can be true:

```typescript
.whereNode("p", (p) =>
  p.role.eq("admin").or(p.role.eq("moderator"))
)
```

### NOT

Negate a condition:

```typescript
.whereNode("p", (p) =>
  p.status.eq("deleted").not()
)
```

### Complex Combinations

Build complex logic with parenthetical grouping:

```typescript
.whereNode("p", (p) =>
  p.status
    .eq("active")
    .and(p.role.eq("admin").or(p.role.eq("moderator")))
)
```

This evaluates as: `status = 'active' AND (role = 'admin' OR role = 'moderator')`

## Multiple Filters

Chain multiple `whereNode()` calls for AND logic:

```typescript
const activeManagers = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .whereNode("p", (p) => p.role.eq("Manager"))
  .select((ctx) => ctx.p)
  .execute();
```

This is equivalent to:

```typescript
.whereNode("p", (p) =>
  p.status.eq("active").and(p.role.eq("Manager"))
)
```

## Filtering After Traversal

Filter nodes at any point in the query:

```typescript
const techCompanyEngineers = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.role.eq("Engineer"))
  .traverse("worksAt", "e")
  .to("Company", "c")
  .whereNode("c", (c) => c.industry.eq("Technology"))
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c.name,
  }))
  .execute();
```

## Common Predicates

Here are the most commonly used predicates. For complete reference, see [Predicates](/queries/predicates/).

### Equality

```typescript
p.name.eq("Alice")       // equals
p.name.neq("Bob")        // not equals
```

### Comparison

```typescript
p.age.gt(21)             // greater than
p.age.gte(21)            // greater than or equal
p.age.lt(65)             // less than
p.age.lte(65)            // less than or equal
p.age.between(18, 65)    // inclusive range
```

### String Matching

```typescript
p.name.contains("ali")   // substring match
p.name.startsWith("A")   // prefix match
p.name.endsWith("ice")   // suffix match
p.email.like("%@example.com")  // SQL LIKE pattern
p.name.ilike("alice")    // case-insensitive LIKE
```

### Null Checks

```typescript
p.deletedAt.isNull()     // is null/undefined
p.email.isNotNull()      // is not null
```

### List Membership

```typescript
p.status.in(["active", "pending"])
p.status.notIn(["archived", "deleted"])
```

### Array Operations

```typescript
p.tags.contains("typescript")
p.tags.containsAll(["typescript", "nodejs"])
p.tags.containsAny(["typescript", "rust", "go"])
p.tags.isEmpty()
p.tags.isNotEmpty()
```

## Predicate Types by Field

The available predicates depend on the field type:

| Field Type | Key Predicates |
|------------|----------------|
| String | `eq`, `contains`, `startsWith`, `like`, `ilike` |
| Number | `eq`, `gt`, `gte`, `lt`, `lte`, `between` |
| Date | `eq`, `gt`, `gte`, `lt`, `lte`, `between` |
| Array | `contains`, `containsAll`, `containsAny`, `isEmpty` |
| Object | `get()`, `hasKey`, `pathEquals` |
| Embedding | `similarTo()` |

See [Predicates](/queries/predicates/) for complete documentation.

## Count and Existence Helpers

### Count Results

```typescript
const count: number = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.status.eq("active"))
  .count();
```

### Check Existence

```typescript
const exists: boolean = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.email.eq("alice@example.com"))
  .exists();
```

### Get First Result

```typescript
const alice = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.email.eq("alice@example.com"))
  .select((ctx) => ctx.p)
  .first();

if (alice) {
  console.log(alice.name);
}
```

## Next Steps

- [Predicates](/queries/predicates/) - Complete predicate reference
- [Traverse](/queries/traverse) - Navigate relationships
- [Advanced](/queries/advanced) - Subqueries with `exists()` and `inSubquery()`

# Order

> Control result ordering with orderBy(), limit(), and offset()

Order operations control how results are sorted and how many are returned. Use `orderBy()` for
sorting, `limit()` to cap results, and `offset()` for simple pagination.

## orderBy()

Sort results by one or more fields:

```typescript
const sorted = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ctx.p)
  .orderBy((ctx) => ctx.p.name, "asc")
  .execute();
```

### Parameters

```typescript
.orderBy(fieldSelector, direction?)
.orderBy(alias, field, direction?)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `fieldSelector` | `(ctx) => field` | Function that selects the field to sort by |
| `alias` | `string` | Node/edge alias (alternative syntax) |
| `field` | `string` | Field name (alternative syntax) |
| `direction` | `"asc" \| "desc"` | Sort direction (default: `"asc"`) |

### Single Field

```typescript
// Function syntax
.orderBy((ctx) => ctx.p.name, "asc")

// Alias syntax
.orderBy("p", "name", "asc")
```

### Multiple Fields

Chain `orderBy()` for multi-field sorting:

```typescript
const sorted = await store
  .query()
  .from("Task", "t")
  .select((ctx) => ctx.t)
  .orderBy("t", "priority", "desc")    // Primary sort
  .orderBy("t", "createdAt", "asc")    // Secondary sort
  .execute();
```

Or use the array syntax:

```typescript
.orderBy((ctx) => [
  { field: ctx.t.priority, direction: "desc" },
  { field: ctx.t.createdAt, direction: "asc" },
])
```

### Null Handling

Control where null values appear:

```typescript
.orderBy((ctx) => ({
  field: ctx.p.email,
  direction: "asc",
  nulls: "last",  // or "first"
}))
```

### Ordering by Edge Properties

Order by properties on traversed edges:

```typescript
const employees = await store
  .query()
  .from("Company", "c")
  .traverse("worksAt", "e", { direction: "in" })
  .to("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name,
    startDate: ctx.e.startDate,
  }))
  .orderBy("e", "startDate", "desc")  // Most recent hires first
  .execute();
```

### Ordering Aggregated Results

Order by aggregate values:

```typescript
import { count, field } from "@nicia-ai/typegraph";

const topDepartments = await store
  .query()
  .from("Employee", "e")
  .groupBy("e", "department")
  .aggregate({
    department: field("e", "department"),
    headcount: count("e"),
  })
  .orderBy((ctx) => ctx.headcount, "desc")
  .execute();
```

## limit()

Cap the number of results returned:

```typescript
const top10 = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ctx.p)
  .orderBy("p", "score", "desc")
  .limit(10)
  .execute();
```

### Parameters

```typescript
.limit(n)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `n` | `number` | Maximum number of results to return |

## offset()

Skip a number of results (useful for simple pagination):

```typescript
const page2 = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ctx.p)
  .orderBy("p", "name", "asc")
  .limit(10)
  .offset(10)  // Skip first 10 results
  .execute();
```

### Parameters

```typescript
.offset(n)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `n` | `number` | Number of results to skip |

## Simple Pagination with limit/offset

```typescript
async function getPage(pageNumber: number, pageSize: number) {
  return store
    .query()
    .from("Person", "p")
    .select((ctx) => ctx.p)
    .orderBy("p", "name", "asc")
    .limit(pageSize)
    .offset((pageNumber - 1) * pageSize)
    .execute();
}

// Usage
const page1 = await getPage(1, 20);  // Results 1-20
const page2 = await getPage(2, 20);  // Results 21-40
```

> **Note:** For large datasets, use [cursor pagination](/queries/execute#cursor-pagination) instead.
> Offset-based pagination becomes slower as offset increases.

## Ordering Requirements

### For Pagination

Both `paginate()` and `stream()` require an `orderBy()` clause:

```typescript
// Required for pagination
const page = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ctx.p)
  .orderBy("p", "name", "asc")    // Required
  .paginate({ first: 20 });

// Required for streaming
const stream = store
  .query()
  .from("Event", "e")
  .select((ctx) => ctx.e)
  .orderBy("e", "createdAt", "desc")  // Required
  .stream();
```

### Stable Ordering

For deterministic pagination, include a unique field (like `id`) in your ordering:

```typescript
.orderBy("p", "name", "asc")
.orderBy("p", "id", "asc")    // Ensures stable ordering when names are equal
```

## Real-World Examples

### Leaderboard

```typescript
const leaderboard = await store
  .query()
  .from("Player", "p")
  .select((ctx) => ({
    name: ctx.p.name,
    score: ctx.p.score,
  }))
  .orderBy("p", "score", "desc")
  .limit(100)
  .execute();
```

### Recent Activity Feed

```typescript
const feed = await store
  .query()
  .from("Activity", "a")
  .whereNode("a", (a) => a.userId.eq(currentUserId))
  .select((ctx) => ctx.a)
  .orderBy("a", "createdAt", "desc")
  .limit(50)
  .execute();
```

### Paginated Search Results

```typescript
async function searchProducts(query: string, page: number) {
  const pageSize = 20;

  return store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => p.name.ilike(`%${query}%`))
    .select((ctx) => ({
      id: ctx.p.id,
      name: ctx.p.name,
      price: ctx.p.price,
    }))
    .orderBy("p", "relevance", "desc")
    .orderBy("p", "id", "asc")
    .limit(pageSize)
    .offset((page - 1) * pageSize)
    .execute();
}
```

## Next Steps

- [Execute](/queries/execute) - Cursor pagination and streaming
- [Shape](/queries/shape) - Output transformation
- [Filter](/queries/filter) - Reducing results with predicates

# Predicates

> Complete reference for filtering predicates by data type

Predicates are the building blocks for filtering in TypeGraph queries. Each data type has its own
set of predicates optimized for that type.

## How Predicates Work

Predicates are accessed through property accessors in `whereNode()` and `whereEdge()`:

```typescript
.whereNode("p", (p) => p.name.eq("Alice"))
//                     ^accessor ^predicate
```

The accessor provides type-safe access to the field, and returns a predicate builder with methods
appropriate for that field's type. Edge fields work the same way:

```typescript
.whereEdge("e", (e) => e.role.eq("admin"))
```

## Predicate Types

| Type | Predicates | Section |
|------|------------|---------|
| All types | `eq`, `neq`, `in`, `notIn`, `isNull`, `isNotNull` | [Common](#common-predicates) |
| String | `contains`, `startsWith`, `endsWith`, `like`, `ilike` | [String](#string) |
| Number | `gt`, `gte`, `lt`, `lte`, `between` | [Number](#number) |
| Boolean | *(common only)* | [Boolean](#boolean) |
| Date | `gt`, `gte`, `lt`, `lte`, `between` | [Date](#date) |
| Array | `contains`, `containsAll`, `containsAny`, `isEmpty`, `isNotEmpty`, `lengthEq/Gt/Gte/Lt/Lte` | [Array](#array) |
| Object | `get`, `field`, `hasKey`, `hasPath`, `pathEquals`, `pathContains`, `pathIsNull`, `pathIsNotNull` | [Object](#object) |
| Embedding | `similarTo` | [Embedding](#embedding) |
| Subquery | `exists`, `notExists`, `inSubquery`, `notInSubquery` | [Subqueries](/queries/advanced) |

## Combining Predicates

All predicates can be combined using logical operators:

### AND

```typescript
p.status.eq("active").and(p.role.eq("admin"))
```

### OR

```typescript
p.role.eq("admin").or(p.role.eq("moderator"))
```

### NOT

```typescript
p.status.eq("deleted").not()
```

### Complex Combinations

```typescript
p.status
  .eq("active")
  .and(p.role.eq("admin").or(p.role.eq("moderator")))
```

Parenthesization is handled automatically. Vector similarity predicates cannot be nested under
`OR` or `NOT`.

---

## Common Predicates

These predicates are available on **all** field types:

| Predicate | Description | SQL |
|-----------|-------------|-----|
| `eq(value)` | Equals | `= value` |
| `neq(value)` | Not equals | `!= value` |
| `in(values[])` | Value is in array | `IN (...)` |
| `notIn(values[])` | Value is not in array | `NOT IN (...)` |
| `isNull()` | Is null/undefined | `IS NULL` |
| `isNotNull()` | Is not null | `IS NOT NULL` |

`eq` and `neq` accept `param()` references for [prepared queries](/queries/execute#prepared-queries).
`in` and `notIn` do **not** support `param()` because the array length must be known at compile time.

---

## String

String predicates for text matching and pattern searches.

### Equality

```typescript
p.name.eq("Alice")       // Exact match
p.name.neq("Bob")        // Not equal
```

### Substring Match

```typescript
p.name.contains("ali")   // Case-insensitive substring match
```

### Prefix/Suffix

```typescript
p.name.startsWith("A")   // Case-insensitive prefix match
p.name.endsWith("ice")   // Case-insensitive suffix match
```

### Pattern Matching

```typescript
p.email.like("%@example.com")  // SQL LIKE (case-sensitive) — % = any chars, _ = single char
p.name.ilike("alice%")         // Case-insensitive LIKE
```

### List Membership

```typescript
p.status.in(["active", "pending"])
p.status.notIn(["archived", "deleted"])
```

### Null Checks

```typescript
p.email.isNull()
p.email.isNotNull()
```

### Reference

| Predicate | Accepts | Description | SQL | Case |
|-----------|---------|-------------|-----|------|
| `eq(value)` | `string \| param()` | Exact match | `=` | sensitive |
| `neq(value)` | `string \| param()` | Not equal | `!=` | sensitive |
| `contains(str)` | `string \| param()` | Substring match | `ILIKE '%str%'` | insensitive |
| `startsWith(str)` | `string \| param()` | Prefix match | `ILIKE 'str%'` | insensitive |
| `endsWith(str)` | `string \| param()` | Suffix match | `ILIKE '%str'` | insensitive |
| `like(pattern)` | `string \| param()` | SQL LIKE pattern | `LIKE` | sensitive |
| `ilike(pattern)` | `string \| param()` | Case-insensitive LIKE | `ILIKE` | insensitive |
| `in(values[])` | `string[]` | In array | `IN (...)` | sensitive |
| `notIn(values[])` | `string[]` | Not in array | `NOT IN (...)` | sensitive |
| `isNull()` | — | Is null | `IS NULL` | — |
| `isNotNull()` | — | Is not null | `IS NOT NULL` | — |

> **Wildcard escaping:** User input passed to `contains`, `startsWith`, and `endsWith` is
> automatically escaped — `%` and `_` characters are treated as literals. Use `like` or `ilike`
> when you need wildcard control.

---

## Number

Number predicates for numeric comparisons and ranges.

### Equality

```typescript
p.age.eq(30)
p.age.neq(0)
```

### Comparisons

```typescript
p.salary.gt(50000)   // Greater than
p.salary.gte(50000)  // Greater than or equal
p.age.lt(65)         // Less than
p.age.lte(65)        // Less than or equal
```

### Range

```typescript
p.age.between(18, 65)  // Inclusive on both bounds
```

### List Membership

```typescript
p.priority.in([1, 2, 3])
p.priority.notIn([0])
```

### Null Checks

```typescript
p.score.isNull()
p.score.isNotNull()
```

### Reference

| Predicate | Accepts | Description | SQL |
|-----------|---------|-------------|-----|
| `eq(value)` | `number \| param()` | Equals | `=` |
| `neq(value)` | `number \| param()` | Not equals | `!=` |
| `gt(value)` | `number \| param()` | Greater than | `>` |
| `gte(value)` | `number \| param()` | Greater than or equal | `>=` |
| `lt(value)` | `number \| param()` | Less than | `<` |
| `lte(value)` | `number \| param()` | Less than or equal | `<=` |
| `between(lo, hi)` | `number \| param()` | Inclusive range | `BETWEEN lo AND hi` |
| `in(values[])` | `number[]` | In array | `IN (...)` |
| `notIn(values[])` | `number[]` | Not in array | `NOT IN (...)` |
| `isNull()` | — | Is null | `IS NULL` |
| `isNotNull()` | — | Is not null | `IS NOT NULL` |

---

## Boolean

Boolean fields support only the [common predicates](#common-predicates):

```typescript
p.isActive.eq(true)
p.isActive.neq(false)
p.isVerified.isNull()
p.role.in(["admin", "moderator"])  // works on string enums too
```

No additional boolean-specific predicates are provided — `eq(true)` and `eq(false)` cover the
typical cases.

---

## Date

Date predicates for temporal comparisons. Accepts `Date` objects or ISO 8601 strings.

### Equality

```typescript
p.createdAt.eq("2024-01-01")
p.createdAt.neq(new Date("2024-01-01"))
```

### Comparisons

```typescript
p.createdAt.gt("2024-01-01")         // After
p.createdAt.gte("2024-01-01")        // On or after
p.createdAt.lt(new Date())           // Before now
p.createdAt.lte("2024-12-31")        // On or before
```

### Range

```typescript
p.createdAt.between("2024-01-01", "2024-12-31")
```

### List Membership

```typescript
p.birthday.in(["2024-01-01", "2024-07-04"])
```

### Null Checks

```typescript
p.deletedAt.isNull()
p.verifiedAt.isNotNull()
```

### Reference

| Predicate | Accepts | Description | SQL |
|-----------|---------|-------------|-----|
| `eq(value)` | `Date \| string \| param()` | Equals | `=` |
| `neq(value)` | `Date \| string \| param()` | Not equals | `!=` |
| `gt(value)` | `Date \| string \| param()` | After | `>` |
| `gte(value)` | `Date \| string \| param()` | On or after | `>=` |
| `lt(value)` | `Date \| string \| param()` | Before | `<` |
| `lte(value)` | `Date \| string \| param()` | On or before | `<=` |
| `between(lo, hi)` | `Date \| string \| param()` | Inclusive range | `BETWEEN lo AND hi` |
| `in(values[])` | `(Date \| string)[]` | In array | `IN (...)` |
| `notIn(values[])` | `(Date \| string)[]` | Not in array | `NOT IN (...)` |
| `isNull()` | — | Is null | `IS NULL` |
| `isNotNull()` | — | Is not null | `IS NOT NULL` |

---

## Array

Array predicates for fields that contain arrays (e.g., `tags: z.array(z.string())`).

### Containment

```typescript
p.tags.contains("typescript")              // Has specific value
p.tags.containsAll(["typescript", "nodejs"]) // Has ALL values
p.tags.containsAny(["typescript", "rust"])   // Has ANY value
```

Containment predicates (`contains`, `containsAll`, `containsAny`) are only available when the
array element type is a scalar — `string`, `number`, `boolean`, or `Date`. They will not
type-check for arrays of objects or arrays.

### Empty Checks

```typescript
p.tags.isEmpty()      // Empty array OR null
p.tags.isNotEmpty()   // Has at least one element
```

### Length Predicates

```typescript
p.scores.lengthEq(3)    // Exactly 3 elements
p.scores.lengthGt(0)    // More than 0 elements
p.scores.lengthGte(3)   // 3 or more elements
p.scores.lengthLt(10)   // Fewer than 10 elements
p.scores.lengthLte(5)   // 5 or fewer elements
```

### Reference

| Predicate | Accepts | Description | SQL |
|-----------|---------|-------------|-----|
| `contains(value)` | `T` | Has value | JSON array contains |
| `containsAll(values[])` | `T[]` | Has all values | AND of contains |
| `containsAny(values[])` | `T[]` | Has any value | OR of contains |
| `isEmpty()` | — | Empty or null | `IS NULL OR length = 0` |
| `isNotEmpty()` | — | Has elements | `IS NOT NULL AND length > 0` |
| `lengthEq(n)` | `number` | Exactly n elements | `json_array_length(col) = n` |
| `lengthGt(n)` | `number` | More than n | `json_array_length(col) > n` |
| `lengthGte(n)` | `number` | n or more | `json_array_length(col) >= n` |
| `lengthLt(n)` | `number` | Fewer than n | `json_array_length(col) < n` |
| `lengthLte(n)` | `number` | n or fewer | `json_array_length(col) <= n` |

> **Note:** `isEmpty()` matches both empty arrays (`[]`) and null/undefined values. Use `isNull()`
> to check specifically for null.

---

## Object

Object predicates for JSON/object fields. Supports both fluent chaining with `get()` and
[JSON Pointer](https://www.rfc-editor.org/rfc/rfc6901) syntax for deep access.

### Nested Access with `get()`

Type-safe chaining through known keys:

```typescript
p.metadata.get("theme").eq("dark")
p.settings.get("notifications").get("email").eq(true)
```

`get()` returns a typed field builder — if the nested field is a string you get string predicates,
if it's a number you get number predicates, and so on.

### Nested Access with `field()`

Access nested fields by JSON Pointer path:

```typescript
p.config.field("/settings/theme").eq("dark")
p.config.field(["settings", "theme"]).eq("dark")  // Array form
```

Like `get()`, `field()` returns a typed field builder for the resolved path. Use `field()` when
you need to reach deeply nested paths in a single call.

### Key Existence

```typescript
p.metadata.hasKey("theme")                   // Has top-level key
```

### Path Operations

```typescript
p.config.hasPath("/nested/key")              // Has nested path
p.config.pathEquals("/settings/theme", "dark")  // Value at path equals scalar
p.config.pathContains("/tags", "featured")   // Array at path contains value
p.config.pathIsNull("/optional")             // Value at path is null
p.config.pathIsNotNull("/required")          // Value at path is not null
```

### Reference

| Predicate | Accepts | Description |
|-----------|---------|-------------|
| `get(key)` | `string` (key name) | Access nested field, returns typed field builder |
| `field(pointer)` | `string \| string[]` (JSON Pointer) | Access field by path, returns typed field builder |
| `hasKey(key)` | `string` | Has top-level key |
| `hasPath(pointer)` | `string \| string[]` | Has nested path |
| `pathEquals(pointer, value)` | pointer + `string \| number \| boolean \| Date` | Value at path equals scalar |
| `pathContains(pointer, value)` | pointer + `string \| number \| boolean \| Date` | Array at path contains value |
| `pathIsNull(pointer)` | `string \| string[]` | Value at path is null |
| `pathIsNotNull(pointer)` | `string \| string[]` | Value at path is not null |

> **JSON Pointer syntax:** Use `/key/nested/value` string form or `["key", "nested", "value"]`
> array form. `pathEquals` only works on scalar values (not objects or arrays). `pathContains`
> requires the path to point to an array.

---

## Embedding

Embedding predicates for vector similarity search on embedding fields.

### similarTo()

Find similar vectors using distance metrics:

```typescript
p.embedding.similarTo(queryEmbedding, 10)  // Top 10 similar (cosine)
```

### With Options

```typescript
p.embedding.similarTo(queryEmbedding, 10, {
  metric: "cosine",      // "cosine" | "l2" | "inner_product"
  minScore: 0.8,         // Minimum similarity threshold
})
```

### Reference

| Predicate | Accepts | Description |
|-----------|---------|-------------|
| `similarTo(embedding, k)` | `number[], number` | Top k most similar vectors (cosine) |
| `similarTo(embedding, k, opts)` | `number[], number, SimilarToOptions` | Top k with custom metric and threshold |

### Distance Metrics

| Metric | Description | Range | Default | Best For |
|--------|-------------|-------|---------|----------|
| `cosine` | Cosine similarity | 0–1 (1 = identical) | Yes | Normalized embeddings, semantic similarity |
| `l2` | Euclidean distance | 0–∞ (0 = identical) | | Absolute distances, unnormalized vectors |
| `inner_product` | Inner product (PostgreSQL only) | -∞ to ∞ | | Maximum Inner Product Search (MIPS) |

### Example: Semantic Search

```typescript
const similar = await store
  .query()
  .from("Document", "d")
  .whereNode("d", (d) =>
    d.embedding.similarTo(queryEmbedding, 20, {
      metric: "cosine",
      minScore: 0.7,
    })
  )
  .select((ctx) => ({
    id: ctx.d.id,
    title: ctx.d.title,
    content: ctx.d.content,
  }))
  .execute();
```

> **Limitations:** Results are automatically ordered by similarity (most similar first).
> `similarTo` cannot be nested under `OR` or `NOT`. SQLite does not support embeddings —
> vector search requires PostgreSQL with pgvector.

---

## Parameterized Predicates

Use `param(name)` to create a named placeholder for [prepared queries](/queries/execute#prepared-queries).

```typescript
import { param } from "@nicia-ai/typegraph";

const prepared = store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq(param("name")))
  .select((ctx) => ctx.p)
  .prepare();

const results = await prepared.execute({ name: "Alice" });
```

### Supported Positions

| Position | Supported | Example |
|----------|-----------|---------|
| Scalar comparisons (`eq`, `neq`, `gt`, `gte`, `lt`, `lte`) | Yes | `p.age.gt(param("minAge"))` |
| `between` bounds | Yes | `p.age.between(param("lo"), param("hi"))` |
| String operations (`contains`, `startsWith`, `endsWith`, `like`, `ilike`) | Yes | `p.name.contains(param("search"))` |
| `in` / `notIn` | No | Array length must be known at compile time |
| Array predicates | No | — |
| Subquery predicates | No | — |

See [Prepared Queries](/queries/execute#prepared-queries) for full usage and performance details.

## Next Steps

- [Filter](/queries/filter) — Using predicates in queries
- [Subqueries](/queries/advanced) — `exists()`, `notExists()`, `inSubquery()`, `notInSubquery()`
- [Overview](/queries/overview) — Query builder categories

# Recursive Traversals

> Variable-length path traversals with recursive()

Graph queries often need to follow edges to an unknown depth: find all ancestors in a hierarchy, all
transitive dependencies of a package, or everyone reachable within six degrees of separation. In a
relational database, each depth level requires another self-join — and you have to know the depth
ahead of time. Recursive traversals solve this by walking edges until a stopping condition is met.

TypeGraph compiles `.recursive()` into a SQL `WITH RECURSIVE` CTE. The database engine handles the
iteration, so you get the full performance of native recursive SQL without writing it by hand.

## How It Works

A recursive traversal starts from a set of source nodes and repeatedly follows edges, accumulating
results at each level:

```text
Level 0:  Alice
            │ reportsTo
Level 1:  Bob
            │ reportsTo
Level 2:  Carol
            │ reportsTo
Level 3:  Dana (CEO)
```

With `.recursive()`, a single query returns Bob, Carol, and Dana — regardless of how deep the chain
goes. Without it, you'd need to know there are exactly 3 levels and chain 3 traversals manually.

## Basic Usage

Add `.recursive()` between `.traverse()` and `.to()`:

```typescript
const allManagers = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq("Alice"))
  .traverse("reportsTo", "e")
  .recursive()
  .to("Person", "manager")
  .select((ctx) => ({
    employee: ctx.p.name,
    manager: ctx.manager.name,
  }))
  .execute();

// Returns every manager above Alice, at any depth
```

## Options Reference

```typescript
.recursive(options?)
```

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `minHops` | `number` | `1` | Minimum traversal depth before including results |
| `maxHops` | `number` | `100`* | Maximum traversal depth |
| `cyclePolicy` | `"prevent" \| "allow"` | `"prevent"` | How to handle cycles |
| `depth` | `boolean \| string` | — | Expose hop count in `select()` context |
| `path` | `boolean \| string` | — | Expose node ID path in `select()` context |

*When `maxHops` is omitted, an implicit cap of 100 is applied. See [Depth Limits](#depth-limits).

## Controlling Depth

### maxHops

Cap the traversal depth:

```typescript
const nearbyManagers = await store
  .query()
  .from("Person", "p")
  .traverse("reportsTo", "e")
  .recursive({ maxHops: 3 })
  .to("Person", "manager")
  .select((ctx) => ({
    employee: ctx.p.name,
    manager: ctx.manager.name,
  }))
  .execute();
```

### minHops

Skip nearby results. With `minHops: 2`, direct connections (1 hop) are excluded:

```typescript
const distantConnections = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq("Alice"))
  .traverse("knows", "e")
  .recursive({ minHops: 2 })
  .to("Person", "friend")
  .select((ctx) => ({
    person: ctx.p.name,
    distantFriend: ctx.friend.name,
  }))
  .execute();
```

### Combining minHops and maxHops

```typescript
// Friends-of-friends: 2–4 hops away
.recursive({ minHops: 2, maxHops: 4 })
```

`minHops` must be ≤ `maxHops` when both are specified.

## Tracking Depth and Path

When `depth` or `path` are enabled, they become available as properties on the `select()` context.
Pass a string to control the property name; pass `true` to use the default names (`"depth"` and
`"path"`).

### depth

Expose the hop count as a number in each result row:

```typescript
const orgChart = await store
  .query()
  .from("Person", "ceo")
  .whereNode("ceo", (p) => p.role.eq("CEO"))
  .traverse("manages", "e")
  .recursive({ depth: "level" })
  .to("Person", "employee")
  .select((ctx) => ({
    ceo: ctx.ceo.name,
    employee: ctx.employee.name,
    level: ctx.level,             // 1 = direct report, 2 = skip-level, etc.
  }))
  .execute();
```

The string `"level"` passed to `depth` becomes `ctx.level` in the select callback — TypeScript
infers this automatically, so `ctx.level` is fully typed.

### path

Expose the traversal path as an array of node IDs:

```typescript
const pathsToRoot = await store
  .query()
  .from("Category", "cat")
  .whereNode("cat", (c) => c.name.eq("Electronics"))
  .traverse("parentCategory", "e")
  .recursive({ path: "trail" })
  .to("Category", "ancestor")
  .select((ctx) => ({
    category: ctx.cat.name,
    ancestor: ctx.ancestor.name,
    trail: ctx.trail,             // Array of node IDs from start to ancestor
  }))
  .execute();
```

### Using both together

```typescript
const networkAnalysis = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq("Alice"))
  .traverse("knows", "e")
  .recursive({
    maxHops: 6,
    depth: "distance",
    path: "route",
  })
  .to("Person", "connection")
  .select((ctx) => ({
    person: ctx.p.name,
    connection: ctx.connection.name,
    distance: ctx.distance,       // number
    route: ctx.route,             // string[] of node IDs
  }))
  .execute();
```

### Boolean shorthand

Pass `true` instead of a string to use the default alias names:

```typescript
.recursive({ depth: true, path: true })
// ctx.depth and ctx.path are available in select()
```

## Cycle Detection

Graphs often contain cycles: `A → B → C → A`. Without protection, a recursive traversal on this
graph would loop forever.

### cyclePolicy: "prevent" (default)

The default policy tracks visited nodes per path and stops when a node would be visited twice.
This is safe for any graph topology:

```typescript
// Safe even with circular relationships (A → B → C → A)
const allReachable = await store
  .query()
  .from("Node", "start")
  .traverse("linkedTo", "e")
  .recursive()                    // cyclePolicy: "prevent" is the default
  .to("Node", "reachable")
  .select((ctx) => ctx.reachable.id)
  .execute();
```

Under the hood, the compiled SQL maintains a path structure at each recursive step and checks
whether the next node has already been visited. On PostgreSQL this uses `ARRAY` operations; on
SQLite it uses string-delimited path tracking.

### cyclePolicy: "allow"

Skips cycle checking entirely. The traversal relies solely on `maxHops` to terminate. Use this when:

- You know your graph is acyclic (trees, DAGs)
- You want maximum query performance and accept that nodes may appear multiple times
- You're using a strict `maxHops` that prevents runaway recursion

```typescript
// Tree structure — no cycles possible
const ancestors = await store
  .query()
  .from("Category", "cat")
  .traverse("parentCategory", "e")
  .recursive({ maxHops: 20, cyclePolicy: "allow" })
  .to("Category", "ancestor")
  .select((ctx) => ctx.ancestor.name)
  .execute();
```

:::caution
With `cyclePolicy: "allow"` on a cyclic graph, the traversal **will** revisit nodes until it
hits `maxHops`. If `maxHops` is not set, the implicit cap of 100 prevents infinite recursion,
but you may get many duplicate results.
:::

## Filtering During Recursion

Predicates placed on the target node or edge apply **at every step** of the recursion — not just
the final results. This lets you prune paths early:

```typescript
// Only follow "active" edges and land on "active" nodes
const activeNetwork = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq("Alice"))
  .traverse("knows", "e")
  .whereEdge("e", (e) => e.status.eq("active"))
  .recursive({ maxHops: 5 })
  .to("Person", "connection")
  .whereNode("connection", (c) => c.active.eq(true))
  .select((ctx) => ctx.connection.name)
  .execute();
```

Source node predicates (on `"p"` above) apply only to the starting set. Edge and target node
predicates are included in the recursive CTE, so unreachable branches are pruned at each level
rather than filtered after the fact.

## Duplicate Results

When a node is reachable via multiple paths, it appears once per path:

```typescript
// Graph: A → B → D, A → C → D (D is reachable via two paths)
const results = await store
  .query()
  .from("Node", "start")
  .whereNode("start", (n) => n.name.eq("A"))
  .traverse("linkedTo", "e")
  .recursive()
  .to("Node", "reachable")
  .select((ctx) => ctx.reachable.name)
  .execute();

// Returns: ["B", "D", "C", "D"] — D appears twice (once per path)
```

To get unique nodes, deduplicate in your application or use [set operations](/queries/combine).

## Depth Limits

Two safety caps prevent runaway recursion:

| Constant | Value | When it applies |
|----------|-------|-----------------|
| `MAX_RECURSIVE_DEPTH` | 100 | `maxHops` is omitted |
| `MAX_EXPLICIT_RECURSIVE_DEPTH` | 1000 | Upper bound for explicit `maxHops` |

```typescript
import {
  MAX_EXPLICIT_RECURSIVE_DEPTH,
  MAX_RECURSIVE_DEPTH,
} from "@nicia-ai/typegraph";

.recursive()                  // Implicitly capped at 100
.recursive({ maxHops: 500 })  // Honored (≤ 1000)
.recursive({ maxHops: 2000 }) // Throws UnsupportedPredicateError
```

## Limitations

- **One recursive traversal per query.** A query with multiple `.recursive()` calls throws
  `UnsupportedPredicateError`. If you need multiple recursive paths, run separate queries or
  use [set operations](/queries/combine) to merge results.
- **Edge properties are not projected** in recursive results. You can filter on edge properties
  with `whereEdge()`, but the `select()` context only exposes the start node, target node, and
  any depth/path aliases.

## Real-World Examples

### Organizational Hierarchy

Find all reports (direct and indirect) under a manager:

```typescript
const allReports = await store
  .query()
  .from("Person", "manager")
  .whereNode("manager", (p) => p.name.eq("VP Engineering"))
  .traverse("manages", "e")
  .recursive({ depth: "level" })
  .to("Person", "report")
  .select((ctx) => ({
    manager: ctx.manager.name,
    report: ctx.report.name,
    level: ctx.level,
    department: ctx.report.department,
  }))
  .orderBy("level", "asc")
  .execute();
```

### Dependency Graph

Find all transitive dependencies of a package:

```typescript
const dependencies = await store
  .query()
  .from("Package", "pkg")
  .whereNode("pkg", (p) => p.name.eq("my-app"))
  .traverse("dependsOn", "e")
  .recursive({ path: "chain", depth: "depth" })
  .to("Package", "dep")
  .select((ctx) => ({
    package: ctx.pkg.name,
    dependency: ctx.dep.name,
    version: ctx.dep.version,
    depth: ctx.depth,
    chain: ctx.chain,
  }))
  .orderBy("depth", "asc")
  .execute();
```

### Social Network — Friends of Friends

```typescript
const recommendations = await store
  .query()
  .from("Person", "me")
  .whereNode("me", (p) => p.id.eq(currentUserId))
  .traverse("follows", "e")
  .recursive({ minHops: 2, maxHops: 3 })
  .to("Person", "suggestion")
  .select((ctx) => ({
    id: ctx.suggestion.id,
    name: ctx.suggestion.name,
  }))
  .limit(20)
  .execute();
```

### Category Breadcrumbs

```typescript
const breadcrumbs = await store
  .query()
  .from("Category", "current")
  .whereNode("current", (c) => c.slug.eq("smartphones"))
  .traverse("parentCategory", "e")
  .recursive({ path: "pathIds", depth: "depth" })
  .to("Category", "ancestor")
  .select((ctx) => ({
    name: ctx.ancestor.name,
    slug: ctx.ancestor.slug,
    depth: ctx.depth,
  }))
  .orderBy("depth", "desc")
  .execute();

// Returns: [{ name: "Root", depth: 3 }, { name: "Electronics", depth: 2 }, { name: "Phones", depth: 1 }]
```

### Access Control — Permission Inheritance

Check if a user has access through a group hierarchy:

```typescript
const inheritedPermissions = await store
  .query()
  .from("Group", "group")
  .whereNode("group", (g) => g.name.eq("Engineering"))
  .traverse("parentGroup", "e")
  .recursive({ depth: "level", maxHops: 10 })
  .to("Group", "ancestor")
  .select((ctx) => ({
    group: ctx.ancestor.name,
    level: ctx.level,
  }))
  .execute();

// Returns: [{ group: "Product", level: 1 }, { group: "Company", level: 2 }]
// Alice inherits permissions from Engineering → Product → Company
```

## Next Steps

- [Traverse](/queries/traverse) — Single-hop and multi-hop traversals
- [Filter](/queries/filter) — Filter nodes and edges with predicates
- [Shape](/queries/shape) — Transform output with `select()`
- [Combine](/queries/combine) — Merge results from multiple queries

# Shape

> Transform output structure with select() and aggregate()

Shape operations transform how results are returned. Use `select()` to define the output structure
and `aggregate()` for grouped/aggregated results.

## select()

The `select()` method defines what data to return:

```typescript
const results = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name,
    email: ctx.p.email,
  }))
  .execute();
```

### Parameters

```typescript
.select(selectFunction)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `selectFunction` | `(ctx) => T` | Function that receives a context and returns the output shape |

The context provides typed access to all nodes and edges in the query via their aliases.

## Selection Patterns

### Full Node

Return all properties as an object:

```typescript
.select((ctx) => ctx.p)

// Returns: { id, kind, name, email, ... }
```

### Specific Fields

Return only the fields you need:

```typescript
.select((ctx) => ({
  id: ctx.p.id,
  name: ctx.p.name,
  email: ctx.p.email,
}))
```

:::tip[Performance]
Selecting specific fields triggers TypeGraph's **smart select optimization**. Instead of fetching
the entire `props` blob, TypeGraph generates SQL that extracts only the requested fields. This
can significantly improve performance, especially with well-designed [indexes](/performance/indexes).
:::

### Node Metadata

Include system metadata fields:

```typescript
.select((ctx) => ({
  id: ctx.p.id,
  kind: ctx.p.kind,           // "Person"
  version: ctx.p.version,     // Optimistic concurrency version
  createdAt: ctx.p.createdAt,
  updatedAt: ctx.p.updatedAt,
}))
```

### Multiple Nodes

Select from multiple nodes in a traversal:

```typescript
const results = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .to("Company", "c")
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c.name,
    role: ctx.e.role,         // Edge property
  }))
  .execute();
```

### Nested Objects

Structure output with nested objects:

```typescript
.select((ctx) => ({
  employee: {
    id: ctx.p.id,
    name: ctx.p.name,
  },
  company: {
    id: ctx.c.id,
    name: ctx.c.name,
  },
  employment: {
    role: ctx.e.role,
    startDate: ctx.e.startDate,
  },
}))
```

### Renamed Fields

Rename fields in the output:

```typescript
.select((ctx) => ({
  personName: ctx.p.name,      // Renamed from 'name'
  companyName: ctx.c.name,     // Renamed from 'name'
  jobTitle: ctx.e.role,        // Renamed from 'role'
}))
```

## Type Inference

TypeScript infers the result type from your selection:

```typescript
// TypeScript infers: Array<{ name: string; email: string | undefined }>
const results = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name,    // string (required in schema)
    email: ctx.p.email,  // string | undefined (optional in schema)
  }))
  .execute();

// Invalid property access caught at compile time:
.select((ctx) => ({
  invalid: ctx.p.nonexistent,  // TypeScript error!
}))
```

## Optional Traversal Results

When using `optionalTraverse()`, accessed nodes and edges may be `undefined`:

```typescript
const results = await store
  .query()
  .from("Person", "p")
  .optionalTraverse("worksAt", "e")
  .to("Company", "c")
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c?.name,      // May be undefined
    role: ctx.e?.role,         // May be undefined
  }))
  .execute();
```

## aggregate()

Use `aggregate()` with aggregate functions for grouped queries:

```typescript
import { count, sum, avg, field } from "@nicia-ai/typegraph";

const stats = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .to("Company", "c")
  .groupBy("c", "name")
  .aggregate({
    companyName: field("c", "name"),
    employeeCount: count("p"),
    totalSalary: sum("e", "salary"),
    avgSalary: avg("e", "salary"),
  })
  .execute();
```

See [Aggregate](/queries/aggregate) for full aggregate documentation.

## Selecting Path Information

With recursive traversals, include path and depth:

```typescript
const results = await store
  .query()
  .from("Category", "cat")
  .traverse("parentCategory", "e")
  .recursive({ path: "pathIds", depth: "depth" })
  .to("Category", "ancestor")
  .select((ctx) => ({
    category: ctx.cat.name,
    ancestor: ctx.ancestor.name,
    path: ctx.pathIds,          // Array of node IDs
    depth: ctx.depth,           // Number of hops
  }))
  .execute();
```

## Temporal Metadata

When using [temporal queries](/queries/temporal), access validity information:

```typescript
const history = await store
  .query()
  .from("Article", "a")
  .temporal("includeEnded")
  .select((ctx) => ({
    title: ctx.a.title,
    validFrom: ctx.a.validFrom,   // When this version became valid
    validTo: ctx.a.validTo,       // When superseded (undefined if current)
    version: ctx.a.version,       // Version number
  }))
  .execute();
```

## Return Type

`select()` returns an `ExecutableQuery` that provides:

- `execute()` - Run the query and get results
- `paginate()` - Cursor-based pagination
- `stream()` - Stream results for large datasets
- `first()` - Get the first result or undefined
- `count()` - Count matching results
- `exists()` - Check if any results exist
- `toAst()` - Get the query AST
- `compile()` - Compile to SQL

## Next Steps

- [Aggregate](/queries/aggregate) - Grouping and aggregate functions
- [Order](/queries/order) - Ordering and limiting results
- [Execute](/queries/execute) - Running queries and pagination

# Source

> Starting queries with from()

Every query starts with `from()`, which specifies the node kind to query and assigns an alias for
referencing it throughout the query.

## Basic Usage

```typescript
const results = await store
  .query()
  .from("Person", "p")  // Start from Person nodes, alias as "p"
  .select((ctx) => ctx.p)
  .execute();
```

## Parameters

```typescript
.from(kind, alias, options?)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `kind` | `string` | The node kind to query (must exist in your graph definition) |
| `alias` | `string` | A unique identifier for referencing this node in the query |
| `options.includeSubClasses` | `boolean` | Include nodes of subclass kinds (default: `false`) |

## Aliases

The alias is used throughout the query to reference the node:

```typescript
const results = await store
  .query()
  .from("Person", "person")
  .whereNode("person", (p) => p.status.eq("active"))  // Reference in filter
  .orderBy("person", "name", "asc")                    // Reference in ordering
  .select((ctx) => ({
    name: ctx.person.name,                             // Reference in selection
    email: ctx.person.email,
  }))
  .execute();
```

Aliases must be unique within a query. TypeScript enforces this at compile time:

```typescript
store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .to("Company", "p")  // TypeScript error: alias "p" already in use
```

## Subclass Expansion

If your ontology defines subclass relationships, you can query a parent kind and include all subclasses:

```typescript
// Graph definition with subclass relationships:
// subClassOf(Podcast, Media)
// subClassOf(Article, Media)
// subClassOf(Video, Media)

// Query only exact Media nodes (default behavior)
const exactMedia = await store
  .query()
  .from("Media", "m")
  .select((ctx) => ctx.m)
  .execute();

// Query Media and all subclasses
const allMedia = await store
  .query()
  .from("Media", "m", { includeSubClasses: true })
  .select((ctx) => ({
    kind: ctx.m.kind,   // "Media" | "Podcast" | "Article" | "Video"
    title: ctx.m.title,
  }))
  .execute();
```

When `includeSubClasses: true`:

- Results include nodes of the specified kind AND all subclass kinds
- The `kind` field in results reflects the actual node kind
- All properties common to the parent kind are accessible

## Return Type

`from()` returns a `QueryBuilder` that provides access to all query methods:

- [Filter](/queries/filter) - `whereNode()`, `whereEdge()`
- [Traverse](/queries/traverse) - `traverse()`, `optionalTraverse()`
- [Shape](/queries/shape) - `select()`, `aggregate()`
- [Order](/queries/order) - `orderBy()`, `limit()`, `offset()`
- [Aggregate](/queries/aggregate) - `groupBy()`, `groupByNode()`
- [Temporal](/queries/temporal) - `temporal()`
- [Compose](/queries/compose) - `pipe()`

## Next Steps

- [Filter](/queries/filter) - Reduce results with `whereNode()`
- [Traverse](/queries/traverse) - Navigate to related nodes
- [Shape](/queries/shape) - Define output with `select()`

# Temporal

> Time-based queries with temporal()

TypeGraph tracks temporal validity for all nodes and edges. Use temporal queries to view the graph
at a point in time, audit changes, or access historical data.

## Temporal Modes

The `temporal()` method controls which versions of data are returned:

| Mode | Description |
|------|-------------|
| `"current"` | Only currently valid data (default behavior) |
| `"asOf"` | Data as it existed at a specific timestamp |
| `"includeEnded"` | All versions, including historical |
| `"includeTombstones"` | All versions, including soft-deleted |

## Current State (Default)

By default, queries return only currently valid, non-deleted data:

```typescript
// Returns only current, non-deleted nodes
const currentPeople = await store
  .query()
  .from("Person", "p")
  .select((ctx) => ctx.p)
  .execute();
```

This is equivalent to:

```typescript
.temporal("current")
```

## Point-in-Time Queries (asOf)

Query the graph as it existed at a specific moment:

```typescript
const yesterday = new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();

const pastState = await store
  .query()
  .from("Article", "a")
  .temporal("asOf", yesterday)
  .whereNode("a", (a) => a.id.eq(articleId))
  .select((ctx) => ctx.a)
  .execute();
```

This returns nodes and edges that were valid at the specified timestamp, even if they've since been updated or deleted.

### Use Cases for asOf

- **Auditing**: See what data looked like at a specific time
- **Debugging**: Reproduce issues by querying historical state
- **Compliance**: Generate point-in-time reports
- **Recovery**: Find old values before an erroneous update

```typescript
// What did the user's profile look like last week?
const lastWeek = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString();

const historicalProfile = await store
  .query()
  .from("User", "u")
  .temporal("asOf", lastWeek)
  .whereNode("u", (u) => u.id.eq(userId))
  .select((ctx) => ctx.u)
  .first();
```

## Including Historical Data (includeEnded)

View all versions, including superseded records:

```typescript
const history = await store
  .query()
  .from("Article", "a")
  .temporal("includeEnded")
  .whereNode("a", (a) => a.id.eq(articleId))
  .orderBy((ctx) => ctx.a.validFrom, "desc")
  .select((ctx) => ({
    title: ctx.a.title,
    validFrom: ctx.a.validFrom,
    validTo: ctx.a.validTo,
    version: ctx.a.version,
  }))
  .execute();

// Result shows all versions:
// [
//   { title: "Final Title", validFrom: "2024-03-01", validTo: undefined, version: 3 },
//   { title: "Draft v2", validFrom: "2024-02-15", validTo: "2024-03-01", version: 2 },
//   { title: "Initial Draft", validFrom: "2024-02-01", validTo: "2024-02-15", version: 1 },
// ]
```

### Audit Trail

Build a complete change history:

```typescript
async function getAuditTrail(nodeId: string) {
  return store
    .query()
    .from("Document", "d")
    .temporal("includeEnded")
    .whereNode("d", (d) => d.id.eq(nodeId))
    .select((ctx) => ({
      version: ctx.d.version,
      title: ctx.d.title,
      status: ctx.d.status,
      validFrom: ctx.d.validFrom,
      validTo: ctx.d.validTo,
      updatedAt: ctx.d.updatedAt,
    }))
    .orderBy("d", "version", "asc")
    .execute();
}
```

## Including Soft-Deleted Data (includeTombstones)

Include records that have been soft-deleted:

```typescript
const allIncludingDeleted = await store
  .query()
  .from("User", "u")
  .temporal("includeTombstones")
  .select((ctx) => ({
    id: ctx.u.id,
    name: ctx.u.name,
    deletedAt: ctx.u.deletedAt,  // Will have a value for deleted records
  }))
  .execute();
```

### Filtering Deleted Records

```typescript
// Find only deleted records
const deletedUsers = await store
  .query()
  .from("User", "u")
  .temporal("includeTombstones")
  .whereNode("u", (u) => u.deletedAt.isNotNull())
  .select((ctx) => ({
    id: ctx.u.id,
    name: ctx.u.name,
    deletedAt: ctx.u.deletedAt,
  }))
  .execute();
```

## Temporal Metadata Fields

When querying with temporal context, these fields are available:

| Field | Type | Description |
|-------|------|-------------|
| `validFrom` | `string \| undefined` | When this version became valid |
| `validTo` | `string \| undefined` | When this version was superseded (undefined if current) |
| `createdAt` | `string` | When the node was first created |
| `updatedAt` | `string` | When this version was written |
| `deletedAt` | `string \| undefined` | Soft-delete timestamp (undefined if not deleted) |
| `version` | `number` | Optimistic concurrency version number |

```typescript
.select((ctx) => ({
  ...ctx.a,                     // All node properties
  validFrom: ctx.a.validFrom,
  validTo: ctx.a.validTo,
  createdAt: ctx.a.createdAt,
  updatedAt: ctx.a.updatedAt,
  deletedAt: ctx.a.deletedAt,
  version: ctx.a.version,
}))
```

## Temporal Traversals

Temporal modes apply to traversals as well:

```typescript
// See who worked at a company last year
const lastYear = new Date("2023-01-01").toISOString();

const pastEmployees = await store
  .query()
  .from("Company", "c")
  .temporal("asOf", lastYear)
  .whereNode("c", (c) => c.name.eq("Acme Corp"))
  .traverse("worksAt", "e", { direction: "in" })
  .to("Person", "p")
  .select((ctx) => ({
    name: ctx.p.name,
    role: ctx.e.role,
  }))
  .execute();
```

## Real-World Examples

### Version Comparison

Compare two versions of a document:

```typescript
async function compareVersions(docId: string, v1: number, v2: number) {
  const versions = await store
    .query()
    .from("Document", "d")
    .temporal("includeEnded")
    .whereNode("d", (d) => d.id.eq(docId))
    .select((ctx) => ctx.d)
    .execute();

  const version1 = versions.find((v) => v.version === v1);
  const version2 = versions.find((v) => v.version === v2);

  return { version1, version2 };
}
```

### Compliance Reporting

Generate a report as of a specific date:

```typescript
async function generateQuarterlyReport(quarterEnd: string) {
  const activeContracts = await store
    .query()
    .from("Contract", "c")
    .temporal("asOf", quarterEnd)
    .whereNode("c", (c) => c.status.eq("active"))
    .traverse("belongsTo", "e")
    .to("Customer", "cust")
    .select((ctx) => ({
      contractId: ctx.c.id,
      value: ctx.c.value,
      customer: ctx.cust.name,
    }))
    .execute();

  return {
    asOf: quarterEnd,
    totalContracts: activeContracts.length,
    totalValue: activeContracts.reduce((sum, c) => sum + c.value, 0),
    contracts: activeContracts,
  };
}
```

### Undo/Recovery

Find the previous value before an update:

```typescript
async function getPreviousVersion(nodeId: string) {
  const versions = await store
    .query()
    .from("Document", "d")
    .temporal("includeEnded")
    .whereNode("d", (d) => d.id.eq(nodeId))
    .select((ctx) => ctx.d)
    .orderBy("d", "version", "desc")
    .limit(2)
    .execute();

  return {
    current: versions[0],
    previous: versions[1],
  };
}
```

## Next Steps

- [Filter](/queries/filter) - Filtering with predicates
- [Traverse](/queries/traverse) - Graph traversals
- [Execute](/queries/execute) - Running queries

# Traverse

> Navigate relationships with traverse() and optionalTraverse()

Traversals let you navigate relationships in your graph. Instead of writing complex SQL joins,
describe the path you want to follow.

## Single-Hop Traversal

Follow one edge from a node to connected nodes:

```typescript
const employments = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.id.eq("alice-123"))
  .traverse("worksAt", "e")      // Follow worksAt edges
  .to("Company", "c")            // Arrive at Company nodes
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c.name,
    role: ctx.e.role,            // Edge properties are accessible
  }))
  .execute();
```

## Parameters

### traverse()

```typescript
.traverse(edgeKind, edgeAlias, options?)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `edgeKind` | `string` | The edge kind to traverse |
| `edgeAlias` | `string` | Unique alias for referencing this edge |
| `options.direction` | `"out" \| "in"` | Traversal direction (default: `"out"`) |
| `options.expand` | `"none" \| "implying" \| "inverse" \| "all"` | Ontology edge expansion mode (default: `"inverse"`) |
| `options.from` | `string` | Fan-out from a different node alias |

### optionalTraverse()

```typescript
.optionalTraverse(edgeKind, edgeAlias, options?)
```

Uses the same options as `traverse()`, but returns optional edge/node values in the result context.

### to()

```typescript
.to(nodeKind, nodeAlias, options?)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `nodeKind` | `string` | The target node kind |
| `nodeAlias` | `string` | Unique alias for referencing this node |
| `options.includeSubClasses` | `boolean` | Include subclass kinds (default: `false`) |

## Direction

By default, traversals follow edges in their defined direction (from → to). Use `direction: "in"` to traverse backwards:

```typescript
// Edge definition: worksAt goes from Person → Company

// Forward: Find companies where Alice works
.from("Person", "p")
.traverse("worksAt", "e")           // Person → Company
.to("Company", "c")

// Backward: Find people who work at Acme
.from("Company", "c")
.whereNode("c", (c) => c.name.eq("Acme"))
.traverse("worksAt", "e", { direction: "in" })  // Company ← Person
.to("Person", "p")
```

## Edge Properties

Edges can carry properties. Access them through the edge alias:

```typescript
const employments = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .to("Company", "c")
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c.name,
    role: ctx.e.role,           // Edge property
    salary: ctx.e.salary,       // Edge property
    startDate: ctx.e.startDate, // Edge property
  }))
  .execute();
```

### Edge Object Structure

Each edge provides these fields:

| Property | Type | Description |
|----------|------|-------------|
| `id` | `string` | Unique edge identifier |
| `kind` | `string` | Edge type name |
| `fromId` | `string` | ID of the source node |
| `toId` | `string` | ID of the target node |
| `meta.createdAt` | `string` | When the edge was created |
| `meta.updatedAt` | `string` | When the edge was last updated |
| `meta.deletedAt` | `string \| undefined` | Soft delete timestamp |
| `meta.validFrom` | `string \| undefined` | Temporal validity start |
| `meta.validTo` | `string \| undefined` | Temporal validity end |
| *schema props* | varies | Properties defined in edge schema |

### Filtering on Edge Properties

Use `whereEdge()` to filter based on edge values:

```typescript
const highPaying = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e")
  .whereEdge("e", (e) => e.salary.gte(100000))
  .to("Company", "c")
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c.name,
    salary: ctx.e.salary,
  }))
  .execute();
```

## Multi-Hop Traversals

Chain traversals to follow multiple relationships:

```typescript
const projectTasks = await store
  .query()
  .from("Person", "person")
  .whereNode("person", (p) => p.name.eq("Alice"))
  .traverse("worksOn", "e1")
  .to("Project", "project")
  .traverse("hasTask", "e2")
  .to("Task", "task")
  .select((ctx) => ({
    person: ctx.person.name,
    project: ctx.project.name,
    task: ctx.task.title,
  }))
  .execute();
```

Each hop starts from the previous node set and arrives at new nodes.

### Mixed Directions

Combine forward and backward traversals:

```typescript
const teamStructure = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e1")                      // Forward: Person → Company
  .to("Company", "c")
  .traverse("manages", "e2", { direction: "in" }) // Backward: Person ← manages
  .to("Person", "manager")
  .select((ctx) => ({
    employee: ctx.p.name,
    company: ctx.c.name,
    manager: ctx.manager.name,
  }))
  .execute();
```

## Optional Traversals

Use `optionalTraverse()` for LEFT JOIN semantics—include results even when the traversal has no matches:

```typescript
const peopleWithOptionalEmployer = await store
  .query()
  .from("Person", "p")
  .optionalTraverse("worksAt", "e")
  .to("Company", "c")
  .select((ctx) => ({
    person: ctx.p.name,
    company: ctx.c?.name,  // May be undefined if no employer
  }))
  .execute();

// Includes all people, even those without a worksAt edge
```

### Mixing Required and Optional

```typescript
const employeesWithOptionalManager = await store
  .query()
  .from("Person", "p")
  .traverse("worksAt", "e1")              // Required: must work at a company
  .to("Company", "c")
  .optionalTraverse("reportsTo", "e2")    // Optional: might not have manager
  .to("Person", "manager")
  .select((ctx) => ({
    employee: ctx.p.name,
    company: ctx.c.name,
    manager: ctx.manager?.name,           // undefined for top-level employees
  }))
  .execute();
```

### Optional Edge Access

With optional traversals, the edge may be `undefined`:

```typescript
.select((ctx) => ({
  person: ctx.p.name,
  company: ctx.c?.name,      // Node may be undefined
  role: ctx.e?.role,         // Edge may be undefined
  salary: ctx.e?.salary,
}))
```

## Ontology-Aware Traversals

If your ontology defines edge implications, expand queries to include implying edges:

```typescript
// Ontology: implies(marriedTo, knows), implies(bestFriends, knows)

const connections = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq("Alice"))
  .traverse("knows", "e", { expand: "implying" })
  .to("Person", "other")
  .select((ctx) => ctx.other.name)
  .execute();

// Returns people connected via "knows", "marriedTo", or "bestFriends"
```

If your ontology defines inverse edge kinds, you can expand traversals to include inverse edges:

```typescript
// Ontology: inverseOf(manages, managedBy)

const relationships = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.eq("Alice"))
  .traverse("manages", "e", { expand: "inverse" })
  .to("Person", "other")
  .select((ctx) => ({
    name: ctx.other.name,
    viaEdgeKind: ctx.e.kind,
  }))
  .execute();

// Traverses both "manages" and "managedBy"
```

You can combine both options:

```typescript
.traverse("knows", "e", { expand: "all" })
```

:::note[Default expansion mode]
The default expansion mode is `"inverse"`, meaning traversals automatically include inverse edge kinds
from your ontology. To opt out for a single traversal, pass `expand: "none"`. To change the default
for all traversals, set `queryDefaults.traversalExpansion` in `createStore` options.
:::

## Real-World Examples

### Organizational Hierarchy

```typescript
const teamMembers = await store
  .query()
  .from("Person", "manager")
  .whereNode("manager", (p) => p.name.eq("VP Engineering"))
  .traverse("manages", "e")
  .to("Person", "report")
  .select((ctx) => ({
    manager: ctx.manager.name,
    report: ctx.report.name,
    department: ctx.report.department,
  }))
  .execute();
```

### Social Graph

```typescript
const friends = await store
  .query()
  .from("Person", "me")
  .whereNode("me", (p) => p.id.eq(currentUserId))
  .traverse("follows", "e")
  .to("Person", "friend")
  .select((ctx) => ({
    id: ctx.friend.id,
    name: ctx.friend.name,
    followedAt: ctx.e.createdAt,
  }))
  .orderBy("e", "createdAt", "desc")
  .limit(50)
  .execute();
```

### E-Commerce

```typescript
const orderDetails = await store
  .query()
  .from("Order", "o")
  .whereNode("o", (o) => o.id.eq(orderId))
  .traverse("contains", "e")
  .to("Product", "p")
  .select((ctx) => ({
    product: ctx.p.name,
    quantity: ctx.e.quantity,
    unitPrice: ctx.e.unitPrice,
  }))
  .execute();
```

## Next Steps

- [Recursive](/queries/recursive) - Variable-length paths with `recursive()`
- [Filter](/queries/filter) - Filter nodes and edges with predicates
- [Shape](/queries/shape) - Transform output with `select()`

# Common Patterns

> Short, focused patterns for common graph problems

Recipes are **short, focused patterns** that solve a specific problem in a few code blocks. For complete,
end-to-end implementations, see [Examples](/examples/document-management).

| Pattern | Use Case |
|---------|----------|
| [RBAC](#role-based-access-control-rbac) | Permission checks through role hierarchies |
| [Social Network](#social-network-followers--feeds) | Feeds, followers, friend recommendations |
| [Content Versioning](#content-versioning-with-history) | Temporal queries and audit trails |
| [Tagging System](#tagging-system) | Flexible categorization with tag clouds |
| [Tree Navigation](#tree-navigation) | Hierarchical menus, org charts, file systems |
| [Weighted Relationships](#weighted-relationships) | Scoring, relevance, confidence levels |
| [Soft Deletes](#soft-deletes-with-cascade) | Safe deletion with relationship cleanup |
| [Unique Constraints](#enforcing-unique-constraints) | Preventing duplicates |

## Role-Based Access Control (RBAC)

TypeGraph's traversal capabilities make it excellent for modeling permission systems, where access
can be inherited through roles or groups.

### Schema Definition

```typescript
import { z } from "zod";
import { defineNode, defineEdge, defineGraph } from "@nicia-ai/typegraph";

// 1. Define Nodes
const User = defineNode("User", {
  schema: z.object({ username: z.string() }),
});

const Role = defineNode("Role", {
  schema: z.object({ name: z.string() }),
});

const Permission = defineNode("Permission", {
  schema: z.object({ action: z.string(), resource: z.string() }),
});

const Resource = defineNode("Resource", {
  schema: z.object({ type: z.string(), externalId: z.string() }),
});

// 2. Define Edges
const hasRole = defineEdge("hasRole");
const hasPermission = defineEdge("hasPermission");
const appliesTo = defineEdge("appliesTo");

// 3. Define Graph (endpoints are specified here, not in defineEdge)
const rbacGraph = defineGraph({
  id: "rbac_system",
  nodes: {
    User: { type: User },
    Role: { type: Role },
    Permission: { type: Permission },
    Resource: { type: Resource },
  },
  edges: {
    hasRole: { type: hasRole, from: [User], to: [Role] },
    hasPermission: { type: hasPermission, from: [Role, User], to: [Permission] },
    appliesTo: { type: appliesTo, from: [Permission], to: [Resource] },
  },
});
```

### Checking Permissions

To check if a user has a specific permission, we can query for a path from the User to the
Permission, either directly or through a Role.

```typescript
async function checkPermission(userId: string, action: string, resourceId: string) {
  const result = await store
    .query()
    .from("User", "u")
    .whereNode("u", (p) => p.id.eq(userId))
    // Traverse optional roles
    .optionalTraverse("hasRole", "r_edge")
    .to("Role", "r")
    // From either User or Role, look for permissions
    .traverse("hasPermission", "p_edge")
    .to("Permission", "p")
    .whereNode("p", (p) => p.action.eq(action))
    .execute();

  return result.length > 0;
}
```

## Social Network (Followers & Feeds)

Modeling social features requires efficient handling of relationships and recursive queries for recommendations.

### Schema Definition

```typescript
const User = defineNode("User", {
  schema: z.object({ handle: z.string() }),
});

const Post = defineNode("Post", {
  schema: z.object({ content: z.string(), timestamp: z.string() }),
});

const follows = defineEdge("follows");
const authored = defineEdge("authored");

const socialGraph = defineGraph({
  id: "social",
  nodes: {
    User: { type: User },
    Post: { type: Post },
  },
  edges: {
    follows: { type: follows, from: [User], to: [User] },
    authored: { type: authored, from: [User], to: [Post] },
  },
});
```

### Generating a Feed

Retrieve posts from users that the current user follows, ordered by time.

```typescript
const feed = await store
  .query()
  .from("User", "me")
  .whereNode("me", (u) => u.id.eq(currentUserId))
  .traverse("follows", "f")
  .to("User", "author")
  .traverse("authored", "p")
  .to("Post", "post")
  .select((ctx) => ({
    author: ctx.author.handle,
    content: ctx.post.content,
    date: ctx.post.timestamp,
  }))
  .orderBy("post", "timestamp", "desc")
  .execute();
```

### Friend Recommendations

Find "Friends of Friends" that the user doesn't follow yet.

```typescript
const recommendations = await store
  .query()
  .from("User", "me")
  .whereNode("me", (u) => u.id.eq(currentUserId))
  .traverse("follows", "f1")
  .to("User", "friend")
  .traverse("follows", "f2")
  .to("User", "fof")
  // Exclude people I already follow (simplified - in practice use EXCEPT or client filtering)
  .select((ctx) => ({
    handle: ctx.fof.handle,
  }))
  .limit(10)
  .execute();
```

## Content Versioning with History

TypeGraph has built-in support for temporal data. Every node and edge tracks `valid_from` and
`valid_to` timestamps, allowing you to travel through time without complex schema changes.

### Enabling Temporal Mode

Ensure your graph definition allows for history. By default, TypeGraph uses
`temporalMode: "current"`, which only returns currently valid data.

```typescript
const cmsGraph = defineGraph({
  id: "cms",
  nodes: {
    /* ... */
  },
  edges: {
    /* ... */
  },
  defaults: {
    // This allows us to query past states
    temporalMode: "current", // Default, but can be overridden per query
  },
});
```

### Updating Content

When you update a node, TypeGraph automatically:

1. Marks the old row as valid until `now()`.
2. Inserts a new row valid from `now()`.

```typescript
// 1. Create initial version
const article = await store.nodes.Article.create({
  title: "Draft 1",
  content: "Work in progress...",
});

// 2. Update it (automatically versions)
await store.nodes.Article.update(article.id, {
  title: "Final Version",
  content: "Ready to publish!",
});
```

### Querying Past States

You can query the state of the graph as it existed at any point in time using `asOf`.

```typescript
// Get the current version (Final Version)
const current = await store
  .query()
  .from("Article", "a")
  .whereNode("a", (a) => a.id.eq(article.id))
  .select((ctx) => ctx.a)
  .execute();

// Get the version from 5 minutes ago (Draft 1)
const fiveMinutesAgo = new Date(Date.now() - 5 * 60 * 1000).toISOString();

const past = await store
  .query()
  .from("Article", "a")
  .temporal("asOf", fiveMinutesAgo)
  .whereNode("a", (a) => a.id.eq(article.id))
  .select((ctx) => ctx.a)
  .execute();
```

### Audit Logs

To see the full history of changes for a specific node, you can use `includeEnded`.

```typescript
const history = await store
  .query()
  .from("Article", "a")
  .temporal("includeEnded") // Include historical rows
  .whereNode("a", (a) => a.id.eq(article.id))
  .orderBy("a", "valid_from", "desc")
  .select((ctx) => ({
    title: ctx.a.title,
    validFrom: ctx.a.valid_from,
    validTo: ctx.a.valid_to,
  }))
  .execute();
```

## Tagging System

A flexible tagging system where items can have multiple tags, and you can query by tag combinations.

### Schema

```typescript
const Item = defineNode("Item", {
  schema: z.object({ title: z.string(), type: z.string() }),
});

const Tag = defineNode("Tag", {
  schema: z.object({ name: z.string(), color: z.string().optional() }),
});

const taggedWith = defineEdge("taggedWith");

const graph = defineGraph({
  id: "tagging",
  nodes: { Item, Tag },
  edges: { taggedWith: { type: taggedWith, from: [Item], to: [Tag] } },
});
```

### Find Items by Tag

```typescript
const photoshopItems = await store
  .query()
  .from("Tag", "t")
  .whereNode("t", (t) => t.name.eq("photoshop"))
  .traverse("taggedWith", "e", { direction: "in" })
  .to("Item", "i")
  .select((ctx) => ctx.i)
  .execute();
```

### Tag Cloud (Count Items per Tag)

```typescript
import { count, field } from "@nicia-ai/typegraph";

const tagCounts = await store
  .query()
  .from("Item", "i")
  .traverse("taggedWith", "e")
  .to("Tag", "t")
  .groupBy("t", "name")
  .aggregate({
    tag: field("t", "name"),
    count: count("i"),
  })
  .execute();

// Sort by count descending
const tagCloud = tagCounts.toSorted((a, b) => b.count - a.count);
```

### Items with Multiple Tags (AND)

```typescript
// Find items tagged with BOTH "javascript" AND "tutorial"
const jsTag = await store
  .query()
  .from("Tag", "t")
  .whereNode("t", (t) => t.name.eq("javascript"))
  .select((ctx) => ctx.t)
  .first();

const tutorialTag = await store
  .query()
  .from("Tag", "t")
  .whereNode("t", (t) => t.name.eq("tutorial"))
  .select((ctx) => ctx.t)
  .first();

if (!jsTag || !tutorialTag) {
  return []; // Tags don't exist
}

const items = await store
  .query()
  .from("Item", "i")
  .traverse("taggedWith", "e1")
  .to("Tag", "t1")
  .whereNode("t1", (t) => t.id.eq(jsTag.id))
  .traverse("taggedWith", "e2", { direction: "in" })
  .to("Item", "i2")
  .traverse("taggedWith", "e3")
  .to("Tag", "t2")
  .whereNode("t2", (t) => t.id.eq(tutorialTag.id))
  .select((ctx) => ctx.i)
  .execute();
```

## Tree Navigation

Hierarchical structures like menus, org charts, or file systems.

### Schema

```typescript
const Category = defineNode("Category", {
  schema: z.object({
    name: z.string(),
    slug: z.string(),
    depth: z.number().default(0),
  }),
});

const parentOf = defineEdge("parentOf");

const graph = defineGraph({
  id: "categories",
  nodes: { Category },
  edges: { parentOf: { type: parentOf, from: [Category], to: [Category] } },
});
```

### Get All Ancestors (Breadcrumb)

```typescript
const breadcrumb = await store
  .query()
  .from("Category", "c")
  .whereNode("c", (c) => c.slug.eq("electronics/phones/iphone"))
  .traverse("parentOf", "e")
  .recursive({ path: "path" })
  .to("Category", "ancestor")
  .select((ctx) => ({
    name: ctx.ancestor.name,
    slug: ctx.ancestor.slug,
  }))
  .execute();
// Returns: [{ name: "Phones", slug: "..." }, { name: "Electronics", slug: "..." }, ...]
```

### Get All Descendants

```typescript
const allChildren = await store
  .query()
  .from("Category", "root")
  .whereNode("root", (c) => c.slug.eq("electronics"))
  .traverse("parentOf", "e", { direction: "in" })
  .recursive({ depth: "level" })
  .to("Category", "child")
  .select((ctx) => ({
    name: ctx.child.name,
    level: ctx.level,
  }))
  .orderBy((ctx) => ctx.level, "asc")
  .execute();
```

### Build a Tree Structure

```typescript
async function buildTree(rootSlug: string): Promise<TreeNode> {
  const descendants = await store
    .query()
    .from("Category", "root")
    .whereNode("root", (c) => c.slug.eq(rootSlug))
    .traverse("parentOf", "e", { direction: "in" })
    .recursive({ maxHops: 10 })
    .to("Category", "child")
    .select((ctx) => ({
      id: ctx.child.id,
      name: ctx.child.name,
      parentId: ctx.e.fromId,
    }))
    .execute();

  // Build tree in memory
  const nodeMap = new Map<string, TreeNode>();
  for (const d of descendants) {
    nodeMap.set(d.id, { ...d, children: [] });
  }
  for (const d of descendants) {
    if (d.parentId && nodeMap.has(d.parentId)) {
      nodeMap.get(d.parentId)!.children.push(nodeMap.get(d.id)!);
    }
  }
  return nodeMap.get(rootSlug)!;
}
```

## Weighted Relationships

Edges with scores for relevance, confidence, or priority.

### Schema

```typescript
const Document = defineNode("Document", {
  schema: z.object({ title: z.string() }),
});

const relatedTo = defineEdge("relatedTo", {
  schema: z.object({
    score: z.number().min(0).max(1),
    type: z.enum(["similar", "cites", "extends"]),
  }),
});

const graph = defineGraph({
  id: "documents",
  nodes: { Document },
  edges: { relatedTo: { type: relatedTo, from: [Document], to: [Document] } },
});
```

### Find Highly Related Documents

```typescript
const related = await store
  .query()
  .from("Document", "d")
  .whereNode("d", (d) => d.id.eq(documentId))
  .traverse("relatedTo", "e")
  .to("Document", "r")
  .whereEdge("e", (e) => e.score.gte(0.8))
  .select((ctx) => ({
    title: ctx.r.title,
    score: ctx.e.score,
    type: ctx.e.type,
  }))
  .orderBy((ctx) => ctx.score, "desc")
  .execute();
```

### Aggregate Relationship Scores

```typescript
import { avg, count, field } from "@nicia-ai/typegraph";

const docStats = await store
  .query()
  .from("Document", "d")
  .traverse("relatedTo", "e")
  .to("Document", "r")
  .groupByNode("d")
  .aggregate({
    docId: field("d", "id"),
    title: field("d", "title"),
    relationCount: count("e"),
    avgScore: avg("e", "score"),
  })
  .execute();
```

## Soft Deletes with Cascade

Delete nodes while preserving relationships for undo capability.

### Mark as Deleted

```typescript
// TypeGraph uses soft deletes by default
await store.nodes.Document.delete(documentId);

// The node still exists but has deleted_at set
// Queries automatically filter it out
```

### Restore Deleted Nodes

```typescript
// upsertById "un-deletes" soft-deleted nodes
await store.nodes.Document.upsertById(documentId, {
  title: "Restored Document",
  content: "...",
});
```

### Find Deleted Nodes

```typescript
// Use temporal queries to see deleted nodes
const deletedDocs = await store
  .query()
  .from("Document", "d")
  .temporal("includeEnded")
  .whereNode("d", (d) => d.deletedAt.isNotNull())
  .select((ctx) => ({
    id: ctx.d.id,
    title: ctx.d.title,
    deletedAt: ctx.d.deletedAt,
  }))
  .execute();
```

### Cascade Delete Pattern

```typescript
async function cascadeDelete(documentId: string): Promise<void> {
  await store.transaction(async (tx) => {
    // Find all related edges
    const edges = await tx
      .query()
      .from("Document", "d")
      .whereNode("d", (d) => d.id.eq(documentId))
      .traverse("relatedTo", "e")
      .to("Document", "r")
      .select((ctx) => ({ edgeId: ctx.e.id }))
      .execute();

    // Delete edges first
    for (const { edgeId } of edges) {
      await tx.edges.relatedTo.delete(edgeId);
    }

    // Then delete the node
    await tx.nodes.Document.delete(documentId);
  });
}
```

## Enforcing Unique Constraints

Prevent duplicate nodes or relationships.

### Schema-Level Uniqueness

```typescript
const User = defineNode("User", {
  schema: z.object({
    email: z.string().email(),
    username: z.string(),
  }),
});

const graph = defineGraph({
  id: "users",
  nodes: {
    User: {
      type: User,
      unique: [
        { name: "user_email", fields: ["email"], scope: "kind", collation: "caseInsensitive" },
        { name: "user_username", fields: ["username"], scope: "kind", collation: "caseSensitive" },
      ],
    },
  },
  edges: {},
});
```

### Use `getOrCreateByConstraint`

```typescript
async function createOrUpdateUserByEmail(
  email: string,
  username: string
): Promise<{
  user: Node<typeof User>;
  action: "created" | "found" | "updated" | "resurrected";
}> {
  return store.nodes.User.getOrCreateByConstraint(
    "user_email",
    { email, username },
    { ifExists: "update" }
  );
}
```

### Use `getOrCreateByEndpoints` for Edge Deduplication

```typescript
async function followUser(followerId: string, followeeId: string): Promise<void> {
  const follower = await store.nodes.User.getById(followerId);
  const followee = await store.nodes.User.getById(followeeId);
  if (!follower || !followee) {
    throw new Error("User not found");
  }

  await store.edges.follows.getOrCreateByEndpoints(
    follower,
    followee,
    {},
    { ifExists: "return" }
  );
}
```

## Next Steps

For complete, end-to-end implementations, see the [Examples](/examples/document-management) section:

- [Document Management](/examples/document-management) - CMS with semantic search
- [Product Catalog](/examples/product-catalog) - Categories, variants, inventory
- [Workflow Engine](/examples/workflow-engine) - State machines with approvals
- [Audit Trail](/examples/audit-trail) - Complete change tracking
- [Multi-Tenant SaaS](/examples/multi-tenant) - Tenant isolation patterns

# Evolving Schemas in Production

> Step-by-step guide for safely evolving your graph schema across deployments

Your graph schema will change as your application grows. This guide covers how
to make those changes safely — from adding a field to renaming a node type.

For API reference, see [Schema Migrations](/schema-management).

## How Schema Evolution Works

When you call `createStoreWithSchema()`, TypeGraph:

1. Serializes your current graph definition
2. Compares it against the stored schema (by hash, then by diff)
3. **Safe changes** — auto-migrates and bumps the version
4. **Breaking changes** — throws `MigrationError` (or returns `status: "breaking"`)

The key insight: TypeGraph manages **schema metadata**, not data migration. When
you add an optional field, TypeGraph records that the schema now includes it. It
does not alter existing rows — Zod defaults handle that at read time.

## Safe Changes

These changes are backwards compatible and auto-migrate without intervention:

- Adding new node types
- Adding new edge types
- Adding optional properties (with defaults)
- Adding ontology relations

### Adding an Optional Property

```typescript
// Version 1
const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
  }),
});

// Version 2 — safe, auto-migrates
const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
    email: z.string().optional(),
  }),
});
```

On startup, `createStoreWithSchema()` returns `status: "migrated"`. Existing
Person nodes return `email: undefined` — no data transformation needed.

### Adding a Node Type with Edges

```typescript
// Version 2 — add Company and worksAt in one deploy
const Company = defineNode("Company", {
  schema: z.object({ name: z.string() }),
});

const worksAt = defineEdge("worksAt", {
  schema: z.object({ role: z.string() }),
});

const graph = defineGraph({
  id: "my_app",
  nodes: {
    Person: { type: Person },
    Company: { type: Company },
  },
  edges: {
    worksAt: { type: worksAt, from: [Person], to: [Company] },
  },
});
```

This is a single safe migration. New node and edge types don't affect existing
data.

## Breaking Changes

These require explicit handling:

- Removing node or edge types
- Removing properties
- Adding required properties (no default)
- Renaming types or properties

TypeGraph will throw `MigrationError` by default. You have two options: fix
the schema to be backwards compatible, or use the expand-contract pattern.

## The Expand-Contract Pattern

For breaking changes, use a multi-deploy strategy. This is the same pattern
used in relational database migrations — deploy in phases so there's never a
moment where running code is incompatible with the schema.

### Renaming a Property

Rename `name` to `fullName` on Person in three deploys:

#### Deploy 1 — Expand: add the new property

```typescript
const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
    fullName: z.string().optional(), // New property, optional for now
  }),
});
```

Safe migration. Then backfill existing data:

```typescript
const [store] = await createStoreWithSchema(graph, backend);

const people = await store.query(Person).execute();
for (const person of people) {
  if (!person.properties.fullName) {
    await store.nodes.Person.update(person.id, {
      fullName: person.properties.name,
    });
  }
}
```

#### Deploy 2 — Switch: use the new property everywhere

Update all application code to read/write `fullName` instead of `name`. Both
properties still exist, so this deploy is safe.

#### Deploy 3 — Contract: remove the old property

```typescript
const Person = defineNode("Person", {
  schema: z.object({
    fullName: z.string(),
  }),
});
```

This is a breaking change (removing `name`). Use `migrateSchema()` to force it:

```typescript
import { getSchemaChanges, migrateSchema } from "@nicia-ai/typegraph/schema";

const [store, result] = await createStoreWithSchema(graph, backend, {
  throwOnBreaking: false,
});

if (result.status === "breaking") {
  // We've already backfilled — safe to force migrate
  const activeSchema = await backend.getActiveSchema(graph.id);
  await migrateSchema(backend, graph, activeSchema!.version);
}
```

### Removing a Node Type

#### Deploy 1 — Stop creating new instances

Update application code to stop creating the deprecated node type. Existing data
remains.

#### Deploy 2 — Clean up references

Delete edges that reference the deprecated node type, then delete the nodes
themselves:

```typescript
// Delete all edges connected to deprecated nodes
const deprecated = await store.query(OldNode).execute();
for (const node of deprecated) {
  await store.nodes.OldNode.delete(node.id);
}
```

#### Deploy 3 — Remove from schema

Remove the node type from `defineGraph()` and force migrate.

### Changing a Property Type

Change `age` from `z.string()` to `z.number()`:

#### Deploy 1 — Add the new property

```typescript
const Person = defineNode("Person", {
  schema: z.object({
    age: z.string(),
    ageNumeric: z.number().optional(),
  }),
});
```

#### Deploy 2 — Backfill and switch

```typescript
const people = await store.query(Person).execute();
for (const person of people) {
  if (person.properties.ageNumeric === undefined) {
    await store.nodes.Person.update(person.id, {
      ageNumeric: parseInt(person.properties.age, 10),
    });
  }
}
```

#### Deploy 3 — Contract

Remove `age`, rename `ageNumeric` to `age` with the new type, and force migrate.

## Pre-Deploy Schema Checks

Use `getSchemaChanges()` in CI to catch breaking changes before they reach
production.

### CI/CD Script

```typescript
import { getSchemaChanges } from "@nicia-ai/typegraph/schema";

async function checkSchema(backend: GraphBackend, graph: GraphDef) {
  const diff = await getSchemaChanges(backend, graph);

  if (!diff) {
    console.log("No existing schema — first deploy");
    return;
  }

  if (!diff.hasChanges) {
    console.log("Schema unchanged");
    return;
  }

  console.log("Schema changes detected:");
  console.log(diff.summary);

  for (const change of [...diff.nodes, ...diff.edges]) {
    const icon =
      change.severity === "safe"
        ? "[safe]"
        : change.severity === "warning"
          ? "[warn]"
          : "[BREAKING]";
    console.log(`  ${icon} ${change.details}`);
  }

  if (diff.hasBreakingChanges) {
    console.error("Breaking changes require migration before deploy.");
    process.exit(1);
  }
}
```

### Staging Validation

Before deploying to production, run against a staging database that mirrors
production schema state:

```typescript
const [store, result] = await createStoreWithSchema(graph, stagingBackend);

switch (result.status) {
  case "initialized":
    console.log("Staging DB was empty — initialized");
    break;
  case "migrated":
    console.log(
      `Auto-migrated v${result.fromVersion} → v${result.toVersion}`,
    );
    console.log("Changes:", result.diff.summary);
    break;
  case "breaking":
    console.error("Would break in production. Fix before deploying.");
    process.exit(1);
    break;
}
```

## Testing Schema Changes

### Unit Testing Migrations

Test that your migration code handles existing data correctly:

```typescript
import { createStoreWithSchema, defineGraph, defineNode } from "@nicia-ai/typegraph";
import { createTestBackend } from "./test-utils";

it("migrates name to fullName", async () => {
  const backend = createTestBackend();

  // Set up v1 with data
  const graphV1 = defineGraph({
    id: "test",
    nodes: { Person: { type: PersonV1 } },
    edges: {},
  });
  const [storeV1] = await createStoreWithSchema(graphV1, backend);
  await storeV1.nodes.Person.create({ name: "Alice" });

  // Migrate to v2 (expand phase)
  const graphV2 = defineGraph({
    id: "test",
    nodes: { Person: { type: PersonV2WithBothFields } },
    edges: {},
  });
  const [storeV2, result] = await createStoreWithSchema(graphV2, backend);
  expect(result.status).toBe("migrated");

  // Run backfill
  const people = await storeV2.query(PersonV2WithBothFields).execute();
  for (const person of people) {
    await storeV2.nodes.Person.update(person.id, {
      fullName: person.properties.name,
    });
  }

  // Verify
  const updated = await storeV2.query(PersonV2WithBothFields).execute();
  expect(updated[0].properties.fullName).toBe("Alice");
});
```

### Previewing Changes Without Applying

Use `getSchemaChanges()` to see what would change without modifying the database:

```typescript
import { getSchemaChanges } from "@nicia-ai/typegraph/schema";

const diff = await getSchemaChanges(backend, newGraph);
if (diff?.hasChanges) {
  console.log("Pending changes:", diff.summary);
  console.log("Breaking:", diff.hasBreakingChanges);

  for (const change of diff.nodes) {
    console.log(`  ${change.severity}: ${change.details}`);
  }
}
```

## Version History

TypeGraph preserves all schema versions in the `typegraph_schema_versions`
table. Only one version is active at a time.

```text
typegraph_schema_versions
├── version 1 (initial)           ← inactive
├── version 2 (added email)       ← inactive
├── version 3 (added Company)     ← active
```

Access version history through the backend:

```typescript
// Get a specific version
const v1 = await backend.getSchemaVersion("my_app", 1);
console.log("V1 created at:", v1?.created_at);

// Get the active version
const active = await backend.getActiveSchema("my_app");
console.log("Current version:", active?.version);
```

## Summary: Change Classification

| Change                         | Classification | Auto-Migrated? |
| ------------------------------ | -------------- | -------------- |
| Add node type                  | Safe           | Yes            |
| Add edge type                  | Safe           | Yes            |
| Add optional property          | Safe           | Yes            |
| Add ontology relation          | Safe           | Yes            |
| Add required property          | Breaking       | No             |
| Remove property                | Breaking       | No             |
| Remove node/edge type          | Breaking       | No             |
| Rename node/edge type          | Breaking       | No             |
| Change property type           | Breaking       | No             |
| Change onDelete behavior       | Warning        | Yes            |
| Change unique constraints      | Warning        | Yes            |
| Change edge cardinality        | Warning        | Yes            |
| Change edge endpoint kinds     | Warning        | Yes            |

## Rollback

If a deployment goes wrong, you can switch back to a previous schema version.
Version history is always preserved — `rollbackSchema()` simply changes which
version is active.

```typescript
import { rollbackSchema } from "@nicia-ai/typegraph/schema";

// Roll back to version 2
await rollbackSchema(backend, "my_app", 2);
```

This does not delete newer versions. You can migrate forward again later.

## Migration Hooks

Use `onBeforeMigrate` and `onAfterMigrate` for observability — logging,
metrics, and alerts during schema migrations:

```typescript
const [store, result] = await createStoreWithSchema(graph, backend, {
  onBeforeMigrate: (context) => {
    console.log(`Migrating ${context.graphId} v${context.fromVersion} → v${context.toVersion}`);
    console.log("Changes:", context.diff.summary);
  },
  onAfterMigrate: (context) => {
    console.log(`Migration complete: v${context.toVersion}`);
    metrics.increment("schema_migrations_total");
  },
});
```

For data transformations (backfill scripts), run them explicitly after store
creation rather than inside hooks. This gives you control over retries and
error handling:

```typescript
const [store, result] = await createStoreWithSchema(graph, backend);

if (result.status === "migrated" && result.toVersion === 3) {
  // Backfill fullName from name for the expand phase
  const people = await store.query(Person).execute();
  for (const person of people) {
    if (!person.properties.fullName) {
      await store.nodes.Person.update(person.id, {
        fullName: person.properties.name,
      });
    }
  }
}
```

## Current Limitations

- **No automatic data transformation.** TypeGraph tracks schema metadata
  changes but does not transform existing rows. Use backfill scripts (or
  `onAfterMigrate` hooks) for data migration.
- **No rename detection.** Renaming a property looks like a removal + addition.
  Use the expand-contract pattern instead.
- **Schema-level only.** Migrations operate on the graph definition, not on
  underlying database tables. TypeGraph's storage tables are
  schema-agnostic (nodes and edges are stored as JSON properties), so
  "schema migration" means updating the metadata that TypeGraph tracks, not
  running `ALTER TABLE`.

# Schema Migrations

> Schema versioning, migration, and lifecycle management

For a practical guide on evolving schemas across deployments, see
[Evolving Schemas in Production](/schema-evolution).

## When Do You Need Schema Management?

As your application evolves, your graph schema changes:

- **Adding features**: New node types, new properties, new relationships
- **Refactoring**: Renaming types, changing property formats
- **Deploying safely**: Ensuring schema changes don't break running applications

Without schema management, you'd face:

- No way to know if the database matches your code
- Silent failures when property names change
- Manual migration scripts for every deployment

TypeGraph's schema management:

1. **Stores the schema in the database** alongside your data
2. **Detects changes** between your code and the stored schema
3. **Auto-migrates safe changes** (adding types, optional properties)
4. **Blocks breaking changes** until you handle them explicitly

## How It Works

TypeGraph stores your graph schema in the database, enabling version tracking,
safe migrations, and runtime introspection.

When you create a store with `createStoreWithSchema()`, TypeGraph:

1. Serializes your graph definition to JSON
2. Compares it with the stored schema (if any)
3. Returns the result so you can act on it

## Schema Lifecycle

When you create a store, TypeGraph can automatically manage schema versions:

```typescript
import { createStoreWithSchema } from "@nicia-ai/typegraph";

const [store, result] = await createStoreWithSchema(graph, backend);

switch (result.status) {
  case "initialized":
    console.log(`Schema initialized at version ${result.version}`);
    break;
  case "unchanged":
    console.log(`Schema unchanged at version ${result.version}`);
    break;
  case "migrated":
    console.log(`Migrated from v${result.fromVersion} to v${result.toVersion}`);
    break;
  case "pending":
    console.log(`Safe changes pending at version ${result.version}`);
    break;
  case "breaking":
    console.log("Breaking changes detected:", result.actions);
    break;
}
```

## Basic vs Managed Store

TypeGraph provides two ways to create a store:

### Basic Store (No Schema Management)

Use `createStore()` when you manage schema versions yourself:

```typescript
import { createStore } from "@nicia-ai/typegraph";

const store = createStore(graph, backend);
// No schema versioning - you handle migrations manually
```

### Managed Store (Automatic Schema Management)

Use `createStoreWithSchema()` for automatic version tracking:

```typescript
import { createStoreWithSchema } from "@nicia-ai/typegraph";

const [store, result] = await createStoreWithSchema(graph, backend, {
  autoMigrate: true, // Auto-apply safe changes (default: true)
  throwOnBreaking: true, // Throw on breaking changes (default: true)
  onBeforeMigrate: (context) => {
    console.log(`Migrating ${context.graphId} from v${context.fromVersion} to v${context.toVersion}`);
  },
  onAfterMigrate: (context) => {
    console.log(`Migration complete: v${context.toVersion}`);
  },
});
```

## Schema Validation Results

The validation result indicates what happened during store initialization:

| Status        | Meaning                                            |
| ------------- | -------------------------------------------------- |
| `initialized` | First run - schema version 1 was created           |
| `unchanged`   | Schema matches stored version - no changes         |
| `migrated`    | Safe changes auto-applied, new version created     |
| `pending`     | Safe changes detected but `autoMigrate` is `false` |
| `breaking`    | Breaking changes detected, action required         |

## Safe vs Breaking Changes

### Safe Changes (Auto-Migrated)

These changes are backwards compatible and can be auto-migrated:

- Adding new node types
- Adding new edge types
- Adding optional properties with defaults
- Adding new ontology relations

### Breaking Changes (Require Manual Action)

These changes require manual migration:

- Removing node or edge types
- Renaming node or edge types
- Changing property types
- Removing properties
- Changing cardinality constraints to be more restrictive

## Handling Breaking Changes

When breaking changes are detected:

```typescript
const [store, result] = await createStoreWithSchema(graph, backend, {
  throwOnBreaking: false, // Don't throw, inspect instead
});

if (result.status === "breaking") {
  console.log("Breaking changes detected:");
  console.log("Summary:", result.diff.summary);
  console.log("Required actions:");
  for (const action of result.actions) {
    console.log(`  - ${action}`);
  }

  // Option 1: Fix your schema to be backwards compatible

  // Option 2: Force migration (data loss possible!)
  // import { migrateSchema } from "@nicia-ai/typegraph/schema";
  // await migrateSchema(backend, graph, currentVersion);
}
```

## Schema Introspection

Query the stored schema at runtime:

```typescript
import { getActiveSchema, isSchemaInitialized, getSchemaChanges } from "@nicia-ai/typegraph/schema";

// Check if schema exists
const initialized = await isSchemaInitialized(backend, "my_graph");

// Get the current schema
const schema = await getActiveSchema(backend, "my_graph");
if (schema) {
  console.log("Graph ID:", schema.graphId);
  console.log("Version:", schema.version);
  console.log("Nodes:", Object.keys(schema.nodes));
  console.log("Edges:", Object.keys(schema.edges));
}

// Preview changes without applying
const diff = await getSchemaChanges(backend, graph);
if (diff?.hasChanges) {
  console.log("Pending changes:", diff.summary);
  console.log("Is backwards compatible:", !diff.hasBreakingChanges);
}
```

## Manual Migration

For full control over migrations:

```typescript
import {
  initializeSchema,
  migrateSchema,
  rollbackSchema,
  ensureSchema,
} from "@nicia-ai/typegraph/schema";

// Initialize schema (first run only)
const row = await initializeSchema(backend, graph);
console.log("Created version:", row.version);

// Migrate to new version
const newVersion = await migrateSchema(backend, graph, currentVersion);
console.log("Migrated to version:", newVersion);

// Rollback to a previous version
await rollbackSchema(backend, "my_graph", 1);
console.log("Rolled back to version 1");

// Or use ensureSchema for automatic handling
const result = await ensureSchema(backend, graph, {
  autoMigrate: true,
  throwOnBreaking: true,
});
```

## Schema Serialization

Schemas are stored as JSON documents with computed hashes for fast comparison:

```typescript
import { serializeSchema, computeSchemaHash } from "@nicia-ai/typegraph/schema";

// Serialize a graph definition
const serialized = serializeSchema(graph, 1);

// Compute hash for comparison
const hash = computeSchemaHash(serialized);
```

The serialized schema includes:

- Graph ID and version
- All node types with their Zod schemas (as JSON Schema)
- All edge types with endpoints and constraints
- Complete ontology relations
- Uniqueness constraints and delete behaviors

## Version History

TypeGraph maintains a history of all schema versions:

```text
typegraph_schema_versions
├── version 1 (initial)
├── version 2 (added User node)
├── version 3 (added email property) ← active
└── ...
```

Only one version is marked as "active" at a time. Previous versions are
preserved for auditing and potential rollback.

## Best Practices

### 1. Use Managed Stores in Production

```typescript
// Production: Use schema management
const [store, result] = await createStoreWithSchema(graph, backend);

// Development: Basic store is fine for rapid iteration
const store = createStore(graph, backend);
```

### 2. Check Migration Status on Startup

```typescript
async function initializeApp() {
  const [store, result] = await createStoreWithSchema(graph, backend);

  if (result.status === "breaking") {
    console.error("Database schema incompatible with application!");
    console.error("Run migrations before deploying this version.");
    process.exit(1);
  }

  if (result.status === "migrated") {
    console.log(`Schema auto-migrated to v${result.toVersion}`);
  }

  return store;
}
```

### 3. Preview Changes Before Deployment

```typescript
import { getSchemaChanges } from "@nicia-ai/typegraph/schema";

// In your CI/CD pipeline or migration script
const diff = await getSchemaChanges(backend, graph);

if (diff?.hasChanges) {
  console.log("Schema changes detected:");
  console.log(diff.summary);

  if (!diff.isBackwardsCompatible) {
    console.error("Breaking changes require manual migration!");
    process.exit(1);
  }
}
```

### 4. Add Properties with Defaults

When adding new properties, always provide defaults to ensure backwards
compatibility:

```typescript
// Good: Optional with default
const User = defineNode("User", {
  schema: z.object({
    name: z.string(),
    // New property with default - safe migration
    status: z.enum(["active", "inactive"]).default("active"),
  }),
});

// Bad: Required without default - breaking change
const User = defineNode("User", {
  schema: z.object({
    name: z.string(),
    status: z.enum(["active", "inactive"]), // No default!
  }),
});
```

# Schemas & Stores

> Schema definition functions and store API reference

This reference documents the schema definition functions and store API for TypeGraph.

## Schema Definition

### `defineNode(name, options)`

Creates a node type definition.

```typescript
import { defineNode } from "@nicia-ai/typegraph";

function defineNode<K extends string, S extends z.ZodObject<any>>(
  name: K,
  options: {
    schema: S;
    description?: string;
  },
): NodeType<K, S>;
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `name` | `string` | Unique name for this node type |
| `options.schema` | `z.ZodObject` | Zod object schema for node properties |
| `options.description` | `string` | Optional description |

**Example:**

```typescript
const Person = defineNode("Person", {
  schema: z.object({
    name: z.string(),
    email: z.string().email().optional(),
  }),
  description: "A person in the system",
});
```

### `defineEdge(name, options?)`

Creates an edge type definition.

```typescript
import { defineEdge } from "@nicia-ai/typegraph";

function defineEdge<K extends string, S extends z.ZodObject<any>>(
  name: K,
  options?: {
    schema?: S;
    description?: string;
    from?: NodeType[];
    to?: NodeType[];
  },
): EdgeType<K, S>;
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `name` | `string` | Unique name for this edge type |
| `options.schema` | `z.ZodObject` | Optional Zod object schema (defaults to empty object) |
| `options.description` | `string` | Optional description |
| `options.from` | `NodeType[]` | Optional domain constraint (valid source node types) |
| `options.to` | `NodeType[]` | Optional range constraint (valid target node types) |

**Example:**

```typescript
const worksAt = defineEdge("worksAt", {
  schema: z.object({
    role: z.string(),
    startDate: z.string().optional(),
  }),
});

const knows = defineEdge("knows"); // No schema needed
```

**With Domain/Range Constraints:**

When `from` and `to` are specified, the edge carries its endpoint constraints intrinsically:

```typescript
const worksAt = defineEdge("worksAt", {
  schema: z.object({
    role: z.string(),
    startDate: z.string().optional(),
  }),
  from: [Person],      // Domain: only Person can be the source
  to: [Company],       // Range: only Company can be the target
});
```

**Unconstrained Edges:**

Edges without `from`/`to` are unconstrained — they can connect any node type to any node type:

```typescript
const sameAs = defineEdge("sameAs");
const related = defineEdge("related", {
  schema: z.object({ reason: z.string() }),
});
```

**Direct use in defineGraph:**

Any edge type can be used directly in `defineGraph` without an `EdgeRegistration` wrapper:

```typescript
const graph = defineGraph({
  id: "my_graph",
  nodes: { Person: { type: Person }, Company: { type: Company } },
  edges: {
    worksAt,  // Constrained — uses built-in from/to
    sameAs,   // Unconstrained — connects any node to any node
  },
});
```

See [Core Concepts](/core-concepts#domain-and-range-constraints) for detailed documentation on domain/range constraints.

### `embedding(dimensions)`

Creates a Zod schema for vector embeddings with dimension validation.

```typescript
import { embedding } from "@nicia-ai/typegraph";

function embedding<D extends number>(dimensions: D): EmbeddingSchema<D>;
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `dimensions` | `number` | The number of dimensions (e.g., 384, 512, 768, 1536, 3072) |

**Example:**

```typescript
const Document = defineNode("Document", {
  schema: z.object({
    title: z.string(),
    content: z.string(),
    embedding: embedding(1536), // OpenAI ada-002
  }),
});

// Optional embeddings
const Article = defineNode("Article", {
  schema: z.object({
    content: z.string(),
    embedding: embedding(1536).optional(),
  }),
});
```

See [Semantic Search](/semantic-search) for query usage.

### `externalRef(table)`

Creates a Zod schema for referencing external data sources. Use this for hybrid
overlay patterns where TypeGraph stores relationships while your existing tables
remain the source of truth.

```typescript
import { externalRef } from "@nicia-ai/typegraph";

function externalRef<T extends string>(table: T): ExternalRefSchema<T>;
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `table` | `string` | Identifier for the external table (e.g., "users", "documents") |

**Example:**

```typescript
const Document = defineNode("Document", {
  schema: z.object({
    source: externalRef("documents"),
    embedding: embedding(1536).optional(),
  }),
});

// Create with explicit table reference
await store.nodes.Document.create({
  source: { table: "documents", id: "doc_123" },
});

// Query the external reference
const results = await store
  .query()
  .from("Document", "d")
  .select((ctx) => ctx.d.source)
  .execute();
// results[0].source = { table: "documents", id: "doc_123" }
```

### `createExternalRef(table)`

Factory helper to create external reference values without repeating the table name.

```typescript
import { createExternalRef } from "@nicia-ai/typegraph";

function createExternalRef<T extends string>(
  table: T
): (id: string) => ExternalRefValue<T>;
```

**Example:**

```typescript
const docRef = createExternalRef("documents");

await store.nodes.Document.create({
  source: docRef("doc_123"), // { table: "documents", id: "doc_123" }
});
```

### `defineGraph(config)`

Creates a graph definition combining nodes, edges, and ontology.

```typescript
import { defineGraph } from "@nicia-ai/typegraph";

function defineGraph<G extends GraphDef>(config: {
  id: string;
  nodes: Record<string, NodeRegistration>;
  edges: Record<string, EdgeRegistration | EdgeType>;
  ontology?: OntologyRelation[];
}): G;
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `id` | `string` | Unique identifier for this graph |
| `nodes` | `Record<string, NodeRegistration>` | Node type registrations |
| `edges` | `Record<string, EdgeRegistration \| EdgeType>` | Edge registrations or edge types directly |
| `ontology` | `OntologyRelation[]` | Optional semantic relationships |

Edge entries can be:

- **`EdgeRegistration`** — explicit `{ type, from, to }` with optional `cardinality`
- **`EdgeType` with `from`/`to`** — uses built-in constraints
- **`EdgeType` without `from`/`to`** — unconstrained, connects any node to any node

**Example:**

```typescript
const graph = defineGraph({
  id: "my_graph",
  nodes: {
    Person: { type: Person },
    Company: { type: Company, onDelete: "cascade" },
  },
  edges: {
    worksAt: {
      type: worksAt,
      from: [Person],
      to: [Company],
      cardinality: "many",
    },
    sameAs,  // Unconstrained — any→any
  },
  ontology: [disjointWith(Person, Company)],
});
```

## Store Creation

### `createStore(graph, backend, options?)`

Creates a store instance for a graph definition.

```typescript
import { createStore } from "@nicia-ai/typegraph";

function createStore<G extends GraphDef>(
  graph: G,
  backend: GraphBackend,
  options?: StoreOptions
): Store<G>;
```

**Options:**

| Option | Type | Description |
|--------|------|-------------|
| `hooks` | `StoreHooks` | Observability hooks for monitoring operations |
| `schema` | `SqlSchema` | Custom table name configuration |
| `queryDefaults.traversalExpansion` | `TraversalExpansion` | Default ontology expansion mode for traversals (default: `"inverse"`) |

**Example:**

```typescript
const store = createStore(graph, backend);
```

Override the default traversal expansion:

```typescript
const store = createStore(graph, backend, {
  queryDefaults: { traversalExpansion: "none" },
});
```

### `createStoreWithSchema(graph, backend, options?)`

Creates a store and ensures the database schema is initialized or migrated.
This is the recommended factory for production use.

```typescript
import { createStoreWithSchema } from "@nicia-ai/typegraph";

function createStoreWithSchema<G extends GraphDef>(
  graph: G,
  backend: GraphBackend,
  options?: StoreOptions & SchemaManagerOptions
): Promise<[Store<G>, SchemaValidationResult]>;
```

**Returns:** A tuple of `[store, validationResult]`

The validation result indicates what happened:

- `status: "initialized"` - Schema created for the first time
- `status: "unchanged"` - Schema matches, no changes needed
- `status: "migrated"` - Safe changes auto-applied (additive only)
- `status: "pending"` - Safe changes detected but `autoMigrate` is `false`
- `status: "breaking"` - Breaking changes detected, action required

**Example:**

```typescript
const [store, result] = await createStoreWithSchema(graph, backend);

if (result.status === "initialized") {
  console.log("Schema initialized at version", result.version);
} else if (result.status === "migrated") {
  console.log(`Migrated from v${result.fromVersion} to v${result.toVersion}`);
} else if (result.status === "pending") {
  console.log(`Safe changes pending at version ${result.version}`);
}
```

**Throws:** `MigrationError` if breaking changes are detected and
`throwOnBreaking` is `true` (the default).

## Store API

The store provides typed node and edge collections via `store.nodes.*` and `store.edges.*`.

### Node Collections

Each node type has a collection with these methods:

#### Naming Guidelines

Method names follow what identifier is used to match an existing record:

| If you have... | Read-only | Get-or-create |
|----------------|-----------|---------------|
| ID | `getById` | `upsertById` |
| Unique constraint name + props | `findByConstraint` | `getOrCreateByConstraint` |
| Edge endpoints (`from`, `to`) + optional `matchOn` | `findByEndpoints` | `getOrCreateByEndpoints` |

#### `create(props, options?)`

Creates a new node.

```typescript
store.nodes.Person.create(
  props: { name: string; email?: string },
  options?: { id?: string; validFrom?: string; validTo?: string }
): Promise<Node<Person>>;
```

#### `getById(id)`

Retrieves a node by ID.

```typescript
store.nodes.Person.getById(id: NodeId<Person>): Promise<Node<Person> | undefined>;
```

#### `getByIds(ids)`

Retrieves multiple nodes by ID in a single query. Returns results in input order,
with `undefined` for missing IDs.

```typescript
store.nodes.Person.getByIds(
  ids: readonly NodeId<Person>[],
  options?: QueryOptions
): Promise<readonly (Node<Person> | undefined)[]>;
```

When the backend supports batch lookups (`getNodes`), this executes a single
`SELECT ... WHERE id IN (...)` query. Otherwise it falls back to sequential lookups.

```typescript
const [alice, bob, unknown] = await store.nodes.Person.getByIds([
  aliceId,
  bobId,
  "nonexistent",
]);
// alice: Node<Person>
// bob: Node<Person>
// unknown: undefined
```

#### `update(id, props)`

Updates node properties.

```typescript
store.nodes.Person.update(
  id: NodeId<Person>,
  props: Partial<{ name: string; email?: string }>
): Promise<Node<Person>>;
```

#### `delete(id)`

Soft-deletes a node.

```typescript
store.nodes.Person.delete(id: NodeId<Person>): Promise<void>;
```

#### `hardDelete(id)`

Permanently deletes a node. This is irreversible and should be used carefully.

```typescript
store.nodes.Person.hardDelete(id: NodeId<Person>): Promise<void>;
```

#### `find(options?)`

Finds nodes of this kind with optional filtering and pagination.

```typescript
store.nodes.Person.find(options?: {
  where?: (accessor) => Predicate;
  limit?: number;
  offset?: number;
}): Promise<Node<Person>[]>;
```

The optional `where` predicate uses the same accessor API as `whereNode()` in the query builder:

```typescript
const activeUsers = await store.nodes.Person.find({
  where: (p) => p.status.eq("active"),
  limit: 50,
});
```

#### `count()`

Counts nodes of this kind (excluding soft-deleted nodes).

```typescript
store.nodes.Person.count(): Promise<number>;
```

#### `upsertById(id, props, options?)`

Creates or updates a node by ID.

```typescript
store.nodes.Person.upsertById(
  id: string,
  props: { name: string; email?: string },
  options?: { validFrom?: string; validTo?: string }
): Promise<Node<Person>>;
```

**Behavior:**

- Creates a new node if no node with the ID exists
- Updates the existing node if one exists
- Un-deletes soft-deleted nodes (clears `deletedAt`)

#### `bulkCreate(items)`

Creates multiple nodes efficiently. Uses a single multi-row INSERT when the backend supports it.

```typescript
store.nodes.Person.bulkCreate(
  items: readonly {
    props: { name: string; email?: string };
    id?: string;
    validFrom?: string;
    validTo?: string;
  }[]
): Promise<Node<Person>[]>;
```

Use `bulkInsert` when you don't need the created nodes back:

```typescript
await store.nodes.Person.bulkInsert(batch);
```

#### `bulkInsert(items)`

Inserts multiple nodes without returning results. This is the dedicated fast path for bulk
ingestion — wrapped in a transaction when the backend supports it.

```typescript
store.nodes.Person.bulkInsert(
  items: readonly {
    props: { name: string; email?: string };
    id?: string;
    validFrom?: string;
    validTo?: string;
  }[]
): Promise<void>;
```

#### `bulkUpsertById(items)`

Creates or updates multiple nodes by ID.

```typescript
store.nodes.Person.bulkUpsertById(
  items: readonly {
    id: string;
    props: { name: string; email?: string };
    validFrom?: string;
    validTo?: string;
  }[]
): Promise<Node<Person>[]>;
```

#### `bulkDelete(ids)`

Soft-deletes multiple nodes.

```typescript
store.nodes.Person.bulkDelete(
  ids: readonly NodeId<Person>[]
): Promise<void>;
```

#### `getOrCreateByConstraint(constraintName, props, options?)`

Looks up an existing node by a named uniqueness constraint. Returns the match if found, or creates a new node if not.

```typescript
store.nodes.Person.getOrCreateByConstraint(
  constraintName: string,
  props: { name: string; email?: string },
  options?: { ifExists?: "return" | "update" } // Default: "return"
): Promise<{
  node: Node<Person>;
  action: "created" | "found" | "updated" | "resurrected";
}>;
```

#### `bulkGetOrCreateByConstraint(constraintName, items, options?)`

Batch version of `getOrCreateByConstraint`. Returns results in input order.

```typescript
store.nodes.Person.bulkGetOrCreateByConstraint(
  constraintName: string,
  items: readonly {
    props: { name: string; email?: string };
  }[],
  options?: { ifExists?: "return" | "update" }
): Promise<
  {
    node: Node<Person>;
    action: "created" | "found" | "updated" | "resurrected";
  }[]
>;
```

#### `findByConstraint(constraintName, props)`

Looks up a node by a named uniqueness constraint without creating.
Returns the matching node or `undefined`. Soft-deleted nodes are excluded.

```typescript
store.nodes.Person.findByConstraint(
  constraintName: string,
  props: { name: string; email?: string }
): Promise<Node<Person> | undefined>;
```

```typescript
const alice = await store.nodes.Person.findByConstraint("email", {
  email: "alice@example.com",
  name: "Alice",
});

if (alice) {
  console.log(alice.id, alice.name);
}
```

Throws `NodeConstraintNotFoundError` if the constraint name is not defined on the node type.

#### `bulkFindByConstraint(constraintName, items)`

Batch version of `findByConstraint`. Returns results in input order,
with `undefined` for non-matches. Deduplicates within-batch lookups automatically.

```typescript
store.nodes.Person.bulkFindByConstraint(
  constraintName: string,
  items: readonly { props: { name: string; email?: string } }[]
): Promise<(Node<Person> | undefined)[]>;
```

```typescript
const results = await store.nodes.Person.bulkFindByConstraint("email", [
  { props: { email: "alice@example.com", name: "Alice" } },
  { props: { email: "nobody@example.com", name: "Nobody" } },
  { props: { email: "bob@example.com", name: "Bob" } },
]);
// results[0]: Node<Person> (Alice)
// results[1]: undefined
// results[2]: Node<Person> (Bob)
```

### Edge Collections

Each edge type has a type-safe collection. The `from` and `to` parameters are
constrained to only accept node types declared in the edge registration.

#### `create(from, to, props)`

Creates an edge. TypeScript enforces valid endpoint types.

```typescript
// Given: worksAt: { type: worksAt, from: [Person], to: [Company] }

store.edges.worksAt.create(
  from: NodeRef<Person>,
  to: NodeRef<Company>,
  props: { role: string }
): Promise<Edge<worksAt>>;

// Preferred: Pass node objects directly
await store.edges.worksAt.create(alice, acme, { role: "Engineer" });

// Compile error - Company is not a valid 'from' type
await store.edges.worksAt.create(acme, alice, { role: "Engineer" });
```

#### Node References

Both forms are **exactly equivalent**—TypeGraph extracts `kind` and `id` from either:

```typescript
// Full node object (preferred - cleaner syntax)
await store.edges.worksAt.create(alice, acme, { role: "Engineer" });

// Explicit reference (useful when you only have IDs)
await store.edges.worksAt.create(
  { kind: "Person", id: aliceId },
  { kind: "Company", id: acmeId },
  { role: "Engineer" }
);
```

Use the explicit `{ kind, id }` form when you have IDs but not the full node objects (e.g., from a
previous query or external input).

#### `getById(id)`

Retrieves an edge by ID.

```typescript
store.edges.worksAt.getById(id: EdgeId<worksAt>): Promise<Edge<worksAt> | undefined>;
```

#### `getByIds(ids)`

Retrieves multiple edges by ID in a single query. Returns results in input order,
with `undefined` for missing IDs.

```typescript
store.edges.worksAt.getByIds(
  ids: readonly EdgeId<worksAt>[],
  options?: QueryOptions
): Promise<readonly (Edge<worksAt> | undefined)[]>;
```

```typescript
const [edge1, edge2] = await store.edges.worksAt.getByIds([id1, id2]);
```

#### `update(id, props, options?)`

Updates edge properties.

```typescript
store.edges.worksAt.update(
  id: EdgeId<worksAt>,
  props: Partial<{ role: string }>,
  options?: { validTo?: string }
): Promise<Edge<worksAt>>;
```

#### `findFrom(from)`

Finds edges from a node.

```typescript
store.edges.worksAt.findFrom(
  from: NodeRef<Person>
): Promise<Edge<worksAt>[]>;
```

#### `findTo(to)`

Finds edges to a node.

```typescript
store.edges.worksAt.findTo(
  to: NodeRef<Company>
): Promise<Edge<worksAt>[]>;
```

#### `find(options?)`

Finds edges with endpoint filtering.

```typescript
store.edges.worksAt.find(options?: {
  from?: NodeRef<Person>;
  to?: NodeRef<Company>;
  limit?: number;
  offset?: number;
}): Promise<Edge<worksAt>[]>;
```

For edge property filters, use the query builder with `whereEdge(...)`.

#### `count(options?)`

Counts edges matching filters.

```typescript
store.edges.worksAt.count(options?: {
  from?: NodeRef<Person>;
  to?: NodeRef<Company>;
}): Promise<number>;
```

#### `delete(id)`

Soft-deletes an edge.

```typescript
store.edges.worksAt.delete(id: EdgeId<worksAt>): Promise<void>;
```

#### `hardDelete(id)`

Permanently deletes an edge. This is irreversible and should be used carefully.

```typescript
store.edges.worksAt.hardDelete(id: EdgeId<worksAt>): Promise<void>;
```

#### `bulkCreate(items)`

Creates multiple edges efficiently. Uses a single multi-row INSERT when the backend supports it.

```typescript
store.edges.worksAt.bulkCreate(
  items: readonly {
    from: NodeRef<Person>;
    to: NodeRef<Company>;
    props?: { role: string };
    id?: string;
    validFrom?: string;
    validTo?: string;
  }[]
): Promise<Edge<worksAt>[]>;
```

Use `bulkInsert` for high-volume edge ingestion when you do not need returned payloads:

```typescript
await store.edges.worksAt.bulkInsert(edgeBatch);
```

#### `bulkInsert(items)`

Inserts multiple edges without returning results. This is the dedicated fast path for bulk
ingestion — wrapped in a transaction when the backend supports it.

```typescript
store.edges.worksAt.bulkInsert(
  items: readonly {
    from: NodeRef<Person>;
    to: NodeRef<Company>;
    props?: { role: string };
    id?: string;
    validFrom?: string;
    validTo?: string;
  }[]
): Promise<void>;
```

#### `bulkDelete(ids)`

Soft-deletes multiple edges.

```typescript
store.edges.worksAt.bulkDelete(
  ids: readonly EdgeId<worksAt>[]
): Promise<void>;
```

#### `bulkUpsertById(items)`

Creates or updates multiple edges by ID.

```typescript
store.edges.worksAt.bulkUpsertById(
  items: readonly {
    id: EdgeId<worksAt>;
    from: NodeRef<Person>;
    to: NodeRef<Company>;
    props?: { role: string };
    validFrom?: string;
    validTo?: string;
  }[]
): Promise<Edge<worksAt>[]>;
```

#### `getOrCreateByEndpoints(from, to, props, options?)`

Looks up an existing edge by endpoints (and optionally by property fields via `matchOn`).
Returns the match if found, or creates a new edge if not.

```typescript
store.edges.worksAt.getOrCreateByEndpoints(
  from: NodeRef<Person>,
  to: NodeRef<Company>,
  props: { role: string },
  options?: {
    matchOn?: readonly ("role")[]; // Default: []
    ifExists?: "return" | "update"; // Default: "return"
  }
): Promise<{
  edge: Edge<worksAt>;
  action: "created" | "found" | "updated" | "resurrected";
}>;
```

#### `bulkGetOrCreateByEndpoints(items, options?)`

Batch version of `getOrCreateByEndpoints`. Returns results in input order.

```typescript
store.edges.worksAt.bulkGetOrCreateByEndpoints(
  items: readonly {
    from: NodeRef<Person>;
    to: NodeRef<Company>;
    props: { role: string };
  }[],
  options?: {
    matchOn?: readonly ("role")[];
    ifExists?: "return" | "update";
  }
): Promise<
  {
    edge: Edge<worksAt>;
    action: "created" | "found" | "updated" | "resurrected";
  }[]
>;
```

#### `findByEndpoints(from, to, options?)`

Looks up an edge by its endpoints without creating. Returns the matching edge or `undefined`. Soft-deleted edges are excluded.

When `matchOn` is omitted, returns the first live edge between the two endpoints.
When `matchOn` is provided, filters by the specified property fields.

```typescript
store.edges.knows.findByEndpoints(
  from: NodeRef<Person>,
  to: NodeRef<Person>,
  options?: {
    matchOn?: readonly ("relationship" | "since")[];
    props?: Partial<{ relationship: string; since: string }>;
  }
): Promise<Edge<knows> | undefined>;
```

```typescript
// Find any edge between Alice and Bob
const edge = await store.edges.knows.findByEndpoints(alice, bob);

// Find the specific "colleague" edge between Alice and Bob
const colleague = await store.edges.knows.findByEndpoints(alice, bob, {
  matchOn: ["relationship"],
  props: { relationship: "colleague" },
});
```

### Transactions

#### `store.transaction(fn)`

Executes a callback within an atomic transaction. All operations succeed together or are
rolled back together. The transaction context (`tx`) provides the same `nodes.*` and
`edges.*` collection API as the store itself.

```typescript
await store.transaction(async (tx) => {
  const person = await tx.nodes.Person.create({ name: "Alice" });
  const company = await tx.nodes.Company.create({ name: "Acme" });
  await tx.edges.worksAt.create(person, company, { role: "Engineer" });
});
```

#### Return values

The callback's return value is forwarded to the caller:

```typescript
const personId = await store.transaction(async (tx) => {
  const person = await tx.nodes.Person.create({ name: "Alice" });
  return person.id;
});
// personId is available here
```

#### Rollback and error propagation

If the callback throws, the transaction is rolled back and the error re-throws to the
caller. No partial writes are persisted.

```typescript
try {
  await store.transaction(async (tx) => {
    await tx.nodes.Person.create({ name: "Alice" });
    throw new Error("something went wrong");
    // Alice is NOT persisted — the entire transaction is rolled back
  });
} catch (error) {
  // error.message === "something went wrong"
}
```

#### Nesting

Transactions do **not** nest. The transaction context intentionally omits the
`transaction()` method, so attempting to start a transaction inside another transaction is
a compile-time error. If you need to compose transactional operations, pass the `tx`
context through your call chain.

#### Backend support

Not all backends support atomic transactions. Cloudflare D1, for example, does not —
calling `store.transaction()` on a D1-backed store throws a `ConfigurationError`. Check
support at runtime with:

```typescript
if (backend.capabilities.transactions) {
  await store.transaction(async (tx) => { /* ... */ });
} else {
  // fall back to individual operations with manual error handling
}
```

### Clear

#### `store.clear()`

Hard-deletes all data for the current graph: nodes, edges, uniqueness entries,
embeddings, and schema versions. Resets collection caches so the store is immediately reusable.

```typescript
store.clear(): Promise<void>;
```

Wrapped in a transaction when the backend supports it. Does not affect other graphs sharing the same backend.

```typescript
// Wipe all data and start fresh
await store.clear();

// Store is immediately reusable
const person = await store.nodes.Person.create({ name: "Alice" });
```

### Query Builder

#### `store.query()`

Creates a query builder. See [Query Builder](/queries/overview) for full documentation.

```typescript
const results = await store
  .query()
  .from("Person", "p")
  .whereNode("p", (p) => p.name.startsWith("A"))
  .select((ctx) => ctx.p)
  .execute();
```

**Execution methods** (see [Execute](/queries/execute) for details):

| Method | Returns | Description |
|--------|---------|-------------|
| `execute()` | `Promise<readonly T[]>` | Run query, return all results |
| `first()` | `Promise<T \| undefined>` | Return first result or undefined |
| `count()` | `Promise<number>` | Count matching results |
| `exists()` | `Promise<boolean>` | Check if any results exist |
| `paginate(options)` | `Promise<PaginatedResult<T>>` | Cursor-based pagination |
| `stream(options?)` | `AsyncIterable<T>` | Stream results in batches |
| `prepare()` | `PreparedQuery<T>` | Pre-compile query for repeated execution with parameters |

### Registry Access

#### `store.registry`

Access to the type registry for ontology lookups. The registry is an internal type;
use `store.registry` directly without importing its type.

See [Ontology](/ontology) for registry methods.

## Observability Hooks

TypeGraph supports observability hooks for monitoring and logging store operations.

### `StoreHooks`

Configuration for observability callbacks:

```typescript
import type {
  HookContext,
  QueryHookContext,
  OperationHookContext,
  StoreHooks,
} from "@nicia-ai/typegraph";
```

```typescript
type StoreHooks = Readonly<{
  onQueryStart?: (ctx: QueryHookContext) => void;
  onQueryEnd?: (ctx: QueryHookContext, result: { rowCount: number; durationMs: number }) => void;
  onOperationStart?: (ctx: OperationHookContext) => void;
  onOperationEnd?: (ctx: OperationHookContext, result: { durationMs: number }) => void;
  onError?: (ctx: HookContext, error: Error) => void;
}>;

type HookContext = Readonly<{
  operationId: string;
  graphId: string;
  startedAt: Date;
}>;

type QueryHookContext = HookContext &
  Readonly<{
    sql: string;
    params: readonly unknown[];
  }>;

type OperationHookContext = HookContext &
  Readonly<{
    operation: "create" | "update" | "delete";
    entity: "node" | "edge";
    kind: string;
    id: string;
  }>;
```

> **Note:** Batch operations (`bulkCreate`, `bulkInsert`, `bulkUpsertById`) skip per-item
operation hooks for throughput. Query hooks still fire normally.

**Example:**

```typescript
import { createStore, type StoreHooks } from "@nicia-ai/typegraph";

const hooks: StoreHooks = {
  onQueryStart: (ctx) => {
    console.log(`[${ctx.operationId}] SQL: ${ctx.sql}`);
  },
  onQueryEnd: (ctx, result) => {
    console.log(`[${ctx.operationId}] ${result.rowCount} rows in ${result.durationMs}ms`);
  },
  onOperationStart: (ctx) => {
    console.log(`[${ctx.operationId}] ${ctx.operation} ${ctx.entity}:${ctx.kind}`);
  },
  onOperationEnd: (ctx, result) => {
    console.log(`[${ctx.operationId}] Completed in ${result.durationMs}ms`);
  },
  onError: (ctx, error) => {
    console.error(`[${ctx.operationId}] Error:`, error.message);
  },
};

const store = createStore(graph, backend, { hooks });

// Operations now trigger hooks
await store.nodes.Person.create({ name: "Alice" });
// Logs:
// [op-abc123] create node:Person
// [op-abc123] SQL: INSERT INTO ...
// [op-abc123] 1 rows in 2ms
// [op-abc123] Completed in 5ms
```

# Semantic Search

> Vector embeddings and similarity search for AI-powered retrieval

TypeGraph supports semantic search using vector embeddings, enabling you to find
semantically similar content using embedding models like OpenAI, Sentence Transformers,
CLIP, or any model that produces fixed-dimension vectors.

## Overview

Traditional search relies on exact keyword matching. Semantic search understands
meaning—"machine learning" matches documents about "neural networks" and "AI algorithms"
even without those exact words.

**Key capabilities:**

- Store embeddings as node properties alongside your graph data
- Find the k most similar nodes using cosine, L2, or inner product distance
- Combine semantic similarity with graph traversals and standard predicates
- Automatic vector indexing for fast approximate nearest neighbor search

## Use Cases

### Retrieval-Augmented Generation (RAG)

Build context-aware AI applications by retrieving relevant documents before
generating responses:

```typescript
async function ragQuery(question: string): Promise<string> {
  const questionEmbedding = await embed(question);

  const context = await store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) =>
      d.embedding.similarTo(questionEmbedding, 5, {
        metric: "cosine",
        minScore: 0.7,
      })
    )
    .select((ctx) => ({
      title: ctx.d.title,
      content: ctx.d.content,
    }))
    .execute();

  return await llm.chat({
    messages: [
      {
        role: "system",
        content: `Answer based on this context:\n${context.map((d) => d.content).join("\n\n")}`,
      },
      { role: "user", content: question },
    ],
  });
}
```

### Semantic Document Search

Find documents by meaning rather than keywords:

```typescript
const results = await store
  .query()
  .from("Article", "a")
  .whereNode("a", (a) =>
    a.embedding
      .similarTo(queryEmbedding, 20)
      .and(a.category.eq("technology"))
  )
  .select((ctx) => ctx.a)
  .execute();
```

### Image Similarity

Use CLIP or similar vision models for image search:

```typescript
const similarImages = await store
  .query()
  .from("Image", "i")
  .whereNode("i", (i) => i.clipEmbedding.similarTo(queryImageEmbedding, 10))
  .select((ctx) => ({
    url: ctx.i.url,
    caption: ctx.i.caption,
  }))
  .execute();
```

### Product Recommendations

Recommend products based on embedding similarity:

```typescript
const recommendations = await store
  .query()
  .from("Product", "p")
  .whereNode("p", (p) =>
    p.embedding
      .similarTo(referenceProductEmbedding, 10)
      .and(p.inStock.eq(true))
  )
  .select((ctx) => ctx.p)
  .execute();
```

## Database Setup

Vector search requires database-specific extensions for storing and querying
high-dimensional vectors efficiently.

### PostgreSQL with pgvector

[pgvector](https://github.com/pgvector/pgvector) is the recommended extension
for PostgreSQL. It provides:

- Native `vector` column type
- HNSW and IVFFlat indexes for fast approximate nearest neighbor search
- Support for cosine, L2, and inner product distance

**Installation:**

```sql
-- Install the extension (requires superuser or database owner)
CREATE EXTENSION vector;
```

**Docker setup:**

```yaml
services:
  postgres:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_PASSWORD: password
      POSTGRES_DB: myapp
    ports:
      - "5432:5432"
```

**TypeGraph migration includes vector support:**

```typescript
import { generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";

// Generates DDL including:
// - CREATE EXTENSION IF NOT EXISTS vector;
// - typegraph_embeddings table with vector column
const migrationSQL = generatePostgresMigrationSQL();
```

### SQLite with sqlite-vec

[sqlite-vec](https://github.com/asg017/sqlite-vec) provides vector search
for SQLite. It offers:

- `vec_f32` type for 32-bit float vectors
- Cosine and L2 distance functions
- Works with any SQLite database

**Installation:**

```bash
npm install sqlite-vec
```

**Loading the extension:**

```typescript
import Database from "better-sqlite3";
import * as sqliteVec from "sqlite-vec";

const sqlite = new Database("myapp.db");
sqliteVec.load(sqlite);
```

**Limitations:**

- sqlite-vec does not support inner product distance
- Use `cosine` or `l2` metrics only

### Supported Distance Metrics

| Metric | PostgreSQL | SQLite | Description |
|--------|------------|--------|-------------|
| `cosine` | `<=>` | `vec_distance_cosine` | Cosine distance (1 - similarity). Best for normalized embeddings. |
| `l2` | `<->` | `vec_distance_l2` | Euclidean distance. Good for unnormalized vectors. |
| `inner_product` | `<#>` | Not supported | Negative inner product. For maximum inner product search (MIPS). |

## Schema Design

### Defining Embedding Properties

Use the `embedding()` function to define vector properties with a specific dimension:

```typescript
import { defineNode, embedding } from "@nicia-ai/typegraph";
import { z } from "zod";

const Document = defineNode("Document", {
  schema: z.object({
    title: z.string(),
    content: z.string(),
    embedding: embedding(1536), // OpenAI ada-002 dimension
  }),
});

const Image = defineNode("Image", {
  schema: z.object({
    url: z.string(),
    caption: z.string().optional(),
    clipEmbedding: embedding(512), // CLIP ViT-B/32 dimension
  }),
});
```

### Common Embedding Dimensions

| Model | Dimensions | Use Case |
|-------|------------|----------|
| all-MiniLM-L6-v2 | 384 | Fast, lightweight text embeddings |
| CLIP ViT-B/32 | 512 | Image-text multimodal |
| BERT base | 768 | General text embeddings |
| OpenAI ada-002 | 1536 | High-quality text embeddings |
| OpenAI text-embedding-3-small | 1536 | Efficient, high-quality |
| OpenAI text-embedding-3-large | 3072 | Maximum quality |
| Cohere embed-v3 | 1024 | Multilingual support |

### Optional Embeddings

Embedding properties can be optional for gradual population:

```typescript
const Article = defineNode("Article", {
  schema: z.object({
    title: z.string(),
    content: z.string(),
    embedding: embedding(1536).optional(),
  }),
});

// Create without embedding
const article = await store.nodes.Article.create({
  title: "Draft Article",
  content: "...",
});

// Add embedding later via background job
await store.nodes.Article.update(article.id, {
  embedding: await generateEmbedding(article.content),
});
```

### Multiple Embeddings per Node

Nodes can have multiple embedding fields for different purposes:

```typescript
const Product = defineNode("Product", {
  schema: z.object({
    name: z.string(),
    description: z.string(),
    imageUrl: z.string(),
    // Text embedding for description search
    textEmbedding: embedding(1536).optional(),
    // Image embedding for visual similarity
    imageEmbedding: embedding(512).optional(),
  }),
});
```

## Storing Embeddings

Embeddings are stored when creating or updating nodes:

```typescript
// Using OpenAI
import OpenAI from "openai";
const openai = new OpenAI();

async function generateEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: text,
  });
  return response.data[0].embedding;
}

// Store with embedding
const embedding = await generateEmbedding("Machine learning fundamentals");

await store.nodes.Document.create({
  title: "ML Guide",
  content: "Machine learning fundamentals...",
  embedding: embedding,
});
```

### Batch Embedding

For bulk operations, batch your embedding API calls:

```typescript
async function batchEmbed(texts: string[]): Promise<number[][]> {
  const response = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: texts,
  });
  return response.data.map((d) => d.embedding);
}

// Process in batches
const documents = await fetchDocumentsWithoutEmbeddings();
const batchSize = 100;

for (let i = 0; i < documents.length; i += batchSize) {
  const batch = documents.slice(i, i + batchSize);
  const embeddings = await batchEmbed(batch.map((d) => d.content));

  await store.transaction(async (tx) => {
    for (const [index, doc] of batch.entries()) {
      await tx.nodes.Document.update(doc.id, {
        embedding: embeddings[index],
      });
    }
  });
}
```

## Querying

### Basic Similarity Search

Use `.similarTo()` to find the k most similar nodes:

```typescript
const queryEmbedding = await generateEmbedding("neural networks");

const similar = await store
  .query()
  .from("Document", "d")
  .whereNode("d", (d) =>
    d.embedding.similarTo(queryEmbedding, 10) // Top 10 most similar
  )
  .select((ctx) => ({
    title: ctx.d.title,
    content: ctx.d.content,
  }))
  .execute();
```

### Choosing a Distance Metric

```typescript
// Cosine similarity (default) - best for normalized embeddings
d.embedding.similarTo(queryEmbedding, 10, { metric: "cosine" })

// L2 (Euclidean) distance - for unnormalized embeddings
d.embedding.similarTo(queryEmbedding, 10, { metric: "l2" })

// Inner product - for maximum inner product search (PostgreSQL only)
d.embedding.similarTo(queryEmbedding, 10, { metric: "inner_product" })
```

**When to use each:**

- **Cosine**: Most common choice. Works well with normalized embeddings
  (OpenAI, Sentence Transformers). Focuses on direction, not magnitude.
- **L2**: Use when vector magnitude matters. Good for detecting exact
  duplicates.
- **Inner product**: For MIPS (maximum inner product search). Useful when
  embeddings encode both relevance and importance in magnitude.

### Minimum Score Filtering

Filter results below a similarity threshold:

```typescript
const highQualityMatches = await store
  .query()
  .from("Document", "d")
  .whereNode("d", (d) =>
    d.embedding.similarTo(queryEmbedding, 100, {
      metric: "cosine",
      minScore: 0.8, // Only results with similarity >= 0.8
    })
  )
  .select((ctx) => ctx.d)
  .execute();
```

The `minScore` parameter filters results using **similarity** (not distance):

- **Cosine**: 1.0 = identical, 0.0 = orthogonal. Typical thresholds: 0.7-0.9
- **L2**: Maximum distance to include (lower = more similar)
- **Inner product**: Minimum inner product value

:::note[Similarity vs Distance]
While the underlying database operators use distance (where 0 = identical for cosine),
`minScore` uses similarity semantics for intuitive usage. TypeGraph converts internally:
`distance_threshold = 1 - minScore` for cosine.
:::

### Combining with Predicates

Semantic search integrates with all standard query predicates:

```typescript
const filteredSearch = await store
  .query()
  .from("Document", "d")
  .whereNode("d", (d) =>
    d.embedding
      .similarTo(queryEmbedding, 20)
      .and(d.category.eq("technology"))
      .and(d.publishedAt.gte("2024-01-01"))
      .and(d.status.eq("published"))
  )
  .select((ctx) => ctx.d)
  .execute();
```

### Combining with Graph Traversals

Search within graph relationships:

```typescript
// Find similar documents by authors I follow
const personalizedSearch = await store
  .query()
  .from("Person", "me")
  .whereNode("me", (p) => p.id.eq(currentUserId))
  .traverse("follows", "f")
  .to("Person", "author")
  .traverse("authored", "a", { direction: "in" })
  .to("Document", "d")
  .whereNode("d", (d) =>
    d.embedding.similarTo(queryEmbedding, 10)
  )
  .select((ctx) => ({
    title: ctx.d.title,
    author: ctx.author.name,
  }))
  .execute();
```

## Best Practices

### Normalize Your Embeddings

Most embedding models produce normalized vectors (unit length). If yours doesn't,
normalize before storing:

```typescript
function normalize(vector: number[]): number[] {
  const magnitude = Math.sqrt(vector.reduce((sum, v) => sum + v * v, 0));
  return vector.map((v) => v / magnitude);
}

await store.nodes.Document.create({
  title: "Example",
  content: "...",
  embedding: normalize(rawEmbedding),
});
```

### Use Consistent Embedding Models

Always use the same model for both storing and querying:

```typescript
// Bad: Mixing models
const docEmbedding = await embed("text-embedding-ada-002", content);
const queryEmbedding = await embed("text-embedding-3-small", query); // Different!

// Good: Same model throughout
const MODEL = "text-embedding-ada-002";
const docEmbedding = await embed(MODEL, content);
const queryEmbedding = await embed(MODEL, query);
```

### Handle Missing Embeddings

Not all nodes may have embeddings. Handle gracefully:

```typescript
// Only search nodes with embeddings
const results = await store
  .query()
  .from("Document", "d")
  .whereNode("d", (d) =>
    d.embedding
      .isNotNull()
      .and(d.embedding.similarTo(queryEmbedding, 10))
  )
  .select((ctx) => ctx.d)
  .execute();
```

### Choose Appropriate k Values

The `k` parameter (number of results) affects performance:

```typescript
// For RAG: Small k (3-10) for focused context
d.embedding.similarTo(query, 5)

// For exploration: Larger k with pagination
d.embedding.similarTo(query, 100)
```

### Index Considerations

Vector indexes (HNSW, IVFFlat) trade accuracy for speed:

- **Small datasets (< 10K)**: Exact search is fast enough
- **Medium datasets (10K-1M)**: HNSW provides good recall with fast queries
- **Large datasets (> 1M)**: Consider IVFFlat with appropriate parameters

TypeGraph creates HNSW indexes by default for optimal balance.

## Troubleshooting

### "Extension not found" errors

**PostgreSQL:**

```sql
-- Check if pgvector is installed
SELECT * FROM pg_extension WHERE extname = 'vector';

-- Install it
CREATE EXTENSION vector;
```

**SQLite:**

```typescript
// Ensure sqlite-vec is loaded before queries
import * as sqliteVec from "sqlite-vec";
sqliteVec.load(sqlite);
```

### "Inner product not supported" (SQLite)

sqlite-vec only supports `cosine` and `l2` metrics. Use one of those instead:

```typescript
// Instead of:
d.embedding.similarTo(query, 10, { metric: "inner_product" })

// Use:
d.embedding.similarTo(query, 10, { metric: "cosine" })
```

### Dimension mismatch errors

Ensure query embedding has the same dimension as stored embeddings:

```typescript
const Document = defineNode("Document", {
  schema: z.object({
    embedding: embedding(1536), // 1536 dimensions
  }),
});

// Query embedding must also be 1536 dimensions
const queryEmbedding = await embed(text); // Verify this returns 1536-dim vector
```

### Slow queries

1. **Check index creation**: Vector indexes may not exist
2. **Reduce k**: Smaller k = faster queries
3. **Add filters**: Pre-filter with standard predicates before similarity search
4. **Consider approximate search**: HNSW indexes sacrifice some accuracy for speed

## API Reference

See the [Predicates documentation](/queries/predicates#embedding) for
complete API reference of the `similarTo()` predicate and related options.

# Testing

> Patterns for testing code that uses TypeGraph

TypeGraph's in-memory SQLite backend makes tests fast and isolated — each test gets a fresh
database with zero setup cost. This guide covers test utilities, common patterns, and strategies
for testing at different levels.

## Test Setup

### In-memory backend (recommended)

`createLocalSqliteBackend()` creates an in-memory SQLite database with TypeGraph tables
pre-configured. Each call returns a completely isolated database.

```typescript
import { beforeEach, describe, expect, it } from "vitest";
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";
import { createStore } from "@nicia-ai/typegraph";
import { graph } from "../src/graph"; // your graph definition

describe("Person queries", () => {
  let store: ReturnType<typeof createStore<typeof graph>>;

  beforeEach(() => {
    const { backend } = createLocalSqliteBackend();
    store = createStore(graph, backend);
  });

  it("creates and retrieves a person", async () => {
    const alice = await store.nodes.Person.create({
      name: "Alice",
      email: "alice@example.com",
    });

    const found = await store.nodes.Person.getById(alice.id);
    expect(found?.props.name).toBe("Alice");
  });
});
```

No teardown is needed — the in-memory database is garbage collected when the backend goes out
of scope.

### Shared test helper

If many test files use the same setup, extract a helper:

```typescript
// tests/test-helpers.ts
import { createLocalSqliteBackend } from "@nicia-ai/typegraph/sqlite/local";
import { createStore } from "@nicia-ai/typegraph";
import { graph } from "../src/graph";

export function createTestStore() {
  const { backend } = createLocalSqliteBackend();
  return createStore(graph, backend);
}
```

```typescript
// tests/person.test.ts
import { beforeEach, describe, expect, it } from "vitest";
import { createTestStore } from "./test-helpers";

describe("Person", () => {
  let store: ReturnType<typeof createTestStore>;

  beforeEach(() => {
    store = createTestStore();
  });

  // tests...
});
```

### createStore vs createStoreWithSchema

| Factory | Sync? | Schema management | Use for |
|---------|-------|-------------------|---------|
| `createStore(graph, backend)` | Yes | None | Tests, local dev |
| `createStoreWithSchema(graph, backend)` | No | Auto-init, auto-migrate | Production, schema evolution tests |

Use `createStore` for most tests — it's synchronous and avoids async setup. Use
`createStoreWithSchema` when you're specifically testing schema migrations or evolution:

```typescript
import { createStoreWithSchema } from "@nicia-ai/typegraph";

it("migrates from v1 to v2", async () => {
  const { backend } = createLocalSqliteBackend();

  // Initialize with v1 schema
  const [storeV1] = await createStoreWithSchema(graphV1, backend);
  await storeV1.nodes.Person.create({ name: "Alice" });

  // Migrate to v2 schema
  const [storeV2, result] = await createStoreWithSchema(graphV2, backend);
  expect(result.status).toBe("migrated");
});
```

## Testing Queries

### Seed data, then query

The typical pattern is: create data through the collection API, then verify queries return
the expected results.

```typescript
it("finds friends-of-friends", async () => {
  // Seed
  const alice = await store.nodes.Person.create({ name: "Alice" });
  const bob = await store.nodes.Person.create({ name: "Bob" });
  const carol = await store.nodes.Person.create({ name: "Carol" });

  await store.edges.knows.create(alice, bob, {});
  await store.edges.knows.create(bob, carol, {});

  // Query
  const fof = await store
    .query()
    .from("Person", "p")
    .whereNode("p", (p) => p.id.eq(alice.id))
    .traverse("knows", "e")
    .recursive({ minHops: 2, maxHops: 2 })
    .to("Person", "friend")
    .select((ctx) => ctx.friend.name)
    .execute();

  expect(fof).toEqual(["Carol"]);
});
```

### Bulk seeding

For tests that need a larger dataset, use `bulkCreate` for speed:

```typescript
beforeEach(async () => {
  const people = Array.from({ length: 100 }, (_, i) => ({
    props: { name: `Person ${i}`, email: `person${i}@example.com` },
  }));
  await store.nodes.Person.bulkCreate(people);
});
```

### Testing query shapes with toSQL()

You can inspect the generated SQL without executing to verify query structure:

```typescript
it("compiles a traversal to a single statement", () => {
  const query = store
    .query()
    .from("Person", "p")
    .traverse("worksAt", "e")
    .to("Company", "c")
    .select((ctx) => ({ person: ctx.p.name, company: ctx.c.name }));

  const { sql } = query.toSQL();
  expect(sql).toContain("WITH");
  expect(sql).not.toContain(";"); // single statement
});
```

### Testing prepared queries

```typescript
it("executes prepared queries with different bindings", async () => {
  await store.nodes.Person.create({ name: "Alice" });
  await store.nodes.Person.create({ name: "Bob" });

  const prepared = store
    .query()
    .from("Person", "p")
    .whereNode("p", (p) => p.name.eq(p.name.bind("targetName")))
    .select((ctx) => ctx.p.name)
    .prepare();

  const alice = await prepared.execute({ targetName: "Alice" });
  const bob = await prepared.execute({ targetName: "Bob" });

  expect(alice).toEqual(["Alice"]);
  expect(bob).toEqual(["Bob"]);
});
```

## Testing Transactions

Verify atomicity by asserting that failed transactions leave no partial data:

```typescript
it("rolls back on error", async () => {
  try {
    await store.transaction(async (tx) => {
      await tx.nodes.Person.create({ name: "Alice" });
      throw new Error("abort");
    });
  } catch {
    // expected
  }

  const all = await store
    .query()
    .from("Person", "p")
    .select((ctx) => ctx.p)
    .execute();

  expect(all).toHaveLength(0); // Alice was rolled back
});
```

## Testing with the Query Profiler

Use the [Query Profiler](/performance/profiler) in tests to catch unindexed filter patterns
before they reach production.

```typescript
import { QueryProfiler } from "@nicia-ai/typegraph/profiler";
import { toDeclaredIndexes } from "@nicia-ai/typegraph/indexes";
import { personEmail } from "../src/indexes";

describe("Index coverage", () => {
  it("all query filters have index coverage", async () => {
    const profiler = new QueryProfiler({
      declaredIndexes: toDeclaredIndexes([personEmail]),
    });
    const profiledStore = profiler.attachToStore(store);

    // Run representative queries
    await profiledStore
      .query()
      .from("Person", "p")
      .whereNode("p", (p) => p.email.eq("alice@example.com"))
      .select((ctx) => ctx.p.name)
      .execute();

    // Fails if any filter property lacks an index
    profiler.assertIndexCoverage();
  });
});
```

This is particularly effective when run against your full test suite — it catches filter patterns
across all tests, not just the ones you remember to check manually.

## PostgreSQL Integration Tests

For tests that verify PostgreSQL-specific behavior (JSONB operators, GIN indexes, concurrent
writes), connect to a real database:

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { createPostgresBackend, generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";

describe("PostgreSQL integration", () => {
  let pool: Pool;
  let store: ReturnType<typeof createStore<typeof graph>>;

  beforeAll(async () => {
    pool = new Pool({ connectionString: process.env.TEST_DATABASE_URL });
    await pool.query(generatePostgresMigrationSQL());
    const db = drizzle(pool);
    const backend = createPostgresBackend(db);
    store = createStore(graph, backend);
  });

  afterAll(async () => {
    await pool.end();
  });

  beforeEach(async () => {
    await pool.query("TRUNCATE typegraph_nodes, typegraph_edges CASCADE");
  });

  it("handles concurrent writes", async () => {
    const creates = Array.from({ length: 100 }, (_, i) =>
      store.nodes.Person.create({ name: `Person ${i}` }),
    );
    await Promise.all(creates);

    const count = await store.nodes.Person.count();
    expect(count).toBe(100);
  });
});
```

### Skipping when no database is available

Guard PostgreSQL tests so they're skipped in environments without a database:

```typescript
const describePostgres = process.env.TEST_DATABASE_URL
  ? describe
  : describe.skip;

describePostgres("PostgreSQL-specific", () => {
  // ...
});
```

## Testing Pyramid

| Level | Backend | Speed | Isolation | When to use |
|-------|---------|-------|-----------|-------------|
| Unit | In-memory SQLite | Fast (~1ms setup) | Full (fresh DB per test) | Collection API, query logic, business rules |
| Integration | SQLite file or PostgreSQL | Medium | Shared (truncate between tests) | Concurrency, transactions, backend-specific behavior |
| Profiler | In-memory SQLite | Fast | Full | Index coverage, query pattern verification |

Most tests should be unit tests with in-memory SQLite. Reserve PostgreSQL integration tests for
behavior that differs across backends (array containment, concurrent writes, isolation levels).

## Next Steps

- [Backend Setup](/backend-setup) — Configure SQLite and PostgreSQL backends
- [Query Profiler](/performance/profiler) — Automatic index recommendations
- [Schemas & Stores](/schemas-stores) — Collection API reference

# Types

> TypeScript type definitions and utilities

This reference documents TypeGraph's TypeScript types and utility functions.

## Node Types

### `Node<N>`

The full node type returned from store operations.

```typescript
type Node<N extends NodeType> = Readonly<{
  id: NodeId<N>;     // Branded ID type
  kind: N["kind"];   // Node kind name
  meta: {
    version: number;                 // Monotonic version counter
    validFrom: string | undefined;   // Temporal validity start (ISO string)
    validTo: string | undefined;     // Temporal validity end (ISO string)
    createdAt: string;               // Created timestamp (ISO string)
    updatedAt: string;               // Updated timestamp (ISO string)
    deletedAt: string | undefined;   // Soft delete timestamp (ISO string)
  };
}> & z.infer<N["schema"]>;            // Schema properties are flattened
```

### `NodeId<N>`

Branded string type for type-safe node IDs. Prevents accidentally mixing IDs from different node types.

```typescript
type NodeId<N extends NodeType> = string & { readonly [__nodeId]: N };
```

**Example:**

```typescript
import { type NodeId } from "@nicia-ai/typegraph";

type PersonId = NodeId<typeof Person>;
type CompanyId = NodeId<typeof Company>;

function getPersonById(id: PersonId): Promise<Node<typeof Person>> {
  // TypeScript prevents passing a CompanyId here
  return store.nodes.Person.getById(id);
}
```

### `NodeProps<N>`

Extracts just the property types from a node definition. Use this when you only
need the schema data without node metadata.

```typescript
type NodeProps<N extends NodeType> = z.infer<N["schema"]>;
```

**Example:**

```typescript
import { type NodeProps } from "@nicia-ai/typegraph";

type PersonProps = NodeProps<typeof Person>;
// { name: string; email?: string; age?: number }

// Useful for form data, API payloads, or validation
function validatePersonData(data: PersonProps): boolean {
  return data.name.length > 0;
}
```

### `NodeRef<N>`

Type-safe reference to a node of a specific kind. Used for edge collection
methods to enforce that endpoints match the allowed node types.
Defaults to `NodeType` when no type parameter is given.

```typescript
type NodeRef<N extends NodeType = NodeType> = Node<N> | Readonly<{ kind: N["kind"]; id: string }>;
```

Accepts either:

- A `Node<N>` instance (e.g., the result of `store.nodes.Person.create()`)
- An explicit object with the correct type name and ID

### `SelectableNode<N>`

The node type available in `select()` context. Properties are flattened (not nested under `props`).

```typescript
type SelectableNode<N extends NodeType> = Readonly<{
  id: string;
  kind: N["kind"];
  meta: {
    version: number;
    validFrom: string | undefined;
    validTo: string | undefined;
    createdAt: string;
    updatedAt: string;
    deletedAt: string | undefined;
  };
}> & z.infer<N["schema"]>;  // Properties are flattened
```

**Example:**

```typescript
// In select context, access properties directly
.select((ctx) => ({
  id: ctx.p.id,           // string
  name: ctx.p.name,       // Direct property access (not ctx.p.props.name)
  email: ctx.p.email,
  created: ctx.p.meta.createdAt,
}))
```

## Edge Types

### `Edge<E, From, To>`

The full edge type returned from store operations. The `From` and `To` type
parameters carry compile-time node type information for the edge endpoints.

```typescript
type Edge<
  E extends EdgeType = EdgeType,
  From extends NodeType = NodeType,
  To extends NodeType = NodeType,
> = Readonly<{
  id: EdgeId<E>;           // Branded ID type
  kind: E["kind"];
  fromKind: From["kind"];
  fromId: NodeId<From>;
  toKind: To["kind"];
  toId: NodeId<To>;
  meta: {
    validFrom: string | undefined;   // Temporal validity start (ISO string)
    validTo: string | undefined;     // Temporal validity end (ISO string)
    createdAt: string;               // Created timestamp (ISO string)
    updatedAt: string;               // Updated timestamp (ISO string)
    deletedAt: string | undefined;   // Soft delete timestamp (ISO string)
  };
}> & z.infer<E["schema"]>;            // Schema properties are flattened
```

### `EdgeId<E>`

Branded string type for type-safe edge IDs. Prevents accidentally mixing IDs from different edge types.

```typescript
type EdgeId<E extends AnyEdgeType = AnyEdgeType> = string & { readonly [__edgeId]: E };
```

**Example:**

```typescript
import { type EdgeId } from "@nicia-ai/typegraph";

type WorksAtId = EdgeId<typeof worksAt>;

function getEdgeById(id: WorksAtId): Promise<Edge<typeof worksAt>> {
  return store.edges.worksAt.getById(id);
}
```

### `EdgeProps<E>`

Extracts just the property types from an edge definition.

```typescript
type EdgeProps<E extends EdgeType> = z.infer<E["schema"]>;
```

**Example:**

```typescript
import { type EdgeProps } from "@nicia-ai/typegraph";

type WorksAtProps = EdgeProps<typeof worksAt>;
// { role: string; startDate?: string }
```

### `SelectableEdge<E>`

The edge type available in `select()` context. Properties are flattened.

```typescript
type SelectableEdge<E extends EdgeType> = Readonly<{
  id: string;
  kind: E["kind"];
  fromId: string;
  toId: string;
  meta: {
    validFrom: string | undefined;
    validTo: string | undefined;
    createdAt: string;
    updatedAt: string;
    deletedAt: string | undefined;
  };
}> & z.infer<E["schema"]>;  // Edge properties are flattened
```

**Example:**

```typescript
// Access edge properties in select context
.select((ctx) => ({
  role: ctx.e.role,           // Direct edge property access
  salary: ctx.e.salary,
  edgeId: ctx.e.id,
  startedAt: ctx.e.meta.createdAt,
}))
```

### `TypedEdgeCollection<R>`

A type-safe edge collection with From/To types extracted from the edge
registration. This is what `store.edges.*` returns.

```typescript
type TypedEdgeCollection<R extends EdgeRegistration> = EdgeCollection<
  R["type"],
  R["from"][number], // Union of allowed 'from' node types
  R["to"][number]    // Union of allowed 'to' node types
>;
```

## Graph Configuration Types

### `DeleteBehavior`

Controls what happens to edges when a node is deleted.

```typescript
type DeleteBehavior = "restrict" | "cascade" | "disconnect";
```

| Value | Description |
|-------|-------------|
| `"restrict"` | Prevent deletion if edges exist |
| `"cascade"` | Delete connected edges |
| `"disconnect"` | Remove edges without error |

### `Cardinality`

Controls how many edges of a type can connect from/to a node.

```typescript
type Cardinality = "many" | "one" | "unique" | "oneActive";
```

| Value | Description |
|-------|-------------|
| `"many"` | No limit on edges |
| `"one"` | At most one edge per source node |
| `"unique"` | At most one edge per source-target pair |
| `"oneActive"` | At most one active edge (`validTo` is `undefined`) per source node |

### `InferenceType`

Controls how ontology relationships affect queries.

```typescript
type InferenceType =
  | "subsumption"   // Query for X includes subclass instances
  | "hierarchy"     // Enables broader/narrower traversal
  | "substitution"  // Can substitute equivalent types
  | "constraint"    // Validation rules
  | "composition"   // Part-whole navigation
  | "association"   // Discovery/recommendation
  | "none";         // No automatic inference
```

## Query Types

### `VariableLengthSpec`

Configuration for variable-length (recursive) traversals.

```typescript
type VariableLengthSpec = Readonly<{
  minDepth: number;                   // Minimum hops (default: 1)
  maxDepth: number;                   // Maximum hops (-1 = unlimited)
  cyclePolicy: "prevent" | "allow";   // Cycle handling mode
  pathAlias?: string;                 // Column alias for projected path
  depthAlias?: string;                // Column alias for projected depth
}>;
```

### `SetOperationType`

Available set operations for combining queries.

```typescript
type SetOperationType = "union" | "unionAll" | "intersect" | "except";
```

### `PaginateOptions`

Options for cursor-based pagination.

```typescript
type PaginateOptions = Readonly<{
  first?: number;   // Items to fetch (forward)
  after?: string;   // Cursor to start after (forward)
  last?: number;    // Items to fetch (backward)
  before?: string;  // Cursor to start before (backward)
}>;
```

### `PaginatedResult<R>`

Result of a paginated query.

```typescript
type PaginatedResult<R> = Readonly<{
  data: readonly R[];
  nextCursor: string | undefined;
  prevCursor: string | undefined;
  hasNextPage: boolean;
  hasPrevPage: boolean;
}>;
```

### `StreamOptions`

Options for streaming results.

```typescript
type StreamOptions = Readonly<{
  batchSize?: number;  // Items per batch (default: 1000)
}>;
```

## Utility Functions

### `generateId()`

Generates a unique ID using nanoid.

```typescript
import { generateId } from "@nicia-ai/typegraph";

function generateId(): string;

const id = generateId(); // "V1StGXR8_Z5jdHi6B-myT"
```

## Constants

### `MAX_RECURSIVE_DEPTH`

Maximum depth for unbounded recursive traversals (100).

```typescript
import { MAX_RECURSIVE_DEPTH } from "@nicia-ai/typegraph";

// MAX_RECURSIVE_DEPTH = 100
```

Recursive traversals are capped at this depth when no `maxHops` is specified in
the `recursive()` options object. Explicit `maxHops` values are validated against
`MAX_EXPLICIT_RECURSIVE_DEPTH` (1000). Cycle prevention is enabled by default.
To allow revisits for maximum performance, use `cyclePolicy: "allow"`.

### `MAX_EXPLICIT_RECURSIVE_DEPTH`

Maximum allowed value for the `maxHops` option in recursive traversals (1000).

```typescript
import { MAX_EXPLICIT_RECURSIVE_DEPTH } from "@nicia-ai/typegraph";

// MAX_EXPLICIT_RECURSIVE_DEPTH = 1000
```

# Ejecting

> How to migrate away from TypeGraph if you need to

TypeGraph is designed with zero lock-in. If you decide to move on, you're left
with a clean, conventional database schema that works with any SQL tooling.

## What You're Left With

When you eject TypeGraph, your database contains two well-structured tables:

```sql
-- Your nodes
SELECT * FROM typegraph_nodes;
┌──────────┬─────────┬──────────────────┬─────────────────────────────────┐
│ kind     │ id      │ props            │ created_at                      │
├──────────┼─────────┼──────────────────┼─────────────────────────────────┤
│ Person   │ p-001   │ {"name": "Ada"}  │ 2024-01-15T10:30:00Z            │
│ Company  │ c-001   │ {"name": "Acme"} │ 2024-01-15T10:30:00Z            │
└──────────┴─────────┴──────────────────┴─────────────────────────────────┘

-- Your relationships
SELECT * FROM typegraph_edges;
┌──────────┬──────────┬─────────┬──────────┬─────────┐
│ kind     │ from_id  │ to_id   │ props    │ ...     │
├──────────┼──────────┼─────────┼──────────┼─────────┤
│ worksAt  │ p-001    │ c-001   │ {}       │ ...     │
└──────────┴──────────┴─────────┴──────────┴─────────┘
```

This is exactly the schema you'd design yourself for a flexible entity-relationship system.

## Querying Without TypeGraph

All your data is accessible with plain SQL. No special drivers, no proprietary formats.

### Find all people at a company

```sql
SELECT n.props->>'name' as person_name
FROM typegraph_nodes n
JOIN typegraph_edges e ON e.from_id = n.id
WHERE e.kind = 'worksAt'
  AND e.to_id = 'c-001'
  AND n.deleted_at IS NULL;
```

### Traverse a relationship

```sql
SELECT
  p.props->>'name' as person,
  c.props->>'name' as company
FROM typegraph_nodes p
JOIN typegraph_edges e ON e.from_id = p.id AND e.kind = 'worksAt'
JOIN typegraph_nodes c ON c.id = e.to_id
WHERE p.kind = 'Person'
  AND c.kind = 'Company'
  AND p.deleted_at IS NULL
  AND c.deleted_at IS NULL;
```

### Point-in-time query

```sql
SELECT *
FROM typegraph_nodes
WHERE kind = 'Article'
  AND valid_from <= '2024-06-01'
  AND (valid_to IS NULL OR valid_to > '2024-06-01');
```

## Using Your Own Tools

The schema works with everything in the SQL ecosystem:

- **ORMs**: Drizzle, Prisma, Knex, TypeORM, Sequelize
- **Query builders**: Kysely, Slonik
- **Raw SQL**: Any PostgreSQL or SQLite client
- **BI tools**: Metabase, Superset, Tableau
- **Migration tools**: dbmate, Flyway, Liquibase

### Example: Drizzle ORM

```typescript
import { pgTable, text, jsonb, timestamp } from "drizzle-orm/pg-core";

const nodes = pgTable("typegraph_nodes", {
  graphId: text("graph_id").notNull(),
  kind: text("kind").notNull(),
  id: text("id").notNull(),
  props: jsonb("props").notNull(),
  createdAt: timestamp("created_at").notNull(),
  updatedAt: timestamp("updated_at").notNull(),
  deletedAt: timestamp("deleted_at"),
});

// Query as usual
const people = await db
  .select()
  .from(nodes)
  .where(eq(nodes.kind, "Person"));
```

### Example: Prisma

```prisma
model TypegraphNode {
  graphId   String   @map("graph_id")
  kind      String
  id        String
  props     Json
  createdAt DateTime @map("created_at")
  deletedAt DateTime? @map("deleted_at")

  @@id([graphId, kind, id])
  @@map("typegraph_nodes")
}
```

## Migration Strategies

### Option 1: Keep the schema as-is

The TypeGraph schema is production-ready. Continue using it directly with your preferred SQL tools.

### Option 2: Normalize into separate tables

If you want traditional per-entity tables:

```sql
-- Create a typed table
CREATE TABLE people AS
SELECT
  id,
  props->>'name' as name,
  props->>'email' as email,
  created_at,
  updated_at
FROM typegraph_nodes
WHERE kind = 'Person'
  AND deleted_at IS NULL;

-- Add constraints
ALTER TABLE people ADD PRIMARY KEY (id);
```

### Option 3: Create views for compatibility

Keep the original tables but add typed views:

```sql
CREATE VIEW people AS
SELECT
  id,
  props->>'name' as name,
  props->>'email' as email,
  created_at
FROM typegraph_nodes
WHERE kind = 'Person'
  AND deleted_at IS NULL;
```

## What About the Ontology?

The ontology (type hierarchies, edge constraints) exists only in your TypeScript
code. The database stores raw data without semantic constraints.

After ejecting:

- You lose automatic subclass queries (`includeSubClasses`)
- You lose edge validation (ensuring valid from/to kinds)
- You keep all your data exactly as stored

If you need these features, you'll implement them in application code—which is what any alternative would require anyway.

## Summary

TypeGraph adds a type-safe API layer over a conventional SQL schema. Remove the library and you still have:

- Standard SQL tables
- JSON properties (supported natively by SQLite and PostgreSQL)
- Full temporal history
- Soft deletes
- No proprietary formats
- No data migration required

Your data is always yours.

# Audit Trail

> Complete change tracking with user attribution and diff generation

This example shows how to build a comprehensive audit system that:

- **Tracks all changes** using TypeGraph's temporal model
- **Attributes changes** to users and sessions
- **Generates diffs** between versions
- **Supports compliance queries** (who changed what, when)
- **Exports audit logs** for external systems

## How TypeGraph Enables Auditing

TypeGraph's temporal model provides built-in auditing capabilities:

1. **Every update creates a new version** - Old data is preserved with `valid_to` timestamp
2. **Temporal queries** - Query any point in time with `asOf` or get full history with `includeEnded`
3. **Metadata fields** - `createdAt`, `updatedAt`, `version` are tracked automatically

This example extends the built-in capabilities with:

- User attribution (who made the change)
- Change descriptions (why the change was made)
- Structured diffs (what exactly changed)

## Schema Definition

```typescript
import { z } from "zod";
import { defineNode, defineEdge, defineGraph } from "@nicia-ai/typegraph";

// Audited entity (example: Settings)
const Setting = defineNode("Setting", {
  schema: z.object({
    key: z.string(),
    value: z.string(),
    category: z.string(),
    description: z.string().optional(),
  }),
});

// Users making changes
const User = defineNode("User", {
  schema: z.object({
    email: z.string().email(),
    name: z.string(),
    role: z.enum(["admin", "editor", "viewer"]),
  }),
});

// Explicit audit log entries (for cross-cutting concerns)
const AuditEntry = defineNode("AuditEntry", {
  schema: z.object({
    entityType: z.string(),
    entityId: z.string(),
    action: z.enum(["create", "update", "delete", "restore"]),
    timestamp: z.string().datetime(),
    changes: z.record(z.object({
      before: z.unknown().optional(),
      after: z.unknown().optional(),
    })).optional(),
    reason: z.string().optional(),
    ipAddress: z.string().optional(),
    userAgent: z.string().optional(),
  }),
});

// Sessions for grouping changes
const Session = defineNode("Session", {
  schema: z.object({
    startedAt: z.string().datetime(),
    endedAt: z.string().datetime().optional(),
    ipAddress: z.string().optional(),
    userAgent: z.string().optional(),
  }),
});

// Edges
const performedBy = defineEdge("performedBy");
const inSession = defineEdge("inSession");
const hasSession = defineEdge("hasSession");

const graph = defineGraph({
  id: "audit_trail",
  nodes: {
    Setting: { type: Setting },
    User: { type: User },
    AuditEntry: { type: AuditEntry },
    Session: { type: Session },
  },
  edges: {
    performedBy: { type: performedBy, from: [AuditEntry], to: [User] },
    inSession: { type: inSession, from: [AuditEntry], to: [Session] },
    hasSession: { type: hasSession, from: [User], to: [Session] },
  },
});
```

## Audit Context

Create a context object to track the current user and session:

```typescript
interface AuditContext {
  userId: string;
  sessionId?: string;
  ipAddress?: string;
  userAgent?: string;
  reason?: string;
}

// Thread-local storage for audit context (Node.js)
import { AsyncLocalStorage } from "async_hooks";

const auditContext = new AsyncLocalStorage<AuditContext>();

function withAuditContext<T>(context: AuditContext, fn: () => Promise<T>): Promise<T> {
  return auditContext.run(context, fn);
}

function getAuditContext(): AuditContext | undefined {
  return auditContext.getStore();
}
```

## Audited Operations

### Create with Audit

```typescript
async function createSetting(
  key: string,
  value: string,
  category: string
): Promise<Node<typeof Setting>> {
  const ctx = getAuditContext();
  if (!ctx) throw new Error("Audit context required");

  return store.transaction(async (tx) => {
    // Create the setting
    const setting = await tx.nodes.Setting.create({
      key,
      value,
      category,
    });

    // Create audit entry
    await createAuditEntry(tx, {
      entityType: "Setting",
      entityId: setting.id,
      action: "create",
      changes: {
        key: { after: key },
        value: { after: value },
        category: { after: category },
      },
    });

    return setting;
  });
}
```

### Update with Audit

```typescript
async function updateSetting(
  id: string,
  updates: Partial<{ value: string; description: string }>
): Promise<Node<typeof Setting>> {
  const ctx = getAuditContext();
  if (!ctx) throw new Error("Audit context required");

  return store.transaction(async (tx) => {
    // Get current state
    const current = await tx.nodes.Setting.getById(id);
    if (!current) throw new Error(`Setting not found: ${id}`);

    // Calculate changes
    const changes: Record<string, { before: unknown; after: unknown }> = {};
    for (const [key, newValue] of Object.entries(updates)) {
      const oldValue = current[key as keyof typeof current];
      if (oldValue !== newValue) {
        changes[key] = { before: oldValue, after: newValue };
      }
    }

    // Skip if no actual changes
    if (Object.keys(changes).length === 0) {
      return current;
    }

    // Update the setting
    const updated = await tx.nodes.Setting.update(id, updates);

    // Create audit entry
    await createAuditEntry(tx, {
      entityType: "Setting",
      entityId: id,
      action: "update",
      changes,
    });

    return updated;
  });
}
```

### Delete with Audit

```typescript
async function deleteSetting(id: string): Promise<void> {
  const ctx = getAuditContext();
  if (!ctx) throw new Error("Audit context required");

  await store.transaction(async (tx) => {
    // Get current state for audit
    const current = await tx.nodes.Setting.getById(id);
    if (!current) throw new Error(`Setting not found: ${id}`);

    // Delete (soft delete)
    await tx.nodes.Setting.delete(id);

    // Create audit entry
    await createAuditEntry(tx, {
      entityType: "Setting",
      entityId: id,
      action: "delete",
      changes: {
        key: { before: current.key },
        value: { before: current.value },
        category: { before: current.category },
      },
    });
  });
}
```

### Create Audit Entry

```typescript
interface AuditEntryInput {
  entityType: string;
  entityId: string;
  action: "create" | "update" | "delete" | "restore";
  changes?: Record<string, { before?: unknown; after?: unknown }>;
}

async function createAuditEntry(
  tx: Transaction,
  input: AuditEntryInput
): Promise<Node<typeof AuditEntry>> {
  const ctx = getAuditContext()!;

  const entry = await tx.nodes.AuditEntry.create({
    entityType: input.entityType,
    entityId: input.entityId,
    action: input.action,
    timestamp: new Date().toISOString(),
    changes: input.changes,
    reason: ctx.reason,
    ipAddress: ctx.ipAddress,
    userAgent: ctx.userAgent,
  });

  // Link to user
  const user = await tx.nodes.User.getById(ctx.userId);
  if (!user) throw new Error(`User not found: ${ctx.userId}`);
  await tx.edges.performedBy.create(entry, user, {});

  // Link to session if present
  if (ctx.sessionId) {
    const session = await tx.nodes.Session.getById(ctx.sessionId);
    if (!session) throw new Error(`Session not found: ${ctx.sessionId}`);
    await tx.edges.inSession.create(entry, session, {});
  }

  return entry;
}
```

## Querying Audit History

### Get Entity History

```typescript
interface HistoryEntry {
  version: number;
  timestamp: string;
  action: string;
  changes?: Record<string, { before?: unknown; after?: unknown }>;
  user: { name: string; email: string };
  reason?: string;
}

async function getEntityHistory(
  entityType: string,
  entityId: string
): Promise<HistoryEntry[]> {
  return store
    .query()
    .from("AuditEntry", "a")
    .whereNode("a", (a) =>
      a.entityType.eq(entityType).and(a.entityId.eq(entityId))
    )
    .traverse("performedBy", "e")
    .to("User", "u")
    .orderBy((ctx) => ctx.a.timestamp, "desc")
    .select((ctx) => ({
      version: ctx.a.version,
      timestamp: ctx.a.timestamp,
      action: ctx.a.action,
      changes: ctx.a.changes,
      user: {
        name: ctx.u.name,
        email: ctx.u.email,
      },
      reason: ctx.a.reason,
    }))
    .execute();
}
```

### Get User Activity

```typescript
interface UserActivity {
  timestamp: string;
  entityType: string;
  entityId: string;
  action: string;
}

async function getUserActivity(
  userId: string,
  options: { since?: Date; limit?: number } = {}
): Promise<UserActivity[]> {
  const { since, limit = 100 } = options;

  let query = store
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.id.eq(userId))
    .traverse("performedBy", "e", { direction: "in" })
    .to("AuditEntry", "a");

  if (since) {
    query = query.whereNode("a", (a) => a.timestamp.gte(since.toISOString()));
  }

  return query
    .orderBy((ctx) => ctx.a.timestamp, "desc")
    .limit(limit)
    .select((ctx) => ({
      timestamp: ctx.a.timestamp,
      entityType: ctx.a.entityType,
      entityId: ctx.a.entityId,
      action: ctx.a.action,
    }))
    .execute();
}
```

### Changes in Time Range

```typescript
interface ChangeReport {
  entityType: string;
  entityId: string;
  changeCount: number;
  users: string[];
  lastChange: string;
}

async function getChangesInRange(
  startDate: Date,
  endDate: Date
): Promise<ChangeReport[]> {
  const entries = await store
    .query()
    .from("AuditEntry", "a")
    .whereNode("a", (a) =>
      a.timestamp.gte(startDate.toISOString()).and(
        a.timestamp.lte(endDate.toISOString())
      )
    )
    .traverse("performedBy", "e")
    .to("User", "u")
    .select((ctx) => ({
      entityType: ctx.a.entityType,
      entityId: ctx.a.entityId,
      timestamp: ctx.a.timestamp,
      userName: ctx.u.name,
    }))
    .execute();

  // Group by entity
  const grouped = new Map<string, ChangeReport>();

  for (const entry of entries) {
    const key = `${entry.entityType}:${entry.entityId}`;
    const existing = grouped.get(key);

    if (existing) {
      existing.changeCount++;
      if (!existing.users.includes(entry.userName)) {
        existing.users.push(entry.userName);
      }
      if (entry.timestamp > existing.lastChange) {
        existing.lastChange = entry.timestamp;
      }
    } else {
      grouped.set(key, {
        entityType: entry.entityType,
        entityId: entry.entityId,
        changeCount: 1,
        users: [entry.userName],
        lastChange: entry.timestamp,
      });
    }
  }

  return Array.from(grouped.values());
}
```

## Using TypeGraph's Built-in Temporal Features

### View Entity at Point in Time

```typescript
async function getSettingAsOf(
  id: string,
  timestamp: Date
): Promise<SettingProps | undefined> {
  return store
    .query()
    .from("Setting", "s")
    .temporal("asOf", timestamp.toISOString())
    .whereNode("s", (s) => s.id.eq(id))
    .select((ctx) => ctx.s)
    .first();
}
```

### Get All Versions

```typescript
interface SettingVersion {
  props: SettingProps;
  validFrom: string;
  validTo: string | undefined;
  version: number;
}

async function getSettingVersions(id: string): Promise<SettingVersion[]> {
  return store
    .query()
    .from("Setting", "s")
    .temporal("includeEnded")
    .whereNode("s", (s) => s.id.eq(id))
    .orderBy((ctx) => ctx.s.validFrom, "desc")
    .select((ctx) => ({
      props: ctx.s,
      validFrom: ctx.s.validFrom,
      validTo: ctx.s.validTo,
      version: ctx.s.version,
    }))
    .execute();
}
```

### Compare Versions

```typescript
interface VersionDiff {
  field: string;
  before: unknown;
  after: unknown;
}

async function compareVersions(
  id: string,
  version1: number,
  version2: number
): Promise<VersionDiff[]> {
  const versions = await store
    .query()
    .from("Setting", "s")
    .temporal("includeEnded")
    .whereNode("s", (s) => s.id.eq(id).and(s.version.in([version1, version2])))
    .orderBy((ctx) => ctx.s.version, "asc")
    .select((ctx) => ctx.s)
    .execute();

  if (versions.length !== 2) {
    throw new Error("Versions not found");
  }

  const [before, after] = versions;
  const diffs: VersionDiff[] = [];

  const allKeys = new Set([...Object.keys(before), ...Object.keys(after)]);

  for (const key of allKeys) {
    const beforeVal = before[key as keyof typeof before];
    const afterVal = after[key as keyof typeof after];

    if (JSON.stringify(beforeVal) !== JSON.stringify(afterVal)) {
      diffs.push({ field: key, before: beforeVal, after: afterVal });
    }
  }

  return diffs;
}
```

## Session Management

### Start Session

```typescript
async function startSession(
  userId: string,
  metadata: { ipAddress?: string; userAgent?: string }
): Promise<Node<typeof Session>> {
  return store.transaction(async (tx) => {
    const session = await tx.nodes.Session.create({
      startedAt: new Date().toISOString(),
      ipAddress: metadata.ipAddress,
      userAgent: metadata.userAgent,
    });

    const user = await tx.nodes.User.getById(userId);
    if (!user) throw new Error(`User not found: ${userId}`);
    await tx.edges.hasSession.create(user, session, {});

    return session;
  });
}
```

### End Session

```typescript
async function endSession(sessionId: string): Promise<void> {
  await store.nodes.Session.update(sessionId, {
    endedAt: new Date().toISOString(),
  });
}
```

### Get Session Activity

```typescript
async function getSessionActivity(
  sessionId: string
): Promise<Array<{ timestamp: string; action: string; entityType: string }>> {
  return store
    .query()
    .from("Session", "s")
    .whereNode("s", (s) => s.id.eq(sessionId))
    .traverse("inSession", "e", { direction: "in" })
    .to("AuditEntry", "a")
    .orderBy((ctx) => ctx.a.timestamp, "asc")
    .select((ctx) => ({
      timestamp: ctx.a.timestamp,
      action: ctx.a.action,
      entityType: ctx.a.entityType,
    }))
    .execute();
}
```

## Compliance Queries

### Who Changed This?

```typescript
async function whoChanged(
  entityType: string,
  entityId: string,
  field: string
): Promise<Array<{ user: string; timestamp: string; before: unknown; after: unknown }>> {
  const entries = await store
    .query()
    .from("AuditEntry", "a")
    .whereNode("a", (a) =>
      a.entityType.eq(entityType).and(a.entityId.eq(entityId))
    )
    .traverse("performedBy", "e")
    .to("User", "u")
    .orderBy((ctx) => ctx.a.timestamp, "desc")
    .select((ctx) => ({
      changes: ctx.a.changes,
      user: ctx.u.name,
      timestamp: ctx.a.timestamp,
    }))
    .execute();

  return entries
    .filter((e) => e.changes && field in e.changes)
    .map((e) => ({
      user: e.user,
      timestamp: e.timestamp,
      before: e.changes![field].before,
      after: e.changes![field].after,
    }));
}
```

### When Was This Value Set?

```typescript
async function whenWasValueSet(
  entityType: string,
  entityId: string,
  field: string,
  value: unknown
): Promise<{ timestamp: string; user: string } | undefined> {
  const entries = await store
    .query()
    .from("AuditEntry", "a")
    .whereNode("a", (a) =>
      a.entityType.eq(entityType).and(a.entityId.eq(entityId))
    )
    .traverse("performedBy", "e")
    .to("User", "u")
    .orderBy((ctx) => ctx.a.timestamp, "asc")
    .select((ctx) => ({
      changes: ctx.a.changes,
      user: ctx.u.name,
      timestamp: ctx.a.timestamp,
    }))
    .execute();

  const entry = entries.find(
    (e) => e.changes && field in e.changes && e.changes[field].after === value
  );

  return entry ? { timestamp: entry.timestamp, user: entry.user } : undefined;
}
```

## Export Audit Logs

### Stream to External System

```typescript
async function* exportAuditLogs(
  since: Date,
  batchSize = 1000
): AsyncGenerator<AuditEntryProps[]> {
  const stream = store
    .query()
    .from("AuditEntry", "a")
    .whereNode("a", (a) => a.timestamp.gte(since.toISOString()))
    .traverse("performedBy", "e")
    .to("User", "u")
    .orderBy((ctx) => ctx.a.timestamp, "asc")
    .select((ctx) => ({
      ...ctx.a,
      performedBy: ctx.u.email,
    }))
    .stream({ batchSize });

  let batch: AuditEntryProps[] = [];

  for await (const entry of stream) {
    batch.push(entry);

    if (batch.length >= batchSize) {
      yield batch;
      batch = [];
    }
  }

  if (batch.length > 0) {
    yield batch;
  }
}

// Usage
async function syncToExternalAuditSystem(since: Date): Promise<void> {
  for await (const batch of exportAuditLogs(since)) {
    await externalAuditApi.ingestBatch(batch);
  }
}
```

## Next Steps

- [Document Management](/examples/document-management) - CMS with semantic search
- [Product Catalog](/examples/product-catalog) - Categories, variants, inventory
- [Workflow Engine](/examples/workflow-engine) - State machines with approvals

# Document Management System

> A complete CMS example with semantic search, versioning, and access control

This example builds a document management system with:

- **Document hierarchy** (folders, documents, sections)
- **Semantic search** with vector embeddings
- **Version history** using temporal queries
- **Access control** with permission inheritance
- **Related documents** discovery

## Schema Definition

```typescript
import { z } from "zod";
import {
  defineNode,
  defineEdge,
  defineGraph,
  embedding,
  subClassOf,
  partOf,
  hasPart,
} from "@nicia-ai/typegraph";

// Base content type (abstract)
const Content = defineNode("Content", {
  schema: z.object({
    title: z.string(),
    createdBy: z.string(),
    status: z.enum(["draft", "published", "archived"]).default("draft"),
  }),
});

// Folder extends Content
const Folder = defineNode("Folder", {
  schema: z.object({
    title: z.string(),
    createdBy: z.string(),
    status: z.enum(["draft", "published", "archived"]).default("draft"),
    path: z.string(), // /engineering/specs
  }),
});

// Document extends Content
const Document = defineNode("Document", {
  schema: z.object({
    title: z.string(),
    content: z.string(),
    createdBy: z.string(),
    status: z.enum(["draft", "published", "archived"]).default("draft"),
    contentType: z.enum(["markdown", "html", "plaintext"]).default("markdown"),
    embedding: embedding(1536).optional(),
  }),
});

// Users and permissions
const User = defineNode("User", {
  schema: z.object({
    email: z.string().email(),
    name: z.string(),
    role: z.enum(["admin", "editor", "viewer"]).default("viewer"),
  }),
});

const Permission = defineNode("Permission", {
  schema: z.object({
    level: z.enum(["read", "write", "admin"]),
  }),
});

// Edges
const contains = defineEdge("contains");
const relatedTo = defineEdge("relatedTo", {
  schema: z.object({
    type: z.enum(["references", "supersedes", "related"]),
    confidence: z.number().min(0).max(1).optional(),
  }),
});
const hasPermission = defineEdge("hasPermission");
const createdBy = defineEdge("createdBy");

// Graph definition
const graph = defineGraph({
  id: "document_management",
  nodes: {
    Content: { type: Content },
    Folder: { type: Folder },
    Document: { type: Document },
    User: { type: User },
    Permission: { type: Permission },
  },
  edges: {
    contains: { type: contains, from: [Folder], to: [Folder, Document] },
    relatedTo: { type: relatedTo, from: [Document], to: [Document] },
    hasPermission: { type: hasPermission, from: [User], to: [Content] },
    createdBy: { type: createdBy, from: [Content], to: [User] },
  },
  ontology: [
    // Type hierarchy
    subClassOf(Folder, Content),
    subClassOf(Document, Content),

    // Compositional relationships
    partOf(Document, Folder),
    hasPart(Folder, Document),
  ],
});
```

## Database Setup

```typescript
import Database from "better-sqlite3";
import { drizzle } from "drizzle-orm/better-sqlite3";
import * as sqliteVec from "sqlite-vec";
import { createSqliteBackend, generateSqliteMigrationSQL } from "@nicia-ai/typegraph/sqlite";
import { createStore } from "@nicia-ai/typegraph";

// Initialize database with vector extension
const sqlite = new Database("documents.db");
sqliteVec.load(sqlite);
sqlite.exec(generateSqliteMigrationSQL());

const db = drizzle(sqlite);
const backend = createSqliteBackend(db);
const store = createStore(graph, backend);
```

## Core Operations

### Creating Folder Structure

```typescript
async function createFolderPath(path: string, userId: string): Promise<Node<typeof Folder>> {
  const parts = path.split("/").filter(Boolean);
  let currentPath = "";
  let parentFolder: Node<typeof Folder> | undefined;

  for (const part of parts) {
    currentPath += `/${part}`;

    // Check if folder exists
    let folder = await store
      .query()
      .from("Folder", "f")
      .whereNode("f", (f) => f.path.eq(currentPath))
      .select((ctx) => ctx.f)
      .first();

    if (!folder) {
      folder = await store.nodes.Folder.create({
        title: part,
        path: currentPath,
        createdBy: userId,
        status: "published",
      });

      if (parentFolder) {
        await store.edges.contains.create(parentFolder, folder, {});
      }
    }

    parentFolder = folder;
  }

  return parentFolder!;
}
```

### Creating Documents with Embeddings

```typescript
import OpenAI from "openai";

const openai = new OpenAI();

async function generateEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: text,
  });
  return response.data[0].embedding;
}

async function createDocument(
  folderId: string,
  title: string,
  content: string,
  userId: string
): Promise<Node<typeof Document>> {
  const embedding = await generateEmbedding(`${title}\n\n${content}`);

  const document = await store.nodes.Document.create({
    title,
    content,
    createdBy: userId,
    status: "draft",
    contentType: "markdown",
    embedding,
  });

  // Link to folder
  const folder = await store.nodes.Folder.getById(folderId);
  if (!folder) throw new Error(`Folder not found: ${folderId}`);
  await store.edges.contains.create(folder, document, {});

  // Link to creator
  const user = await store.nodes.User.getById(userId);
  if (!user) throw new Error(`User not found: ${userId}`);
  await store.edges.createdBy.create(document, user, {});

  return document;
}
```

### Updating Documents (Versioned)

```typescript
async function updateDocument(
  documentId: string,
  updates: { title?: string; content?: string; status?: "draft" | "published" | "archived" }
): Promise<Node<typeof Document>> {
  const current = await store.nodes.Document.getById(documentId);
  if (!current) throw new Error(`Document not found: ${documentId}`);

  // If content changed, regenerate embedding
  let embedding = current.embedding;
  if (updates.content && updates.content !== current.content) {
    const text = `${updates.title ?? current.title}\n\n${updates.content}`;
    embedding = await generateEmbedding(text);
  }

  // Update creates a new version automatically
  return store.nodes.Document.update(documentId, {
    ...updates,
    embedding,
  });
}
```

## Searching Documents

### Semantic Search

```typescript
async function searchDocuments(
  query: string,
  options: {
    folderId?: string;
    status?: "draft" | "published" | "archived";
    limit?: number;
    minScore?: number;
  } = {}
): Promise<Array<{ document: DocumentProps; score: number }>> {
  const { folderId, status = "published", limit = 10, minScore = 0.7 } = options;

  const queryEmbedding = await generateEmbedding(query);

  let queryBuilder = store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => {
      let pred = d.embedding.similarTo(queryEmbedding, limit, {
        metric: "cosine",
        minScore,
      });

      if (status) {
        pred = pred.and(d.status.eq(status));
      }

      return pred;
    });

  // If folderId specified, filter to folder descendants
  if (folderId) {
    queryBuilder = store
      .query()
      .from("Folder", "f")
      .whereNode("f", (f) => f.id.eq(folderId))
      .traverse("contains", "e")
      .recursive()
      .to("Document", "d")
      .whereNode("d", (d) =>
        d.embedding
          .similarTo(queryEmbedding, limit, { metric: "cosine", minScore })
          .and(d.status.eq(status))
      );
  }

  return queryBuilder
    .select((ctx) => ({
      document: ctx.d,
      score: ctx.d.embedding.similarity(queryEmbedding),
    }))
    .execute();
}
```

### Find Related Documents

```typescript
async function findRelatedDocuments(
  documentId: string,
  limit = 5
): Promise<Array<{ document: DocumentProps; relationship: string }>> {
  // First, get explicit relationships
  const explicit = await store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) => d.id.eq(documentId))
    .traverse("relatedTo", "e")
    .to("Document", "related")
    .select((ctx) => ({
      document: ctx.related,
      relationship: ctx.e.type,
    }))
    .execute();

  // Then, find semantically similar documents
  const source = await store.nodes.Document.getById(documentId);
  if (!source) throw new Error(`Document not found: ${documentId}`);
  if (!source.embedding) {
    return explicit;
  }

  const similar = await store
    .query()
    .from("Document", "d")
    .whereNode("d", (d) =>
      d.embedding
        .similarTo(source.embedding!, limit * 2, { metric: "cosine", minScore: 0.8 })
        .and(d.id.neq(documentId))
    )
    .select((ctx) => ({
      document: ctx.d,
      relationship: "similar" as const,
    }))
    .limit(limit)
    .execute();

  return [...explicit, ...similar].slice(0, limit);
}
```

## Version History

### Get Document History

```typescript
interface DocumentVersion {
  title: string;
  content: string;
  status: string;
  validFrom: string;
  validTo: string | undefined;
  version: number;
}

async function getDocumentHistory(documentId: string): Promise<DocumentVersion[]> {
  return store
    .query()
    .from("Document", "d")
    .temporal("includeEnded")
    .whereNode("d", (d) => d.id.eq(documentId))
    .orderBy((ctx) => ctx.d.validFrom, "desc")
    .select((ctx) => ({
      title: ctx.d.title,
      content: ctx.d.content,
      status: ctx.d.status,
      validFrom: ctx.d.validFrom,
      validTo: ctx.d.validTo,
      version: ctx.d.version,
    }))
    .execute();
}
```

### View Document at Point in Time

```typescript
async function getDocumentAsOf(
  documentId: string,
  timestamp: Date
): Promise<DocumentProps | undefined> {
  return store
    .query()
    .from("Document", "d")
    .temporal("asOf", timestamp.toISOString())
    .whereNode("d", (d) => d.id.eq(documentId))
    .select((ctx) => ctx.d)
    .first();
}
```

### Compare Versions

```typescript
async function compareVersions(
  documentId: string,
  version1: number,
  version2: number
): Promise<{ before: DocumentProps; after: DocumentProps } | undefined> {
  const versions = await store
    .query()
    .from("Document", "d")
    .temporal("includeEnded")
    .whereNode("d", (d) =>
      d.id.eq(documentId).and(d.version.in([version1, version2]))
    )
    .orderBy((ctx) => ctx.d.version, "asc")
    .select((ctx) => ctx.d)
    .execute();

  if (versions.length !== 2) return undefined;

  return { before: versions[0], after: versions[1] };
}
```

## Access Control

### Check Read Permission

```typescript
async function canRead(userId: string, contentId: string): Promise<boolean> {
  // Check direct permission
  const directPermission = await store
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.id.eq(userId))
    .traverse("hasPermission", "p")
    .to("Content", "c", { includeSubClasses: true })
    .whereNode("c", (c) => c.id.eq(contentId))
    .first();

  if (directPermission) return true;

  // Check inherited permission (from parent folders)
  const content = await store.nodes.Content.getById(contentId);
  if (!content) return false;

  // Walk up the folder tree checking permissions
  const parentFolders = await store
    .query()
    .from("Folder", "f")
    .traverse("contains", "e")
    .recursive()
    .to("Content", "c", { includeSubClasses: true })
    .whereNode("c", (c) => c.id.eq(contentId))
    .select((ctx) => ctx.f.id)
    .execute();

  for (const folderId of parentFolders) {
    const folderPermission = await store
      .query()
      .from("User", "u")
      .whereNode("u", (u) => u.id.eq(userId))
      .traverse("hasPermission", "p")
      .to("Folder", "f")
      .whereNode("f", (f) => f.id.eq(folderId))
      .first();

    if (folderPermission) return true;
  }

  return false;
}
```

### Grant Permission

```typescript
async function grantPermission(
  userId: string,
  contentId: string,
  level: "read" | "write" | "admin"
): Promise<void> {
  const user = await store.nodes.User.getById(userId);
  if (!user) throw new Error(`User not found: ${userId}`);
  const content = await store.nodes.Content.getById(contentId);
  if (!content) throw new Error(`Content not found: ${contentId}`);

  // Create permission node
  const permission = await store.nodes.Permission.create({ level });

  // Link user to content via permission
  await store.edges.hasPermission.create(user, content, {});
}
```

## Folder Navigation

### Get Folder Contents

```typescript
interface FolderContents {
  folders: Array<{ id: string; title: string; path: string }>;
  documents: Array<{ id: string; title: string; status: string }>;
}

async function getFolderContents(folderId: string): Promise<FolderContents> {
  const folders = await store
    .query()
    .from("Folder", "parent")
    .whereNode("parent", (f) => f.id.eq(folderId))
    .traverse("contains", "e")
    .to("Folder", "child")
    .select((ctx) => ({
      id: ctx.child.id,
      title: ctx.child.title,
      path: ctx.child.path,
    }))
    .execute();

  const documents = await store
    .query()
    .from("Folder", "parent")
    .whereNode("parent", (f) => f.id.eq(folderId))
    .traverse("contains", "e")
    .to("Document", "doc")
    .select((ctx) => ({
      id: ctx.doc.id,
      title: ctx.doc.title,
      status: ctx.doc.status,
    }))
    .execute();

  return { folders, documents };
}
```

### Get Breadcrumb Path

```typescript
async function getBreadcrumb(
  contentId: string
): Promise<Array<{ id: string; title: string; path: string }>> {
  return store
    .query()
    .from("Content", "c", { includeSubClasses: true })
    .whereNode("c", (c) => c.id.eq(contentId))
    .traverse("contains", "e", { direction: "in" })
    .recursive()
    .to("Folder", "ancestor")
    .select((ctx) => ({
      id: ctx.ancestor.id,
      title: ctx.ancestor.title,
      path: ctx.ancestor.path,
    }))
    .execute();
}
```

## Bulk Operations

### Move Document to Folder

```typescript
async function moveDocument(documentId: string, targetFolderId: string): Promise<void> {
  await store.transaction(async (tx) => {
    // Remove from current folder
    const currentEdge = await tx
      .query()
      .from("Folder", "f")
      .traverse("contains", "e")
      .to("Document", "d")
      .whereNode("d", (d) => d.id.eq(documentId))
      .select((ctx) => ctx.e.id)
      .first();

    if (currentEdge) {
      await tx.edges.contains.delete(currentEdge);
    }

    // Add to new folder
    const document = await tx.nodes.Document.getById(documentId);
    if (!document) throw new Error(`Document not found: ${documentId}`);
    const targetFolder = await tx.nodes.Folder.getById(targetFolderId);
    if (!targetFolder) throw new Error(`Folder not found: ${targetFolderId}`);
    await tx.edges.contains.create(targetFolder, document, {});
  });
}
```

### Bulk Archive

```typescript
async function archiveFolder(folderId: string): Promise<number> {
  // Get all documents in folder and subfolders
  const documents = await store
    .query()
    .from("Folder", "f")
    .whereNode("f", (f) => f.id.eq(folderId))
    .traverse("contains", "e")
    .recursive()
    .to("Document", "d")
    .select((ctx) => ctx.d.id)
    .execute();

  // Archive each document
  await store.transaction(async (tx) => {
    for (const docId of documents) {
      await tx.nodes.Document.update(docId, { status: "archived" });
    }
  });

  return documents.length;
}
```

## Next Steps

- [Product Catalog](/examples/product-catalog) - Categories, variants, inventory
- [Workflow Engine](/examples/workflow-engine) - State machines with approvals
- [Audit Trail](/examples/audit-trail) - Complete change tracking

# Knowledge Graph for RAG

> Enhance retrieval with entity linking, relationship traversal, and multi-hop context

This example demonstrates how **graph structure enhances RAG** beyond vector similarity.
While [Semantic Search](/semantic-search) covers embedding basics, this guide focuses
on what graphs uniquely provide: entity disambiguation, relationship traversal,
and structured context that flat retrieval cannot offer.

## What Graphs Add to RAG

| Flat RAG | Graph RAG |
|----------|-----------|
| Returns similar chunks | Traverses to related entities and facts |
| Treats "Apple" the same everywhere | Disambiguates Apple Inc. vs. apple fruit |
| Context is unstructured text | Context includes structured relationships |
| Single-hop retrieval | Multi-hop reasoning across connections |

**Example**: For "What companies has Elon Musk founded?", flat RAG returns chunks
mentioning him. Graph RAG traverses from the "Elon Musk" entity through "founded"
edges to return structured company data—regardless of whether those facts appear
in the same chunk.

## Schema

```typescript
import { z } from "zod";
import { defineNode, defineEdge, defineGraph, embedding, inverseOf } from "@nicia-ai/typegraph";

// Source documents
const Document = defineNode("Document", {
  schema: z.object({
    title: z.string(),
    source: z.string(),
  }),
});

// Text chunks with embeddings
const Chunk = defineNode("Chunk", {
  schema: z.object({
    text: z.string(),
    embedding: embedding(1536),
    position: z.number().int(),
  }),
});

// Extracted entities
const Entity = defineNode("Entity", {
  schema: z.object({
    name: z.string(),
    type: z.enum(["person", "organization", "location", "concept", "product", "event"]),
    description: z.string().optional(),
    embedding: embedding(1536).optional(),
  }),
});

// Edges
const containsChunk = defineEdge("containsChunk");
const nextChunk = defineEdge("nextChunk");
const prevChunk = defineEdge("prevChunk");
const mentions = defineEdge("mentions", {
  schema: z.object({
    confidence: z.number().min(0).max(1).optional(),
  }),
});
const relatesTo = defineEdge("relatesTo", {
  schema: z.object({
    relationship: z.string(), // "founded", "works_at", "located_in"
  }),
});

export const graph = defineGraph({
  id: "rag_graph",
  nodes: {
    Document: { type: Document },
    Chunk: { type: Chunk },
    Entity: {
      type: Entity,
      unique: [
        {
          name: "entity_name_type",
          fields: ["name", "type"],
          scope: "kind",
          collation: "caseInsensitive",
        },
      ],
    },
  },
  edges: {
    containsChunk: { type: containsChunk, from: [Document], to: [Chunk] },
    nextChunk: { type: nextChunk, from: [Chunk], to: [Chunk] },
    prevChunk: { type: prevChunk, from: [Chunk], to: [Chunk] },
    mentions: { type: mentions, from: [Chunk], to: [Entity] },
    relatesTo: { type: relatesTo, from: [Entity], to: [Entity] },
  },
  ontology: [inverseOf(nextChunk, prevChunk)],
});
```

## Embedding Setup

Using [Vercel AI SDK](https://ai-sdk.dev/docs/ai-sdk-core/embeddings):

```typescript
import { embed, embedMany } from "ai";
import { openai } from "@ai-sdk/openai";

const embeddingModel = openai.embeddingModel("text-embedding-3-small");

async function generateEmbedding(text: string): Promise<number[]> {
  const { embedding } = await embed({ model: embeddingModel, value: text });
  return embedding;
}

async function generateEmbeddings(texts: string[]): Promise<number[][]> {
  const { embeddings } = await embedMany({ model: embeddingModel, values: texts });
  return embeddings;
}
```

## Ingestion with Entity Linking

The key graph RAG capability: linking chunks to disambiguated entities.

```typescript
import type { Node } from "@nicia-ai/typegraph";

interface ChunkData {
  text: string;
  entities: Array<{
    name: string;
    type: "person" | "organization" | "location" | "concept" | "product" | "event";
  }>;
}

async function ingestDocument(
  title: string,
  source: string,
  chunks: ChunkData[]
): Promise<void> {
  await store.transaction(async (tx) => {
    const doc = await tx.nodes.Document.create({ title, source });

    // Batch embed all chunks
    const chunkEmbeddings = await generateEmbeddings(chunks.map((c) => c.text));

    let prevChunk: Node<typeof Chunk> | undefined;

    for (const [i, chunkData] of chunks.entries()) {
      const chunk = await tx.nodes.Chunk.create({
        text: chunkData.text,
        embedding: chunkEmbeddings[i],
        position: i,
      });

      await tx.edges.containsChunk.create(doc, chunk, {});

      if (prevChunk) {
        await tx.edges.nextChunk.create(prevChunk, chunk, {});
      }

      // Link to entities (dedupe by unique constraint)
      for (const entityData of chunkData.entities) {
        const entityResult = await tx.nodes.Entity.getOrCreateByConstraint(
          "entity_name_type",
          {
            name: entityData.name,
            type: entityData.type,
          }
        );

        // Compute expensive derived fields only for newly created entities
        if (entityResult.action === "created") {
          await tx.nodes.Entity.update(entityResult.node.id, {
            embedding: await generateEmbedding(entityData.name),
          });
        }

        await tx.edges.mentions.getOrCreateByEndpoints(
          chunk,
          entityResult.node,
          {},
          { ifExists: "return" }
        );
      }

      prevChunk = chunk;
    }
  });
}
```

## Graph-Specific Query Patterns

These patterns demonstrate capabilities that require graph structure—they cannot
be replicated with flat vector search.

### Entity-Based Retrieval

Find all chunks mentioning a specific entity, regardless of how it's phrased:

```typescript
async function findChunksByEntity(entityName: string) {
  return store
    .query()
    .from("Entity", "e")
    .whereNode("e", (e) => e.name.eq(entityName))
    .traverse("mentions", "m", { direction: "in" })
    .to("Chunk", "c")
    .select((ctx) => ctx.c.text)
    .execute();
}
```

### Multi-Hop Entity Traversal

Find entities connected through relationships:

```typescript
async function findRelatedEntities(entityName: string, maxHops = 2) {
  const rows = await store
    .query()
    .from("Entity", "e")
    .whereNode("e", (e) => e.name.eq(entityName))
    .traverse("relatesTo", "r")
    .recursive({ maxHops, depth: "depth" })
    .to("Entity", "related")
    .select((ctx) => ({
      from: ctx.e.name,
      to: ctx.related.name,
      toId: ctx.related.id,
      depth: ctx.depth,
    }))
    .execute();

  // distinct paths can reach the same target; dedupe by target
  const seen = new Set<string>();
  return rows
    .filter((row) => {
      if (seen.has(row.toId)) return false;
      seen.add(row.toId);
      return true;
    })
    .map((row) => ({
      from: row.from,
      to: row.to,
      depth: row.depth,
    }));
}
```

### Context Window Expansion

Get surrounding chunks for a match:

```typescript
async function getChunkWithContext(chunkId: string, windowSize = 1) {
  const [before, after] = await Promise.all([
    store
      .query()
      .from("Chunk", "c")
      .whereNode("c", (c) => c.id.eq(chunkId))
      .traverse("prevChunk", "e")
      .recursive({ maxHops: windowSize })
      .to("Chunk", "prev")
      .orderBy("prev", "position", "desc")
      .select((ctx) => ctx.prev.text)
      .execute(),
    store
      .query()
      .from("Chunk", "c")
      .whereNode("c", (c) => c.id.eq(chunkId))
      .traverse("nextChunk", "e")
      .recursive({ maxHops: windowSize })
      .to("Chunk", "next")
      .orderBy("next", "position", "asc")
      .select((ctx) => ctx.next.text)
      .execute(),
  ]);

  const chunk = await store.nodes.Chunk.getById(chunkId);

  return {
    before: before.toReversed(),
    chunk: chunk?.text ?? "",
    after,
  };
}
```

## Hybrid Retrieval: Vector + Graph

Combine vector similarity with graph traversal in a single query using the `from` option:

```typescript
async function hybridRetrieval(query: string, limit = 10) {
  const queryEmbedding = await generateEmbedding(query);

  // Single query: vector search + fan-out to entities AND document
  const results = await store
    .query()
    .from("Chunk", "c")
    .whereNode("c", (c) =>
      c.embedding.similarTo(queryEmbedding, limit, { metric: "cosine", minScore: 0.7 })
    )
    .traverse("mentions", "m")
    .to("Entity", "e")
    .traverse("containsChunk", "d_edge", { direction: "in", from: "c" }) // Fan-out from chunk
    .to("Document", "d")
    .select((ctx) => ({
      chunkId: ctx.c.id,
      text: ctx.c.text,
      score: ctx.c.embedding.similarity(queryEmbedding),
      source: ctx.d.title,
      entityName: ctx.e.name,
      entityType: ctx.e.type,
    }))
    .execute();

  // Group by chunk (one row per chunk-entity pair)
  const byChunk = new Map<string, typeof results[number] & { entities: Array<{ name: string; type: string }> }>();
  for (const row of results) {
    const existing = byChunk.get(row.chunkId);
    if (existing) {
      existing.entities.push({ name: row.entityName, type: row.entityType });
    } else {
      byChunk.set(row.chunkId, {
        ...row,
        entities: [{ name: row.entityName, type: row.entityType }],
      });
    }
  }

  return [...byChunk.values()];
}
```

The `from` option enables **fan-out patterns** where you traverse multiple relationships
from the same node. Without `from`, traversals chain sequentially (A→B→C). With `from`,
you can branch: traverse from chunk to entities, AND from chunk to document.

## Building Structured Context

Format graph-enriched context for an LLM:

```typescript
async function buildGraphContext(query: string, extractedEntities: string[]) {
  const queryEmbedding = await generateEmbedding(query);

  // Get relevant chunks with sources
  const chunks = await store
    .query()
    .from("Chunk", "c")
    .whereNode("c", (c) =>
      c.embedding.similarTo(queryEmbedding, 5, { metric: "cosine", minScore: 0.7 })
    )
    .traverse("containsChunk", "e", { direction: "in" })
    .to("Document", "d")
    .select((ctx) => ({ text: ctx.c.text, source: ctx.d.title }))
    .execute();

  // Get entity relationships from graph
  const entityFacts = await Promise.all(
    extractedEntities.map(async (name) => {
      const relations = await store
        .query()
        .from("Entity", "e")
        .whereNode("e", (e) => e.name.eq(name))
        .traverse("relatesTo", "r")
        .to("Entity", "target")
        .select((ctx) => ctx.target.name)
        .execute();

      return relations.length > 0 ? { name, relatedTo: relations } : undefined;
    })
  );

  return { chunks, entityFacts: entityFacts.filter(Boolean) };
}

function formatForPrompt(context: Awaited<ReturnType<typeof buildGraphContext>>): string {
  let prompt = "## Relevant Passages\n\n";

  for (const chunk of context.chunks) {
    prompt += `**${chunk.source}**: ${chunk.text}\n\n`;
  }

  if (context.entityFacts.length > 0) {
    prompt += "## Entity Relationships\n\n";
    for (const entity of context.entityFacts) {
      if (entity) {
        prompt += `**${entity.name}** → ${entity.relatedTo.join(", ")}\n`;
      }
    }
  }

  return prompt;
}
```

## When to Use Graph RAG

**Use graph RAG when:**

- Queries require connecting facts across documents ("Who founded X and what else did they start?")
- Entity disambiguation matters (distinguishing "Apple" the company from "apple" the fruit)
- Relationship traversal provides value ("Find all companies in the same industry as X")
- You need structured facts alongside unstructured text

**Flat vector RAG may suffice when:**

- Simple "find similar content" queries
- No entity relationships to exploit
- Single-document question answering

## Next Steps

- [Semantic Search](/semantic-search) — Vector embedding fundamentals
- [Traversals](/queries/traverse) — Graph traversal patterns
- [Document Management](/examples/document-management) — Versioning and access control

# Multi-Tenant SaaS

> Complete multi-tenancy patterns with isolation, data partitioning, and tenant management

This example shows how to build a multi-tenant SaaS application with:

- **Three isolation strategies** (shared tables, schema per tenant, database per tenant)
- **Tenant-aware queries** that automatically filter data
- **Tenant provisioning** and lifecycle management
- **Cross-tenant analytics** for platform operators
- **Tenant migration** between isolation levels

## Choosing an Isolation Strategy

| Strategy | Isolation | Complexity | Cost | Best For |
|----------|-----------|------------|------|----------|
| Shared tables | Low | Low | Lowest | Many small tenants, B2C SaaS |
| Schema per tenant | Medium | Medium | Low | SMB customers, PostgreSQL only |
| Database per tenant | High | High | Highest | Enterprise, compliance requirements |

## Strategy 1: Shared Tables with Row-Level Isolation

All tenants share the same database tables, filtered by `tenantId`.

### Schema Definition

```typescript
import { z } from "zod";
import { defineNode, defineEdge, defineGraph } from "@nicia-ai/typegraph";

// Tenant metadata
const Tenant = defineNode("Tenant", {
  schema: z.object({
    slug: z.string(),
    name: z.string(),
    plan: z.enum(["free", "starter", "pro", "enterprise"]),
    status: z.enum(["active", "suspended", "cancelled"]).default("active"),
    createdAt: z.string().datetime(),
    settings: z.record(z.unknown()).optional(),
  }),
});

// All entities include tenantId
const Project = defineNode("Project", {
  schema: z.object({
    tenantId: z.string(),
    name: z.string(),
    description: z.string().optional(),
    status: z.enum(["active", "archived"]).default("active"),
  }),
});

const Task = defineNode("Task", {
  schema: z.object({
    tenantId: z.string(),
    title: z.string(),
    status: z.enum(["todo", "in_progress", "done"]).default("todo"),
    priority: z.enum(["low", "medium", "high"]).default("medium"),
  }),
});

const User = defineNode("User", {
  schema: z.object({
    tenantId: z.string(),
    email: z.string().email(),
    name: z.string(),
    role: z.enum(["owner", "admin", "member", "guest"]).default("member"),
  }),
});

// Edges
const hasProject = defineEdge("hasProject");
const hasTask = defineEdge("hasTask");
const assignedTo = defineEdge("assignedTo");
const memberOf = defineEdge("memberOf");

const graph = defineGraph({
  id: "multi_tenant",
  nodes: {
    Tenant: { type: Tenant },
    Project: { type: Project },
    Task: { type: Task },
    User: { type: User },
  },
  edges: {
    hasProject: { type: hasProject, from: [Tenant], to: [Project] },
    hasTask: { type: hasTask, from: [Project], to: [Task] },
    assignedTo: { type: assignedTo, from: [Task], to: [User] },
    memberOf: { type: memberOf, from: [User], to: [Tenant] },
  },
});
```

### Tenant-Scoped Store

Create a wrapper that automatically filters by tenant:

```typescript
interface TenantContext {
  tenantId: string;
  userId: string;
  role: "owner" | "admin" | "member" | "guest";
}

function createTenantStore(store: Store, ctx: TenantContext) {
  const projects = {
    async list(options: { status?: string } = {}) {
      let query = store
        .query()
        .from("Project", "p")
        .whereNode("p", (p) => p.tenantId.eq(ctx.tenantId));

      if (options.status) {
        query = query.whereNode("p", (p) => p.status.eq(options.status));
      }

      return query.select((q) => q.p).execute();
    },

    async create(data: { name: string; description?: string }) {
      const project = await store.nodes.Project.create({
        ...data,
        tenantId: ctx.tenantId,
      });

      const tenant = await store.nodes.Tenant.getById(ctx.tenantId);
      if (!tenant) throw new Error(`Tenant not found: ${ctx.tenantId}`);
      await store.edges.hasProject.create(tenant, project, {});

      return project;
    },

    async get(projectId: string) {
      const project = await store.nodes.Project.getById(projectId);
      if (!project || project.tenantId !== ctx.tenantId) {
        throw new Error("Not found");
      }
      return project;
    },

    async update(projectId: string, updates: Partial<ProjectProps>) {
      await projects.get(projectId); // Verify access
      return store.nodes.Project.update(projectId, updates);
    },

    async delete(projectId: string) {
      await projects.get(projectId); // Verify access
      await store.nodes.Project.delete(projectId);
    },
  };

  const tasks = {
    async list(projectId: string) {
      await projects.get(projectId); // Verify access

      return store
        .query()
        .from("Project", "p")
        .whereNode("p", (p) => p.id.eq(projectId))
        .traverse("hasTask", "e")
        .to("Task", "t")
        .select((q) => q.t)
        .execute();
    },

    async create(projectId: string, data: { title: string; priority?: string }) {
      const project = await projects.get(projectId); // Verify access

      const task = await store.nodes.Task.create({
        ...data,
        tenantId: ctx.tenantId,
      });

      await store.edges.hasTask.create(project, task, {});
      return task;
    },
  };

  const users = {
    async list() {
      return store
        .query()
        .from("User", "u")
        .whereNode("u", (u) => u.tenantId.eq(ctx.tenantId))
        .select((q) => q.u)
        .execute();
    },

    async invite(email: string, name: string, role: string) {
      if (ctx.role !== "owner" && ctx.role !== "admin") {
        throw new Error("Insufficient permissions");
      }

      const user = await store.nodes.User.create({
        tenantId: ctx.tenantId,
        email,
        name,
        role,
      });

      const tenant = await store.nodes.Tenant.getById(ctx.tenantId);
      if (!tenant) throw new Error(`Tenant not found: ${ctx.tenantId}`);
      await store.edges.memberOf.create(user, tenant, {});

      return user;
    },
  };

  return { projects, tasks, users };
}

// Usage in API handler
async function handleRequest(req: Request) {
  const session = await getSession(req);
  const tenantStore = createTenantStore(store, {
    tenantId: session.tenantId,
    userId: session.userId,
    role: session.role,
  });

  // All queries are automatically tenant-scoped
  const projects = await tenantStore.projects.list();
}
```

### Tenant Provisioning

```typescript
async function provisionTenant(
  slug: string,
  name: string,
  ownerEmail: string,
  ownerName: string,
  plan: "free" | "starter" | "pro" | "enterprise" = "free"
): Promise<{ tenant: Node<typeof Tenant>; owner: Node<typeof User> }> {
  return store.transaction(async (tx) => {
    // Check slug uniqueness
    const existing = await tx
      .query()
      .from("Tenant", "t")
      .whereNode("t", (t) => t.slug.eq(slug))
      .first();

    if (existing) {
      throw new Error("Tenant slug already exists");
    }

    // Create tenant
    const tenant = await tx.nodes.Tenant.create({
      slug,
      name,
      plan,
      status: "active",
      createdAt: new Date().toISOString(),
    });

    // Create owner user
    const owner = await tx.nodes.User.create({
      tenantId: tenant.id,
      email: ownerEmail,
      name: ownerName,
      role: "owner",
    });

    await tx.edges.memberOf.create(owner, tenant, {});

    return { tenant, owner };
  });
}
```

## Strategy 2: Schema Per Tenant (PostgreSQL)

Each tenant gets their own PostgreSQL schema within the same database.

### Setup

```typescript
import { Pool } from "pg";
import { drizzle } from "drizzle-orm/node-postgres";
import { sql } from "drizzle-orm";
import { createPostgresBackend, generatePostgresMigrationSQL } from "@nicia-ai/typegraph/postgres";

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

async function createTenantSchema(tenantId: string): Promise<void> {
  const schemaName = `tenant_${tenantId}`;

  // Create schema
  await pool.query(`CREATE SCHEMA IF NOT EXISTS ${schemaName}`);

  // Run TypeGraph migrations in the tenant schema
  await pool.query(`SET search_path TO ${schemaName}`);
  await pool.query(generatePostgresMigrationSQL());
  await pool.query(`SET search_path TO public`);
}

async function getTenantStore(tenantId: string): Promise<Store> {
  const schemaName = `tenant_${tenantId}`;

  // Create connection with schema
  const client = await pool.connect();
  await client.query(`SET search_path TO ${schemaName}`);

  const db = drizzle(client);
  const backend = createPostgresBackend(db);

  return createStore(graph, backend);
}
```

### Tenant Store Cache

```typescript
class TenantStoreManager {
  private stores = new Map<string, { store: Store; lastUsed: Date }>();
  private maxCached = 100;

  async getStore(tenantId: string): Promise<Store> {
    const cached = this.stores.get(tenantId);

    if (cached) {
      cached.lastUsed = new Date();
      return cached.store;
    }

    // Evict oldest if at capacity
    if (this.stores.size >= this.maxCached) {
      this.evictOldest();
    }

    const store = await getTenantStore(tenantId);
    this.stores.set(tenantId, { store, lastUsed: new Date() });

    return store;
  }

  private evictOldest(): void {
    let oldest: { id: string; date: Date } | undefined;

    for (const [id, { lastUsed }] of this.stores) {
      if (!oldest || lastUsed < oldest.date) {
        oldest = { id, date: lastUsed };
      }
    }

    if (oldest) {
      this.stores.delete(oldest.id);
    }
  }
}

const tenantManager = new TenantStoreManager();
```

### Provisioning with Schema

```typescript
async function provisionTenantWithSchema(
  slug: string,
  name: string,
  ownerEmail: string
): Promise<{ tenantId: string }> {
  const tenantId = generateUUID();

  // Create schema and tables
  await createTenantSchema(tenantId);

  // Get tenant-specific store
  const tenantStore = await tenantManager.getStore(tenantId);

  // Create initial data
  await tenantStore.nodes.User.create({
    email: ownerEmail,
    name: name,
    role: "owner",
  });

  // Store tenant metadata in public schema
  const publicDb = drizzle(pool);
  await publicDb.insert(tenants).values({
    id: tenantId,
    slug,
    name,
    createdAt: new Date(),
  });

  return { tenantId };
}
```

## Strategy 3: Database Per Tenant

Each tenant gets their own database for maximum isolation.

### Tenant Database Manager

```typescript
interface TenantConfig {
  id: string;
  slug: string;
  databaseUrl: string;
  status: "active" | "suspended";
}

class TenantDatabaseManager {
  private connections = new Map<string, { pool: Pool; store: Store }>();
  private maxConnections = 50;

  async getStore(tenantId: string): Promise<Store> {
    const cached = this.connections.get(tenantId);
    if (cached) return cached.store;

    // Get tenant config from central registry
    const config = await this.getTenantConfig(tenantId);

    if (config.status !== "active") {
      throw new Error("Tenant is not active");
    }

    // Evict if at capacity
    if (this.connections.size >= this.maxConnections) {
      await this.evictLeastUsed();
    }

    // Create new connection
    const pool = new Pool({ connectionString: config.databaseUrl, max: 5 });
    const db = drizzle(pool);
    const backend = createPostgresBackend(db);
    const store = createStore(graph, backend);

    this.connections.set(tenantId, { pool, store });

    return store;
  }

  async closeConnection(tenantId: string): Promise<void> {
    const conn = this.connections.get(tenantId);
    if (conn) {
      await conn.pool.end();
      this.connections.delete(tenantId);
    }
  }

  private async getTenantConfig(tenantId: string): Promise<TenantConfig> {
    // Fetch from central tenant registry
    const result = await centralDb
      .select()
      .from(tenantConfigs)
      .where(eq(tenantConfigs.id, tenantId))
      .get();

    if (!result) throw new Error("Tenant not found");

    return result;
  }

  private async evictLeastUsed(): Promise<void> {
    // Simple LRU eviction
    const first = this.connections.keys().next().value;
    if (first) {
      await this.closeConnection(first);
    }
  }
}

const dbManager = new TenantDatabaseManager();
```

### Provisioning New Database

```typescript
async function provisionTenantDatabase(
  slug: string,
  name: string,
  ownerEmail: string
): Promise<{ tenantId: string; databaseUrl: string }> {
  const tenantId = generateUUID();
  const dbName = `tenant_${tenantId.replace(/-/g, "_")}`;

  // Create database (using admin connection)
  const adminPool = new Pool({ connectionString: process.env.ADMIN_DATABASE_URL });
  await adminPool.query(`CREATE DATABASE ${dbName}`);
  await adminPool.end();

  // Build connection URL
  const baseUrl = new URL(process.env.DATABASE_BASE_URL!);
  baseUrl.pathname = `/${dbName}`;
  const databaseUrl = baseUrl.toString();

  // Initialize TypeGraph tables
  const tenantPool = new Pool({ connectionString: databaseUrl });
  await tenantPool.query(generatePostgresMigrationSQL());

  // Create initial data
  const db = drizzle(tenantPool);
  const backend = createPostgresBackend(db);
  const store = createStore(graph, backend);

  await store.nodes.User.create({
    email: ownerEmail,
    name: name,
    role: "owner",
  });

  await tenantPool.end();

  // Register in central tenant registry
  await centralDb.insert(tenantConfigs).values({
    id: tenantId,
    slug,
    name,
    databaseUrl,
    status: "active",
    createdAt: new Date(),
  });

  return { tenantId, databaseUrl };
}
```

## Cross-Tenant Operations

For platform administrators who need to query across tenants.

### Aggregated Metrics (Shared Tables)

```typescript
import { count, field } from "@nicia-ai/typegraph";

async function getTenantMetrics(): Promise<
  Array<{ tenantId: string; projectCount: number; taskCount: number; userCount: number }>
> {
  // Projects by tenant
  const projectCounts = await store
    .query()
    .from("Project", "p")
    .groupBy("p", "tenantId")
    .aggregate({
      tenantId: field("p", "tenantId"),
      projectCount: count("p"),
    })
    .execute();

  // Tasks by tenant
  const taskCounts = await store
    .query()
    .from("Task", "t")
    .groupBy("t", "tenantId")
    .aggregate({
      tenantId: field("t", "tenantId"),
      taskCount: count("t"),
    })
    .execute();

  // Users by tenant
  const userCounts = await store
    .query()
    .from("User", "u")
    .groupBy("u", "tenantId")
    .aggregate({
      tenantId: field("u", "tenantId"),
      userCount: count("u"),
    })
    .execute();

  // Merge results
  const metrics = new Map<string, { projectCount: number; taskCount: number; userCount: number }>();

  for (const p of projectCounts) {
    metrics.set(p.tenantId, { projectCount: p.projectCount, taskCount: 0, userCount: 0 });
  }

  for (const t of taskCounts) {
    const existing = metrics.get(t.tenantId) || { projectCount: 0, taskCount: 0, userCount: 0 };
    existing.taskCount = t.taskCount;
    metrics.set(t.tenantId, existing);
  }

  for (const u of userCounts) {
    const existing = metrics.get(u.tenantId) || { projectCount: 0, taskCount: 0, userCount: 0 };
    existing.userCount = u.userCount;
    metrics.set(u.tenantId, existing);
  }

  return Array.from(metrics.entries()).map(([tenantId, counts]) => ({
    tenantId,
    ...counts,
  }));
}
```

### Cross-Tenant Search (Database Per Tenant)

```typescript
async function searchAcrossTenants(
  query: string,
  tenantIds: string[]
): Promise<Array<{ tenantId: string; results: ProjectProps[] }>> {
  const results = await Promise.all(
    tenantIds.map(async (tenantId) => {
      try {
        const store = await dbManager.getStore(tenantId);

        const projects = await store
          .query()
          .from("Project", "p")
          .whereNode("p", (p) => p.name.contains(query))
          .select((ctx) => ctx.p)
          .limit(10)
          .execute();

        return { tenantId, results: projects };
      } catch (error) {
        console.error(`Failed to search tenant ${tenantId}:`, error);
        return { tenantId, results: [] };
      }
    })
  );

  return results;
}
```

## Tenant Lifecycle

### Suspend Tenant

```typescript
async function suspendTenant(tenantId: string, reason: string): Promise<void> {
  const current = await store.nodes.Tenant.getById(tenantId);
  if (!current) throw new Error(`Tenant not found: ${tenantId}`);

  await store.nodes.Tenant.update(tenantId, {
    status: "suspended",
    settings: {
      ...(current.settings || {}),
      suspendedAt: new Date().toISOString(),
      suspendReason: reason,
    },
  });
}
```

### Delete Tenant (Shared Tables)

```typescript
async function deleteTenant(tenantId: string): Promise<void> {
  await store.transaction(async (tx) => {
    // Delete all tasks
    const tasks = await tx
      .query()
      .from("Task", "t")
      .whereNode("t", (t) => t.tenantId.eq(tenantId))
      .select((ctx) => ctx.t.id)
      .execute();

    for (const taskId of tasks) {
      await tx.nodes.Task.delete(taskId);
    }

    // Delete all projects
    const projects = await tx
      .query()
      .from("Project", "p")
      .whereNode("p", (p) => p.tenantId.eq(tenantId))
      .select((ctx) => ctx.p.id)
      .execute();

    for (const projectId of projects) {
      await tx.nodes.Project.delete(projectId);
    }

    // Delete all users
    const users = await tx
      .query()
      .from("User", "u")
      .whereNode("u", (u) => u.tenantId.eq(tenantId))
      .select((ctx) => ctx.u.id)
      .execute();

    for (const userId of users) {
      await tx.nodes.User.delete(userId);
    }

    // Delete tenant
    await tx.nodes.Tenant.delete(tenantId);
  });
}
```

### Delete Tenant (Database Per Tenant)

```typescript
async function deleteTenantDatabase(tenantId: string): Promise<void> {
  // Close active connection
  await dbManager.closeConnection(tenantId);

  // Get database name
  const config = await getTenantConfig(tenantId);
  const dbUrl = new URL(config.databaseUrl);
  const dbName = dbUrl.pathname.slice(1);

  // Drop database
  const adminPool = new Pool({ connectionString: process.env.ADMIN_DATABASE_URL });
  await adminPool.query(`DROP DATABASE IF EXISTS ${dbName}`);
  await adminPool.end();

  // Remove from registry
  await centralDb.delete(tenantConfigs).where(eq(tenantConfigs.id, tenantId));
}
```

## Tenant Migration

Move tenant between isolation strategies:

```typescript
async function migrateTenantToSeparateDatabase(tenantId: string): Promise<string> {
  // 1. Create new database
  const { databaseUrl } = await provisionTenantDatabase(
    `migrated_${tenantId}`,
    "Migrated Tenant",
    "placeholder@example.com"
  );

  // 2. Get tenant data from shared tables
  const sharedStore = store;

  const projects = await sharedStore
    .query()
    .from("Project", "p")
    .whereNode("p", (p) => p.tenantId.eq(tenantId))
    .select((ctx) => ctx.p)
    .execute();

  const tasks = await sharedStore
    .query()
    .from("Task", "t")
    .whereNode("t", (t) => t.tenantId.eq(tenantId))
    .select((ctx) => ctx.t)
    .execute();

  const users = await sharedStore
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.tenantId.eq(tenantId))
    .select((ctx) => ctx.u)
    .execute();

  // 3. Insert into new database
  const newStore = await dbManager.getStore(tenantId);

  await newStore.transaction(async (tx) => {
    for (const project of projects) {
      await tx.nodes.Project.create(project);
    }

    for (const task of tasks) {
      await tx.nodes.Task.create(task);
    }

    for (const user of users) {
      await tx.nodes.User.create(user);
    }
  });

  // 4. Delete from shared tables
  await deleteTenant(tenantId);

  return databaseUrl;
}
```

## Next Steps

- [Document Management](/examples/document-management) - CMS with semantic search
- [Product Catalog](/examples/product-catalog) - Categories, variants, inventory
- [Integration Patterns](/integration) - More deployment strategies

# Product Catalog

> E-commerce catalog with categories, variants, and inventory tracking

This example builds a product catalog system with:

- **Category hierarchy** with inheritance
- **Product variants** (size, color, etc.)
- **Inventory tracking** across warehouses
- **Product relationships** (bundles, accessories, alternatives)
- **Price history** using temporal queries

## Schema Definition

```typescript
import { z } from "zod";
import {
  defineNode,
  defineEdge,
  defineGraph,
  embedding,
} from "@nicia-ai/typegraph";

// Category hierarchy
const Category = defineNode("Category", {
  schema: z.object({
    name: z.string(),
    slug: z.string(),
    description: z.string().optional(),
    imageUrl: z.string().url().optional(),
    displayOrder: z.number().default(0),
    isActive: z.boolean().default(true),
  }),
});

// Products
const Product = defineNode("Product", {
  schema: z.object({
    sku: z.string(),
    name: z.string(),
    description: z.string(),
    basePrice: z.number().positive(),
    currency: z.string().default("USD"),
    status: z.enum(["draft", "active", "discontinued"]).default("draft"),
    embedding: embedding(1536).optional(),
  }),
});

// Product variants (specific size/color combinations)
const Variant = defineNode("Variant", {
  schema: z.object({
    sku: z.string(),
    name: z.string(), // "Large / Blue"
    priceModifier: z.number().default(0), // Added to base price
    attributes: z.record(z.string()), // { size: "L", color: "blue" }
    isDefault: z.boolean().default(false),
  }),
});

// Inventory
const Warehouse = defineNode("Warehouse", {
  schema: z.object({
    code: z.string(),
    name: z.string(),
    location: z.string(),
    isActive: z.boolean().default(true),
  }),
});

const Inventory = defineNode("Inventory", {
  schema: z.object({
    quantity: z.number().int().min(0),
    reservedQuantity: z.number().int().min(0).default(0),
    reorderPoint: z.number().int().min(0).default(10),
    lastCountedAt: z.string().datetime().optional(),
  }),
});

// Edges
const parentCategory = defineEdge("parentCategory");
const inCategory = defineEdge("inCategory", {
  schema: z.object({ isPrimary: z.boolean().default(false) }),
});
const hasVariant = defineEdge("hasVariant");
const inventoryFor = defineEdge("inventoryFor");
const atWarehouse = defineEdge("atWarehouse");
const relatedProduct = defineEdge("relatedProduct", {
  schema: z.object({
    type: z.enum(["accessory", "alternative", "bundled", "upsell"]),
    sortOrder: z.number().default(0),
  }),
});

// Graph
const graph = defineGraph({
  id: "product_catalog",
  nodes: {
    Category: { type: Category },
    Product: { type: Product },
    Variant: { type: Variant },
    Warehouse: { type: Warehouse },
    Inventory: { type: Inventory },
  },
  edges: {
    parentCategory: { type: parentCategory, from: [Category], to: [Category] },
    inCategory: { type: inCategory, from: [Product], to: [Category] },
    hasVariant: { type: hasVariant, from: [Product], to: [Variant] },
    inventoryFor: { type: inventoryFor, from: [Inventory], to: [Variant] },
    atWarehouse: { type: atWarehouse, from: [Inventory], to: [Warehouse] },
    relatedProduct: { type: relatedProduct, from: [Product], to: [Product] },
  },
  ontology: [
    // Category hierarchy is modeled via the parentCategory edge, not ontology.
    // Use ontology for type-level constraints, e.g.:
    // disjointWith(Product, Category),
  ],
});
```

## Category Management

### Create Category Tree

```typescript
async function createCategory(
  name: string,
  slug: string,
  parentSlug?: string
): Promise<Node<typeof Category>> {
  const category = await store.nodes.Category.create({
    name,
    slug,
    isActive: true,
  });

  if (parentSlug) {
    const parent = await store
      .query()
      .from("Category", "c")
      .whereNode("c", (c) => c.slug.eq(parentSlug))
      .select((ctx) => ctx.c)
      .first();

    if (parent) {
      await store.edges.parentCategory.create(category, parent, {});
    }
  }

  return category;
}

// Build initial category structure
await createCategory("Electronics", "electronics");
await createCategory("Phones", "phones", "electronics");
await createCategory("Accessories", "accessories", "electronics");
await createCategory("Cases", "cases", "accessories");
await createCategory("Chargers", "chargers", "accessories");
```

### Get Category with Ancestors

```typescript
interface CategoryWithPath {
  id: string;
  name: string;
  slug: string;
  path: Array<{ name: string; slug: string }>;
}

async function getCategoryWithPath(slug: string): Promise<CategoryWithPath | undefined> {
  const category = await store
    .query()
    .from("Category", "c")
    .whereNode("c", (c) => c.slug.eq(slug))
    .select((ctx) => ({
      id: ctx.c.id,
      name: ctx.c.name,
      slug: ctx.c.slug,
    }))
    .first();

  if (!category) return undefined;

  const ancestors = await store
    .query()
    .from("Category", "c")
    .whereNode("c", (c) => c.slug.eq(slug))
    .traverse("parentCategory", "e")
    .recursive()
    .to("Category", "ancestor")
    .select((ctx) => ({
      name: ctx.ancestor.name,
      slug: ctx.ancestor.slug,
    }))
    .execute();

  return {
    ...category,
    path: ancestors.reverse(), // Root first
  };
}
```

### Get Subcategories

```typescript
async function getSubcategories(
  parentSlug: string,
  includeNested = false
): Promise<Array<{ id: string; name: string; slug: string; depth: number }>> {
  let query = store
    .query()
    .from("Category", "parent")
    .whereNode("parent", (c) => c.slug.eq(parentSlug))
    .traverse("parentCategory", "e", { direction: "in" });

  if (includeNested) {
    query = query.recursive({ depth: "depth" });
  }

  return query
    .to("Category", "child")
    .whereNode("child", (c) => c.isActive.eq(true))
    .select((ctx) => ({
      id: ctx.child.id,
      name: ctx.child.name,
      slug: ctx.child.slug,
      depth: ctx.depth ?? 1,
    }))
    .orderBy((ctx) => ctx.child.displayOrder, "asc")
    .execute();
}
```

## Product Management

### Create Product with Variants

```typescript
interface ProductInput {
  sku: string;
  name: string;
  description: string;
  basePrice: number;
  categorySlug: string;
  variants: Array<{
    sku: string;
    name: string;
    priceModifier?: number;
    attributes: Record<string, string>;
    isDefault?: boolean;
  }>;
}

async function createProduct(input: ProductInput): Promise<Node<typeof Product>> {
  return store.transaction(async (tx) => {
    // Generate embedding for semantic search
    const embedding = await generateEmbedding(`${input.name} ${input.description}`);

    // Create product
    const product = await tx.nodes.Product.create({
      sku: input.sku,
      name: input.name,
      description: input.description,
      basePrice: input.basePrice,
      status: "draft",
      embedding,
    });

    // Link to category
    const category = await tx
      .query()
      .from("Category", "c")
      .whereNode("c", (c) => c.slug.eq(input.categorySlug))
      .select((ctx) => ctx.c)
      .first();

    if (category) {
      await tx.edges.inCategory.create(product, category, { isPrimary: true });
    }

    // Create variants
    for (const v of input.variants) {
      const variant = await tx.nodes.Variant.create({
        sku: v.sku,
        name: v.name,
        priceModifier: v.priceModifier ?? 0,
        attributes: v.attributes,
        isDefault: v.isDefault ?? false,
      });

      await tx.edges.hasVariant.create(product, variant, {});
    }

    return product;
  });
}
```

### Get Product Details

```typescript
interface ProductDetails {
  id: string;
  sku: string;
  name: string;
  description: string;
  basePrice: number;
  status: string;
  categories: Array<{ name: string; slug: string; isPrimary: boolean }>;
  variants: Array<{
    id: string;
    sku: string;
    name: string;
    price: number;
    attributes: Record<string, string>;
    inventory: number;
  }>;
  related: Array<{ id: string; name: string; type: string }>;
}

async function getProductDetails(sku: string): Promise<ProductDetails | undefined> {
  const product = await store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => p.sku.eq(sku))
    .select((ctx) => ctx.p)
    .first();

  if (!product) return undefined;

  // Get categories
  const categories = await store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => p.id.eq(product.id))
    .traverse("inCategory", "e")
    .to("Category", "c")
    .select((ctx) => ({
      name: ctx.c.name,
      slug: ctx.c.slug,
      isPrimary: ctx.e.isPrimary,
    }))
    .execute();

  // Get variants with inventory
  const variants = await store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => p.id.eq(product.id))
    .traverse("hasVariant", "e")
    .to("Variant", "v")
    .optionalTraverse("inventoryFor", "inv", { direction: "in" })
    .to("Inventory", "i")
    .select((ctx) => ({
      id: ctx.v.id,
      sku: ctx.v.sku,
      name: ctx.v.name,
      priceModifier: ctx.v.priceModifier,
      attributes: ctx.v.attributes,
      quantity: ctx.i?.quantity ?? 0,
      reservedQuantity: ctx.i?.reservedQuantity ?? 0,
    }))
    .execute();

  // Get related products
  const related = await store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => p.id.eq(product.id))
    .traverse("relatedProduct", "e")
    .to("Product", "r")
    .select((ctx) => ({
      id: ctx.r.id,
      name: ctx.r.name,
      type: ctx.e.type,
    }))
    .orderBy((ctx) => ctx.e.sortOrder, "asc")
    .execute();

  return {
    id: product.id,
    sku: product.sku,
    name: product.name,
    description: product.description,
    basePrice: product.basePrice,
    status: product.status,
    categories,
    variants: variants.map((v) => ({
      ...v,
      price: product.basePrice + v.priceModifier,
      inventory: v.quantity - v.reservedQuantity,
    })),
    related,
  };
}
```

## Inventory Management

### Update Inventory

```typescript
async function updateInventory(
  variantSku: string,
  warehouseCode: string,
  quantity: number
): Promise<void> {
  const variant = await store
    .query()
    .from("Variant", "v")
    .whereNode("v", (v) => v.sku.eq(variantSku))
    .select((ctx) => ctx.v)
    .first();

  const warehouse = await store
    .query()
    .from("Warehouse", "w")
    .whereNode("w", (w) => w.code.eq(warehouseCode))
    .select((ctx) => ctx.w)
    .first();

  if (!variant || !warehouse) {
    throw new Error("Variant or warehouse not found");
  }

  // Find existing inventory record
  const existingInventory = await store
    .query()
    .from("Inventory", "i")
    .traverse("inventoryFor", "e1")
    .to("Variant", "v")
    .whereNode("v", (v) => v.id.eq(variant.id))
    .traverse("atWarehouse", "e2", { direction: "in" })
    .to("Warehouse", "w")
    .whereNode("w", (w) => w.id.eq(warehouse.id))
    .select((ctx) => ctx.i)
    .first();

  if (existingInventory) {
    await store.nodes.Inventory.update(existingInventory.id, {
      quantity,
      lastCountedAt: new Date().toISOString(),
    });
  } else {
    const inventory = await store.nodes.Inventory.create({
      quantity,
      reservedQuantity: 0,
      lastCountedAt: new Date().toISOString(),
    });

    await store.edges.inventoryFor.create(inventory, variant, {});
    await store.edges.atWarehouse.create(inventory, warehouse, {});
  }
}
```

### Reserve Inventory

```typescript
async function reserveInventory(
  variantSku: string,
  quantity: number
): Promise<{ success: boolean; warehouseCode?: string }> {
  const inventories = await store
    .query()
    .from("Variant", "v")
    .whereNode("v", (v) => v.sku.eq(variantSku))
    .traverse("inventoryFor", "e", { direction: "in" })
    .to("Inventory", "i")
    .traverse("atWarehouse", "e2")
    .to("Warehouse", "w")
    .whereNode("w", (w) => w.isActive.eq(true))
    .select((ctx) => ({
      inventoryId: ctx.i.id,
      warehouseCode: ctx.w.code,
      available: ctx.i.quantity - ctx.i.reservedQuantity,
      reservedQuantity: ctx.i.reservedQuantity,
    }))
    .execute();

  // Find warehouse with enough inventory
  const available = inventories.find((i) => i.available >= quantity);

  if (!available) {
    return { success: false };
  }

  await store.nodes.Inventory.update(available.inventoryId, {
    reservedQuantity: available.reservedQuantity + quantity,
  });

  return { success: true, warehouseCode: available.warehouseCode };
}
```

### Low Stock Report

```typescript
import { field, sum, havingLt } from "@nicia-ai/typegraph";

interface LowStockItem {
  productName: string;
  variantSku: string;
  variantName: string;
  totalQuantity: number;
  reorderPoint: number;
}

async function getLowStockItems(): Promise<LowStockItem[]> {
  return store
    .query()
    .from("Product", "p")
    .traverse("hasVariant", "e1")
    .to("Variant", "v")
    .traverse("inventoryFor", "e2", { direction: "in" })
    .to("Inventory", "i")
    .groupByNode("v")
    .having(havingLt(sum("i", "quantity"), field("i", "reorderPoint")))
    .aggregate({
      productName: field("p", "name"),
      variantSku: field("v", "sku"),
      variantName: field("v", "name"),
      totalQuantity: sum("i", "quantity"),
      reorderPoint: field("i", "reorderPoint"),
    })
    .execute();
}
```

## Search and Discovery

### Semantic Product Search

```typescript
async function searchProducts(
  query: string,
  options: {
    categorySlug?: string;
    minPrice?: number;
    maxPrice?: number;
    limit?: number;
  } = {}
): Promise<Array<{ product: ProductProps; score: number }>> {
  const { categorySlug, minPrice, maxPrice, limit = 20 } = options;

  const queryEmbedding = await generateEmbedding(query);

  let queryBuilder = store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => {
      let pred = p.embedding
        .similarTo(queryEmbedding, limit, { metric: "cosine", minScore: 0.6 })
        .and(p.status.eq("active"));

      if (minPrice !== undefined) {
        pred = pred.and(p.basePrice.gte(minPrice));
      }
      if (maxPrice !== undefined) {
        pred = pred.and(p.basePrice.lte(maxPrice));
      }

      return pred;
    });

  // Filter by category if specified
  if (categorySlug) {
    // Get category and all subcategories
    const categoryIds = await store
      .query()
      .from("Category", "c")
      .whereNode("c", (c) => c.slug.eq(categorySlug))
      .traverse("parentCategory", "e", { direction: "in" })
      .recursive()
      .to("Category", "sub")
      .select((ctx) => ctx.sub.id)
      .execute();

    queryBuilder = queryBuilder
      .traverse("inCategory", "e")
      .to("Category", "c")
      .whereNode("c", (c) => c.id.in([...categoryIds, categorySlug]));
  }

  return queryBuilder
    .select((ctx) => ({
      product: ctx.p,
      score: ctx.p.embedding.similarity(queryEmbedding),
    }))
    .execute();
}
```

### Get Products in Category

```typescript
async function getProductsInCategory(
  categorySlug: string,
  options: {
    includeSubcategories?: boolean;
    page?: number;
    pageSize?: number;
    sortBy?: "name" | "price" | "newest";
  } = {}
): Promise<{ products: ProductProps[]; total: number }> {
  const { includeSubcategories = true, page = 1, pageSize = 20, sortBy = "name" } = options;

  // Build category ID list
  let categoryIds: string[] = [];

  const rootCategory = await store
    .query()
    .from("Category", "c")
    .whereNode("c", (c) => c.slug.eq(categorySlug))
    .select((ctx) => ctx.c.id)
    .first();

  if (!rootCategory) return { products: [], total: 0 };

  categoryIds.push(rootCategory);

  if (includeSubcategories) {
    const subIds = await store
      .query()
      .from("Category", "c")
      .whereNode("c", (c) => c.slug.eq(categorySlug))
      .traverse("parentCategory", "e", { direction: "in" })
      .recursive()
      .to("Category", "sub")
      .select((ctx) => ctx.sub.id)
      .execute();

    categoryIds = [...categoryIds, ...subIds];
  }

  const query = store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => p.status.eq("active"))
    .traverse("inCategory", "e")
    .to("Category", "c")
    .whereNode("c", (c) => c.id.in(categoryIds))
    .select((ctx) => ctx.p);

  // Apply sorting
  const sortedQuery =
    sortBy === "price"
      ? query.orderBy((ctx) => ctx.p.basePrice, "asc")
      : sortBy === "newest"
        ? query.orderBy((ctx) => ctx.p.createdAt, "desc")
        : query.orderBy((ctx) => ctx.p.name, "asc");

  const products = await sortedQuery
    .limit(pageSize)
    .offset((page - 1) * pageSize)
    .execute();

  const total = await store
    .query()
    .from("Product", "p")
    .whereNode("p", (p) => p.status.eq("active"))
    .traverse("inCategory", "e")
    .to("Category", "c")
    .whereNode("c", (c) => c.id.in(categoryIds))
    .count();

  return { products, total };
}
```

## Price History

### Track Price Changes

TypeGraph's temporal model automatically tracks all changes:

```typescript
async function getPriceHistory(
  sku: string
): Promise<Array<{ price: number; validFrom: string; validTo: string | undefined }>> {
  return store
    .query()
    .from("Product", "p")
    .temporal("includeEnded")
    .whereNode("p", (p) => p.sku.eq(sku))
    .orderBy((ctx) => ctx.p.validFrom, "desc")
    .select((ctx) => ({
      price: ctx.p.basePrice,
      validFrom: ctx.p.validFrom,
      validTo: ctx.p.validTo,
    }))
    .execute();
}
```

### Price at Point in Time

```typescript
async function getPriceAsOf(sku: string, date: Date): Promise<number | undefined> {
  const product = await store
    .query()
    .from("Product", "p")
    .temporal("asOf", date.toISOString())
    .whereNode("p", (p) => p.sku.eq(sku))
    .select((ctx) => ctx.p.basePrice)
    .first();

  return product;
}
```

## Next Steps

- [Document Management](/examples/document-management) - CMS with semantic search
- [Workflow Engine](/examples/workflow-engine) - State machines with approvals
- [Audit Trail](/examples/audit-trail) - Complete change tracking

# Workflow Engine

> State machines with approvals, assignments, and escalations

This example builds a workflow engine with:

- **State machine definitions** as graph schemas
- **Approval chains** with multiple approvers
- **Task assignment** and delegation
- **Escalation rules** based on time
- **Audit trail** of all state changes

## Schema Definition

```typescript
import { z } from "zod";
import { defineNode, defineEdge, defineGraph, implies } from "@nicia-ai/typegraph";

// Workflow definition (template)
const WorkflowDefinition = defineNode("WorkflowDefinition", {
  schema: z.object({
    name: z.string(),
    description: z.string().optional(),
    version: z.number().int().positive(),
    isActive: z.boolean().default(true),
  }),
});

// States within a workflow
const State = defineNode("State", {
  schema: z.object({
    name: z.string(),
    type: z.enum(["initial", "intermediate", "terminal", "approval"]),
    config: z.record(z.unknown()).optional(), // State-specific config
  }),
});

// Transitions between states
const Transition = defineNode("Transition", {
  schema: z.object({
    name: z.string(),
    condition: z.string().optional(), // Expression to evaluate
    requiredRole: z.string().optional(),
  }),
});

// Workflow instances
const WorkflowInstance = defineNode("WorkflowInstance", {
  schema: z.object({
    referenceId: z.string(), // ID of the entity being processed
    referenceType: z.string(), // Type of entity (e.g., "PurchaseOrder")
    status: z.enum(["active", "completed", "cancelled", "failed"]).default("active"),
    data: z.record(z.unknown()).optional(), // Instance-specific data
    createdAt: z.string().datetime(),
    completedAt: z.string().datetime().optional(),
  }),
});

// Tasks assigned to users
const Task = defineNode("Task", {
  schema: z.object({
    title: z.string(),
    description: z.string().optional(),
    type: z.enum(["action", "approval", "review", "notification"]),
    status: z.enum(["pending", "in_progress", "completed", "rejected", "escalated"]).default("pending"),
    dueDate: z.string().datetime().optional(),
    priority: z.enum(["low", "medium", "high", "urgent"]).default("medium"),
    result: z.record(z.unknown()).optional(),
    completedAt: z.string().datetime().optional(),
  }),
});

// Users
const User = defineNode("User", {
  schema: z.object({
    email: z.string().email(),
    name: z.string(),
    role: z.string(),
    department: z.string().optional(),
  }),
});

// Comments on tasks
const Comment = defineNode("Comment", {
  schema: z.object({
    content: z.string(),
    createdAt: z.string().datetime(),
  }),
});

// Edges
const hasState = defineEdge("hasState");
const hasTransition = defineEdge("hasTransition");
const fromState = defineEdge("fromState");
const toState = defineEdge("toState");
const usesDefinition = defineEdge("usesDefinition");
const currentState = defineEdge("currentState");
const hasTask = defineEdge("hasTask");
const assignedTo = defineEdge("assignedTo");
const createdBy = defineEdge("createdBy");
const hasComment = defineEdge("hasComment");
const reportsTo = defineEdge("reportsTo"); // For escalation chain

// Graph
const graph = defineGraph({
  id: "workflow_engine",
  nodes: {
    WorkflowDefinition: { type: WorkflowDefinition },
    State: { type: State },
    Transition: { type: Transition },
    WorkflowInstance: { type: WorkflowInstance },
    Task: { type: Task },
    User: { type: User },
    Comment: { type: Comment },
  },
  edges: {
    hasState: { type: hasState, from: [WorkflowDefinition], to: [State] },
    hasTransition: { type: hasTransition, from: [WorkflowDefinition], to: [Transition] },
    fromState: { type: fromState, from: [Transition], to: [State] },
    toState: { type: toState, from: [Transition], to: [State] },
    usesDefinition: { type: usesDefinition, from: [WorkflowInstance], to: [WorkflowDefinition] },
    currentState: { type: currentState, from: [WorkflowInstance], to: [State] },
    hasTask: { type: hasTask, from: [WorkflowInstance], to: [Task] },
    assignedTo: { type: assignedTo, from: [Task], to: [User] },
    createdBy: { type: createdBy, from: [Task, Comment, WorkflowInstance], to: [User] },
    hasComment: { type: hasComment, from: [Task], to: [Comment] },
    reportsTo: { type: reportsTo, from: [User], to: [User] },
  },
  ontology: [
    // Escalation implies assignment
    implies(reportsTo, assignedTo),
  ],
});
```

## Workflow Definition

### Create Approval Workflow

```typescript
async function createApprovalWorkflow(): Promise<Node<typeof WorkflowDefinition>> {
  return store.transaction(async (tx) => {
    // Create workflow definition
    const workflow = await tx.nodes.WorkflowDefinition.create({
      name: "Purchase Order Approval",
      description: "Multi-level approval for purchase orders",
      version: 1,
      isActive: true,
    });

    // Create states
    const states = {
      draft: await tx.nodes.State.create({
        name: "Draft",
        type: "initial",
      }),
      pendingManagerApproval: await tx.nodes.State.create({
        name: "Pending Manager Approval",
        type: "approval",
        config: { approverRole: "manager", timeout: "48h" },
      }),
      pendingFinanceApproval: await tx.nodes.State.create({
        name: "Pending Finance Approval",
        type: "approval",
        config: { approverRole: "finance", timeout: "24h" },
      }),
      approved: await tx.nodes.State.create({
        name: "Approved",
        type: "terminal",
      }),
      rejected: await tx.nodes.State.create({
        name: "Rejected",
        type: "terminal",
      }),
    };

    // Link states to workflow
    for (const state of Object.values(states)) {
      await tx.edges.hasState.create(workflow, state, {});
    }

    // Create transitions
    const transitions = [
      {
        from: states.draft,
        to: states.pendingManagerApproval,
        name: "Submit",
        requiredRole: "requester",
      },
      {
        from: states.pendingManagerApproval,
        to: states.pendingFinanceApproval,
        name: "Approve",
        requiredRole: "manager",
        condition: "amount > 1000",
      },
      {
        from: states.pendingManagerApproval,
        to: states.approved,
        name: "Approve",
        requiredRole: "manager",
        condition: "amount <= 1000",
      },
      {
        from: states.pendingManagerApproval,
        to: states.rejected,
        name: "Reject",
        requiredRole: "manager",
      },
      {
        from: states.pendingFinanceApproval,
        to: states.approved,
        name: "Approve",
        requiredRole: "finance",
      },
      {
        from: states.pendingFinanceApproval,
        to: states.rejected,
        name: "Reject",
        requiredRole: "finance",
      },
    ];

    for (const t of transitions) {
      const transition = await tx.nodes.Transition.create({
        name: t.name,
        requiredRole: t.requiredRole,
        condition: t.condition,
      });

      await tx.edges.hasTransition.create(workflow, transition, {});
      await tx.edges.fromState.create(transition, t.from, {});
      await tx.edges.toState.create(transition, t.to, {});
    }

    return workflow;
  });
}
```

## Workflow Instances

### Start Workflow

```typescript
interface StartWorkflowInput {
  workflowName: string;
  referenceId: string;
  referenceType: string;
  data?: Record<string, unknown>;
  createdByUserId: string;
}

async function startWorkflow(input: StartWorkflowInput): Promise<Node<typeof WorkflowInstance>> {
  return store.transaction(async (tx) => {
    // Find workflow definition
    const workflow = await tx
      .query()
      .from("WorkflowDefinition", "w")
      .whereNode("w", (w) => w.name.eq(input.workflowName).and(w.isActive.eq(true)))
      .select((ctx) => ctx.w)
      .first();

    if (!workflow) {
      throw new Error(`Workflow '${input.workflowName}' not found`);
    }

    // Find initial state
    const initialState = await tx
      .query()
      .from("WorkflowDefinition", "w")
      .whereNode("w", (w) => w.id.eq(workflow.id))
      .traverse("hasState", "e")
      .to("State", "s")
      .whereNode("s", (s) => s.type.eq("initial"))
      .select((ctx) => ctx.s)
      .first();

    if (!initialState) {
      throw new Error("Workflow has no initial state");
    }

    // Create instance
    const instance = await tx.nodes.WorkflowInstance.create({
      referenceId: input.referenceId,
      referenceType: input.referenceType,
      status: "active",
      data: input.data,
      createdAt: new Date().toISOString(),
    });

    // Link to definition and state
    await tx.edges.usesDefinition.create(instance, workflow, {});
    await tx.edges.currentState.create(instance, initialState, {});

    // Link to creator
    const creator = await tx.nodes.User.getById(input.createdByUserId);
    if (!creator) throw new Error(`User not found: ${input.createdByUserId}`);
    await tx.edges.createdBy.create(instance, creator, {});

    return instance;
  });
}
```

### Get Available Transitions

```typescript
interface AvailableTransition {
  id: string;
  name: string;
  targetState: string;
  requiredRole?: string;
  condition?: string;
}

async function getAvailableTransitions(
  instanceId: string,
  userId: string
): Promise<AvailableTransition[]> {
  // Get user's role
  const user = await store.nodes.User.getById(userId);
  if (!user) throw new Error(`User not found: ${userId}`);
  const userRole = user.role;

  // Get current state
  const currentState = await store
    .query()
    .from("WorkflowInstance", "i")
    .whereNode("i", (i) => i.id.eq(instanceId))
    .traverse("currentState", "e")
    .to("State", "s")
    .select((ctx) => ctx.s)
    .first();

  if (!currentState) {
    throw new Error("Instance has no current state");
  }

  // Get transitions from current state
  const transitions = await store
    .query()
    .from("State", "s")
    .whereNode("s", (s) => s.id.eq(currentState.id))
    .traverse("fromState", "e1", { direction: "in" })
    .to("Transition", "t")
    .traverse("toState", "e2")
    .to("State", "target")
    .select((ctx) => ({
      id: ctx.t.id,
      name: ctx.t.name,
      targetState: ctx.target.name,
      requiredRole: ctx.t.requiredRole,
      condition: ctx.t.condition,
    }))
    .execute();

  // Filter by role
  return transitions.filter(
    (t) => !t.requiredRole || t.requiredRole === userRole || userRole === "admin"
  );
}
```

### Execute Transition

```typescript
async function executeTransition(
  instanceId: string,
  transitionId: string,
  userId: string,
  result?: Record<string, unknown>
): Promise<void> {
  await store.transaction(async (tx) => {
    const instance = await tx.nodes.WorkflowInstance.getById(instanceId);
    if (!instance) throw new Error(`WorkflowInstance not found: ${instanceId}`);

    if (instance.status !== "active") {
      throw new Error("Workflow is not active");
    }

    // Verify transition is valid
    const available = await getAvailableTransitions(instanceId, userId);
    const transition = available.find((t) => t.id === transitionId);

    if (!transition) {
      throw new Error("Transition not available");
    }

    // Get target state
    const targetState = await tx
      .query()
      .from("Transition", "t")
      .whereNode("t", (t) => t.id.eq(transitionId))
      .traverse("toState", "e")
      .to("State", "s")
      .select((ctx) => ctx.s)
      .first();

    // Remove current state edge
    const currentStateEdge = await tx
      .query()
      .from("WorkflowInstance", "i")
      .whereNode("i", (i) => i.id.eq(instanceId))
      .traverse("currentState", "e")
      .to("State", "s")
      .select((ctx) => ctx.e.id)
      .first();

    if (currentStateEdge) {
      await tx.edges.currentState.delete(currentStateEdge);
    }

    // Add new state edge
    await tx.edges.currentState.create(instance, targetState!, {});

    // Update instance data
    const updatedData = { ...instance.data, lastTransition: transition.name, ...result };
    const updates: Partial<WorkflowInstanceProps> = { data: updatedData };

    // Check if terminal state
    if (targetState!.type === "terminal") {
      updates.status = "completed";
      updates.completedAt = new Date().toISOString();
    }

    await tx.nodes.WorkflowInstance.update(instanceId, updates);

    // Complete any pending tasks
    const pendingTasks = await tx
      .query()
      .from("WorkflowInstance", "i")
      .whereNode("i", (i) => i.id.eq(instanceId))
      .traverse("hasTask", "e")
      .to("Task", "t")
      .whereNode("t", (t) => t.status.in(["pending", "in_progress"]))
      .select((ctx) => ctx.t.id)
      .execute();

    for (const taskId of pendingTasks) {
      await tx.nodes.Task.update(taskId, {
        status: "completed",
        completedAt: new Date().toISOString(),
      });
    }

    // Create tasks for new state if needed
    if (targetState!.type === "approval") {
      await createApprovalTask(tx, instanceId, targetState!, userId);
    }
  });
}
```

## Task Management

### Create Approval Task

```typescript
async function createApprovalTask(
  tx: Transaction,
  instanceId: string,
  state: Node<typeof State>,
  requesterId: string
): Promise<void> {
  const config = state.config as { approverRole: string; timeout: string } | undefined;
  if (!config) return;

  // Find approver (first user with matching role, or requester's manager)
  let approver = await tx
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.role.eq(config.approverRole))
    .select((ctx) => ctx.u)
    .first();

  // If no direct match, find in reporting chain
  if (!approver) {
    approver = await tx
      .query()
      .from("User", "requester")
      .whereNode("requester", (u) => u.id.eq(requesterId))
      .traverse("reportsTo", "e")
      .recursive()
      .to("User", "manager")
      .whereNode("manager", (u) => u.role.eq(config.approverRole))
      .select((ctx) => ctx.manager)
      .first();
  }

  if (!approver) {
    throw new Error(`No approver found with role '${config.approverRole}'`);
  }

  // Calculate due date
  const dueDate = calculateDueDate(config.timeout);

  // Create task
  const task = await tx.nodes.Task.create({
    title: `Approval Required: ${state.name}`,
    description: `Please review and approve or reject.`,
    type: "approval",
    status: "pending",
    priority: "medium",
    dueDate: dueDate.toISOString(),
  });

  // Link task to instance and approver
  const instance = await tx.nodes.WorkflowInstance.getById(instanceId);
  if (!instance) throw new Error(`WorkflowInstance not found: ${instanceId}`);
  await tx.edges.hasTask.create(instance, task, {});
  await tx.edges.assignedTo.create(task, approver, {});
}

function calculateDueDate(timeout: string): Date {
  const now = new Date();
  const match = timeout.match(/^(\d+)(h|d)$/);

  if (!match) return new Date(now.getTime() + 24 * 60 * 60 * 1000); // Default 24h

  const value = parseInt(match[1], 10);
  const unit = match[2];

  if (unit === "h") {
    return new Date(now.getTime() + value * 60 * 60 * 1000);
  } else {
    return new Date(now.getTime() + value * 24 * 60 * 60 * 1000);
  }
}
```

### Get User's Tasks

```typescript
interface TaskWithContext {
  id: string;
  title: string;
  type: string;
  status: string;
  priority: string;
  dueDate?: string;
  workflowName: string;
  referenceId: string;
  referenceType: string;
}

async function getUserTasks(
  userId: string,
  status?: "pending" | "in_progress"
): Promise<TaskWithContext[]> {
  let query = store
    .query()
    .from("User", "u")
    .whereNode("u", (u) => u.id.eq(userId))
    .traverse("assignedTo", "e", { direction: "in" })
    .to("Task", "t");

  if (status) {
    query = query.whereNode("t", (t) => t.status.eq(status));
  } else {
    query = query.whereNode("t", (t) => t.status.in(["pending", "in_progress"]));
  }

  return query
    .traverse("hasTask", "e2", { direction: "in" })
    .to("WorkflowInstance", "i")
    .traverse("usesDefinition", "e3")
    .to("WorkflowDefinition", "w")
    .select((ctx) => ({
      id: ctx.t.id,
      title: ctx.t.title,
      type: ctx.t.type,
      status: ctx.t.status,
      priority: ctx.t.priority,
      dueDate: ctx.t.dueDate,
      workflowName: ctx.w.name,
      referenceId: ctx.i.referenceId,
      referenceType: ctx.i.referenceType,
    }))
    .orderBy((ctx) => ctx.t.dueDate, "asc")
    .execute();
}
```

### Complete Task

```typescript
async function completeTask(
  taskId: string,
  userId: string,
  decision: "approve" | "reject",
  comment?: string
): Promise<void> {
  await store.transaction(async (tx) => {
    const task = await tx.nodes.Task.getById(taskId);
    if (!task) throw new Error(`Task not found: ${taskId}`);

    // Verify user is assigned
    const assignee = await tx
      .query()
      .from("Task", "t")
      .whereNode("t", (t) => t.id.eq(taskId))
      .traverse("assignedTo", "e")
      .to("User", "u")
      .select((ctx) => ctx.u.id)
      .first();

    if (assignee !== userId) {
      throw new Error("User is not assigned to this task");
    }

    // Update task
    await tx.nodes.Task.update(taskId, {
      status: decision === "approve" ? "completed" : "rejected",
      completedAt: new Date().toISOString(),
      result: { decision },
    });

    // Add comment if provided
    if (comment) {
      const commentNode = await tx.nodes.Comment.create({
        content: comment,
        createdAt: new Date().toISOString(),
      });
      await tx.edges.hasComment.create(task, commentNode, {});

      const user = await tx.nodes.User.getById(userId);
      if (!user) throw new Error(`User not found: ${userId}`);
      await tx.edges.createdBy.create(commentNode, user, {});
    }

    // Get workflow instance
    const instance = await tx
      .query()
      .from("Task", "t")
      .whereNode("t", (t) => t.id.eq(taskId))
      .traverse("hasTask", "e", { direction: "in" })
      .to("WorkflowInstance", "i")
      .select((ctx) => ctx.i)
      .first();

    // Find and execute the appropriate transition
    const transitions = await getAvailableTransitions(instance!.id, userId);
    const transition = transitions.find((t) =>
      decision === "approve" ? t.name === "Approve" : t.name === "Reject"
    );

    if (transition) {
      await executeTransition(instance!.id, transition.id, userId, { decision });
    }
  });
}
```

## Escalation

### Check Overdue Tasks

```typescript
async function getOverdueTasks(): Promise<Array<{ task: TaskProps; assignee: UserProps }>> {
  const now = new Date().toISOString();

  return store
    .query()
    .from("Task", "t")
    .whereNode("t", (t) =>
      t.status
        .in(["pending", "in_progress"])
        .and(t.dueDate.isNotNull())
        .and(t.dueDate.lt(now))
    )
    .traverse("assignedTo", "e")
    .to("User", "u")
    .select((ctx) => ({
      task: ctx.t,
      assignee: ctx.u,
    }))
    .execute();
}
```

### Escalate Task

```typescript
async function escalateTask(taskId: string): Promise<void> {
  await store.transaction(async (tx) => {
    // Get current assignee
    const currentAssignment = await tx
      .query()
      .from("Task", "t")
      .whereNode("t", (t) => t.id.eq(taskId))
      .traverse("assignedTo", "e")
      .to("User", "u")
      .select((ctx) => ({ edgeId: ctx.e.id, user: ctx.u }))
      .first();

    if (!currentAssignment) {
      throw new Error("Task has no assignee");
    }

    // Find manager in reporting chain
    const manager = await tx
      .query()
      .from("User", "u")
      .whereNode("u", (u) => u.id.eq(currentAssignment.user.id))
      .traverse("reportsTo", "e")
      .to("User", "manager")
      .select((ctx) => ctx.manager)
      .first();

    if (!manager) {
      throw new Error("No manager found for escalation");
    }

    // Update task
    await tx.nodes.Task.update(taskId, {
      status: "escalated",
      priority: "urgent",
    });

    // Reassign to manager
    await tx.edges.assignedTo.delete(currentAssignment.edgeId);
    const task = await tx.nodes.Task.getById(taskId);
    if (!task) throw new Error(`Task not found: ${taskId}`);
    await tx.edges.assignedTo.create(task, manager, {});

    // Add escalation comment
    const comment = await tx.nodes.Comment.create({
      content: `Task escalated from ${currentAssignment.user.name} due to timeout`,
      createdAt: new Date().toISOString(),
    });
    await tx.edges.hasComment.create(task, comment, {});
  });
}
```

### Run Escalation Job

```typescript
async function runEscalationJob(): Promise<{ escalated: number }> {
  const overdueTasks = await getOverdueTasks();
  let escalated = 0;

  for (const { task } of overdueTasks) {
    try {
      await escalateTask(task.id);
      escalated++;
    } catch (error) {
      console.error(`Failed to escalate task ${task.id}:`, error);
    }
  }

  return { escalated };
}
```

## Workflow History

### Get Instance Timeline

```typescript
interface TimelineEvent {
  timestamp: string;
  type: "state_change" | "task_created" | "task_completed" | "comment";
  description: string;
  actor?: string;
}

async function getInstanceTimeline(instanceId: string): Promise<TimelineEvent[]> {
  const events: TimelineEvent[] = [];

  // Get state change history using temporal queries
  const stateHistory = await store
    .query()
    .from("WorkflowInstance", "i")
    .temporal("includeEnded")
    .whereNode("i", (i) => i.id.eq(instanceId))
    .traverse("currentState", "e")
    .to("State", "s")
    .orderBy((ctx) => ctx.e.validFrom, "asc")
    .select((ctx) => ({
      stateName: ctx.s.name,
      timestamp: ctx.e.validFrom,
    }))
    .execute();

  for (const state of stateHistory) {
    events.push({
      timestamp: state.timestamp,
      type: "state_change",
      description: `Entered state: ${state.stateName}`,
    });
  }

  // Get task events
  const tasks = await store
    .query()
    .from("WorkflowInstance", "i")
    .whereNode("i", (i) => i.id.eq(instanceId))
    .traverse("hasTask", "e")
    .to("Task", "t")
    .optionalTraverse("assignedTo", "a")
    .to("User", "u")
    .select((ctx) => ({
      title: ctx.t.title,
      status: ctx.t.status,
      createdAt: ctx.t.createdAt,
      completedAt: ctx.t.completedAt,
      assignee: ctx.u?.name,
    }))
    .execute();

  for (const task of tasks) {
    events.push({
      timestamp: task.createdAt.toISOString(),
      type: "task_created",
      description: `Task created: ${task.title}`,
      actor: task.assignee,
    });

    if (task.completedAt) {
      events.push({
        timestamp: task.completedAt,
        type: "task_completed",
        description: `Task ${task.status}: ${task.title}`,
        actor: task.assignee,
      });
    }
  }

  // Sort by timestamp
  return events.sort((a, b) => a.timestamp.localeCompare(b.timestamp));
}
```

## Next Steps

- [Document Management](/examples/document-management) - CMS with semantic search
- [Product Catalog](/examples/product-catalog) - Categories, variants, inventory
- [Audit Trail](/examples/audit-trail) - Complete change tracking