Skip to content

Performance Overview

TypeGraph is designed to be a high-performance, low-overhead layer on top of your relational database. By leveraging the power of modern SQL engines (SQLite and PostgreSQL) and precomputing complex relationships, TypeGraph ensures that your knowledge graph scales with your application.

  1. One Query, One Statement: Every query — including multi-hop traversals — compiles to a single SQL statement. No N+1 queries by design.
  2. Precomputed Ontology: Transitive closures, subclass hierarchies, and edge implications are computed once at schema initialization, not during every query.
  3. Batching & Transactions: Bulk collection APIs and transactions minimize round-trips for writes.
  4. Zero-Cost Abstractions: Type safety and ontological reasoning add no measurable runtime overhead.

A common performance problem in ORMs is the N+1 query: you fetch N entities, then issue one query per entity to load related data. TypeGraph eliminates this structurally.

Every query — regardless of how many traversals it chains — compiles to a single SQL statement using Common Table Expressions (CTEs). Each traversal step becomes a CTE that joins against the previous one:

// This compiles to ONE SQL statement, not 3 separate queries
const results = await store
.query()
.from("Person", "p")
.whereNode("p", (p) => p.name.eq("Alice"))
.traverse("worksAt", "employment")
.to("Company", "c")
.traverse("locatedIn", "location")
.to("City", "city")
.select((ctx) => ({
person: ctx.p.name,
company: ctx.c.name,
city: ctx.city.name,
}))
.execute();

The generated SQL looks like:

WITH cte_p AS (
SELECT ... FROM typegraph_nodes
WHERE graph_id = ? AND kind IN ('Person') AND ...
),
cte_employment AS (
SELECT ... FROM typegraph_edges e
JOIN typegraph_nodes n ON ...
WHERE e.graph_id = ? AND ...
),
cte_location AS (
SELECT ... FROM typegraph_edges e
JOIN typegraph_nodes n ON ...
WHERE e.graph_id = ? AND ...
)
SELECT ... FROM cte_p
JOIN cte_employment ON ...
JOIN cte_location ON ...

This holds for all query types:

  • Multi-hop traversals (N CTEs, 1 statement)
  • Recursive traversals (WITH RECURSIVE, 1 statement)
  • Aggregations with traversals (CTEs + GROUP BY, 1 statement)
  • Set operations (UNION/INTERSECT/EXCEPT of CTEs, 1 statement)

There is no dataloader or batching layer because there is nothing to batch — the database handles the entire join graph in a single execution.

For small numbers of writes, individual create() calls inside a transaction are fine. For larger volumes, use the bulk collection APIs — they use multi-row INSERTs and handle parameter chunking internally.

MethodReturns resultsUse case
bulkCreate(items)YesNeed created nodes back
bulkInsert(items)NoMaximum throughput ingestion
bulkUpsertById(items)YesIdempotent import (create or update by ID)
bulkDelete(ids)NoMass soft-delete

PostgreSQL has a 65,535 bind parameter limit per statement. TypeGraph automatically chunks bulk operations to stay within this limit:

  • Node inserts: ~7,200 per chunk (9 params per node)
  • Edge inserts: ~5,400 per chunk (12 params per edge)

You don’t need to chunk manually — pass arrays of any size and TypeGraph handles the rest.

Bulk operations are individually transactional (each chunk is atomic), but if you need the entire batch to be atomic, wrap it in a transaction:

// Atomic: all-or-nothing for the entire import
await store.transaction(async (tx) => {
await tx.nodes.Person.bulkCreate(people);
await tx.nodes.Company.bulkCreate(companies);
await tx.edges.worksAt.bulkCreate(employments);
});

Without the wrapping transaction, a failure partway through would leave partial data.

// Small batch (< 100 items): individual creates in a transaction are fine
await store.transaction(async (tx) => {
for (const person of people) {
await tx.nodes.Person.create(person);
}
});
// Medium batch (100–10,000 items): bulkCreate
const created = await store.nodes.Person.bulkCreate(people);
// Large batch (10,000+ items): bulkInsert (no result allocation)
await store.nodes.Person.bulkInsert(people);
// Idempotent import: bulkUpsertById (creates or updates by ID)
await store.nodes.Person.bulkUpsertById(itemsWithIds);

getByIds() on node and edge collections uses a single SELECT ... WHERE id IN (...) instead of N individual queries. Results are returned in input order with undefined for missing entries.

const [alice, bob] = await store.nodes.Person.getByIds([aliceId, bobId]);

TypeGraph does not manage database connections or pools — you bring your own and are responsible for lifecycle. See Backend Setup for full setup guides.

Always use a connection pool in production. TypeGraph issues one SQL statement per query, so pool utilization is straightforward — no long-held connections or multi-statement conversations.

import { Pool } from "pg";
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20, // Size based on your concurrency needs
idleTimeoutMillis: 30_000,
connectionTimeoutMillis: 2_000,
});
pool.on("error", (err) => {
console.error("Unexpected pool error", err);
});

Sizing guidance: Each concurrent query uses one connection for the duration of that single SQL statement. A pool of 10–20 connections handles most workloads. If you’re running bulk imports in parallel, size up accordingly.

SQLite is single-writer. For best throughput:

  • Enable WAL mode: sqlite.pragma("journal_mode = WAL") — allows concurrent reads while writing
  • Batch writes in transactions rather than issuing many small commits
  • For read-heavy workloads, SQLite performs well without pooling since better-sqlite3 is synchronous

PostgreSQL transactions accept an optional isolation level:

await store.transaction(
async (tx) => {
// Serializable isolation for strict consistency
const snapshot = await tx.nodes.Account.getById(accountId);
// ...
},
{ isolationLevel: "serializable" },
);

Available levels: read_uncommitted, read_committed (default), repeatable_read, serializable.

SQLite always operates at serializable isolation.

When you define an ontology (e.g., subClassOf, implies), TypeGraph precomputes the full transitive closure at store initialization. Queries like .from("Parent", "p", { includeSubClasses: true }) use a pre-calculated list of kinds rather than recursive lookups at runtime.

TypeGraph automatically optimizes queries based on which fields your select() callback accesses. When you select specific fields, TypeGraph generates SQL that only extracts those fields using json_extract() (SQLite) or JSONB path extraction (PostgreSQL), rather than fetching the entire props blob.

// Optimized: Only fetches email and name from the database
const results = await store
.query()
.from("Person", "p")
.whereNode("p", (p) => p.email.eq("alice@example.com"))
.select((ctx) => ({
email: ctx.p.email,
name: ctx.p.name,
}))
.execute();
// SQL: SELECT json_extract(props, '$.email'), json_extract(props, '$.name') ...

This optimization pairs well with covering indexes: if your index contains both the filter keys and the selected keys, the database can satisfy the query with an index-only scan.

When optimization applies:

PatternOptimized?Reason
ctx => ({ email: ctx.p.email })YesSimple field extraction
ctx => [ctx.p.id, ctx.p.name]YesMultiple fields in array
ctx => ctx.pNoWhole node returned
ctx => ({ upper: ctx.p.email.toUpperCase() })YesField extracted; method runs in JS
ctx => ({ ...ctx.p })NoSpread requires full node

The optimization is transparent — if your callback can’t be optimized, TypeGraph automatically falls back to fetching the full node data.

The default TypeGraph schema includes optimized indexes for the most common access patterns:

  • Graph + Kind + ID: Primary key for node lookups
  • Graph + From/To ID: Optimized for edge traversals
  • Temporal columns: Indexes on valid_from, valid_to, and deleted_at

For application-specific indexes on JSON properties, see Indexes.

Each builder method (.where(), .limit(), .orderBy(), etc.) returns a new immutable instance. The compiled SQL for each instance is cached internally — repeated .execute() calls on the same builder skip recompilation entirely. This applies to standard queries, aggregate queries, and set-operation queries (union, intersect, except). This is transparent and requires no API changes.

const activeUsers = store
.query()
.from("User", "u")
.whereNode("u", (u) => u.status.eq("active"))
.select((ctx) => ctx.u);
// First call: compiles AST → SQL → executes
await activeUsers.execute();
// Second call: reuses cached SQL → executes
await activeUsers.execute();

For hot paths that execute the same query shape with different values, .prepare() pre-compiles the entire query pipeline (AST build, SQL compilation, text extraction) once. Subsequent .execute(bindings) calls only substitute parameter values and execute.

When executeRaw is available (both SQLite and PostgreSQL backends), the pre-compiled SQL text is sent directly to the driver — zero recompilation overhead.

Best for: API endpoints, hot loops, or any code path that runs the same query shape repeatedly.

See Prepared Queries for usage details.

Apply .whereNode() predicates as early as possible in your query chain. TypeGraph moves these predicates into the initial CTEs, reducing the number of rows that need to be joined in subsequent steps.

When you only need certain fields, select them explicitly rather than returning whole nodes. This triggers the smart select optimization and can enable index-only scans with properly configured indexes.

// Preferred: Only fetches what you need
.select((ctx) => ({ name: ctx.p.name, email: ctx.p.email }))
// Avoid when possible: Fetches entire props blob
.select((ctx) => ctx.p)

Unless you specifically need to query across a hierarchy, avoid includeSubClasses: true. Being specific about the node kind allows the SQL engine to use more restrictive index scans.

For large datasets, prefer .paginate() over .limit() and .offset(). Keyset pagination (using cursors) avoids the O(N) cost of skipping rows in standard SQL offsets.

TypeGraph’s built-in indexes cover structural lookups (by ID, by edge endpoints). Properties you filter or sort on in whereNode(), whereEdge(), and orderBy() need application-specific expression indexes. Use the Query Profiler to identify which properties need coverage.

Use the Query Profiler to identify missing indexes and understand query patterns in your application. The profiler captures property access patterns and generates prioritized index recommendations.

import { QueryProfiler } from "@nicia-ai/typegraph/profiler";
const profiler = new QueryProfiler();
const profiledStore = profiler.attachToStore(store);
// Run your application or test suite...
const report = profiler.getReport();
console.log(report.recommendations);

TypeGraph uses a deterministic performance sanity suite as its benchmark and regression gate. The suite seeds a realistic graph shape and measures end-to-end query latency across:

  • forward and reverse traversals
  • inverse/symmetric traversal (expand: "inverse" / expand: "all")
  • 2-hop and 3-hop traversals
  • aggregate queries
  • cached execute vs prepared execute
  • deep traversals (10/100/1000 hop recursive with cyclePolicy: "allow")

Guardrail thresholds enforce expected behavior in CI (for example, traversal latency caps and ratio checks such as reverse/forward and deep-hop scaling).

Deep-recursive benchmark probes explicitly set cyclePolicy: "allow" to isolate recursive CTE expansion cost; the default cyclePolicy: "prevent" prioritizes cycle-safe semantics and is expected to be slower on long traversals.

Note: Real-world performance varies by hardware, database driver, network latency (for PostgreSQL), and schema/data shape.

Benchmark configuration and guardrails

Current suite configuration:

SettingValue
Seed users1200
Follows per user10
Posts per user5
Batch size250
Warmup iterations2
Sample iterations (median reported)15

Default guardrails:

CheckThreshold
reverse/forward ratio<= 6x
inverse traversal latency<= 500ms
inverse/forward ratio<= 10x
3-hop latency<= 500ms
3-hop/2-hop ratio<= 8x
aggregate latency<= 500ms
aggregate distinct latency<= 700ms
aggregateDistinct/aggregate ratio<= 4x
cached execute latency<= 500ms
prepared execute latency<= 500ms
prepared/cached ratio<= 2x
10-hop recursive latency<= 250ms
100-hop recursive latency<= 1000ms
100-hop-recursive/10-hop-recursive ratio<= 30x
1000-hop recursive latency<= 5000ms
1000-hop-recursive/100-hop-recursive ratio<= 20x

Backend-specific overrides:

BackendCheckThreshold
SQLite1000-hop recursive latency<= 7000ms
PostgreSQLinverse traversal latency<= 1000ms
PostgreSQLinverse/forward ratio<= 30x
PostgreSQL3-hop latency<= 1000ms
PostgreSQLaggregate distinct latency<= 1200ms
PostgreSQLprepared execute latency<= 700ms
Terminal window
pnpm bench

For guardrail mode (fails on regression thresholds):

Terminal window
pnpm --filter @nicia-ai/typegraph-benchmarks perf:check

Run the same guardrailed suite against PostgreSQL:

Terminal window
POSTGRES_URL=postgresql://typegraph:typegraph@127.0.0.1:5432/typegraph_test \
pnpm --filter @nicia-ai/typegraph-benchmarks perf:check:postgres

The benchmark source code is located in packages/benchmarks/src/.