Research Copilot
A single runnable example that exercises nearly every TypeGraph capability — typed schema, ontology, vector embeddings, recursive traversals, and the graph algorithms — against a corpus of landmark ML papers. It produces an explainable literature-review digest in one run against a single SQLite file, with zero external services.
What You Get
Section titled “What You Get”A natural-language query comes in. The copilot returns a ranked, chronological reading list with citation counts, authors, and topics — all computed against a single in-memory SQLite database:
Query: "contrastive self-supervised representation learning for vision"
Recommended reading order (chronological among top-ranked):
2012 ImageNet Classification with Deep Convolutional Neural Networks [3 citations] Alex Krizhevsky, Ilya Sutskever, Geoffrey Hinton topics: CNN, ComputerVision, DeepLearning why: semantic 0.449 · topic match: DeepLearning · 3 incoming citations 2014 Adam: A Method for Stochastic Optimization [1 citation] Diederik Kingma, Jimmy Ba topics: Optimization why: semantic 0.429 · 1 incoming citation 2019 Momentum Contrast for Unsupervised Visual Representation Learning [1 citation] Kaiming He, Haoqi Fan, Yuxin Wu, et al. topics: Contrastive, SelfSupervised, ComputerVision why: semantic 0.523 · topic match: SelfSupervised, Contrastive · 1 incoming citation 2020 A Simple Framework for Contrastive Learning of Visual Representations [1 citation] Ting Chen, Simon Kornblith, Mohammad Norouzi, et al. topics: Contrastive, SelfSupervised, ComputerVision why: semantic 0.436 · topic match: SelfSupervised, Contrastive · 1 incoming citation 2020 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [1 citation] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, et al. topics: Transformer, ComputerVision, DeepLearning why: semantic 0.350 · topic match: DeepLearning · 1 incoming citationArchitecture
Section titled “Architecture”Each moving part maps to a single TypeGraph primitive:
| Feature | TypeGraph capability |
|---|---|
| Semantic paper retrieval | embedding() fields + cosine similarity |
| Topic hierarchy expansion | Ontology + store.algorithms.reachable() |
| Citation-authority ranking | store.algorithms.degree() over cites |
| Explainable paper lineage | store.algorithms.shortestPath() over cites |
| ”Does this trace back to X?” | store.algorithms.canReach() |
| Co-author discovery (2-hop) | store.algorithms.neighbors() |
| Reading-list assembly | Query builder with typed traversals |
Schema
Section titled “Schema”Three node kinds and four edges model the citation graph plus a topic hierarchy that supports query expansion:
const Paper = defineNode("Paper", { schema: z.object({ // Title + abstract are `searchable()` so BM25 ranks papers by // keyword hits (rare technical terms, author surnames, dataset // names) — exactly the queries where embeddings are least // discriminative. title: searchable({ language: "english" }), year: z.number().int(), abstract: searchable({ language: "english" }), embedding: embedding(128), }),});
const Author = defineNode("Author", { schema: z.object({ name: z.string() }),});
const Topic = defineNode("Topic", { schema: z.object({ name: z.string() }),});
const cites = defineEdge("cites", { schema: z.object({}) });const authoredBy = defineEdge("authored_by", { schema: z.object({}) });const coversTopic = defineEdge("covers_topic", { schema: z.object({}) });
// Topic hierarchy: `CNN broader_than DL` reads "CNN is a more specific// concept than DL". Recursive traversal expands narrow query terms into// their ancestor concepts for higher recall.const broaderThan = defineEdge("broader_than", { schema: z.object({}) });
const graph = defineGraph({ id: "research_copilot", nodes: { Paper: { type: Paper }, Author: { type: Author }, Topic: { type: Topic } }, edges: { cites: { type: cites, from: [Paper], to: [Paper] }, authored_by: { type: authoredBy, from: [Paper], to: [Author] }, covers_topic: { type: coversTopic, from: [Paper], to: [Topic] }, broader_than: { type: broaderThan, from: [Topic], to: [Topic] }, },});Scene by Scene
Section titled “Scene by Scene”The example walks through five capabilities end-to-end. Each produces real console output against the seeded corpus of 18 landmark papers.
1. Semantic retrieval
Section titled “1. Semantic retrieval”Every paper has a 128-dimensional embedding. Rank the corpus against a query embedding and take the top hits:
const queryEmbedding = mockEmbedding(query);const allPapers = await store.nodes.Paper.find();const ranked = allPapers .map((paper) => ({ paper, similarity: cosine(queryEmbedding, paper.embedding), })) .sort((a, b) => b.similarity - a.similarity);In production, swap the in-JS ranking for p.embedding.similarTo(queryEmbedding, k)
in a query builder predicate — backed by pgvector or
sqlite-vec — to do the scoring in SQL. See Semantic Search.
title and abstract are declared searchable(), so the same corpus
is also indexed for BM25 via SQLite’s FTS5. The example runs a
rare-token query against the fulltext index to show where BM25 wins —
dataset names, method acronyms, proper nouns — exactly the queries
embeddings smooth out:
const fulltextHits = await store.search.fulltext("Paper", { query: "Dropout", limit: 3, includeSnippets: true,});─── Fulltext retrieval (BM25 via FTS5) for: "Dropout" ─── 2.619 Dropout: A Simple Way to Prevent Neural Networks from Overfitting <mark>Dropout</mark>: A Simple Way to Prevent Neural Networks from Overfitting Randomly zeroing unit activations during training prevents co-adaptation and…In production you’d fuse the two via store.search.hybrid(), which
runs both retrievers and blends them with Reciprocal Rank Fusion at
the SQL layer:
const hits = await store.search.hybrid("Paper", { limit: 10, vector: { fieldPath: "embedding", queryEmbedding, metric: "cosine" }, fulltext: { query, includeSnippets: true }, // Weight fulltext slightly higher for the entity-heavy queries // typical of literature search. fusion: { method: "rrf", k: 60, weights: { vector: 1, fulltext: 1.25 } },});See the Fulltext Search guide for tuning and Example 15 for an end-to-end hybrid walkthrough.
2. Ontology-expanded topic matching
Section titled “2. Ontology-expanded topic matching”A query for the narrow topic Contrastive should also return papers tagged
with its ancestors (SelfSupervised, DeepLearning). reachable() walks
the broader_than edge recursively and returns every ancestor topic:
const topicAncestors = await store.algorithms.reachable(contrastiveTopic, { edges: ["broader_than"], maxHops: 10, excludeSource: true,});Then filter papers whose covers_topic edge lands in the expanded set:
const topicMatches = await store .query() .from("Paper", "p") .traverse("covers_topic", "e") .to("Topic", "t") .whereNode("t", (t) => t.id.in([...expandedTopicIds])) .select((ctx) => ({ id: ctx.p.id, title: ctx.p.title, topic: ctx.t.name })) .execute();Output:
Expanded set: {Contrastive, SelfSupervised, DeepLearning}3. Citation-authority re-ranking
Section titled “3. Citation-authority re-ranking”Pure vector similarity is noisy. Fuse it with in-degree on the cites edge
so highly-cited papers bubble up:
const citationCount = await store.algorithms.degree(paperId, { edges: ["cites"], direction: "in",});const score = similarity + topicBonus + Math.log(citationCount + 1) / 10;Output:
score = similarity + 0.05 * topicMatches + log(1 + citations) / 10
rank score sim topic cites title ─────────────────────────────────────────────────────────────────── 1 0.692 0.523 2 1 Momentum Contrast for Unsupervised Visual Representation Learning 2 0.638 0.449 1 3 ImageNet Classification with Deep Convolutional Neural Networks 3 0.606 0.436 2 1 A Simple Framework for Contrastive Learning of Visual Representations4. Explainable lineage
Section titled “4. Explainable lineage”“You’ve read AlexNet — how does SimCLR trace back to it?” shortestPath
returns an ordered list of nodes, which the example formats as a tree:
const lineage = await store.algorithms.shortestPath(simclr.id, alex.id, { edges: ["cites"], maxHops: 6,}); 2-hop citation lineage:
A Simple Framework for Contrastive Learning of Visual Representations └─▶ Deep Residual Learning for Image Recognition └─▶ ImageNet Classification with Deep Convolutional Neural Networks5. Heritage check
Section titled “5. Heritage check”canReach is the boolean sibling of shortestPath — useful when you don’t
need the path, just the answer. Here: “which of these modern papers still
trace back to Rumelhart’s 1986 backprop paper?“
const reaches = await store.algorithms.canReach(paper.id, backprop.id, { edges: ["cites"], maxHops: 10,}); ✓ "LLaMA: Open and Efficient Foundation Language Models" traces to Rumelhart 1986 ✓ "Learning Transferable Visual Models From Natural Language" traces to Rumelhart 1986 ✓ "Chain-of-Thought Prompting Elicits Reasoning in Large LMs" traces to Rumelhart 1986 ✓ "A Simple Framework for Contrastive Learning of Visual Reps" traces to Rumelhart 19866. Collaborator discovery
Section titled “6. Collaborator discovery”neighbors returns the direct neighborhood of a node. Compose it — authors
of CLIP → their other papers → co-authors on those papers — to rank natural
collaborators by shared-paper count:
const clipAuthors = await store.algorithms.neighbors(clip.id, { edges: ["authored_by"], depth: 1,});
// For each CLIP author: walk authored_by backwards to all their papers,// then forwards to all their co-authors.const perAuthorPapers = await Promise.all( clipAuthors.map((author) => store.algorithms.neighbors(author.id, { edges: ["authored_by"], direction: "in", depth: 1, }), ),);Issuing each level in parallel keeps the fan-out at O(depth) round-trips
instead of O(authors × papers).
Seed paper authors: Ilya Sutskever, Jong Wook Kim, Aditya Ramesh, Alec Radford, Chris Hallacy
Nearby collaborators beyond the original CLIP paper: 2× shared papers with CLIP authors Alex Krizhevsky 2× shared papers with CLIP authors Geoffrey Hinton 2× shared papers with CLIP authors Rewon Child 2× shared papers with CLIP authors Jeffrey Wu 1× shared papers with CLIP authors Nitish SrivastavaRun It
Section titled “Run It”The full source lives at
packages/typegraph/examples/14-research-copilot.ts.
From a checkout of the repository:
pnpm installnpx tsx packages/typegraph/examples/14-research-copilot.tsThe example builds the graph, runs every scene, and tears down — all
against an in-memory SQLite database. To persist it, point
createExampleBackend() at a file path. To run it on Postgres, swap the
import to createPostgresBackend — see Backend Setup.
Next Steps
Section titled “Next Steps”- Graph Algorithms — the full API for
shortestPath,reachable,canReach,neighbors, anddegree - Knowledge Graph for RAG — entity linking, chunk traversal, and hybrid vector + fulltext retrieval
- Ontology & Reasoning — inverse edges, subclass hierarchies,
and other ontology primitives beyond
broader_than - Semantic Search — production vector search with pgvector and sqlite-vec