context graph vs knowledge graph

A knowledge graph is a persistent, global model of entities and relationships; a context graph is a small, dynamic subgraph (or overlay) that represents what is relevant “right now” for a specific task, agent, or decision.¹²³

Core definitions

Knowledge graph: Long‑lived graph of entities, relations, and sometimes rules or constraints, typically represented as triples $(subject, predicate, object)$, used as a semantic backbone across systems and time.²⁴⁵
Context graph: Task‑ or query‑specific graph capturing current user intent, recent events, applicable policies, and the subset of facts relevant to the ongoing interaction.⁶³⁷¹

Structural and lifecycle differences

Scope & size
- Knowledge graph: Large, often millions of nodes/edges, meant to be comprehensive at the domain or enterprise level.⁴³²
- Context graph: Small (often 10–100s of entities), scoped to a session, workflow, or decision, optimized to fit model context windows and real‑time reasoning.³¹
Temporal behavior
- Knowledge graph: Evolves slowly, curated, governed, and versioned as institutional memory.⁵¹⁶
- Context graph: Built and updated on the fly, often ephemeral after the task, though decision traces may be archived for learning/audit.⁷¹⁶

Role in AI agent systems

Knowledge graph → “semantic backbone” / long‑term memory
- Provides consistent entity definitions, disambiguation (e.g., “Reddit” customer vs “Reddit” ad channel), and multi‑hop reasoning over relationships.²⁴⁵
- Ideal for complex regulated domains (healthcare, finance, compliance) where traceability and shared understanding are critical.¹⁴⁵
Context graph → “working state of now” / control surface
- Represents current goals, constraints, relevant entities, and policies: essentially a decision‑aware slice over the larger knowledge base.⁶³⁷¹
- Often enriched with provenance, authority, and policy edges (“who can do what with which data”) to govern agent actions, not just inform them.³⁶

Governance and decision semantics

Some authors distinguish knowledge graph (what exists, factual relations) from governed context graph (what is allowed, under which authority, in this situation).⁶

Knowledge graph:
- Focus: “What is X?”, “How is X related to Y?”, “What has historically happened?”⁴⁵²
Governed context graph:
- Focus: “What can the AI do about X now, given policies, roles, and risk constraints?”³⁶

How they work together in an agentic stack

In a typical multi‑agent architecture:

The knowledge graph sits under everything as the global world model and institutionally governed truth layer.⁵¹²
A context engine (or context graph builder) pulls a query‑driven subgraph—scored by relevance, filtered by policy—and exposes it as a compact context graph for the current agent turn or workflow.⁸¹⁴³
This context graph is then serialized (often as structured text or JSON) into the model’s prompt, giving it focused, policy‑aware grounding while keeping tokens in check and reducing hallucinations.⁴³

If you want, next step I can sketch:

a minimal typed schema (entities/edges) for a domain KG, and
a LangGraph‑style context‑graph builder node that projects task‑specific subgraphs for your agents. ⁹¹⁰

⁂

ketch:

a minimal typed schema (entities/edges) for a domain KG, and a LangGraph‑style context‑graph builder node that projects task‑specific subgraphs for your agents.

Below is a concrete, copy‑pasteable sketch for both parts: a minimal typed domain KG schema and a LangGraph node that builds a task‑specific context graph view from it.

1. Minimal typed schema for a domain knowledge graph

Assume an enterprise “Customer‑Case‑Content” domain (tickets, docs, agents). This is property‑graph style, but easy to map to RDF triples.¹¹¹²¹³¹⁴¹⁵

Entity types (nodes)

// Core node types
type NodeType =
  | "Customer"
  | "User"          // internal user/agent
  | "Case"          // ticket / incident
  | "Document"      // markdown, spec, runbook
  | "Product"
  | "Feature"
  | "Tag"
  | "Policy";

// Base node shape
interface KGNode {
  id: string;                 // global, stable ID
  type: NodeType;
  name: string;
  createdAt?: string;         // ISO 8601
  updatedAt?: string;
  // Arbitrary properties per type in a `props` bag
  props?: Record<string, any>;
}

// Examples of props by type
interface CustomerNode extends KGNode {
  type: "Customer";
  props?: {
    externalId?: string;
    segment?: "SMB" | "MidMarket" | "Enterprise";
    region?: string;
    isRegulated?: boolean;    // e.g. healthcare/fintech
  };
}

interface CaseNode extends KGNode {
  type: "Case";
  props?: {
    status: "open" | "in_progress" | "resolved";
    priority: "p0" | "p1" | "p2" | "p3";
    openedAt: string;
    closedAt?: string;
    channel: "email" | "chat" | "api";
  };
}

interface DocumentNode extends KGNode {
  type: "Document";
  props?: {
    path: string;             // repo path, URL, etc.
    mimeType: string;
    embeddingId?: string;     // pointer into vector store
  };
}

Relationship types (edges)

type EdgeType =
  | "OWNS"              // Customer -> Product
  | "USES_FEATURE"      // Customer -> Feature
  | "RAISED"            // Customer -> Case
  | "ASSIGNED_TO"       // Case -> User
  | "RELATES_TO_CASE"   // Document -> Case
  | "RELATES_TO_FEATURE"// Document -> Feature
  | "HAS_TAG"           // Case/Document -> Tag
  | "GOVERNED_BY"       // Node -> Policy
  | "SIMILAR_TO";       // symmetric similarity

interface KGEdge {
  id: string;
  type: EdgeType;
  from: string;         // KGNode.id
  to: string;           // KGNode.id
  props?: {
    score?: number;     // similarity/relevance score
    since?: string;     // ISO 8601
    source?: string;    // “etl”, “llm_extraction”, etc.
  };
}

You can then package a subgraph like this:¹³¹⁴

interface KGSubgraph {
  nodes: KGNode[];
  edges: KGEdge[];
}

This is sufficient for:

Entity‑centric reasoning: “Which features does this regulated enterprise customer use?”
Graph‑RAG: “Which documents are most related to this case and its product/feature neighborhood?”

2. LangGraph‑style context‑graph builder node

Goal: a node that, given current task state (e.g. active case, query text), queries the KG + vector store, then writes a compact contextGraph into the LangGraph state.

Assume:

You have a kgClient with methods like getNode(id), neighbors(id, depth, filters), similarDocuments(embeddingId, k).
You use LangGraph Python with TypedDict state.¹⁶¹⁷¹⁸¹⁹

State schema with context graph field

from typing import List, Literal, TypedDict, Optional, Dict, Any

NodeType = Literal[
    "Customer", "User", "Case", "Document", "Product", "Feature", "Tag", "Policy"
]

EdgeType = Literal[
    "OWNS",
    "USES_FEATURE",
    "RAISED",
    "ASSIGNED_TO",
    "RELATES_TO_CASE",
    "RELATES_TO_FEATURE",
    "HAS_TAG",
    "GOVERNED_BY",
    "SIMILAR_TO",
]

class KGNode(TypedDict, total=False):
    id: str
    type: NodeType
    name: str
    createdAt: Optional[str]
    updatedAt: Optional[str]
    props: Dict[str, Any]

class KGEdge(TypedDict, total=False):
    id: str
    type: EdgeType
    from_: str   # `from` is reserved in Python
    to: str
    props: Dict[str, Any]

class KGSubgraph(TypedDict):
    nodes: List[KGNode]
    edges: List[KGEdge]

class AgentState(TypedDict):
    messages: list         # your normal conversation/messages state
    active_case_id: Optional[str]
    query: Optional[str]
    context_graph: Optional[KGSubgraph]

Context‑graph builder node

from langgraph.graph import StateGraph, START, END

# Pseudocode client; you’d back this with ArangoDB/Neo4j/etc.
class KGClient:
    def get_node(self, node_id: str) -> Optional[KGNode]: ...
    def neighborhood(
        self,
        node_id: str,
        depth: int = 2,
        max_nodes: int = 50,
        allowed_types: Optional[list[NodeType]] = None,
    ) -> KGSubgraph: ...
    def semantic_docs_for_case(
        self,
        case_id: str,
        k: int = 10,
    ) -> KGSubgraph: ...

kg_client = KGClient()

def build_context_graph(state: AgentState) -> AgentState:
    case_id = state.get("active_case_id")
    query = state.get("query")

    nodes: dict[str, KGNode] = {}
    edges: dict[str, KGEdge] = {}

    if case_id:
        # 1) Structural neighborhood around the active case
        case_ego = kg_client.neighborhood(
            node_id=case_id,
            depth=2,
            max_nodes=50,
            allowed_types=[
                "Case",
                "Customer",
                "Product",
                "Feature",
                "Document",
                "Policy",
                "Tag",
            ],
        )
        for n in case_ego["nodes"]:
            nodes[n["id"]] = n
        for e in case_ego["edges"]:
            edges[e["id"]] = e

        # 2) Semantic expansion: top-k related documents
        doc_subgraph = kg_client.semantic_docs_for_case(case_id=case_id, k=10)
        for n in doc_subgraph["nodes"]:
            nodes[n["id"]] = n
        for e in doc_subgraph["edges"]:
            edges[e["id"]] = e

    # Optional: query‑driven expansion if no active case but a free‑text query
    # e.g., vector search + expansion via KG neighborhood; omitted for brevity.

    state["context_graph"] = {
        "nodes": list(nodes.values()),
        "edges": list(edges.values()),
    }
    return state

Wiring it into a LangGraph

from langgraph.graph import StateGraph, START, END

builder = StateGraph(AgentState)

builder.add_node("build_context", build_context_graph)

# Example: always build context first, then call an LLM node that consumes it
def call_model(state: AgentState) -> AgentState:
    # You’d serialize state["context_graph"] into the prompt here
    # and append the model’s reply to `messages`.
    ...
    return state

builder.add_node("model", call_model)

builder.add_edge(START, "build_context")
builder.add_edge("build_context", "model")
builder.add_edge("model", END)

graph = builder.compile()

This gives you:

A global KG schema you can implement in ArangoDB/Neo4j/Relational‑KG, compatible with GraphRAG patterns.¹⁴²⁰²¹¹³
A LangGraph node that projects a task‑specific context graph into state, ready to be serialized into prompts or used for policy checks before tool calls.¹⁷¹⁸¹⁹¹⁶

If you tell me your preferred backing store (ArangoDB vs Neo4j vs relational) I can turn the KGClient pseudocode into concrete queries (AQL, Cypher, or SQL + edge tables) and add a prompt template that injects context_graph for multi‑step agents. ²²²³²⁴²⁵

⁂

Examples of knowledge graphs in enterprise AI applications

When to use context graphs over knowledge graphs How do context graphs integrate with RAG Building a governed context graph step by step Knowledge graphs vs context graphs for AI agents. ext_graph for multi‑step agents.

How to implement the KG schema in Rel or Neo4j Example code for LangGraph node projecting subgraphs Task-specific subgraph projection techniques for agents Integrating KG with LangGraph state and context_schema Building multi-agent workflows using context graphs. ALL 3 preferred backing store (ArangoDB vs Neo4j vs relational) I can turn the KGClient pseudocode into concrete queries (AQL, Cypher, or SQL + edge tables) and add a prompt template that injects context_graph for multi‑step agents.

Below is a compact design you can lift into a spec or codebase. I’ll group by what you asked for, with concrete patterns and code.

1. When to use context graphs vs knowledge graphs

Use a knowledge graph when you need a durable, governed world model: shared semantics across teams, cross‑system joins, lineage, and slow‑changing truth (products, org, policies, long‑term customer history).²⁶²⁷²⁸²⁹³⁰
Use a context graph when you need a small, task‑scoped, policy‑aware slice for a specific run/agent: “what matters for this incident, this user, under these constraints, right now.”²⁷³¹³²³³³⁴

For agent stacks, a solid pattern is: KG as backbone, context graph as working set injected into prompts and tools, then logged for traceability.³¹³²³⁵²⁷

2. How context graphs integrate with RAG

Typical pattern (KG + vector RAG + context graph):²⁸³²³⁵²⁶

Retrieve candidate docs from vector store with the user query or task description.
Map chunks to KG nodes (Document, Section, Feature, Policy) via stored IDs or edges (e.g., Document -> Feature, Document -> Policy).³⁶³⁷
Expand to a context graph by pulling:
- Local KG neighborhood (customer, product, feature, prior cases).
- Governing policies or SLAs.
- A small set of top‑K related docs (semantic + graph similarity).
Serialize the context graph into the model prompt (or a tool schema) with explicit types and relationships, not just raw text.
Use the context graph for control, e.g., filter tools by policy node, mask PII nodes, or veto actions that violate a GOVERNED_BY -> Policy constraint.³³²⁷

This is essentially “GraphRAG with a decision layer”: the KG provides structure, the retriever provides content, the context graph provides governed, task‑specific state.

3. Building a governed context graph (step by step)

You can implement this as a pipeline that runs per task/agent turn:

Identify the anchor(s)
- From state: active_case_id, user_id, customer_id, task_type.
- From query: classify domain entities and resolve them into KG IDs using entity linking.²⁹³⁷
Pull the structural neighborhood
- From the KG: neighbors(anchor, depth=1–2) constrained to key types (Customer, Product, Feature, Policy, Tag) and with a node budget.³⁷³⁶
- This gives you a compact “ego network” around the task.
Add relevant content nodes
- From RAG: top‑K Documents and Sections linked to these entities (via stored IDs or an edge collection like RELATES_TO_FEATURE, plus pure semantic similarity for back‑off).³⁶³⁷
Overlay governance
- Attach Policy nodes referenced by GOVERNED_BY edges from Case, Customer, User, or Feature.
- Optionally, add Risk / Authority edges, e.g. User -[HAS_ROLE]-> Role -[ALLOWS]-> Action.
Normalize & cap
- Normalize node/edge shapes to your KGSubgraph schema.
- Apply budgets (max nodes/edges, per‑type caps), and annotate nodes with relevance scores and provenance (KG vs. RAG).²⁷³³
Emit as context_graph
- Store into graph state (LangGraph) and/or sign/log it for audit.
- Optionally store a hash / ID and reference it in your traces.

4. Implementing the KG schema in three backends

4.1 Relational (PostgreSQL / “Rel‑style” schema)

A simple, explicit schema: one table per node type + a generic edges table.³⁶

-- Nodes
CREATE TABLE customers (
  id          TEXT PRIMARY KEY,
  name        TEXT NOT NULL,
  external_id TEXT,
  segment     TEXT,
  region      TEXT,
  is_regulated BOOLEAN DEFAULT FALSE
);

CREATE TABLE cases (
  id          TEXT PRIMARY KEY,
  customer_id TEXT REFERENCES customers(id),
  status      TEXT,
  priority    TEXT,
  opened_at   TIMESTAMPTZ,
  closed_at   TIMESTAMPTZ,
  channel     TEXT
);

CREATE TABLE documents (
  id           TEXT PRIMARY KEY,
  name         TEXT,
  path         TEXT,
  mime_type    TEXT,
  embedding_id TEXT
);

CREATE TABLE policies (
  id   TEXT PRIMARY KEY,
  name TEXT,
  path TEXT
);

CREATE TABLE tags (
  id   TEXT PRIMARY KEY,
  name TEXT UNIQUE
);

-- Edges (typed, generic)
CREATE TABLE edges (
  id        BIGSERIAL PRIMARY KEY,
  type      TEXT NOT NULL,      -- 'OWNS', 'RELATES_TO_FEATURE', ...
  from_id   TEXT NOT NULL,
  to_id     TEXT NOT NULL,
  since     TIMESTAMPTZ,
  score     DOUBLE PRECISION,
  source    TEXT,
  -- Optionally: foreign keys + CHECK constraints per type
  -- or keep it soft-typed if you want flexibility
  INDEX (from_id),
  INDEX (to_id),
  INDEX (type, from_id),
  INDEX (type, to_id)
);

Example: task‑specific neighborhood query around a case:

-- 1) Get case + customer + product/features
WITH anchor_case AS (
  SELECT * FROM cases WHERE id = $1
),
neighbors AS (
  -- direct edges from the case
  SELECT e.*
  FROM edges e
  WHERE e.from_id = $1 OR e.to_id = $1
)
SELECT *
FROM neighbors;

You can then join neighbors to customers, products, policies, documents by from_id / to_id and build a KGSubgraph in application code.

4.2 Neo4j (Cypher)

Node labels and relationship types map directly to your schema.³⁸

// Example nodes
CREATE (c:Customer {id: $customerId, name: $name, segment: $segment});
CREATE (p:Product  {id: $productId, name: $productName});
CREATE (k:Case     {id: $caseId, status: $status, priority: $priority});

// Relationships
MATCH (c:Customer {id: $customerId}),
      (p:Product  {id: $productId})
MERGE (c)-[:OWNS]->(p);

MATCH (c:Customer {id: $customerId}),
      (k:Case     {id: $caseId})
MERGE (c)-[:RAISED]->(k);

Task‑specific subgraph around a case:

// ego network: Case + neighbors (depth <= 2), typed and capped
MATCH (case:Case {id: $caseId})
CALL apoc.path.subgraphNodes(case, {
  maxLevel: 2,
  relationshipFilter: "RAISED|OWNS|USES_FEATURE|ASSIGNED_TO|HAS_TAG|GOVERNED_BY|RELATES_TO_CASE|RELATES_TO_FEATURE",
  labelFilter: "+Customer|+Product|+Feature|+User|+Document|+Policy|+Tag"
})
YIELD node
WITH collect(node) AS nodes
UNWIND nodes AS n
MATCH (n)-[r]->(m)
WHERE m IN nodes
RETURN collect(DISTINCT n) AS nodes, collect(DISTINCT r) AS rels
LIMIT 200;

You can add RAG overlays by adding SIMILAR_TO relationships or storing embedding_id on Document and querying external vector search.

4.3 ArangoDB (AQL)

Use one document collection per node type + edge collections.³⁷³⁶

// Example collections
// Customers, Cases, Documents, Policies, Tags (document collections)
// ownsEdges, raisedEdges, relatesToFeatureEdges, governedByEdges (edge collections)

Ego neighborhood around a Case:

LET caseDoc = DOCUMENT("Cases", @caseId)

FOR v, e, p IN 1..2 ANY caseDoc
  GRAPH "EnterpriseGraph"
  OPTIONS { bfs: true, uniqueVertices: "global" }
  FILTER v.type IN ["Customer","Product","Feature","User","Document","Policy","Tag"]
  LIMIT 200
  RETURN { node: v, edge: e }

You can keep RAG integration by having a Documents collection with embedding_id and an external vector index; ArangoSearch can handle full‑text / metadata filters.³⁷³⁶

5. Example LangGraph node projecting subgraphs

This extends the earlier pseudocode and shows: projection, state integration, and context_schema usage.³⁹⁴⁰

State and context schemas

from typing import TypedDict, List, Literal, Optional, Dict, Any
from typing_extensions import Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from langgraph.graph import MessagesState
from langgraph.types import Command
from langgraph.graph.state import Runtime

NodeType = Literal[
    "Customer", "User", "Case", "Document", "Product", "Feature", "Tag", "Policy"
]

EdgeType = Literal[
    "OWNS",
    "USES_FEATURE",
    "RAISED",
    "ASSIGNED_TO",
    "RELATES_TO_CASE",
    "RELATES_TO_FEATURE",
    "HAS_TAG",
    "GOVERNED_BY",
    "SIMILAR_TO",
]

class KGNode(TypedDict, total=False):
    id: str
    type: NodeType
    name: str
    props: Dict[str, Any]

class KGEdge(TypedDict, total=False):
    id: str
    type: EdgeType
    from_: str
    to: str
    props: Dict[str, Any]

class KGSubgraph(TypedDict):
    nodes: List[KGNode]
    edges: List[KGEdge]

class AgentState(TypedDict):
    messages: list
    active_case_id: Optional[str]
    query: Optional[str]
    context_graph: Optional[KGSubgraph]

class Context(TypedDict):
    max_nodes: int
    max_depth: int

Subgraph‑projection node

class KGClient:
    def neighborhood(self, node_id: str, depth: int, max_nodes: int) -> KGSubgraph:
        """Backend-specific implementation (Arango, Neo4j, SQL)."""
        ...

    def semantic_docs_for_case(self, case_id: str, k: int) -> KGSubgraph:
        """Vector search + KG joins."""
        ...

kg_client = KGClient()

def build_context_graph(
    state: AgentState,
    runtime: Runtime[Context],
) -> AgentState:
    ctx = runtime.context
    max_nodes = ctx.get("max_nodes", 128)
    max_depth = ctx.get("max_depth", 2)

    case_id = state.get("active_case_id")
    query = state.get("query")

    nodes: Dict[str, KGNode] = {}
    edges: Dict[str, KGEdge] = {}

    if case_id:
        ego = kg_client.neighborhood(
            node_id=case_id,
            depth=max_depth,
            max_nodes=max_nodes,
        )
        for n in ego["nodes"]:
            nodes[n["id"]] = n
        for e in ego["edges"]:
            edges[e["id"]] = e

        docs = kg_client.semantic_docs_for_case(case_id=case_id, k=10)
        for n in docs["nodes"]:
            nodes[n["id"]] = n
        for e in docs["edges"]:
            edges[e["id"]] = e

    # TODO: query-based projection when no case_id

    state["context_graph"] = {
        "nodes": list(nodes.values())[:max_nodes],
        "edges": list(edges.values())[: max_nodes * 2],
    }
    return state

Wiring into a multi‑agent workflow

Supervisor routes to workers; all agents see the same context_graph on state.⁴¹⁴²⁴³⁴⁴

from typing import Literal

def support_agent(state: AgentState) -> Command[Literal["security_agent", END]]:
    # serialize context graph into prompt
    ctx_graph = state.get("context_graph") or {"nodes": [], "edges": []}
    system_instructions = (
        "You are a support agent. Use the following context graph of entities and "
        "relations to answer. Do not invent entities not present in the graph.\n"
        f"CONTEXT_GRAPH_JSON:\n{ctx_graph}"
    )
    # call your model here; omitted
    ...
    return Command(
        goto="security_agent",
        update={"messages": state["messages"] + ["support reply"]},
    )

def security_agent(state: AgentState) -> Command[Literal[END]]:
    # use context_graph to enforce policy: e.g., disallow PHI for non-compliant customers
    ctx_graph = state.get("context_graph") or {"nodes": [], "edges": []}
    # run checks over ctx_graph before finalizing response
    ...
    return Command(
        goto=END,
        update={"messages": state["messages"] + ["security checked"]},
    )

builder = StateGraph(AgentState, context_schema=Context)
builder.add_node("build_context", build_context_graph)
builder.add_node("support_agent", support_agent)
builder.add_node("security_agent", security_agent)

builder.add_edge(START, "build_context")
builder.add_edge("build_context", "support_agent")
builder.add_edge("support_agent", "security_agent")
graph = builder.compile()

# Example invocation with runtime context (context_schema)
result = graph.invoke(
    {
        "messages": [],
        "active_case_id": "CASE-123",
        "query": "Customer cannot access feature X in region Y",
        "context_graph": None,
    },
    context={"max_nodes": 128, "max_depth": 2},
)

This uses context_schema for immutable run‑time parameters (budgets, backends, toggles) and state for evolving context_graph and messages.⁴⁵⁴⁰³⁹

6. Task‑specific subgraph projection techniques

You can define several projection modes and select them via context or supervisor policy:

Anchor‑centric: ego network around Case, Customer, Product. Good for incident handling.
Query‑semantic: vector search on docs, then expand to nearest entities in KG (lazy graph building / LazyGraphRAG style).⁴⁶³⁷
Policy‑first: start from Policy and Role nodes, then intersect with customer/product neighborhood; use for strong governance.
Historical trace: project only nodes/edges with since >= last_30_days to bias toward recent context.

Implementation‑wise you can expose a projection_mode in Context and branch inside build_context_graph.

7. Stronger follow‑up prompts you can reuse

You can paste any of these as next‑step questions:

“Design a governed context graph specification for a healthcare SaaS (HIPAA), including node/edge types for PHI, policies, and audit trails, and show how to enforce it in a LangGraph multi‑agent workflow.”
“Given a Neo4j‑backed enterprise KG, write complete Python code for a LangGraph node that runs a Cypher query to build a task‑specific context graph and injects it into an LLM prompt for RAG.”
“For ArangoDB Community Edition, implement KGClient.neighborhood and KGClient.semantic_docs_for_case using AQL, including indices and collection design optimized for GraphRAG.”
“Define a JSON serialization format and prompt template that lets an LLM reason reliably over a context_graph (nodes, edges, props) while minimizing tokens, and show an example multi‑turn dialogue.”
“Extend the context‑graph pattern to a supervisor/worker multi‑agent design: specify how the supervisor modifies context_graph between agents to add tasks, constraints, and intermediate conclusions.” ⁴⁷⁴⁸⁴⁹⁵⁰⁵¹⁵²⁵³⁵⁴

⁂

Stronger follow‑up prompts you can reuse

You can paste any of these as next‑step questions: “Design a governed context graph specification for a healthcare SaaS (HIPAA), including node/edge types for PHI, policies, and audit trails, and show how to enforce it in a LangGraph multi‑agent workflow.” “Given a Neo4j‑backed enterprise KG, write complete Python code for a LangGraph node that runs a Cypher query to build a task‑specific context graph and injects it into an LLM prompt for RAG.” “For ArangoDB Community Edition, implement KGClient.neighborhood and KGClient.semantic_docs_for_case using AQL, including indices and collection design optimized for GraphRAG.” “Define a JSON serialization format and prompt template that lets an LLM reason reliably over a context_graph (nodes, edges, props) while minimizing tokens, and show an example multi‑turn dialogue.” “Extend the context‑graph pattern to a supervisor/worker multi‑agent design: specify how the supervisor modifies context_graph between agents to add tasks, constraints, and intermediate conclusions.”

LangGraph multi-agent collaboration examples Code tutorial for StateGraph with checkpointer Context graphs in production AI systems Differences between LangGraph and Neo4j knowledge graphs Scaling governed context graphs for enterprise

Here are tighter, higher‑leverage follow‑up prompts you can paste directly, tuned to your list and to enterprise / multi‑agent work.

Governance, KG vs context graph, production

Governed context graph for HIPAA SaaS

“Design a governed context graph specification for a HIPAA‑regulated healthcare SaaS: define concrete node/edge types for Patients, Encounters, PHI fields, Policies, Roles, and AuditEvents; show how this graph constrains a LangGraph multi‑agent workflow (intake agent, clinical‑support agent, compliance agent) so that PHI never leaves allowed channels and every decision is auditable.”⁵⁵⁵⁶⁵⁷⁵⁸ 2. Context graphs in production AI systems

“Using real‑world patterns from Glean’s enterprise graph and modern ‘context layer’ architectures, outline how a governed context graph sits between raw data sources and AI agents in a large enterprise (identity, content, activity, policies). Specify required SLAs, observability, and governance controls to run this in production for search + RAG assistants across multiple business units.”⁵⁶⁵⁷⁵⁸⁵⁹⁶⁰ 3. Scaling governed context graphs

“Propose an architecture to scale governed context graphs to 10k+ concurrent agent sessions in a large enterprise: detail how you separate the durable enterprise KG from per‑task context graphs, how you enforce row/edge‑level permissions, and how you index context graphs for post‑hoc decision forensics and policy‑violation detection.”⁵⁸⁶¹⁶²⁵⁶ 4. Knowledge graphs vs context graphs for AI agents

“Create a technical comparison (with tables) of knowledge graphs vs governed context graphs for AI agents, focusing on: schema design, update cadence, governance model, role in RAG, and impact on agent reliability. Include examples from enterprise tools like Glean and context‑layer architectures described by Atlan or similar vendors.”⁵⁷⁵⁹⁶⁰⁵⁶⁵⁸

LangGraph multi‑agent, context_graph, and checkpointer

Multi‑agent collaboration with context_graph

“Using LangGraph’s multi‑agent patterns (supervisor and peer collaboration), design a three‑agent system (research, reasoning, and policy‑check) that all read/write a shared context_graph field in state. Show the full Python StateGraph definition, the AgentState schema, and how the supervisor modifies context_graph between agents to add tasks, constraints, and intermediate conclusions.”⁶³⁶⁴⁶⁵⁵⁵ 6. StateGraph with checkpointer for decision forensics

“Write a complete code tutorial that builds a LangGraph StateGraph with a SQLite or Couchbase checkpointer, focused on replaying and inspecting context_graph evolution over time. Demonstrate how to time‑travel to a specific step, dump the context graph at that step, and relate it to model actions for debugging non‑deterministic multi‑agent behaviour.”⁶⁶⁶⁷⁶⁸⁶⁹ 7. LangGraph vs Neo4j knowledge graphs

“Explain the conceptual and practical differences between LangGraph (as an execution/state graph) and Neo4j (as a knowledge graph database). Then design an architecture where LangGraph nodes query Neo4j to build task‑specific context graphs, and describe how you’d trace from a LangGraph execution to the underlying Neo4j entities and relationships used.”⁷⁰⁷¹⁵⁵⁵⁷

Backend‑specific KGClient implementations (ArangoDB, Neo4j, relational)

Neo4j‑backed KG + LangGraph node

“Given a Neo4j‑backed enterprise KG with labels Customer, Case, Document, Product, Feature, Policy, and Tag, write complete Python code for a KGClient plus a LangGraph node that: (1) runs Cypher to pull a 2‑hop ego network around a Case, (2) caps to N nodes, (3) returns a context_graph JSON object, and (4) injects that JSON into an LLM RAG prompt in a downstream node.”⁶⁵⁷²⁵⁵⁷⁰ 9. ArangoDB Community Edition GraphRAG client

“For ArangoDB Community Edition, design collections and indices for Customers, Cases, Documents, Policies, Tags, and typed edge collections (e.g., raisedEdges, governedByEdges). Then implement KGClient.neighborhood and KGClient.semantic_docs_for_case in Python using AQL, optimized for GraphRAG patterns (including AQL examples and hints for performance).”⁶²⁷³ 10. Relational implementation with edge tables

“Starting from a relational schema with node tables (customers, cases, documents, policies, tags) and a generic edges table, implement KGClient.neighborhood in Python using SQL (for PostgreSQL). Show how to: (1) run a bounded BFS from an anchor case, (2) assemble a KGSubgraph in memory, and (3) serialize it into a compact context_graph attached to LangGraph state.”⁶⁸⁶⁶⁶²

JSON serialization, prompts, and dialogue

JSON context_graph format + prompt template

“Define a compact JSON schema for context_graph (nodes, edges, props) tailored for LLM reasoning under tight token budgets. Then write a system prompt template that teaches the model how to use this graph for chain‑of‑thought reasoning without hallucinating entities. Finally, show a multi‑turn dialogue where the model updates its plan based on changes to context_graph between turns.”⁵⁵⁶³⁵⁷⁵⁸ 12. Task‑specific subgraph projection techniques

“Enumerate and implement three task‑specific subgraph projection strategies—anchor‑centric, query‑semantic, and policy‑first—and show how each is exposed as a projection_mode parameter in LangGraph’s context_schema. Provide example Python code that switches projection logic inside build_context_graph based on this mode.”⁷⁴⁷⁵⁵⁸⁵⁵

Context graphs in enterprise AI

Enterprise AI examples using KGs and context graphs

“Survey 3–5 real or representative enterprise AI applications that use knowledge graphs (e.g., Glean, enterprise search, customer 360) and show how they could evolve to governed context graphs for agentic workflows. For each, outline entities, relationships, governance needs, and how a context graph would be constructed per request.”⁵⁹⁶⁰⁵⁶⁵⁷⁵⁸

If you tell me which of these you want to run first, I can expand it straight into code + diagrams tailored to your current stack (GCP + ArangoDB/Neo4j/Postgres + LangGraph). ⁷⁶⁷⁷⁷⁸

⁂

Implement this context graph spec using LangGraph with checkpointer for persistence

Add multi-agent workflow for patient data queries in the healthcare graph Compare governed context graph vs knowledge graph for HIPAA healthcare use Example JSON schema for the healthcare context graph nodes and edges Extend the spec with governance metrics and audit trail examples YES TO ALL which of these you want to run first, I can expand it straight into code + diagrams tailored to your current stack (GCP + ArangoDB/Neo4j/Postgres + LangGraph).

Below is a compact “YES TO ALL” implementation sketch you can extend. I’ll keep it concrete and healthcare‑specific.

1) Healthcare context graph vs knowledge graph (HIPAA focus)

Enterprise healthcare knowledge graph: durable, governed ontology and facts about Providers, Patients, Encounters, Diagnoses, Orders, Systems, Policies, etc.; supports lineage, semantic consistency, and compliance reporting.⁷⁹⁸⁰⁸¹⁸²
Governed context graph: small, per‑task graph to answer “what is relevant and allowed for this patient query, for this clinician, under this policy, right now?”, including PHI nodes and audit events tied to each agent step.⁸³⁸⁴⁷⁹

Key HIPAA‑specific differences:⁸⁵⁸⁶⁸⁷

KG = long‑term semantic + data governance backbone (who the patient is, what happened, policy set and lineage).
Context graph = short‑lived access + decision graph (who accessed what PHI, for what purpose, under which role/policy, and what the agent did with it).

2) JSON schema for the healthcare context graph

Minimal, PHI‑aware context_graph JSON:

{
  "nodes": [
    {
      "id": "patient:123",
      "type": "Patient",
      "name": "REDACTED",
      "props": {
        "mrn": "MRN-12345",
        "age": 54,
        "isPHI": true
      }
    },
    {
      "id": "encounter:789",
      "type": "Encounter",
      "name": "ED visit 2026-01-11",
      "props": {
        "encounterType": "ED",
        "start": "2026-01-11T12:34:00Z",
        "end": "2026-01-11T16:22:00Z",
        "isPHI": true
      }
    },
    {
      "id": "policy:hipaa-minimum-necessary",
      "type": "Policy",
      "name": "HIPAA Minimum Necessary",
      "props": {
        "reg": "45 CFR 164.514(d)",
        "link": "https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html",
        "isPHI": false
      }
    },
    {
      "id": "audit:step-5",
      "type": "AuditEvent",
      "name": "AgentAccess",
      "props": {
        "agentId": "clinical-assistant",
        "userId": "clinician:456",
        "timestamp": "2026-02-03T04:31:00Z",
        "action": "READ",
        "targetNodeIds": ["patient:123", "encounter:789"],
        "purposeOfUse": "TREATMENT",
        "isPHI": false
      }
    }
  ],
  "edges": [
    {
      "id": "e1",
      "type": "HAS_ENCOUNTER",
      "from": "patient:123",
      "to": "encounter:789",
      "props": {}
    },
    {
      "id": "e2",
      "type": "GOVERNED_BY",
      "from": "encounter:789",
      "to": "policy:hipaa-minimum-necessary",
      "props": {}
    },
    {
      "id": "e3",
      "type": "RECORDED_IN",
      "from": "audit:step-5",
      "to": "encounter:789",
      "props": {}
    }
  ]
}

Core node types: Patient, Encounter, Condition, Medication, Observation, Document, User, Role, Policy, AuditEvent. Core edge types: HAS_ENCOUNTER, HAS_CONDITION, HAS_MEDICATION, HAS_OBSERVATION, HAS_DOCUMENT, HAS_ROLE, GOVERNED_BY, RECORDED_IN, PERMITTED_ACTION, DENIED_ACTION.

3) LangGraph implementation with checkpointer (Postgres)

Using a Postgres checkpointer for persistence and replay.⁸⁸⁸⁹⁹⁰

State + context schema

from typing import TypedDict, List, Literal, Optional, Dict, Any
from langgraph.graph import StateGraph, START, END
from langgraph.types import Command
from langgraph.graph.state import Runtime
from langgraph.checkpoint.postgres import PostgresSaver
from psycopg_pool import ConnectionPool

NodeType = Literal[
    "Patient", "Encounter", "Condition", "Medication",
    "Observation", "Document", "User", "Role", "Policy", "AuditEvent"
]

EdgeType = Literal[
    "HAS_ENCOUNTER", "HAS_CONDITION", "HAS_MEDICATION",
    "HAS_OBSERVATION", "HAS_DOCUMENT", "HAS_ROLE",
    "GOVERNED_BY", "RECORDED_IN", "PERMITTED_ACTION", "DENIED_ACTION"
]

class KGNode(TypedDict, total=False):
    id: str
    type: NodeType
    name: str
    props: Dict[str, Any]

class KGEdge(TypedDict, total=False):
    id: str
    type: EdgeType
    from_: str
    to: str
    props: Dict[str, Any]

class KGSubgraph(TypedDict):
    nodes: List[KGNode]
    edges: List[KGEdge]

class AgentState(TypedDict):
    messages: list
    patient_id: Optional[str]
    clinician_id: Optional[str]
    query: Optional[str]
    context_graph: Optional[KGSubgraph]
    governance_metrics: Dict[str, Any]

class Context(TypedDict):
    max_nodes: int
    max_depth: int
    projection_mode: Literal["anchor", "semantic", "policy_first"]

Checkpointer setup (Postgres)

DB_URI = "postgresql://user:pass@host:5432/langgraph?sslmode=require"
pool = ConnectionPool(conninfo=DB_URI, max_size=10)

with pool.connection() as conn:
    checkpointer = PostgresSaver(conn)
    checkpointer.setup()

4) Multi‑agent workflow for patient data queries

Three agents: intake_agent, clinical_agent, compliance_agent. Supervisor is implicit via edges.⁹¹⁹²⁹³

KG client (backend‑agnostic interface)

class HealthcareKGClient:
    def neighborhood(
        self,
        patient_id: str,
        depth: int,
        max_nodes: int,
    ) -> KGSubgraph:
        """Backend-specific (ArangoDB / Neo4j / Postgres)."""
        ...

    def semantic_docs_for_patient(
        self, patient_id: str, k: int
    ) -> KGSubgraph:
        """RAG over guidelines, SOPs, patient-doc links."""
        ...

kg_client = HealthcareKGClient()

Context‑graph builder node (governed projection)

def build_context_graph(
    state: AgentState,
    runtime: Runtime[Context],
) -> AgentState:
    ctx = runtime.context
    patient_id = state.get("patient_id")
    if not patient_id:
        return state

    max_nodes = ctx.get("max_nodes", 128)
    max_depth = ctx.get("max_depth", 2)

    nodes: Dict[str, KGNode] = {}
    edges: Dict[str, KGEdge] = {}

    # 1) Structural neighborhood for patient
    ego = kg_client.neighborhood(patient_id, depth=max_depth, max_nodes=max_nodes)
    for n in ego["nodes"]:
        nodes[n["id"]] = n
    for e in ego["edges"]:
        edges[e["id"]] = e

    # 2) Semantic expansion (guidelines, SOPs, etc.)
    docs = kg_client.semantic_docs_for_patient(patient_id, k=10)
    for n in docs["nodes"]:
        nodes[n["id"]] = n
    for e in docs["edges"]:
        edges[e["id"]] = e

    # Governance metrics (simple examples)
    phi_nodes = [n for n in nodes.values() if n.get("props", {}).get("isPHI")]
    policies = [n for n in nodes.values() if n["type"] == "Policy"]

    state["context_graph"] = {
        "nodes": list(nodes.values())[:max_nodes],
        "edges": list(edges.values())[: max_nodes * 2],
    }
    state["governance_metrics"] = {
        "phi_node_count": len(phi_nodes),
        "policy_count": len(policies),
    }
    return state

Agents

def intake_agent(state: AgentState) -> AgentState:
    # Clarify query, set patient_id / clinician_id, append to messages.
    ...
    return state

def clinical_agent(state: AgentState) -> AgentState:
    ctx_graph = state.get("context_graph") or {"nodes": [], "edges": []}
    system_prompt = (
        "You are a clinical decision-support assistant. "
        "Use ONLY facts in CONTEXT_GRAPH to answer. Do NOT reveal direct identifiers.\n"
        f"CONTEXT_GRAPH_JSON:\n{ctx_graph}"
    )
    # Call LLM with system_prompt + state["messages"]; update messages.
    ...
    return state

def compliance_agent(state: AgentState) -> AgentState:
    ctx_graph = state.get("context_graph") or {"nodes": [], "edges": []}
    metrics = state.get("governance_metrics", {})
    # Example: block if PHI nodes exist and no Policy nodes guarding them.
    phi_cnt = metrics.get("phi_node_count", 0)
    pol_cnt = metrics.get("policy_count", 0)
    if phi_cnt > 0 and pol_cnt == 0:
        state["messages"].append(
            "Compliance check: response blocked due to missing policies for PHI access."
        )
    else:
        state["messages"].append("Compliance check: response approved.")
    # Optionally append an AuditEvent node to context_graph.
    ...
    return state

Wiring with checkpointer

builder = StateGraph(AgentState, context_schema=Context)

builder.add_node("intake", intake_agent)
builder.add_node("build_context", build_context_graph)
builder.add_node("clinical", clinical_agent)
builder.add_node("compliance", compliance_agent)

builder.add_edge(START, "intake")
builder.add_edge("intake", "build_context")
builder.add_edge("build_context", "clinical")
builder.add_edge("clinical", "compliance")
builder.add_edge("compliance", END)

graph = builder.compile(checkpointer=checkpointer)

Invocation:

thread = {"configurable": {"thread_id": "pt-123-q-1"}}

result = graph.invoke(
    {
        "messages": [{"role": "user", "content": "Why was my ACE inhibitor stopped?"}],
        "patient_id": "patient:123",
        "clinician_id": "clinician:456",
        "query": "medication change explanation",
        "context_graph": None,
        "governance_metrics": {},
    },
    config=thread,
    context={"max_nodes": 128, "max_depth": 2, "projection_mode": "anchor"},
)

You can later replay / inspect the context_graph from checkpoints via the checkpointer API.⁹⁴⁸⁹⁹⁰⁸⁸

5) Governance metrics and audit trail examples

Examples of metrics you can compute per run:⁸⁶⁸⁷⁷⁹⁸⁵

phi_node_count: number of PHI‑tagged nodes in the context graph.
un-governed_phi_nodes: PHI nodes without a GOVERNED_BY -> Policy edge.
agent_actions_per_policy: how many actions were taken under each policy (e.g., hipaa-minimum-necessary).
access_log_coverage: ratio of PHI nodes in context graph that also appear in AuditEvent.targetNodeIds.

Audit trail pattern (conceptual):

Every time an agent reads or writes PHI, add an AuditEvent node with edges RECORDED_IN (to Encounter or Document) and GOVERNED_BY (to Policy), containing agentId, userId, timestamp, action, purposeOfUse.⁸⁷⁸⁵⁸⁶
Persist checkpoints so you can replay the evolution of context_graph and correlate it with AuditEvent nodes and external logs (EHR, FHIR server).

If you tell me which backend you want first (ArangoDB vs Neo4j vs Postgres) I can next give you a fully concrete HealthcareKGClient implementation (AQL, Cypher, or SQL) that populates this healthcare context_graph for the LangGraph workflow above. ⁹⁵⁹⁶⁹⁷⁹⁸

⁂

Enterprise context graph schema for clinic patient records

An enterprise context graph schema for clinic patient records is a constrained, FHIR‑aligned slice of a broader clinical KG, focused on what a given agent/user is allowed to see and act on for one patient and episode of care.⁹⁹¹⁰⁰¹⁰¹

Core design principles

Standards‑anchored: align node/edge types with FHIR resources (Patient, Encounter, Observation, MedicationStatement, Condition, Practitioner) and major vocabularies (SNOMED CT, LOINC, RxNorm, ICD‑10).¹⁰¹¹⁰²⁹⁹
Patient‑episode centric: graph is anchored on a Patient and one or more Encounters/Visits, representing a clinical trajectory rather than the whole enterprise.¹⁰⁰¹⁰¹
Governance‑forward: embed Policy, Consent, and AuditEvent nodes, with GOVERNED_BY / RECORDED_IN edges, to support HIPAA auditability and minimum‑necessary access.¹⁰³¹⁰⁴⁹⁹¹⁰⁰

Node types

You can treat this as a minimal typed schema for the context graph (not the full enterprise KG).

Patient (FHIR Patient)
- Key props: mrn, age, sex, deidentifiedId, isPHI=true.
Encounter (FHIR Encounter / Visit_occurrence in OMOP)¹⁰⁰¹⁰¹
- Props: encounterType, start, end, locationId, serviceType.
Condition (FHIR Condition, ICD‑10/SNOMED‑coded)⁹⁹¹⁰¹
- Props: codeSystem, code, display, onsetDate, clinicalStatus.
Medication (FHIR MedicationStatement/Drug_exposure, RxNorm‑coded)¹⁰²¹⁰¹⁹⁹
- Props: rxnormCode, name, dose, route, start, end, status.
Observation (FHIR Observation, LOINC‑coded)¹⁰¹¹⁰²⁹⁹
- Props: loincCode, name, value, unit, referenceRange, effectiveTime.
Procedure (FHIR Procedure)¹⁰²¹⁰¹
- Props: codeSystem, code, display, performedTime.
Document (clinical note, guideline, SOP)
- Props: docType, pathOrId, embeddingId, isPHI.
Practitioner (FHIR Practitioner)¹⁰¹
- Props: npi, role, department.
User (application user or service principal)
- Props: userType, externalId, associatedPractitionerId.
Role (RBAC roles: “Attending”, “Resident”, “Billing”, “Researcher”)
- Props: purposeOfUse, riskTier.
Policy (HIPAA / local policies)¹⁰⁴¹⁰³⁹⁹
- Props: regCitation, name, link, scope (“TREATMENT”, “PAYMENT”, “OPERATIONS”).
Consent (FHIR Consent)
- Props: patientId, status, scope, categories, provision.
AuditEvent (FHIR AuditEvent / HIPAA audit trail)¹⁰⁵¹⁰⁶¹⁰⁷
- Props: agentId, userId, timestamp, action, targets, purposeOfUse, outcome.

Each node includes generic properties like id, type, name, createdAt, updatedAt, and isPHI where relevant.⁹⁹¹⁰¹

Edge types

Edges encode clinical relationships, user‑to‑data access, and governance.

Patient‑centric clinical edges¹⁰⁰⁹⁹¹⁰¹
- HAS_ENCOUNTER: Patient → Encounter
- HAS_CONDITION: Encounter → Condition
- HAS_MEDICATION: Encounter → Medication
- HAS_OBSERVATION: Encounter → Observation
- HAS_PROCEDURE: Encounter → Procedure
- HAS_DOCUMENT: Encounter → Document or Patient → Document
Provider and user relations⁹⁹¹⁰⁰¹⁰¹
- PERFORMED_BY: Encounter/Procedure → Practitioner
- AUTHORED_BY: Document/Observation → Practitioner
- HAS_ROLE: User → Role
Governance and consent¹⁰³¹⁰⁴⁹⁹
- GOVERNED_BY: Patient/Encounter/Document → Policy
- PERMITTED_ACTION: Role → Policy (what actions role may take under this policy)
- DENIED_ACTION: Role → Policy (explicit prohibitions)
- HAS_CONSENT: Patient → Consent
- CONSENT_GOVERNED_BY: Consent → Policy
Audit and access trails¹⁰⁶¹⁰⁷¹⁰⁵
- RECORDED_IN: AuditEvent → Encounter/Patient/Document (what access relates to)
- ACTED_BY: AuditEvent → User or Practitioner

You can optionally include SIMILAR_TO between Conditions, Medications, or Documents for GraphRAG‑style expansion.¹⁰⁸¹⁰⁹¹¹⁰

Example JSON snippet (context graph for one query)

{
  "nodes": [
    {
      "id": "patient:123",
      "type": "Patient",
      "name": "REDACTED",
      "props": {
        "mrn": "MRN-12345",
        "age": 54,
        "sex": "female",
        "isPHI": true
      }
    },
    {
      "id": "encounter:789",
      "type": "Encounter",
      "name": "Inpatient 2026-01-10",
      "props": {
        "encounterType": "inpatient",
        "start": "2026-01-10T09:32:00Z",
        "end": "2026-01-14T15:21:00Z",
        "locationId": "ward:cardiology",
        "isPHI": true
      }
    },
    {
      "id": "condition:ckd-3a",
      "type": "Condition",
      "name": "Chronic kidney disease, stage 3a",
      "props": {
        "codeSystem": "ICD-10",
        "code": "N18.31",
        "display": "Chronic kidney disease, stage 3a",
        "onsetDate": "2024-11-02",
        "clinicalStatus": "active",
        "isPHI": true
      }
    },
    {
      "id": "med:ace-inhibitor",
      "type": "Medication",
      "name": "Lisinopril 10 mg",
      "props": {
        "rxnormCode": "29046",
        "dose": "10 mg",
        "route": "oral",
        "start": "2024-12-01",
        "end": "2026-01-10",
        "status": "stopped",
        "isPHI": true
      }
    },
    {
      "id": "policy:hipaa-minimum-necessary",
      "type": "Policy",
      "name": "HIPAA Minimum Necessary",
      "props": {
        "regCitation": "45 CFR 164.514(d)",
        "scope": ["TREATMENT"],
        "link": "https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html",
        "isPHI": false
      }
    },
    {
      "id": "consent:de-identified-research",
      "type": "Consent",
      "name": "De-identified data for research",
      "props": {
        "status": "active",
        "scope": "RESEARCH",
        "categories": ["de-identified"],
        "isPHI": false
      }
    },
    {
      "id": "audit:step-12",
      "type": "AuditEvent",
      "name": "ClinicalAgentRead",
      "props": {
        "agentId": "clinical-agent",
        "userId": "user:clinician-456",
        "timestamp": "2026-02-03T04:33:00Z",
        "action": "READ",
        "targetNodeIds": ["patient:123", "encounter:789", "med:ace-inhibitor"],
        "purposeOfUse": "TREATMENT",
        "outcome": "success",
        "isPHI": false
      }
    }
  ],
  "edges": [
    {
      "id": "e1",
      "type": "HAS_ENCOUNTER",
      "from": "patient:123",
      "to": "encounter:789",
      "props": {}
    },
    {
      "id": "e2",
      "type": "HAS_CONDITION",
      "from": "encounter:789",
      "to": "condition:ckd-3a",
      "props": {}
    },
    {
      "id": "e3",
      "type": "HAS_MEDICATION",
      "from": "encounter:789",
      "to": "med:ace-inhibitor",
      "props": {}
    },
    {
      "id": "e4",
      "type": "GOVERNED_BY",
      "from": "encounter:789",
      "to": "policy:hipaa-minimum-necessary",
      "props": {}
    },
    {
      "id": "e5",
      "type": "HAS_CONSENT",
      "from": "patient:123",
      "to": "consent:de-identified-research",
      "props": {}
    },
    {
      "id": "e6",
      "type": "RECORDED_IN",
      "from": "audit:step-12",
      "to": "encounter:789",
      "props": {}
    }
  ]
}

This is the context graph your agents would receive for a single patient query: small, patient‑episode‑scoped, PHI‑annotated, and explicitly governed by policies and consents.¹⁰⁰¹⁰¹⁹⁹

Suggested next step prompt

If you want to push this into implementation, you can paste:

“Implement this enterprise clinic context graph schema on top of a FHIR/OMOP‑aligned clinical knowledge graph:

show how FHIR Patient/Encounter/Observation/MedicationStatement map into the node/edge types above,
design AQL or Cypher queries that build a per‑patient, per‑encounter context graph, and
integrate it into a LangGraph multi‑agent workflow with a checkpointer, including how AuditEvent nodes are appended at each agent step.” ¹¹¹¹¹²¹¹³¹¹⁴¹¹⁵¹¹⁶

⁂

which backend you want first (ArangoDB vs Neo4j vs Postgres) I can next give you a fully concrete HealthcareKGClient implementation (AQL, Cypher, or SQL) that populates this healthcare context_graph for the LangGraph workflow above. Sample code for PostgresSaver checkpointer setup in LangGraph

Multi-agent roles for healthcare query workflow HIPAA compliance checklist for patient data graphs Deploy LangGraph healthcare agent with Redis checkpointing Enterprise context graph schema for clinic patient records Sample code for PostgresSaver checkpointer setup in LangGraph Multi-agent roles for healthcare query workflow HIPAA compliance checklist for patient data graphs Deploy LangGraph healthcare agent with Redis checkpointing. “Implement this enterprise clinic context graph schema on top of a FHIR/OMOP‑aligned clinical knowledge graph: show how FHIR Patient/Encounter/Observation/MedicationStatement map into the node/edge types above, design AQL or Cypher queries that build a per‑patient, per‑encounter context graph, and integrate it into a LangGraph multi‑agent workflow with a checkpointer, including how AuditEvent nodes are appended at each agent step.” Related How to implement this schema with FHIR and OMOP mappings Example entities and edges for clinic patient graph Privacy and governance best practices for healthcare graphs GraphRAG applications for patient journey analysis LangGraph implementation of clinic context graph Enterprise context graph schema for clinic patient records

Use Postgres first for the durable checkpointer and OMOP‑style tabular mapping, then layer ArangoDB/Neo4j if/when you want native graph ops. Below are targeted snippets for each item.

1) Sample PostgresSaver checkpointer setup (LangGraph)

from psycopg_pool import ConnectionPool
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.graph import StateGraph, START, END

# 1) Connection + checkpointer
DB_URI = "postgresql://user:pass@host:5432/langgraph?sslmode=require"
pool = ConnectionPool(conninfo=DB_URI, max_size=10)

with pool.connection() as conn:
    checkpointer = PostgresSaver(conn)
    checkpointer.setup()  # creates tables if not exist

# 2) Define state
from typing import TypedDict, Optional, Dict, Any, List

class AgentState(TypedDict):
    messages: List[Dict[str, Any]]
    patient_id: Optional[str]
    clinician_id: Optional[str]
    query: Optional[str]
    context_graph: Optional[Dict[str, Any]]
    governance_metrics: Dict[str, Any]

# 3) Build graph and compile with checkpointer
def dummy_node(state: AgentState) -> AgentState:
    return state

builder = StateGraph(AgentState)
builder.add_node("dummy", dummy_node)
builder.add_edge(START, "dummy")
builder.add_edge("dummy", END)

graph = builder.compile(checkpointer=checkpointer)

Checkpointing docs and patterns:¹¹⁷¹¹⁸¹¹⁹¹²⁰

2) Multi‑agent roles for healthcare query workflow

Typical roles over the clinic context graph:¹²¹¹²²¹²³

Intake agent
- Normalize user question, identify patient/episode, detect intent (med change, lab interpretation, discharge question).
- Sets patient_id, clinician_id, query in state.
Context builder agent (tool node)
- Calls HealthcareKGClient to build context_graph for that patient and encounter (per‑patient, per‑episode subgraph).
Clinical reasoning agent
- Uses context_graph to explain decisions (e.g., “Why was ACE inhibitor stopped?”), referencing Conditions, Medications, Observations.
Compliance / privacy agent
- Evaluates context_graph + proposed answer: PHI exposure, minimum necessary, consent, and policy coverage.
- May append AuditEvent nodes and block or redact response.¹²⁴¹²⁵¹²⁶
Supervisor (optional)
- Orchestrates ordering, retries, escalation, and additional constraints; may also down‑scope the context graph between agents.¹²⁷¹²⁸¹²⁹

3) HIPAA compliance checklist for patient data graphs

Non‑exhaustive but practical list tied to graph design:¹²²¹²⁵¹²⁶¹²⁴

Identify PHI and tag explicitly
- Mark nodes and properties (isPHI, phiCategory) for all identifiers and quasi‑identifiers.
Minimum necessary principle
- Ensure per‑task context graphs only include nodes needed for that task (episode‑scoped, small neighborhoods).
Access control and role‑based edges
- Represent roles and policies explicitly (User–HAS_ROLE–Role, Role–PERMITTED_ACTION–Policy).
Consent and purpose‑of‑use
- Include Consent nodes and purposeOfUse fields in AuditEvent aligned with HIPAA TPO (Treatment/Payment/Operations).
Audit trails
- For each agent action touching PHI, create an AuditEvent node with ACTED_BY, RECORDED_IN, GOVERNED_BY edges.
Data retention and deletion
- Track createdAt, updatedAt, and retention policy indicators per node; support archival / purge workflows.
De‑identification / pseudonymization
- Maintain mappings between MRN and deidentified IDs outside the graph used by agents, or segregated into a higher‑security KG/DB.

4) Deploying LangGraph healthcare agent with Redis checkpointing

For less heavy deployment (or dev/test) you can swap Postgres for Redis.¹²⁰¹³⁰

from langgraph.checkpoint.redis import RedisSaver
from redis import Redis
from langgraph.graph import StateGraph, START, END

redis_client = Redis(host="redis", port=6379, db=0)
checkpointer = RedisSaver(redis_client)

builder = StateGraph(AgentState)
builder.add_node("intake", intake_agent)
builder.add_edge(START, "intake")
builder.add_edge("intake", END)

graph = builder.compile(checkpointer=checkpointer)

# Invoke with a thread id for replay
config = {"configurable": {"thread_id": "pt-123"}}
result = graph.invoke(
    {
        "messages": [{"role": "user", "content": "Why was my ACE inhibitor stopped?"}],
        "patient_id": "patient:123",
        "clinician_id": "clinician:456",
        "query": "medication change explanation",
        "context_graph": None,
        "governance_metrics": {},
    },
    config=config,
)

Redis integration guidance:¹³⁰¹²⁰

5) FHIR / OMOP mappings into clinic context graph

High‑level mapping from FHIR/OMOP into the node types we defined:¹²³¹³¹¹³²¹³³

Patient node
- FHIR: Patient (id, identifiers, birthDate, gender) → Patient node props: mrn, age, sex, deidentifiedId, isPHI.
- OMOP: person (person_id, gender_concept_id, year_of_birth) → same node.
Encounter node
- FHIR: Encounter → Encounter node props: encounterType, start, end, locationId.
- OMOP: visit_occurrence → same node; map visit_concept_id to type.
Condition node
- FHIR: Condition → Condition node props: codeSystem, code, display, clinicalStatus, onsetDate.
- OMOP: condition_occurrence.
Medication node
- FHIR: MedicationStatement / MedicationRequest → Medication node props: rxnormCode, name, dose, route, status.
- OMOP: drug_exposure.
Observation node
- FHIR: Observation → Observation node with loincCode, value, unit, effectiveTime.
- OMOP: measurement.

Edges like HAS_ENCOUNTER, HAS_CONDITION, HAS_MEDICATION, HAS_OBSERVATION correspond to FHIR references (Encounter.subject, Condition.encounter, MedicationStatement.subject, Observation.encounter) or OMOP foreign keys (e.g., person_id, visit_occurrence_id).¹³¹¹³²¹²³

6) Example entities and edges for clinic patient graph

You can view this as the enterprise context graph schema you already started:¹²¹¹²²¹²³

Entities: Patient, Encounter, Condition, Medication, Observation, Procedure, Document, Practitioner, User, Role, Policy, Consent, AuditEvent.
Edges: HAS_ENCOUNTER, HAS_CONDITION, HAS_MEDICATION, HAS_OBSERVATION, HAS_PROCEDURE, HAS_DOCUMENT, PERFORMED_BY, AUTHORED_BY, HAS_ROLE, GOVERNED_BY, HAS_CONSENT, CONSENT_GOVERNED_BY, RECORDED_IN, ACTED_BY.

This schema is consistent with FHIR‑oriented clinical KGs and cohort‑building graphs in the literature.¹³⁴¹²³¹³¹¹²¹

7) LangGraph implementation of clinic context graph (Postgres backend)

A minimal HealthcareKGClient over a relational (OMOP‑ish) model that builds a per‑patient, per‑encounter graph (pseudo‑SQL, not optimized):

import psycopg
from typing import Dict, Any, List

class HealthcareKGClient:
    def __init__(self, dsn: str):
        self.dsn = dsn

    def _fetchall(self, query: str, params: tuple) -> List[Dict[str, Any]]:
        with psycopg.connect(self.dsn) as conn:
            with conn.cursor() as cur:
                cur.execute(query, params)
                cols = [c.name for c in cur.description]
                return [dict(zip(cols, row)) for row in cur.fetchall()]

    def patient_context(self, patient_id: str, max_nodes: int = 128) -> Dict[str, Any]:
        # Example: join person, visit_occurrence, condition_occurrence, drug_exposure, measurement
        # Build nodes/edges from rows; omitted for brevity.

        nodes: Dict[str, Dict[str, Any]] = {}
        edges: List[Dict[str, Any]] = []

        # 1) patient
        person_rows = self._fetchall(
            """
            SELECT person_id, year_of_birth, gender_concept_id
            FROM person
            WHERE person_id = %s
            """,
            (patient_id,),
        )
        # build Patient node, then add visits, conditions, etc.

        return {"nodes": list(nodes.values())[:max_nodes], "edges": edges}

You then call this from build_context_graph in the LangGraph node as we sketched earlier.

8) GraphRAG applications for patient journey analysis

You can use the same context graph to support GraphRAG tasks such as:¹³⁵¹³⁶¹³⁴

“Summarize this patient’s journey around diagnosis X over the last N encounters.”
“Find similar patients with comparable conditions, meds, and lab trajectories.”
“Explain treatment changes in light of guidelines (documents) linked to the graph.”

This is where SIMILAR_TO edges (patients/encounters/documents) and cohort‑level KGs from EHR data become valuable.¹²³¹³⁴¹²¹

If you want next, I can either:

Write a concrete Postgres SQL schema + full HealthcareKGClient.patient_context implementation, or
Show a Cypher or AQL variant for the same context graph projection, using FHIR/OMOP mappings.

⁂

Write a concrete Postgres SQL schema + full HealthcareKGClient.patient_context implementation, or

Show a Cypher or AQL variant for the same context graph projection, using FHIR/OMOP mappings.

Postgres + OMOP is the most generally useful base, so I’ll give you a concrete SQL schema and a full HealthcareKGClient.patient_context for a clinic context graph. You can adapt this to your LangGraph node directly.

1) Concrete Postgres schema (OMOP‑aligned, graph‑friendly)

This is a minimal subset of OMOP CDM tables plus a generic edges table to make graph projection easy.¹³⁷¹³⁸¹³⁹¹⁴⁰¹⁴¹¹⁴²

-- Core OMOP-like tables (highly simplified)

CREATE TABLE person (
  person_id           BIGINT PRIMARY KEY,
  birth_datetime      TIMESTAMPTZ,
  gender_concept_id   INTEGER,
  race_concept_id     INTEGER
);

CREATE TABLE visit_occurrence (
  visit_occurrence_id BIGINT PRIMARY KEY,
  person_id           BIGINT NOT NULL REFERENCES person(person_id),
  visit_concept_id    INTEGER NOT NULL,
  visit_start_datetime TIMESTAMPTZ NOT NULL,
  visit_end_datetime   TIMESTAMPTZ,
  care_site_id        BIGINT
);

CREATE TABLE condition_occurrence (
  condition_occurrence_id BIGINT PRIMARY KEY,
  person_id               BIGINT NOT NULL REFERENCES person(person_id),
  condition_concept_id    INTEGER NOT NULL,
  condition_start_datetime TIMESTAMPTZ,
  visit_occurrence_id     BIGINT REFERENCES visit_occurrence(visit_occurrence_id)
);

CREATE TABLE drug_exposure (
  drug_exposure_id        BIGINT PRIMARY KEY,
  person_id               BIGINT NOT NULL REFERENCES person(person_id),
  drug_concept_id         INTEGER NOT NULL,
  drug_exposure_start_datetime TIMESTAMPTZ,
  drug_exposure_end_datetime   TIMESTAMPTZ,
  visit_occurrence_id     BIGINT REFERENCES visit_occurrence(visit_occurrence_id),
  route_concept_id        INTEGER,
  dose_unit_concept_id    INTEGER,
  stop_reason             VARCHAR(50)
);

CREATE TABLE measurement (
  measurement_id          BIGINT PRIMARY KEY,
  person_id               BIGINT NOT NULL REFERENCES person(person_id),
  measurement_concept_id  INTEGER NOT NULL,
  measurement_datetime    TIMESTAMPTZ,
  value_as_number         DOUBLE PRECISION,
  unit_concept_id         INTEGER,
  visit_occurrence_id     BIGINT REFERENCES visit_occurrence(visit_occurrence_id)
);

-- Clinical notes / documents
CREATE TABLE note (
  note_id                 BIGINT PRIMARY KEY,
  person_id               BIGINT NOT NULL REFERENCES person(person_id),
  visit_occurrence_id     BIGINT REFERENCES visit_occurrence(visit_occurrence_id),
  note_datetime           TIMESTAMPTZ,
  note_title              TEXT,
  note_text               TEXT,
  embedding_id            TEXT
);

-- Governance tables
CREATE TABLE policy (
  policy_id     BIGINT PRIMARY KEY,
  name          TEXT NOT NULL,
  reg_citation  TEXT,
  scope         TEXT,         -- e.g., 'TREATMENT', 'RESEARCH'
  link          TEXT
);

CREATE TABLE consent (
  consent_id    BIGINT PRIMARY KEY,
  person_id     BIGINT NOT NULL REFERENCES person(person_id),
  status        TEXT,         -- 'active', 'revoked'
  scope         TEXT,
  categories    TEXT,         -- JSON/text list
  created_at    TIMESTAMPTZ,
  updated_at    TIMESTAMPTZ
);

CREATE TABLE app_user (
  user_id       BIGINT PRIMARY KEY,
  external_id   TEXT,
  user_type     TEXT,         -- 'clinician', 'admin', 'service'
  practitioner_id BIGINT      -- nullable, if mapped to clinical provider
);

CREATE TABLE user_role (
  role_id       BIGINT PRIMARY KEY,
  name          TEXT NOT NULL,
  purpose_of_use TEXT          -- 'TREATMENT', 'PAYMENT', etc.
);

CREATE TABLE user_role_link (
  user_id       BIGINT NOT NULL REFERENCES app_user(user_id),
  role_id       BIGINT NOT NULL REFERENCES user_role(role_id),
  PRIMARY KEY (user_id, role_id)
);

CREATE TABLE audit_event (
  audit_event_id BIGSERIAL PRIMARY KEY,
  person_id      BIGINT REFERENCES person(person_id),
  visit_occurrence_id BIGINT REFERENCES visit_occurrence(visit_occurrence_id),
  user_id        BIGINT REFERENCES app_user(user_id),
  agent_id       TEXT,         -- agent name in LangGraph
  action         TEXT,         -- 'READ', 'WRITE', 'DISCLOSE'
  purpose_of_use TEXT,
  timestamp      TIMESTAMPTZ NOT NULL DEFAULT now(),
  outcome        TEXT          -- 'success', 'denied'
);

-- Generic edges to model governance relations at KG level
CREATE TABLE policy_edge (
  edge_id       BIGSERIAL PRIMARY KEY,
  from_type     TEXT NOT NULL,   -- 'Patient','Encounter','Document','Role','Consent'
  from_id       BIGINT NOT NULL,
  to_type       TEXT NOT NULL,   -- 'Policy'
  to_id         BIGINT NOT NULL,
  edge_type     TEXT NOT NULL,   -- 'GOVERNED_BY','CONSENT_GOVERNED_BY','PERMITTED_ACTION'
  created_at    TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_visit_person ON visit_occurrence(person_id);
CREATE INDEX idx_condition_person ON condition_occurrence(person_id);
CREATE INDEX idx_drug_person ON drug_exposure(person_id);
CREATE INDEX idx_measurement_person ON measurement(person_id);
CREATE INDEX idx_note_person ON note(person_id);
CREATE INDEX idx_policy_edge_from ON policy_edge(from_type, from_id, edge_type);
CREATE INDEX idx_policy_edge_to   ON policy_edge(to_type, to_id, edge_type);

This is compatible with OMOP CDM semantics while being simple enough to project into a context graph.¹³⁸¹³⁹¹⁴⁰¹³⁷

2) Full `HealthcareKGClient.patient_context` implementation (Python)

This client queries Postgres and returns a context_graph JSON compatible with the earlier schema.

import psycopg
from typing import Dict, Any, List, Tuple

Node = Dict[str, Any]
Edge = Dict[str, Any]
Subgraph = Dict[str, List[Dict[str, Any]]]

class HealthcareKGClient:
    def __init__(self, dsn: str):
        self.dsn = dsn

    # --- helpers ---------------------------------------------------------

    def _fetchall(self, conn, query: str, params: Tuple[Any, ...]) -> List[Dict[str, Any]]:
        with conn.cursor() as cur:
            cur.execute(query, params)
            cols = [c.name for c in cur.description]
            return [dict(zip(cols, row)) for row in cur.fetchall()]

    def _add_node(self, nodes: Dict[str, Node], node: Node) -> None:
        nodes[node["id"]] = node

    def _edge(self, id_: str, type_: str, from_id: str, to_id: str, props=None) -> Edge:
        return {
            "id": id_,
            "type": type_,
            "from": from_id,
            "to": to_id,
            "props": props or {},
        }

    # --- public API ------------------------------------------------------

    def patient_context(
        self,
        person_id: int,
        max_visits: int = 10,
        max_conditions: int = 50,
        max_drugs: int = 50,
        max_measurements: int = 50,
        include_notes: bool = True,
    ) -> Subgraph:
        """
        Build a per-patient, per-encounter clinic context graph from OMOP-like tables.
        """

        nodes: Dict[str, Node] = {}
        edges: List[Edge] = []

        with psycopg.connect(self.dsn) as conn:
            # 1) Patient
            person_rows = self._fetchall(
                conn,
                """
                SELECT person_id, birth_datetime, gender_concept_id
                FROM person
                WHERE person_id = %s
                """,
                (person_id,),
            )
            if not person_rows:
                return {"nodes": [], "edges": []}

            p = person_rows[^8_0]
            patient_node_id = f"patient:{p['person_id']}"
            patient_node = {
                "id": patient_node_id,
                "type": "Patient",
                "name": f"Patient {p['person_id']}",
                "props": {
                    "personId": p["person_id"],
                    "birthDatetime": p["birth_datetime"].isoformat() if p["birth_datetime"] else None,
                    "genderConceptId": p["gender_concept_id"],
                    "isPHI": True,
                },
            }
            self._add_node(nodes, patient_node)

            # 2) Encounters (visits)
            visit_rows = self._fetchall(
                conn,
                """
                SELECT visit_occurrence_id, visit_concept_id,
                       visit_start_datetime, visit_end_datetime, care_site_id
                FROM visit_occurrence
                WHERE person_id = %s
                ORDER BY visit_start_datetime DESC
                LIMIT %s
                """,
                (person_id, max_visits),
            )

            for v in visit_rows:
                enc_id = f"encounter:{v['visit_occurrence_id']}"
                enc_node = {
                    "id": enc_id,
                    "type": "Encounter",
                    "name": f"Encounter {v['visit_occurrence_id']}",
                    "props": {
                        "visitConceptId": v["visit_concept_id"],
                        "start": v["visit_start_datetime"].isoformat() if v["visit_start_datetime"] else None,
                        "end": v["visit_end_datetime"].isoformat() if v["visit_end_datetime"] else None,
                        "careSiteId": v["care_site_id"],
                        "isPHI": True,
                    },
                }
                self._add_node(nodes, enc_node)
                edges.append(self._edge(f"e-patient-enc-{v['visit_occurrence_id']}",
                                        "HAS_ENCOUNTER", patient_node_id, enc_id))

            if not visit_rows:
                # Even if no visits, you might still have conditions or drugs, but
                # for a context graph we usually stop here.
                return {"nodes": list(nodes.values()), "edges": edges}

            visit_ids = [v["visit_occurrence_id"] for v in visit_rows]

            # 3) Conditions
            condition_rows = self._fetchall(
                conn,
                """
                SELECT condition_occurrence_id, condition_concept_id,
                       condition_start_datetime, visit_occurrence_id
                FROM condition_occurrence
                WHERE person_id = %s
                  AND visit_occurrence_id = ANY(%s)
                LIMIT %s
                """,
                (person_id, visit_ids, max_conditions),
            )

            for c in condition_rows:
                cond_id = f"condition:{c['condition_occurrence_id']}"
                node = {
                    "id": cond_id,
                    "type": "Condition",
                    "name": f"Condition {c['condition_occurrence_id']}",
                    "props": {
                        "conditionConceptId": c["condition_concept_id"],
                        "onset": c["condition_start_datetime"].isoformat()
                        if c["condition_start_datetime"] else None,
                        "isPHI": True,
                    },
                }
                self._add_node(nodes, node)
                enc_id = f"encounter:{c['visit_occurrence_id']}"
                edges.append(self._edge(f"e-enc-cond-{c['condition_occurrence_id']}",
                                        "HAS_CONDITION", enc_id, cond_id))

            # 4) Medications
            drug_rows = self._fetchall(
                conn,
                """
                SELECT drug_exposure_id, drug_concept_id,
                       drug_exposure_start_datetime, drug_exposure_end_datetime,
                       visit_occurrence_id, stop_reason
                FROM drug_exposure
                WHERE person_id = %s
                  AND visit_occurrence_id = ANY(%s)
                LIMIT %s
                """,
                (person_id, visit_ids, max_drugs),
            )

            for d in drug_rows:
                med_id = f"med:{d['drug_exposure_id']}"
                node = {
                    "id": med_id,
                    "type": "Medication",
                    "name": f"Medication {d['drug_exposure_id']}",
                    "props": {
                        "drugConceptId": d["drug_concept_id"],
                        "start": d["drug_exposure_start_datetime"].isoformat()
                        if d["drug_exposure_start_datetime"] else None,
                        "end": d["drug_exposure_end_datetime"].isoformat()
                        if d["drug_exposure_end_datetime"] else None,
                        "stopReason": d["stop_reason"],
                        "isPHI": True,
                    },
                }
                self._add_node(nodes, node)
                enc_id = f"encounter:{d['visit_occurrence_id']}"
                edges.append(self._edge(f"e-enc-med-{d['drug_exposure_id']}",
                                        "HAS_MEDICATION", enc_id, med_id))

            # 5) Observations / measurements
            meas_rows = self._fetchall(
                conn,
                """
                SELECT measurement_id, measurement_concept_id,
                       measurement_datetime, value_as_number,
                       unit_concept_id, visit_occurrence_id
                FROM measurement
                WHERE person_id = %s
                  AND visit_occurrence_id = ANY(%s)
                LIMIT %s
                """,
                (person_id, visit_ids, max_measurements),
            )

            for m in meas_rows:
                obs_id = f"obs:{m['measurement_id']}"
                node = {
                    "id": obs_id,
                    "type": "Observation",
                    "name": f"Measurement {m['measurement_id']}",
                    "props": {
                        "measurementConceptId": m["measurement_concept_id"],
                        "time": m["measurement_datetime"].isoformat()
                        if m["measurement_datetime"] else None,
                        "value": m["value_as_number"],
                        "unitConceptId": m["unit_concept_id"],
                        "isPHI": True,
                    },
                }
                self._add_node(nodes, node)
                enc_id = f"encounter:{m['visit_occurrence_id']}"
                edges.append(self._edge(f"e-enc-obs-{m['measurement_id']}",
                                        "HAS_OBSERVATION", enc_id, obs_id))

            # 6) Notes / documents
            if include_notes:
                note_rows = self._fetchall(
                    conn,
                    """
                    SELECT note_id, visit_occurrence_id, note_datetime,
                           note_title, embedding_id
                    FROM note
                    WHERE person_id = %s
                      AND visit_occurrence_id = ANY(%s)
                    """,
                    (person_id, visit_ids),
                )
                for n in note_rows:
                    doc_id = f"doc:{n['note_id']}"
                    node = {
                        "id": doc_id,
                        "type": "Document",
                        "name": n["note_title"] or f"Note {n['note_id']}",
                        "props": {
                            "noteId": n["note_id"],
                            "time": n["note_datetime"].isoformat()
                            if n["note_datetime"] else None,
                            "embeddingId": n["embedding_id"],
                            "isPHI": True,
                        },
                    }
                    self._add_node(nodes, node)
                    enc_id = f"encounter:{n['visit_occurrence_id']}"
                    edges.append(self._edge(f"e-enc-doc-{n['note_id']}",
                                            "HAS_DOCUMENT", enc_id, doc_id))

            # 7) Governance: policies and consents attached to the patient
            policy_rows = self._fetchall(
                conn,
                """
                SELECT pe.edge_id, pe.from_type, pe.from_id, pe.to_type, pe.to_id,
                       pe.edge_type,
                       p.name, p.reg_citation, p.scope, p.link
                FROM policy_edge pe
                JOIN policy p ON p.policy_id = pe.to_id
                WHERE pe.from_type = 'Patient'
                  AND pe.from_id = %s
                """,
                (person_id,),
            )

            for r in policy_rows:
                pol_node_id = f"policy:{r['to_id']}"
                if pol_node_id not in nodes:
                    policy_node = {
                        "id": pol_node_id,
                        "type": "Policy",
                        "name": r["name"],
                        "props": {
                            "regCitation": r["reg_citation"],
                            "scope": r["scope"],
                            "link": r["link"],
                            "isPHI": False,
                        },
                    }
                    self._add_node(nodes, policy_node)

                edges.append(
                    self._edge(
                        f"e-pol-{r['edge_id']}",
                        r["edge_type"] or "GOVERNED_BY",
                        patient_node_id,
                        pol_node_id,
                    )
                )

            consent_rows = self._fetchall(
                conn,
                """
                SELECT consent_id, status, scope, categories,
                       created_at, updated_at
                FROM consent
                WHERE person_id = %s
                """,
                (person_id,),
            )

            for c in consent_rows:
                cons_id = f"consent:{c['consent_id']}"
                node = {
                    "id": cons_id,
                    "type": "Consent",
                    "name": f"Consent {c['consent_id']}",
                    "props": {
                        "status": c["status"],
                        "scope": c["scope"],
                        "categories": c["categories"],
                        "createdAt": c["created_at"].isoformat() if c["created_at"] else None,
                        "updatedAt": c["updated_at"].isoformat() if c["updated_at"] else None,
                        "isPHI": False,
                    },
                }
                self._add_node(nodes, node)
                edges.append(
                    self._edge(
                        f"e-patient-cons-{c['consent_id']}",
                        "HAS_CONSENT",
                        patient_node_id,
                        cons_id,
                    )
                )

        return {"nodes": list(nodes.values()), "edges": edges}

This function gives you a per‑patient context graph aligned with OMOP tables and the clinic schema you defined.¹³⁹¹⁴⁰¹⁴¹¹⁴²¹³⁷¹³⁸

You then wire it into your LangGraph build_context_graph node:

kg_client = HealthcareKGClient(dsn="postgresql://user:pass@host/db")

def build_context_graph(state: AgentState, runtime) -> AgentState:
    person_id = int(state["patient_id"].split(":")[^8_1])  # e.g., "patient:123"
    subgraph = kg_client.patient_context(person_id=person_id)
    state["context_graph"] = subgraph
    return state

If you want, next I can add a Neo4j Cypher version of patient_context that builds the same context graph directly from a graph model, or provide AQL queries if you prefer an ArangoDB‑backed implementation. ¹⁴³¹⁴⁴¹⁴⁵¹⁴⁶¹⁴⁷¹⁴⁸¹⁴⁹¹⁵⁰¹⁵¹

⁂

RDF explain structured format

RDF (Resource Description Framework) is a data model for representing information as a directed labeled graph, using a very simple structured format: triples.

Core structured model

Triple: the basic statement is $\langle subject,\ predicate,\ object \rangle$ where:
- subject = the thing being described (usually an IRI/URI or blank node)
- predicate = the property/relationship (IRI)
- object = the value or related thing (IRI, blank node, or literal)

Example triple (conceptually):

Subject: <https://example.org/patient/123>
Predicate: <https://example.org/vocab/hasCondition>
Object: <https://example.org/condition/CKD-3a>

Together, many triples form a graph: subjects/objects are nodes, predicates are edge labels.

Common structured serializations

These are different concrete syntaxes to write the same RDF graph:

Turtle (.ttl)
- Human‑readable, compact.
- Uses prefixes and punctuation to group triples.
- Example:

@prefix ex: <https://example.org/> .

ex:patient123
  ex:hasCondition ex:CKD_3a ;
  ex:age 54 .

N‑Triples (.nt)
- Line‑oriented, very simple, one triple per line.
- Easier for machines, verbose for humans.
- Example:

<https://example.org/patient/123> <https://example.org/vocab/hasCondition> <https://example.org/condition/CKD-3a> .
<https://example.org/patient/123> <https://example.org/vocab/age> "54"^^<http://www.w3.org/2001/XMLSchema#integer> .

N‑Quads (.nq)
- Extends N‑Triples with a 4th element (graph name) for datasets (multiple named graphs).
RDF/XML (.rdf)
- XML syntax; older and more verbose, but still widely supported.
JSON‑LD (.jsonld)
- JSON representation of RDF, designed to embed linked data into JSON APIs.
- Uses an @context to map JSON keys to IRIs.
- Example:

{
  "@context": {
    "ex": "https://example.org/",
    "hasCondition": "ex:hasCondition",
    "age": "ex:age"
  },
  "@id": "ex:patient123",
  "hasCondition": { "@id": "ex:CKD_3a" },
  "age": 54
}

All these formats encode the same underlying structure: a set of triples in the RDF graph model. The choice is about readability (Turtle), simplicity (N‑Triples), compatibility (RDF/XML), or JSON integration (JSON‑LD). ¹⁵²¹⁵³¹⁵⁴¹⁵⁵¹⁵⁶¹⁵⁷¹⁵⁸¹⁵⁹¹⁶⁰¹⁶¹

⁂

RDF Turtle syntax examples

JSON-LD vs Turtle comparison N-Triples format details How to parse RDF serializations in Python RDF vs knowledge graph schemas like clinic patient

Below is a compact, implementation‑oriented rundown for all five items.

1) RDF Turtle syntax examples

Turtle is a compact textual syntax for RDF triples.¹⁶²¹⁶³¹⁶⁴¹⁶⁵

Basic example:

@prefix ex:  <http://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:patient123
  a ex:Patient ;
  ex:hasCondition ex:CKD_3a ;
  ex:age 54 ;
  ex:birthDate "1971-03-15"^^xsd:date .

List and blank node example:¹⁶³¹⁶²

@prefix : <http://example.org/foo/> .

:subject :predicate ( :a :b :c ) .
:subject :predicate2 () .

Predicate list shorthand (multiple properties for same subject):¹⁶⁴

<http://example.org/thing1>
  a <http://example.org/Example> ;
  <http://example.org/name> "First Thing" ;
  <http://example.org/next> <http://example.org/thing2> .

2) JSON‑LD vs Turtle (in practice)

Both are RDF serializations (different notations for the same triple model).¹⁶⁶¹⁶⁷¹⁶⁸¹⁶⁹

Turtle
- Textual, graph‑centric; closer to SPARQL; very readable for ontologists.
- Good for ontologies, schema files, data dumps.
JSON‑LD
- JSON‑based, designed for APIs and web apps; fits naturally into REST payloads and JavaScript.¹⁶⁷¹⁶⁶
- Uses @context to map short keys to IRIs.
- Easier to integrate with existing JSON pipelines and tooling.

Key practical differences:¹⁶⁸¹⁶⁶¹⁶⁷

JSON‑LD works seamlessly with JSON types and web stacks; Turtle is more natural inside RDF/SPARQL ecosystems.
Conversion between them is algorithmically defined and supported by libraries (e.g., json‑ld processors, RDFLib).¹⁷⁰¹⁷¹¹⁶⁶
SPARQL works identically once data is in a triple store regardless of whether it came from Turtle or JSON‑LD.¹⁶⁸

3) N‑Triples format details

N‑Triples is a very simple, line‑oriented RDF serialization.¹⁶⁹¹⁷²¹⁶²

One triple per line: <subject> <predicate> <object> .
Subject and predicate are IRIs or blank nodes; object can be IRI, blank node, or literal.
No prefixes; everything is absolute, which makes it verbose but trivial to parse.
Example:

<http://example.org/patient/123> <http://example.org/vocab/hasCondition> <http://example.org/condition/CKD-3a> .
<http://example.org/patient/123> <http://example.org/vocab/age> "54"^^<http://www.w3.org/2001/XMLSchema#integer> .

N‑Quads extends N‑Triples with a 4th term (graph name) for datasets.

4) Parsing RDF serializations in Python (RDFLib)

RDFLib supports Turtle, N‑Triples, RDF/XML, JSON‑LD, etc.¹⁷³¹⁷¹¹⁷⁴

Basic parsing:

from rdflib import Graph

g = Graph()
g.parse("example.ttl", format="turtle")   # or "nt", "xml", "json-ld", ...

print(len(g))  # number of triples

for s, p, o in g:
    print(s, p, o)

Parsing N‑Triples:¹⁷⁴

from rdflib import Graph

g = Graph()
g.parse("demo.nt", format="nt")

for stmt in g:
    print(stmt)

Parsing JSON‑LD:

from rdflib import Graph

g = Graph()
g.parse("example.jsonld", format="json-ld")

RDFLib also supports datasets and TriG (named graphs):¹⁷¹¹⁷³

from rdflib import Dataset, RDF

ds = Dataset()
ds.parse("demo.trig", format="trig")

for s, p, o, g in ds.quads((None, RDF.type, None, None)):
    print(s, g)

5) RDF vs clinic patient “knowledge graph” schema

RDF is an abstract graph data model + semantics; your clinic patient graph is a specific domain schema that can be expressed in RDF (or in other graph/relational stores).¹⁷⁵¹⁷⁶¹⁷⁷¹⁷⁸¹⁶⁹

RDF level
- Just triples: subject–predicate–object with IRIs and literals.
- Serializations: Turtle, JSON‑LD, N‑Triples, RDF/XML, etc.
- Ontology layers (RDFS/OWL) express class hierarchies, property domains/ranges, constraints.
Clinic patient KG level
- Domain modeling: classes like Patient, Encounter, Condition, Medication, Observation, Policy, AuditEvent, and properties like hasEncounter, hasCondition, governedBy.
- You can define these as RDF classes/properties (e.g., in Turtle) and store patient data as RDF graphs, or you can implement the same schema in Neo4j, ArangoDB, Postgres.¹⁷⁶¹⁷⁷¹⁷⁸¹⁷⁵

Mapping your schema to RDF is roughly:

Node types → RDF classes (ex:Patient a rdfs:Class, etc.).
Properties/edges → RDF properties (ex:hasEncounter a rdf:Property, etc.).
Instance nodes → RDF resources (ex:patient123 a ex:Patient …).

You are free to choose RDF/Turtle/JSON‑LD if you want standards‑based interoperability and SPARQL, or stay in property‑graph / relational if you prioritize operational familiarity and performance. The structural ideas (entities + relationships) are the same; RDF just standardizes the representation and reasoning layer.¹⁷⁷¹⁷⁸¹⁶⁹¹⁷⁵¹⁷⁶

If you’d like, next I can:

Show your clinic context graph encoded as Turtle and JSON‑LD side‑by‑side, or
Sketch an OWL/RDFS ontology (in Turtle) for your healthcare context graph schema. ¹⁷⁹¹⁸⁰¹⁸¹¹⁸²

⁂

RDFLib tutorial for querying RDF graphs

Convert Turtle to JSON-LD in Python N-Quads vs N-Triples differences SPARQL queries on Turtle data examples Turtle syntax for blank nodes and collections. Show your clinic context graph encoded as Turtle and JSON‑LD side‑by‑side, or Sketch an OWL/RDFS ontology (in Turtle) for your healthcare context graph schema.

Here’s a concise, code‑oriented answer touching all the items, plus the clinic context graph encodings.

RDFLib: querying RDF graphs with SPARQL

Basic pattern to load Turtle and run SPARQL.¹⁸³¹⁸⁴¹⁸⁵¹⁸⁶¹⁸⁷

from rdflib import Graph, Namespace

g = Graph()
g.parse("clinic.ttl", format="turtle")

EX = Namespace("http://example.org/clinic#")

q = """
PREFIX ex: <http://example.org/clinic#>

SELECT ?enc ?cond
WHERE {
  ?patient a ex:Patient ;
           ex:hasEncounter ?enc .
  ?enc ex:hasCondition ?cond .
}
"""

for row in g.query(q):
    enc_iri, cond_iri = row
    print(enc_iri, cond_iri)

Prepared queries:¹⁸⁴¹⁸³

from rdflib.plugins.sparql import prepareQuery

q = prepareQuery(
    """
    PREFIX ex: <http://example.org/clinic#>
    SELECT ?obs ?value
    WHERE {
      ?patient a ex:Patient ;
               ex:hasEncounter ?enc .
      ?enc ex:hasObservation ?obs .
      ?obs ex:value ?value .
    }
    """,
)

for row in g.query(q):
    print(row.obs, row.value)

Convert Turtle → JSON‑LD in Python (RDFLib)

from rdflib import Graph

g = Graph()
g.parse("clinic.ttl", format="turtle")

jsonld_str = g.serialize(format="json-ld", indent=2)
with open("clinic.jsonld", "w") as f:
    f.write(jsonld_str)

Reverse (JSON‑LD → Turtle) is symmetric by swapping parse/format.¹⁸⁸¹⁸⁹¹⁸⁷¹⁹⁰

N‑Quads vs N‑Triples

N‑Triples: 3 terms + dot per line (subject, predicate, object).¹⁹¹¹⁹²
N‑Quads: 4 terms + dot per line (subject, predicate, object, graph label), so it can represent named graphs / datasets.¹⁹³¹⁹¹

Example N‑Triples:

<http://example.org/clinic/patient123> <http://example.org/clinic/hasEncounter> <http://example.org/clinic/encounter789> .

Same triple in a named graph with N‑Quads:

<http://example.org/clinic/patient123> <http://example.org/clinic/hasEncounter> <http://example.org/clinic/encounter789> <http://example.org/graphs/episode1> .

N‑Triples: only a single default graph; N‑Quads: multi‑graph datasets.¹⁹³¹⁹¹

SPARQL queries on Turtle data (clinic example)

Assume clinic.ttl defines your context graph.

All encounters for a patient:

PREFIX ex: <http://example.org/clinic#>

SELECT ?enc
WHERE {
  ex:patient123 a ex:Patient ;
                ex:hasEncounter ?enc .
}

Conditions and medications per encounter:

PREFIX ex: <http://example.org/clinic#>

SELECT ?enc ?cond ?med
WHERE {
  ex:patient123 ex:hasEncounter ?enc .
  OPTIONAL { ?enc ex:hasCondition ?cond . }
  OPTIONAL { ?enc ex:hasMedication ?med . }
}

These queries work identically whether data comes from Turtle, JSON‑LD, or N‑Triples once loaded into a triplestore or RDFLib graph.¹⁸⁵¹⁹⁴¹⁸³

Turtle: blank nodes and collections

Blank nodes (anonymous resources):¹⁹⁵¹⁹⁶¹⁹⁷¹⁹⁸

@prefix ex: <http://example.org/clinic#> .

ex:patient123 ex:hasAddress [
  a ex:Address ;
  ex:street "123 Main St" ;
  ex:city "Springfield"
] .

Collections (RDF lists):

@prefix ex: <http://example.org/clinic#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

ex:patient123 ex:hasAllergies ( ex:Penicillin ex:Peanuts ) .

This expands to rdf:first, rdf:rest chains under the hood.

Clinic context graph: Turtle and JSON‑LD side‑by‑side

Minimal slice: Patient, one Encounter, one Condition, one Medication, one Policy.

Turtle

@prefix ex:  <http://example.org/clinic#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:patient123 a ex:Patient ;
  ex:mrn "MRN-12345" ;
  ex:age 54 ;
  ex:sex "female" ;
  ex:hasEncounter ex:encounter789 ;
  ex:hasConsent ex:consent1 .

ex:encounter789 a ex:Encounter ;
  ex:encounterType "inpatient" ;
  ex:start "2026-01-10T09:32:00Z"^^xsd:dateTime ;
  ex:end   "2026-01-14T15:21:00Z"^^xsd:dateTime ;
  ex:hasCondition ex:condition_ckd3a ;
  ex:hasMedication ex:med_acei ;
  ex:governedBy ex:policy_minimum_necessary .

ex:condition_ckd3a a ex:Condition ;
  ex:codeSystem "ICD-10" ;
  ex:code "N18.31" ;
  ex:display "Chronic kidney disease, stage 3a" .

ex:med_acei a ex:Medication ;
  ex:rxnormCode "29046" ;
  ex:name "Lisinopril 10 mg" ;
  ex:status "stopped" ;
  ex:stopReason "hyperkalemia" .

ex:policy_minimum_necessary a ex:Policy ;
  ex:name "HIPAA Minimum Necessary" ;
  ex:regCitation "45 CFR 164.514(d)" .

ex:consent1 a ex:Consent ;
  ex:status "active" ;
  ex:scope "RESEARCH" .

ex:audit_step12 a ex:AuditEvent ;
  ex:agentId "clinical-agent" ;
  ex:userId "user:clinician-456" ;
  ex:timestamp "2026-02-03T04:33:00Z"^^xsd:dateTime ;
  ex:action "READ" ;
  ex:purposeOfUse "TREATMENT" ;
  ex:outcome "success" ;
  ex:recordedIn ex:encounter789 .

JSON‑LD

{
  "@context": {
    "ex": "http://example.org/clinic#",
    "mrn": "ex:mrn",
    "age": "ex:age",
    "sex": "ex:sex",
    "hasEncounter": "ex:hasEncounter",
    "hasCondition": "ex:hasCondition",
    "hasMedication": "ex:hasMedication",
    "hasConsent": "ex:hasConsent",
    "governedBy": "ex:governedBy",
    "codeSystem": "ex:codeSystem",
    "code": "ex:code",
    "display": "ex:display",
    "rxnormCode": "ex:rxnormCode",
    "name": "ex:name",
    "status": "ex:status",
    "stopReason": "ex:stopReason",
    "regCitation": "ex:regCitation",
    "scope": "ex:scope",
    "agentId": "ex:agentId",
    "userId": "ex:userId",
    "timestamp": {
      "@id": "ex:timestamp",
      "@type": "http://www.w3.org/2001/XMLSchema#dateTime"
    },
    "action": "ex:action",
    "purposeOfUse": "ex:purposeOfUse",
    "outcome": "ex:outcome",
    "recordedIn": "ex:recordedIn",
    "Patient": "ex:Patient",
    "Encounter": "ex:Encounter",
    "Condition": "ex:Condition",
    "Medication": "ex:Medication",
    "Policy": "ex:Policy",
    "Consent": "ex:Consent",
    "AuditEvent": "ex:AuditEvent"
  },
  "@graph": [
    {
      "@id": "ex:patient123",
      "@type": "Patient",
      "mrn": "MRN-12345",
      "age": 54,
      "sex": "female",
      "hasEncounter": { "@id": "ex:encounter789" },
      "hasConsent": { "@id": "ex:consent1" }
    },
    {
      "@id": "ex:encounter789",
      "@type": "Encounter",
      "encounterType": "inpatient",
      "start": "2026-01-10T09:32:00Z",
      "end": "2026-01-14T15:21:00Z",
      "hasCondition": { "@id": "ex:condition_ckd3a" },
      "hasMedication": { "@id": "ex:med_acei" },
      "governedBy": { "@id": "ex:policy_minimum_necessary" }
    },
    {
      "@id": "ex:condition_ckd3a",
      "@type": "Condition",
      "codeSystem": "ICD-10",
      "code": "N18.31",
      "display": "Chronic kidney disease, stage 3a"
    },
    {
      "@id": "ex:med_acei",
      "@type": "Medication",
      "rxnormCode": "29046",
      "name": "Lisinopril 10 mg",
      "status": "stopped",
      "stopReason": "hyperkalemia"
    },
    {
      "@id": "ex:policy_minimum_necessary",
      "@type": "Policy",
      "name": "HIPAA Minimum Necessary",
      "regCitation": "45 CFR 164.514(d)"
    },
    {
      "@id": "ex:consent1",
      "@type": "Consent",
      "status": "active",
      "scope": "RESEARCH"
    },
    {
      "@id": "ex:audit_step12",
      "@type": "AuditEvent",
      "agentId": "clinical-agent",
      "userId": "user:clinician-456",
      "timestamp": "2026-02-03T04:33:00Z",
      "action": "READ",
      "purposeOfUse": "TREATMENT",
      "outcome": "success",
      "recordedIn": { "@id": "ex:encounter789" }
    }
  ]
}

These two encodings are isomorphic RDF graphs; you can round‑trip between them with RDFLib.¹⁸⁷¹⁹⁰¹⁸⁸

If you want the OWL/RDFS layer next, I can sketch an ontology in Turtle (classes + properties + domains/ranges) for this clinic context schema, suitable as a base for SHACL or reasoning. ¹⁹⁹²⁰⁰²⁰¹²⁰²²⁰³²⁰⁴²⁰⁵

⁂

REIFCATION statements about statements

In RDF, reification is the pattern of making a statement itself into a resource so you can make further statements about that statement (provenance, source, time, certainty, etc.).²⁰⁶²⁰⁷²⁰⁸²⁰⁹

Core idea

Normal RDF triple:

$\langle S, P, O \rangle$ = subject, predicate, object (e.g., :alice :likes :RDF).

Reification turns that triple into a describable node R:

R rdf:type rdf:Statement
R rdf:subject S
R rdf:predicate P
R rdf:object O

Now you can say things like “who asserted this triple”, “when”, “with what confidence”.²⁰⁷²¹⁰²¹¹²⁰⁶

Simple example

Base triple:

@prefix ex: <http://example.org/> .

ex:alice ex:likes ex:RDF .

Reified version (statements about that statement):²¹²²¹³²⁰⁶

@prefix ex:  <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

# The original statement
ex:alice ex:likes ex:RDF .

# A resource representing that statement
ex:stmt1
  a rdf:Statement ;
  rdf:subject   ex:alice ;
  rdf:predicate ex:likes ;
  rdf:object    ex:RDF .

# Metadata about the statement
ex:stmt1
  ex:assertedBy ex:wikipedia ;
  ex:assertedOn "2024-01-01" ;
  ex:certainty  "0.8"^^xsd:decimal .

Here ex:stmt1 is not the subject of the original statement; it is a description of the triple (ex:alice, ex:likes, ex:RDF).²⁰⁶²⁰⁷

Why use reification?

Typical uses:²¹⁰²¹¹²⁰⁸²⁰⁹²⁰⁶

Provenance: source of a statement (ex:assertedBy, prov:wasDerivedFrom).
Temporal context: when the statement was true or observed.
Trust/quality: confidence scores, author, version.
Access control: who is allowed to see/use a given triple (relevant to your governed context graph / HIPAA work).

In a clinic setting, you’d reify statements like “patient has condition CKD‑3a” to attach who recorded it, when, and with what evidence, separate from the core clinical fact.

Alternatives in practice

Standard reification (with rdf:Statement) is verbose, and many systems prefer alternatives:²⁰⁸²⁰⁹²¹²²⁰⁷

Named graphs / N‑Quads: put triples in a graph identified by a URI, then attach metadata to the graph (e.g., “graph G was asserted by source X at time T”).
Singleton properties: create a unique predicate for each statement so you can annotate that predicate.²⁰⁹²¹²²⁰⁸
Custom patterns: e.g., application‑level “Fact” resource that has fact:subject, fact:predicate, fact:object, plus metadata.

For a governed context graph, named graphs or custom “Fact/AuditEvent” resources often scale better than classic rdf:Statement reification. ²¹⁴²¹⁵

⁂

Singleton property vs RDF reification for statements about statements

Turtle syntax example for reifying a statement RDFLib Python code to create and query reified statements Use cases of reification in RDF knowledge graphs Named graphs alternative to RDF reification Alternatives in practice Standard reification (with rdf:Statement) is verbose, and many systems prefer alternatives: Named graphs / N‑Quads: put triples in a graph identified by a URI, then attach metadata to the graph (e.g., “graph G was asserted by source X at time T”). Singleton properties: create a unique predicate for each statement so you can annotate that predicate. Custom patterns: e.g., application‑level “Fact” resource that has fact:subject, fact:predicate, fact:object, plus metadata. For a governed context graph, named graphs or custom “Fact/AuditEvent” resources often scale better than classic rdf:Statement reification.

RDF gives you several ways to make “statements about statements.” Below is a concise comparison and examples, with emphasis on patterns that fit a governed context graph.

1) Standard RDF reification vs singleton properties

Standard reification (rdf:Statement)²¹⁶²¹⁷²¹⁸

Represent each triple as a resource with rdf:subject, rdf:predicate, rdf:object.
Pros: standardized, works with any RDF 1.1 system.
Cons: verbose (4 triples + metadata per original triple), awkward to query at scale.

Example (Tolkien wrote LOTR):²¹⁹²²⁰

@prefix ex:  <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

# Base triple
ex:Tolkien ex:wrote ex:LOTR .

# Reified statement
ex:s1 a rdf:Statement ;
  rdf:subject   ex:Tolkien ;
  rdf:predicate ex:wrote ;
  rdf:object    ex:LOTR ;
  ex:since "1954" ;
  ex:source ex:someBibliography .

Singleton properties²²¹²²⁰²²²²²³²¹⁶

Create a unique property for each triple, relate it to its generic property via rdf:singletonPropertyOf.
Metadata is attached to the singleton property instead of a statement node.

Example (same semantics):²²⁰²¹⁶²²¹

@prefix ex:  <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

# A unique property for this specific “wrote” statement
ex:wrote_s1 rdf:singletonPropertyOf ex:wrote ;
            ex:since "1954" ;
            ex:source ex:someBibliography .

# Statement using the singleton property
ex:Tolkien ex:wrote_s1 ex:LOTR .

Pros/cons relative to classic reification:²²²²²⁴²¹⁶²²¹

Pros: fewer auxiliary triples per statement, more natural SPARQL over properties.
Cons: not part of the core RDF standard; reasoners and tools may need special handling.

For a governed context graph, singleton properties can be powerful if you care deeply about per‑edge metadata and are OK with non‑standard patterns.

2) RDFLib: creating and querying reified statements

Creating reified statements (classic style) in Python:²²⁵²¹⁸

from rdflib import Graph, URIRef, BNode, Namespace, RDF, Literal

EX = Namespace("http://example.org/")

g = Graph()

subject = EX.Tolkien
predicate = EX.wrote
obj = EX.LOTR

# Base triple
g.add((subject, predicate, obj))

# Reification node
s1 = BNode()

g.add((s1, RDF.type, RDF.Statement))
g.add((s1, RDF.subject, subject))
g.add((s1, RDF.predicate, predicate))
g.add((s1, RDF.object, obj))
g.add((s1, EX.since, Literal("1954")))
g.add((s1, EX.source, EX.someBibliography))

print(g.serialize(format="turtle").decode())

Querying reified statements with SPARQL:²¹⁸²²⁶²²⁷

from rdflib import Graph, Namespace, RDF

EX = Namespace("http://example.org/")
g = Graph()
g.parse("reified.ttl", format="turtle")

q = """
PREFIX ex:  <http://example.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?sub ?obj ?since ?src
WHERE {
  ?stmt a rdf:Statement ;
        rdf:subject   ?sub ;
        rdf:predicate ex:wrote ;
        rdf:object    ?obj ;
        ex:since      ?since ;
        ex:source     ?src .
}
"""

for row in g.query(q):
    print(row.sub, row.obj, row.since, row.src)

Singleton properties in RDFLib are analogous: you create a unique predicate EX.wrote_s1, assert EX.wrote_s1 rdf:singletonPropertyOf EX.wrote, and attach metadata to EX.wrote_s1.²¹⁶²²⁰

3) Use cases of reification / singleton properties in KGs

Common uses in large KGs:²²⁴²²⁸²²¹²¹⁶

Provenance: who asserted a fact, which dataset or publication, with which version.
Temporal validity: when a relation was observed or valid (e.g., drug exposure period, condition status intervals).
Uncertainty and trust: probabilities, confidence scores, source reliability.
Policy / access control: which roles or policies apply at the statement level (e.g., “clinician can see this link between patient and diagnosis but researcher cannot”).
Biomedical and clinical graphs: multiple lab measurements, conflicting diagnoses, or different coding systems for the same concept often benefit from statement‑level metadata.²²¹²²⁴²¹⁶

For your clinic context graph, this aligns with representing “Patient has CKD‑3a according to clinician X at time T, for encounter E, under policy P, with certainty C.”

4) Named graphs as an alternative

Named graphs attach metadata to sets of triples (graphs) rather than individual triples.²²⁹²³⁰²³¹²³²

Data model: quads (subject, predicate, object, graphName).
Implementation: TriG/N‑Quads syntax and quad stores.
You then annotate the graph IRI with provenance, time, etc.

Example in N‑Quads:²³⁰²³¹²³²²²⁹

<http://example.org/Tolkien> <http://example.org/wrote> <http://example.org/LOTR> <http://example.org/graph/assertion1> .

Metadata about the graph (in Turtle over the dataset vocabulary):

@prefix ex: <http://example.org/> .

ex:graph/assertion1
  ex:source ex:someBibliography ;
  ex:since "1954" ;
  ex:assertedBy ex:alice .

Practical trade‑offs:²³¹²²⁹

Easier to manage large sets of provenance (per dataset / per context) than per triple.
Works well with governed context graphs: you can use one named graph per agent step or per “context graph snapshot,” then attach audit metadata to the graph name.

5) What to use in a governed context graph

Given your clinic/HIPAA context:

Use named graphs for:
- Per‑agent or per‑turn context graphs (one graph URI per LangGraph step).
- Attaching coarse provenance (which agent, which user, which policies applied) to all triples used at that step.²²⁹²³¹
Use custom Fact/AuditEvent resources for:
- Application‑level audit trails (“agent X read PHI node Y at time T for purpose U”).
- This aligns with the AuditEvent node pattern you already have, without forcing RDF‑level reification.
Use singleton properties or classic reification only where you truly need fine‑grained, statement‑level metadata and you’re willing to pay the complexity cost (e.g., for specific high‑value clinical facts or research‑grade provenance).²²⁴²¹⁶²²¹

If you want, next I can sketch a RDF‑star / Turtle‑star variant, which is the more modern RDF‑level solution for statements‑about‑statements and tends to be cleaner than classic reification or singleton properties. ²³³²³⁴²³⁵

⁂

RDF 1.2

RDF 1.2 is the latest revision of the RDF family of specs. Conceptually it keeps the same graph data model as RDF 1.1, but adds a few important capabilities and some versioning/clarifications.²³⁶²³⁷²³⁸

Key additions in RDF 1.2

Triple terms (RDF‑star built in) RDF 1.2 allows an RDF triple to be used as a triple term in the object position of another triple.²³⁷²³⁶
- This enables a cleaner, native way to make statements about statements (metadata on triples) without classic verbose reification.
- Semantics for this are defined in the RDF 1.2 concepts and semantics specs.²³⁶²³⁷
Directional language‑tagged strings RDF 1.2 introduces language‑tagged strings that also carry a base direction (e.g., left‑to‑right, right‑to‑left), so user agents can render multilingual content correctly.²³⁷²³⁶
Version signalling RDF 1.2 defines a mechanism for indicating the RDF version used by a dataset or serialization (e.g., via media‑type parameters or in‑syntax directives).²³⁶
- This lets parsers detect RDF 1.2‑specific features (like triple terms) early and warn or error if unsupported.

Backwards compatibility with RDF 1.1

Data valid under RDF 1.1 is also valid under RDF 1.2 (“1.1” and “1.2‑basic” profiles). The core graph model (IRIs, blank nodes, literals, triples, datasets) remains the same.²³⁸²³⁷²³⁶
Existing entailment regimes and tools can continue to operate; RDF 1.2 mainly extends the model (e.g., triple terms) and clarifies semantics rather than breaking existing 1.1 data.²³⁹²⁴⁰²³⁷²³⁶

Why it matters for your use cases

For governed context graphs and “statements about statements” in healthcare:

RDF 1.2’s triple terms offer a more natural alternative to classic rdf:Statement reification and singleton properties for attaching provenance, policy, or audit metadata directly to clinical facts.²⁴¹²³⁷²³⁶
You can still keep your clinic context graph schema (Patient, Encounter, Condition, Policy, AuditEvent) and, where needed, annotate particular statements using triple terms instead of heavy reification constructs.

If you want, next I can sketch how one of your clinic context graph facts (e.g., “patient has condition CKD‑3a”) would look in RDF 1.2 triple‑term style vs classic reification and singleton properties. ²⁴²²⁴³²⁴⁴²⁴⁵

⁂

sketch a RDF‑star / Turtle‑star variant, which is the more modern RDF‑level solution for statements‑about‑statements and tends to be cleaner than classic reification or singleton properties.

Related RDFStar triple terms vs RDF reification and singleton properties Turtle syntax example using named graphs for metadata RDFLib code example for singleton properties Limitations and performance issues of RDF reification Alternatives to reification like RDF quads or ntriples. sketch how one of your clinic context graph facts (e.g., “patient has condition CKD‑3a”) would look in RDF 1.2 triple‑term style vs classic reification and singleton properties.

Here’s a focused sketch of RDF‑star / Turtle‑star for your clinic graph, plus comparisons to classic reification and singleton properties, and a quick named‑graph example.

1) RDF‑star / Turtle‑star (RDF 1.2 triple terms)

RDF‑star lets you quote a triple and use it as subject or object of another triple, using << … >> in Turtle‑star.²⁴⁶²⁴⁷²⁴⁸²⁴⁹²⁵⁰

Clinic fact: “patient123 has condition CKD‑3a”.

Base triple (normal Turtle):

@prefix ex: <http://example.org/clinic#> .

ex:patient123 ex:hasCondition ex:condition_ckd3a .

RDF‑star: add metadata directly on that triple:

@prefix ex:  <http://example.org/clinic#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:patient123 ex:hasCondition ex:condition_ckd3a .

<< ex:patient123 ex:hasCondition ex:condition_ckd3a >>
  ex:assertedBy ex:clinician456 ;
  ex:assertedAt "2026-01-10T10:00:00Z"^^xsd:dateTime ;
  ex:sourceEncounter ex:encounter789 ;
  ex:confidence "0.92"^^xsd:decimal .

Here the quoted triple << ex:patient123 ex:hasCondition ex:condition_ckd3a >> is a triple term that can be used just like a node.²⁴⁷²⁴⁸²⁴⁹²⁵⁰

This is the cleanest RDF‑level fit for your governed context graph (per‑fact provenance, audit, and policy) without four extra reification triples.

2) Same clinic fact in three styles

a) Classic RDF reification (`rdf:Statement`)

@prefix ex:  <http://example.org/clinic#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:patient123 ex:hasCondition ex:condition_ckd3a .

ex:s1 a rdf:Statement ;
  rdf:subject   ex:patient123 ;
  rdf:predicate ex:hasCondition ;
  rdf:object    ex:condition_ckd3a ;
  ex:assertedBy ex:clinician456 ;
  ex:assertedAt "2026-01-10T10:00:00Z"^^xsd:dateTime ;
  ex:sourceEncounter ex:encounter789 ;
  ex:confidence "0.92"^^xsd:decimal .

Pros: standard RDF 1.1.²⁵⁰²⁵¹²⁵²
Cons: verbose; queries must go through rdf:Statement structure; performance overhead at scale.²⁵³²⁵⁴²⁵⁵²⁵⁰

b) Singleton property

@prefix ex:  <http://example.org/clinic#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Singleton property for this specific assertion
ex:hasCondition_s1 rdf:singletonPropertyOf ex:hasCondition ;
  ex:assertedBy ex:clinician456 ;
  ex:assertedAt "2026-01-10T10:00:00Z"^^xsd:dateTime ;
  ex:sourceEncounter ex:encounter789 ;
  ex:confidence "0.92"^^xsd:decimal .

# Fact using the singleton property
ex:patient123 ex:hasCondition_s1 ex:condition_ckd3a .

Pros: fewer extra triples; easier SPARQL over properties; good benchmarked characteristics.²⁵⁶²⁵⁷²⁵⁸²⁵⁹
Cons: non‑standard; explosion of properties; some tools won’t “understand” singleton semantics out‑of‑the‑box.

c) RDF‑star triple term (RDF 1.2 / Turtle‑star)

@prefix ex:  <http://example.org/clinic#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:patient123 ex:hasCondition ex:condition_ckd3a .

<< ex:patient123 ex:hasCondition ex:condition_ckd3a >>
  ex:assertedBy ex:clinician456 ;
  ex:assertedAt "2026-01-10T10:00:00Z"^^xsd:dateTime ;
  ex:sourceEncounter ex:encounter789 ;
  ex:confidence "0.92"^^xsd:decimal .

Pros: compact, readable; good store performance; integrates with SPARQL‑star (query triple terms directly).²⁴⁸²⁶⁰²⁶¹²⁶²²⁴⁷²⁵⁰
Cons: requires RDF 1.2 / RDF‑star support (not universal yet, though major stacks like Jena, GraphDB, Blazegraph variants support it).²⁶³²⁶⁰²⁴⁹

3) Turtle‑star with named graphs (metadata per context graph)

You can combine RDF‑star with named graphs for your governed context graph snapshots.²⁶⁴²⁶⁵²⁶⁶²⁶⁷²⁴⁷

TriG‑star style:

@prefix ex:  <http://example.org/clinic#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Named graph = one context graph snapshot for a LangGraph step
ex:ctxGraph_step12 {
  ex:patient123 ex:hasCondition ex:condition_ckd3a .
  ex:patient123 ex:hasEncounter ex:encounter789 .

  << ex:patient123 ex:hasCondition ex:condition_ckd3a >>
    ex:assertedBy ex:clinicalAgent ;
    ex:assertedAt "2026-02-03T04:33:00Z"^^xsd:dateTime ;
    ex:purposeOfUse "TREATMENT" .
}

# Metadata about the context graph as a whole
ex:ctxGraph_step12
  ex:belongsToThread "pt-123-q-1" ;
  ex:agentStep "clinical_agent" ;
  ex:user "clinician:456" .

Now you can track:

Metadata on the fact (triple term annotations).
Metadata on the whole context graph snapshot (named graph annotations).

4) RDFLib and RDF‑star / singleton properties

Singleton properties work in RDFLib as plain RDF: just treat ex:hasCondition_s1 as a normal predicate and add rdf:singletonPropertyOf and your metadata.²⁶⁸²⁵⁶

Example:

from rdflib import Graph, Namespace, RDF, Literal
from rdflib.namespace import XSD

EX = Namespace("http://example.org/clinic#")
g = Graph()
g.bind("ex", EX)

patient = EX.patient123
condition = EX.condition_ckd3a
base_pred = EX.hasCondition
single_pred = EX.hasCondition_s1

# Base relation using singleton property
g.add((patient, single_pred, condition))

# Declare singleton relationship to base property and metadata
g.add((single_pred, EX.assertedBy, EX.clinician456))
g.add((single_pred, EX.sourceEncounter, EX.encounter789))
g.add((single_pred, EX.confidence, Literal("0.92", datatype=XSD.decimal)))
g.add((single_pred, RDF.type, RDF.Property))
g.add((single_pred, EX.singletonPropertyOf, base_pred))

print(g.serialize(format="turtle").decode())

RDF‑star in RDFLib: core releases don’t fully support Turtle‑star yet, but there are forks/PRs and external modules that translate RDF‑star to standard RDF (e.g., to reification) and back.²⁶⁹²⁷⁰
- For production, you’d likely use a store with RDF‑star support (e.g., Jena, GraphDB) and a SPARQL‑star endpoint, then interact via HTTP or use a library that understands triple terms.²⁶⁰²⁴⁹²⁶³²⁵⁰

5) Limitations and performance issues of classic reification

Empirically and per vendor docs:²⁵⁴²⁵⁵²⁵⁰²⁵³

Storage blow‑up: 4+ extra triples per annotated triple (statement node + subject/predicate/object).
Query complexity: queries become multi‑join patterns over rdf:Statement, which hurts readability and performance.
Ambiguity: reification does not, by itself, assert the original triple; you must assert it separately if you want it in the base graph.²⁵¹²⁶³²⁵⁰

This is why RDF‑star is considered the preferred modern solution for edge metadata, especially in large KGs.²⁶¹²⁶²²⁶⁰²⁵⁰

6) Alternatives recap for your governed context graph

For a clinic / HIPAA context graph:

RDF‑star (triple terms): best fit if your graph stack supports it; annotate clinically important edges (Patient–Condition, Encounter–Medication) with provenance, policy, and audit metadata.
Named graphs (TriG / N‑Quads): use one named graph per context‑graph snapshot (per LangGraph step / agent turn) and attach metadata (thread, agent, user, purposeOfUse) to the graph IRI.²⁶⁵²⁶⁶²⁶⁷²⁶⁴
Singleton properties: good fallback if you want per‑edge metadata in RDF 1.1 tools and can tolerate property proliferation.²⁵⁷²⁵⁸²⁵⁶
Classic reification: standards‑pure but heavy; mainly useful if you must stay fully RDF 1.1 with no extensions and no custom patterns.²⁵⁰²⁵¹²⁵⁴

If you tell me which stack you’re most likely to use for the clinic KG (Jena/GraphDB vs pure RDFLib), I can give you a SPARQL‑star example query over the RDF‑star clinic graph and a minimal migration pattern from your current JSON context_graph into Turtle‑star. ²⁷¹²⁷²²⁷³

⁂

pure RDFLib), I can give you a SPARQL‑star example query over the RDF‑star clinic graph and a minimal migration pattern from your current JSON context_graph into Turtle‑star.

RDF-star vs classic RDF reification examples SPARQL-star query syntax for nested triples RDFLib code example parsing Turtle-star Performance benchmarks RDF-star vs singleton properties Convert Turtle-star to TriG-star format

RDF‑star lets you add metadata directly to triples; in your clinic context graph it’s a cleaner fit than classic reification. Below are minimal examples and patterns you can adapt.

RDF‑star vs classic reification (clinic fact)

Fact: “patient123 has condition CKD‑3a”.

RDF‑star (Turtle‑star)

@prefix ex:  <http://example.org/clinic#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Base fact
ex:patient123 ex:hasCondition ex:condition_ckd3a .

# Metadata on that fact (triple term)
<< ex:patient123 ex:hasCondition ex:condition_ckd3a >>
  ex:assertedBy ex:clinician456 ;
  ex:assertedAt "2026-01-10T10:00:00Z"^^xsd:dateTime ;
  ex:sourceEncounter ex:encounter789 ;
  ex:confidence "0.92"^^xsd:decimal .

Classic RDF reification

@prefix ex:  <http://example.org/clinic#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:patient123 ex:hasCondition ex:condition_ckd3a .

ex:s1 a rdf:Statement ;
  rdf:subject   ex:patient123 ;
  rdf:predicate ex:hasCondition ;
  rdf:object    ex:condition_ckd3a ;
  ex:assertedBy ex:clinician456 ;
  ex:assertedAt "2026-01-10T10:00:00Z"^^xsd:dateTime ;
  ex:sourceEncounter ex:encounter789 ;
  ex:confidence "0.92"^^xsd:decimal .

RDF‑star is more compact and keeps metadata close to the fact; reification requires a separate rdf:Statement node and more joins.²⁷⁴²⁷⁵²⁷⁶²⁷⁷²⁷⁸

SPARQL‑star query syntax for nested triples

SPARQL‑star extends SPARQL to bind triple terms using << … >>.²⁷⁷²⁷⁹²⁸⁰

Example: find all patient‑condition facts with confidence and encounter:

PREFIX ex:  <http://example.org/clinic#>

SELECT ?patient ?cond ?enc ?conf
WHERE {
  # Base fact
  ?patient ex:hasCondition ?cond .

  # Metadata about that fact (quoted triple)
  << ?patient ex:hasCondition ?cond >>
    ex:sourceEncounter ?enc ;
    ex:confidence ?conf .
}

Example: filter by clinician:

PREFIX ex:  <http://example.org/clinic#>

SELECT ?cond ?when
WHERE {
  << ex:patient123 ex:hasCondition ?cond >>
    ex:assertedBy ex:clinician456 ;
    ex:assertedAt ?when .
}

Most triple stores that support RDF‑star (Jena, GraphDB, etc.) also support SPARQL‑star.²⁷⁸²⁸¹²⁸²²⁸³

RDFLib and Turtle‑star

RDFLib does not fully support RDF‑star natively in stable releases; current patterns are:²⁸⁴²⁸⁵

Use a store or service that supports RDF‑star (e.g., Jena Fuseki, GraphDB), talk to it via HTTP/SPARQL‑star.
Or, preprocess Turtle‑star → classic RDF (reification or custom encoding) before loading into RDFLib.

If you have a server with RDF‑star, you can still use RDFLib’s SPARQL wrapper to query it:

from SPARQLWrapper import SPARQLWrapper, JSON

endpoint = SPARQLWrapper("http://localhost:3030/clinic/sparql")
endpoint.setReturnFormat(JSON)

query = """
PREFIX ex:  <http://example.org/clinic#>

SELECT ?cond ?when
WHERE {
  << ex:patient123 ex:hasCondition ?cond >>
    ex:assertedBy ex:clinician456 ;
    ex:assertedAt ?when .
}
"""

endpoint.setQuery(query)
results = endpoint.query().convert()

for row in results["results"]["bindings"]:
    print(row["cond"]["value"], row["when"]["value"])

For pure local RDFLib today, you’d generally stick to standard RDF (no << >>) or encode RDF‑star triples into a custom vocabulary until RDF‑star support lands.²⁸⁵²⁸⁴

Performance: RDF‑star vs singleton properties (high‑level)

Benchmarks comparing reification, singleton properties, and RDF‑star show:²⁸²²⁸³²⁸⁶²⁸⁷²⁸⁸²⁸⁹²⁷⁸

Classic reification: worst in storage and query performance due to 4+ metadata triples and complex patterns per statement.
Singleton properties: better storage and query characteristics than reification, but explode the property space and require special semantics.
RDF‑star: generally best trade‑off—metadata expressed with minimal overhead, simpler graph patterns, better performance with engines optimized for triple terms.

The exact numbers depend on the engine, but for large governed KGs, vendors now generally recommend RDF‑star where available.²⁸⁶²⁸⁹²⁷⁸²⁸²

Named graphs / TriG‑star for metadata per context graph

You can combine RDF‑star with named graphs (TriG‑star) to annotate whole context graphs.²⁷⁶²⁹⁰²⁹¹²⁹²²⁹³

@prefix ex:  <http://example.org/clinic#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Context graph for one LangGraph step
ex:ctxGraph_step12 {
  ex:patient123 ex:hasCondition ex:condition_ckd3a .

  << ex:patient123 ex:hasCondition ex:condition_ckd3a >>
    ex:assertedBy ex:clinicalAgent ;
    ex:assertedAt "2026-02-03T04:33:00Z"^^xsd:dateTime ;
    ex:purposeOfUse "TREATMENT" .
}

# Graph-level metadata
ex:ctxGraph_step12
  ex:threadId "pt-123-q-1" ;
  ex:agentStep "clinical_agent" ;
  ex:user "clinician:456" .

Converting Turtle‑star (single default graph) → TriG‑star is mostly a matter of wrapping triples into a named graph block and adding graph‑level metadata.

Minimal migration: JSON `context_graph` → Turtle‑star

Assume your current JSON context graph:

{
  "nodes": [
    { "id": "patient:123", "type": "Patient" },
    { "id": "condition:ckd-3a", "type": "Condition" }
  ],
  "edges": [
    {
      "id": "e1",
      "type": "HAS_CONDITION",
      "from": "patient:123",
      "to": "condition:ckd-3a",
      "props": {
        "assertedBy": "clinician:456",
        "assertedAt": "2026-01-10T10:00:00Z",
        "sourceEncounter": "encounter:789",
        "confidence": 0.92
      }
    }
  ]
}

You can emit Turtle‑star like:

@prefix ex:  <http://example.org/clinic#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:patient123 a ex:Patient .
ex:condition_ckd3a a ex:Condition .

ex:patient123 ex:hasCondition ex:condition_ckd3a .

<< ex:patient123 ex:hasCondition ex:condition_ckd3a >>
  ex:assertedBy ex:clinician456 ;
  ex:assertedAt "2026-01-10T10:00:00Z"^^xsd:dateTime ;
  ex:sourceEncounter ex:encounter789 ;
  ex:confidence "0.92"^^xsd:decimal .

Mapping rules:

JSON node IDs → IRIs: "patient:123" → ex:patient123; "condition:ckd-3a" → ex:condition_ckd3a.
Edge types → properties: "HAS_CONDITION" → ex:hasCondition.
Edge props → metadata on the quoted triple.

You can implement this as a simple export function in Python that:

Normalizes IDs and labels.
Writes base triples for each edge.
Writes a quoted triple block per edge with non‑structural properties as metadata.

If you decide on RDF‑star‑capable infrastructure (e.g., Jena/GraphDB) I can next sketch a full export function (context_graph JSON → Turtle‑star string) and a corresponding SPARQL‑star query you’d run over that data.

⁂

LAYERS of a a CONTEXT GRAPH: 1. Synthetic Grounding - grounding data thats captured as system generated 2. System of Records - metadata and records ontology 3. Initial domain grounding - more traditional knowledge graph structures

You can think of a context graph as stratified into three governed layers, each feeding the next: synthetic grounding, systems of record, and domain KG. This matches how emerging “context layers” are being described for enterprise AI.²⁹⁴²⁹⁵²⁹⁶²⁹⁷²⁹⁸²⁹⁹

1) Synthetic grounding layer (L1)

Purpose: capture system‑generated signals and operational traces that give you “what is happening now” around agents and users.

Content:
- Agent runs, tool calls, prompts/responses, LangGraph steps, error events.
- Usage telemetry (features clicked, workflows invoked), session context, UI state.
- Synthetic assertions from heuristics or models (e.g., intent labels, risk scores, triage priority).
Shape in the graph:
- Nodes: AgentRun, ToolCall, Session, UIEvent, RiskScore, Intent.
- Edges: TRIGGERED_BY (ToolCall → AgentRun), OBSERVED_IN (RiskScore → Encounter), DERIVED_FROM (SyntheticFact → SourceFact).
Role:
- Acts as live grounding: tells agents what they just did, saw, or inferred.
- Feeds higher layers but is not itself the system of record; more like a fixture/telemetry layer.³⁰⁰³⁰¹³⁰²²⁹⁹³⁰³

For your governed context graph, this layer is where you’d attach decision traces and agent‑level audit (e.g., LangGraph step → AuditEvent nodes).

2) System‑of‑record layer (L2)

Purpose: mirror and index core SoR systems and metadata, but in a graph‑friendly, ontology‑aware way.²⁹⁵²⁹⁶²⁹⁷³⁰⁴³⁰⁵²⁹⁴

Content:
- Master data and transactional records from EHR, CRM, billing, ticketing, etc.
- Data catalog/metadata: tables, columns, lineage, quality signals, owners.
- Identity resolution: unified Patient, Clinician, Account, Device across systems.²⁹⁶³⁰⁴³⁰⁵
Shape:
- Nodes: PatientSoR, EncounterSoR, FHIRResourceRef, Table, Column, Dataset, LineageEvent.
- Edges: RESOLVES_TO (PatientSoR → PatientDomain), HAS_LINEAGE, STORED_IN, HAS_SOR_RECORD.
Role:
- Provides authoritative facts and metadata; context graph reads from here but does not write back.
- Enables SoR‑aware governance (e.g., HIPAA enforcement via knowing exactly which SoR a piece of PHI came from).³⁰⁶³⁰⁵³⁰⁷²⁹⁵

In your clinic setting, this is where FHIR/OMOP mappings live; the context graph pulls patient/encounter/condition/measurement slices from here.

3) Initial domain grounding layer (L3 – domain KG)

Purpose: encode domain semantics—entities, relationships, and policies—for reasoning and RAG.³⁰²³⁰⁸³⁰⁹³¹⁰³¹¹³⁰⁶

Content:
- Domain entities: Patient, Encounter, Condition, Medication, Observation, Policy, Consent, Role, AuditEvent.
- Ontologies and vocabularies: SNOMED CT, LOINC, RxNorm, ICD‑10, internal taxonomies.
- Governance model: which roles can do what, under which policies, for which entities.³⁰⁹³¹²³¹³³⁰⁶
Shape:
- Nodes: domain‑level Patient, Encounter, Condition, etc. (your clinic context graph schema).
- Edges: HAS_ENCOUNTER, HAS_CONDITION, HAS_MEDICATION, GOVERNED_BY, HAS_CONSENT, RECORDED_IN.
Role:
- Acts as semantic backbone that both SoR layer and synthetic grounding layer attach to.
- Feeds RAG/context building: context graphs are subgraphs cut from this layer plus selected synthetic and SoR projections.³⁰¹³⁰⁸³¹⁰³¹¹³⁰²³⁰⁶

How these layers combine into a context graph

For a given task (e.g., “Explain why ACE inhibitor was stopped for patient 123”):

L2 → L3 projection
- Query SoR (FHIR/OMOP) for Patient 123 and their recent inpatient encounter; map to domain nodes / edges (Patient, Encounter, Condition CKD‑3a, Medication ACEI, labs).³¹⁰³¹¹³¹⁴³¹⁵
L3 domain grounding
- Pull neighborhood in the domain KG: comorbidities, guideline documents, relevant policies (HIPAA, institution rules).³⁰⁸³⁰²³⁰⁶³⁰⁹
L1 synthetic grounding overlay
- Add nodes/edges for: prior agent runs about this patient, recent tool calls, current LangGraph step, risk scores, and AuditEvents (who is asking, for what purpose, what has already been disclosed).²⁹⁹³⁰¹³⁰²

The context graph you inject into the agent’s state is then:

Mostly L3 (domain KG view of the patient episode),
Filtered and sourced from L2 (SoR),
Annotated by L1 (synthetic and decision trace).

You can explicitly tag each node/edge with a layer property ("synthetic" | "sor" | "domain") and use that in governance and debugging.

If you want, next I can turn this into a typed schema (Layer enum + node/edge props) and a LangGraph build_context_graph node that composes all three layers for a given patient and clinician. ³¹⁶³¹⁷

⁂

turn this into a typed schema (Layer enum + node/edge props) and a LangGraph build_context_graph node that composes all three layers for a given patient and clinician. Examples of synthetic grounding data in enterprise context graphs

How to implement system of records ontology in Neo4j, arangodb and Postgresql Relationship between initial domain grounding and RAG What is a context layer in enterprise AI like Atlan Fourth layer or extensions beyond these three in context graphs

Below is a compact, typed schema plus a build_context_graph node that composes the three layers for a given patient and clinician, plus brief notes on your related questions.

1) Typed schema with Layer enum

from typing import TypedDict, Literal, Dict, Any, List, Optional

Layer = Literal["synthetic", "sor", "domain"]

class Node(TypedDict, total=False):
    id: str                 # e.g., "patient:123"
    type: str               # "Patient","Encounter","Condition","AgentRun",...
    layer: Layer            # "synthetic" | "sor" | "domain"
    name: str
    props: Dict[str, Any]

class Edge(TypedDict, total=False):
    id: str
    type: str               # "HAS_ENCOUNTER","HAS_CONDITION","DERIVED_FROM",...
    from_: str
    to: str
    layer: Layer
    props: Dict[str, Any]

class ContextGraph(TypedDict):
    nodes: List[Node]
    edges: List[Edge]

class AgentState(TypedDict):
    messages: List[Dict[str, Any]]
    patient_id: Optional[str]       # "patient:123"
    clinician_id: Optional[str]     # "clinician:456"
    query: Optional[str]
    context_graph: Optional[ContextGraph]
    governance_metrics: Dict[str, Any]

class ContextConfig(TypedDict):
    max_nodes: int
    max_depth: int

Layer semantics:

domain: Patient, Encounter, Condition, Medication, Observation, Policy, Consent, AuditEvent (KG view).³¹⁸³¹⁹³²⁰³²¹
sor: raw SoR entities or references (OMOP/FHIR rows, table/column/dataset nodes, lineage entries).³²⁰³²¹³²²³²³
synthetic: AgentRun, ToolCall, RiskScore, Intent, plus synthetic facts derived by agents or models.³²⁴³²⁵³²⁶³²⁷³²⁸

2) LangGraph `build_context_graph` node (3‑layer composition)

Assume you have three backends:

sor_client: queries SoR (Postgres/OMOP or FHIR) and returns SoR‑layer nodes/edges.
domain_client: maps SoR entities into domain KG and expands the clinical neighborhood.
synthetic_client: returns recent agent runs, tool calls, risk scores for this patient/clinician/thread.

from langgraph.graph import StateGraph, START, END
from langgraph.graph.state import Runtime

class SoRClient:
    def patient_slice(self, patient_id: str, max_nodes: int) -> ContextGraph:
        ...

class DomainKGClient:
    def patient_episode_graph(
        self,
        patient_id: str,
        max_depth: int,
        max_nodes: int,
    ) -> ContextGraph:
        ...

class SyntheticClient:
    def recent_context(
        self,
        patient_id: str,
        clinician_id: str,
        thread_id: str,
        max_events: int = 50,
    ) -> ContextGraph:
        ...

sor_client = SoRClient()
domain_client = DomainKGClient()
synthetic_client = SyntheticClient()

def merge_graphs(graphs: List[ContextGraph], max_nodes: int) -> ContextGraph:
    nodes: Dict[str, Node] = {}
    edges: Dict[str, Edge] = {}
    for g in graphs:
        for n in g["nodes"]:
            nodes[n["id"]] = n
        for e in g["edges"]:
            edges[e["id"]] = e
    # Simple cap
    return {
        "nodes": list(nodes.values())[:max_nodes],
        "edges": list(edges.values())[: 2 * max_nodes],
    }

def build_context_graph(
    state: AgentState,
    runtime: Runtime[ContextConfig],
) -> AgentState:
    cfg = runtime.context
    patient_id = state.get("patient_id")
    clinician_id = state.get("clinician_id")
    if not patient_id:
        return state

    max_nodes = cfg.get("max_nodes", 256)
    max_depth = cfg.get("max_depth", 2)
    thread_id = runtime.config.get("configurable", {}).get("thread_id", "default-thread")

    # 1) SoR slice (layer = "sor")
    sor_graph = sor_client.patient_slice(patient_id, max_nodes=max_nodes)
    for n in sor_graph["nodes"]:
        n.setdefault("layer", "sor")
    for e in sor_graph["edges"]:
        e.setdefault("layer", "sor")

    # 2) Domain KG neighborhood (layer = "domain")
    domain_graph = domain_client.patient_episode_graph(
        patient_id=patient_id,
        max_depth=max_depth,
        max_nodes=max_nodes,
    )
    for n in domain_graph["nodes"]:
        n.setdefault("layer", "domain")
    for e in domain_graph["edges"]:
        e.setdefault("layer", "domain")

    # 3) Synthetic grounding (agent runs, tool calls) (layer = "synthetic")
    synthetic_graph = synthetic_client.recent_context(
        patient_id=patient_id,
        clinician_id=clinician_id or "",
        thread_id=thread_id,
        max_events=50,
    )
    for n in synthetic_graph["nodes"]:
        n.setdefault("layer", "synthetic")
    for e in synthetic_graph["edges"]:
        e.setdefault("layer", "synthetic")

    ctx_graph = merge_graphs([sor_graph, domain_graph, synthetic_graph], max_nodes)

    # Simple governance metrics
    phi_nodes = [n for n in ctx_graph["nodes"] if n.get("props", {}).get("isPHI")]
    policies = [n for n in ctx_graph["nodes"] if n["type"] == "Policy"]
    synthetic_fraction = (
        len([n for n in ctx_graph["nodes"] if n["layer"] == "synthetic"]) / max(len(ctx_graph["nodes"]), 1)
    )

    state["context_graph"] = ctx_graph
    state["governance_metrics"] = {
        "phi_node_count": len(phi_nodes),
        "policy_count": len(policies),
        "synthetic_node_fraction": synthetic_fraction,
    }
    return state

You wire this node as earlier, with StateGraph(AgentState, context_schema=ContextConfig) and a checkpointer.

3) Examples of synthetic grounding data

Concrete synthetic nodes/edges you’d add in SyntheticClient:³²⁵³²⁹³²⁶³²⁸³³⁰³²⁴

Nodes:
- AgentRun: LangGraph execution step, including agentName, timestamp, status.
- ToolCall: invoked tool, inputs/outputs summary, latency, error flags.
- RiskScore: output of a model estimating readmission risk, sepsis risk, etc.
- Intent: classifier output for the current query (e.g., “medication_explanation”).
Edges:
- ABOUT_PATIENT: AgentRun → Patient.
- USED_CONTEXT_FROM: AgentRun → Encounter/Document/Policy nodes.
- TRIGGERED_TOOL: AgentRun → ToolCall.
- DERIVED_FROM: RiskScore → (Patient, Encounter, Observations).

This is consistent with how “context layers” capture decision traces and operational signals for agents.³³¹³²⁹³³⁰³²⁴³²⁵

4) System‑of‑record ontology in Neo4j / ArangoDB / Postgres

High‑level patterns:

Neo4j
- Labels: :PatientSoR, :VisitSoR, :Table, :Column, :Dataset.
- Relationships: (:VisitSoR)-[:VISIT_OF]->(:PatientSoR), (:Column)-[:IN_TABLE]->(:Table), (:Column)-[:LINEAGE_FROM]->(:Column).
- Map FHIR/OMOP IDs into node properties and connect them to domain Patient/Encounter via RESOLVES_TO.
ArangoDB
- Collections: PatientsSoR, VisitsSoR, Tables, Columns, plus edge collections patientVisitEdges, columnLineageEdges.
- Use AQL graph traversals to link SoR records to domain nodes via RESOLVES_TO / HAS_SOR_RECORD edges.
Postgres
- Keep OMOP/FHIR tables as‑is; add “graph overlay” tables like entity_node(entity_id, entity_type, sor_table, sor_pk) and entity_edge(from_id, to_id, type).
- Your SoRClient.patient_slice is essentially a query + adapter that builds Node(layer='sor') / Edge(layer='sor') objects.

This makes the SoR layer explicit and queryable, while the domain layer is your cleaned, ontology‑driven graph.³²¹³²²³²³³³²³²⁰

5) Relationship between initial domain grounding and RAG

Domain grounding (layer 3) is the semantic backbone for GraphRAG:³²⁷³³³³³⁴³¹⁸³²⁰³²¹

Embeddings retrieve documents/chunks.
Domain KG connects patients, encounters, conditions, guidelines, and policies.
Context graph builder:
- Takes retrieved docs + KG nodes,
- Expands along domain edges (e.g., Condition → Medication → Lab Observations),
- Produces a compact, role‑aware subgraph.

RAG then consumes that subgraph (plus text) as structured grounding instead of a flat docs list, improving precision and interpretability.

6) What is a “context layer” in enterprise AI (Atlan‑style)

Vendors like Atlan define a context layer as a unified, governed system that captures and serves organizational knowledge (semantics, lineage, usage, governance, and decision traces) to AI systems in real time.³³⁵³³⁶³³⁷³³²³²⁵³³¹

Typical components:

Connectors to SoR and SaaS tools (unification).
Metadata / lineage / business glossary (semantic and governance).
Usage, collaboration, and quality signals (operational context).
Activation interfaces (MCP, APIs) for agents and copilots.³³⁶³³⁷³³⁸³³⁹³³⁵

Your three layers map well to this: SoR + domain KG + synthetic grounding + activation.

7) A fourth layer beyond the three

A natural “L4” in your stack:

Policy and simulation layer (or “governance / planning layer”)
- Encodes higher‑order rules, simulations, and hypothetical states.
- Nodes: PolicyModel, Scenario, WhatIfPlan, EvaluationResult.
- Edges: APPLIES_TO (PolicyModel → Policy/Role), EVALUATED_ON (Scenario → ContextGraphSnapshot), RECOMMENDS_ACTION.

This layer lets you run what‑if analyses, policy simulations, and replay over stored context graphs and agent traces, then feed updated policies back down into the domain and synthetic layers as they evolve.³⁴⁰³²⁸³²⁴³²⁵

If you want, next I can expand the three‑layer schema into a four‑layer one (with a layer: "policy" and Scenario/ContextGraphSnapshot nodes) and show how to log and replay context graphs through LangGraph’s checkpointer to feed that simulation layer. ³⁴¹³⁴²³⁴³³⁴⁴³⁴⁵³⁴⁶

⁂

Expand the three‑layer schema into a four‑layer one (with a layer: "policy" and Scenario/ContextGraphSnapshot nodes) and show how to log and replay context graphs through LangGraph’s checkpointer to feed that simulation layer.

Example input data for build_context_graph with sample patient SPARQL query to traverse the three-layer context graph Add compliance governance to Activation layer nodes Healthcare examples of Unification layer connectors Extend schema for multi-clinician shared context

Here’s a compact expansion to a four‑layer schema plus how to log and replay context graphs with LangGraph, then short answers to the related points.

1) Four‑layer schema (add `policy` layer, Scenario, Snapshot)

Extend Layer and node types:

from typing import TypedDict, Literal, Dict, Any, List, Optional

Layer = Literal["synthetic", "sor", "domain", "policy"]

class Node(TypedDict, total=False):
    id: str
    type: str
    layer: Layer
    name: str
    props: Dict[str, Any]

class Edge(TypedDict, total=False):
    id: str
    type: str
    from_: str
    to: str
    layer: Layer
    props: Dict[str, Any]

class ContextGraph(TypedDict):
    nodes: List[Node]
    edges: List[Edge]

class AgentState(TypedDict):
    messages: List[Dict[str, Any]]
    patient_id: Optional[str]
    clinician_id: Optional[str]
    query: Optional[str]
    context_graph: Optional[ContextGraph]
    governance_metrics: Dict[str, Any]
    current_scenario_id: Optional[str]        # policy layer

class ContextConfig(TypedDict):
    max_nodes: int
    max_depth: int

New policy‑layer node types:

Scenario (simulated or what‑if context; e.g., “remove ACEI, add ARB”).
ContextGraphSnapshot (snapshot of a prior context_graph for replay/simulation).
PolicyModel (e.g., sepsis policy engine, HIPAA policy set).
EvaluationResult (result of applying a scenario/policy to a snapshot).

Each gets layer="policy" in Node.layer.

Example policy‑layer nodes:

scenario_node: Node = {
    "id": "scenario:stop-acei",
    "type": "Scenario",
    "layer": "policy",
    "name": "Stop ACE inhibitor",
    "props": {"description": "Simulate stopping ACE inhibitor due to hyperkalemia"}
}

snapshot_node: Node = {
    "id": "snapshot:thread-pt123-step12",
    "type": "ContextGraphSnapshot",
    "layer": "policy",
    "name": "Context at step12",
    "props": {
        "threadId": "pt-123-q-1",
        "checkpointId": "ckpt-uuid",
        "createdAt": "2026-02-03T04:33:00Z"
    }
}

Edges in policy layer (examples):

SCENARIO_APPLIES_TO: Scenario → ContextGraphSnapshot.
EVALUATES_POLICY: PolicyModel → Scenario.
HAS_EVALUATION: Scenario → EvaluationResult.

All with layer="policy".

2) Logging and replaying context graphs via LangGraph checkpointer

LangGraph stores state per checkpoint (thread, step); you can read past states, including context_graph, and fork or replay from them.³⁴⁷³⁴⁸³⁴⁹³⁵⁰³⁵¹³⁵²

Assume graph = builder.compile(checkpointer=postgres_or_redis_saver).

Log snapshots into the graph

A node to emit a ContextGraphSnapshot policy node:

import uuid
from datetime import datetime, timezone

def log_snapshot(state: AgentState) -> AgentState:
    ctx_graph = state.get("context_graph")
    if not ctx_graph:
        return state

    # Assume checkpoint_id is injected into config by LangGraph
    checkpoint_id = state.get("props", {}).get("checkpoint_id")  # or from runtime/config
    thread_id = state.get("props", {}).get("thread_id", "unknown")

    snapshot_id = f"snapshot:{thread_id}:{checkpoint_id}"
    snapshot_node: Node = {
        "id": snapshot_id,
        "type": "ContextGraphSnapshot",
        "layer": "policy",
        "name": f"Snapshot {checkpoint_id}",
        "props": {
            "threadId": thread_id,
            "checkpointId": checkpoint_id,
            "createdAt": datetime.now(timezone.utc).isoformat()
        }
    }

    # Append snapshot node into context_graph
    nodes = {n["id"]: n for n in ctx_graph["nodes"]}
    nodes[snapshot_id] = snapshot_node
    ctx_graph["nodes"] = list(nodes.values())

    state["context_graph"] = ctx_graph
    state["current_scenario_id"] = snapshot_id
    return state

You can call this node after build_context_graph and before activation/clinical agents.

Replay from a snapshot (fork a scenario)

Using LangGraph’s time‑travel/forking API:³⁴⁸³⁵⁰³⁵²

# Get history for a thread
thread = {"configurable": {"thread_id": "pt-123-q-1"}}
history = list(graph.get_state_history(thread))

# Pick a checkpoint to simulate from
snapshot_state = history[-2]  # e.g., before compliance agent

# Fork: new scenario thread id
scenario_thread = {
    "configurable": {
        "thread_id": "pt-123-q-1-sim-stop-acei",
        "checkpoint_id": snapshot_state.config["configurable"]["checkpoint_id"],
    }
}

# Resume from that checkpoint with different scenario in state
for event in graph.stream(
    {"messages": [], "current_scenario_id": "scenario:stop-acei"},
    config=scenario_thread,
    stream_mode="values",
):
    pass  # process simulation events

LangGraph won’t re‑execute steps before the checkpoint; it will continue from that point and create new checkpoints on the forked branch.³⁵³³⁵²³⁴⁸

3) Example input data for `build_context_graph`

Example AgentState input for a clinical question:

state: AgentState = {
    "messages": [
        {"role": "user", "content": "Why was my ACE inhibitor stopped?"},
    ],
    "patient_id": "patient:123",
    "clinician_id": "clinician:456",
    "query": "Explain ACE inhibitor discontinuation",
    "context_graph": None,
    "governance_metrics": {},
    "current_scenario_id": None,
}
config = {"configurable": {"thread_id": "pt-123-q-1"}}
context = {"max_nodes": 256, "max_depth": 2}

build_context_graph then builds a multi‑layer context_graph with layer tags as above.

4) SPARQL query to traverse three‑layer context

Assuming you export your context graph to RDF with ex:layer as a property, and IRIs like ex:patient123, ex:encounter789, etc.

Example: find all domain conditions for a patient and their SoR origins and synthetic agent runs that used them:

PREFIX ex: <http://example.org/clinic#>

SELECT ?cond ?sorRec ?agentRun
WHERE {
  # Domain layer condition
  ?patient a ex:Patient ;
           ex:hasEncounter ?enc .
  ?enc ex:hasCondition ?cond .

  ?patient ex:patientId "123" .          # or use IRI ex:patient123
  ?cond ex:layer "domain" .

  # SoR records linked to this condition
  OPTIONAL {
    ?sorRec ex:resolvesTo ?cond ;
            ex:layer "sor" .
  }

  # Synthetic agent runs that used this condition in context
  OPTIONAL {
    ?agentRun a ex:AgentRun ;
              ex:usedContext ?cond ;
              ex:layer "synthetic" .
  }
}

This illustrates traversing domain → SoR → synthetic by filtering on layer.³⁵⁴³⁵⁵³⁵⁶³⁵⁷³⁵⁸

5) Add compliance governance to activation layer nodes

Activation layer = the nodes that represent agent actions or external side effects (e.g., writing to EHR, sending a message, executing a tool).³⁵⁹³⁶⁰³⁶¹³⁶²³⁶³

Extend node types:

Activation: a concrete action (e.g., “send explanation message”, “propose medication change”).

Add props and edges:

activation_node: Node = {
    "id": "activation:msg-1",
    "type": "Activation",
    "layer": "synthetic",
    "name": "Send explanation to patient portal",
    "props": {
        "targetChannel": "patient_portal",
        "containsPHI": True,
        "governanceStatus": "pending_approval"  # or 'allowed','blocked'
    }
}

# Edges
# Activation governed by policies
# Activation reviewed/approved by compliance agent or human

Edges:

GOVERNED_BY: Activation → Policy (HIPAA minimum necessary, local policy).
HAS_REVIEW: Activation → AuditEvent or ComplianceDecision.

Compliance agent in LangGraph can then:

Inspect context_graph for Activation nodes with containsPHI=true.
Check for GOVERNED_BY edges to appropriate Policy nodes and Consent.
Set governanceStatus to allowed or blocked and emit an AuditEvent node (layer domain or policy depending on modeling).³⁶⁰³⁶⁴³⁶⁵³⁶⁶³⁵⁹

6) Healthcare examples of Unification layer connectors

Unification (SoR) connectors bring clinical data into the system:³⁶⁷³⁶⁸³⁶⁹³⁷⁰

FHIR servers (R4/R5) via REST APIs: Patient, Encounter, Observation, Condition, MedicationStatement.
HL7 v2 feeds for ADT, ORU, ORM messages (ADTs to visits, ORUs to observations).
Direct DB connectors into EHR databases or data warehouses (Epic Clarity, Cerner HealtheIntent, OMOP CDM).
RIS/LIS/PACS systems for imaging and lab results.
HIEs and registries for cross‑facility data sharing.

Each connector populates SoR nodes and edges (PatientSoR, VisitSoR, ObservationSoR) and links them to domain Patient / Encounter nodes via RESOLVES_TO / HAS_SOR_RECORD edges.

7) Extend schema for multi‑clinician shared context

To support multi‑clinician collaboration on the same patient/thread:

Add node types:
- Clinician (domain) and User (app‑level).
- SharedContextSession (collaborative session over one or more context graphs).
Edges:
- PARTICIPATES_IN: Clinician/User → SharedContextSession.
- USES_SNAPSHOT: SharedContextSession → ContextGraphSnapshot.
- HAS_NOTE: SharedContextSession → Document (collaborative notes).

Within context_graph:

Allow multiple Clinician nodes (layer="domain") attached to the same patient/encounter.
Synthetic layer can include AgentRun nodes tagged with clinicianId and sessionId, so multiple clinicians see the same synthetic grounding and can add their own “synthetic facts” (e.g., proposed plans) that are properly attributed.

If you want, next I can show a complete ContextGraph JSON example containing all four layers (domain, SoR, synthetic, policy) for a sample multi‑clinician scenario and a LangGraph compliance agent that uses governanceStatus on Activation nodes to gate real side‑effects. ³⁷¹³⁷²

⁂

Example code for policy layer nodes in four-layer schema

How to implement checkpointer with Postgres for context graphs Using LangGraph time-travel to fork policy scenarios Integrate FHIR data into ContextGraphSnapshot nodes Best practices for managing checkpoints in production LangGraph

Here’s a compact, code‑first answer for all five items.

1) Example code for policy‑layer nodes

Define a helper to append Scenario and ContextGraphSnapshot nodes to the current context_graph:

from datetime import datetime, timezone

def add_policy_nodes(
    state: AgentState,
    scenario_name: str,
    scenario_description: str,
) -> AgentState:
    ctx_graph = state.get("context_graph") or {"nodes": [], "edges": []}
    nodes = {n["id"]: n for n in ctx_graph["nodes"]}

    # Scenario node (policy layer)
    scenario_id = f"scenario:{scenario_name}"
    scenario_node: Node = {
        "id": scenario_id,
        "type": "Scenario",
        "layer": "policy",
        "name": scenario_name,
        "props": {"description": scenario_description},
    }
    nodes[scenario_id] = scenario_node

    # Snapshot node pointing to the current checkpoint/thread
    props = state.get("props", {})
    checkpoint_id = props.get("checkpoint_id", "unknown")
    thread_id = props.get("thread_id", "unknown")

    snapshot_id = f"snapshot:{thread_id}:{checkpoint_id}"
    snapshot_node: Node = {
        "id": snapshot_id,
        "type": "ContextGraphSnapshot",
        "layer": "policy",
        "name": f"Snapshot {checkpoint_id}",
        "props": {
            "threadId": thread_id,
            "checkpointId": checkpoint_id,
            "createdAt": datetime.now(timezone.utc).isoformat(),
        },
    }
    nodes[snapshot_id] = snapshot_node

    # Link scenario to snapshot
    edge_id = f"edge:scenario-applies-{scenario_id}-{snapshot_id}"
    ctx_graph["edges"].append({
        "id": edge_id,
        "type": "SCENARIO_APPLIES_TO",
        "from_: scenario_id,
        "to": snapshot_id,
        "layer": "policy",
        "props": {},
    })

    ctx_graph["nodes"] = list(nodes.values())
    state["context_graph"] = ctx_graph
    state["current_scenario_id"] = scenario_id
    return state

You call add_policy_nodes after building the context graph but before running a simulation branch.

2) Implement Postgres checkpointer for context graphs

Basic PostgresSaver setup for durable checkpoints (short‑term) with context graphs in state:³⁷³³⁷⁴³⁷⁵³⁷⁶

from psycopg_pool import ConnectionPool
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.graph import StateGraph, START, END

DB_URI = "postgresql://user:pass@host:5432/langgraph?sslmode=require"
pool = ConnectionPool(conninfo=DB_URI, max_size=10)

with pool.connection() as conn:
    checkpointer = PostgresSaver(conn)
    checkpointer.setup()  # creates checkpoint tables

builder = StateGraph(AgentState, context_schema=ContextConfig)
builder.add_node("build_context", build_context_graph)
builder.add_edge(START, "build_context")
builder.add_edge("build_context", END)

graph = builder.compile(checkpointer=checkpointer)

Best practice: treat checkpoints as operational state, not long‑term memory; promote anything important (e.g., snapshots) into your own tables or a LangGraph Store before pruning.³⁷⁷³⁷⁸³⁷⁹³⁷⁵³⁷³

3) Using LangGraph time‑travel to fork policy scenarios

Pattern: get history for a thread, pick a checkpoint, then resume from it with a new current_scenario_id.³⁸⁰³⁸¹³⁸²³⁸³

# Original thread config
base_config = {"configurable": {"thread_id": "pt-123-q-1"}}

# Run once
graph.invoke(initial_state, config=base_config)

# 1) Identify a checkpoint
states = list(graph.get_state_history(base_config))  # newest first
chosen = states[-2]  # e.g., before compliance node

checkpoint_id = chosen.config["configurable"]["checkpoint_id"]

# 2) Fork a new scenario thread from that checkpoint
scenario_config = {
    "configurable": {
        "thread_id": "pt-123-q-1-scenario-stop-acei",
        "checkpoint_id": checkpoint_id,
    }
}

# 3) Resume with modified state (e.g., new scenario)
graph.update_state(
    scenario_config,
    {
        "current_scenario_id": "scenario:stop-acei",
        # optionally tweak context_graph/Activation nodes here
    },
)

# 4) Resume execution (no new input needed)
for event in graph.stream(None, config=scenario_config, stream_mode="values"):
    pass  # collect simulation outputs

This gives you a forked policy scenario without rerunning the earlier steps.³⁸⁴³⁸²³⁸⁵

4) Integrate FHIR data into `ContextGraphSnapshot` nodes

Assuming you have FHIR resources in JSON and/or a FHIR→OMOP mapping:³⁸⁶³⁸⁷³⁸⁸

When creating a ContextGraphSnapshot, you can embed lightweight FHIR references or keys in props, not full resources (to avoid bloat).

Example snapshot with FHIR references:

snapshot_node: Node = {
    "id": "snapshot:pt-123-q-1:ckpt-abc",
    "type": "ContextGraphSnapshot",
    "layer": "policy",
    "name": "Snapshot before medication change",
    "props": {
        "threadId": "pt-123-q-1",
        "checkpointId": "ckpt-abc",
        "createdAt": "2026-02-03T04:33:00Z",
        "fhirPatientRef": "Patient/123",
        "fhirEncounterRefs": ["Encounter/789"],
        "fhirBundleIds": ["Bundle/xyz"]  # e.g., cached in FHIR store
    },
}

Your SoR/domain clients can then resolve these to OMOP or FHIR resources as needed (e.g., using FHIR Ontop OMOP or an OMOP‑on‑FHIR DB).³⁸⁷³⁸⁸³⁸⁶

5) Best practices for managing checkpoints in production

Patterns that show up repeatedly:³⁷⁸³⁷⁹³⁷⁵³⁷³³⁷⁷

Use Postgres (or DynamoDB, etc.) for production checkpointers; InMemorySaver is for testing only.
Treat thread_id as a first‑class key:
- One user session / patient case / workflow = one thread_id.
- Include thread_id in logs and Snapshot nodes for correlation.
Garbage‑collect old checkpoints:
- Keep last N checkpoints or last T hours per thread_id; delete older rows with a background job.
- Promote long‑term artifacts (snapshots, audit trails, summaries) into dedicated tables or LangGraph Stores before deleting.³⁷⁹³⁷⁸
Separate “memory” from checkpoints:
- Use checkpoints for execution continuity.
- Use Stores or your own DB/vector store for durable clinical knowledge, embeddings, or audit logs.³⁷⁵³⁷⁹
Observe and cap:
- Monitor checkpoint table growth and query latency.
- Consider partitioning by date or thread_id for very high volumes.
Time‑travel discipline:
- Limit who/what can trigger time‑travel in prod (e.g., only via internal tooling).
- Always fork to a new thread_id when simulating scenarios to preserve auditability.³⁸¹³⁸²³⁸⁰

If you want, next I can show a concrete Postgres schema and GC job for pruning checkpoints (keeping last N per thread_id) and a small helper that creates ContextGraphSnapshot records in your own context_snapshots table alongside LangGraph’s internal checkpoint tables. ³⁸⁹³⁹⁰

⁂

Best practices for pruning old Postgres checkpoints in LangGraph

LangGraph Store abstraction for long-term memory beyond checkpoints Example of update_state for branching policy scenarios in LangGraph FHIR Ontop OMOP setup tutorial with Postgres How to add pgvector for vector search in LangGraph PostgresSavercontext drift??

Here’s a concise set of patterns you can reuse.

1) Pruning old Postgres checkpoints (LangGraph)

Checkpoints = operational log, not long‑term memory; prune by time/length per thread.³⁹¹³⁹²³⁹³

Typical pattern:

Keep last N checkpoints or T hours per thread_id.
Nightly GC job that deletes old rows from checkpoints, checkpoint_writes, checkpoint_blobs.

Pseudo‑SQL sketch (you’ll adapt to actual table names):

-- Example: keep last 20 checkpoints per thread
WITH ranked AS (
  SELECT id, thread_id,
         ROW_NUMBER() OVER (PARTITION BY thread_id ORDER BY created_at DESC) AS rn
  FROM langgraph_checkpoints
)
DELETE FROM langgraph_checkpoint_blobs
WHERE checkpoint_id IN (
  SELECT id FROM ranked WHERE rn > 20
);

DELETE FROM langgraph_checkpoint_writes
WHERE checkpoint_id IN (
  SELECT id FROM ranked WHERE rn > 20
);

DELETE FROM langgraph_checkpoints
WHERE id IN (
  SELECT id FROM ranked WHERE rn > 20
);

Wire that into a scheduled job (Cron, Cloud Scheduler) and monitor table size/latency.³⁹²³⁹⁴³⁹³³⁹¹

2) LangGraph Store abstraction for long‑term memory

LangGraph distinguishes:³⁹⁵³⁹⁶³⁹⁷

Short‑term: state + checkpoints (thread‑scoped).
Long‑term: Stores (document/record stores, often with vector search).

Pattern:

Use PostgresSaver only for checkpoints.
Use a Store (or your own tables/vector DB) for durable facts/preferences.
Nodes call Store tools to get/put/search memories, then merge into state.

Docs + Store interfaces cover both scalar + vector memory; on LangGraph Cloud, the Store can be Postgres with vector similarity built‑in.³⁹⁶³⁹⁸³⁹⁷³⁹⁵

3) `update_state` example for branching policy scenarios

Use update_state to branch from a checkpoint into a new scenario thread.³⁹⁹⁴⁰⁰⁴⁰¹⁴⁰²

# Base config for original conversation
base_cfg = {"configurable": {"thread_id": "pt-123-q-1"}}

# Get history and pick a checkpoint to fork from
history = list(graph.get_state_history(base_cfg))
chosen = history[-2]  # e.g., pre-compliance step
checkpoint_id = chosen.config["configurable"]["checkpoint_id"]

# New scenario thread config
scenario_cfg = {
    "configurable": {
        "thread_id": "pt-123-q-1-scenario-stop-acei",
        "checkpoint_id": checkpoint_id,
    }
}

# Branch state: set scenario id, tweak context_graph if desired
graph.update_state(
    scenario_cfg,
    {
        "current_scenario_id": "scenario:stop-acei",
        # optionally: "context_graph": modified_graph
    },
)

# Resume execution on the fork
for event in graph.stream(None, config=scenario_cfg, stream_mode="values"):
    pass

This pattern gives you explicit policy scenario branches without rerunning earlier nodes.⁴⁰¹⁴⁰³³⁹⁹

4) FHIR Ontop OMOP setup with Postgres (pointer)

For exposing OMOP Postgres as a FHIR‑compatible virtual KG, FHIROntopOMOP shows a full setup:⁴⁰⁴⁴⁰⁵⁴⁰⁶

High‑level steps:

OMOP on Postgres: deploy OMOP schema + sample data (e.g., omop_v5 image).⁴⁰⁵
Configure Ontop to connect to Postgres:
- Set listen_addresses='*' + proper pg_hba.conf so Docker/VM can reach Postgres.⁴⁰⁴
- Provide JDBC URL/credentials.
Provide mappings from OMOP tables to FHIR model using Ontop (OBDA mappings or FHIR‑Omop IG).⁴⁰⁶⁴⁰⁴
Run Ontop server; query with SPARQL or FHIR‑like APIs over the virtual KG.

For your context graph, you can either:

Pull FHIR resources via this virtual KG and project into your domain graph, or
Pull OMOP rows directly and store FHIR references in ContextGraphSnapshot props.

5) Adding pgvector for vector search alongside PostgresSaver

Best practice: keep pgvector separate from the checkpoint tables; use it for long‑term vector memory, not for the checkpointer itself.⁴⁰⁷³⁹⁸⁴⁰⁸³⁹⁵

Typical steps:

Enable pgvector:

CREATE EXTENSION IF NOT EXISTS vector;

Create a vector table:

CREATE TABLE memory_messages (
  id           BIGSERIAL PRIMARY KEY,
  thread_id    TEXT NOT NULL,
  role         TEXT NOT NULL,      -- 'user','assistant',...
  content      TEXT NOT NULL,
  embedding    vector(1536),       -- match your model
  created_at   TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX ON memory_messages USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

In your LangGraph node, when you want to store long‑term memory:

def persist_memory(thread_id: str, role: str, content: str, embedding: list[float]):
    with psycopg.connect(DSN) as conn:
        with conn.cursor() as cur:
            cur.execute(
                """
                INSERT INTO memory_messages (thread_id, role, content, embedding)
                VALUES (%s, %s, %s, %s)
                """,
                (thread_id, role, content, embedding),
            )

For retrieval (vector search):

SELECT id, thread_id, role, content
FROM memory_messages
WHERE thread_id = $1
ORDER BY embedding <-> $2::vector
LIMIT 10;

This complements PostgresSaver: checkpoints hold short‑term execution state, while pgvector table holds long‑term semantic memory, avoiding context drift by pulling only the most relevant past items instead of entire history.³⁹⁷³⁹⁵⁴⁰⁷

If you want, next I can sketch a small LangGraph node pair: one that writes conversation turns to the pgvector memory table and one retriever node that injects the top‑k similar past turns into context_graph or messages before the main reasoning agent runs. ⁴⁰⁹⁴¹⁰⁴¹¹

⁂

TrustGraph github

The main TrustGraph repo you’re looking for is:

GitHub: https://github.com/trustgraph-ai/trustgraph
Website/docs: https://trustgraph.ai

That repo is the open‑source “Context Graph Engine / Factory for AI” with GraphRAG, ontology‑RAG, and agent integration. ⁴¹²⁴¹³⁴¹⁴⁴¹⁵⁴¹⁶⁴¹⁷⁴¹⁸⁴¹⁹⁴²⁰

⁂

DSL

A DSL (domain‑specific language) is a small language tailored to a specific problem domain, rather than a general programming language meant for everything.⁴²¹⁴²²⁴²³

Examples: SQL for relational queries, HTML/CSS for web layout, regex syntax for pattern matching, Terraform’s HCL for infrastructure, Drools rules DSLs for business rules.⁴²²⁴²⁴⁴²⁵⁴²¹
DSLs can be external (their own parser/interpreter, like SQL or a custom config language) or embedded/internal (APIs or fluent builders inside a host language, like a testing DSL in Ruby or Kotlin).⁴²⁶⁴²¹⁴²²

They’re useful when a recurring class of problems in your domain can be expressed more clearly, concisely, and safely with a specialized syntax/abstractions than with a general‑purpose language.⁴²⁷⁴²⁵⁴²¹⁴²² ⁴²⁸⁴²⁹⁴³⁰

⁂

How to install and run TrustGraph locally

TrustGraph quickstart tutorial Context Graphs vs traditional knowledge graphs TrustGraph CLI examples for graph extraction TrustGraph supported LLMs and integrations. DSL examples for graph databases Internal DSL vs external DSL comparison Tools for building DSLs like ANTLR DSL for knowledge graph schemas How to implement a DSL

Here’s a tight rundown you can drop into your notes.

TrustGraph: install, quickstart, CLI

Repo: https://github.com/trustgraph-ai/trustgraph
Docs: https://docs.trustgraph.ai
Tutorial video (launch locally): “How to Launch TrustGraph Tutorial”⁴³¹

Install and run locally (typical pattern)

From the quickstart / tutorial:⁴³¹⁴³²⁴³³

Install CLI:

pip install "trustgraph-cli==0.5.2"
# or latest version per README

Use the Configuration Builder (web UI or provided configs) to download a deployment package with:

docker-compose.yaml
config/ YAML (LLM, graph DB, vector DB, chunking, etc.)
optional sample data and prompts⁴³²⁴³³

Launch via Docker:

# from the directory with docker-compose.yaml
docker compose up -d
# later:
docker compose down -v

The tutorial shows using tg CLI commands (e.g., tg load-sample-documents) and exploring the UI/Data Workbench.⁴³⁴⁴³⁵⁴³¹

CLI examples for graph extraction

The CLI supports:

Loading documents into the Librarian, building graph edges + vector embeddings as “Knowledge Packages”.
Running GraphRAG queries and inspecting subgraphs.⁴³⁵⁴³⁶⁴³⁷⁴³⁴⁴³²

Concrete command names evolve, but patterns from their materials include:

tg load-sample-documents
tg upload-data my_docs/
tg run-flow my_flow.yaml
tg query --graph-rag "What obligations does this contract create?"

(Use tg --help and docs for the exact verbs in the version you install.)⁴³³⁴³¹⁴³²

Context graphs vs traditional knowledge graphs (TrustGraph view)

TrustGraph frames context graphs as AI‑optimized, governed subgraphs for a specific task, built from a more general knowledge/metadata layer:⁴³⁸⁴³⁷⁴³⁹⁴³²

Traditional KG: global, persistent, schema/ontology‑driven; good for enterprise semantics and lineage.
Context graph: small, dynamic, relevance‑ranked subgraph (plus embeddings) tuned for LLM grounding and agent workflows (GraphRAG, tools, policy checks).

Their “Knowledge Core” = graph edges + mapped vector embeddings; context graphs are subgraphs over these cores used per query.⁴³⁶⁴³⁷⁴³²

Supported LLMs & integrations (high‑level)

From the configuration builder / architecture:⁴³⁶⁴³²⁴³³

LLMs: OpenAI, Anthropic, Google AI Studio, local models (via OpenAI‑compatible APIs) selectable in config.
Graph stores: Neo4j, Memgraph, FalkorDB, plus others depending on version.⁴³⁷⁴⁴⁰⁴³²⁴³⁶
Vector DBs: Qdrant, Pinecone, others as configured.⁴³²⁴³⁶
Tools: MCP tools, custom agent tools, chunking/embedding pipelines configured via YAML.⁴³³⁴³²

DSLs for graph / KG work

DSL examples for graph databases

Cypher (Neo4j): pattern‑matching DSL for graph queries.

MATCH (p:Patient)-[:HAS_CONDITION]->(c:Condition {code: "N18.31"})
RETURN p, c;

Gremlin: traversal DSL for TinkerPop graphs.
Graph DSL via ANTLR: e.g., external DSL Graph { A -> B (10) } compiled to Graph/Vertex/Edge model.⁴⁴¹⁴⁴²

From the ANTLR graph DSL example:⁴⁴²⁴⁴¹

grammar Graph;

graph  : 'Graph' '{' edge+ '}' ;
vertex : ID ;
edge   : vertex '->' vertex '(' NUM ')' ;

ID  : [a-zA-Z]+ ;
NUM : [0-9]+ ;
WS  : [ \t\r\n]+ -> skip ;

Input:

Graph {
  A -> B (10)
  B -> C (20)
}

This is an external DSL that the parser turns into a semantic model (Graph/Edge/Vertex objects).⁴⁴¹⁴⁴²

Internal vs external DSL

Internal DSL: embedded in a host language’s syntax (e.g., fluent builders in Kotlin/Scala/Ruby).
- Pros: no separate parser; leverage IDE, types, tooling.
- Example (Python-ish):

g = (
  Graph()
  .node("patient123").label("Patient")
  .node("cond_ckd3a").label("Condition")
  .edge("patient123", "HAS_CONDITION", "cond_ckd3a")
)

External DSL: its own syntax and parser (ANTLR, PEG, etc.).
- Pros: very tailored syntax, editable by non‑programmers, decoupled from implementation language.
- Cons: need grammar, parser, editor tooling.⁴⁴³⁴⁴⁴⁴⁴⁵⁴⁴⁶⁴⁴²⁴⁴¹

Tools for building DSLs (esp. for KGs)

ANTLR: widely used parser generator for external DSLs; generates parsers in Java, Python, C#, etc.⁴⁴²⁴⁴¹
PEG/Parsers: Pegen, Lark, Parsec, etc.
Xtext (Eclipse), JetBrains MPS: for full language workbenches.
Martin Fowler’s DSL guide gives design patterns for internal/external DSLs and semantic models.⁴⁴⁴⁴⁴⁵⁴⁴⁶

DSL for knowledge graph schemas (how to implement)

You can design a small DSL to define KG schemas (entities, edges, constraints) and compile it to Neo4j/SQL/Arango/FHIR/RDF.

Example external DSL sketch:

schema Clinic {

  entity Patient {
    key id
    prop mrn: string
    prop isPHI: bool
  }

  entity Condition {
    key id
    prop code: string
    prop system: string
  }

  relation HAS_CONDITION {
    from Patient
    to   Condition
    cardinality many
  }
}

ANTLR‑style grammar fragment:

grammar ClinicSchema;

schema   : 'schema' ID '{' (entity | relation)* '}' ;
entity   : 'entity' ID '{' (prop | keyDecl)+ '}' ;
prop     : 'prop' ID ':' ID ;
keyDecl  : 'key' ID ;
relation : 'relation' ID '{'
             'from' ID
             'to'   ID
             'cardinality' ID
           '}' ;

ID  : [a-zA-Z_][a-zA-Z0-9_]* ;
WS  : [ \t\r\n]+ -> skip ;

Compiler pipeline:

Parse DSL → AST (ANTLR).
Build semantic model: Entity, Property, Relation classes.
Generate:
- Cypher constraints and relationship types (Neo4j), or
- SQL table/edge definitions (Postgres/Arango), or
- RDF/OWL/Turtle schema.

For an internal DSL in Python for schema:

schema = (
  Schema("Clinic")
  .entity("Patient")
    .key("id")
    .prop("mrn", "string")
    .prop("isPHI", "bool")
  .entity("Condition")
    .key("id")
    .prop("code", "string")
    .prop("system", "string")
  .relation("HAS_CONDITION", "Patient", "Condition", cardinality="many")
)

Same semantic model, but you rely on Python’s syntax instead of a dedicated grammar.

If you’d like, next I can sketch a tiny ClinicSchema DSL→Cypher code generator: given the DSL above, emit CREATE CONSTRAINT and CREATE INDEX plus relationship patterns for Neo4j or Postgres edge tables. ⁴⁴⁷⁴⁴⁸⁴⁴⁹⁴⁵⁰

⁂

TrustGraph enterprise licensing and pricing

TrustGraph’s software is open‑source under Apache 2.0; the enterprise cost model is infrastructure‑based rather than per‑query SaaS.

The main repo trustgraph-ai/trustgraph is licensed under Apache License 2.0, so you can run and modify it in your own environment without license fees, subject to standard Apache terms (notice preservation, patent grant, etc.).⁴⁵¹⁴⁵²⁴⁵³
For enterprise deployments, their own comparison vs Graphlit describes the cost model as “Infrastructure‑Based”: you pay for your own graph DB, vector DB, compute, and storage (e.g., Neo4j cluster, Qdrant, K8s), plus optional enterprise support, with no per‑request charges from TrustGraph itself.⁴⁵⁴
They contrast this with usage‑based SaaS (per document, per query, per GB) and position TrustGraph as better for high‑volume, cost‑sensitive workloads where you want predictable infra costs and full data sovereignty.⁴⁵⁵⁴⁵⁶⁴⁵⁴

They don’t publish a simple public price sheet for support / enterprise services; the model is effectively:

Software: Apache‑2.0 OSS (no license fee).
Costs: your infra + optional paid support/consulting (contact them via site for quotes).⁴⁵⁷⁴⁵⁶⁴⁵¹⁴⁵⁴

For an accurate enterprise quote (support tiers, SLAs, onboarding), you’d need to reach out through their site or sales/contact channels. ⁴⁵⁸⁴⁵⁹⁴⁶⁰⁴⁶¹⁴⁶²⁴⁶³⁴⁶⁴⁴⁶⁵

⁂

TrustGraph Configuration UI tutorial

Common TrustGraph deployment errors and fixes TrustGraph data workbench GraphRAG examples Integrate TrustGraph with Qdrant vector DB TrustGraph enterprise licensing and pricing

TrustGraph’s docs and videos cover all of these at a high level; here’s a concise map with where to look and what to expect.

Configuration UI tutorial

TrustGraph ships a configuration builder / UI that helps you generate a deployable package (Docker compose + configs):⁴⁶⁶⁴⁶⁷⁴⁶⁸⁴⁶⁹

Walkthrough:
- TrustGraph YouTube: “How to Launch TrustGraph Tutorial” and “TrustGraph 101” show using the UI to:
  - Select LLM provider, graph DB (Neo4j/Memgraph/FalkorDB), vector DB (Qdrant, etc.).
  - Configure chunking, embedding models, and knowledge packages.
  - Download a ready‑to‑run deployment (docker‑compose + YAML).⁴⁶⁷⁴⁶⁹⁴⁷⁰⁴⁷¹
Docs hub:
- https://docs.trustgraph.ai – sections on “Get Started”, “Configuration”, “Deployment” cover the UI fields and how they map to config files.⁴⁶⁸⁴⁷²

Common deployment errors and fixes

From README/discussions + partner writeups:⁴⁷³⁴⁷⁴⁴⁶⁶

Typical issues:

Container dependencies not reachable (graph DB, vector DB, LLM endpoint):
- Fix: check docker-compose network names, env variables, and service URLs in config/*.yaml (e.g., Qdrant gRPC/HTTP ports, Neo4j bolt URI, etc.).⁴⁷⁴⁴⁶⁶
Missing or incompatible graph DB / vector DB versions:
- Fix: use the versions pinned in the example docker-compose.yaml or docs; Memgraph/FalkorDB/Qdrant versions are usually specified.⁴⁶⁶⁴⁷³⁴⁷⁴
LLM auth failures:
- Fix: ensure API keys and base URLs are set in the config UI or .env file, and that the selected LLM provider matches the key (OpenAI vs Anthropic vs GCP).
Resource limits (OOM during extraction / GraphRAG flows):
- Fix: lower parallelism / batch sizes in flows; adjust container memory/CPU in compose; prune sample data.

Troubleshooting aids:

Logs from tg-api, graph DB, and vector DB containers (via docker compose logs), plus tests in the “How to Explore Knowledge with the Test Suite” video.⁴⁷¹⁴⁷³⁴⁷⁴

Data Workbench / GraphRAG examples

TrustGraph’s Data Workbench and published case studies show GraphRAG patterns:⁴⁷³⁴⁷⁴⁴⁶⁶

Workbench capabilities:
- Inspect knowledge packages (documents + extracted entities/edges + embeddings).
- Run GraphRAG queries that first traverse the graph, then retrieve supporting chunks via vector search, then call the LLM.⁴⁷²⁴⁶⁸⁴⁶⁶
Example scenarios:
- Memgraph/TrustGraph blog: complex industrial domains where GraphRAG is used to retrieve multi‑hop context (assets, components, incidents) for a query.⁴⁷³
- Qdrant case study: high‑volume agentic workflows where TrustGraph uses a combination of knowledge packages, vector search, and graph traversal for contextual grounding.⁴⁷⁴

Look for “GraphRAG”, “Knowledge Packages”, and “Data Workbench” sections in the docs/guides and videos.⁴⁶⁸⁴⁷²⁴⁶⁶⁴⁷⁴⁴⁷³

Integrating TrustGraph with Qdrant

TrustGraph explicitly supports Qdrant as a vector DB backend:⁴⁶⁶⁴⁷⁴

Configuration UI:
- Choose Qdrant as the vector DB provider, specify host/port/API key.
Docker‑based deployment:
- Compose file typically includes a qdrant service; TrustGraph’s API service connects via gRPC/HTTP endpoint configured in YAML.⁴⁷⁴⁴⁶⁶
Flow:
- Documents ingested → chunked → embeddings stored in Qdrant collections.
- GraphRAG queries:
  - Graph traversal identifies candidate nodes;
  - Qdrant similarity search fetches top‑k chunks;
  - LLM gets structured graph context + chunk text.⁴⁷³⁴⁷⁴

For a concrete setup, follow the Qdrant case study which outlines the architecture and config blocks.⁴⁷⁴

Enterprise licensing and pricing

License: core TrustGraph repo is Apache 2.0 OSS (no software license fee to run/modify).⁴⁷⁵⁴⁷⁶⁴⁶⁶
Enterprise pricing model:
- Compared to Graphlit, TrustGraph is described as infrastructure‑based: you pay for your own infra (graph DB, vector DB, compute, storage) plus optional support, not per‑query SaaS usage.⁴⁷⁷⁴⁷⁸⁴⁷⁹⁴⁷⁴
- No public SaaS price sheet; enterprise support and services are by contact/quote.

If you plan to run it as part of your own platform, the main cost drivers are: graph DB licenses (if using enterprise Neo4j etc.), managed vector DB (or self‑hosted Qdrant), and compute for extraction and serving.

If you want, next I can sketch a minimal docker‑compose + config layout for a local TrustGraph + Memgraph + Qdrant + OpenAI stack that mirrors your clinic context graph use case, including where you’d plug in your own extraction DSL or LangGraph‑based agents.

⁂

sketch a minimal docker‑compose + config layout for a local TrustGraph + Memgraph + Qdrant + OpenAI stack that mirrors your clinic context graph use case, including where you’d plug in your own extraction DSL or LangGraph‑based agents.

Below is a minimal, opinionated sketch you can adapt. It’s not full YAML from TrustGraph, but shows the structure, services, and where your clinic context graph + LangGraph agents plug in.

1) High-level architecture

Services:

trustgraph-api: TrustGraph engine (GraphRAG & orchestration).
memgraph: graph store for knowledge/core graph.
qdrant: vector DB for embeddings.
langgraph-backend: your LangGraph multi‑agent pipeline for clinic context graphs.
fhir-pg: Postgres/OMOP/FHIR DB (SoR) for patient data.
nginx (optional): single entrypoint for UI/API.

Config:

config/trustgraph.yaml: TrustGraph connections, LLM, chunking.
config/langgraph.yaml: endpoints and keys for TrustGraph, FHIR DB, Qdrant (optional).
flows/clinic_graphrag.yaml: TrustGraph flow using your extraction DSL.

2) docker-compose.yml (minimal sketch)

version: "3.9"

services:
  memgraph:
    image: memgraph/memgraph-mage:latest
    container_name: memgraph
    ports:
      - "7687:7687"    # Bolt
      - "7444:7444"    # Web interface
    environment:
      MEMGRAPH="--log-level=INFO"
    volumes:
      - memgraph_data:/var/lib/memgraph

  qdrant:
    image: qdrant/qdrant:latest
    container_name: qdrant
    ports:
      - "6333:6333"    # HTTP
      - "6334:6334"    # gRPC
    volumes:
      - qdrant_data:/qdrant/storage

  fhir-pg:
    image: postgres:16
    container_name: fhir-pg
    environment:
      POSTGRES_DB: fhir_omop
      POSTGRES_USER: fhiruser
      POSTGRES_PASSWORD: fhirpass
    ports:
      - "5432:5432"
    volumes:
      - fhir_pg_data:/var/lib/postgresql/data
    # seed with OMOP/FHIR schema + data via init scripts/SQL

  trustgraph-api:
    image: trustgraph/trustgraph-api:latest
    container_name: trustgraph-api
    depends_on:
      - memgraph
      - qdrant
    environment:
      TG_CONFIG_PATH: /app/config/trustgraph.yaml
      OPENAI_API_KEY: ${OPENAI_API_KEY}
    volumes:
      - ./config:/app/config
      - ./flows:/app/flows
      - ./data:/app/data
    ports:
      - "8080:8080"   # TrustGraph API / UI

  langgraph-backend:
    build: ./langgraph-backend
    container_name: langgraph-backend
    depends_on:
      - fhir-pg
      - trustgraph-api
    environment:
      FHIR_PG_DSN: "postgresql://fhiruser:fhirpass@fhir-pg:5432/fhir_omop"
      TRUSTGRAPH_API_URL: "http://trustgraph-api:8080"
      OPENAI_API_KEY: ${OPENAI_API_KEY}
    ports:
      - "8000:8000"   # LangGraph HTTP API / SSE

volumes:
  memgraph_data:
  qdrant_data:
  fhir_pg_data:

3) TrustGraph config (config/trustgraph.yaml)

Key ideas: point TrustGraph at Memgraph + Qdrant + OpenAI; define a clinic extraction flow that your DSL feeds.

# config/trustgraph.yaml
llm:
  provider: openai
  model: gpt-4.1-mini
  api_key_env: OPENAI_API_KEY

graph:
  type: memgraph
  uri: bolt://memgraph:7687
  username: ""
  password: ""

vector_store:
  type: qdrant
  url: http://qdrant:6333
  grpc_url: http://qdrant:6334
  collection: clinic_docs

knowledge_packages:
  - name: clinic_guidelines
    source_path: /app/data/clinic_guidelines
    chunk_size: 1024
    chunk_overlap: 128
    embedding_model: text-embedding-3-large

flows:
  - id: clinic_graphrag
    path: /app/flows/clinic_graphrag.yaml

Example flow (flows/clinic_graphrag.yaml), showing where your extraction DSL fits (as a custom step):

id: clinic_graphrag
steps:
  - id: chunk_and_embed
    type: embedding
    input: clinic_guidelines
    output_collection: clinic_docs

  - id: extract_graph
    type: custom
    description: "Call clinic-DSL extractor to generate entities/edges"
    handler: clinic_dsl_extractor   # TrustGraph will call this (Python code in container)
    input_collection: clinic_docs

  - id: build_knowledge_package
    type: graph_package
    graph_db: memgraph
    vector_db: qdrant
    from: extract_graph

  - id: graph_rag_query
    type: graph_rag
    graph_db: memgraph
    vector_db: qdrant
    llm: default
    params:
      max_hops: 2
      top_k: 10

Your DSL extractor (e.g., Python using ANTLR or pattern rules) transforms text chunks into Patient, Condition, Medication, Policy nodes and their edges and writes them into Memgraph and Qdrant.

4) LangGraph backend: clinic context graph + TrustGraph integration

langgraph-backend/Dockerfile (sketch):

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

requirements.txt:

langgraph
langchain-openai
psycopg[binary]
httpx
pydantic

Example node that calls TrustGraph’s GraphRAG API to enrich the domain layer of your context_graph:

import httpx
from typing import Dict, Any
from langgraph.graph import StateGraph, START, END

TRUSTGRAPH_API_URL = "http://trustgraph-api:8080"

def enrich_with_trustgraph(state: AgentState) -> AgentState:
  patient_id = state.get("patient_id")
  if not patient_id:
      return state

  query = state.get("query") or "Explain recent medication changes"
  payload = {
      "flow_id": "clinic_graphrag",
      "query": query,
      "context": {"patient_id": patient_id},
  }

  with httpx.Client(timeout=30.0) as client:
      resp = client.post(f"{TRUSTGRAPH_API_URL}/v1/graphrag/query", json=payload)
      resp.raise_for_status()
      data: Dict[str, Any] = resp.json()

  # Assume TrustGraph returns nodes/edges (domain) in a simple schema
  tg_graph = data.get("context_graph", {"nodes": [], "edges": []})
  for n in tg_graph["nodes"]:
      n.setdefault("layer", "domain")
  for e in tg_graph["edges"]:
      e.setdefault("layer", "domain")

  ctx = state.get("context_graph") or {"nodes": [], "edges": []}
  existing_nodes = {n["id"]: n for n in ctx["nodes"]}
  existing_edges = {e["id"]: e for e in ctx["edges"]}

  for n in tg_graph["nodes"]:
      existing_nodes[n["id"]] = n
  for e in tg_graph["edges"]:
      existing_edges[e["id"]] = e

  state["context_graph"] = {
      "nodes": list(existing_nodes.values()),
      "edges": list(existing_edges.values()),
  }
  return state

Your full LangGraph pipeline then:

Builds SoR + synthetic layers (SoRClient, SyntheticClient).
Calls enrich_with_trustgraph to add domain‑layer graph context from TrustGraph.
Runs clinical + compliance agents consuming context_graph.

5) Where the extraction DSL plugs in

You have two main options:

Inside TrustGraph (preferred for KG build):
- Provide a clinic_dsl_extractor step in the TrustGraph flow, implemented as a Python handler in the trustgraph-api image that:
  - Parses clinic documents with your DSL (ANTLR or internal DSL).
  - Writes entities/edges into Memgraph and embeddings into Qdrant.
Pre‑processor feeding TrustGraph’s data directory:
- Run a separate container / script that:
  - Reads raw clinical guidelines / SOPs.
  - Produces normalized JSON/CSV representing KG edges.
- TrustGraph then ingests that as a data source for the graph_package step.

LangGraph stays focused on per‑patient context graphs and agent workflows; TrustGraph handles offline graph extraction + GraphRAG.

If you want, next I can:

Sketch the clinic_dsl_extractor Python handler signature that TrustGraph would call, and how it writes to Memgraph/Qdrant, or
Show a small end‑to‑end example query: FHIR/OMOP → LangGraph build_context_graph → TrustGraph GraphRAG → combined context_graph injected to the LLM.

Core definitions
Structural and lifecycle differences
Role in AI agent systems
Governance and decision semantics
How they work together in an agentic stack
1. Minimal typed schema for a domain knowledge graph
- Entity types (nodes)
- Relationship types (edges)
2. LangGraph‑style context‑graph builder node
- State schema with context graph field
- Context‑graph builder node
- Wiring it into a LangGraph
1. When to use context graphs vs knowledge graphs
2. How context graphs integrate with RAG
3. Building a governed context graph (step by step)
4. Implementing the KG schema in three backends
- 4.1 Relational (PostgreSQL / “Rel‑style” schema)
- 4.2 Neo4j (Cypher)
- 4.3 ArangoDB (AQL)
5. Example LangGraph node projecting subgraphs
- State and context schemas
- Subgraph‑projection node
- Wiring into a multi‑agent workflow
6. Task‑specific subgraph projection techniques
7. Stronger follow‑up prompts you can reuse
Governance, KG vs context graph, production
LangGraph multi‑agent, context_graph, and checkpointer
Backend‑specific KGClient implementations (ArangoDB, Neo4j, relational)
JSON serialization, prompts, and dialogue
Context graphs in enterprise AI
1) Healthcare context graph vs knowledge graph (HIPAA focus)
2) JSON schema for the healthcare context graph
3) LangGraph implementation with checkpointer (Postgres)
- State + context schema
- Checkpointer setup (Postgres)
4) Multi‑agent workflow for patient data queries
- KG client (backend‑agnostic interface)
- Context‑graph builder node (governed projection)
- Agents
- Wiring with checkpointer
5) Governance metrics and audit trail examples
Core design principles
Node types
Edge types
Example JSON snippet (context graph for one query)
Suggested next step prompt
1) Sample PostgresSaver checkpointer setup (LangGraph)
2) Multi‑agent roles for healthcare query workflow
3) HIPAA compliance checklist for patient data graphs
4) Deploying LangGraph healthcare agent with Redis checkpointing
5) FHIR / OMOP mappings into clinic context graph
6) Example entities and edges for clinic patient graph
7) LangGraph implementation of clinic context graph (Postgres backend)
8) GraphRAG applications for patient journey analysis
1) Concrete Postgres schema (OMOP‑aligned, graph‑friendly)
2) Full HealthcareKGClient.patient_context implementation (Python)
Core structured model
Common structured serializations
1) RDF Turtle syntax examples
2) JSON‑LD vs Turtle (in practice)
3) N‑Triples format details
4) Parsing RDF serializations in Python (RDFLib)
5) RDF vs clinic patient “knowledge graph” schema
RDFLib: querying RDF graphs with SPARQL
Convert Turtle → JSON‑LD in Python (RDFLib)
N‑Quads vs N‑Triples
SPARQL queries on Turtle data (clinic example)
Turtle: blank nodes and collections
Clinic context graph: Turtle and JSON‑LD side‑by‑side
- Turtle
- JSON‑LD
Core idea
Simple example
Why use reification?
Alternatives in practice
1) Standard RDF reification vs singleton properties
2) RDFLib: creating and querying reified statements
3) Use cases of reification / singleton properties in KGs
4) Named graphs as an alternative
5) What to use in a governed context graph
Key additions in RDF 1.2
Backwards compatibility with RDF 1.1
Why it matters for your use cases
1) RDF‑star / Turtle‑star (RDF 1.2 triple terms)
2) Same clinic fact in three styles
- a) Classic RDF reification (rdf:Statement)
- b) Singleton property
- c) RDF‑star triple term (RDF 1.2 / Turtle‑star)
3) Turtle‑star with named graphs (metadata per context graph)
4) RDFLib and RDF‑star / singleton properties
5) Limitations and performance issues of classic reification
6) Alternatives recap for your governed context graph
RDF‑star vs classic reification (clinic fact)
- RDF‑star (Turtle‑star)
- Classic RDF reification
SPARQL‑star query syntax for nested triples
RDFLib and Turtle‑star
Performance: RDF‑star vs singleton properties (high‑level)
Named graphs / TriG‑star for metadata per context graph
Minimal migration: JSON context_graph → Turtle‑star
1) Synthetic grounding layer (L1)
2) System‑of‑record layer (L2)
3) Initial domain grounding layer (L3 – domain KG)
How these layers combine into a context graph
1) Typed schema with Layer enum
2) LangGraph build_context_graph node (3‑layer composition)
3) Examples of synthetic grounding data
4) System‑of‑record ontology in Neo4j / ArangoDB / Postgres
5) Relationship between initial domain grounding and RAG
6) What is a “context layer” in enterprise AI (Atlan‑style)
7) A fourth layer beyond the three
1) Four‑layer schema (add policy layer, Scenario, Snapshot)
2) Logging and replaying context graphs via LangGraph checkpointer
- Log snapshots into the graph
- Replay from a snapshot (fork a scenario)
3) Example input data for build_context_graph
4) SPARQL query to traverse three‑layer context
5) Add compliance governance to activation layer nodes
6) Healthcare examples of Unification layer connectors
7) Extend schema for multi‑clinician shared context
1) Example code for policy‑layer nodes
2) Implement Postgres checkpointer for context graphs
3) Using LangGraph time‑travel to fork policy scenarios
4) Integrate FHIR data into ContextGraphSnapshot nodes
5) Best practices for managing checkpoints in production
1) Pruning old Postgres checkpoints (LangGraph)
2) LangGraph Store abstraction for long‑term memory
3) update_state example for branching policy scenarios
4) FHIR Ontop OMOP setup with Postgres (pointer)
5) Adding pgvector for vector search alongside PostgresSaver
TrustGraph: install, quickstart, CLI
- Install and run locally (typical pattern)
- CLI examples for graph extraction
- Context graphs vs traditional knowledge graphs (TrustGraph view)
- Supported LLMs & integrations (high‑level)
DSLs for graph / KG work
- DSL examples for graph databases
- Internal vs external DSL
- Tools for building DSLs (esp. for KGs)
- DSL for knowledge graph schemas (how to implement)
Configuration UI tutorial
Common deployment errors and fixes
Data Workbench / GraphRAG examples
Integrating TrustGraph with Qdrant
Enterprise licensing and pricing
1) High-level architecture
2) docker-compose.yml (minimal sketch)
3) TrustGraph config (config/trustgraph.yaml)
4) LangGraph backend: clinic context graph + TrustGraph integration
5) Where the extraction DSL plugs in

Core definitions​

Structural and lifecycle differences​

Role in AI agent systems​

Governance and decision semantics​

How they work together in an agentic stack​

ketch:

1. Minimal typed schema for a domain knowledge graph​

Entity types (nodes)​

Relationship types (edges)​

2. LangGraph‑style context‑graph builder node​

State schema with context graph field​

Context‑graph builder node​

Wiring it into a LangGraph​

Examples of knowledge graphs in enterprise AI applications

1. When to use context graphs vs knowledge graphs​

2. How context graphs integrate with RAG​

3. Building a governed context graph (step by step)​

4. Implementing the KG schema in three backends​

4.1 Relational (PostgreSQL / “Rel‑style” schema)​

4.2 Neo4j (Cypher)​

4.3 ArangoDB (AQL)​

5. Example LangGraph node projecting subgraphs​

State and context schemas​

Subgraph‑projection node​

Wiring into a multi‑agent workflow​

6. Task‑specific subgraph projection techniques​

7. Stronger follow‑up prompts you can reuse​

Stronger follow‑up prompts you can reuse

Governance, KG vs context graph, production​

LangGraph multi‑agent, context_graph, and checkpointer​

Backend‑specific KGClient implementations (ArangoDB, Neo4j, relational)​

JSON serialization, prompts, and dialogue​

Context graphs in enterprise AI​

Implement this context graph spec using LangGraph with checkpointer for persistence

1) Healthcare context graph vs knowledge graph (HIPAA focus)​

2) JSON schema for the healthcare context graph​

3) LangGraph implementation with checkpointer (Postgres)​

State + context schema​

Checkpointer setup (Postgres)​

4) Multi‑agent workflow for patient data queries​

KG client (backend‑agnostic interface)​

Context‑graph builder node (governed projection)​

Agents​

Wiring with checkpointer​

5) Governance metrics and audit trail examples​