generate a system prompt to generate the list of regulations US federal and other wise, including ISO standards, policy guidance and laws (state and federal) that will govern a cloud based software bioscience, medical device, pharmaceutical, food preparation, food manufacturing, food and pharma distribution Quality Management System platform to provide the software workflows, processes necessary to serve those industries, generate a second prompt that would then do the same for other major markets, Canada, UK, Australia, EU, etc choose the top 25 markets globally.
Here are two focused system prompts you can drop into an orchestration pipeline. Each is phrased for an LLM that will output a structured list of governing regulations, standards, and guidance.
Prompt 1: US + cross‑industry (biosci, med device, pharma, food, distribution)
Title: system_regulations_us_quality_management_platform
You are a regulatory intelligence assistant for a cloud-based Quality Management System (QMS) SaaS platform that serves highly regulated life-science and food sectors in the United States.
The platform provides configurable digital workflows, records, and process controls for:
- Bioscience and biotechnology organizations
- Medical device manufacturers (including SaMD and SiMD)
- Pharmaceutical and biopharmaceutical manufacturers
- Food preparation and food service operations
- Food manufacturing and processing plants
- Pharmaceutical and food distribution, wholesale, and logistics providers
Your task:
- Identify and list all major binding regulatory frameworks and authoritative standards that govern quality management systems, manufacturing/processing controls, recordkeeping, and electronic systems for these industries in the United States.
- Include, at minimum:
- US federal statutes and regulations (U.S. Code, CFR titles and parts)
- FDA regulations and guidance documents relevant to QMS, GMP, GxP, CSV/CSA, data integrity, and electronic records/signatures
- USDA and other federal agency regulations for food, where applicable
- Key state-level laws or regulations that materially affect QMS operation or data handling (only if they impose additional or more stringent requirements, e.g., California, New York, Massachusetts, etc.)
- Cross-cutting federal laws impacting a cloud QMS platform: data protection, health information privacy, cybersecurity, and records retention
- International and US‑adopted standards (e.g., ISO, IEC) commonly used as de‑facto or incorporated-by-reference requirements for quality management, risk management, information security, and data integrity in these sectors.
- For each item, provide:
- Full name
- Citation (e.g., “21 CFR Part 820”, “21 CFR Part 11”, “ISO 13485:2016”)
- Issuing body (e.g., FDA, USDA, ISO, IEC, NIST, states)
- Sector(s) it primarily applies to (bioscience, medical devices, pharma, food, distribution, or cross‑sector)
- Short description of why it is relevant to a cloud-based QMS platform (e.g., electronic records, validation, CAPA, complaints, change control, supplier management, audit trails, data integrity, cybersecurity, hosting location, etc.)
- Whether it is a legal requirement, a strongly expected standard for market access, or a best-practice reference.
- Pay particular attention to US FDA medical device QMS harmonization with ISO 13485:2016 via the Quality Management System Regulation (QMSR) and its relationship to 21 CFR Part 820.123
- Also consider relevant ISO and related standards used by medical device and pharma manufacturers for QMS and risk management, such as ISO 13485, ISO 14971, ISO 9001, and ISO 27001 for information security supporting regulated records.45
Output format:
- Return a single JSON array.
- Each element must be an object with the following keys:
name(string)citation_or_id(string)jurisdiction_or_scope(string, e.g., “US Federal”, “California”, “Voluntary international standard”)issuing_body(string)primary_sectors(array of strings chosen from:["bioscience","medical_device","pharma","food_prep","food_manufacturing","distribution","cross_sector"])type(one of:"statute","regulation","guidance","standard","policy")qms_relevance(1–3 sentence string describing why it matters for a cloud QMS platform)binding_level(one of:"legally_required","market_access_expected","best_practice")Only include items that are plausibly relevant to the design, operation, validation, security, or compliance posture of a cloud-hosted QMS platform supporting the above industries in the United States.
Prompt 2: Top ~25 global markets outside US
This prompt expects the model to first choose the ~25 largest or most regulated life‑science/food markets outside the US (pharma/med device/food volume + regulatory maturity), then enumerate frameworks per jurisdiction.6789
Title: system_regulations_global_quality_management_platform
You are a regulatory intelligence assistant for a cloud-based Quality Management System (QMS) SaaS platform that serves highly regulated life-science and food sectors globally.
The platform provides configurable digital workflows, records, and process controls for:
- Bioscience and biotechnology organizations
- Medical device manufacturers (including SaMD and SiMD)
- Pharmaceutical and biopharmaceutical manufacturers
- Food preparation and food service operations
- Food manufacturing and processing plants
- Pharmaceutical and food distribution, wholesale, and logistics providers
Your task:
- Identify the top ~25 non‑US markets by a combination of pharmaceutical market size and regulatory maturity (e.g., EU Member States, UK, Canada, Japan, China, Australia, Brazil, etc.).7896
- For each selected country or region, list major binding regulatory frameworks and authoritative standards governing QMS, GMP/GxP, medical devices (including SaMD), pharmaceuticals, and food manufacturing/distribution, plus electronic records and information security where they affect a cloud QMS.
- Include, where applicable:
- Regional frameworks (e.g., EU regulations, directives, and guidelines)
- National acts, regulations, and agency guidance (e.g., EMA, MHRA, Health Canada, PMDA, NMPA, TGA, ANVISA, etc.)
- Recognized international standards adopted or referenced for quality management, risk management, and information security (e.g., ISO 13485, ISO 14971, ISO 9001, ISO 22000, ISO 27001)
- Any notable local data protection, data residency, or cybersecurity requirements that have direct implications for a cloud-hosted QMS (e.g., GDPR, LGPD, PIPEDA, POPIA).
- For each regulatory or standards item, provide:
- Full name
- Citation or identifier (e.g., “EU MDR 2017/745”, “ISO 13485:2016”)
- Issuing body (e.g., EMA, MHRA, Health Canada, ISO, IEC, national authority)
- Country or region
- Sector(s) it primarily applies to (bioscience, medical devices, pharma, food, distribution, or cross-sector)
- Short description of why it is relevant to a cloud QMS platform (e.g., QMS structure, GMP, CAPA, validation, SaMD lifecycle, audit trails, data integrity, hosting, cybersecurity).
- Whether it is a legal requirement, strongly expected for market access, or a best-practice reference.
- Assume the QMS platform will be validated against recognized global norms such as ISO 13485 and related standards, and harmonized programs like MDSAP; highlight when a jurisdiction directly leverages or aligns with these standards for QMS expectations.534
Output format:
- Return a single JSON array.
- Each element must be an object with the following keys:
name(string)citation_or_id(string)country_or_region(string)issuing_body(string)primary_sectors(array of strings chosen from:["bioscience","medical_device","pharma","food_prep","food_manufacturing","distribution","cross_sector"])type(one of:"statute","regulation","guidance","standard","policy")qms_relevance(1–3 sentence string describing why it matters for a cloud QMS platform)binding_level(one of:"legally_required","market_access_expected","best_practice")Focus on items that are material to the design, operation, validation, and security of a cloud-hosted QMS that will be used in life-science and food sectors in the top ~25 markets outside the United States.
If you want, I can next help you refine these into role‑separated (system/developer/user) prompt stacks or into LangGraph node specs for a “regulatory harvesting” agent.
refine these into role‑separated (system/developer/user) prompt stacks or into LangGraph node specs for a “regulatory harvesting” agent.
Here are two role‑separated stacks plus a LangGraph node spec you can drop in with minimal editing.
Stack A: US Regulatory Harvesting Agent
System message
You are a regulatory harvesting agent for a cloud-based Quality Management System (QMS) SaaS platform serving: bioscience/biotech, medical device (including SaMD/SiMD), pharmaceutical/biopharma, food preparation and manufacturing, and pharma/food distribution in the United States.
Your objective is to build a structured catalog of governing regulations, standards, and guidance that materially affect the design, operation, validation, and security of a cloud‑hosted QMS used by these sectors in the US.
Scope:
- Federal statutes and regulations (U.S. Code, CFR)
- FDA, USDA, and other federal agency regulations and guidance relevant to QMS, GMP/GxP, CSV/CSA, SaMD, data integrity, and electronic records/signatures
- State laws only where they impose materially stricter or additional requirements relevant to QMS or data handling (e.g., privacy, security, records, health information)
- Cross‑cutting federal laws (data protection, health information privacy, cybersecurity, records retention) that impact a cloud QMS
- International and US‑adopted standards commonly used or incorporated by reference (e.g., ISO 13485, ISO 14971, ISO 9001, ISO 27001) in these sectors.1617
Pay particular attention to:
- FDA medical device QMS harmonization with ISO 13485:2016 through the Quality Management System Regulation (QMSR) and its relationship to 21 CFR Part 820.181920
- Requirements for electronic records, audit trails, validation, CAPA, change control, complaints, supplier management, and data integrity as implemented in software.
Output contract (must always be respected):
- Return a single JSON array.
- Each element is an object with keys:
name(string)citation_or_id(string)jurisdiction_or_scope(string; e.g.,"US Federal","California","Voluntary international standard")issuing_body(string)primary_sectors(array of strings; subset of["bioscience","medical_device","pharma","food_prep","food_manufacturing","distribution","cross_sector"])type(one of:"statute","regulation","guidance","standard","policy")qms_relevance(1–3 sentence string focusing on why it matters for a cloud QMS)binding_level(one of:"legally_required","market_access_expected","best_practice")Do not generate prose explanations outside this JSON. Only include items plausibly relevant to a cloud-hosted QMS platform supporting the specified industries in the United States.
Developer message
Tools available: web search, retrieval over internal regulatory knowledge, and structured JSON validation.
Behavior:
- Decompose the task into sectors (bioscience, medical devices, pharma, food, distribution) and cross‑cutting regimes (privacy, cybersecurity, records).
- Deduplicate overlapping frameworks (e.g., HIPAA vs. HITECH, 21 CFR Part 11 vs. Part 820/QMSR).
- Prefer canonical citations (e.g., “21 CFR Part 11”, “ISO 13485:2016”) and official issuing bodies (e.g., “U.S. Food and Drug Administration (FDA)”).
- Err on the side of including fewer, more central regulations/standards rather than long tails of marginal items.
- Validate that the final JSON strictly conforms to the schema; if not, repair it before returning.
Optimization:
- Group federal items first, then add only a small subset of state‑level items that truly impact QMS or cloud hosting (e.g., state privacy/security laws affecting regulated health/food data).
- When in doubt, annotate in
qms_relevancehow a downstream policy‑mapping agent should interpret the item (e.g., “governs ePHI storage and transmission; apply to hosting, backup, and integrations”).
User message (template)
Build or update the US QMS regulatory catalog for our cloud QMS platform.
- Industries: bioscience, medical device, pharma, food prep, food manufacturing, pharma/food distribution.
- Jurisdiction: United States (federal plus only materially stricter state‑level items).
- Output: a JSON array exactly matching the schema in the system message.
- If you have a previous version of this catalog in context, merge and deduplicate entries instead of starting from scratch.
Stack B: Global (non‑US, ~25 markets) Regulatory Harvesting Agent
System message
You are a global regulatory harvesting agent for a cloud-based Quality Management System (QMS) SaaS platform serving bioscience/biotech, medical device (including SaMD/SiMD), pharmaceutical/biopharma, food preparation and manufacturing, and pharma/food distribution outside the United States.
Your objective is to build a structured catalog of governing regulations, standards, and guidance that materially affect the design, operation, validation, and security of a cloud‑hosted QMS for these sectors across roughly the 25 largest and most regulated non‑US markets.21222324
Scope:
- Regional frameworks (e.g., EU MDR/IVDR, EU GMP, EMA/ICH guidelines, GDPR)
- National acts, regulations, and agency guidance relevant to QMS, GMP/GxP, SaMD, data integrity, and electronic records/signatures (e.g., EMA, MHRA, Health Canada, PMDA, NMPA, TGA, ANVISA, etc.)
- Recognized international standards (e.g., ISO 13485, ISO 14971, ISO 9001, ISO 22000, ISO 27001) and harmonization programs (e.g., MDSAP) used as QMS baselines.172016
- Data protection, data residency, and cybersecurity regimes that constrain cloud QMS deployments (e.g., GDPR, LGPD, PIPEDA, POPIA).
The ~25 non‑US markets should be drawn from the largest pharma/med‑device/healthcare/food markets and include at minimum: EU (as region), UK, Canada, Japan, China, Australia, Brazil, India, South Korea, plus other top markets by size/regulatory maturity.222321
Output contract (must always be respected):
- Return a single JSON array.
- Each element is an object with keys:
name(string)citation_or_id(string)country_or_region(string; either specific country or region such as"EU"or"Nordic region")issuing_body(string)primary_sectors(array of strings; subset of["bioscience","medical_device","pharma","food_prep","food_manufacturing","distribution","cross_sector"])type(one of:"statute","regulation","guidance","standard","policy")qms_relevance(1–3 sentence string focusing on why it matters for a cloud QMS)binding_level(one of:"legally_required","market_access_expected","best_practice")Do not generate prose explanations outside this JSON. Focus on items that are material to life‑science and food‑sector QMS implementations and to the cloud‑hosting environment.
Developer message
Tools available: web search, retrieval over internal regulatory knowledge, and structured JSON validation.
Behavior:
- Step 1: Explicitly decide and internally enumerate the ~25 non‑US markets (countries/regions) based on pharmaceutical/med‑device/healthcare/food market size and regulatory maturity.
- Step 2: For each market, identify the smallest set of regulations/standards that govern QMS, GMP/GxP, medical devices (including SaMD), pharma, food manufacturing/distribution, and cloud‑relevant data protection/cybersecurity.
- Step 3: Normalize names, citations, and issuing bodies.
- Step 4: Deduplicate regional vs. national items (e.g., EU MDR vs. country‑level transpositions) and explain the relationship in
qms_relevancerather than creating redundant entries.- Step 5: Validate JSON against the contract before returning.
Optimization:
- Assume the platform will align with ISO 13485 and related standards globally; highlight when jurisdictions explicitly reference or rely on these (e.g., MDSAP alignment).
- Use
binding_levelto distinguish hard law from “needed for market access” norms (e.g., ISO 13485 certification, MDSAP participation).
User message (template)
Build or update the global (non‑US) QMS regulatory catalog for our cloud QMS platform.
- Industries: bioscience, medical device, pharma, food prep, food manufacturing, pharma/food distribution.
- Jurisdictions: top ~25 non‑US markets by size and regulatory maturity.
- Output: a JSON array exactly matching the schema in the system message.
- If you have a previous version of this catalog in context, merge and deduplicate entries instead of starting from scratch.
LangGraph Node Spec: regulatory_harvester
You can use essentially the same node with different mode flags ("US" vs "GLOBAL") and prompt stacks above.
from typing import Literal, TypedDict, List
from langgraph.graph import StateGraph, END
from langchain_core.messages import SystemMessage, HumanMessage
class RegCatalogItem(TypedDict):
name: str
citation_or_id: str
jurisdiction_or_scope: str | None # US mode
country_or_region: str | None # Global mode
issuing_body: str
primary_sectors: List[str]
type: Literal["statute","regulation","guidance","standard","policy"]
qms_relevance: str
binding_level: Literal["legally_required","market_access_expected","best_practice"]
class HarvesterState(TypedDict):
mode: Literal["US","GLOBAL"]
prior_catalog: List[RegCatalogItem] | None
result_catalog: List[RegCatalogItem] | None
def regulatory_harvester_node(state: HarvesterState, llm) -> HarvesterState:
if state["mode"] == "US":
system_prompt = SYSTEM_US # from Stack A
user_prompt = USER_US # from Stack A
else:
system_prompt = SYSTEM_GLOBAL # from Stack B
user_prompt = USER_GLOBAL # from Stack B
messages = [SystemMessage(content=system_prompt)]
if state.get("prior_catalog"):
# Optionally pass prior catalog as JSON in the user message
messages.append(HumanMessage(
content=user_prompt + "\n\nHere is the existing catalog JSON to merge/deduplicate:\n" +
json.dumps(state["prior_catalog"])
))
else:
messages.append(HumanMessage(content=user_prompt))
raw = llm.invoke(messages).content
catalog = json.loads(raw)
# optional: run JSON schema validation / normalization here
state["result_catalog"] = catalog
return state
If you want, next step I can:
- Add a sibling “policy‑mapping” node that takes these catalog items and maps them into your framework/control matrix schema, or
- Turn this into a more opinionated multi‑node LangGraph (US node → Global node → dedupe/normalizer → writer into Postgres/graph).
Add a sibling “policy‑mapping” node that takes these catalog items and maps them into your framework/control matrix schema, Turn this into a more opinionated multi‑node LangGraph (US node → Global node → dedupe/normalizer → writer into Postgres/graph
Below is a concrete “policy‑mapping” sibling node plus an opinionated multi‑node LangGraph design you can drop into your stack.
1. Policy‑mapping sibling node
Assume you already have a framework/control matrix schema along the lines of our previous work: a canonical control catalog plus mappings from each regulation/standard to one or more controls.252627
Control / mapping schemas
from typing import TypedDict, List, Literal
class RegCatalogItem(TypedDict):
name: str
citation_or_id: str
jurisdiction_or_scope: str | None # US mode
country_or_region: str | None # Global mode
issuing_body: str
primary_sectors: List[str]
type: Literal["statute","regulation","guidance","standard","policy"]
qms_relevance: str
binding_level: Literal["legally_required","market_access_expected","best_practice"]
class Control(TypedDict):
id: str # e.g., "QMS-CHG-001"
name: str # e.g., "Change Control for GxP Records"
domain: str # e.g., "QMS", "InfoSec", "Privacy"
description: str
class RegToControlMapping(TypedDict):
regulation_id: str # citation_or_id
control_id: str # Control.id
mapping_strength: Literal["direct","partial","related"]
rationale: str
Node role and behavior
Node name: policy_mapping_node
Inputs:
catalog_items: list[RegCatalogItem](from US/global harvester)control_catalog: list[Control](your canonical QMS/GRC control set)- Optional
existing_mappings: list[RegToControlMapping]to extend/merge
Output:
mappings: list[RegToControlMapping]
Pseudo‑implementation
from langchain_core.messages import SystemMessage, HumanMessage
import json
class PolicyMappingState(TypedDict):
catalog_items: List[RegCatalogItem]
control_catalog: List[Control]
existing_mappings: List[RegToControlMapping] | None
mappings: List[RegToControlMapping] | None
POLICY_MAPPING_SYSTEM = """
You are a policy-mapping assistant for a regulated life-science and food-sector QMS platform.
You receive:
- A catalog of regulations/standards (life-science QMS, GMP/GxP, SaMD, pharma, food, distribution).
- A canonical internal control catalog (QMS + InfoSec + Privacy).
Your task:
- For each regulation/standard, identify which internal controls it maps to.
- Prefer mapping at the level of high-level control objectives (e.g., change control, CAPA, document control, training, supplier oversight, data integrity, access control, logging, incident response), not individual line items. [web:30][web:31][web:34][web:35][web:42]
- For each pair, specify mapping strength:
- "direct": the control clearly addresses a core requirement of the regulation.
- "partial": the control addresses part of the requirement; additional controls are needed.
- "related": the control supports compliance but is not a primary requirement.
Output contract:
- Return a single JSON array of objects with:
- regulation_id (string; use citation_or_id)
- control_id (string; Control.id)
- mapping_strength ("direct" | "partial" | "related")
- rationale (1–2 sentences; why this control supports this regulation/standard, in the context of a cloud QMS).
Do not output anything except this JSON array.
"""
POLICY_MAPPING_USER_TEMPLATE = """
You are given:
1) Regulation/standard catalog (JSON):
{catalog_json}
2) Internal control catalog (JSON):
{controls_json}
If there are existing mappings, extend them and avoid duplicates:
{existing_mappings_json}
"""
def policy_mapping_node(state: PolicyMappingState, llm) -> PolicyMappingState:
catalog_json = json.dumps(state["catalog_items"])
controls_json = json.dumps(state["control_catalog"])
existing = state.get("existing_mappings") or []
existing_json = json.dumps(existing)
user = POLICY_MAPPING_USER_TEMPLATE.format(
catalog_json=catalog_json,
controls_json=controls_json,
existing_mappings_json=existing_json,
)
messages = [
SystemMessage(content=POLICY_MAPPING_SYSTEM),
HumanMessage(content=user),
]
raw = llm.invoke(messages).content
mappings = json.loads(raw)
# Optionally merge with existing_mappings and de-duplicate here
state["mappings"] = mappings
return state
This node is effectively your “regulation → internal control set” crosswalk generator, aligned with common GRC practices for common control frameworks.282625
2. Opinionated multi‑node LangGraph: US → Global → Dedupe/Normalize → Policy Map → Writer
You can treat this as a single StateGraph with a shared state type.
Shared state
from typing import TypedDict, List
class GraphState(TypedDict):
mode: str # "US", "GLOBAL", etc., used per-node
us_catalog: List[RegCatalogItem] | None
global_catalog: List[RegCatalogItem] | None
merged_catalog: List[RegCatalogItem] | None
control_catalog: List[Control] | None
mappings: List[RegToControlMapping] | None
Nodes
us_harvester_node– implements Stack A (US system/dev/user)global_harvester_node– implements Stack B (global system/dev/user)dedupe_normalizer_node– mergesus_catalogandglobal_catalogintomerged_catalogpolicy_mapping_node– as abovewriter_node– persistsmerged_catalog+mappingsinto Postgres and/or your graph store
2.1. US harvester node
Assume you already have regulatory_harvester_node from previous step, called with mode="US" and writing state["us_catalog"].
2.2. Global harvester node
Same node with mode="GLOBAL" and writing state["global_catalog"].
2.3. Dedupe / normalizer node
You’ll likely want opinionated rules: same citation_or_id across US/global, or same name+issuing_body+region, etc.
import itertools
def dedupe_normalizer_node(state: GraphState) -> GraphState:
us = state.get("us_catalog") or []
gl = state.get("global_catalog") or []
# Simple example: key by citation_or_id + issuing_body
def key(item: RegCatalogItem) -> str:
return f'{item["citation_or_id"]}::{item["issuing_body"]}'.lower()
merged = {}
for item in itertools.chain(us, gl):
k = key(item)
if k not in merged:
merged[k] = item
else:
# Merge primary_sectors, jurisdictions, etc.
existing = merged[k]
ps = list(sorted(set(existing["primary_sectors"] + item["primary_sectors"])))
existing["primary_sectors"] = ps
# Optionally merge jurisdiction_or_scope / country_or_region
if item.get("jurisdiction_or_scope"):
existing["jurisdiction_or_scope"] = item["jurisdiction_or_scope"]
if item.get("country_or_region"):
existing["country_or_region"] = item["country_or_region"]
merged[k] = existing
state["merged_catalog"] = list(merged.values())
return state
You can extend this with more nuanced normalization (e.g., unify ISO standards by stripping year suffix, unify EU MDR across countries, etc.).29303127
2.4. Policy mapping node
Use state["merged_catalog"] and state["control_catalog"] as inputs; set state["mappings"] as output.
def policy_mapping_node_wrapper(state: GraphState, llm) -> GraphState:
pm_state: PolicyMappingState = {
"catalog_items": state["merged_catalog"] or [],
"control_catalog": state["control_catalog"] or [],
"existing_mappings": None,
"mappings": None,
}
pm_state = policy_mapping_node(pm_state, llm)
state["mappings"] = pm_state["mappings"]
return state
2.5. Writer node (Postgres/graph)
Illustrative sketch for Postgres (normalized tables) and a graph (e.g., Neo4j / TrustGraph‑like):
import psycopg2
from psycopg2.extras import execute_batch
def writer_node(state: GraphState, pg_conn_str: str) -> GraphState:
merged = state["merged_catalog"] or []
mappings = state["mappings"] or []
conn = psycopg2.connect(pg_conn_str)
cur = conn.cursor()
# Example tables:
# regulations(id serial, citation text, name text, issuing_body text, scope jsonb, sectors jsonb, type text, binding_level text, qms_relevance text)
# reg_control_mappings(id serial, regulation_citation text, control_id text, mapping_strength text, rationale text)
regs_rows = [
(
item["citation_or_id"],
item["name"],
item["issuing_body"],
json.dumps({
"jurisdiction_or_scope": item.get("jurisdiction_or_scope"),
"country_or_region": item.get("country_or_region"),
}),
json.dumps(item["primary_sectors"]),
item["type"],
item["binding_level"],
item["qms_relevance"],
)
for item in merged
]
execute_batch(
cur,
"""
INSERT INTO regulations
(citation, name, issuing_body, scope, sectors, type, binding_level, qms_relevance)
VALUES (%s,%s,%s,%s,%s,%s,%s,%s)
ON CONFLICT (citation) DO UPDATE
SET name = EXCLUDED.name,
issuing_body = EXCLUDED.issuing_body,
scope = EXCLUDED.scope,
sectors = EXCLUDED.sectors,
type = EXCLUDED.type,
binding_level = EXCLUDED.binding_level,
qms_relevance = EXCLUDED.qms_relevance;
""",
regs_rows,
)
map_rows = [
(
m["regulation_id"],
m["control_id"],
m["mapping_strength"],
m["rationale"],
)
for m in mappings
]
execute_batch(
cur,
"""
INSERT INTO reg_control_mappings
(regulation_citation, control_id, mapping_strength, rationale)
VALUES (%s,%s,%s,%s)
ON CONFLICT (regulation_citation, control_id) DO UPDATE
SET mapping_strength = EXCLUDED.mapping_strength,
rationale = EXCLUDED.rationale;
""",
map_rows,
)
conn.commit()
cur.close()
conn.close()
return state
You can mirror the same relationships into a graph DB with (:Regulation)-[:REQUIRES]->(:Control) edges to support graph queries and RAG over your “compliance universe” later.3227
3. Putting it together: Graph wiring
from langgraph.graph import StateGraph, END
def build_regulatory_graph(llm, pg_conn_str: str) -> StateGraph:
g = StateGraph(GraphState)
# Add nodes
g.add_node("us_harvester", lambda s: regulatory_harvester_node({**s, "mode": "US"}, llm))
g.add_node("global_harvester", lambda s: regulatory_harvester_node({**s, "mode": "GLOBAL"}, llm))
g.add_node("dedupe_normalizer", dedupe_normalizer_node)
g.add_node("policy_mapping", lambda s: policy_mapping_node_wrapper(s, llm))
g.add_node("writer", lambda s: writer_node(s, pg_conn_str))
# Edges: US & Global can run in parallel, then join
g.set_entry_point("us_harvester")
g.add_edge("us_harvester", "global_harvester")
g.add_edge("global_harvester", "dedupe_normalizer")
g.add_edge("dedupe_normalizer", "policy_mapping")
g.add_edge("policy_mapping", "writer")
g.add_edge("writer", END)
return g
If you want next, I can:
- Add a reader / query graph that takes an industry + region and returns applicable regulations + mapped controls, or
- Extend the writer to also emit a CSV/Parquet extract for your existing spreadsheet‑based control matrix.
Add a reader / query graph that takes an industry + region and returns applicable regulations + mapped controls, or
Extend the writer to also emit a CSV/Parquet extract for your existing spreadsheet‑based control matrix. Provide LangChain code example for the policy-mapping node Define the control matrix schema structure with sample data How to integrate RACI matrix into policy mappings Add a validation node after Global node for compliance checks Compare US FDA 21 CFR Part 820 vs EU MDR in the graph flow
Here’s a compact extension of the design covering all your asks, keeping things copy‑pasteable.
1) Control matrix schema + sample data (with RACI)
Control catalog table
class Control(TypedDict):
id: str # "QMS-CHG-001"
name: str # "Change Control for GxP Records"
domain: str # "QMS", "InfoSec", "Privacy"
description: str
objective: str # control objective text
industries: list[str] # ["bioscience","medical_device",...]
evidence_examples: list[str] # ["SOP-001", "Change log export", ...]
# RACI for this control as applied globally
raci: dict[str, str] # {"R": "QA Manager", "A": "VP Quality", "C": "IT Security", "I": "Reg Affairs"}
RACI at the control level is standard GRC practice for accountability and ownership.40414243444546
Mapping table (regulation → control + RACI override if needed)
class RegToControlMapping(TypedDict):
regulation_id: str # citation_or_id
control_id: str # Control.id
mapping_strength: Literal["direct","partial","related"]
rationale: str
# Optional RACI overrides specific to this regulation↔control relationship
raci_override: dict[str, str] | None # e.g., {"R": "Reg Affairs", "A": "VP Quality"}
Sample rows
control_catalog: list[Control] = [
{
"id": "QMS-CHG-001",
"name": "Change Control for GxP Records",
"domain": "QMS",
"description": "All changes to GxP-relevant processes, systems, and documents are formally assessed, approved, implemented, and documented.",
"objective": "Ensure controlled, documented changes for regulated processes and systems (including cloud QMS) to maintain compliance and data integrity.",
"industries": ["bioscience","medical_device","pharma","food_manufacturing","distribution"],
"evidence_examples": ["Change Control SOP", "Change tickets", "Impact assessment records"],
"raci": {"R": "QA Manager", "A": "Head of Quality", "C": "IT Lead", "I": "Regulatory Affairs"},
},
{
"id": "QMS-VAL-001",
"name": "Computer System Validation / CSA",
"domain": "QMS",
"description": "Validation of GxP-relevant computerized systems proportional to risk, including cloud-hosted QMS.",
"objective": "Demonstrate that cloud QMS and integrated systems are fit for intended use and remain in a validated state.",
"industries": ["bioscience","medical_device","pharma","food_manufacturing"],
"evidence_examples": ["Validation plan/report", "IQ/OQ/PQ docs", "Test scripts", "Traceability matrix"],
"raci": {"R": "CSV Lead", "A": "Head of Quality", "C": "Vendor", "I": "IT Security"},
},
]
Example mapping row for 21 CFR 820 vs EU MDR:47484950
mappings: list[RegToControlMapping] = [
{
"regulation_id": "21 CFR Part 820",
"control_id": "QMS-CHG-001",
"mapping_strength": "direct",
"rationale": "Change control is a core requirement of the FDA QSR/QMSR for medical devices; cloud QMS workflows must support formal change approval and documentation.",
"raci_override": None,
},
{
"regulation_id": "EU MDR 2017/745",
"control_id": "QMS-VAL-001",
"mapping_strength": "direct",
"rationale": "EU MDR expects manufacturers to operate a QMS that ensures validated processes and software used in QMS activities; cloud QMS validation supports this requirement.",
"raci_override": {"R": "Regulatory Affairs", "A": "Head of Quality"},
},
]
2) LangChain code example for the policy‑mapping node
This is the same logical node as earlier, but as a straight LangChain Runnable.
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_core.runnables import RunnableLambda
import json
POLICY_MAPPING_SYSTEM = """
You are a policy-mapping assistant for a regulated life-science and food-sector QMS platform.
You receive:
- A catalog of regulations/standards (life-science QMS, GMP/GxP, SaMD, pharma, food, distribution).
- A canonical internal control catalog (QMS + InfoSec + Privacy).
Your task:
- For each regulation/standard, identify which internal controls it maps to.
- Map at the level of control objectives (change control, CAPA, document control, training, supplier oversight, data integrity, access control, logging, incident response, etc.). [web:35][web:42][web:51][web:54]
- For each regulation→control pair, specify mapping strength:
- "direct": the control clearly addresses a core requirement.
- "partial": the control addresses part of the requirement, other controls fill the gap.
- "related": the control supports compliance but is not primary.
Output contract:
- Return a single JSON array of objects:
- regulation_id (string; use citation_or_id)
- control_id (string; Control.id)
- mapping_strength ("direct" | "partial" | "related")
- rationale (1–2 sentences, focused on cloud QMS context)
- raci_override (object with optional R, A, C, I keys, or null)
Do not output anything except this JSON array.
"""
POLICY_MAPPING_USER_TMPL = """
Regulation/standard catalog (JSON):
{catalog_json}
Internal control catalog (JSON):
{controls_json}
If there are existing mappings, extend them and avoid duplicates:
{existing_mappings_json}
"""
def build_policy_mapping_runnable(llm):
def _invoke(inputs):
catalog_items = inputs["catalog_items"]
control_catalog = inputs["control_catalog"]
existing_mappings = inputs.get("existing_mappings") or []
user = POLICY_MAPPING_USER_TMPL.format(
catalog_json=json.dumps(catalog_items),
controls_json=json.dumps(control_catalog),
existing_mappings_json=json.dumps(existing_mappings),
)
messages = [
SystemMessage(content=POLICY_MAPPING_SYSTEM),
HumanMessage(content=user),
]
raw = llm.invoke(messages).content
mappings = json.loads(raw)
return {"mappings": mappings}
return RunnableLambda(_invoke)
You can plug this runnable into LangGraph via a node wrapper, as in the previous message.
3) Integrating RACI into the mappings
Best practice: store baseline RACI on the control, and allow regulation‑specific overrides when accountability differs for a particular regime or industry.4243444540
- Control.raci = “default” ownership (who generally owns this control).
- RegToControlMapping.raci_override = optional override when the regulation demands a different accountable function (e.g., EU MDR may put more emphasis on Regulatory Affairs for some controls than US QSR).
In your DB, add columns:
controlstable:raci jsonbreg_control_mappingstable:raci_override jsonb
Your writer node simply persists these JSON fields; your reader APIs can merge them at query time:
effective_raci = mapping.raci_override or control.raci
4) Validation node after Global node (compliance checks)
This node checks that global harvesting obeyed some minimal conditions: required regions present, key med device/pharma/food frameworks found, etc.5152
class ValidationResult(TypedDict):
ok: bool
errors: list[str]
warnings: list[str]
class GraphState(TypedDict):
# ... previous fields ...
global_catalog: list[RegCatalogItem] | None
validation: ValidationResult | None
REQUIRED_REGIONS = ["EU", "United Kingdom", "Canada", "Japan", "China", "Australia", "Brazil", "India", "South Korea"]
def validation_node(state: GraphState) -> GraphState:
catalog = state.get("global_catalog") or []
errors: list[str] = []
warnings: list[str] = []
# Check that mandatory regions appear
regions_present = {item.get("country_or_region") for item in catalog if item.get("country_or_region")}
for r in REQUIRED_REGIONS:
if r not in regions_present:
errors.append(f"Missing required region in global catalog: {r}")
# Check that at least one MDR and one major QMS standard exists
if not any("MDR" in (item["name"] or "") for item in catalog):
warnings.append("EU MDR not detected in global catalog.")
if not any("ISO 13485" in item["citation_or_id"] for item in catalog):
warnings.append("ISO 13485 not detected in global catalog.")
state["validation"] = {"ok": len(errors) == 0, "errors": errors, "warnings": warnings}
return state
In the graph, you’d insert this node after global_harvester (or after dedupe_normalizer if you want to validate the merged catalog) and short‑circuit or flag runs when ok is false.
5) Reader / query graph: industry + region → regs + controls
API-level behavior
Input:
industry: str– one of your sectors (e.g.,"medical_device","pharma")region: str– country/region name (e.g.,"United States","EU","Canada")
Output:
- Applicable regulations/standards
- For each, mapped controls with effective RACI
Reader node example (Postgres)
class QueryState(TypedDict):
industry: str
region: str
results: dict | None
def reader_node(state: QueryState, pg_conn_str: str) -> QueryState:
industry = state["industry"]
region = state["region"]
conn = psycopg2.connect(pg_conn_str)
cur = conn.cursor()
# 1) Find regulations for the region and industry
cur.execute(
"""
SELECT citation, name, issuing_body, scope, sectors, type, binding_level, qms_relevance
FROM regulations
WHERE (scope->>'country_or_region' = %s OR scope->>'jurisdiction_or_scope' = %s)
""",
(region, region),
)
regs = cur.fetchall()
# 2) Filter by industry in sectors JSON
regs_filtered = []
for row in regs:
sectors = json.loads(row[^4_4])
if industry in sectors or "cross_sector" in sectors:
regs_filtered.append(row)
# 3) For each regulation, get mapped controls + effective RACI
results = []
for row in regs_filtered:
citation = row[^4_0]
cur.execute(
"""
SELECT m.control_id, m.mapping_strength, m.rationale, m.raci_override,
c.name, c.domain, c.description, c.raci
FROM reg_control_mappings m
JOIN controls c ON m.control_id = c.id
WHERE m.regulation_citation = %s
""",
(citation,),
)
mappings = []
for c_row in cur.fetchall():
control_id, strength, rationale, raci_override, c_name, c_domain, c_desc, c_raci = c_row
effective_raci = raci_override or c_raci
mappings.append({
"control_id": control_id,
"control_name": c_name,
"control_domain": c_domain,
"control_description": c_desc,
"mapping_strength": strength,
"rationale": rationale,
"effective_raci": effective_raci,
})
results.append({
"regulation_citation": citation,
"regulation_name": row[^4_1],
"issuing_body": row[^4_2],
"binding_level": row[^4_6],
"qms_relevance": row[^4_7],
"controls": mappings,
})
cur.close()
conn.close()
state["results"] = {
"industry": industry,
"region": region,
"items": results,
}
return state
You can wrap this in a small LangGraph just for query flows (entry → reader_node → END).
6) Compare US FDA 21 CFR Part 820 vs EU MDR in the graph flow
You can handle this in two ways:
- Static comparison as data in your catalog/mappings (preferred)
- On‑demand comparison via an analysis node that consumes the stored data and generates a summary
Static representation
When you harvest 21 CFR Part 820 and EU MDR 2017/745, ensure:
- Both link to overlapping control IDs (e.g., QMS-CHG-001, QMS-VAL-001, CAPA, complaint handling, PMS, etc.).
- Their
qms_relevancedescribes differences: US QSR/QMSR vs MDR emphasis on UDI, clinical evaluation, PMS, etc.48495047
Example extra fields you might add to RegCatalogItem:
class RegCatalogItem(TypedDict):
# existing fields...
lifecycle_coverage: list[str] # ["design", "manufacturing", "postmarket"]
has_udi_requirements: bool
has_clinical_evaluation_emphasis: bool
Populate:
- 21 CFR 820/QMSR:
has_udi_requirements=False,has_clinical_evaluation_emphasis=Truebut less granular than MDR.4950 - EU MDR:
has_udi_requirements=True,has_clinical_evaluation_emphasis=True, plus stricter PMS/PMCF.474849
Analysis node: “compare_820_vs_mdr_node”
class CompareState(TypedDict):
merged_catalog: list[RegCatalogItem]
comparison_summary: str | None
def compare_820_vs_mdr_node(state: CompareState, llm) -> CompareState:
catalog = state["merged_catalog"]
regs = {item["citation_or_id"]: item for item in catalog}
part820 = next((v for k, v in regs.items() if "21 CFR Part 820" in k or "21 CFR 820" in v["name"]), None)
mdr = next((v for k, v in regs.items() if "2017/745" in k or "MDR" in v["name"]), None)
if not part820 or not mdr:
state["comparison_summary"] = "One or both of 21 CFR Part 820 and EU MDR 2017/745 not found in catalog."
return state
system = """
You are a regulatory analyst comparing US FDA 21 CFR Part 820/QMSR and EU MDR 2017/745 from the perspective of a cloud QMS platform.
Using only the JSON below, summarize:
- Overlapping QMS expectations.
- Key differences (e.g., UDI, clinical evidence, post-market surveillance).
- Implications for control design (where a single control set can cover both vs where region-specific controls or configurations are needed). [web:48][web:50][web:53][web:56]
Keep it under 300 words.
"""
user = f"""
21 CFR Part 820/QMSR JSON:
{json.dumps(part820)}
EU MDR 2017/745 JSON:
{json.dumps(mdr)}
"""
msg = [
SystemMessage(content=system),
HumanMessage(content=user),
]
summary = llm.invoke(msg).content
state["comparison_summary"] = summary
return state
Wire this node anywhere after dedupe_normalizer (since it uses merged_catalog).
7) Updated graph wiring (high‑level)
us_harvesterglobal_harvestervalidation(checks global)dedupe_normalizerpolicy_mappingwritercompare_820_vs_mdr(optional side branch)reader(separate graph for query‑time)
You can keep harvesting + mapping as a batch pipeline graph and reader as a lightweight query graph that hits the DB.
If you’d like next, I can give you:
- A Postgres DDL snippet for all tables (
regulations,controls,reg_control_mappings, mayberacis) and - A small FastAPI endpoint that calls the reader node for
GET /regulations/{industry}/{region}.
1) Tech stack for industry–region regulation query graph
For your use case (LangGraph + agentic GRC + query over regs/controls), a pragmatic stack:
- Core runtime
- Python 3.x
- LangGraph for orchestration (harvest → normalize → map → write; reader graph for queries)
- LangChain for LLM abstractions/tools
- Persistence
- Query / API
- FastAPI or Starlette for:
GET /regulations/{industry}/{region}GET /controls/{control_id}/regsGET /matrix/export/{version_id}
- SQLAlchemy / psycopg3 for DB access
- FastAPI or Starlette for:
- Data lake / artifacts
- Object storage (S3/GCS/Azure Blob/MinIO) to hold:
- Versioned CSV/Parquet exports of the compliance matrix
- Pandas / PyArrow for CSV ↔ Parquet conversion
- Object storage (S3/GCS/Azure Blob/MinIO) to hold:
- Front‑end / UX
- React/Next.js (or your existing CopilotKit / LangGraph UI) for:
- Regulatory catalog browser
- Control matrix + RACI visualization (heatmaps, filters)
- React/Next.js (or your existing CopilotKit / LangGraph UI) for:
This gives you a “compliance graph” over structured tables with optional graph queries, matching what current “compliance graph” articles describe.56585955
2) Version control for spreadsheet‑based GRC extracts
Approach: treat spreadsheets as build artifacts, not the source of truth.
- Authoritative source: Postgres tables (
regulations,controls,reg_control_mappings,raci) and possibly a graph DB. - Extract pipeline:
- Batch job/graph node that:
- Pulls current data from Postgres
- Converts to CSV and Parquet
- Writes to object storage at a versioned path, e.g.:
s3://grc-matrix/exports/compliance_matrix_v00023.csv.../compliance_matrix_v00023.parquet
- Batch job/graph node that:
- Metadata table (in Postgres):
CREATE TABLE matrix_export_versions (
id bigserial PRIMARY KEY,
version_tag text UNIQUE, -- e.g. "v00023"
created_at timestamptz NOT NULL DEFAULT now(),
git_sha text, -- optional, link to infra repo
description text,
csv_path text NOT NULL, -- e.g. s3 uri
parquet_path text NOT NULL
);
- Versioning strategy:
- Increment semantic/int-style version (
v00023) on each export - Optionally tie to Git commit SHA that changed mapping logic
- Expose
GET /matrix/export/latestandGET /matrix/export/{version_tag}
- Increment semantic/int-style version (
- Git vs storage:
- Store large CSV/Parquet in object storage, not Git.
- Commit:
- The schema
- The export script
- The small metadata entries or a manifest
This aligns with how modern GRC/ISMS RACI templates and ISO control matrices are maintained (source DB + exported, versioned spreadsheets).6061
3) RACI matrix template for mapped compliance controls
You can maintain a dedicated RACI matrix view for your GRC audience that joins controls + mappings + roles.
Minimal columns (spreadsheet template):
| Control ID | Control Name | Domain | Regulation/Standard | Region | Industry | R (Responsible) | A (Accountable) | C (Consulted) | I (Informed) |
|---|
Populate as:
Control ID: e.g.,QMS-VAL-001Control Name: “Computer System Validation / CSA”Domain: “QMS”Regulation/Standard: e.g., “21 CFR Part 11”, “EU MDR 2017/745”Region: “US”, “EU”Industry: “medical_device”, “pharma”, etc.- RACI: merged from
controls.raciandreg_control_mappings.raci_overrideas discussed earlier.6263646560
Programmatically, the writer node can generate a RACI matrix CSV in parallel to the main matrix:
raci_rows = []
for m in mappings:
control = controls_by_id[m["control_id"]]
reg = regs_by_citation[m["regulation_id"]]
effective_raci = m["raci_override"] or control["raci"]
raci_rows.append([
control["id"],
control["name"],
control["domain"],
reg["citation_or_id"],
reg.get("country_or_region") or reg.get("jurisdiction_or_scope"),
",".join(control["industries"]),
effective_raci.get("R",""),
effective_raci.get("A",""),
effective_raci.get("C",""),
effective_raci.get("I",""),
])
# write raci_rows to CSV / Parquet and version like other extracts
4) Example compliance matrix for healthcare EU vs US
Below is a simplified, copy‑pasteable compliance matrix snippet for healthcare / medical devices focusing on US FDA vs EU MDR (high‑level).666768697071
Healthcare / med device compliance matrix (excerpt)
| Area | US (FDA, healthcare/med device) | EU (MDR, healthcare/med device) | Example Controls (IDs) |
|---|---|---|---|
| QMS baseline | 21 CFR Part 820 / QMSR, aligned with ISO 13485; design controls, CAPA, production, process validation, complaint handling, document control. 697071 | EU MDR 2017/745 requires manufacturers to have a QMS aligned with ISO 13485, covering design, manufacturing, PMS, vigilance, and clinical evaluation. 69707267 | QMS-QMS-001 (Quality Manual), QMS-DOC-001 (Document Control), QMS-CAPA-001 (CAPA), QMS-COM-001 (Complaints) |
| Risk management | FDA expects risk management aligned with ISO 14971 across device lifecycle; integrated with design controls and post-market data. 697167 | MDR mandates continuous risk management throughout lifecycle, tightly integrated with clinical evaluation and PMS/PMCF; ISO 14971 conformity expected. 697067 | QMS-RISK-001 (Risk Management Process), QMS-RISK-TRACE-001 (Risk Traceability) |
| Clinical evidence | Premarket submissions (510(k), De Novo, PMA) rely on clinical data proportional to risk; predicate devices used for many 510(k). 6668 | Stricter clinical data requirements, more extensive clinical evaluation and PMCF, especially for higher-risk devices. 666768 | REG-CEVAL-001 (Clinical Evaluation Records), REG-PMCF-001 (PMCF Plan/Reports) |
| Post-market surveillance | FDA requires complaint handling, MDR/MDV reporting, recalls, CAPA, and vigilance activities; PMS programs vary with device class. 6667 | MDR requires structured PMS plans, PSURs, trend reporting, and vigilance; PMS integrated into QMS and risk management. 6667 | QMS-PMS-001 (PMS Plan & Reports), QMS-VIG-001 (Vigilance Reporting), QMS-RECALL-001 (Recall Management) |
| Data protection & health data | HIPAA/HITECH for PHI, plus 21 CFR Part 11 for electronic records/signatures for regulated data; cloud QMS must ensure confidentiality, integrity, availability, and audit trails. 6768 | GDPR for personal data, plus MDR expectations around protection of patient data in clinical investigations and PMS; cloud QMS must respect lawful basis, minimization, and cross-border transfer rules. 666768 | SEC-AC-001 (Access Control), SEC-AUD-001 (Audit Logging), PRIV-DP-001 (Data Protection Policy), PRIV-XB-001 (Cross-Border Transfers) |
| Device/software classification | FDA device classes I–III; SaMD regulated via guidance and specific pathways, with cybersecurity and software lifecycle expectations. 6667 | MDR device classes I, IIa, IIb, III, with clearer rules for software; more explicit requirements on safety, performance, and post-market monitoring for software devices. 6667 | REG-CLASS-001 (Classification & Intended Use), SW-LC-001 (Software Lifecycle & Cybersecurity) |
This table is exactly what your reader / query graph should be able to support:
- Query:
industry=medical_device,region=US→ fetch regs/controls to fill “US (FDA)” column. - Query:
industry=medical_device,region=EU→ fetch regs/controls to fill “EU (MDR)” column. - Optional analysis node builds this kind of summary matrix on demand from the stored data, instead of hardcoding it.
If you want, next step I can give you:
- Postgres DDL for all involved tables (regs, controls, mappings, RACI, exports)
- A FastAPI router exposing:
/query/{industry}/{region}→ JSON of regs+controls+RACI/matrix/export/{version}→ signed URL to CSV/Parquet