Motivation / problem
The v4.7 Tabular Review engine turned a set of documents into a spreadsheet: rows are documents, columns are questions, and each cell is a RAG-grounded extraction with a flag (green/yellow/red) and cited chunks. It answers “what does each document say about X?” well. But every column was the same kind of question — a single-shot LLM extraction — so two things were out of reach:- Governance questions that the graph already knows the answer to. “Is this doc canonical? Orphaned in the graph? Superseded? What’s its evidence tier?” are deterministic facts in the canonical layer, not LLM judgments. Asking an LLM to guess them is slower, costs money, and can hallucinate.
- Anti-hallucination on the extraction itself. A green flag means the model was confident — not that the value is actually supported by the cited evidence.
agent dimension orthogonal to its format, so a
single report can mix RAG extraction, deterministic graph governance, and a verified
anti-hallucination pass — and ship as a one-click, per-cell-cited, exportable audit
matrix.
Theory & background
The unit of the engine is still a cell = (document row × column). What changes is that a column declares how its cell is produced, via theagent kind
(App\Support\TabularReview\AgentKind):
extract— today’s behaviour, unchanged and the default. One RAG single-shot per (doc, column): retrieve, ask the LLM, flag + cite. Every pre-v8.19 review is byte-identical (the field is absent →extract).graph— a deterministic, LLM-free governance metric resolved from the canonical graph + document columns. No model call, no cost, no hallucination — the value is a fact (is_canonical = true,incoming_edges = 4, …). Thegraphagent wins over thejson_pathshortcut when both are set.verify— a bounded anti-hallucination second pass. After anextractproduces a value, a verify call re-checks it against the document’s cited evidence and can only downgrade the flag (green → yellow / else → red) when the value isn’t supported. It is never worse than extract: a verify-call failure keeps the original cell (R14).
graph) → grounded
(extract) → grounded-and-checked (verify) — expressed as a per-column choice
rather than a whole-report mode.
The governance metrics come from one place — App\Services\TabularReview\GovernanceColumnResolver
— which reads the real taxonomies (EvidenceTier, CanonicalStatus enums) and
the canonical graph (kb_edges), tenant-scoped (R30). It exposes 10 metrics:
| Metric | Meaning |
|---|---|
evidence_tier | The document’s evidence tier (guideline / peer_reviewed / official / … / unverified), ranked; low-confidence tiers flag amber/red. |
frontmatter_completeness | How complete the canonical frontmatter is. |
canonical_status | draft / review / accepted / superseded / deprecated / archived. |
is_canonical | Whether the doc is canonical at all. |
incoming_edges / outgoing_edges | Degree in the canonical graph. |
graph_connectivity | Whether the doc is wired into the graph at all. |
is_orphan | Canonical but with zero edges — a governance smell. |
supersession_status | Whether the doc declares a superseded_by (outgoing InvalidatedBy edge). |
staleness_days | Days since the doc was last updated. |
Design
The router lives inApp\Services\TabularReview\TabularReviewExtractor, which now
takes the GovernanceColumnResolver as a constructor dependency. Per column, it picks
a path by agent (graph wins over json_path, both LLM-free; otherwise extract,
optionally followed by verify):
The verify pass is bounded and order-stable: it ksorts the columns it checks and
only ever downgrades, so a report’s verified run is deterministic and can never invent
a better flag than the extraction earned.
The flagship “Canonical KB Governance Audit” preset (#16 in the seeded library)
is the headline application: rows = canonical docs, columns = 8 graph governance
auditors + 1 verify contradiction check. Running it turns the whole KB into a
per-cell-cited, exportable governance matrix — “which canonical docs are orphaned,
stale, superseded, low-evidence-tier, or self-contradictory?” answered in one grid.
The FE (built on the existing accessible DOM matrix) adds three surfaces: an agentic
column editor (the governance metric picker appears only for graph columns; submit
is gated so a graph-without-metric can’t 422), a per-cell evidence side-panel
(summary + flag + reasoning + cited chunks), and a one-click template gallery of
the built-in system reports.
Data model / contract
The engine reuses the v4.7 tables — no new tables for the agentic upgrade. The agentic dimension is additive on the column config:| Table | Holds |
|---|---|
tabular_reviews | One report: title, project, and the columns_config JSON array. Each column now carries the additive agent (extract/graph/verify) + metric (for graph) keys, alongside the existing name / prompt / format / enum_values / json_path. |
tabular_cells | One cell per (review, document, column): content (summary + flag + reasoning + citations), upserted on the composite UNIQUE (tenant_id, review_id, document_id, column_index) so concurrent generate/regenerate never duplicates a row. |
workflows (type = tabular, is_system = true) —
BuiltInWorkflowSeeder now mints 16 system templates (the +1 is the governance
preset), each with a sensible columns_config. Both agent and metric are validated
on Store/UpdateTabularReviewRequest (metric is required_if the agent is graph
and must be one of the 10 GovernanceColumnResolver::METRICS).
The capability is reachable on all three R44 surfaces over one shared core:
- PHP —
TabularReviewExtractor+GovernanceColumnResolver+ theBuiltInWorkflowSeederlibrary. - HTTP — the existing
api/admin/tabular-reviews/*(list / show / create / generate), now accepting agentic columns. - MCP —
App\Mcp\Tools\KbRunReportToolon theenterprise-kbserver (read a saved report’s matrix, tenant-scoped R30, OFF-path safe R43), bounded byCOUNT(DISTINCT) + LIMITso an agent can’t pull an unbounded matrix.
Security & flags (R32 / R30 / R43)
- RBAC. Every tabular-review route is behind
can:viewTabularReviews/can:manageTabularReviews, R32-matrix-locked. The agentic upgrade adds no new route group — it extends the existing controllers’ request contracts. - Tenant isolation (R30).
GovernanceColumnResolverscopes everykb_edges/ document query to the active tenant;KbRunReportToolfilters the report + cells by tenant. A client cannot read another tenant’s report or widen scope through a filter. - OFF-path safe (R43).
KbRunReportToolreturns a well-formed empty payload when the report/cells are absent rather than throwing;verifyfailures andgraphunknowns degrade to a grey/red cell (R14) instead of erroring the whole generate.
Decision rationale (ADR-style)
- A per-column
agentdimension, not a report-level mode. Governance audits mix deterministic facts and grounded judgments in the same grid; a whole-report mode would force a false choice. Makingagenta column property lets one report carrygraphfacts next toextract/verifyanswers. See architecture decisions. - Deterministic graph metrics over LLM guesses. Canonical status, edge degree,
orphan-ness and supersession are facts in the graph. Resolving them with
GovernanceColumnResolver(no model call) is faster, free, and cannot hallucinate — and it reuses the realEvidenceTier/CanonicalStatusenums so the taxonomy never drifts (R9). verifycan only downgrade. An anti-hallucination pass that could raise a flag would itself be a hallucination risk. Constraining it to downgrade-only (and keeping the original cell on failure) makes the verified run monotonic and never worse than the extraction (R14).- Reuse
workflowsfor the library. The ready-made “precotte” templates are just system-owned tabular workflows — no new model, no new admin surface, and they show up in the existing workflow gallery for free. - Glide canvas grid deferred. The accessible DOM matrix (per-cell testids + ARIA, R11/R15) ships now; the canvas grid + SSE progressive-paint are documented v8.19.x follow-ups because canvas cells can’t carry per-cell testids/ARIA — testability and a11y come first.
Worked example
Create a governance report from the flagship preset, then read its matrix over MCP. Agraph column is computed with no LLM call:
verify column downgrades a too-confident extraction:
enterprise-kb
server calls KbRunReportTool and gets the tenant’s report matrix, bounded by
COUNT(DISTINCT) + LIMIT:
/app/admin/tabular-reviews, “From template” opens the gallery,
picks “Canonical KB Governance Audit”, and pre-fills the create dialog; clicking a
populated cell opens the evidence side-panel with the summary, flag, reasoning and
the cited KB chunks.
Gotchas & operations
graphneeds ametric. Agraphcolumn without a governance metric is a misconfiguration — the FE gates submit on it and the BE validator rejects it (required_if), so a bad column can’t reach generation and 422.graphwins overjson_path. If a column sets both, the deterministic graph resolver runs;json_pathis the LLM-free shortcut forextract-family columns.verifyadds a bounded second call. It is the one agent kind that costs extra per cell (one verify call after the extract). Use it for the columns where anti-hallucination matters, not blanket-on every column.- Existing reviews are unchanged. A pre-v8.19 review has no
agenton its columns → every column isextract→ byte-identical output. The agentic upgrade is purely additive (R27). - Re-run after canonical changes.
graphcells reflect the canonical graph at generate time; promote/supersede/delete a doc and re-generate to refresh the governance matrix.
Canonical & promotion
The canonical graph (kb_edges, status, evidence tiers) the graph columns read.
Grounding & evidence tiers
The EvidenceTier taxonomy a governance audit ranks each doc by.
Multi-tenant isolation
The TenantContext every report, cell and graph query is scoped to (R30).
MCP server
The KbRunReportTool reader on the enterprise-kb server.