Link to implementation repo.

Myria is a production memory system for long-running, session-less conversational agents.

It is designed to solve one core problem:

How to give agents stable, queryable, long-term memory over very long chats without forcing clients to read giant transcripts?

For v1, the design is fixed around four implementation choices:

a standalone Go service
a PostgreSQL persistence layer
an internal structured tool-calling LLM for memory-building workflows that need guided interaction with SQL
an MCP server surface as the public interface

Myria does this by combining:

an append-only event log as ground truth
a versioned semantic index built in the background
a stable active snapshot for live retrieval
a staging snapshot for async rebuilds
topic-indexed fresh events to bridge snapshot lag
an MCP server interface so any client can use it as infrastructure

1. Core model
2. System responsibilities
3. Architectural decomposition
4. Fundamental design invariants
5. Event model
6. Topic model
7. TopicIndex model
8. Freshness bridge: topic-indexed event log
9. Snapshot lifecycle
10. Build plane vs serving plane
11. Builder trigger model
12. Validation rules
13. SQL-backed storage model
14. Retrieval model
15. Query pipeline
16. Query tree model
17. Topic hint model
18. Scope and masking
19. Client interaction model
20. MCP exposure
21. Concurrency model
22. Implementation Language
23. Internal module layout
24. Suggested package / binary layout
25. Main failure modes
26. Why this design is strong
27. Canonical system statement
28. Final compressed blueprint

1. Core model

Myria has three data layers.

1.1 Event Log

The source of truth.

Properties:

append-only
immutable
replayable
fully reconstructible
event-native

Stores:

raw user messages
assistant messages
tool results
task updates
topic markers
explicit memory markers
emitter-supplied topic hints at ingest time

1.2 Active Memory

The currently served semantic memory snapshot.

Properties:

read-only to live clients
stable during a request
used for retrieval
versioned

This contains:

topic nodes
topic summaries
edges / parent-child relationships
event references
cached traversal metadata

1.3 Staging Memory

A background copy under construction.

Properties:

built asynchronously
invisible to live clients
validated before activation
atomically swapped into active

This supports:

nightly rebuilds
merge / compaction
snapshot refinement
rollback-safe deployment

2. System responsibilities

Myria is responsible for:

ingesting canonical event JSON
storing the append-only event log
storing emitter-supplied topic hints on events
serving exact event lookups
serving TopicIndex retrieval
assembling bounded rooted query trees
building new semantic snapshots in the background
atomically swapping the active snapshot
preserving provenance from semantic nodes back to events

Myria is not responsible for:

frontend transport handling
chat orchestration
the application’s general-purpose tool loop
the application’s general-purpose reasoning policy
account mapping policy beyond what it is explicitly given

However, Myria does own its internal memory workflows, including topic classification, summarization, and constrained LLM-assisted planning when those workflows are part of ingest, query enrichment, or snapshot building.

3. Architectural decomposition

Myria is one standalone service with five major internal subsystems.

3.1 Ingest subsystem

Handles incoming events.

Responsibilities:

validate event schema
deduplicate on a caller-supplied idempotency key when present
store emitter-supplied topic hints
append to event log
update topic metadata
update global rebuild-trigger state

3.2 Query subsystem

Handles live reads.

Responsibilities:

exact lookup by event / topic / node / snapshot id
TopicIndex traversal on active snapshot
reference resolution
fresh-event retrieval from raw event log
rooted query tree assembly
masked tree-walker orchestration for query_references

3.3 Builder subsystem

Handles async memory construction.

Responsibilities:

read new events since last snapshot
capture an event-log high-water mark when a build starts
update staging structures
summarize internally derived topics
create / update topic hierarchies
merge topic subtrees into larger abstractions
rebuild indices and caches
validate candidate snapshot
publish for blue/green swap

3.4 Snapshot registry

Handles snapshot lifecycle.

Responsibilities:

track active snapshot
track staging snapshot
version snapshots
support rollback
expose build status

3.5 LLM orchestration

Handles constrained LLM-assisted memory workflows.

Responsibilities:

manage model configuration and prompt templates
expose typed SQL tools to the model
validate model tool arguments before execution
bound tool-call rounds, time, and token budgets
reject unsafe or out-of-policy SQL actions
produce structured outputs for builder plans, summaries, merge/split proposals, and tree-walk actions

4. Fundamental design invariants

These are non-negotiable.

Invariant 1

The event log is the only source of truth.

Everything in active or staging memory must be reconstructible from events.

Invariant 2

Active memory is immutable during live serving.

Live clients never see partial or in-place semantic mutation.

Invariant 3

Staging memory is invisible to live clients.

No live request can read staging.

Invariant 4

Snapshot swap is atomic.

A request sees exactly one snapshot for its entire lifetime.

Invariant 5

Every semantic node must reference event ids.

No orphan summaries.

Invariant 6

Topic hints are hints, not truth.

They support freshness and retrieval fallback, but do not replace semantic structure.

Invariant 7

Working memory is a rooted query tree, never storage.

Myria stores truth and structure. Clients consume bounded rooted trees.

Invariant 8

Participant visibility is monotonic down the tree.

If a child node changes participant scope, it only narrows it. Formally, child.participants ⊆ parent.participants.

Invariant 9

Participant scope may never widen in place.

If an index node must expand its participant list, the affected structure must be rebuilt rather than mutated in place.

Invariant 10

Every LLM-visible inspection must be bounded.

Any subtree, ref, or event inspection shown to an internal LLM must be bounded by depth, breadth, and byte or token budget.

Invariant 11

Bounded inspections must be deterministically ordered.

If a result is truncated, ordering and pagination must remain stable for the lifetime of the pinned snapshot or build task.

Invariant 12

Raw event payloads are not part of structural inspection by default.

Structural inspection exposes summaries, counts, and metadata first. Full event payloads require a separate explicit expansion step.

Invariant 13

Every LLM workflow turn is snapshot-pinned and participant-masked.

The model only inspects data that has already been bounded and masked for the current workflow scope.

5. Event model

All writes enter Myria as canonical events.

5.1 Event shape

Each event minimally contains:

{
  "timestamp": "2026-03-19T20:00:00Z",
  "channel": "telegram",
  "participants": ["user_42", "agent_1"],
  "source_event_key": "telegram:chat_7:msg_99102",
  "context_id": "ctx_group_7",
  "type": "message",
  "payload": {
    "text": "Let's design the memory system"
  },
  "topic_hints": [
    {"hint": "memory_system", "confidence": 0.92},
    {"hint": "system_design", "confidence": 0.61}
  ],
  "internal": false
}

5.2 Event requirements

event_id is generated by Myria at ingest time and must be globally unique
ingestion idempotency must be keyed by a separate caller-supplied dedupe key such as source_event_key
when source_event_key is present, idempotency scope is (channel, source_event_key)
if no dedupe key is supplied, retries can append distinct events with distinct event_ids
Myria also assigns a monotonically increasing event_seq used for build boundaries and stable append ordering
payload is opaque to storage, interpreted by consumers
channel identifies where the event came from, or where an agent emitted it
participants is the immutable visibility scope for the event
callers can supply an internal hint, but the persisted internal field is the service-normalized effective visibility flag after ingest policy is applied
source_event_key, context_id, type, topic_hints, and other semantic annotations are emitter-supplied hints rather than trusted facts except where explicit ingest policy promotes them to idempotency inputs
each topic hint contains a string label plus a confidence in the inclusive range [0.0, 1.0]
multiple topic hints can be attached to the same event
event order is canonicalized through timestamp + event_seq

Trusted event facts are limited to:

participants
channel
timestamp
payload

All other event metadata is treated as provisional emitter guesses unless produced or validated by Myria itself.

The frontend or client layer is expected to map external <channel, account> tuples onto internal user identities before ingestion. Myria stores the participant identities it is given; it does not own cross-channel account resolution.

6. Topic model

Myria supports long chats by partitioning events by logical topic.

A topic is the only first-class partition over events.

There are no session partitions in Myria. There is no parallel conversation-boundary partition model.

6.1 Derived topic properties

An internally derived topic contains:

system-assigned topic_id
optional context_id hint carried forward from ingest
participant scope
state: active / dormant / archived
derived label / summary
optional parent topic
event membership

6.2 Topic formation boundaries

Derived topics are created or forked by:

logical topic shift
explicit task boundary
context change
explicit marker event

6.3 Topic purpose

Derived topics exist to:

reduce retrieval noise
create summarization boundaries
define local conversation context
serve as the base unit of memory compilation

Derived topics are the primary logical partitioning model. Higher-order topic hierarchies can still be built above them in snapshots.

7. TopicIndex model

Myria’s semantic memory layer is called TopicIndex.

TopicIndex is:

a versioned, hierarchical semantic index over the event log

It is built from:

derived topic summaries
topic nodes
topic-hierarchy merges
event references

7.1 Core idea

Topics start small and local, then expand by merging topic nodes.

So the structure is not “one global graph first.”
It is:

local derived topics
topic hierarchies
parent topic nodes created over time
eventually, an evolving forest of mergeable hierarchies

7.2 TopicIndex node types

TopicIndex does not use a rich semantic node taxonomy.

Tasks, concepts, planning structure, and conversational themes are all represented as topics through summary and metadata, not through separate node kinds.

For the shipped system, the only storage node types are:

internal_topic
leaf_topic

Every index node is one of two structural forms:

internal topic node
leaf topic node

Rules:

internal topic nodes can have child index nodes, but never directly reference events
leaf topic nodes can reference events, but never own child index partitions beneath them
leaf topic nodes contain at least two event references whenever the eligible event set permits it
singleton leaves are allowed only when no valid merge alternative exists at build time

7.3 TopicIndex node fields

Each node contains:

node_id
node_type
is_leaf
participants
summary
metadata
refs -> event_ids
timestamps

Snapshot membership, parent/child edges, and traversal caches are stored alongside nodes rather than inside the canonical node body.

Field constraints:

node_id is a SHA-256 hash over the canonical identity body with tree pointers excluded
internal nodes must have zero direct event refs
leaf nodes are the only nodes allowed to carry direct event refs
parent ids, child ids, edge positions, traversal caches, and other tree-pointer fields must not participate in the node_id hash
summary text and presentation-oriented metadata do not participate in the node_id hash and can be refreshed without changing node identity
any change to hashed identity fields creates a new node rather than mutating the old node in place

7.4 Event references

Every node carries provenance:

a node references one or more event_ids
merged parent nodes usually reference the union of child refs
summary quality can later be revalidated against source refs
if a parent and child differ in participant scope, the child scope must be a subset of the parent scope

This is what keeps Myria honest.

8. Freshness bridge: topic-indexed event log

Snapshots are intentionally stale compared to real-time events.

To bridge this lag, Myria also retrieves directly from the event log using participant scope, payload filters, and optional emitter-supplied weighted topic hints.

This creates a dual retrieval path:

structured path: active TopicIndex snapshot
fresh path: recent raw events since snapshot cutoff

8.1 Why this exists

Without it, a new event would be invisible until the next rebuild.

That would make the client “forget” recently heard information.

8.2 Retrieval rule

For a memory query:

query active TopicIndex snapshot
query event log for events newer than the snapshot cutoff using query filters plus optional weighted topic hints
merge both into one rooted result tree
dedupe by event_id

9. Snapshot lifecycle

Myria memory is not updated in place.
It is recompiled asynchronously into snapshots.

9.1 Snapshot states

A snapshot can be:

staging
active
archived
failed

9.2 Snapshot metadata

Each snapshot stores:

snapshot_id
parent_snapshot_id
event range covered
event-log high-water mark captured at build start
builder version
summarizer version
merge policy version
build start/end time
validation status

9.3 Swap model

Myria uses blue/green deployment semantics.

active snapshot = blue
newly built snapshot = green
after validation, pointer flips atomically
old active is kept for rollback
each request pins one snapshot for its full lifetime

10. Build plane vs serving plane

Myria memory is split into two planes.

10.1 Serving plane

Used by live clients.

Properties:

low latency
stable
read-only
deterministic

10.2 Build plane

Used by async builder.

Properties:

can be slower
can restructure
can resummarize
can merge trees
can rebuild traversal metadata

This supports “sleep-phase” consolidation without destabilizing live chat behavior.

11. Builder trigger model

TopicIndex rebuild scheduling is global.

Topic tagging does not control build triggering or build boundaries. Triggering depends only on trusted global event-log facts.

11.1 Trigger paths

A rebuild starts through exactly three paths:

manual admin-triggered rebuild through MCP
global inactivity timeout after no new events for N seconds
global tail-threshold trigger after N non-indexed events accumulate past the active snapshot cutoff

11.2 Build boundary

When a rebuild starts, the builder captures the current event-log high-water mark.

That captured high-water mark is the maximum committed event_seq at build start.

Rules:

only events at or before the captured event_seq high-water mark participate in the current rebuild
newer events appended during the rebuild do not participate in the in-flight staging snapshot
newer events remain in the fresh tail and can still be returned through read queries
the published snapshot records the captured event_seq high-water mark it covers

This is a logical end-pointer lock, not a write lock on event ingest.

11.3 Build pipeline

After the high-water mark is captured, the builder constructs the next snapshot from all eligible events up to that boundary.

Stages:

load the active snapshot baseline
load the bounded event-log slice from the previous event_seq cutoff up to the new high-water mark
deterministically partition the eligible input by exact participants set
mask each build task to only the events and nodes visible to that exact participant set
invoke the builder LLM on each masked build task to propose topic grouping, summaries, node splits, node merges, and leaf ref placement
deterministically validate the returned build plan
regenerate topic summaries
refresh refs
derive or refresh internal topic assignments where needed
attach topic summaries into topic hierarchies within the same exact participant set
split or merge leaf topic nodes so event refs live only in leaves and singleton leaves exist only when unavoidable
rebuild traversal metadata
validate the staging snapshot
publish atomically if validation passes and publish policy allows it

In v1, the builder only creates subtrees whose nodes all share the same exact participant set.

That means:

no broad-scope parent is built over narrower child scopes
no cross-scope aggregation is allowed inside one subtree
any future cross-scope abstraction would require a separate design with deterministic redaction rules

11.4 Publish

Write the staging snapshot to registry and atomically promote it.

The active snapshot pointer moves only after validation succeeds. Events appended after the captured high-water mark are excluded from that published snapshot and remain available through the fresh-event path until a later rebuild covers them.

12. Validation rules

A snapshot cannot be activated until it passes validation.

Validation checks:

Structural

all refs resolve
no orphan child nodes
no invalid parent relationships
no missing roots
internal nodes contain no direct event refs
event refs appear only in leaf nodes
leaf nodes contain at least two event refs whenever a non-singleton layout is possible

Provenance

all summaries have refs
merged nodes preserve child coverage

Scope

no cross-user leakage
no private refs in shared-only nodes unless policy allows it

Quality

no collapse into one giant root
no explosion into too many useless leaves
summary size and depth remain bounded

13. SQL-backed storage model

Everything in Myria can be backed by SQL.

That includes:

append-only event log
topic metadata
topic-indexed events
TopicIndex nodes
node-child relations
node-event refs
snapshots
snapshot registry

SQL is the persistence substrate.
Traversal and meaning remain in the service layer.

For v1, the SQL target is PostgreSQL. Other databases are out of scope.

PostgreSQL is used for:

transactional ingest
snapshot metadata and atomic active-pointer updates
JSONB-backed event payloads and metadata
relational joins for refs, topics, and nodes
indexed fresh-event lookups

There is no premature cross-database abstraction in v1. The storage layer can hide driver details, but the schema and query design remain intentionally PostgreSQL-native.

13.1 PostgreSQL notes for v1

The v1 PostgreSQL layout leans on native features instead of simulating a lowest-common-denominator SQL subset.

PostgreSQL features used in v1:

JSONB for event payloads, node metadata, and weighted topic-hint arrays
transactional INSERT ... ON CONFLICT for idempotent ingest on a partial unique key over (channel, source_event_key) when source_event_key is not null
partial and composite indexes for active-snapshot and fresh-event reads
advisory locking or equivalent serialized publish coordination for snapshot promotion
schema metadata with an explicit version check at startup

PostgreSQL features not required in v1:

pgvector
stored procedures as the primary application logic layer
cross-database compatibility promises

13.2 Core tables

`events`

Stores raw truth.

Fields:

event_seq
event_id
source_event_key
timestamp
context_id
channel
participants
internal
payload
topic_hints

`topics`

Stores internally derived topic metadata.

Fields:

topic_id
context_id
participants
state
label
created_at
last_active_at
parent_topic_id

`snapshots`

Stores snapshot registry.

Fields:

snapshot_id
parent_id
event_max_seq
status
builder_version
merge_policy_version
created_at

`nodes`

Stores global content-addressed TopicIndex node bodies.

Fields:

node_id
node_type
is_leaf
participants
summary
metadata
created_at

node_id is derived from a canonical serialization of the node identity body, such as:

node_type
sorted participants
sorted direct event_ids for leaf nodes
sorted descendant leaf node_ids for internal nodes
an explicit node-schema version tag

and must exclude:

summary text
presentation-only metadata
parent ids
child ids
edge positions
snapshot-local traversal caches

`snapshot_nodes`

Stores snapshot membership for global nodes.

Fields:

snapshot_id
node_id

`node_children`

Stores snapshot-scoped tree edges between global nodes.

Fields:

snapshot_id
parent_id
child_id
position

`node_refs`

Stores provenance.

Only leaf nodes may appear here.

Fields:

node_id
event_id
weight

14. Retrieval model

For normal conversations, the agent is expected to use exactly three Myria tools:

append_event
query_event_nodes
query_references

The write path appends canonical events. The two read tools expose raw-event retrieval and reference-tree retrieval.

14.1 `query_event_nodes`

Used for:

fetching raw event nodes for a request
returning direct event data in a rooted tree
returning fresh or indexed event leaves without requiring summary traversal

This is the direct event-data tool in normal conversation.

14.2 `query_references`

Used for:

returning a rooted tree of references to event nodes
attaching summaries and topic structure above those references
exposing semantic organization for a subtree or root selection

Each query_references call carries a participants list that defines the visibility scope of the caller.

This is the summary/reference-tree tool, not the direct raw-event retrieval tool.

14.3 Rooted tree response contract

All read queries return one rooted multi-child tree.

Rules:

a shallow list is represented as a synthetic root node with direct children
a deep semantic result is represented as a normal tree
a hybrid response can mix semantic branches and direct event children under one root
the response shape is stable regardless of whether results came from TopicIndex, fresh events, or both
every returned node must carry its own participants list
if scope changes while walking downward, it only narrows
a query only reads nodes whose participant set is a superset of the query participant set

At minimum, tree nodes declare a kind such as:

root
internal_topic
leaf_topic
event_ref
event

Visibility rule:

a node is visible only if query.participants ⊆ node.participants
broader queries do not see narrower-scope nodes
because child scope only stays the same or narrows, invisible subtrees can be pruned immediately

14.4 Deterministic serving vs LLM-assisted workflows

The live serving path must remain understandable and bounded.

For the shipped system:

snapshot selection, participant masking, raw-event retrieval, rooted-tree assembly, SQL execution, validation, and snapshot publish are deterministic Go code
the internal LLM assists topic grouping, topic summarization, subtree shaping, reference-walk branch selection, and offline analysis
the internal LLM queries PostgreSQL only through service-managed typed tools
all internal inspection tools must enforce depth, breadth, and byte or token limits before data reaches the model
if an LLM workflow fails, times out, or exhausts budget, serving must fall back to a deterministic degraded mode
the public MCP contract must not require clients to orchestrate the LLM themselves

This keeps the public behavior stable while still allowing model-driven memory construction where it is useful.

15. Query pipeline

For a normal conversation read:

Step 1

Resolve active snapshot.

Step 2

Resolve query selectors:

participant scope
topic hint
optional context_id hint
payload / text filters
internal-visibility policy

Step 3

Choose read mode:

query_event_nodes for raw event-node retrieval
query_references for LLM-guided reference-tree retrieval with attached summaries

Step 4

Load candidate data:

fetch event nodes directly from the event log when serving query_event_nodes
load the visible snapshot roots, visible node summaries, and fresh-tail metadata when serving query_references
include fresh tail events newer than the snapshot cutoff when policy allows

Step 5

Apply filters:

internal topic selectors
optional weighted topic hints
query participants
internal-visibility policy

Step 6

Execute read strategy:

query_event_nodes stays fully deterministic and fetches matching raw events directly
query_references invokes a tree-walker LLM that can only see already-masked node summaries, metadata, and fresh-tail candidates
the tree-walker requests bounded expansions such as child inspection, leaf ref loading, and fresh-tail event fetches
every expansion is re-masked and revalidated by deterministic Go code before the next LLM turn
subtree inspection must use deterministic ordering plus continuation cursors when results are truncated
if the tree-walker fails or exhausts budget, return a deterministic bounded tree built from the currently loaded visible frontier plus fresh-tail matches

Step 7

Assemble rooted result tree:

one synthetic or semantic root
summary/reference nodes where applicable
dedupe by event_id or node_id as appropriate
prune invisible branches before return
raw event nodes where applicable
per-node participants
constraints / scope

16. Query tree model

Myria exposes a first-class rooted query tree.

A rooted query tree is the client-facing read package.

Example:

{
  "snapshot_id": "mem_v42",
  "root": {
    "kind": "root",
    "label": "query_result",
    "participants": ["user_42", "agent_1"],
    "children": [
      {
        "kind": "leaf_topic",
        "node_id": "node_topic_memory_system",
        "label": "memory_system",
        "participants": ["user_42", "agent_1"],
        "children": [
          {"kind": "event_ref", "event_id": "evt_000001", "participants": ["user_42", "agent_1"]},
          {"kind": "event_ref", "event_id": "evt_000002", "participants": ["user_42"]}
        ]
      },
      {
        "kind": "event",
        "event_id": "evt_fresh_001",
        "participants": ["user_42", "agent_1"],
        "internal": false
      }
    ]
  },
  "constraints": {
    "participants": ["user_42"]
  }
}

Clients use this instead of manually stitching together low-level reads.

17. Topic hint model

Emitters can provide best-effort topic_hints at ingest time. Myria treats them as weak signals rather than canonical structure.

17.1 Purpose

support fresh retrieval before snapshot rebuild
provide cheap early indexing
support fresh-tail grouping before snapshot rebuild
aid later merge planning

17.2 Properties

optional
zero or more hints can be attached to a single event
each hint contains a string label plus a confidence in [0.0, 1.0]
fast to compute
not authoritative

17.3 Evolution

shipped system: weighted hint arrays
v2: normalized internally derived topic ids
v3: alignment with topic tree nodes

18. Scope and masking

Myria uses participant lists as its primary visibility scope.

Rules:

every event has a participants list
every index node has a participants list
every query_references call has a participants list
a node is visible only if query.participants ⊆ node.participants
broader queries cannot see narrower-scope nodes

Tree constraints:

if a child differs from its parent in participant scope, then child.participants ⊆ parent.participants
if a node must expand its participant scope, it must be rebuilt rather than widened in place
in v1, builder-created subtrees must use one exact participant set throughout, so cross-scope parent/child structure is disallowed

Retrieval must apply participant masking before rooted-tree assembly.

LLM workflows must also apply participant masking before every model turn.

The builder LLM and the tree-walker LLM only receive masked inputs whose participants sets are compatible with the current workflow scope.

This prevents:

narrower-scope one-on-one memory from leaking into broader multi-user queries
silent widening of visibility in existing TopicIndex structure

19. Client interaction model

Myria is a standalone memory service, not the application’s orchestrator.

The client application:

ingests canonical events into Myria
queries Myria through query_event_nodes and query_references
optionally uses exact lookup or observability tools outside normal conversation
never reads staging
never mutates active snapshot directly
maps external <channel, account> identities to internal participant identities before ingest when needed

20. MCP exposure

Myria is exposed as an MCP service.

That gives:

client independence
language independence
deployment as a standalone memory server
consistent contracts

For v1, MCP is the primary public interface. A private in-process API may exist inside the Go codebase, but external consumers integrate through MCP tools rather than a bespoke RPC contract.

20.1 Agent-facing MCP tools

These are the normal-conversation tools:

`myria.append_event`

Append one canonical event.

`myria.query_event_nodes`

Return a rooted multi-child tree of raw event nodes filtered by the caller’s participant scope.

`myria.query_references`

Return a rooted multi-child tree of references to event nodes, with attached summaries and topic structure when applicable, filtered by the caller’s participant scope.

20.2 Auxiliary MCP tools

These are optional tools for exact lookup, observability, or admin:

`myria.get_event`

Fetch exact event.

`myria.get_topic`

Fetch exact internally derived topic metadata.

`myria.get_active_snapshot`

Observability.

`myria.list_snapshots`

Observability / admin.

`myria.get_snapshot_status`

Observability / admin.

20.3 MCP contract expectations

The MCP surface follows a strict typed-tool style.

Rules:

each tool accepts a structured JSON input object
each tool returns a structured JSON result object
errors are machine-readable and include stable codes
snapshot-stable reads return the pinned snapshot_id
admin-only tools remain separable from read/write tools by policy
all read tools return one rooted multi-child tree
internal LLM-facing inspection tools must expose deterministic truncation metadata and continuation cursors
degraded results are explicitly flagged in structured response metadata when deterministic fallback was used

The main contract is:

append canonical events
query raw event-node trees
query summary/reference trees
observe exact entities and snapshot lifecycle when needed

21. Concurrency model

Myria itself may be concurrent internally, but serving semantics must remain snapshot-stable.

Rules:

queries pin one active snapshot id at start
staging writes never affect in-flight queries
event ingestion is idempotent
snapshot swap is atomic
final publish is serialized

22. Implementation Language

The production service is implemented in a natively compiled language.

Language: Go

Reason:

great for service orchestration
easy MCP server implementation
good concurrency primitives
simple deployment
fast enough for this workload

Myria ships as its own Go project and executable, with its MCP server, PostgreSQL integration, builder workers, and LLM orchestration living in the same codebase.

22.1 Configuration model

Myria configuration must include:

service identity and logging
stdio MCP server settings
PostgreSQL connection settings
a path to the structured tool-calling LLM request template JSON
builder and snapshot-publish policy

Configuration shape:

service:
  name: myria
  log_level: info
  log_file: ./myria.log

mcp:
  transport: stdio

postgres:
  dsn: postgres://myria:secret@127.0.0.1:5432/myria?sslmode=disable
  schema: myria
  max_open_conns: 20
  max_idle_conns: 5
  migrate_on_start: false

llm:
  provider: openrouter
  request_template_path: ./config/openrouter-llm.template.json
  timeout: 30s
  max_tool_rounds: 8
  temperature: 1.0
  top_p: 0.95
  structured_tool_calling_required: true

builder:
  inactivity_timeout: 30s
  max_unindexed_events: 100
  auto_publish: true
  allow_manual_trigger: true

The configured model must support structured tool calls with typed arguments. Plain text-only models are out of scope.

For v1, the target platform is OpenRouter.

For v1, the builder LLM and tree-walker LLM target nvidia/nemotron-3-super-120b-a12b through OpenRouter.

For local testing, the current configured test target is nvidia/nemotron-3-super-120b-a12b:free.

The LLM call configuration lives in a JSON template on disk rather than environment variables. That template defines the outbound request shape used by Myria when calling OpenRouter.

Template:

{
  "model": "nvidia/nemotron-3-super-120b-a12b:free",
  "provider": {
    "sort": "throughput"
  },
  "temperature": 1.0,
  "top_p": 0.95,
  "response_format": {
    "type": "json_object"
  },
  "tools": [],
  "tool_choice": "auto",
  "usage": {
    "include": true
  },
  "headers": {
    "Authorization": "Bearer <openrouter-api-key>",
    "HTTP-Referer": "http://localhost",
    "X-Title": "myria"
  }
}

Myria injects dynamic fields such as messages, live tool schemas, and bounded execution settings at call time, but the provider target and base request template come from this JSON file.

22.2 LLM execution model

The configured LLM does not receive direct database credentials and does not execute arbitrary SQL on its own.

Instead:

the Go service opens the PostgreSQL connection
the service exposes typed SQL tools to the LLM
the LLM emits structured tool calls
the service validates and executes those tool calls
results are fed back to the LLM for the next step

This is a planner/executor split. The model proposes bounded actions; the service enforces policy, schema, and execution limits.

For v1, there are two internal LLM-controlled workflows:

builder LLM operates on one exact-participant-set build task at a time using nvidia/nemotron-3-super-120b-a12b through OpenRouter and proposes topic grouping, summaries, node merges, node splits, and leaf ref placement
tree-walker LLM operates on one masked read request at a time using nvidia/nemotron-3-super-120b-a12b through OpenRouter and proposes which visible branches, leaves, or fresh-tail candidates to expand next

Deterministic responsibilities:

snapshot pinning and cutoff capture
participant masking before each LLM turn
typed tool exposure and SQL execution
structural validation of returned plans
enforcement of leaf-only refs and exact-participant-set subtree rules
deterministic fallback when builder or tree-walker workflows fail
final rooted-tree assembly and snapshot publish

LLM-controlled responsibilities:

topic grouping within a masked build task
summary and label generation
merge and split proposals inside one exact participant set
branch selection during query_references
stop / continue decisions under deterministic budget limits

Required internal LLM inspection tools:

inspect_subtree(root, depth, max_nodes, summaries_only, cursor) returns a deterministically ordered masked view of a subtree rooted at root
inspect_leaf_refs(root, max_refs, cursor) returns deterministically ordered event refs for one visible leaf topic node
fetch_events(event_ids, max_bytes) returns masked raw event payloads only for explicitly selected event ids

Inspection tool rules:

inspect_subtree must bound depth, breadth, and estimated bytes or tokens
inspect_subtree returns summaries, metadata, child counts, ref counts, truncation state, and continuation cursor before any raw payloads
inspect_leaf_refs must use stable ordering and stable pagination within the pinned snapshot or build task
fetch_events is a separate explicit payload expansion step and must not be implicit in structural inspection
builder-side inspection must also enforce deterministic limits on loaded events, loaded nodes, and tool-call rounds per build task

Fallback rules:

if a builder task fails validation, times out, or exhausts budget, Myria falls back to a deterministic leaf-only build for that exact participant set
the deterministic leaf-only build groups events by exact participant set plus weighted topic-hint overlap when available, otherwise by bounded time windows
fallback leaf nodes use template-generated summaries rather than model-generated summaries
if a query_references walk fails validation, times out, or exhausts budget, Myria returns a deterministic bounded tree from the already-loaded visible frontier and fresh tail instead of failing open

Required safety constraints:

max tool-call rounds per workflow
max rows and bytes returned per SQL tool call
explicit read-only vs write-capable tool separation
query allowlisting or shape validation for write paths
full audit logging of prompts, tool calls, and SQL statements
masking audit records for data shown to each LLM workflow

23. Internal module layout

Module split:

`ingest`

schema validation
topic classification
event append
dirty marker update

`query`

exact lookup
rooted query tree building
fresh-event query
ref resolution
deterministic serving logic for MCP reads
masked tree-walker orchestration for query_references

`topicindex`

node loading
traversal
adjacency caches
anchor resolution

`builder`

dirty scan
exact-participant-set build-task generation
masked builder LLM orchestration
summary generation
merge planning
snapshot assembly
validation

`store`

PostgreSQL connection management
migrations
transaction boundaries
query helpers and row decoding

`llm`

prompt construction
tool schema registration
tool-call loop execution
structured output validation
audit logging for model-assisted workflows

`registry`

snapshot state
active pointer
rollback

`mcp`

public tool surface
request/response schema validation
MCP auth/policy hooks if deployment requires them

24. Suggested package / binary layout

V1 ships as one standalone Go service binary:

myria — MCP server, PostgreSQL-backed core, builder workers, and LLM orchestration

Supporting binaries may be added later:

myria-cli — admin and debug utility
myria-migrate — explicit schema migration runner

The important v1 property is operational simplicity: one deployable service, one PostgreSQL database, one MCP surface.

25. Main failure modes

Over-merge

Everything collapses into one generic tree.

Fix:

stricter thresholds
fan-in limits
coverage checks

Under-merge

Too many tiny trees.

Fix:

consolidation pass
co-occurrence heuristics

Summary drift

Semantic nodes no longer match events.

Fix:

always keep refs
periodic full rebuilds

LLM workflow exhaustion

Builder or tree-walker exceeds time, round, or context budgets.

Fix:

deterministic bounded inspection tools
deterministic leaf-only builder fallback
deterministic bounded query fallback with explicit degraded result metadata

Snapshot lag

Fresh information unavailable.

Fix:

topic-indexed event log fallback

Privacy bleed

Private memory leaks into shared retrieval.

Fix:

scope masking in rooted-tree assembly

26. Why this design is strong

Myria avoids the main failure modes of naïve chat memory systems:

Instead of:

giant transcript stuffing
vector-only recall
in-place summary mutation
per-session silos

it gives you:

event truth
bounded rooted trees
stable active memory
async semantic recompilation
exact provenance
replayability
rollback safety

27. Canonical system statement

The best single-sentence definition is:

Myria is an MCP-exposed, SQL-backed, event-sourced memory system that compiles long conversational histories into versioned semantic TopicIndex snapshots while preserving fresh recall through topic-indexed event-log queries.

In v1, it ships as a standalone Go service backed by PostgreSQL and uses a structured tool-calling LLM for internal memory workflows.

28. Final compressed blueprint

Inputs

canonical event JSON
exact lookup query
query_event_nodes request
query_references request
service configuration including PostgreSQL and LLM settings

Internal layers

event log
active snapshot
staging snapshot

Internal processes

ingest
query
builder
registry
LLM-assisted memory workflows

Outputs

event acknowledgement
exact event / topic / snapshot data
semantic nodes
resolved refs
bounded rooted query tree
MCP tool results

Core philosophy

The event log is truth.
TopicIndex is compiled memory.
Active memory is stable.
Fresh recall comes from topic-indexed events.
Clients consume rooted trees, not transcripts.

Table of Contents

1. Core model

1.1 Event Log

1.2 Active Memory

1.3 Staging Memory

2. System responsibilities

3. Architectural decomposition

3.1 Ingest subsystem

3.2 Query subsystem

3.3 Builder subsystem

3.4 Snapshot registry

3.5 LLM orchestration

4. Fundamental design invariants

Invariant 1

Invariant 2

Invariant 3

Invariant 4

Invariant 5

Invariant 6

Invariant 7

Invariant 8

Invariant 9

Invariant 10

Invariant 11

Invariant 12

Invariant 13

5. Event model

5.1 Event shape

5.2 Event requirements

6. Topic model

6.1 Derived topic properties

6.2 Topic formation boundaries

6.3 Topic purpose

7. TopicIndex model

7.1 Core idea

7.2 TopicIndex node types

7.3 TopicIndex node fields

7.4 Event references

8. Freshness bridge: topic-indexed event log

8.1 Why this exists

8.2 Retrieval rule

9. Snapshot lifecycle

9.1 Snapshot states

9.2 Snapshot metadata

9.3 Swap model

10. Build plane vs serving plane

10.1 Serving plane

10.2 Build plane

11. Builder trigger model

11.1 Trigger paths

11.2 Build boundary

11.3 Build pipeline

11.4 Publish

12. Validation rules

Structural

Provenance

Scope

Quality

13. SQL-backed storage model

13.1 PostgreSQL notes for v1

13.2 Core tables

events

topics

snapshots

nodes

snapshot_nodes

node_children

node_refs

14. Retrieval model

14.1 query_event_nodes

14.2 query_references

14.3 Rooted tree response contract

14.4 Deterministic serving vs LLM-assisted workflows

15. Query pipeline

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

`events`

`topics`

`snapshots`

`nodes`

`snapshot_nodes`

`node_children`

`node_refs`

14.1 `query_event_nodes`

14.2 `query_references`

`myria.append_event`

`myria.query_event_nodes`

`myria.query_references`

`myria.get_event`

`myria.get_topic`

`myria.get_active_snapshot`

`myria.list_snapshots`

`myria.get_snapshot_status`

`ingest`

`query`

`topicindex`

`builder`

`store`

`llm`

`registry`

`mcp`