Link to implementation repo.
Inspired by PageIndex.
Myria is a production memory system for long-running, session-less conversational agents.
It is designed to solve one core problem:
How to give agents stable, queryable, long-term memory over very long chats without forcing clients to read giant transcripts?
For v1, the design is fixed around four implementation choices:
- a standalone Go service
- a PostgreSQL persistence layer
- an internal structured tool-calling LLM for memory-building workflows that need guided interaction with SQL
- an MCP server surface as the public interface
Myria does this by combining:
- an append-only event log as ground truth
- a versioned semantic index built in the background
- a stable active snapshot for live retrieval
- a staging snapshot for async rebuilds
- topic-indexed fresh events to bridge snapshot lag
- an MCP server interface so any client can use it as infrastructure
Table of Contents
- 1. Core model
- 2. System responsibilities
- 3. Architectural decomposition
- 4. Fundamental design invariants
- 5. Event model
- 6. Topic model
- 7. TopicIndex model
- 8. Freshness bridge: topic-indexed event log
- 9. Snapshot lifecycle
- 10. Build plane vs serving plane
- 11. Builder trigger model
- 12. Validation rules
- 13. SQL-backed storage model
- 14. Retrieval model
- 15. Query pipeline
- 16. Query tree model
- 17. Topic hint model
- 18. Scope and masking
- 19. Client interaction model
- 20. MCP exposure
- 21. Concurrency model
- 22. Implementation Language
- 23. Internal module layout
- 24. Suggested package / binary layout
- 25. Main failure modes
- 26. Why this design is strong
- 27. Canonical system statement
- 28. Final compressed blueprint
1. Core model
Myria has three data layers.
1.1 Event Log
The source of truth.
Properties:
- append-only
- immutable
- replayable
- fully reconstructible
- event-native
Stores:
- raw user messages
- assistant messages
- tool results
- task updates
- topic markers
- explicit memory markers
- emitter-supplied topic hints at ingest time
1.2 Active Memory
The currently served semantic memory snapshot.
Properties:
- read-only to live clients
- stable during a request
- used for retrieval
- versioned
This contains:
- topic nodes
- topic summaries
- edges / parent-child relationships
- event references
- cached traversal metadata
1.3 Staging Memory
A background copy under construction.
Properties:
- built asynchronously
- invisible to live clients
- validated before activation
- atomically swapped into active
This supports:
- nightly rebuilds
- merge / compaction
- snapshot refinement
- rollback-safe deployment
2. System responsibilities
Myria is responsible for:
- ingesting canonical event JSON
- storing the append-only event log
- storing emitter-supplied topic hints on events
- serving exact event lookups
- serving TopicIndex retrieval
- assembling bounded rooted query trees
- building new semantic snapshots in the background
- atomically swapping the active snapshot
- preserving provenance from semantic nodes back to events
Myria is not responsible for:
- frontend transport handling
- chat orchestration
- the application’s general-purpose tool loop
- the application’s general-purpose reasoning policy
- account mapping policy beyond what it is explicitly given
However, Myria does own its internal memory workflows, including topic classification, summarization, and constrained LLM-assisted planning when those workflows are part of ingest, query enrichment, or snapshot building.
3. Architectural decomposition
Myria is one standalone service with five major internal subsystems.
3.1 Ingest subsystem
Handles incoming events.
Responsibilities:
- validate event schema
- deduplicate on a caller-supplied idempotency key when present
- store emitter-supplied topic hints
- append to event log
- update topic metadata
- update global rebuild-trigger state
3.2 Query subsystem
Handles live reads.
Responsibilities:
- exact lookup by event / topic / node / snapshot id
- TopicIndex traversal on active snapshot
- reference resolution
- fresh-event retrieval from raw event log
- rooted query tree assembly
- masked tree-walker orchestration for
query_references
3.3 Builder subsystem
Handles async memory construction.
Responsibilities:
- read new events since last snapshot
- capture an event-log high-water mark when a build starts
- update staging structures
- summarize internally derived topics
- create / update topic hierarchies
- merge topic subtrees into larger abstractions
- rebuild indices and caches
- validate candidate snapshot
- publish for blue/green swap
3.4 Snapshot registry
Handles snapshot lifecycle.
Responsibilities:
- track active snapshot
- track staging snapshot
- version snapshots
- support rollback
- expose build status
3.5 LLM orchestration
Handles constrained LLM-assisted memory workflows.
Responsibilities:
- manage model configuration and prompt templates
- expose typed SQL tools to the model
- validate model tool arguments before execution
- bound tool-call rounds, time, and token budgets
- reject unsafe or out-of-policy SQL actions
- produce structured outputs for builder plans, summaries, merge/split proposals, and tree-walk actions
4. Fundamental design invariants
These are non-negotiable.
Invariant 1
The event log is the only source of truth.
Everything in active or staging memory must be reconstructible from events.
Invariant 2
Active memory is immutable during live serving.
Live clients never see partial or in-place semantic mutation.
Invariant 3
Staging memory is invisible to live clients.
No live request can read staging.
Invariant 4
Snapshot swap is atomic.
A request sees exactly one snapshot for its entire lifetime.
Invariant 5
Every semantic node must reference event ids.
No orphan summaries.
Invariant 6
Topic hints are hints, not truth.
They support freshness and retrieval fallback, but do not replace semantic structure.
Invariant 7
Working memory is a rooted query tree, never storage.
Myria stores truth and structure. Clients consume bounded rooted trees.
Invariant 8
Participant visibility is monotonic down the tree.
If a child node changes participant scope, it only narrows it.
Formally, child.participants ⊆ parent.participants.
Invariant 9
Participant scope may never widen in place.
If an index node must expand its participant list, the affected structure must be rebuilt rather than mutated in place.
Invariant 10
Every LLM-visible inspection must be bounded.
Any subtree, ref, or event inspection shown to an internal LLM must be bounded by depth, breadth, and byte or token budget.
Invariant 11
Bounded inspections must be deterministically ordered.
If a result is truncated, ordering and pagination must remain stable for the lifetime of the pinned snapshot or build task.
Invariant 12
Raw event payloads are not part of structural inspection by default.
Structural inspection exposes summaries, counts, and metadata first. Full event payloads require a separate explicit expansion step.
Invariant 13
Every LLM workflow turn is snapshot-pinned and participant-masked.
The model only inspects data that has already been bounded and masked for the current workflow scope.
5. Event model
All writes enter Myria as canonical events.
5.1 Event shape
Each event minimally contains:
{
"timestamp": "2026-03-19T20:00:00Z",
"channel": "telegram",
"participants": ["user_42", "agent_1"],
"source_event_key": "telegram:chat_7:msg_99102",
"context_id": "ctx_group_7",
"type": "message",
"payload": {
"text": "Let's design the memory system"
},
"topic_hints": [
{"hint": "memory_system", "confidence": 0.92},
{"hint": "system_design", "confidence": 0.61}
],
"internal": false
}5.2 Event requirements
event_idis generated by Myria at ingest time and must be globally unique- ingestion idempotency must be keyed by a separate caller-supplied
dedupe key such as
source_event_key - when
source_event_keyis present, idempotency scope is(channel, source_event_key) - if no dedupe key is supplied, retries can append distinct events
with distinct
event_ids - Myria also assigns a monotonically increasing
event_seqused for build boundaries and stable append ordering payloadis opaque to storage, interpreted by consumerschannelidentifies where the event came from, or where an agent emitted itparticipantsis the immutable visibility scope for the event- callers can supply an
internalhint, but the persistedinternalfield is the service-normalized effective visibility flag after ingest policy is applied source_event_key,context_id,type,topic_hints, and other semantic annotations are emitter-supplied hints rather than trusted facts except where explicit ingest policy promotes them to idempotency inputs- each topic hint contains a string label plus a confidence in the
inclusive range
[0.0, 1.0] - multiple topic hints can be attached to the same event
- event order is canonicalized through timestamp +
event_seq
Trusted event facts are limited to:
participantschanneltimestamppayload
All other event metadata is treated as provisional emitter guesses unless produced or validated by Myria itself.
The frontend or client layer is expected to map external
<channel, account> tuples onto internal user
identities before ingestion. Myria stores the participant identities it
is given; it does not own cross-channel account resolution.
6. Topic model
Myria supports long chats by partitioning events by logical topic.
A topic is the only first-class partition over events.
There are no session partitions in Myria. There is no parallel conversation-boundary partition model.
6.1 Derived topic properties
An internally derived topic contains:
- system-assigned
topic_id - optional
context_idhint carried forward from ingest - participant scope
- state: active / dormant / archived
- derived label / summary
- optional parent topic
- event membership
6.2 Topic formation boundaries
Derived topics are created or forked by:
- logical topic shift
- explicit task boundary
- context change
- explicit marker event
6.3 Topic purpose
Derived topics exist to:
- reduce retrieval noise
- create summarization boundaries
- define local conversation context
- serve as the base unit of memory compilation
Derived topics are the primary logical partitioning model. Higher-order topic hierarchies can still be built above them in snapshots.
7. TopicIndex model
Myria’s semantic memory layer is called TopicIndex.
TopicIndex is:
a versioned, hierarchical semantic index over the event log
It is built from:
- derived topic summaries
- topic nodes
- topic-hierarchy merges
- event references
7.1 Core idea
Topics start small and local, then expand by merging topic nodes.
So the structure is not “one global graph first.”
It is:
- local derived topics
- topic hierarchies
- parent topic nodes created over time
- eventually, an evolving forest of mergeable hierarchies
7.2 TopicIndex node types
TopicIndex does not use a rich semantic node taxonomy.
Tasks, concepts, planning structure, and conversational themes are all represented as topics through summary and metadata, not through separate node kinds.
For the shipped system, the only storage node types are:
internal_topicleaf_topic
Every index node is one of two structural forms:
- internal topic node
- leaf topic node
Rules:
- internal topic nodes can have child index nodes, but never directly reference events
- leaf topic nodes can reference events, but never own child index partitions beneath them
- leaf topic nodes contain at least two event references whenever the eligible event set permits it
- singleton leaves are allowed only when no valid merge alternative exists at build time
7.3 TopicIndex node fields
Each node contains:
node_idnode_typeis_leafparticipantssummarymetadatarefs -> event_ids- timestamps
Snapshot membership, parent/child edges, and traversal caches are stored alongside nodes rather than inside the canonical node body.
Field constraints:
node_idis a SHA-256 hash over the canonical identity body with tree pointers excluded- internal nodes must have zero direct event refs
- leaf nodes are the only nodes allowed to carry direct event refs
- parent ids, child ids, edge positions, traversal caches, and other
tree-pointer fields must not participate in the
node_idhash - summary text and presentation-oriented metadata do not participate
in the
node_idhash and can be refreshed without changing node identity - any change to hashed identity fields creates a new node rather than mutating the old node in place
7.4 Event references
Every node carries provenance:
- a node references one or more
event_ids - merged parent nodes usually reference the union of child refs
- summary quality can later be revalidated against source refs
- if a parent and child differ in participant scope, the child scope must be a subset of the parent scope
This is what keeps Myria honest.
8. Freshness bridge: topic-indexed event log
Snapshots are intentionally stale compared to real-time events.
To bridge this lag, Myria also retrieves directly from the event log using participant scope, payload filters, and optional emitter-supplied weighted topic hints.
This creates a dual retrieval path:
- structured path: active TopicIndex snapshot
- fresh path: recent raw events since snapshot cutoff
8.1 Why this exists
Without it, a new event would be invisible until the next rebuild.
That would make the client “forget” recently heard information.
8.2 Retrieval rule
For a memory query:
- query active TopicIndex snapshot
- query event log for events newer than the snapshot cutoff using query filters plus optional weighted topic hints
- merge both into one rooted result tree
- dedupe by
event_id
9. Snapshot lifecycle
Myria memory is not updated in place.
It is recompiled asynchronously into snapshots.
9.1 Snapshot states
A snapshot can be:
stagingactivearchivedfailed
9.2 Snapshot metadata
Each snapshot stores:
snapshot_idparent_snapshot_id- event range covered
- event-log high-water mark captured at build start
- builder version
- summarizer version
- merge policy version
- build start/end time
- validation status
9.3 Swap model
Myria uses blue/green deployment semantics.
- active snapshot = blue
- newly built snapshot = green
- after validation, pointer flips atomically
- old active is kept for rollback
- each request pins one snapshot for its full lifetime
10. Build plane vs serving plane
Myria memory is split into two planes.
10.1 Serving plane
Used by live clients.
Properties:
- low latency
- stable
- read-only
- deterministic
10.2 Build plane
Used by async builder.
Properties:
- can be slower
- can restructure
- can resummarize
- can merge trees
- can rebuild traversal metadata
This supports “sleep-phase” consolidation without destabilizing live chat behavior.
11. Builder trigger model
TopicIndex rebuild scheduling is global.
Topic tagging does not control build triggering or build boundaries. Triggering depends only on trusted global event-log facts.
11.1 Trigger paths
A rebuild starts through exactly three paths:
- manual admin-triggered rebuild through MCP
- global inactivity timeout after no new events for
Nseconds - global tail-threshold trigger after
Nnon-indexed events accumulate past the active snapshot cutoff
11.2 Build boundary
When a rebuild starts, the builder captures the current event-log high-water mark.
That captured high-water mark is the maximum committed
event_seq at build start.
Rules:
- only events at or before the captured
event_seqhigh-water mark participate in the current rebuild - newer events appended during the rebuild do not participate in the in-flight staging snapshot
- newer events remain in the fresh tail and can still be returned through read queries
- the published snapshot records the captured
event_seqhigh-water mark it covers
This is a logical end-pointer lock, not a write lock on event ingest.
11.3 Build pipeline
After the high-water mark is captured, the builder constructs the next snapshot from all eligible events up to that boundary.
Stages:
- load the active snapshot baseline
- load the bounded event-log slice from the previous
event_seqcutoff up to the new high-water mark - deterministically partition the eligible input by exact
participantsset - mask each build task to only the events and nodes visible to that exact participant set
- invoke the builder LLM on each masked build task to propose topic grouping, summaries, node splits, node merges, and leaf ref placement
- deterministically validate the returned build plan
- regenerate topic summaries
- refresh refs
- derive or refresh internal topic assignments where needed
- attach topic summaries into topic hierarchies within the same exact participant set
- split or merge leaf topic nodes so event refs live only in leaves and singleton leaves exist only when unavoidable
- rebuild traversal metadata
- validate the staging snapshot
- publish atomically if validation passes and publish policy allows it
In v1, the builder only creates subtrees whose nodes all share the same exact participant set.
That means:
- no broad-scope parent is built over narrower child scopes
- no cross-scope aggregation is allowed inside one subtree
- any future cross-scope abstraction would require a separate design with deterministic redaction rules
11.4 Publish
Write the staging snapshot to registry and atomically promote it.
The active snapshot pointer moves only after validation succeeds. Events appended after the captured high-water mark are excluded from that published snapshot and remain available through the fresh-event path until a later rebuild covers them.
12. Validation rules
A snapshot cannot be activated until it passes validation.
Validation checks:
Structural
- all refs resolve
- no orphan child nodes
- no invalid parent relationships
- no missing roots
- internal nodes contain no direct event refs
- event refs appear only in leaf nodes
- leaf nodes contain at least two event refs whenever a non-singleton layout is possible
Provenance
- all summaries have refs
- merged nodes preserve child coverage
Scope
- no cross-user leakage
- no private refs in shared-only nodes unless policy allows it
Quality
- no collapse into one giant root
- no explosion into too many useless leaves
- summary size and depth remain bounded
13. SQL-backed storage model
Everything in Myria can be backed by SQL.
That includes:
- append-only event log
- topic metadata
- topic-indexed events
- TopicIndex nodes
- node-child relations
- node-event refs
- snapshots
- snapshot registry
SQL is the persistence substrate.
Traversal and meaning remain in the service layer.
For v1, the SQL target is PostgreSQL. Other databases are out of scope.
PostgreSQL is used for:
- transactional ingest
- snapshot metadata and atomic active-pointer updates
- JSONB-backed event payloads and metadata
- relational joins for refs, topics, and nodes
- indexed fresh-event lookups
There is no premature cross-database abstraction in v1. The storage layer can hide driver details, but the schema and query design remain intentionally PostgreSQL-native.
13.1 PostgreSQL notes for v1
The v1 PostgreSQL layout leans on native features instead of simulating a lowest-common-denominator SQL subset.
PostgreSQL features used in v1:
JSONBfor event payloads, node metadata, and weighted topic-hint arrays- transactional
INSERT ... ON CONFLICTfor idempotent ingest on a partial unique key over(channel, source_event_key)whensource_event_keyis not null - partial and composite indexes for active-snapshot and fresh-event reads
- advisory locking or equivalent serialized publish coordination for snapshot promotion
- schema metadata with an explicit version check at startup
PostgreSQL features not required in v1:
pgvector- stored procedures as the primary application logic layer
- cross-database compatibility promises
13.2 Core tables
events
Stores raw truth.
Fields:
event_seqevent_idsource_event_keytimestampcontext_idchannelparticipantsinternalpayloadtopic_hints
topics
Stores internally derived topic metadata.
Fields:
topic_idcontext_idparticipantsstatelabelcreated_atlast_active_atparent_topic_id
snapshots
Stores snapshot registry.
Fields:
snapshot_idparent_idevent_max_seqstatusbuilder_versionmerge_policy_versioncreated_at
nodes
Stores global content-addressed TopicIndex node bodies.
Fields:
node_idnode_typeis_leafparticipantssummarymetadatacreated_at
node_id is derived from a canonical serialization of the
node identity body, such as:
node_type- sorted
participants - sorted direct
event_ids for leaf nodes - sorted descendant leaf
node_ids for internal nodes - an explicit node-schema version tag
and must exclude:
- summary text
- presentation-only metadata
- parent ids
- child ids
- edge positions
- snapshot-local traversal caches
snapshot_nodes
Stores snapshot membership for global nodes.
Fields:
snapshot_idnode_id
node_children
Stores snapshot-scoped tree edges between global nodes.
Fields:
snapshot_idparent_idchild_idposition
node_refs
Stores provenance.
Only leaf nodes may appear here.
Fields:
node_idevent_idweight
14. Retrieval model
For normal conversations, the agent is expected to use exactly three Myria tools:
append_eventquery_event_nodesquery_references
The write path appends canonical events. The two read tools expose raw-event retrieval and reference-tree retrieval.
14.1
query_event_nodes
Used for:
- fetching raw event nodes for a request
- returning direct event data in a rooted tree
- returning fresh or indexed event leaves without requiring summary traversal
This is the direct event-data tool in normal conversation.
14.2
query_references
Used for:
- returning a rooted tree of references to event nodes
- attaching summaries and topic structure above those references
- exposing semantic organization for a subtree or root selection
Each query_references call carries a
participants list that defines the visibility scope of the
caller.
This is the summary/reference-tree tool, not the direct raw-event retrieval tool.
14.3 Rooted tree response contract
All read queries return one rooted multi-child tree.
Rules:
- a shallow list is represented as a synthetic root node with direct children
- a deep semantic result is represented as a normal tree
- a hybrid response can mix semantic branches and direct event children under one root
- the response shape is stable regardless of whether results came from TopicIndex, fresh events, or both
- every returned node must carry its own
participantslist - if scope changes while walking downward, it only narrows
- a query only reads nodes whose participant set is a superset of the query participant set
At minimum, tree nodes declare a kind such as:
rootinternal_topicleaf_topicevent_refevent
Visibility rule:
- a node is visible only if
query.participants ⊆ node.participants - broader queries do not see narrower-scope nodes
- because child scope only stays the same or narrows, invisible subtrees can be pruned immediately
14.4 Deterministic serving vs LLM-assisted workflows
The live serving path must remain understandable and bounded.
For the shipped system:
- snapshot selection, participant masking, raw-event retrieval, rooted-tree assembly, SQL execution, validation, and snapshot publish are deterministic Go code
- the internal LLM assists topic grouping, topic summarization, subtree shaping, reference-walk branch selection, and offline analysis
- the internal LLM queries PostgreSQL only through service-managed typed tools
- all internal inspection tools must enforce depth, breadth, and byte or token limits before data reaches the model
- if an LLM workflow fails, times out, or exhausts budget, serving must fall back to a deterministic degraded mode
- the public MCP contract must not require clients to orchestrate the LLM themselves
This keeps the public behavior stable while still allowing model-driven memory construction where it is useful.
15. Query pipeline
For a normal conversation read:
Step 1
Resolve active snapshot.
Step 2
Resolve query selectors:
- participant scope
- topic hint
- optional
context_idhint - payload / text filters
- internal-visibility policy
Step 3
Choose read mode:
query_event_nodesfor raw event-node retrievalquery_referencesfor LLM-guided reference-tree retrieval with attached summaries
Step 4
Load candidate data:
- fetch event nodes directly from the event log when serving
query_event_nodes - load the visible snapshot roots, visible node summaries, and
fresh-tail metadata when serving
query_references - include fresh tail events newer than the snapshot cutoff when policy allows
Step 5
Apply filters:
- internal topic selectors
- optional weighted topic hints
- query
participants - internal-visibility policy
Step 6
Execute read strategy:
query_event_nodesstays fully deterministic and fetches matching raw events directlyquery_referencesinvokes a tree-walker LLM that can only see already-masked node summaries, metadata, and fresh-tail candidates- the tree-walker requests bounded expansions such as child inspection, leaf ref loading, and fresh-tail event fetches
- every expansion is re-masked and revalidated by deterministic Go code before the next LLM turn
- subtree inspection must use deterministic ordering plus continuation cursors when results are truncated
- if the tree-walker fails or exhausts budget, return a deterministic bounded tree built from the currently loaded visible frontier plus fresh-tail matches
Step 7
Assemble rooted result tree:
- one synthetic or semantic root
- summary/reference nodes where applicable
- dedupe by
event_idornode_idas appropriate - prune invisible branches before return
- raw event nodes where applicable
- per-node
participants - constraints / scope
16. Query tree model
Myria exposes a first-class rooted query tree.
A rooted query tree is the client-facing read package.
Example:
{
"snapshot_id": "mem_v42",
"root": {
"kind": "root",
"label": "query_result",
"participants": ["user_42", "agent_1"],
"children": [
{
"kind": "leaf_topic",
"node_id": "node_topic_memory_system",
"label": "memory_system",
"participants": ["user_42", "agent_1"],
"children": [
{"kind": "event_ref", "event_id": "evt_000001", "participants": ["user_42", "agent_1"]},
{"kind": "event_ref", "event_id": "evt_000002", "participants": ["user_42"]}
]
},
{
"kind": "event",
"event_id": "evt_fresh_001",
"participants": ["user_42", "agent_1"],
"internal": false
}
]
},
"constraints": {
"participants": ["user_42"]
}
}Clients use this instead of manually stitching together low-level reads.
17. Topic hint model
Emitters can provide best-effort topic_hints at ingest
time. Myria treats them as weak signals rather than canonical
structure.
17.1 Purpose
- support fresh retrieval before snapshot rebuild
- provide cheap early indexing
- support fresh-tail grouping before snapshot rebuild
- aid later merge planning
17.2 Properties
- optional
- zero or more hints can be attached to a single event
- each hint contains a string label plus a confidence in
[0.0, 1.0] - fast to compute
- not authoritative
17.3 Evolution
- shipped system: weighted hint arrays
- v2: normalized internally derived topic ids
- v3: alignment with topic tree nodes
18. Scope and masking
Myria uses participant lists as its primary visibility scope.
Rules:
- every event has a
participantslist - every index node has a
participantslist - every
query_referencescall has aparticipantslist - a node is visible only if
query.participants ⊆ node.participants - broader queries cannot see narrower-scope nodes
Tree constraints:
- if a child differs from its parent in participant scope, then
child.participants ⊆ parent.participants - if a node must expand its participant scope, it must be rebuilt rather than widened in place
- in v1, builder-created subtrees must use one exact participant set throughout, so cross-scope parent/child structure is disallowed
Retrieval must apply participant masking before rooted-tree assembly.
LLM workflows must also apply participant masking before every model turn.
The builder LLM and the tree-walker LLM only receive masked inputs
whose participants sets are compatible with the current
workflow scope.
This prevents:
- narrower-scope one-on-one memory from leaking into broader multi-user queries
- silent widening of visibility in existing TopicIndex structure
19. Client interaction model
Myria is a standalone memory service, not the application’s orchestrator.
The client application:
- ingests canonical events into Myria
- queries Myria through
query_event_nodesandquery_references - optionally uses exact lookup or observability tools outside normal conversation
- never reads staging
- never mutates active snapshot directly
- maps external
<channel, account>identities to internal participant identities before ingest when needed
20. MCP exposure
Myria is exposed as an MCP service.
That gives:
- client independence
- language independence
- deployment as a standalone memory server
- consistent contracts
For v1, MCP is the primary public interface. A private in-process API may exist inside the Go codebase, but external consumers integrate through MCP tools rather than a bespoke RPC contract.
20.1 Agent-facing MCP tools
These are the normal-conversation tools:
myria.append_event
Append one canonical event.
myria.query_event_nodes
Return a rooted multi-child tree of raw event nodes filtered by the caller’s participant scope.
myria.query_references
Return a rooted multi-child tree of references to event nodes, with attached summaries and topic structure when applicable, filtered by the caller’s participant scope.
20.2 Auxiliary MCP tools
These are optional tools for exact lookup, observability, or admin:
myria.get_event
Fetch exact event.
myria.get_topic
Fetch exact internally derived topic metadata.
myria.get_active_snapshot
Observability.
myria.list_snapshots
Observability / admin.
myria.get_snapshot_status
Observability / admin.
20.3 MCP contract expectations
The MCP surface follows a strict typed-tool style.
Rules:
- each tool accepts a structured JSON input object
- each tool returns a structured JSON result object
- errors are machine-readable and include stable codes
- snapshot-stable reads return the pinned
snapshot_id - admin-only tools remain separable from read/write tools by policy
- all read tools return one rooted multi-child tree
- internal LLM-facing inspection tools must expose deterministic truncation metadata and continuation cursors
- degraded results are explicitly flagged in structured response metadata when deterministic fallback was used
The main contract is:
- append canonical events
- query raw event-node trees
- query summary/reference trees
- observe exact entities and snapshot lifecycle when needed
21. Concurrency model
Myria itself may be concurrent internally, but serving semantics must remain snapshot-stable.
Rules:
- queries pin one active snapshot id at start
- staging writes never affect in-flight queries
- event ingestion is idempotent
- snapshot swap is atomic
- final publish is serialized
22. Implementation Language
The production service is implemented in a natively compiled language.
Language: Go
Reason:
- great for service orchestration
- easy MCP server implementation
- good concurrency primitives
- simple deployment
- fast enough for this workload
Myria ships as its own Go project and executable, with its MCP server, PostgreSQL integration, builder workers, and LLM orchestration living in the same codebase.
22.1 Configuration model
Myria configuration must include:
- service identity and logging
- stdio MCP server settings
- PostgreSQL connection settings
- a path to the structured tool-calling LLM request template JSON
- builder and snapshot-publish policy
Configuration shape:
service:
name: myria
log_level: info
log_file: ./myria.log
mcp:
transport: stdio
postgres:
dsn: postgres://myria:secret@127.0.0.1:5432/myria?sslmode=disable
schema: myria
max_open_conns: 20
max_idle_conns: 5
migrate_on_start: false
llm:
provider: openrouter
request_template_path: ./config/openrouter-llm.template.json
timeout: 30s
max_tool_rounds: 8
temperature: 1.0
top_p: 0.95
structured_tool_calling_required: true
builder:
inactivity_timeout: 30s
max_unindexed_events: 100
auto_publish: true
allow_manual_trigger: trueThe configured model must support structured tool calls with typed arguments. Plain text-only models are out of scope.
For v1, the target platform is OpenRouter.
For v1, the builder LLM and tree-walker LLM target
nvidia/nemotron-3-super-120b-a12b through OpenRouter.
For local testing, the current configured test target is
nvidia/nemotron-3-super-120b-a12b:free.
The LLM call configuration lives in a JSON template on disk rather than environment variables. That template defines the outbound request shape used by Myria when calling OpenRouter.
Template:
{
"model": "nvidia/nemotron-3-super-120b-a12b:free",
"provider": {
"sort": "throughput"
},
"temperature": 1.0,
"top_p": 0.95,
"response_format": {
"type": "json_object"
},
"tools": [],
"tool_choice": "auto",
"usage": {
"include": true
},
"headers": {
"Authorization": "Bearer <openrouter-api-key>",
"HTTP-Referer": "http://localhost",
"X-Title": "myria"
}
}Myria injects dynamic fields such as messages, live tool schemas, and bounded execution settings at call time, but the provider target and base request template come from this JSON file.
22.2 LLM execution model
The configured LLM does not receive direct database credentials and does not execute arbitrary SQL on its own.
Instead:
- the Go service opens the PostgreSQL connection
- the service exposes typed SQL tools to the LLM
- the LLM emits structured tool calls
- the service validates and executes those tool calls
- results are fed back to the LLM for the next step
This is a planner/executor split. The model proposes bounded actions; the service enforces policy, schema, and execution limits.
For v1, there are two internal LLM-controlled workflows:
- builder LLM operates on one exact-participant-set build task at a
time using
nvidia/nemotron-3-super-120b-a12bthrough OpenRouter and proposes topic grouping, summaries, node merges, node splits, and leaf ref placement - tree-walker LLM operates on one masked read request at a time using
nvidia/nemotron-3-super-120b-a12bthrough OpenRouter and proposes which visible branches, leaves, or fresh-tail candidates to expand next
Deterministic responsibilities:
- snapshot pinning and cutoff capture
- participant masking before each LLM turn
- typed tool exposure and SQL execution
- structural validation of returned plans
- enforcement of leaf-only refs and exact-participant-set subtree rules
- deterministic fallback when builder or tree-walker workflows fail
- final rooted-tree assembly and snapshot publish
LLM-controlled responsibilities:
- topic grouping within a masked build task
- summary and label generation
- merge and split proposals inside one exact participant set
- branch selection during
query_references - stop / continue decisions under deterministic budget limits
Required internal LLM inspection tools:
inspect_subtree(root, depth, max_nodes, summaries_only, cursor)returns a deterministically ordered masked view of a subtree rooted atrootinspect_leaf_refs(root, max_refs, cursor)returns deterministically ordered event refs for one visible leaf topic nodefetch_events(event_ids, max_bytes)returns masked raw event payloads only for explicitly selected event ids
Inspection tool rules:
inspect_subtreemust bound depth, breadth, and estimated bytes or tokensinspect_subtreereturns summaries, metadata, child counts, ref counts, truncation state, and continuation cursor before any raw payloadsinspect_leaf_refsmust use stable ordering and stable pagination within the pinned snapshot or build taskfetch_eventsis a separate explicit payload expansion step and must not be implicit in structural inspection- builder-side inspection must also enforce deterministic limits on loaded events, loaded nodes, and tool-call rounds per build task
Fallback rules:
- if a builder task fails validation, times out, or exhausts budget, Myria falls back to a deterministic leaf-only build for that exact participant set
- the deterministic leaf-only build groups events by exact participant set plus weighted topic-hint overlap when available, otherwise by bounded time windows
- fallback leaf nodes use template-generated summaries rather than model-generated summaries
- if a
query_referenceswalk fails validation, times out, or exhausts budget, Myria returns a deterministic bounded tree from the already-loaded visible frontier and fresh tail instead of failing open
Required safety constraints:
- max tool-call rounds per workflow
- max rows and bytes returned per SQL tool call
- explicit read-only vs write-capable tool separation
- query allowlisting or shape validation for write paths
- full audit logging of prompts, tool calls, and SQL statements
- masking audit records for data shown to each LLM workflow
23. Internal module layout
Module split:
ingest
- schema validation
- topic classification
- event append
- dirty marker update
query
- exact lookup
- rooted query tree building
- fresh-event query
- ref resolution
- deterministic serving logic for MCP reads
- masked tree-walker orchestration for
query_references
topicindex
- node loading
- traversal
- adjacency caches
- anchor resolution
builder
- dirty scan
- exact-participant-set build-task generation
- masked builder LLM orchestration
- summary generation
- merge planning
- snapshot assembly
- validation
store
- PostgreSQL connection management
- migrations
- transaction boundaries
- query helpers and row decoding
llm
- prompt construction
- tool schema registration
- tool-call loop execution
- structured output validation
- audit logging for model-assisted workflows
registry
- snapshot state
- active pointer
- rollback
mcp
- public tool surface
- request/response schema validation
- MCP auth/policy hooks if deployment requires them
24. Suggested package / binary layout
V1 ships as one standalone Go service binary:
myria— MCP server, PostgreSQL-backed core, builder workers, and LLM orchestration
Supporting binaries may be added later:
myria-cli— admin and debug utilitymyria-migrate— explicit schema migration runner
The important v1 property is operational simplicity: one deployable service, one PostgreSQL database, one MCP surface.
25. Main failure modes
Over-merge
Everything collapses into one generic tree.
Fix:
- stricter thresholds
- fan-in limits
- coverage checks
Under-merge
Too many tiny trees.
Fix:
- consolidation pass
- co-occurrence heuristics
Summary drift
Semantic nodes no longer match events.
Fix:
- always keep refs
- periodic full rebuilds
LLM workflow exhaustion
Builder or tree-walker exceeds time, round, or context budgets.
Fix:
- deterministic bounded inspection tools
- deterministic leaf-only builder fallback
- deterministic bounded query fallback with explicit degraded result metadata
Snapshot lag
Fresh information unavailable.
Fix:
- topic-indexed event log fallback
Privacy bleed
Private memory leaks into shared retrieval.
Fix:
- scope masking in rooted-tree assembly
26. Why this design is strong
Myria avoids the main failure modes of naïve chat memory systems:
Instead of:
- giant transcript stuffing
- vector-only recall
- in-place summary mutation
- per-session silos
it gives you:
- event truth
- bounded rooted trees
- stable active memory
- async semantic recompilation
- exact provenance
- replayability
- rollback safety
27. Canonical system statement
The best single-sentence definition is:
Myria is an MCP-exposed, SQL-backed, event-sourced memory system that compiles long conversational histories into versioned semantic TopicIndex snapshots while preserving fresh recall through topic-indexed event-log queries.
In v1, it ships as a standalone Go service backed by PostgreSQL and uses a structured tool-calling LLM for internal memory workflows.
28. Final compressed blueprint
Inputs
- canonical event JSON
- exact lookup query
query_event_nodesrequestquery_referencesrequest- service configuration including PostgreSQL and LLM settings
Internal layers
- event log
- active snapshot
- staging snapshot
Internal processes
- ingest
- query
- builder
- registry
- LLM-assisted memory workflows
Outputs
- event acknowledgement
- exact event / topic / snapshot data
- semantic nodes
- resolved refs
- bounded rooted query tree
- MCP tool results
Core philosophy
The event log is truth.
TopicIndex is compiled memory.
Active memory is stable.
Fresh recall comes from topic-indexed events.
Clients consume rooted trees, not transcripts.
