Agents That Remember

Shared memory for autonomous agent teams

The way most of us use AI to write code is still single-player. One agent, one chat window, one task — and the moment the session ends, it forgets everything it figured out. Next time, you start over.

That is fine for a snippet. It is not how real software gets built. Real engineering is a team sport: people working in parallel, sharing what they know, talking as they go. The frontier in AI coding is not a smarter solo agent — it is the autonomous agent team. That is what we are building with baro, on our open-source engine, Mozaik.

From one agent to many

Tell baro “add a pricing page in Next.js + Tailwind” and it does not open one chat and start typing. It plans the goal into a DAG and dispatches an agent team to work at once — S1 scaffolds the route, S2 models the plan data, S3 builds the component, S4 wires the responsive grid, S5 writes the tests — each in its own branch, in parallel. You watch them light up in the run view, tokens ticking, and a build-verified pull request comes out the other end.

That branch-per-agent, plan-into-a-DAG orchestration is Mozaik doing the work — and it is open source.

But parallelism alone does not make a team. A team shares what it learns and communicates while the work is happening. Strip those two away and you do not have a team — you have strangers editing the same repo at the same time. Two problems stand between a parallel run and a real agent team. We just solved the first.

Problem 1: memory across time

Agents are amnesiac. Between runs, everything evaporates — the team that chose a repository pattern and settled a trade-off yesterday shows up today with no memory of any of it. So it re-derives the same decisions, re-reads the same code, re-asks you the same questions. You become the team’s memory.

That is the first thing we fixed. baro now has cross-run memory: every run’s key decisions are captured, distilled, and persisted. Start the next run — same repo or a different one — and baro recalls the relevant decisions into context instead of re-deciding.

memory.ts
// Every run's decisions become memory the team can recall next time.
await memory.remember(run.ownerId, run.goal, run.decisionDocument)
// The next run same repo or a different one starts with what the
// team already decided, instead of re-deriving it from scratch.
const context = await memory.recall(run.ownerId, goal)
// "## Decisions from your past baro runs (reuse these; don't re-decide)
// - Data access via a repository pattern
// - Validation with zod at the API boundary
// - Auth: short-lived JWT + refresh, no server sessions"

Under the hood it runs on mem0, backed by Postgres with pgvector and an OpenAI model that distills each run and embeds it for retrieval. When a run finishes, baro takes its decision record, distills it into a handful of durable facts, and stores them keyed to you. On the next run it embeds the new goal, semantically searches your past decisions, and drops the relevant ones straight into the architect’s context — so the team plans with what it already knows, instead of starting from zero.

We dogfood it: when we asked baro to extend a module it had built in an earlier run, it recalled the decisions from that run and built on them — no re-explaining. Memory across time is table stakes for autonomy. And it is the easy half.

Problem 2: communication in the moment

Here is the harder, more exciting problem. Within a single run, those parallel agents work blind to each other. S2 decides to model User one way; S3, working at the same moment on a different story, makes an assumption that quietly contradicts it. Neither knows. You find out at the end — in the diff, or worse, in a bug.

Human teams do not work like that. You do not wait until the pull request to mention you changed the auth flow — you say it in the moment, and everyone adjusts. Continuous communication is what keeps a team coherent while moving fast. Today’s agent teams parallelize the work but not the talking.

The vision: memory and communication as runtime

This is where Mozaik comes in. Mozaik is not a sequential pipeline — agent 1, then 2, then 3. It is event-driven and reactive: agents are participants that emit and react to semantic events. Exactly what an autonomous agent team needs. (We went deep on this in Move From Autonomous Agents to Autonomous Teams.)

Our vision is to make memory and communication runtime. The moment an agent updates shared memory — or simply has something important to say — it broadcasts a semantic event, and every other agent currently working is notified and reacts, right then. Not “merge everyone’s work at the end and hope it fits.” Live coordination: an agent learns the schema changed and adapts mid-task; an agent makes a decision and the others fall in line instead of fighting it.

That is the difference between agents running in parallel and an autonomous agent team. One is concurrency. The other is collaboration. And because Mozaik is open, you can build that team on it yourself.

Where we are

Cross-run memory is live today — your agents remember across runs. Runtime shared memory and agent-to-agent communication within a run is what we are building next on Mozaik. The march toward autonomous agent teams is incremental, and this is the next step on it.

This is where AI coding goes: not a bigger model babysat by one person, but an autonomous agent team that remembers, talks, and coordinates — and ships the pull request while you watch.

Miodrag Todorović

Miodrag Todorović

Co-founder @ JigJoy

My passion is to tell the world the stories about the beautiful stuff we have built.

Different is better than better.

Newsletter

Get the next JigJoy deep dive

Notes on reactive agents, Mozaik, Baro, and the engineering behind autonomous software teams.

We're organizing hackathon