Build Your Own CLI Agent: A Step-by-Step Guide

Build Your Own CLI Agent: A Step-by-Step Guide

I love JavaScript. Even though I have learned several languages, JavaScript has become my first choice when building new projects. I don't even remember when this love arose. In this post I'll walk through a small terminal chat app where a language model can run shell commands for you—something you can actually run locally. At the end there's a GitHub template if you'd rather clone than copy.

Two libraries do most of the heavy lifting:

Ink is what the user sees; Mozaik is what coordinates the agent behind the scenes.

What we are building

One Node command that:

  1. Shows a simple chat UI in the terminal.
  2. Forwards what you type to a hosted language model.
  3. Lets the model run terminal commands through a single tool (run_command) when it needs to inspect the machine or run a build.
  4. Prints the model's replies (and optional "calling a tool…" hints) in that same UI—without pushing API calls into every component.

The goal is a clean split: the terminal view stays simple; the agent and tools stay in one place and are easier to test or swap later.

How the pieces fit together

Mozaik runs everything on a shared environment: when the model speaks, asks for a tool, or a tool returns a result, every piece that has joined that environment can hear about it. For this CLI you mainly care about two kinds of participant:

Two roles carry the story:

RoleTypical base classResponsibility
AgentBaseAgentParticipantRemembers the conversation, asks the model for the next step, and runs tools (like run_command) when the model asks for them.
Observer / UI bridgeBaseObserverParticipantListens for assistant text and tool activity from the agent and forwards it into Ink through small callbacks—so the screen updates when the model speaks or starts a tool, without owning the agent loop.

In this project, typing in Ink calls session.send, which hands your text to the agent. You could add other participant types later (for example streaming stdin), but a straight send-to-agent path keeps the tutorial easy to follow. The full pattern is in terminal/agent.ts below if you want to see how Mozaik's base class is extended.

Step 1 — Bootstrap the runtime (cli.tsx)

The entry file is intentionally tiny: load environment variables, maybe print a one-line usage hint, then hand off to Ink with render(<App />). Keep agent logic out of here—only bootstrapping.

cli.tsx
1#!/usr/bin/env node
2import React from "react";
3import { render } from "ink";
4import meow from "meow";
5import dotenv from "dotenv";
6import path from "node:path";
7import { fileURLToPath } from "node:url";
8import App from "./app.js";
9
10const here = path.dirname(fileURLToPath(import.meta.url));
11dotenv.config({
12 quiet: true,
13 path: [
14 path.resolve(process.cwd(), ".env"),
15 path.resolve(here, "..", "..", ".env"),
16 path.resolve(here, "..", "..", "..", ".env"),
17 ],
18});
19
20meow(
21 `
22 Usage
23 $ your-cli
24
25 Starts an interactive chat with the agent.
26`,
27 { importMeta: import.meta },
28);
29
30render(<App />);

Step 2 — Compose the session (session.ts)

This file is the "control room." You create the model, the conversation memory, the thing that runs tools, and the shared environment, then plug in your agent and your UI helper. Everything with awkward names (OpenAIInferenceRunner, ModelContext, and so on) lives here so the React side stays small. When you are done, the UI only needs one method: something like send(message), forwarded to the agent's onMessage.

The snippet below shows the full wiring; use it as a checklist if you build from scratch.

session.ts
1import {
2 AgenticEnvironment,
3 Gpt54,
4 ModelContext,
5 OpenAIInferenceRunner,
6 DefaultFunctionCallRunner,
7} from "@mozaik-ai/core";
8import { terminalTools } from "./terminal/tools.js";
9import { TerminalAgent } from "./terminal/agent.js";
10import { UIUpdater } from "./ui-updater.js";
11
12export type AgentSession = {
13 send: (message: string) => void;
14};
15
16export type AgentListeners = {
17 onAssistantText: (text: string) => void;
18 onFunctionCall?: (name: string) => void;
19};
20
21export function createAgentSession(listeners: AgentListeners): AgentSession {
22 const functionCallRunner = new DefaultFunctionCallRunner([...terminalTools]);
23 const inferenceRunner = new OpenAIInferenceRunner();
24
25 const context = ModelContext.create("cli-agent");
26 const model = new Gpt54();
27 model.setTools([...terminalTools]);
28
29 const environment = new AgenticEnvironment();
30 const agent = new TerminalAgent(
31 inferenceRunner,
32 functionCallRunner,
33 environment,
34 context,
35 model,
36 );
37 const uiUpdater = new UIUpdater(listeners);
38
39 agent.join(environment);
40 uiUpdater.join(environment);
41 environment.start();
42
43 return {
44 send: (message: string) => agent.onMessage(message),
45 };
46}

Step 3 — The agent loop (terminal/agent.ts)

This is the heart of the app, but the story is simple: when someone sends a message, you record it, let the model think, and if it wants to run a tool you run it and feed the result back—then the model gets another turn until it answers in plain language. Mozaik spells that out as a small class that extends BaseAgentParticipant; you are not reinventing a scheduler, you are filling in how each beat of that loop updates memory and calls back into the framework.

In practice that means: on new user text, add it to the conversation and ask for the next model response; when the model asks for a tool, remember that call is in flight, run it through Mozaik's runner, and store the outcome; when every outstanding tool has finished, ask the model again so it can either reply to the user or request another step. The code block below is the authoritative version—use the prose here to read it, not to memorize API names.

There is no Ink or terminal drawing in this file—only memory and orchestration—so you can change the UI later without touching the agent.

terminal/agent.ts
1import {
2 BaseAgentParticipant,
3 UserMessageItem,
4 FunctionCallItem,
5 AgenticEnvironment,
6 ModelContext,
7 GenerativeModel,
8 InputStream,
9 InferenceRunner,
10 FunctionCallRunner,
11 FunctionCallOutputItem,
12 DeveloperMessageItem,
13} from "@mozaik-ai/core";
14
15const programmaticAgentInputStub: InputStream = {
16 async *stream() {},
17};
18
19export class TerminalAgent extends BaseAgentParticipant {
20 private pendingCalls = new Set<string>();
21
22 constructor(
23 inferenceRunner: InferenceRunner,
24 functionCallRunner: FunctionCallRunner,
25 private readonly environment: AgenticEnvironment,
26 private readonly context: ModelContext,
27 private readonly model: GenerativeModel,
28 ) {
29 super(programmaticAgentInputStub, inferenceRunner, functionCallRunner);
30 }
31
32 override onMessage(message: string): void {
33 const developerMessage = DeveloperMessageItem.create(
34 `You are a terminal agent. You can run commands in the terminal to help the user with their request.`,
35 );
36
37 this.context
38 .addContextItem(developerMessage)
39 .addContextItem(UserMessageItem.create(message));
40 this.runInference(this.environment, this.context, this.model);
41 }
42
43 override onFunctionCall(item: FunctionCallItem) {
44 this.pendingCalls.add(item.callId);
45 this.context.addContextItem(item);
46 this.executeFunctionCall(this.environment, item);
47 }
48
49 override onFunctionCallOutput(item: FunctionCallOutputItem) {
50 this.context.addContextItem(item);
51 this.pendingCalls.delete(item.callId);
52 if (this.pendingCalls.size === 0) {
53 this.runInference(this.environment, this.context, this.model);
54 }
55 }
56}

Step 4 — Tools the model can use (terminal/tools.ts)

Tools are how you tell the model what it is allowed to do outside of chat text. Each tool has a name, a short description the model can read, argument shapes, and an invoke function that runs on your machine. Here we expose one tool: run_command, which executes a shell command and returns output so the model can use it on its next turn.

terminal/tools.ts
1import { Tool } from "@mozaik-ai/core";
2import { Terminal } from "./terminal.js";
3
4const terminal = new Terminal();
5
6export const terminalTools: Tool[] = [
7 {
8 name: "run_command",
9 description: "Run a command in the terminal.",
10 parameters: {
11 type: "object",
12 properties: {
13 command: {
14 type: "string",
15 description: "The command to run in the terminal.",
16 },
17 cwd: { type: "string", description: "The current working directory." },
18 },
19 required: ["command", "cwd"],
20 },
21 strict: true,
22 type: "function",
23 invoke: async (args: { command: string; cwd: string }) => {
24 const result = await terminal.runCommand(args.command, args.cwd);
25 return result;
26 },
27 },
28];

Step 5 — Feed the terminal UI (ui-updater.ts)

The observer sits between Mozaik and Ink. When the agent produces text the user should see, this class forwards it through a callback; when a tool starts, it can add a small status line (for example "calling run_command"). You stay subscribed to external events so you hear what the agent is doing, not duplicate the agent's own work—that keeps one clear owner of the loop and one clear owner of the display.

ui-updater.ts
1import {
2 Participant,
3 FunctionCallItem,
4 ModelMessageItem,
5 BaseObserverParticipant,
6} from "@mozaik-ai/core";
7
8type Listeners = {
9 onAssistantText: (text: string) => void;
10 onFunctionCall?: (name: string) => void;
11};
12
13export class UIUpdater extends BaseObserverParticipant {
14 constructor(private readonly listeners: Listeners) {
15 super();
16 }
17
18 override onFunctionCall(item: FunctionCallItem) {
19 this.listeners.onFunctionCall?.(item.toJSON()?.name ?? "tool");
20 }
21
22 override onExternalFunctionCall(
23 _source: Participant,
24 item: FunctionCallItem,
25 ) {
26 this.listeners.onFunctionCall?.(item.toJSON()?.name ?? "tool");
27 }
28
29 override onExternalModelMessage(_source: Participant, item: ModelMessageItem) {
30 const text = item.content?.text ?? "";
31 if (text) this.listeners.onAssistantText(text);
32 }
33}

Step 6 — Wire Ink (app.tsx)

The Ink layer holds chat history in normal React state, builds the session once so you do not reconnect on every render, and on submit appends the user message then calls session.send. Anything the observer hears arrives through the callbacks you passed in when creating the session. The pattern below is intentionally minimal so you can focus on layout and input UX in your own fork.

app.tsx
1import React, { useMemo, useRef, useState } from "react";
2import { useApp } from "ink";
3import { createAgentSession } from "./session.js";
4
5type ChatMessage = {
6 id: number;
7 role: "user" | "assistant" | "system";
8 content: string;
9};
10
11export default function App() {
12 const { exit } = useApp();
13 const [messages, setMessages] = useState<ChatMessage[]>([]);
14 const nextId = useRef(0);
15
16 const appendMessage = (role: ChatMessage["role"], content: string) => {
17 setMessages((previous) => [
18 ...previous,
19 { id: nextId.current++, role, content },
20 ]);
21 };
22
23 const session = useMemo(
24 () =>
25 createAgentSession({
26 onAssistantText: (text: string) => {
27 appendMessage("assistant", text);
28 },
29 onFunctionCall: (name: string) => {
30 appendMessage("system", `calling tool: ${name}`);
31 },
32 }),
33 [],
34 );
35
36 const handleSubmit = (value: string) => {
37 const trimmed = value.trim();
38 if (!trimmed) return;
39 appendMessage("user", trimmed);
40 session.send(trimmed);
41 };
42
43 // render messages + <TextInput onSubmit={handleSubmit} />
44}

Step 7 — Credentials and build

Put your API key in .env ( the template expects something like OPENAI_API_KEY; check Mozaik if you change model or provider). Then install, build, and run the compiled CLI (or npm link if you want a global command).

Starter repository

Prefer a working tree over copy-paste? There is a template repo with the same layout this article walks through—agent, observer, tools, and Ink UI already split into files.

Scaffold a fresh project without copying the template's full git history:

terminal
1npx degit jigjoy-ai/cli-agent-starter my-cli-agent
2cd my-cli-agent
3git init
4git add .
5git commit -m "Initial commit"
6npm install

Replace jigjoy-ai/cli-agent-starter with your fork or canonical URL if it moves; replace my-cli-agent with your package name. Then edit package.json (name, bin), tweak source/cli.tsx / source/app.tsx branding, and start adding participants and tools.

GitHub alternative: enable Template repository in the repo settings and use Use this template — you get a first commit snapshot with a clean history for a new repo.

Where to go next

You now have a concise path from "blank Node project" to "Ink front end + Mozaik event bus + tool-running agent" — with a degit-friendly repo to hit the ground running.

Miodrag Vilotijević

Miodrag Vilotijević

Co-founder @ JigJoy

Building the future of agentic systems

To answer the question of what is going to happen next, we need to work out what has already happened; that is, to understand where we will be tomorrow, we need to understand what it was that got us to where we are today.