Growing It

The system described on the features page did not arrive as a package. It started with a CLAUDE.md file and a few sessions of describing what was needed. Each session added something. Skills, agents, memory, integrations. The growth was continuous and compounding. What follows is how it happened.


Adding Skills

The rule: if you do the same multi-step thing twice, make it a skill.

The process is not write-it-yourself. Tell the assistant what you need and it builds the workflow, writes the file in skills/, and adds the command to CLAUDE.md. The trigger question is: "Is this something you're going to do regularly?" When the answer is yes, everything gets built. Your job is to recognize the pattern and say so.

That is how every skill on the features page was created. That is also how the agents were built -- the same question, a more complex spec.

The harder part is recognizing when something has become repeatable. If you find yourself typing the same setup paragraph before a task, that is a skill waiting to be built. If you keep correcting the same behavior, that is either a memory update or a skill for how to avoid it.

A detection log automates part of this recognition. The assistant records multi-step task patterns as they happen. When the same pattern appears twice, it surfaces at session start: this task has occurred twice, formalize it as a workflow? You decide. The assistant builds the skill file when you say yes. It does not build speculatively.

The skill does not need to be perfect on the first run. It gets better the same way everything else does: use and correction.


Running /distill

At the end of any session where you learned something, run /distill.

Claude reviews the conversation. It identifies stable patterns worth keeping and calls memory_save to persist each learning to the MCP memory server. The server handles deduplication automatically; if a memory already exists with high similarity, it is skipped. You approve or reject each proposed memory. What you approve gets saved. What you reject is dropped.

The quality bar matters. Not every correction is worth keeping. A one-time exception is not a pattern. /distill is for stable, repeatable knowledge about how you work. It is not a transcript of the conversation.

If you skip /distill, the patterns evaporate with the session. The assistant will make the same mistakes and you will correct them again. Run it when the session produced something worth keeping.

Quick mode, which is the default for /save-state, skips /distill entirely for speed (under 30 seconds). Only the --full flag runs distill first. If you want distillation handled at save time, use /save-state --full. Quick mode is the right default for mid-session checkpoints; full mode is for end-of-day closes when you want everything captured.


Evolving CLAUDE.md

CLAUDE.md is a living document. You own it. Claude reads it as instructions.

When your priorities change, update it. When you add a new tool, add it. When a section is wrong, fix it. When something in the system is no longer accurate, remove it.

A separate changelog file is worth maintaining alongside CLAUDE.md. The log records what changed and when, which matters when you want to understand why the assistant behaves a certain way or when you want to track what improved. Each significant session gets a dated entry. Keeping the changelog in its own file rather than appending to CLAUDE.md keeps the operating brief lean and the history findable without parsing a long file every session.

There is no correct final form for CLAUDE.md. The file will look different for a journalist than for a developer than for a researcher. The content spec is the one you write for yourself.

One addition worth making once the system is running: a ## Compact Instructions section. Claude Code recognizes this section and uses it to guide what gets preserved when the context window fills and compaction runs. Without it, compaction is lossy. With it, you control what survives. List what matters: active projects, pending decisions, files modified in the session, any task in progress. Claude will protect those items even when the rest of the context is compressed.


Useful Built-In Commands

Four Claude Code commands that are non-obvious but useful once you know they exist:


Refining Your Voice Profile

voice_profile.md starts as a behavioral analysis of your interview answers. Better than self-description -- but it has not seen you work yet.

The interview gives the assistant a first read: how you shift between audiences, what punctuation you avoid, where you hedge and where you assert. That data is real. It is also from one hour of your life. The assistant has not seen you push back on a bad draft, correct the same thing three times, or write under pressure.

After 10 to 15 sessions of real use, run a refinement interview. Say "run a refinement interview" and the assistant rebuilds voice_profile.md using evidence from actual corrections. Every time you said "no, shorter" or "drop the hedging" or "that is not how I talk" -- that is calibration data. The refinement interview captures those patterns instead of what you said about yourself at session one.

The difference between a first-session voice profile and a refined one is roughly the difference between meeting someone for an hour and working with them for months. The interview gets you to functional. Corrections and refinement get you to calibrated.

Two things accelerate it:

The goal is a voice profile specific enough to function as a writing instruction, not a description. That takes use. The system is designed to get there.


Voice Evolution

The refinement interview is a full recalibration. It runs once, maybe twice a year. Between interviews, the assistant still makes small voice errors. A banned word slips through. Paragraphs get longer. The hedging creeps back. None of it is catastrophic. It accumulates.

The voice drift system closes that gap. It runs automatically as part of /distill, which means it fires every time you save state at the end of a session. No extra commands. No separate workflow.

How it works

When /distill scans the conversation for patterns worth keeping, it also scans for voice corrections: moments where you edited the assistant's writing because of how it sounded, not what it said. Word replacements. Sentence restructuring. Tone pushback. "I wouldn't say it that way."

Those corrections are saved to the MCP memory server with voice-drift tags. No approval needed at this stage. Each correction gets a count. When the same correction appears three or more times across sessions, it gets flagged as a confirmed pattern. The old flat file (voice-drift.md) is kept as a read-only archive; all new corrections go to the memory server.

At the start of each session, the startup hook checks for voice drift entries. If there are confirmed patterns pending review, a one-line summary appears alongside your calendar and deadlines:

Voice drift: 6 corrections logged, 2 confirmed patterns pending review.

When you are ready, say "show me the voice drift." Review the confirmed patterns. Approve the ones that belong in your voice profile. Reject the noise. Ten seconds. The approved patterns get written into voice_profile.md and the drift entries are cleared.

Why thread history matters

Every correction you have ever made to the assistant's output is sitting in your conversation history. That is a dataset. Do not delete threads. The voice drift system works best when it has a full record of how you corrected writing over weeks and months, not fragments from surviving sessions.

Keep threads focused on one project or task. Running several intense projects in a single thread forces compaction, which compresses the conversation and degrades the correction data. Short focused threads preserve full context and produce cleaner voice data.

Two layers, different jobs

The refinement interview recalibrates the standard. It asks you to articulate what has changed about how you write. It catches evolution the assistant cannot observe from output alone.

Voice drift catches deviations from the standard. It runs continuously. It handles the maintenance between recalibrations.

Both layers persist. Neither requires you to remember what went wrong or when. The data is already in the files.


Building Agents

An agent is a specialized persona with a specific task type. When a workflow becomes common enough to deserve its own instructions, its own output format, and its own quality standards, it becomes an agent.

Agents are markdown files in agents/. Each one includes:

The current roster is 26 agents across Opus, Sonnet, and Haiku tiers. That number is not a goal. It is a count of how many task types became common enough to deserve their own spec. Here is what some of them actually do:

The Orchestrator assigns work to the right model tier: Haiku for high-volume extraction, Sonnet for judgment and synthesis, Opus for complex editorial work that justifies the cost. It coordinates agents and reviews their output. It does not do the grunt work itself.

Sonnet agents handle judgment and synthesis: research, writing, OSINT, financial analysis, complex editorial decisions. They take more time and tokens and return better output on tasks that require thinking.

Haiku agents handle volume and structure: agenda parsing, file management, archive maintenance, data extraction. Faster, cheaper, appropriate when the task is well-defined and the output is structured.

Agents hand work to each other. A signals sweep identifies leads, then routes them to the OSINT agent for people research, the Finance agent for contract analysis, the Palm Bayer agent for article drafting. The Orchestrator coordinates the sequence but does not execute the tasks. Each agent writes its output to a shared task folder. The Orchestrator reads the results, flags gaps, requests revisions, and assembles the final deliverable.

You do not need agents on day one. The skills and CLAUDE.md carry most of the load. Agents make sense when you have a task type that runs frequently, has enough complexity to deserve its own spec, and produces output that needs consistent formatting. The roster started with none. Each agent was added when a task earned it.


Adding Mobile Access via Telegram

Once the assistant is running reliably at the terminal, an optional mobile-access tier extends it to your phone. A Telegram channel plugin runs as an MCP server directly inside your Claude Code session. Messages arrive as channel events. Replies go out through MCP tool calls. No separate bridge process is needed.

The practical benefit: you can assign task lists from your phone, receive results as messages, and send follow-up instructions without sitting at a desk. The same assistant handles both the terminal session and the Telegram channel. It is one system, not two.

Setup requires a Telegram bot token, the channel plugin configured as an MCP server in your Claude Code settings, and a bot added to your chat. Once wired up, inbound messages show up in session just like any other channel event. The away-mode dispatch-wait loop uses this channel to block on a reply before proceeding to the next task in the queue.

This layer is worth adding after the core system is stable. It is not a day-one requirement, and the assistant works completely without it. But it changes how you can use the system: tasks that previously required sitting at a terminal can be directed from anywhere.


Multi-AI Routing via Context Folders

The assistant is not the only AI in the system. Other command-line AI tools can be routed specific task types by building purpose-built context folders for them.

The pattern: create a directory for each AI CLI you want to use. Inside it, add an instruction file that the CLI reads automatically on startup. Each folder becomes a context-loaded dispatch point for a specific task domain. When the orchestrator needs that type of work done, it dispatches a task to the right folder rather than handling it inline.

Example task domains this works well for:

The result is a routing layer where tasks go to the most capable tool for the job rather than everything defaulting to a single model. Research that benefits from live web access goes to the web-grounded tool. High-volume extraction goes to the fastest available model. Editorial judgment stays with the primary assistant. The system routes rather than generalizes, and the context folders make that routing explicit and repeatable.


Scaling CLAUDE.md

CLAUDE.md grows. Every new project, tool, agent, and priority adds lines. At some point the file gets heavy enough to slow down session starts and eat context window space that should be available for actual work. This is not a failure of the approach. It is a sign the system is working. The fix is not to stop adding things. The fix is lazy loading.

Heavy sections that do not change often get extracted to subfiles. CLAUDE.md keeps a one-line pointer. Agents read the subfile on demand via the Read tool when they actually need it. A 130-line agent architecture section became a one-line import reference. A routing table, maintenance config, and URL registry followed. The main file shrank 37%. The content is still there. It just loads when needed, not on every session start.

The principle: CLAUDE.md should contain what every session needs. Everything else should be one read away. Identity, priorities, output rules, recognized commands, and current projects stay in the main file. Reference material, detailed configs, and agent specs live in subfiles that load on demand. The file stays fast. The system stays complete.

Beyond lazy loading, the files themselves can be compressed for token efficiency. System instruction files are formatted for humans, but the AI does not need that formatting. Stripping markdown syntax, collapsing lists, and using shorthand notation cuts token consumption by 40-70% with no behavior changes. The Token Trim page covers the method and results.