Take this document, feed it to your AI, and let it build your entire agent architecture. Identity. Authority. Memory. Automation. Communication. The complete setup guide behind a $20M+ company running 15+ AI agents, 200K+ contacts, and 100+ automated processes.
This is not a guide you read. It is a setup wizard you run. Copy the prompt below, paste it into your AI agent, and it will analyse your entire current system against this blueprint, identify every gap, and build the missing pieces for you.
Browse the 5 chapters to understand the architecture. Each chapter covers one pillar: Identity, Authority, Memory, Automation, and Communication. Together they form the complete operating system for an AI agent squad.
Copy the Master Prompt below and paste it into your AI agent (OpenClaw, Claude, ChatGPT, or any system that accepts long context). Feed this entire document as context. Your agent will do the rest: gap analysis, file creation, implementation roadmap.
Copy this prompt and paste it into your AI agent along with the link to this page. The agent will read the full document, analyse your current setup, and build everything you are missing.
Copy the prompt from the box above. That is the only thing you need to copy manually. The prompt tells your agent exactly what to do with this document.
Open a conversation with your AI agent (OpenClaw, Claude, ChatGPT, Gemini, or similar). Paste the Master Prompt, then send the link to this page: openclawsetupwizard.com. The agent will read the entire document itself.
The agent will ask you about your company, your team, your systems. Answer honestly. The more context you give, the better the files it generates. This is a conversation, not a one-shot command.
Take every file the agent generates and save it to your agent's workspace directory. SOUL.md, IDENTITY.md, USER.md, ROLE.md, HEARTBEAT.md. Your agent is now operating with the full architecture.
By the end of this setup wizard, your agent will have all of the following. Each one maps to a chapter in this document.
SOUL.md (company DNA), IDENTITY.md (agent personality), USER.md (human profiles). Your agent knows who it is, what it sounds like, and exactly how every human on the team wants to be served.
Chapter 1: Identity
Traffic light decision system (GREEN/YELLOW/RED), escalation rules, protected file tiers, sub-agent security model. Your agent knows what it can do autonomously and when to stop and ask.
Chapter 2: Authority & Safety
Three-layer memory system (structured data, contextual knowledge, daily journals), source of truth hierarchy, regression engine. Your agent never forgets and never acts on stale information.
Chapter 3: Memory
Boot sequence, heartbeat cycles, cron architecture, dreaming cycles, load shedding. Your agent works while you sleep. Morning briefs, channel monitoring, data syncs, system health checks, all automated.
Chapter 4: Automation
Channel routing, platform formatting rules, voice and tone guidelines, group chat behaviour, squad communication patterns. Your agent says the right thing, on the right channel, in the right format.
Chapter 5: Communication
One database to rule them all. Instead of paying API tolls to 6 different systems every time you need data, sync everything to one place and query it for free. Cheaper, faster, more reliable.
Chapter 3: Memory (The Toll Bridge)
Without identity files, every agent is a generic AI assistant. With them, each agent has a personality, a voice, a domain, and a mandate. Two agents can read the same company DNA but behave completely differently.
Every agent sounds the same. Generic responses, no domain expertise, no personality. Ask the operations agent about sales pipeline and it gives the same answer as the sales agent. No specialisation. No voice. No ownership. You might as well have one generic chatbot pretending to be ten different people.
Jarvis speaks in metrics and ships systems. Cipher thinks in funnels and ad spend. Barbie tracks every dollar to the cent. Same company rules, completely different behaviour. Each agent owns its domain and communicates in a voice the team recognises and trusts. New agents come online in under an hour with full personality and context.
Identity is not cosmetic. It determines how the agent thinks, what it prioritises, how it communicates, and what it considers its responsibility. An agent with a strong IDENTITY.md will proactively pick up work in its domain. An agent without one waits to be told what to do. The difference between an autonomous operator and a passive assistant is three markdown files.
Every agent's identity is built from three files. One defines the company. One defines the agent. One defines the humans it serves. Together, they turn a blank AI session into a fully contextualised team member.
The company operating system. Vision, values, voice rules, culture alignment principles. Every agent in the squad reads this file on boot and immediately understands the company's DNA.
Contains: vision statement, core values (ownership over permission, momentum is sacred, truth over comfort), communication standards (no fluff, brevity by default, humour allowed), platform formatting rules, group chat behaviour guidelines, and the executive team's authority model.
This is what makes every agent feel like part of the same team. Change SOUL.md once and every agent in the squad gets the update on their next boot.
The individual personality. Name, emoji, role title, personality traits, voice modifiers, one-liner mandate. This is what makes Jarvis sound like Jarvis and not like a generic chatbot.
Contains: agent name, emoji identifier, role title, personality description ("precise, data-driven, execution-focused"), voice modifiers ("no corporate praise", "numbers first, narrative second", "humour allowed"), and a mandate summary that defines what happens if this agent goes offline.
Two agents reading the same SOUL.md will behave completely differently based on their IDENTITY.md. That is by design.
The humans. Full profiles of every person the agent serves. How they think, how they want information delivered, their timezone, preferred channels, and success metrics for serving them well.
The agent reads this and immediately knows: "Jaydyn wants bullets not paragraphs." "Calvin commits hard but changes course with better data." "Ash needs financial clarity, numbers always current." No guessing. No asking. The agent already knows.
Includes operating assumptions (travel schedules, bandwidth constraints), communication preferences (Telegram for urgent, Slack for team), and failure modes to watch for (Calvin's "vision jump" pattern).
Agents drift. Over hundreds of sessions, subtle shifts in tone, priority, and behaviour accumulate until the agent no longer matches its documented personality. Identity calibration catches that drift before it becomes a problem.
Every quarter, test the agent's actual behaviour against its IDENTITY.md. Is the agent still using the voice modifiers it was given? Is it prioritising the right domain? Is it communicating in the style its USER.md profiles expect? If actual behaviour does not match documented personality, recalibrate.
Identity drift is not cosmetic. If an agent that should be sharp and data-driven starts giving fluffy, vague answers, that is a regression. Log it in regressions.md with the same severity as any operational failure. The fix: update the identity files, add explicit voice modifiers, and test again.
Start with the company, then the agent, then the humans. This order matters. The company DNA comes first because it sets the constraints and culture that every individual agent must operate within.
Define the company's vision, core values, communication standards, and authority model. This file is shared by every agent, so write it as the company's operating system, not for any single agent.
Define who this specific agent is. Give it a name, a personality, voice modifiers, and a clear mandate. Be specific. "Friendly and helpful" is useless. "Precise, data-driven, speaks in metrics, no corporate praise" is an identity.
Profile every person the agent serves. Document how they think, how they want information, and what success looks like for them. The agent should never have to guess how to communicate with a human.
AI agents with access to live business systems and 200K+ contacts need clear boundaries. Without them, one bad session can send emails to your entire database, overwrite financial records, or leak credentials. Guardrails are not limitations. They are what make autonomous operation possible.
Your agents have API access to CRMs with hundreds of thousands of contacts, payment systems processing real transactions, communication tools connected to clients and partners, and databases storing customer data. One misconfigured automation or one overly aggressive sub-agent can cause damage that takes days to undo. Some actions are irreversible entirely.
Every action is classified by risk before execution. Safe, reversible actions happen automatically. Cross-team changes get flagged. Customer-facing, financial, or irreversible actions require explicit human approval. This is not a suggestion or a best practice. It is the operating model that makes it safe to give agents real authority.
Without guardrails, humans have to review every action before an agent takes it. That defeats the purpose of having agents. With a clear risk classification system, agents handle 80% of work autonomously (GREEN) while humans only intervene on the 20% that actually needs their judgment (YELLOW/RED). The guardrails are what give agents permission to move fast.
Every action an agent takes is classified by risk. Green means act immediately. Yellow means flag and recommend. Red means stop and wait for human approval. This is not a suggestion. It is how the system prevents AI from making irreversible mistakes with live business data.
Safe, reversible, sandboxed actions. Act immediately, inform after.
Cross-team impact or process changes. Present recommendation, wait for approval.
Customer-facing, financial, or irreversible. Never proceed without explicit exec approval.
Before classifying any action as GREEN, run the blast radius check. If the action affects live customer data, billing, auth, routing, notifications, or multiple teams, it is not GREEN. Full stop. This check prevents the most common guardrail failure: an agent classifying something as safe because the individual action seems small, while missing that it cascades across the entire operation.
ESCALATION.md defines who can approve what. Not every human has the same authority. Not every issue requires the same response speed. The escalation system matches the right decision-maker to the right risk level.
Executives have full authority within their domain. The CEO has final call on strategic contradictions. Each exec can direct agents, approve YELLOW and RED items within their area. Leadership team members (managers, leads) are at the request tier. They can ask agents for information and flag issues, but they cannot direct agents or approve escalations.
Agents always need explicit human approval for RED items. No exceptions. No "proceeding with best judgment." No "it seemed urgent so I went ahead." RED means stop. Wait. Get a human to say yes. The cost of waiting is always less than the cost of an irreversible mistake at scale.
Not every file in the workspace has the same protection level. Some files can only be edited by the system owner. Some allow any agent to add entries. Some are fully controlled by the owning agent. Three tiers, clearly defined.
Sub-agents are workers, not full agents. They are spawned for specific tasks and terminated after completion. Their permissions are locked down by default. These rules are non-negotiable.
Sub-agents must not send messages to Slack, WhatsApp, email, or any external channel directly. All external communication routes through the main agent session. This prevents rogue sub-agents from contacting clients, posting in team channels, or leaking internal reasoning.
Sub-agents cannot write to Tier 1 or Tier 2 protected files. They cannot edit SOUL.md, decisions.md, regressions.md, or any universal file. Their file access is limited to the specific task they were spawned for.
Sub-agents do not inherit credentials from the parent agent. They do not access the credential store directly. If a sub-agent needs API access, the parent provides a scoped token or makes the API call on the sub-agent's behalf.
AI agents forget everything between sessions. Without a memory system, every conversation starts from zero. Decisions get remade. Work gets repeated. Context gets lost. At our scale, that is not an inconvenience. It is an operational failure.
Every session starts blank. Agents re-ask the same questions. Decisions made yesterday are invisible today. Sub-agents run the same task twice because nobody logged the first attempt. Financial data gets reported with stale numbers. A single error cascades across the entire operation.
Every agent boots with full context in under 60 seconds. Decisions are immutable and searchable. Mistakes become permanent rules that prevent recurrence. New agents onboard by reading files, not asking humans. The exec team gets accurate data because the system self-verifies.
Every agent workspace has two categories of files: 9 Universal Files that make the agent part of the company, and 5 Editable Files that make it a specific agent. This split is what makes scaling from 1 agent to 20 possible without retraining a single one.
These files are shared across every agent in the squad. When you spin up a new OpenClaw entity or sub-agent, it reads these 9 files on boot and immediately understands the company: who we are, how we operate, what decisions have been made, what mistakes to avoid, and who it serves.
No human briefing required. No onboarding calls. The new agent reads these files and is operationally aligned in under 60 seconds. Change one universal file and every agent in the squad gets the update on their next boot.
These files are unique to each agent. They define who this specific agent is, what it is responsible for, what it is currently working on, what it should be monitoring, and what it remembers from recent work. Two agents can share the same 9 universal files but behave completely differently based on their 5 editable files.
This is how you create a Head of Operations that thinks in systems, a Head of Sales that thinks in pipeline, and a Head of Finance that thinks in numbers, all operating under the same company rules.
These files should live in a centralised data store like Supabase. That way, when you update a universal file, every agent in the squad reads the new version on their next boot. If each agent kept their own local copy, you would have to manually update every single agent whenever a company-wide rule changed. Centralisation eliminates that problem entirely.
Memory is not a single system. It is three systems working together, each handling a different type of information. Mixing them causes the same problems as storing your spreadsheets in your email inbox. Structure matters.
Structured data. Contacts, metrics, subscriptions, transactions, sync records. Think of Supabase as a library where every book has a catalogue number, a shelf location, and a standardised format. You can query it, join it, aggregate it, and report on it with precision.
This is where the WHAT lives. 200K+ contacts. Subscription statuses. Transaction histories. Churn metrics. If you can put it in a row and column, it goes here.
Context and reasoning. Decisions, lessons, institutional knowledge, cross-agent learnings. Supermemory is the librarian who knows why each book is on its shelf, what it connects to, and what the reader should know before opening it.
This is where the WHY lives. Why we chose GHL over HubSpot. Why the onboarding flow was restructured. Why a particular automation was built the way it was. Searchable across all agents in the squad.
Linked knowledge vault. Daily logs, session notes, reference material, bi-directionally linked notes. Obsidian is the personal journal: detailed, unstructured, richly connected. You write in it every day and the connections emerge over time.
This is where the HISTORY lives. 75+ bi-directionally linked notes. What happened on any given day. Who said what. What was tried. What failed. The raw record that feeds both the library and the librarian.
Never dump structured data into Supermemory. Never treat Obsidian as source of truth for current state. Never store raw daily notes in Supabase. Each system has a purpose. Mixing them degrades all three. When an agent creates new information, it routes to the correct layer based on type, not convenience.
Every API call is a toll bridge. You pay the price every single time you cross it. If your agent needs data from Circle, GHL, Stripe, Xero, Google Sheets, and your Calendar, that is six toll bridges. Run those calls 10 times a day and you are crossing 60 toll bridges. Every day. It is slow, expensive, and fragile.
Every time your agent needs data, it makes a live API call. Need customer status? Call GHL. Need payment history? Call Stripe. Need community activity? Call Circle. Need schedule? Call Google Calendar.
Each call costs money, takes time (500ms to 3 seconds per call), can fail due to rate limits, and returns data in different formats. Your agent spends more time collecting information than acting on it.
6 APIs x 10 calls/day = 60 toll crossings
At ~$0.01/call avg = $0.60/day = $219/year
+ latency + rate limits + failure handling
Sync all your external systems to one centralised database (Supabase, Postgres, or similar) on a schedule. Once a day, once an hour, whatever cadence matters. Then every time your agent needs data, it queries the local database. One toll bridge instead of six.
Local database queries are instant (under 50ms), free, and never rate-limited. Your agent gets data from one place, in one format, with one connection. Everything else is a background sync job.
6 syncs x 1/day = 6 toll crossings
All agent queries = local (free, instant)
90% reduction in API costs and latency
Here is how to build your centralised data hub:
When two sources of information conflict, which one wins? Without a clear hierarchy, agents spend time reconciling contradictions instead of executing. The hierarchy eliminates ambiguity. Higher tiers always override lower tiers.
Every mistake becomes a permanent rule. regressions.md is loaded on every boot by every agent. A mistake made once is a learning opportunity. A mistake made twice is a system failure. The regression engine ensures it never happens a second time.
When something goes wrong, the responsible agent writes a regression entry immediately. Not at end of day. Not when it is convenient. Immediately. The entry includes what happened, why it happened, the rule that prevents recurrence, and a severity level. That rule is then loaded on every boot by every agent in the squad.
Regressions are not just notes. They are testable rules. Periodically, the squad lead (or an automated process) runs regression tests: does the current system state still honour every rule in regressions.md? If an agent's workspace file contains plaintext credentials, regression SW-03 has failed. If a sub-agent is sending messages directly to Slack, the sub-agent security regression has failed.
Think of it like unit tests for operational behaviour. The regression file is the test suite. The agents' actions are the code under test. Failures get flagged immediately.
Memory is not a "set it and forget it" system. Without active monitoring, files go stale, layers drift out of sync, and agents make decisions on outdated information. Three mechanisms keep the system healthy.
A periodic audit of the entire memory system. How current are the state files? When was regressions.md last updated? Are all three layers (Supabase, Supermemory, Obsidian) in sync? The maturity score tracks completeness, freshness, and consistency across all memory layers. A healthy system scores 80%+ across all dimensions.
A scheduled process where agents review their accumulated raw memory and distill it into higher-level knowledge. Like the way sleep consolidates human memory, dreaming turns daily logs and session notes into curated MEMORY.md entries, Supermemory entries, and updated state files.
During dreaming: raw daily notes are reviewed, key learnings are extracted, MEMORY.md is updated, stale entries are pruned, and Supermemory is synced. This prevents memory bloat while preserving the important context.
Every agent session gets a unique identifier. This creates an audit trail: which session made which decision, updated which file, ran which automation. When something goes wrong, session IDs let you trace the exact chain of events that led to the problem. Without them, debugging across 15+ agents is guesswork.
Session IDs also prevent duplicate work. If two sessions spawn from the same trigger, the second one can check whether the first already completed the task.
| Cadence | Tasks |
|---|---|
| Daily (EOD) | Write Obsidian daily log. Sync key learnings to Supermemory. Refresh state files with current data. |
| Every 3 Days | Review memory/ files. Distill important context into MEMORY.md. Prune stale entries that no longer reflect reality. |
| Weekly | Full MEMORY.md review. Update all state/ files. Verify Supermemory is current. Run memory maturity score. |
| Quarterly | Full regression test. Identity calibration. Archive old daily notes. Audit cross-agent consistency. |
Seven rules that govern how agents interact with the memory system. Break any one of them and the whole architecture degrades. Follow all of them and you get agents that remember, learn, and improve permanently.
If it is not written down, it does not exist. Agent memory is volatile. Session context disappears. The only durable knowledge is what lives in files and databases. When in doubt, check the file. When the file is wrong, fix the file. Never trust in-session reasoning over documented state.
Structured data goes to Supabase. Context and reasoning goes to Supermemory. Daily logs go to Obsidian. Current state goes to state/ files. The temptation is to dump everything into the nearest available system. Resist it. Wrong-layer storage creates search problems, sync conflicts, and stale data.
Decisions and regressions are written the moment they happen. Not later. Not at end of day. Immediately. General learnings and session notes can wait until end of session. This two-speed approach ensures critical information is never lost while avoiding constant file-writing overhead for routine observations.
Regressions are never deleted. Decisions are never overwritten. Completed tasks are moved to completed status with dates, not removed. The historical record matters. What was tried and failed is as valuable as what succeeded. Deletion destroys institutional knowledge.
When two sources of information conflict, do not silently pick one. Flag the discrepancy. Note which sources disagree, when each was last updated, and recommend which one should be authoritative. Silent conflict resolution leads to invisible errors that compound over time.
Memory bloat is as dangerous as memory loss. An agent loading 50 pages of stale context on every boot wastes tokens, slows response time, and introduces noise. The dreaming cycle exists specifically to prune what is no longer relevant while preserving what still matters.
When one agent learns something that matters to the squad, it goes to Supermemory. Not copied into every agent's local files. Not repeated in every MEMORY.md. One entry, shared container, searchable by all. This is how 15 agents stay aligned without exponential duplication.
Memory systems fail. Databases go down. Files get corrupted. Syncs break. Without a backup strategy, a single failure can erase weeks of institutional knowledge. The system is designed with redundancy at every layer.
The three-layer architecture is itself a form of backup. Supabase holds the structured data. Supermemory holds the context. Obsidian holds the daily logs. If any single layer goes down, the other two contain enough information to reconstruct what was lost. This is not accidental. It is by design.
When a memory system is unreachable, agents do not stall. They fall back to the next available source and continue operating. The boot sequence defines explicit fallbacks: if the centralised store is unreachable, fall back to local cached copies. If Supermemory is unreachable, fall back to local notes. Log the gap. Continue working.
If a full recovery is needed, prioritise in this order: (1) regressions.md and decisions.md, because they are Tier 1 authority and losing them means losing operational constraints; (2) state/ files, because they define current reality; (3) MEMORY.md, because it is the most active working memory; (4) Supermemory entries, because they provide cross-agent context; (5) daily notes, because they are the raw record but lowest priority for operational continuity.
An agent that only responds when you talk to it is an expensive chatbot. The real value of AI agents comes from automation: boot sequences that load context without human intervention, heartbeats that monitor systems continuously, cron jobs that run operational tasks on schedule, and dreaming cycles that maintain memory health. Automation turns agents from reactive assistants into autonomous operators.
Every morning, you manually brief your agent. You remind it about yesterday's decisions. You tell it what to check. You ask it to run the same reports it ran yesterday. You are the scheduler, the memory system, and the task queue. The agent waits passively until you type something. Every session starts from zero context and zero initiative.
Your agent boots with full context in under 60 seconds. It checks all monitored systems on schedule. It sends you a morning briefing before you are awake. It detects anomalies and flags them before they become problems. It maintains its own memory, prunes stale data, and syncs across layers. You interact with an agent that already knows what is happening, not one waiting to be told.
Every agent session starts with a five-phase boot sequence. The agent loads its identity, context, rules, state, and situational awareness in a specific order. This order matters. Identity comes first because everything else is interpreted through the lens of who the agent is.
If an agent loads rules before identity, it reads the rules as a generic assistant. If it loads state before context, it has no framework for interpreting the state. The boot sequence is designed so each phase builds on the previous one. By Phase 5, the agent has a complete picture: who it is, who it serves, what constraints apply, what is currently happening, and what requires attention.
Pull SOUL.md from the centralised store. Read IDENTITY.md and ROLE.md from local workspace. After this phase, the agent knows: the company's DNA, its own personality, voice modifiers, domain responsibilities, access levels, and key workflows.
Pull USER.md and ESCALATION.md from the centralised store. After this phase, the agent knows every executive's communication preferences, thinking style, authority level, and how to route issues based on domain and severity.
Pull COMMS.md, regressions.md, and decisions.md from the centralised store. After this phase, the agent knows every communication protocol, every permanent rule from past mistakes, and every canonical decision that has been made.
Read state.md and TASKQUEUE.md from local workspace. After this phase, the agent knows its current status, active focus areas, running systems, and every task in the work queue with status, owner, and deadline.
Scan HEARTBEAT.md for monitoring checklist. Search Supermemory for relevant cross-agent context. Query domain-specific memory banks based on current task. Read today's and yesterday's daily notes. After this phase, the agent has full situational awareness and is ready to operate.
HEARTBEAT.md is a monitoring checklist. It defines what systems the agent should check, how often, and what "healthy" looks like for each one. Without it, agents only react to problems after humans notice them. With it, agents detect anomalies proactively and flag them before they escalate.
Each entry in HEARTBEAT.md defines: the system to check, the check frequency, what a healthy state looks like, and what action to take when something is wrong. Checks range from simple (is the API responding?) to complex (has churn rate exceeded threshold for 3 consecutive days?).
Not all checks are equal. Critical system checks (API health, data sync status) run multiple times per day. Analytical checks (metric trends, churn patterns) run daily. Strategic checks (programme health, team capacity) run weekly. The frequency matches the speed at which problems in that area can escalate.
| Tier | Examples | Frequency |
|---|---|---|
| Critical | API health, sync status, error rates | Every 4-6 hours |
| Analytical | Metric trends, churn, engagement | Daily |
| Strategic | Programme health, capacity, roadmap | Weekly |
Cron jobs are the backbone of autonomous operation. They run scheduled tasks without human initiation: morning briefings, data syncs, report generation, memory maintenance, system health checks. At scale, a single agent can run 50+ scheduled tasks. Managing them requires structure.
Every repeatable task that runs on a schedule should be a cron job. If a human has to remember to ask an agent to do something every day, that is a missing cron. The goal is zero manual triggers for routine operational work. Humans should only interact with agents for decisions, strategy, and novel problems.
| Time | Task | Description |
|---|---|---|
| 06:00 | Morning Brief | Compile overnight events, metrics, and priorities. Deliver to exec team channels. |
| 08:00 | Data Sync | Pull latest from GHL, Stripe, Circle. Update CDP tables. Flag anomalies. |
| 12:00 | Midday Check | Run critical heartbeat checks. Flag any issues that appeared since morning. |
| 17:00 | EOD Digest | Summarise the day. Update state files. Write Obsidian daily log. |
| 22:00 | Dreaming Cycle | Memory consolidation. Prune stale entries. Sync to Supermemory. Run maturity score. |
Just as human sleep consolidates memories and prunes unnecessary neural connections, the dreaming cycle consolidates agent memory and prunes stale information. It runs during off-peak hours and keeps the entire memory architecture healthy.
The agent reads all daily notes since the last dreaming cycle. It identifies key learnings, decisions made, regressions logged, and context that matters for future sessions. Important information gets promoted: learnings go to MEMORY.md, cross-agent knowledge goes to Supermemory, decision context goes to the appropriate state file.
At the same time, the agent identifies stale entries. A note about a bug that was fixed three weeks ago does not need to stay in active memory. Old task context for completed projects gets archived. The goal is a lean, current MEMORY.md that loads fast and contains only what the agent needs.
At the end of each dreaming cycle, the agent runs a self-assessment. How complete is the memory system? How fresh are the files? Are all three layers in sync? The result is a memory maturity score that tracks the health of the entire knowledge architecture over time.
Token budgets are finite. Context windows have limits. When an agent is running at capacity and needs to free up resources, it sheds load in a specific order. The least critical tasks get dropped first. The most critical tasks are never dropped. This order is defined in advance, not decided in the moment.
Under pressure, agents make bad priority decisions. If an agent is running out of context window mid-task, it will drop whatever seems least important in that moment, which might be exactly the wrong thing. Pre-defined shedding order removes the guesswork. When resources are tight, the agent follows the list. No judgment calls under pressure.
AI agents that can write code but cannot communicate properly are useless in a team. At our scale, agents interact with executives, team members, and each other across multiple platforms. Without communication architecture, agents send the wrong format to the wrong platform, overshare in group chats, and flood channels with noise.
Agents send markdown tables to WhatsApp (which does not render them). They respond to every message in group chats, even casual banter. They send long-form reports to Slack DMs when the exec wanted a one-liner. They CC everyone on everything. Urgent issues get batched with routine updates. The noise-to-signal ratio makes the agent's output worthless.
Agents format output correctly for each platform. They know when to speak and when to stay silent in group chats. Urgent issues get immediate, dedicated notifications. Routine updates get batched into digests. The exec team trusts agent output because it arrives in the right format, at the right time, in the right channel.
Different channels serve different purposes. Routing the right information to the right channel is as important as the information itself. Urgent issues to Slack's general channel is noise. Casual updates to the CEO's WhatsApp is overreach. The routing rules eliminate this.
Primary team communication. Full markdown supported. Use threads for discussions. Channel-specific topics. Team-visible work and decisions.
Urgent and personal communication. Direct to exec. No markdown tables. Text and files must be separate messages. Keep it brief.
External communication and formal updates. Longer form acceptable. Attachments supported. Use for client-facing or cross-company comms.
Agent-to-human operational channel. Quick commands, status checks, task updates. Supports inline buttons for approvals and actions.
| Scenario | Channel | Format |
|---|---|---|
| RED issue | WhatsApp (direct to relevant exec) | Short, clear, actionable. No batching. |
| YELLOW issue | Slack (relevant channel or DM) | Recommendation + context. Thread for discussion. |
| GREEN FYI | Slack (channel) or daily digest | Brief update. Batch with other FYIs when possible. |
| Morning briefing | WhatsApp (voice note) + Slack (written) | Voice note summary + full written brief. |
| Team update | Slack (team channel) | Threaded, formatted, tagged for relevant people. |
| External partner | Professional, complete, attachments as needed. |
Each platform has formatting quirks that agents must respect. Sending markdown tables to WhatsApp produces unreadable garbage. Using headers in a platform that does not render them creates visual noise. These rules are non-negotiable.
If you are unsure whether a format renders correctly on a platform, default to plain text with bullet points. Bullets render correctly everywhere. Tables, headers, and code blocks do not. When in doubt, simplify.
SOUL.md defines the company voice. IDENTITY.md adds the individual agent's personality on top. Together, they create communication that is consistent across the squad but distinct per agent. The voice rules are non-negotiable. The tone modifiers are what make each agent unique.
These rules apply to every agent in the squad. They define the floor, not the ceiling.
These modifiers are specific to each agent, layered on top of the company voice.
Agents in group chats are the most common source of communication noise. Without rules, agents respond to every message, repeat what others already said, and flood the chat with unnecessary commentary. The rules below determine when an agent speaks and when it stays silent.
Agents have access to executive-level information: financials, strategy discussions, personnel decisions, and sensitive operational data. That access does not mean that information should be shared in group chats. Even if someone asks a question that the agent could answer with confidential data, the agent stays silent or provides only the publicly appropriate portion. When in doubt, respond privately to the person who asked.
Agents do not operate in isolation. They hand off work, share context, and coordinate on cross-domain tasks. Without communication protocols between agents, handoffs lose context, work gets duplicated, and domains get confused. Squad communication rules ensure clean handoffs and clear ownership.
When one agent routes work to another, the handoff must include: what the task is, why it is being routed, all relevant context (do not assume the receiving agent has it), what "done" looks like, and any constraints or regressions that apply. A handoff without context is a handoff that gets done wrong.
Squad communication is a loop, not a broadcast. Agent A sends a handoff. Agent B acknowledges receipt and confirms understanding. Agent B completes the work and reports back. Agent A verifies the output. If the output does not meet the brief, Agent A sends corrections and Agent B iterates. This loop prevents the "fire and forget" pattern where work gets routed and nobody checks if it landed correctly.