Architecture

How Cotask's Agent Works

One agentic loop. Five context layers. Multi-LLM. Self-host or hosted.

Self-Host or Hosted

Cotask runs in two configurations. The hosted version is multi-tenant, audited, and production-hardened. The open-source version on GitHub is single-tenant by design and runs on your own infrastructure.

git clone https://github.com/cotask
cd cotask && docker compose up
open http://localhost:3000

The OSS harness is the same agent loop, the same tools, the same skill format you see described below. The hosted product adds multi-tenant deployment, per-user audit, billing infrastructure, and a production-tuned skill catalog.

Multi-LLM by Design

Bring Any Model

The agent loop calls a provider-neutral interface. Switch models per workspace, per skill, or per request.

Provider	Routing	Notes
Kimi, GLM, DeepSeek, Qwen	OpenRouter	Pin specific provider + quantization for routing control
Anthropic Claude	Direct API or OpenRouter	Compatible with Anthropic's published skills
OpenAI GPT	Direct API or OpenRouter	Tool-calling supported
Google Gemini	Direct API or OpenRouter	Long-context redlining
Llama (open-weights)	OpenRouter or self-hosted	Deployable on your own infrastructure
Local (Ollama)	OpenAI-compatible API	Air-gapped deployment supported

The Agentic Loop

Unlike simple chatbots that respond in one shot, Cotask runs an agentic loop. The LLM receives your message along with a set of tools and autonomously decides which tools to call, in what order, and how many times. It keeps working until the task is complete.

User Message
    ↓
┌───────────────────────────────────────┐
│  System Prompt (5 context layers)     │
│  + Available Tools (filtered by skill)│
│  + Conversation History               │
└───────────────────────────────────────┘
    ↓
┌───────────────────────────────────────┐
│  LLM decides: respond or use a tool  │
│                                       │
│  → If tool call: execute tool,        │
│    feed result back, loop again       │
│  → If text: stream response to user   │
└───────────────────────────────────────┘
    ↓
Workspace Mode: Editor updates + side chat
Chat Mode: Full-width chat + artifact cards

The agent might read a document, search the web, build a spreadsheet, and draft a slide deck, all from a single user message. You see each step as numbered progress items in the chat.

Progressive Disclosure

Five Context Layers

The system prompt is assembled from five layers, each adding more context. You control layers 1–3. The platform handles the rest.

Layer	Name	Source	Who Controls It
L0	Platform Prompt	`system_prompt.md`	Cotask (hardcoded)
L1	agents.md	Your `agents.md` file	You
L2	Skill Instructions	`SKILL.md` body	You (customizable)
L3	Playbook	`playbook.md`	You
L4	Session Context	Documents, active file, conversation	Automatic

agents.md: Your Agent Configuration

The agents.mdfile is your personal configuration for the AI agent. It sits at the top of the workspace file tree and is injected into every conversation as Layer 1 context. Think of it as “who the agent is” for your workspace.

# Agent Configuration

## Identity
You are a writing and research assistant for our product team.

## Preferences
- Default language: English
- Tone: concise, professional, no filler
- Citation format: [[N]](URL) with source name
- Use our brand colors in slide decks

## House Style
- Headings in sentence case
- Prefer short paragraphs and bullet lists
- Spell out acronyms on first use

## Defaults
- Spreadsheets: include a summary row
- Decks: title slide + agenda + one idea per slide

Edit this file anytime from the workspace sidebar. Changes take effect on the next message. The agent will follow your preferences, match your house style, and focus on the things you care about.

How the Layers Interplay

When you type a message, the system assembles the prompt from all applicable layers:

Example: User types "/deck from this report"

System prompt assembled:
├── L0: Platform prompt (anti-hallucination rules,
│       citation format, tool usage patterns)
├── L1: agents.md ("concise tone, use brand colors,
│       one idea per slide")
├── L2: deck SKILL.md body ("read the source, outline
│       the slides, generate the presentation...")
├── L3: playbook.md ("Decks: title + agenda first,
│       summary slide last, max 6 bullets per slide")
└── L4: Session context
    ├── Documents: Q2_Report.docx (DOCX, 24KB)
    ├── Active document: Q2_Report.docx
    └── Conversation history

→ The LLM now knows your STYLE (L1),
  HOW to build the deck (L2), WHAT conventions to
  follow (L3), and WHICH document to read (L4).

Why this matters: Every layer is a file you can read, edit, and version-control. No hidden prompts, no opaque configuration databases, no vendor lock-in. If you switch platforms, your agents.md, SKILL.md files, and playbook.md come with you.

Playbook: Your Conventions

The playbook (playbook.md) is where you define your team's or organization's standard conventions for common deliverables. The agent loads it as Layer 3 context when drafting, building, or editing.

# Standard Conventions

## Documents
- Preferred: Sentence-case headings, short paragraphs
- Acceptable: Numbered sections for long reports
- Avoid: Walls of text, undefined acronyms

## Spreadsheets
- Preferred: A summary row at the top, live formulas
- Acceptable: Pivot tables for breakdowns
- Avoid: Hard-coded values where a formula fits

## Slide Decks
- Preferred: Title + agenda first, summary last
- Acceptable: Up to 6 bullets per slide
- Avoid: More than one main idea per slide

This is a plain markdown file. Define it once, and every draft, spreadsheet, and deck will follow your conventions consistently. Share it across your team by copying a file.

Open Source Tools

The tools the agent uses (document reading and editing, spreadsheets, presentations, image generation, web search) are all open source on GitHub. You can audit exactly how your files are processed.

Category	Tools
Document Management	`list_documents`, `read_document`, `create_document`, `edit_document`
Spreadsheets & Decks	`create_spreadsheet`, `edit_spreadsheet`, `create_presentation`, `edit_presentation`
Images	`generate_image`, `edit_image`
Web Research	`web_search`, `web_fetch`
DOCX Track Changes	`accept_revisions`, `reject_revisions`, `get_revision_stats`, `export_docx`

Skills define which tools the agent can access. The /draft skill gets document tools; the /research skill gets web tools. This keeps the agent focused and efficient.

See It in Action

Start a free trial and explore the workspace file tree. Edit agents.md, customize your playbook, activate skills.

Start Free Trial View Skills →