Architecture

How Cotask's Agent Works

One agentic loop. Five context layers. Multi-LLM. Self-host or hosted.

Self-Host or Hosted

Cotask runs in two configurations. The hosted version is multi-tenant, audited, and production-hardened. The open-source version on GitHub is single-tenant by design and runs on your own infrastructure.

git clone https://github.com/cotask cd cotask && docker compose up open http://localhost:3000

The OSS harness is the same agent loop, the same tools, the same skill format you see described below. The hosted product adds multi-tenant deployment, per-user audit, billing infrastructure, and a production-tuned skill catalog.

Multi-LLM by Design

Bring Any Model

The agent loop calls a provider-neutral interface. Switch models per workspace, per skill, or per request.

ProviderRoutingNotes
Kimi, GLM, DeepSeek, QwenOpenRouterPin specific provider + quantization for routing control
Anthropic ClaudeDirect API or OpenRouterCompatible with Anthropic's published skills
OpenAI GPTDirect API or OpenRouterTool-calling supported
Google GeminiDirect API or OpenRouterLong-context redlining
Llama (open-weights)OpenRouter or self-hostedDeployable on your own infrastructure
Local (Ollama)OpenAI-compatible APIAir-gapped deployment supported

The Agentic Loop

Unlike simple chatbots that respond in one shot, Cotask runs an agentic loop. The LLM receives your message along with a set of tools and autonomously decides which tools to call, in what order, and how many times. It keeps working until the task is complete.

User Message ↓ ┌───────────────────────────────────────┐ │ System Prompt (5 context layers) │ │ + Available Tools (filtered by skill)│ │ + Conversation History │ └───────────────────────────────────────┘ ↓ ┌───────────────────────────────────────┐ │ LLM decides: respond or use a tool │ │ │ │ → If tool call: execute tool, │ │ feed result back, loop again │ │ → If text: stream response to user │ └───────────────────────────────────────┘ ↓ Workspace Mode: Editor updates + side chat Chat Mode: Full-width chat + artifact cards

The agent might read a document, search the web, build a spreadsheet, and draft a slide deck, all from a single user message. You see each step as numbered progress items in the chat.

Progressive Disclosure

Five Context Layers

The system prompt is assembled from five layers, each adding more context. You control layers 1–3. The platform handles the rest.

LayerNameSourceWho Controls It
L0Platform Promptsystem_prompt.mdCotask (hardcoded)
L1agents.mdYour agents.md fileYou
L2Skill InstructionsSKILL.md bodyYou (customizable)
L3Playbookplaybook.mdYou
L4Session ContextDocuments, active file, conversationAutomatic

agents.md: Your Agent Configuration

The agents.mdfile is your personal configuration for the AI agent. It sits at the top of the workspace file tree and is injected into every conversation as Layer 1 context. Think of it as “who the agent is” for your workspace.

# Agent Configuration ## Identity You are a writing and research assistant for our product team. ## Preferences - Default language: English - Tone: concise, professional, no filler - Citation format: [[N]](URL) with source name - Use our brand colors in slide decks ## House Style - Headings in sentence case - Prefer short paragraphs and bullet lists - Spell out acronyms on first use ## Defaults - Spreadsheets: include a summary row - Decks: title slide + agenda + one idea per slide

Edit this file anytime from the workspace sidebar. Changes take effect on the next message. The agent will follow your preferences, match your house style, and focus on the things you care about.

How the Layers Interplay

When you type a message, the system assembles the prompt from all applicable layers:

Example: User types "/deck from this report" System prompt assembled: ├── L0: Platform prompt (anti-hallucination rules, │ citation format, tool usage patterns) ├── L1: agents.md ("concise tone, use brand colors, │ one idea per slide") ├── L2: deck SKILL.md body ("read the source, outline │ the slides, generate the presentation...") ├── L3: playbook.md ("Decks: title + agenda first, │ summary slide last, max 6 bullets per slide") └── L4: Session context ├── Documents: Q2_Report.docx (DOCX, 24KB) ├── Active document: Q2_Report.docx └── Conversation history → The LLM now knows your STYLE (L1), HOW to build the deck (L2), WHAT conventions to follow (L3), and WHICH document to read (L4).

Why this matters: Every layer is a file you can read, edit, and version-control. No hidden prompts, no opaque configuration databases, no vendor lock-in. If you switch platforms, your agents.md, SKILL.md files, and playbook.md come with you.

Playbook: Your Conventions

The playbook (playbook.md) is where you define your team's or organization's standard conventions for common deliverables. The agent loads it as Layer 3 context when drafting, building, or editing.

# Standard Conventions ## Documents - Preferred: Sentence-case headings, short paragraphs - Acceptable: Numbered sections for long reports - Avoid: Walls of text, undefined acronyms ## Spreadsheets - Preferred: A summary row at the top, live formulas - Acceptable: Pivot tables for breakdowns - Avoid: Hard-coded values where a formula fits ## Slide Decks - Preferred: Title + agenda first, summary last - Acceptable: Up to 6 bullets per slide - Avoid: More than one main idea per slide

This is a plain markdown file. Define it once, and every draft, spreadsheet, and deck will follow your conventions consistently. Share it across your team by copying a file.

Open Source Tools

The tools the agent uses (document reading and editing, spreadsheets, presentations, image generation, web search) are all open source on GitHub. You can audit exactly how your files are processed.

CategoryTools
Document Managementlist_documents, read_document, create_document, edit_document
Spreadsheets & Deckscreate_spreadsheet, edit_spreadsheet, create_presentation, edit_presentation
Imagesgenerate_image, edit_image
Web Researchweb_search, web_fetch
DOCX Track Changesaccept_revisions, reject_revisions, get_revision_stats, export_docx

Skills define which tools the agent can access. The /draft skill gets document tools; the /research skill gets web tools. This keeps the agent focused and efficient.

See It in Action

Start a free trial and explore the workspace file tree. Edit agents.md, customize your playbook, activate skills.