How Canoryn Works

Canoryn is built on a structured architecture that connects system inputs, AI reasoning, and native action execution.

1. Input & Context (Triggers)

Canoryn uses multimodal input to understand context:

Vision Loop: Periodically captures screen state (text, UI elements, images).
Audio Stream: Listens for wake words ("Hey Canoryn") and voice commands.
System Events: Monitors file changes, app launches, and hotkeys.
Clipboard: Watches for copied text or images.

How Canoryn processes workflows:

├─────────────────────────────────────────────────────────┤
│  INPUTS & TRIGGERS    AI REASONING      SYSTEM ACTIONS  │
│  ─────────────────    ────────────      ──────────────  │
│  See & Hear           Think & Decide    Do & Execute    │
└─────────────────────────────────────────────────────────┘

1. Inputs & Context

How Canoryn captures system events and user intent:

Input Type	Implementation
Vision	Screen analysis, UI element detection
Hearing	Voice commands, audio context
Context	Active app, time, location

2. Reasoning & Logic (AI Processing)

How Canoryn processes instructions and decides next steps:

Component	Function
LLM Reasoning	Natural language NLU
Blueprint Logic	Visual workflow execution
Memory Database	Retrieval of relevant past context

3. Actions & Integrations (Execution)

How Canoryn runs workflows and controls system elements:

System	Capabilities
macOS APIs	App control, file management
Accessibility	UI automation, clicks, typing
Integrations	Spotify, Calendar, etc.

Data Flow

User speaks "Play something chill"
        │
        ▼
   ┌─────────┐
   │ TRIGGER │ ← Voice transcription
   └────┬────┘
        │
        ▼
   ┌─────────┐
   │ PROCESS │ ← LLM understands intent
   └────┬────┘   Memory recalls: "User likes Lo-fi"
        │
        ▼
   ┌─────────┐
   │ ACTION  │ ← Spotify: Play Lo-fi playlist
   └─────────┘

Local-First Architecture

Everything runs on your Mac:

LLM: Ollama (local) or cloud providers (your choice)
Memory: SQLite database, encrypted
Processing: Native Swift, no web views

How Canoryn Works ​

1. Input & Context (Triggers) ​

1. Inputs & Context ​

2. Reasoning & Logic (AI Processing) ​

3. Actions & Integrations (Execution) ​

Data Flow ​

Local-First Architecture ​

Further Reading ​