Skip to main content

System Overview

GetThatQuick is a self-hosted, Docker-packaged desktop AI assistant that combines prompt templates, multi-provider LLM chat, and offline speech-to-text in a single container.

Architecture Diagram

┌──────────────────── Docker Container ────────────────────┐
│ │
│ ┌──────────── Bun Server (Hono) ────────────┐ │
│ │ │ │
│ │ REST API WebSocket Static SPA │ │
│ │ /api/* /ws/stt React + Vite │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ Services Vosk FFI │ │
│ │ (LLM, etc.) (bun:ffi) │ │
│ │ │ │
│ └──────┬────────────┬────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ~/getthatquick/ libvosk.so + models │
│ (bind mount) │
│ │
└────────────────────────┬─────────────────────────────────┘

┌───────────┴───────────┐
▼ ▼
Browser (SPA) LLM Providers
(OpenAI, Ollama, etc.)

UI Model

GetThatQuick uses a ChatGPT-like single-page layout consisting of:

  • Left sidebar — session list, new chat button, navigation
  • Chat area — message thread with streaming responses, input bar at the bottom
  • Right sidebar — template editor, model/config panels

The interface is designed to feel instantly familiar to anyone who has used ChatGPT, while adding template-driven prompt engineering on top.

Routes

RouteComponentPurpose
/DashboardMain chat interface with sidebar + chat area
/setupOnboardingFirst-run setup wizard (API keys, model selection)
/settingsSettingsConfiguration overlay for providers, STT, UI

Data Storage

All data is filesystem-based — no database required. Everything lives under ~/getthatquick/:

DataFormatLocation
Chat sessionsJSON~/getthatquick/sessions/
TemplatesMarkdown + YAML frontmatter~/getthatquick/templates/
SettingsJSON~/getthatquick/settings.json
Vosk modelsBinary~/getthatquick/models/vosk/

Single Container Architecture

The entire application ships as a single Docker container:

  • The Bun server handles REST API routes, WebSocket connections, and serves the pre-built static SPA — all from one process.
  • Vosk's native libvosk.so is loaded via bun:ffi directly in the server process — no sidecar or microservice needed.
  • A single bind mount at ~/getthatquick/ provides persistent storage for sessions, templates, settings, and STT models.
  • The browser connects to the container on a single port for everything: API calls, WebSocket audio streaming, and the UI itself.

Key Architectural Decisions

#DecisionRationale
1Single Docker containerSimplest possible deployment for a desktop tool — one docker run command
2ChatGPT-like UIFamiliar mental model reduces onboarding friction
3~/getthatquick/ data pathPredictable, user-accessible location outside the container
4Templates = system promptsTemplates map directly to the system prompt role in LLM APIs
5JSON sessionsHuman-readable, no ORM, trivially portable
6Markdown + YAML frontmatter templatesAuthorable in any text editor, version-controllable
7Bun runtimeFast startup, native TypeScript, built-in FFI support
8bun:ffi for VoskZero-overhead native calls without a C++ addon build step
9Session-scoped model loadingVosk recognizers are created per WebSocket session, freeing memory on disconnect
10Bind mountsUser data survives container rebuilds; editable from the host
11Server-side LLM proxyAPI keys stay on the server; client never touches provider APIs directly
12Hono frameworkLightweight, Bun-native, supports REST + WebSocket in one router
13Multi-arch DockerSupports both amd64 and arm64 for broad desktop compatibility
14Filesystem storage over DBNo migrations, no connection strings — just files
15OpenAI SDK as universal clientWorks with any OpenAI-compatible API (Ollama, OpenRouter, LM Studio)
16SSE streamingToken-by-token delivery for responsive chat UX
17Shared types via monorepoSingle source of truth for API contracts between client and server