Under the Hood

How Cue is built — the tech choices, architecture decisions, and workflows behind a full-stack app developed almost entirely with AI.

The big picture

Cue is an event scheduling app that uses AI to help groups of friends coordinate plans through natural language. Someone proposes a few times, friends reply however they want — “Wednesday works but I’d prefer Thursday” — and Gemini figures out the best option.

Behind that simple idea sits a monorepo with six modules, written in four languages, running on personal servers connected to the world through Cloudflare Tunnels. No Kubernetes. No GitHub Actions. No cloud functions. Just Docker containers, a load balancer, and a Fish shell script.

If you read Norwegian, there’s a companion piece about the personal journey of building Cue: the Cue story.

Six modules, one repo

Everything lives in a single monorepo. This was a deliberate choice — it lets the AI agent see the full picture when working across boundaries, and keeps documentation, deploy scripts, and configuration in one place.

cue-coreKotlin

The brain. All business logic, database access, AI integration, authentication, push notifications, and the SSE broadcaster. Spring Boot 4 on JDK 25 with virtual threads.

cue-front-endTypeScript

The web client. Next.js 16 with App Router, React Server Components, Tailwind CSS 4, and a custom SSE hook for real-time updates. View layer only — no business logic.

cue-adminTypeScript

Admin dashboard for monitoring users, events, AI usage, and costs. Built with the same stack as the frontend, plus Recharts for analytics. Google OAuth only.

cue-iosSwift

Native iOS app built with SwiftUI. Distributed through the App Store. Supports push notifications, calendar sync, home screen widgets, and quick actions.

cue-cliGo

Command-line client built with Cobra and Charmbracelet for beautiful terminal UIs. Distributed via Homebrew with GoReleaser. Supports SSE streaming.

cue-core-api-testKotlin

Integration test suite that runs against the deployed server. Verifies the API contract after every deploy — the safety net that catches regressions.

Four languages

Each module uses the language that fits its domain best. No universal compromise — just the right tool for each job.

Language	Used in	Lines
Kotlin	Backend, tests	~10,800
TypeScript	Web, admin	~8,400
Swift	iOS app	~7,100
Go	CLI	~3,900

On top of that: ~210 lines of SQL across 23 Flyway migrations, ~235 lines of Fish for deploy automation, and a pile of YAML/TOML for Docker Compose and GoReleaser configuration.

How requests flow

Every request to Cue travels the same path: from the client, through a Cloudflare Tunnel, into an nginx load balancer, and finally to one of two backend instances.

Request flow:

Client→Cloudflare Tunnel→nginx

├─cue-core-1:8080

└─cue-core-2:8080

├─PostgreSQL:5432

└─Valkey:6379

The nginx load balancer distributes traffic between two cue-core instances with automatic failover. If one goes down, the other picks up immediately. SSE connections get a 24-hour read timeout so they can stay alive for as long as the user has the app open.

The full service map

In total, eight containers run in Docker Compose. Here’s the complete picture:

cue-core-18080Backend instance 1

cue-core-28080Backend instance 2

cue-lb8095nginx load balancer

cue-front-end3000Next.js web app

cue-admin3016Admin dashboard

postgres5432PostgreSQL 18

valkey6379Cache & SSE relay

cloudflared—Cloudflare Tunnel

Real-time with Server-Sent Events

Cue needs to feel instant. When someone responds to an event, everyone else should see it immediately — on web, iOS, and CLI. Instead of WebSockets or polling, Cue uses Server-Sent Events (SSE), a simple, HTTP-native protocol for one-way real-time streaming.

How it works

The client opens a long-lived HTTP connection to /api/sse/stream. The server keeps that connection open and pushes events down whenever something happens. It’s beautifully simple.

Signal-based, not payload-based

This is a key design choice. SSE events in Cue are signals, not data payloads. When the backend sends an event-updated signal, it doesn’t include the event data. Instead, the client refetches the relevant data through the normal REST API. This keeps authorization logic centralized in one place and the SSE layer paper-thin.

Multi-instance broadcasting

With two backend instances behind a load balancer, a signal produced by instance 1 needs to reach clients connected to instance 2. Cue solves this with a Valkey pub/sub relay. When a service publishes a signal, it goes to Valkey first, and every backend instance picks it up and broadcasts to its connected clients.

SSE broadcast flow:

EventService ─ after TX commit ─▸ Valkey pub/sub

├─ core-1 ─▸ connected clients

└─ core-2 ─▸ connected clients

Transactional safety

Signals are only sent after the database transaction commits. This is enforced through Spring’s TransactionSynchronization.afterCommit(). If a transaction rolls back, no signal is ever sent. No phantom updates. No race conditions.

Heartbeats and reconnection

A heartbeat is sent every 20 seconds to keep connections alive and detect dead ones. On the client side, if the connection drops, an exponential backoff kicks in — starting at 1 second and capping at 30. A 401 response stops the retry loop entirely and redirects to login instead of hammering the server.

AI at the core

AI isn’t just a development tool for Cue — it’s embedded in the product itself. When users respond to an event, their natural language responses are analyzed by Gemini 2.5 Pro through Google’s Vertex AI.

AI resolution flow:

User responds ─▸ EventService

└─▸ Spring Event (async)

└─▸ ResolutionService ─▸ Vertex AI

└─▸ Valkey SSE relay ─▸ all clients

The resolution runs asynchronously through Spring’s ApplicationEvent system. The user sees their response saved instantly, and moments later, an updated AI recommendation appears in real-time via SSE. Gemini returns structured JSON with per-time-slot scores and a human-readable summary.

All AI calls are logged with token counts for cost monitoring. The admin dashboard tracks daily, weekly, and monthly usage against configurable limits.

Authentication

Cue supports three sign-in methods: Google, Apple, and email OTP. The frontend handles OAuth flows via NextAuth v5, while the backend validates JWT tokens from all three providers on every request.

Google

Offline mode with refresh tokens. Sessions auto-renew silently every 10 minutes and whenever the tab regains focus, so the access token is fresh before a request needs it.

Apple

Apple doesn't provide refresh tokens in the same way. Users re-authenticate when the session expires — a deliberate trade-off for App Store presence.

6-digit OTP sent via Brevo SMTP. Rate-limited to 3 codes per 10 minutes per address. Backend issues a 30-day JWT after verification.

Users can link multiple providers to one account through verified email matching — no account fragmentation, no “which provider did I use?” moments.

Database design

The database is PostgreSQL 18 managed exclusively through Flyway SQL migrations. Hibernate runs in validate mode — it checks that entities match the schema but never touches the DDL. Schema changes go through migration files, reviewed and intentional.

Some notable choices: status fields use VARCHAR with CHECK constraints instead of PostgreSQL enums (easier to evolve), AI resolution scores and weather reports are stored as JSONB (flexible structure), and profile pictures live in a separate table to keep the users table lean. There are 23 migrations spanning 15 tables.

Hosting: personal servers, no cloud

Cue deliberately avoids the big cloud providers. No AWS, no GCP compute, no Azure. Instead, it runs on physical and rented servers — a conscious choice to keep things simple, affordable, and European.

DevelopmentHomelab

An Ubuntu Server sitting under my desk at home. Accessible via SSH through a Cloudflare Zero Trust tunnel. The database runs directly on the host. It’s fast, it’s free, and I can hear it humming when a deploy is running.

ProductionHetzner

A Hetzner CAX ARM64 instance in Europe. Affordable, fast, and the data stays on European soil. PostgreSQL runs in Docker here, and deploys are zero-downtime with rolling container updates.

No Kubernetes, by choice

Kubernetes is incredible technology, but it’s designed for problems Cue doesn’t have. Two backend instances behind nginx, managed by Docker Compose, is all the orchestration needed. The entire infrastructure fits in a single compose file. If traffic ever outgrows this setup, scaling up is a matter of adding another instance and a line in the nginx config.

Cloudflare Tunnels

Both environments are exposed to the internet through Cloudflare Zero Trust tunnels. No open ports, no public IPs, no traditional reverse proxy chain. The cloudflared daemon runs as a Docker container and connects outbound to Cloudflare’s edge, which then routes traffic to the local services. It’s elegant and secure.

Deploy: from commit to running in production

Cue doesn’t use GitHub Actions or any CI/CD platform. Instead, the entire build and deploy pipeline is a Fish shell script called omni-deploy. It’s faster, simpler, and gives complete control.

The deploy pipeline:

Git syncStage all changes, commit with timestamp, push to GitHub

SSH to serverPull the latest code on the target machine

Build imagesBuild cue-core, cue-front-end, and cue-admin Docker images in parallel

Sync configCopy nginx.conf and compose files to the server

Restart containersBring up all services with the new images

Health checkPoll /actuator/health for up to 60 seconds until the backend is ready

VerifyRun the API test suite against the deployed server

Zero-downtime in production

The production deploy script goes further. Instead of restarting everything at once, it updates containers one by one — waiting for each to pass its health check before moving to the next. Nginx is reloaded after each backend instance updates, so there’s always at least one healthy instance serving traffic. Users never see downtime.

Why not GitHub Actions?

Building locally on the server skips the overhead of cloning, caching, and transferring artifacts. The Fish script SSHes directly into the server, pulls the latest code (which is already mostly there from the previous deploy), and builds. It’s significantly faster and there’s zero vendor lock-in. The whole pipeline is a single readable script.

Docker build pipeline

Each module has a multi-stage Dockerfile optimized for fast rebuilds and minimal runtime images.

cue-core

eclipse-temurin:25-jdk→eclipse-temurin:25-jre

Gradle compiles the Kotlin source into a fat JAR. The runtime image only contains the JRE and the JAR — no build tools, no source code.

cue-front-end

node:20-alpine→distroless/nodejs20

npm ci installs deps, Next.js builds the standalone output, and the final image is Google's distroless — no shell, no package manager, minimal attack surface.

cue-admin

node:20-alpine→distroless/nodejs20

Same pattern as the frontend. Identical build pipeline, different app.

All three images build in parallel during deploys. Docker layer caching means that if only the backend code changed, the frontend images rebuild in seconds (nothing to do).

The iOS story

The iOS app is a native SwiftUI application distributed through the App Store. This has a profound impact on the backend architecture: since users can’t be force-updated, every API endpoint the iOS app touches is treated as a stable public API.

Adding an optional response field? Safe. Removing a field or renaming an endpoint? Forbidden without a migration plan. This discipline keeps the system honest.

The app supports push notifications through Apple Push Notification service (APNs), calendar sync for finalized events, home screen widgets in three sizes, and quick actions from the home screen. Device tokens are registered on login and stale tokens are automatically cleaned up when APNs reports them as invalid.

Universal Links tie the web and the app together: tapping a /events/<id> link on iOS opens the event directly in the native app if installed, and falls back to the web view otherwise. This is powered by an apple-app-site-association file served by the Next.js frontend with separate app IDs for dev and prod builds.

The Go CLI

The CLI was the last module added, and it’s a great example of how AI accelerates development. With the full project context already established — API contracts, auth flows, SSE protocol — Claude Code generated the entire CLI from a single prompt: “Build a CLI for the app in Go, look at the other modules for functionality.”

It’s built with Cobra for command structure and the Charmbracelet suite for beautiful terminal UIs. Authentication tokens are stored securely in the system keyring. Distribution happens through GoReleaser, which builds binaries for macOS and Linux (both AMD64 and ARM64), generates shell completions, creates a GitHub release, and updates the Homebrew formula — all from a single make release.

Built with Claude Code

Perhaps the most interesting part of Cue’s architecture is how it was built. The entire project was developed using Claude Code — Anthropic’s CLI tool for agentic software development. Claude doesn’t just write code; it SSHes into servers, runs deploys, verifies health checks, and runs test suites.

CLAUDE.md: the project brain

At the root of the repo sits a CLAUDE.md file that acts as persistent instructions for Claude. It documents the tech stack, coding standards, deployment shortcuts, and critical rules like the iOS API stability contract. Every Claude Code session starts by reading this file, so the agent always has the right context.

Rules: domain-specific guardrails

The .claude/rules/ directory contains focused rule files for specific domains. Infrastructure rules document the two environments, SSH hostnames, and URLs. Admin rules define constraints for the dashboard. These files are loaded contextually — Claude reads the relevant rules when working in that area of the codebase.

Skills: complex workflows as commands

The real power comes from custom skills — parameterized workflows defined in .claude/skills/. Each skill is a markdown file that describes a multi-step operation Claude can execute as a slash command.

/cue-deploy-verifyCommit, push, deploy to dev, run API tests, verify container health

/deploy-prodZero-downtime production deploy with rolling updates

/api-testRun the integration test suite against the deployed server

/deploy-iosBuild and install the debug app to a connected iPhone

/ios-releaseClose release notes and bump the App Store version

/deploy-cliBuild, test, and release the Go CLI via GoReleaser + Homebrew

/reset-dev-dataSeed the dev database with four realistic test users and events

/clear-dev-dataWipe all user data from the dev database for a clean slate

/tagCreate and push a semver git tag

A typical development session might look like: write a feature, run /cue-deploy-verify to deploy and test it, iterate on feedback, then /deploy-prod when it’s ready. The entire cycle — from code change to verified production deploy — happens without leaving the terminal.

Memory: learning across sessions

Claude Code has a persistent memory system that carries knowledge between conversations. Feedback about preferred approaches, project context, and workflow preferences are stored and recalled automatically. The agent gets better at working with this specific project over time.

Interesting tech choices

Virtual threads (JDK 25)

Spring Boot 4 with virtual threads means each request gets its own lightweight thread. No reactive programming complexity, no callback hell — just straightforward blocking code that scales.

Valkey for pub/sub and rate limiting

Valkey handles three jobs: the SSE pub/sub relay between backend instances, atomic sliding-window rate limiting (via a small Lua script over a sorted set), and general caching. Fast, in-memory, and a few dozen lines of Kotlin.

Distroless containers

The Next.js frontends run in Google's distroless images — no shell, no package manager, nothing except Node.js and the app. Smaller images, smaller attack surface.

Flyway over Hibernate DDL

Hibernate can generate schemas, but in production you want explicit, reviewed SQL migrations. Hibernate validates that entities match the schema — it never modifies it.

Signal-based SSE

SSE events carry signal types, not data. Clients refetch through REST, keeping authorization logic in one place. The SSE layer stays trivially simple.

Sliding-window rate limiting in Lua

Rate limits (OTPs, writes, event creation, IP ceilings) run as a single atomic Lua script against a Valkey sorted set. No race conditions, no external service — and if Valkey is unreachable, the limiter fails open so traffic keeps flowing.

Transactional events for async work

AI resolution, AI comments, and weather reports fire via Spring ApplicationEvents with @TransactionalEventListener(AFTER_COMMIT). Background work only runs once the write is durable — no phantom jobs triggered by rolled-back transactions.

No GitHub Actions

Building directly on the server is faster than any CI/CD platform. No artifact transfer, no cache warming, no YAML debugging. A Fish shell script does everything.

Fish shell for automation

Fish has cleaner syntax than Bash, better error handling, and readable scripts. The entire dev and prod deploy pipelines fit in ~100 lines of Fish each.

The philosophy

Use the simplest thing that works. Build on personal servers, not cloud abstractions. Let the AI agent handle the toil. Write code in the language that fits best, not the one you’re most comfortable with. And when eight Docker containers behind nginx can do the job — you don’t need Kubernetes.

See it in action

Web appgetcue.net

CLIgetcue.net/cli

iOS appApp Store

The Cue StoryThe personal journey (in Norwegian)

Built by one person and a very capable AI.

Open Cue