Skip to content

~/project

Project Aeon: Local-First Assistant Platform

An assistant platform where your data never leaves your machine. Vector search over local documents, a local LLM runtime through Ollama, and clean boundaries between the API, vector store, and UI.

  • FastAPI
  • ChromaDB
  • Vue 3
  • TypeScript
  • Naive UI
  • Ollama

Role: Personal project · Year: 2025 · Status: shipped

tldr: A local-first RAG platform so your documents never leave the machine. FastAPI orchestrates, ChromaDB stores and searches, Ollama runs the model. The hard part isn’t plumbing; it’s getting retrieval quality right with models that can’t brute-force relevance.

Project Aeon started from a specific problem: cloud-based AI assistants require sending your documents to external servers. The goal was RAG-powered conversations over local files, with everything running on your own machine. No API keys, no data leaving the host.

Architecture

The system has three clear layers.

  • FastAPI serves the backend API and orchestrates requests: taking a user query, hitting the vector store, assembling the prompt, and forwarding the generation request to the local model.
  • ChromaDB handles vector storage and similarity search over document embeddings.
  • Ollama runs the LLM locally. Model choice is a deployment concern, not a code change.
  • The Vue 3 frontend with Naive UI provides a clean chat interface, but most of the work is behind the API.

Retrieval quality

The interesting part was getting retrieval quality right with local models. Cloud APIs can brute-force relevance with bigger models and longer context windows; locally, you have to be smarter about chunking, embedding quality, and context window management. The chunker ended up being one of the load-bearing decisions: too small and retrieval fragments related content, too large and the context window fills with noise.

Completed — learned everything that I wanted from this project.