Project Aeon: Local-First Assistant Platform
An assistant platform where your data never leaves your machine. Vector search over local documents, a local LLM runtime through Ollama, and clean boundaries between the API, vector store, and UI.
A local-first RAG platform so your documents never leave the machine. FastAPI orchestrates, ChromaDB stores and searches, Ollama runs the model. The hard part isn't plumbing; it's getting retrieval quality right with models that can't brute-force relevance.
Context
Project Aeon started from a specific problem: cloud-based AI assistants require sending your documents to external servers. The goal was RAG-powered conversations over local files, with everything running on your own machine. No API keys, no data leaving the host.
Architecture
The system has three clear layers.
FastAPI serves the backend API and orchestrates requests: taking a user query, hitting the vector store, assembling the prompt, and forwarding the generation request to the local model.
ChromaDB handles vector storage and similarity search over document embeddings.
Ollama runs the LLM locally. Model choice is a deployment concern, not a code change.
The Vue 3 frontend with Naive UI provides a clean chat interface, but most of the work is behind the API.
Retrieval quality
The interesting part was getting retrieval quality right with local models. Cloud APIs can brute-force relevance with bigger models and longer context windows; locally, you have to be smarter about chunking, embedding quality, and context window management. The chunker ended up being one of the load-bearing decisions: too small and retrieval fragments related content, too large and the context window fills with noise.
Status
Completed (learned everything that I wanted from this project)