When the SDK Says No: Vision-Aware RAG via a 300-Line Python Bridge

This is Part 3 — the final chapter in turning a text-only knowledge base into a fully vision-aware RAG system After moving CLIP embeddings into Oracle AI Database 26ai, I had something great: true multimodal search. Text queries found images. Images found related text. No OCR, no captions—just CLIP doing its magic in the database. But I quickly hit a wall. When I asked my knowledge management app: “What color is the car in this photo?” ...

November 22, 2025 · 14 min · Brian Hengen

Three LLMs, One App: Balancing Speed, Privacy, and Power

I spent a weekend fine-tuning a model for my knowledge management app, designed to handle notes, PDFs, and presentations with Oracle Database 23ai’s vector search (see my management AI post). It aced testing on my RTX 5090 server, but on my M2 MacBook Pro? Barely usable. A query like “Summarize last week’s customer meetings and identify risks” took over a minute, leaving me staring at a spinning wheel while my coffee got cold. ...

October 28, 2025 · 6 min · Brian Hengen

Fine-Tuning a Personal Executive Assistant: Lessons from My Management Notes

After successfully fine-tuning ChefBot on cooking recipes, I wondered: Could I fine-tune an AI on my own management experience to create a personalized executive assistant? Imagine asking your AI: “Summarize last week’s 1:1 with Sarah and suggest coaching points”, and getting a response in your voice, drawing from years of team dynamics and decision patterns. All running locally, with complete privacy and no API costs. This is the story of how I built exactly that. Spoiler: I failed three times before succeeding, and the lesson wasn’t about hyperparameters. ...

October 5, 2025 · 8 min · Brian Hengen

Fine-Tuning Qwen Models: From Theory to Practice

Imagine an AI that coordinates your entire cooking process—faster, smarter, and without ChatGPT’s API costs. With my RTX 5090 workstation humming, I’m answering: Can a specialized language model outcook ChatGPT in the kitchen? Over the past few days, I’ve been fine-tuning Qwen’s 32B and 14B parameter models to create ChefBot, an experimental Specialized Language Model (SLM). Think ChatGPT for recipes, but tuned for cooking data and running without per-token API costs. ...

September 29, 2025 · 3 min · Brian Hengen

Building the RTX 5090 AI Workstation - Step by Step

🚀 The Build Day Has Arrived! After months of anticipation and planning, the RTX 5090 finally launched and it was time to bring my AI/ML powerhouse to life. This is more than just a computer build - it’s the foundation for exploring the cutting edge of artificial intelligence and machine learning. Every component was carefully selected to maximize performance for training neural networks, running inference, and pushing the boundaries of what’s possible with consumer AI hardware. ...

September 24, 2025 · 3 min · Brian Hengen

Building the Ultimate AI Workstation

Building My AI Workstation: Kicking Off the Project Over the past few years, I’ve watched the evolution of AI move from research labs into the everyday workflows of business, productivity, and creativity. Models that once required massive clusters are now accessible to individuals, and the frontier is shifting toward Specialized Language Models (SLMs) — models tuned for specific domains, tasks, and contexts. To push my own understanding forward, I’ve decided to build a personal AI workstation: a high-performance rig that will let me fine-tune, benchmark, and experiment with large models in a hands-on way. This is not just about performance numbers — it’s about exploring how to apply AI practically and learning what it really takes to train and deploy models tailored for business value. ...

September 20, 2025 · 3 min · Brian Hengen

CLIP Inside Oracle AI Database 26ai: Fast, Multimodal RAG

After the 3-way LLM toggle went live, I turned my attention to embeddings - the invisible glue that powers search and RAG. Oracle OCI GenAI’s Cohere endpoint had been rock-solid in my testing: fast, reliable, and gave me 80 K token context. But every chunk still meant a network round-trip, and images were stuck behind OCR, so text-only embeddings meant photos, diagrams, and whiteboards were blind spots in my knowledge base. ...

November 11, 2024 · 11 min · Brian Hengen

Subscribe to New Posts

Get notified when I publish new articles about AI/ML training and workstation builds.