When the SDK Says No: Vision-Aware RAG via a 300-Line Python Bridge

This is Part 3 — the final chapter in turning a text-only knowledge base into a fully vision-aware RAG system After moving CLIP embeddings into Oracle AI Database 26ai, I had something great: true multimodal search. Text queries found images. Images found related text. No OCR, no captions—just CLIP doing its magic in the database. But I quickly hit a wall. When I asked my knowledge management app: “What color is the car in this photo?” ...

November 22, 2025 · 14 min · Brian Hengen

CLIP Inside Oracle AI Database 26ai: Fast, Multimodal RAG

After the 3-way LLM toggle went live, I turned my attention to embeddings - the invisible glue that powers search and RAG. Oracle OCI GenAI’s Cohere endpoint had been rock-solid in my testing: fast, reliable, and gave me 80 K token context. But every chunk still meant a network round-trip, and images were stuck behind OCR, so text-only embeddings meant photos, diagrams, and whiteboards were blind spots in my knowledge base. ...

November 11, 2024 · 11 min · Brian Hengen

Subscribe to New Posts

Get notified when I publish new articles about AI/ML training and workstation builds.