MediaPipe Articles
Android Local LLM Inference: LiteRT, MediaPipe, Quantization, and Production Trade-offs
A practical guide to Android local LLM inference across LiteRT, ONNX Runtime Mobile, MediaPipe LLM Inference, INT4 quantization, GPU delegates, KV cache memory, and device fallback.
Read Post
Android On-device RAG: From Local Vector Databases to LLM Inference
A practical walkthrough of on-device Android RAG, covering document chunking, local vector search with SQLite, MediaPipe LLM inference, and performance trade-offs.
Read Post