Performance Optimization Articles
Android 16 Edge-to-Edge: WindowInsets Dispatch and System Bar Adaptation
Android 16 treats forced edge-to-edge as a breaking change for targetSdk 36 apps. This guide explains the WindowInsets dispatch path and practical adaptation strategies for both View and Compose UIs.
Read Post
Android Local LLM Inference: LiteRT, MediaPipe, Quantization, and Production Trade-offs
A practical guide to Android local LLM inference across LiteRT, ONNX Runtime Mobile, MediaPipe LLM Inference, INT4 quantization, GPU delegates, KV cache memory, and device fallback.
Read Post
Android On-device AI Benchmarking: Latency, Throughput, Power, and Thermal Degradation
A practical benchmark methodology for Android on-device AI inference across latency, throughput, power, thermal throttling, long-tail metrics, GPU sync, and automated test reports.
Read Post
Android Official Skills Deep Dive: Redefining Android Development Workflows with AI Agents
A deep dive into Google's official android/skills repository, a structured instruction set for AI agents that covers Compose migration, Navigation 3, R8 optimization, and other core Android workflows.
Read Post
Android Bitmap Memory Model: From Java Heap to Hardware Bitmap
A deep dive into how Android Bitmap pixel storage moved between native heap, Java heap, and GPU memory, and what that means for OOM prevention.
Read Post
Android RecyclerView Cache: Four Levels, Reuse, and Prefetch
A deep dive into RecyclerView's four cache layers, their different hit costs, GapWorker prefetch, and practical tuning for smoother scrolling.
Read Post
Inside the Android ART dex2oat Pipeline: From DEX Bytecode to OAT Machine Code
A practical walkthrough of the dex2oat pipeline, compiler filter trade-offs, JIT and AOT cooperation, and Baseline Profile startup optimization.
Read Post
Android On-device AI Chat Compose UI Architecture: Streaming Rendering and Multi-turn Conversation State
Compose UI patterns for on-device LLM chat apps, using token buffering, state isolation, and a single source of truth to keep streaming output smooth.
Read Post
Kotlin Value Classes and Inline Classes: Zero-Overhead Type Safety
A deep dive into Kotlin inline class and value class compilation, boxing elimination, JVM signatures, Android performance patterns, and serialization compatibility.
Read Post
Kotlin Symbol Processing: KSP from Annotation Scanning to Code Generation
A full walkthrough of KSP as a KAPT replacement, from SymbolProcessor and Resolver to CodeGenerator, incremental builds, multiplatform support, and migration.
Read Post
Android Cache Systems: LruCache, DiskLruCache, and Offline-first Design
A practical deep dive into Android caching, from LruCache and DiskLruCache internals to three-level cache coordination, consistency, and offline-first engineering.
Read Post
Android Stability Monitoring: From Crash SDKs to APM Dashboards
A deep dive into Android production stability monitoring, including Java and native crash collection, ANR detection, reporting pipelines, stack clustering, and APM dashboards.
Read Post
Android On-device Speech Recognition: From SpeechRecognizer to Android 16 ASR
A full-stack look at Android on-device speech recognition, from AudioRecord capture and SpeechRecognizer APIs to Android 16's built-in ASR engine.
Read Post
Android On-device LLM Context Window Engineering
A practical Android on-device LLM context management strategy covering layered prompt compression, summary caching, dialog state machines, and token budget allocation.
Read Post
Android On-device LLM Streaming Output: From Tokens to Compose UI
A full-stack architecture for Android on-device LLM streaming output, covering KV Cache memory pressure, Kotlin Flow backpressure, and incremental Compose rendering.
Read Post
Android On-device AI Model Delivery and Version Management
A practical model delivery architecture that decouples on-device AI models from APK releases with three-layer versioning, BSDiff incremental updates, and hot rollback.
Read Post
Android On-device AI Power and Thermal Management: From SoC DVFS to Thermal Throttling
A practical look at sustained on-device LLM inference, GPU power profiles, DVFS scheduling, thermal throttling, and thermal-aware load scheduling that reduced P99 latency from 890ms to 380ms.
Read Post
Android On-device AI Memory Bandwidth: GPU Shared Memory to NPU Zero-copy
A practical guide to Android on-device AI memory-bandwidth optimization, from camera-to-GPU data movement to AHardwareBuffer, ION reuse, and NPU zero-copy paths.
Read Post
Android Cold Start Optimization with Baseline Profiles
A practical Android cold start optimization guide covering Baseline Profile principles, generation, integration, measurement, test results, and rollout caveats.
Read Post
Android On-device AI Profiling with Perfetto: NPU Scheduling and Memory Bandwidth
A Perfetto-based profiling method for Android on-device AI inference, tracing NPU scheduling, GPU counters, DRM contention, and memory bandwidth bottlenecks.
Read Post