Performance Optimization Articles

Android 16 Edge-to-Edge: WindowInsets Dispatch and System Bar Adaptation

Android 16 treats forced edge-to-edge as a breaking change for targetSdk 36 apps. This guide explains the WindowInsets dispatch path and practical adaptation strategies for both View and Compose UIs.

Android Local LLM Inference: LiteRT, MediaPipe, Quantization, and Production Trade-offs

A practical guide to Android local LLM inference across LiteRT, ONNX Runtime Mobile, MediaPipe LLM Inference, INT4 quantization, GPU delegates, KV cache memory, and device fallback.

Android On-device AI Benchmarking: Latency, Throughput, Power, and Thermal Degradation

A practical benchmark methodology for Android on-device AI inference across latency, throughput, power, thermal throttling, long-tail metrics, GPU sync, and automated test reports.

Android Official Skills Deep Dive: Redefining Android Development Workflows with AI Agents

A deep dive into Google's official android/skills repository, a structured instruction set for AI agents that covers Compose migration, Navigation 3, R8 optimization, and other core Android workflows.

Android Bitmap Memory Model: From Java Heap to Hardware Bitmap

A deep dive into how Android Bitmap pixel storage moved between native heap, Java heap, and GPU memory, and what that means for OOM prevention.

Android RecyclerView Cache: Four Levels, Reuse, and Prefetch

A deep dive into RecyclerView's four cache layers, their different hit costs, GapWorker prefetch, and practical tuning for smoother scrolling.

Inside the Android ART dex2oat Pipeline: From DEX Bytecode to OAT Machine Code

A practical walkthrough of the dex2oat pipeline, compiler filter trade-offs, JIT and AOT cooperation, and Baseline Profile startup optimization.

Android On-device AI Chat Compose UI Architecture: Streaming Rendering and Multi-turn Conversation State

Compose UI patterns for on-device LLM chat apps, using token buffering, state isolation, and a single source of truth to keep streaming output smooth.

Kotlin Value Classes and Inline Classes: Zero-Overhead Type Safety

A deep dive into Kotlin inline class and value class compilation, boxing elimination, JVM signatures, Android performance patterns, and serialization compatibility.

Kotlin Symbol Processing: KSP from Annotation Scanning to Code Generation

A full walkthrough of KSP as a KAPT replacement, from SymbolProcessor and Resolver to CodeGenerator, incremental builds, multiplatform support, and migration.

Android Cache Systems: LruCache, DiskLruCache, and Offline-first Design

A practical deep dive into Android caching, from LruCache and DiskLruCache internals to three-level cache coordination, consistency, and offline-first engineering.

Android Stability Monitoring: From Crash SDKs to APM Dashboards

A deep dive into Android production stability monitoring, including Java and native crash collection, ANR detection, reporting pipelines, stack clustering, and APM dashboards.

Android On-device Speech Recognition: From SpeechRecognizer to Android 16 ASR

A full-stack look at Android on-device speech recognition, from AudioRecord capture and SpeechRecognizer APIs to Android 16's built-in ASR engine.

Android On-device LLM Context Window Engineering

A practical Android on-device LLM context management strategy covering layered prompt compression, summary caching, dialog state machines, and token budget allocation.

Android On-device LLM Streaming Output: From Tokens to Compose UI

A full-stack architecture for Android on-device LLM streaming output, covering KV Cache memory pressure, Kotlin Flow backpressure, and incremental Compose rendering.

Android On-device AI Model Delivery and Version Management

A practical model delivery architecture that decouples on-device AI models from APK releases with three-layer versioning, BSDiff incremental updates, and hot rollback.

Android On-device AI Power and Thermal Management: From SoC DVFS to Thermal Throttling

A practical look at sustained on-device LLM inference, GPU power profiles, DVFS scheduling, thermal throttling, and thermal-aware load scheduling that reduced P99 latency from 890ms to 380ms.

Android On-device AI Memory Bandwidth: GPU Shared Memory to NPU Zero-copy

A practical guide to Android on-device AI memory-bandwidth optimization, from camera-to-GPU data movement to AHardwareBuffer, ION reuse, and NPU zero-copy paths.

Android Cold Start Optimization with Baseline Profiles

A practical Android cold start optimization guide covering Baseline Profile principles, generation, integration, measurement, test results, and rollout caveats.

Android On-device AI Profiling with Perfetto: NPU Scheduling and Memory Bandwidth

A Perfetto-based profiling method for Android on-device AI inference, tracing NPU scheduling, GPU counters, DRM contention, and memory bandwidth bottlenecks.