Android Multi-Process Architecture: WebView Isolation and Remote Service Stability

Last year I took over stability work for an e-commerce app. WebView-related native crashes were in the top three of the crash leaderboard. After reading the crash stacks, we found SIGSEGVs inside libwebviewchromium.so: internal rendering-engine state had gone bad, and upper-layer Java code could not catch it.

The idea at the time was straightforward: if WebView crashes are not controllable, isolate WebView in another process.

The engineering value of multiple processes: more than crash isolation

Android gives each process an independent VM instance and memory space. A crash in one process does not bring down the whole app, and memory pressure in one process does not consume the main process’s heap quota.

This also bypasses the main process’s 512 MB heap limit. An isolated WebView process can hold 200 to 300 MB of memory without interfering with the main process. On low-end devices with 3 to 4 GB of RAM, this can decide whether the app survives in the background.

The value of multiple processes depends on the scenario. In our project, I split the use cases into three groups:

Stability isolation: put WebView and third-party SDKs, such as maps or live-streaming SDKs, into isolated processes. Native crashes in those modules do not affect the main business flow.

Memory decoupling: run memory-heavy pages such as large-image viewers and video players in separate processes. When the user exits, the system can reclaim the entire process space, leaving no leaked memory behind.

Keep-alive strategy: dual-process keep-alive patterns are mostly ineffective after Android 8.0, so I will not dwell on them. I would rather spend resources adapting vendor push channels than fight Android’s background limits.

WebView in an isolated process: implementation and pitfalls

Configuring an isolated process takes only one Manifest line:

<activity
    android:name=".webview.WebViewActivity"
    android:process=":webview" />

The colon prefix means a private process, such as packageName:webview, that other apps cannot communicate with. The declaration is only the first step. The real work is cross-process communication.

The core problem with an isolated WebView process is that URL loading, JavaScript calls, and Cookie synchronization are initiated by the main process but must execute across processes. Our early approach passed the URL through an Intent and opened the WebView page with startActivity. That was enough for simple cases, but as requirements grew, the gaps appeared: JS Bridge callbacks could not be returned, page state such as scroll position and form content was lost, and Cookie sync needed special handling.

AIDL interface design

We later switched to a bidirectional AIDL interface. The main process acted as the Client, and the WebView process acted as the Server:

// IWebViewService.aidl
interface IWebViewService {
    void loadUrl(String url, Map headers);
    void evaluateJavascript(String script, IJsCallback callback);
    void syncCookies(String domain, List<Cookie> cookies);
    void registerEventListener(IWebViewEventListener listener);
}

IJsCallback and IWebViewEventListener are callback interfaces that provide bidirectional communication between the main process and the WebView process. There is one hard rule here: callback interfaces must be marked oneway, otherwise Binder thread blocking in the WebView process can cause an ANR in the main process.

// IJsCallback.aidl
oneway interface IJsCallback {
    void onResult(String result);
}

One pitfall I hit: WebView initialization is asynchronous in the isolated process. When the main process calls loadUrl through AIDL, WebView may not have finished WebView.prepare(). The fix is to introduce a readiness signal:

// WebView process side
private final CountDownLatch mReadyLatch = new CountDownLatch(1);

@Override
public void loadUrl(String url, Map headers) {
    mReadyLatch.await(); // Block until WebView is ready
    mHandler.post(() -> mWebView.loadUrl(url, headers));
}

Use CountDownLatch instead of polling state, because a waiting Binder thread does not consume CPU time slices. But be careful: it must not wait for long inside an AIDL implementation method. The Binder thread pool has only 16 threads; occupying them for a long time makes other calls queue and time out.

Process death handling

The WebView process can be killed by the system at any time, and the main process must notice:

private ServiceConnection mConnection = new ServiceConnection() {
    @Override
    public void onServiceDisconnected(ComponentName name) {
        // WebView process died; bind again
        bindWebViewService();
    }
};

onServiceDisconnected is called only when Binder disconnects unexpectedly, such as when the process is killed. A normal unbindService does not trigger it. Rebind inside the callback, and wrap it with exponential backoff to avoid a reconnect loop.

Remote Service memory optimization

An isolated process is not free. Each extra process costs:

  • VM instance: about 10 MB, including ART dex cache and JIT code cache
  • System resources: Binder thread pool and Ashmem shared-memory regions
  • The process’s own Application initialization cost

Our app once split push, WebView, and image loading into three separate processes. Cold start on low-end devices became 800 ms slower because Application.onCreate ran in every process.

Slimming down Application initialization is a prerequisite for this architecture:

public class MyApplication extends Application {
    @Override
    public void onCreate() {
        // Only process-independent initialization: logging and crash handling
        LogManager.init();
        CrashHandler.init();
        
        // Defer process-specific initialization to each process entry point
        if (isMainProcess()) {
            initMainProcessModules();
        }
    }
    
    private boolean isMainProcess() {
        String processName = getProcessName(this);
        return getPackageName().equals(processName);
    }
}

Check the current process name and initialize only the modules that process needs. The WebView process does not need image libraries, network libraries, or database connection pools. The native libraries and memory footprint from those modules can easily exceed 30 MB.

Another common pitfall is Binder payload size. Binder has a 1 MB transaction limit. Passing large objects not only adds serialization overhead, but also involves a kernel-space memory copy. Do not pass a Bitmap object directly across processes. Use a Bundle with a ParcelFileDescriptor, or pass a file path and let the target process decode it.

Tradeoffs in cross-process data sharing

In a multi-process architecture, SharedPreferences, singleton objects, and in-memory caches all stop being shared because every process has an independent address space. There are three main choices for data sharing:

ContentProvider: good for structured data sharing and has built-in cross-process synchronization. The cost is higher overhead, and complex queries may need multiple IPC calls.

MMKV multi-process mode: a replacement for SharedPreferences. It uses mmap underneath for multi-process reads and writes. Writes still need locking, and high-frequency writes can contend.

Socket / Messenger: use a long-lived Socket connection when realtime behavior matters and you are willing to own serialization. Use Messenger when realtime requirements are lower; underneath, it is a message-queue model.

Our project used ContentProvider for Cookie sync, MMKV for page state, and AIDL callbacks for WebView JavaScript results. Do not chase a single unified solution. Choose the communication method based on the data’s characteristics.

Practical advice

Multi-process architecture is a tradeoff. Stability and resource cost must be balanced precisely. These are the three lessons I keep from production incidents:

  1. Create extra processes only where necessary. If a module’s crash can be caught and degraded by upper layers, do not put it in a separate process. WebView native crashes cannot be caught by Java code, which is why isolation is justified.

  2. Make Application.onCreate process-aware. Each process should load only the modules it needs. This is the most effective way to keep a multi-process architecture from slowing down startup.

  3. Make Binder callbacks oneway by default. Do not assume the remote process always responds quickly. One synchronous callback timeout can cascade into a main-process ANR. oneway does not guarantee delivery order, but stability matters more than ordering. In the production flows I have seen, I have not encountered a business requirement that truly depended on callback order.

Further reading