Android Treble: blessing or trouble – Part III

Karim Yaghmour, CEO Opersys inc. (www.opersys.com)

Inside Treble

To understand how Treble achieves its goals, we need to take a closer look at the internals of Android and get a better understanding of how the Android framework interacts with the lower layers of the system.  Only then can we explore the technologies introduced by Treble to modularize the links between the two, such as HIDL, VINTF, VNDK, and VTS.  While it’s outside the scope of this post to describe each of these in detail, I will describe them sufficiently to highlight their relevance with regards to our earlier discussion.  Note that many of those will require a significant time investment on the part of platform developers to master properly.  Through this discussion we’ll also see that there are likely a number of other aspects underlying Google’s decisions that aren’t spelled out as-is in their official documentation but that are worth keeping in mind still.

Underneath the Framework

The following diagram provides a high-level overview of Android’s internals:

The framework is essentially all the yellow boxes.  The gray boxes underneath the framework are part of what is commonly-referred to as the “native” layer – in reference to it being mostly made up of code compiled against the native processor architecture while most of the framework is written in architecture-independent Java code.  This native layer contains much of what is referred to as the “vendor implementation” in the Google Treble diagrams we saw earlier.  For the framework to run properly, it has therefore traditionally depended on several key components from the native layer:

– System libraries for a range of support functionality such as a C library, SSL, GL, etc.

– Hardware Abstraction Layer (HAL) modules for hardware support

– Native daemon processes for privilege separation and key service hardening

– Command line tools and capabilities as provided by Toybox

– The Linux kernel’s capabilities, drivers, system calls and interfaces

– SE Android / SE Linux security

Historically, there were no limits to the number or nature of changes an SoC vendor or device manufacturer could do in those layers without penalty since the vast majority of the functional surface of those layers was never tested by the CTS.  Insofar as the app development layer (illustrated by the “android.*” and “java.*” boxes in the previous diagram) behaved the same as an official Android released from Google, CTS would pass without a problem.  Treble, as we shall see, formalizes much of the coupling between the framework and the native components and enforces the validation of the formal rules using the VTS.

Linux Kernel

Aside from the Android code-base managed and released by Google, Android depends on one major component that Google does not control in any way: the Linux kernel.  Linux had of course existed for a very long time before Android came along.  Its adoption as the basis of the Android OS by the Android team was, to an extent, a recognition of Linux’s success at becoming the de-facto OS for much of the embedded space as seen by the level of support it enjoyed at the time (and continues to enjoy) from SoC vendors.  Over the course of its 25+ year history, much has been written about Linux and its development model.  As a first stop, you can refer to Wikipedia for an explanation of its history and its development.  A quick search in your favorite search engine should yield quite a few articles and books.  Suffice it to say, though, that Google is not the originator of Linux.  It’s only yet another company that needs Linux’s capabilities for its products.  Hence, Google is downstream from the existing Linux development process.  Whereas, for instance, Google is shown as the first party in the Android distribution flow as seen earlier, it’s essentially the second party in line in that same diagram when it comes to the Linux kernel.

Hence, for the purposes of the Linux kernel, Google is yet another intermediary.  And much as Google worries about the lack of responsiveness of intermediaries downstream from it in making sure users get up-to-date Android releases as quickly as possible, so too have Linux kernel developers historically been worried about Google’s and its intermediaries’ reliance on older kernel releases.  In short, it’s hasn’t been uncommon to see new Android devices running on what are effectively “end-of-life” kernel versions.  While the kernel’s capabilities and features remain largely hidden from regular users, the inability of new kernel releases to find their way in the form of updates to devices is a major concern with regards to one aspect we haven’t spent a lot of time discussing earlier: security.  Indeed, there are security fixes in every new kernel release and since the kernel is the foundation of the Android stack, as can be seen by the earlier diagram, Android’s own security mechanisms are often at stake when a security issue is found in the kernel.

Furthermore, it hasn’t been uncommon to see the same Android release running on different kernels from different SoC vendors.  Android itself didn’t have, prior to Treble, any hard requirements with regards to the specific kernel version to be used with Android.  Security fixes were generally applied in a very uneven fashion from one SoC vendor to the next.  Even for a given SoC vendor, some security fixes applied in newer kernels would be backported whereas others were not, without necessarily any apparent logic or rationale in the choice of fixes being applied.  Much like Android updates, kernel updates for Android were therefore spotty at best.

Treble’s goals for the kernels used by Android follow the same pattern as that charted for Android itself.  Namely, the aim is to make kernel updates for Android devices easier, faster and more consistent.  To do so, Treble aims to separate the deliverables related to the kernel based on the role of each party in the ecosystem.  To keep vendors honest, Treble specifies that several kernel-related aspects will be tested by the VTS.  Specifically, Treble provides specific rules with regards to the kernel versions that must be used, their basic configuration, the system calls that must be exposed (ABI and API), and the filesystem and filesystem features that must be support.

Google provides a number of detailed pages explaining its current requirements for 8.x/Oreo as well as its projected plans moving forward.  In general, the long-term goal seems to be for Google to base its own Androidized kernel releases on the Long Term Support (LTS) releases which aim at providing long-term support for specific kernel releases.  Google in fact announced at Linaro Connect last fall that they would work with LTS release maintainer Greg Kroah-Hartman to support the 4.4 LTS kernel for 6 years instead of the initial 2 years generally planned for LTS, thereby essentially allowing for 4 Android releases to run on that 4.4 LTS kernel. Those kernels would then have a Google-specified base configuration and be used by SoC vendors to provide per-SoC vendor kernels which would be customized by OEMs/ODMs by way of kernel config overrides, loadable modules, and device-tree overlays. In other words, the vendor additions would be modularized on top of the LTS-maintained kernel.

Hardware Interface Definition Language (HIDL)

Per the architecture diagram introduced earlier, the Android framework does not know the specifics of the hardware it needs to interface with.  Instead, it relies on HAL modules to provide per-device-type support for each hardware type.  For example, the SurfaceFlinger, which is the framework component responsible for rendering content to the screen, relies on a HAL module type known as “hwcomposer” (which stands for “HardWare Composer”) to do the actual work of rendering screen sections (known as “surfaces”) to the screen.  The Location System service, as another example, can’t provide actual location without, say, the GPS HAL module.  And so on for every hardware type.  Whereas Google specifies the interface between the framework service and the corresponding HAL module type, it’s the responsibility of the device manufacturer to provide the HAL modules that properly implement those interfaces, generally based on reference versions provided by the SoC vendor for their reference design and/or reference board.

Before 8.x/Oreo, Android’s way of specifying its interface requirements for HAL modules was through C header files.  That is, a HAL module author would include a C programming header into their project and proceed to implement the functionality specified by that header in their code.  This module would then be built and shipped as part of the manufacturer’s image for the given device.  At boot time, the Android system would load this HAL module into the framework and the framework would proceed to use it as-is.  Given that HAL module interfaces could and did sometimes change (and in some cases drastically so) from one version to the next, any Android update required programmers to go back to old HAL module implementations, include the new C header from the new Android version, modify their code according to any changes between this new version and the older version, recompile their module as part of a full Android build, and release the updated HAL module as part of a full new system image.  In short, the older working HAL module had to be re-engineered anew for it to be of any use in a new Android version.  Sometimes the work was trivial.  In some key cases, however, it was sufficiently significant as to go undone.  This was further compounded by the “all or nothing” nature of such an endeavor.  Going from version N to version N+1 required updating all HAL modules to the new HAL interface definitions specified in the new version.

To solve this issue, Treble introduces the Hardware Interface Definition Language (HIDL).  HIDL is a major part of Treble and it formalizes the HAL interface definitions by forcing them to be described using a new high-level, non-programming-language specific format with enforced versioning.  If you are familiar with the AIDL format used to describe service interfaces for application development, HIDL follows a similar (but different) format for describing HAL interfaces.  In 8.x/Oreo, for instance, HIDL is used to describe the hwcomposer interfaces version 2.1 and GNSS (which includes GPS) version 1.0, among many other hardware definitions.  Once published by Google, the definition corresponding to a certain version number doesn’t change anymore in the future.  In 9.x/“P”, for example, Android may include, say, hwcomposer v2.2 or v3.0 or GNSS v1.1 or v2.0, but hwcomposer v2.1 and GNSS v1.0 will always remain as they were defined in 8.x.

When Android is freshly ported to a new device, all HAL module implementations should typically begin their life as inheriting from the last HIDL definition found for each hardware type in that version of Android.  For hwcomposer in 8.x/Oreo this would be the v2.1 HIDL definition and for GNSS this would be the v1.0 HIDL definition.  Once that version of Android is shipped, the modules should continue to work as-is without any updating or modification insofar as future Android releases still support the HIDL interface versions they implement.  If, for example, the Android framework in 9.x/“P” still supports hwcomposer 2.1 and GNSS 1.0 then the modules created based on those versions in 8.x/Oreo should work as-is in the new version.  That’s one of the greatest benefits of Treble as it fulfills the promise of making HAL module updates easier, faster and less costly.  The only caveat here is that this completely depends on Google continuing to provide support for those HIDL interface versions in future Android releases.  As we saw earlier, however, they have ever benefit of keeping to this promise as long as it’s possible.

In short, this is one aspect where Android’s lack of upgradability was entirely due Google’s way of engineering the stack, and they’ve taken steps to avoid this being a problem moving forward.

In part IV, we’ll start looking look at the rest of the changes made by Treble to the Android stack, and conclude this series.

About The Author:
Karim Yaghmour is part serial entrepreneur, part unrepentant geek. He’s most widely know for his O’Reilly books: “Building Embedded Linux Systems” and “Embedded Android”. As an active member of the open source community since the mid-90’s, he pioneered the world of Linux tracing with the Linux Trace Toolkit.

© 2018, Opersys inc.,                /