Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Why is Android moving to AOT Java compilation?
27 points by rockdoe on July 4, 2014 | hide | past | favorite | 55 comments
http://anandtech.com/show/8231/a-closer-look-at-android-runtime-art-in-android-l

I'm a bit surprised to see Android move to AOT compilation, together with large claimed performance benefits. As far as I know, aside from ART, all state of the art Java Runtimes are using JIT. In theory, JIT should have the advantage of having live profile data, the ability to speculatively do optimization and back off when they're wrong, etc...

The article list a bunch of benefits, but they don't really seem to stand up to scrutiny. Overhead from repeatedly having to JIT the code? Cache it - also removing the battery life argument. AoT having the advantage of seeing all the code and do global optimizations? It's just the opposite!

I could see some advantage of AoT for first-startup, but that isn't even claimed as an advantage, because the AoT happens at install, not at compilation/packaging time.

It's even stranger as modern Android phones have 4 to even 8 cores. That leaves plenty of horsepower to do the JITing in the background and be even faster.

Now, I can see that ART is faster simply because it's a better, second iteration on Dalvik (which wasn't great). Dalvik was 2-3 times slower than standard JITs. ART seems to make up this difference, but that still leaves the question: why did Google choose to go the AoT route instead of JIT?



Startup time. JIT doesn't kick into action until after the app has started up and ran through some hot code paths. Up until that time things are interpreted and thus slower.

Secondly there are limits to how much you can optimize at run time. For servers there is more leeway in terms of available memory and CPU power but on mobile you can only optimize so much without slowing things down with optimization and jit code generation overhead.

JIT generated code has to be generated every time the app runs. Besides the pages used to store the JIT code are process private - I.e. no sharing.

Compared to all of this - AOT can afford to optimize more, it only has to do it once, the framework code can be spit out to common shared files on the disk this enabling sharing of those pages and reducing memory consumption. This gives you speed, battery and memory benefits over JIT.


+1 Battery life is the limiting factor in mobile phones. They are usually disconnected all day.

It is worthwhile to put up with all the engineering pain because having a phone that runs all day is most to end users.

Whenever I get a new phone, I'd have fun with it for about a week, and then turn off everything and try not to use it much so that I can still take and make calls by the end of the day.


Up until that time things are interpreted and thus slower.

That's true - but not specific to mobile.

For servers there is more leeway in terms of available memory and CPU power but on mobile you can only optimize so much without slowing things down with optimization and jit code generation overhead.

Which makes me wonder why it's going into new versions of Android (running on 4 to 8 core devices, i.e. having plenty of free horsepower). If anything this is becoming less of an issue.

JIT generated code has to be generated every time the app runs.

False.

The memory savings argument might be something - particularly with heavy use of system libraries. Android phones usually have relatively memory compared to their other resources.


"Which makes me wonder why it's going into new versions of Android (running on 4 to 8 core devices, i.e. having plenty of free horsepower)."

... running on a very taxed battery.


>>JIT generated code has to be generated every time the app runs.

> False.

How so? Do you know of production class JITs that cache generated native code and reuse it every time the process is re-executed?



Nope. I know for sure hotspot doesn't do it - neither for desktop nor for server. The "Oracle JIT" refers to the JIT that Oracle uses for Java code that executes inside of the Oracle database. Completely different environment and usage model.

Read the first comment in your link to get a sense of why it isn't practical to do persistent code caching for a dynamic language like Java.


If you cache everything, you're just doing AoT compilation.

But in general, I would say that the performance advantages of JIT have never really materialized outside of a few specific types of code. In theory, there should be room for them to appear, but in theory, we're all running Itanium chips. In practice, we haven't been able to write software to take advantage of those theoretical possibilities very well.

Look at it this way -- there are a lot of really smart people working at Google on Android, and despite having a completely working JIT environment and centuries of man-effort to devote to it, they went with a complete rewrite of the runtime in order to gain speed. They wouldn't have done that if there were easier paths to efficiency.


If you cache everything, you're just doing AoT compilation.

Not at all. That's a false dichotomy. You can restore the past JIT-ed code from the cache and do new profile directed optimizations. Java JITs already do this, just not (AFAIK!) HotSpot.

there are a lot of really smart people working at Google on Android

...and yet their previous runtime was 3 times slower than Sun's one, requiring a full rewrite. Sorry, that's just an appeal to authority, not an argument.


"You can restore the past JIT-ed code from the cache and do new profile directed optimizations."

Sure, but now you're paying for the compilation/runtime analysis stage more than once. If the JIT compiler is actively doing any profiling of your code, that's time and watts being spent not running the program the user wanted you to run. In order for it to be a net gain, you have to improve the efficiency of the user's program enough through JIT to pay for this repeated cost of doing the analysis however often you do it.

You can profile more intensively and more often, thus improving the generated code, but now you need to improve it a lot. Or you can profile lightly or infrequently, but now you have way less information with which to improve it.

In practice, it seems to be very hard to do significantly better than just running it through a good compiler once.


In practice, it seems to be very hard to do significantly better than just running it through a good compiler once.

Why aren't the other Java runtimes doing it, then?


I can only guess, but I'd say it's a combination of factors.

First, the JIT on my quad-core i7 can be pretty intense. I'm more willing to pay the overhead than I am on a device that has a tiny battery and gets hot in my pocket. So the JIT on my desktop is going to be better than the one on my phone.

Second, Java is mostly a server language these days. The types of programs that run on servers are going to be more amenable to profile-guided optimizations than UI heavy event handling code that constitutes a large part of phone apps.

And on desktops and servers, it just doesn't matter that much. Particularly on a server, startup time is pretty much irrelevant. So the downsides to JITs are minimized due to the nature of the workload.


They are though. There are quite a few JVM with AOT compilers available.


Good Points ... I've been doing Java development for 14 years and it is just plain slower for most things I've done. Sure you can write loop which will optimize correctly but most of the code executes more slowly or suffers from class loading slowness or GC delays.

This is why my android phone needs a quad-core 1.5 GHZ processor to feel as fast as an IPhone 4S.


> modern Android phones have 4 to even 8 cores, That leaves plenty of horsepower to do the JITing in the background

Android can't really look at the world like that, the platform is being used in wearables through to what will be console class devices.

Android must be able to work in the low-end, take project Svelte for example, as being important to get Android (back) to being usuable in 512MB devices.

Also typically most effective power management strategies run to idle then off, rather than keep power consumers alive at some nominal frequency.

> Overhead from repeatedly having to JIT the code? Cache it

Caching in memory is a limitation because of basic memory constraints, and caching to persistent storage (swap) is not a great option because of the relative speed of mobile storage and to a certain extent wear levelling effects on flash may also be a concern.

I argue that what gets lost in the fog when talking about JITs and virtual machines vs static / AOT based solutions is that VMs can be very performant when you can ignore startup time and when there is a single instance of a VM that you have to manage on a system.

Take an Android device and look at the number of activities resident at any one time, you'll have 10 maybe 20 plus activities present, each of which is an instance of a virtual machine spawned from Zygote. This makes memory footprint of the VM critical, because the penalty for restarting Activities is often pretty high.


Because making your users wait for your code to compile over and over again on their device is not what you want your users doing.

.Net has the same issue and MS also went this route for Windows phone and it's now making its way to the desktop.

If it can be done at build time it should, I think everyone is realizing that on many fronts. Long startup times and battery power being consumed by a jit is a worse users experience.


But why not just cache the JITed code? There's third party JIT runtimes that do it (Excelsior JET) and it's planned in the official Java 8 runtime.

Caching first-run still gives you the ability to do more/new profiled-based optimization later.


> But why not just cache the JITed code?

What is the difference between AOT and sufficiently aggressive JIT caching?

> Caching first-run still gives you the ability to do more/new profiled-based optimization later.

Well you could have your runtime running with the overhead of an interpreter, a profiler, and an optimizer, and hope all will turn out well. Or you could just run the machine code without these complexities and know that it is probably close to optimal anyway.


What is the difference between AOT and sufficiently aggressive JIT caching?

AoT doesn't adapt to changing program profiles.


But it also doesn't spend any CPU time, memory pressure, cache lines, etc., on trying to adapt to changing profiles.


Well, I think the answer to you question then would have to be that not enough advantage could be gained from changing program profiles to compensate the disadvantages of JIT.


Yeah, but AOT + PGO does.


"It's planned in the official Java 8 runtime."

Do you have any sources on that? Java 8 has been released for a while now, and this is the first time I hear about this.

Given the dynamism of most Java code and the very very wide use of runtime-generated code via cglib or asm, dynamic proxies etc., I think this could actually be quite hard (well, the runtime could still cache the generated code, and if the dynamically generated bytecode matches the stored hash, it could just use the cached code; but maybe there are other issues as well).


Excelsior guy here. Actually, Excelsior JET is a proper AOT compiler. That is, a developer or build enginer runs it on their workstation (build server) and then ships a native executable to the end users. We used to have a caching JIT, and even an option to recompile the cache into a single binary, but concluded it is not worth the hassle, and deprecated it last year.


If you cache you lose most of the benefits of a jit, might as well AOT one time for everyone.


I'm not sure I understand how.

You've gone through the JIT process of live profiling, hot spot detection, compiled down to native code etc. etc., then saved it along the way so that next launch you can start fast with everything already optimised (but also in a position to continue to learn and improve should the opportunity arrive.)

Your first run or two will be slower, beyond that you should be as fast as your code will go.


What magical extra CPU, memory, cache, and battery have you introduced into that third run to handle the process of "continuing to learn and improve should the opportunity arise"? Because if you're using the same one that's running the user's program, then you won't be running as fast as the code can go.

You have to pay for profile guided optimization. That optimizer is another program running on your CPU. It needs memory, time slices, access to secondary storage, etc. It's worth it only if the user code can be improved so much that the total of the two is smaller than what you would have gotten by just compiling it once. And that's not easy to do for general purpose computation.


Arguments please. The post you are responding to already rebuffs this.


See DCKings response. You can reduce startup time of jitting but not eliminate with multistage interpret and background jit with persistent caching and how much complexity and overhead did you add to your runtime and battery power use to get that far?

Jitted languages are not known for their performance, they are faster than interpreted ones and slower than the AOT ones generally speaking. So what performance benefit did jit bring compared to a good AOT compiler?


Apart from technical reasons, there might be political or legal ones. One would be to bring the platform further away from Oracle's Java. Standard java uses the JVM, Android uses the Dalvik VM. While technically quite different, one could argue that in both cases Java is compiled to bytecode, and thus Android would be ripping of Java. ART would mean moving to machine code and thus a step away from the VM paradigm. However, I don't think this is a compelling reason, since the disputes have always been around the Java API and parts of the source, and not about the architecture.

Another (more paranoid) idea is that ART would nip stuff like Cydia Substrate in the bud. Substrate is basically a framework to hook arbitrary functions in apps or in the OS. For that, it depends on rewriting Dalvik bytecodes. If Google wants to lock the platform down to disallow stuff (bypassing DRM, turning off the camera sound, installing facial-recognition software on Glass, etc.) it would make sense to stop that kind of tool.


There are certified JVMs with AOT compilers available.


Again and again Apple's much criticized design choices turn out to be the right ones. In a resource constrained environment, AOT compilation and reference counting get you to a fairly high level of performance and good levels of abstraction with enough simplification for the programmer's task, but without taking too much away from UX in the form of lag.

Mobile is always going to be resource constrained for the same reason highways are always busy at rush hour. Such resources are too valuable to stay unused for long.

(Dalvik will never go to ref counting, but the design trade offs in its GC will reflect the same goals.)


You're comparing apples and oranges, Apple had the privilege of choosing its own hardware and programming language.

For android they went for java because of popularity, and everyone can use it on their own hardware.

So you have a java based language running on ARM, x86 and MIPS, what would you choose? A proven approach like JIT VM or AOT?


> You're comparing apples and oranges, Apple had the privilege of choosing its own hardware and programming language.

Google and Apple both used already existing languages and adapted existing kernels. If you're saying that Apple didn't use an existing language or that Google didn't have the option of being more restrictive with hardware, then you are wrong on both counts. Google made choices which turned out to be worse for users over a certain timeframe. They were more attractive to vendors and manufacturers, however.

> For android they went for java because of popularity, and everyone can use it on their own hardware.

Google was optimizing for programer popularity, not user experience.

> So you have a java based language running on ARM, x86 and MIPS, what would you choose? A proven approach like JIT VM or AOT?

If you're going to go with what's more proven with constrained resources, then you'd go AOT. If you're trying to optimize for UX and lag, but high level enough for rapid development, then AOT with reference counting GC.


I didn't say that, I said that both went for what was the optimal choice at the time.

Why Apple didn't go for Swift from the start? Because Objective-C was used in NeXT and then OSX and so on.. when the time was right ans Swift was mature it has been introduced.

The same with JIT VM, that, again, are then norm for Java bytecode.

Also Google could not be restrictive on the hardware, since they are not manufacturers and the OS is open source. They restrict to ARM CPUs only? Fine, Intel forks it and ports to x86 anyway.

Smart guys work in both companies and they evaluate pro and cons of every approach anyway.


Re: "For android they went for java because of popularity, and everyone can use it on their own hardware."

True, but only the hardware independence part is true. Android was targeted for any vendor, while Apple targeted only their A-series processor.


Power is a bigger constraint on mobile devices than raw speed.


But would AOT have significant effect in this regard?

From my experience, major power consumers are display, its backlight and various connectivity interfaces. Oh, and working GPS can surely drain the phone in an hour or two. Compared to those, any possible differences AOT vs JIT CPU power consumption (if there are any and if they are somehow in favour of AOT) seem totally negligible. But maybe I'm just wrong on this.


What advantage does AoT have over JIT in terms of power? Recompiling is a non-issue, if you can cache AoT code you can cache JIT code.

Not to mention, faster code uses less battery.


Recompiling is an issue on large code bases, that's the problem, power is secondary, slow app startup is primary in my experience.

If you cache jit code it loses it live data benefit for the most part, might as well AOT on your build server instead.


If you cache jit code it loses it live data benefit for the most part

As pointed out elsewhere: that's just not true.


IMO, more static code means even more predictability in terms of object placement in memory, thus allowing much more aggresive memory optimization : less fragmentation, efficient pagination etc... It means also that you have static code and GC interacting with it, it's pretty new, isn't it ?


Not really, these are orthogonal concerns. One could just as easily write an abusive unpredictable memory allocator in Go (native, GC) as in Java (bytecode, GC) or C (native, no GC).


Maybe they're not really moving away from JIT but just Dalvik. After they have the new runtime running well as AOT-compiled, they can move some of the optimizations into a tracer or other profiler, in a hybrid approach.


Maybe it helps with latency? When a user presses a button, you don't want to have to compile some code before you can show them the result.


That's not really an advantage of AoT over JIT+interpreter, is it? AoT requires compiling at first run, the JIT can just use the interpreter (and preferably compile in the background).


Wouldn't the background JIT thread possibly cause some of Android's notorious stuttering in the UI?

Let's be honest, today's smartphones are more powerful than desktops of only a 3-5 years ago for most things, have as much RAM, decent GPUs and as many cores and near enough as much storage. It's actually pretty miraculous what's been crammed into these tiny mobile computers. When you remove the battery out of one it's pretty astonishing how much computing you can fit in a few ounces.


Wouldn't the background JIT thread possibly cause some of Android's notorious stuttering in the UI?

No? Why would it? Unless it's a single-core device with bad scheduling and prioritization, that makes no sense.


Because that interpreter that's running while your code is being compiled is not slower than the jitted code right?


The other cores might be busy, or turned off for power reasons.


This is a question for stackoverflow, not HN. Also, just use google.


Fairly sure SO would close the question as unsuitable, FWIW.


Closed as primarily opinion-based.



Stack Exchange maybe?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: