Severe Siberian JVM: great interview about Excelsior JET

Recently, we wrote about what tricks Alibaba went to in order to make life with OpenJDK more acceptable. There were comments like "it turns out, while we are suffering here with ordinary java, the Chinese have already made themselves special." Alibaba, of course, is impressive - but Russia also has its own fundamental projects, where they also make “special java”.

In Novosibirsk, for 18 years now they have been making their own JVM , written entirely independently and in demand far beyond the borders of Russia. This is not just some kind of hotSpot fork that does the same thing, but slightly better. Excelsior JET is a set of solutions that allow you to do completely different things in terms of AOT compilation. “Pff, AOT is in GraalVM,” you can say. But GraalVM is still a very research piece, and JET is a proven solution for use in production over the years.

This is an interview with some of the Excelsior JET developers . I hope it will be especially interesting for anyone who wants to discover new things that can be done with Java. Or people who are interested in the life of JVM-engineers and want to participate in it themselves.

In the fall, I flew to one of the Novosibirsk Java conferences, we sat down with Ivan Uglyansky dbg_nsk and Nikita Lipsky pjBooms (one of the JET developers) and recorded this interview. Ivan deals with JET: GC runtime, class loading, multithreading support, profiling, plugin for GDB. Nikita is one of the initiators of the JET project, participated in research and development of almost all components of the product from core to product properties - OSGi at the JVM level, Java Runtime Slim Down (Java modules in JET were already in 2007), two bytecode verifiers, support Spring Boot and so on.

Oleg Chirukhin: Can you tell about ignorant people about your product?

Nikita Lipsky: It is surprising, of course, that we have been on the market for 18 years, and we are not so much known. We make an unusual JVM. Unusual for an AOT compilation bet, i.e. we try to precompile java bytecode into machine code.

Initially, the idea of the project was to make Java fast. Productivity is what we went to the market with. And when we went, Java was still interpreted, and static compilation into machine code promised to improve Java performance not even times, but orders of magnitude. But even in the presence of JIT, if you compile everything in advance, you do not spend resources at runtime on compilation, and thus you can spend more time and memory and end up with more optimized code.

In addition to performance, a side effect of static bytecode compilation is to protect Java applications from decompilation. Because after compilation no bytecode remains, only the machine remains. It is much more difficult to decompile to source code than Java bytecode. Actually impossible. That is, it can be disassembled, but you will not generate the source code. But from the bytecode, Java sources can be generated with accuracy to variable names easily, there are many tools for this.

In addition, once upon a time it was assumed that Java is on all computers, you distribute your Java applications as bytecode, and they are executed the same way everywhere. But in reality, everything was not so good, because one has one Java, the other has another. Because of this, if you distributed the program as bytecode, various kinds of surprises could occur, starting with the fact that the user simply could not start your program, ending with some strange behavior that you could not manifest at yourself. Our product has always had the advantage that you distribute your Java application simply as a native application. You are not dependent on runtime, which is (or not worth) the user.

Ivan Uglyansky: In general, you do not need to require that Java be installed.

Oleg: Remains a dependency on the operating system, right?

Nikita: That's right. Many people criticize that if you compile into native code, then you have the slogan "Write once - run anywhere" stops working. But it is not. I occasionally talk about it on my reports that “write once” sounds like “write once” and not “build once”. That is, you can build your Java application for each platform, and it will work everywhere.

Oleg: Straight everywhere?

Nikita: Wherever supported. We have a Java-compatible solution. If you write in Java, it will work where Java works. And if you use the compiled by us, then where we support it is Windows, Linux, Mac, x86, amd64, ARM32. But where we don’t support, you can still use regular Java for your applications, that is, the portability of your Java applications in this sense does not suffer.

Oleg: Are there such constructions that are performed differently on different platforms? Pieces of the platform, which are not fully implemented, any standard libraries.

Ivan: It happens, but it is not JET-specific. You can look, for example, at the AsynchronousFileChannel implementation in the JDK itself, it is completely different on different Windows and on Posix, which is logical. Some things are implemented only on certain platforms, SCTP support (see sun.nio.ch.sctp.SctpChannelImpl on Windows) and SDP (see sun.net.sdp.SdpSupport). In this, too, there is no particular contradiction, but it really turns out that “Write once, run anywhere” is not entirely true.

If we talk about the implementation of JVM, then on different OSs the difference, of course, can be huge. What is the fact that on OS X in the main thread you need to run the Cocoa event loop, so the launch there is different from the rest.

Nikita: However, outside for the user it all looks and works almost the same.

Oleg: What about performance? Is it the same on all platforms?

Nikita: There is a difference. For example, the Linux file system works better and faster than the Windows one.

Oleg: And porting between processors?

Nikita: This is a fun activity! The whole team suddenly starts to port. Entertainment is usually for six months or a year.

Oleg: Does it happen that a piece of code on another platform starts to slow down more?

Ivan: This may be due to the fact that we just did not have time to do or adapt some kind of optimization. It worked well on x86, then we switched to AMD64, and just did not have time to adapt it. Because of this, it may be slower.

Another example about performance. There is a weak memory model on ARM, there you need to put many more barriers so that everything works correctly. We had AMD64, some places worked, consider free then, because the memory model is different there. On ARM you need to put more barriers, and this is not free.

Oleg: Let's talk about the hot topic now - “Java on embedded-devices”.
Suppose I make a plane that flies with control on a Raspberry Pi. What typical problems does a person have when he does this? And how can JET and AOT compilation in general help in this matter?

Nikita: Airplane on the Raspberry Pi is, of course, an interesting topic. We did ARM32, and now JET is on the Raspberry Pi. We have a certain number of customers on embedded, but there are not so many of them to talk about their “typical” problems. Although they have some problems with Java, it’s not hard to guess.

Ivan: What are the problems with Java on the Raspberry Pi? The problem is with memory consumption. If it needs too much, then the application and the JVM is hard to live on poor Raspberry Pi. In addition, on embedded devices, it is important to have a quick launch so that the application does not overclock there for too long. AOT solves both of these problems well, so we are working to improve support for embedded. Specifically about the Raspberry Pi is to say about Bellsoft , who are now actively engaged in this c HotSpot. Normal Java is fully present there.

Nikita: In addition, there are few resources on embedded systems, there is no place for the JIT compiler. Accordingly, AOT compilation itself accelerates performance.

Again, embedded-devices are without inclusion in the network, on the battery. Why is there a battery for the JIT compiler, if you can assemble everything in advance?

Oleg: What features do you have? I understand that JET is a very large complex system with a lot of everything. You have an AOT compilation, that is, you can compile the file. What else? What are the interesting components that are worth talking about?

Ivan: We have a number of features related to performance.

I recently talked about PGO, our relatively new feature. We have a profiler built right into the JVM, as well as a set of optimizations based on the profile it collects. After recompilation with the profile we often get a serious performance boost. The fact is that performance information is added to our powerful static analyzes and optimizations. This is such a slightly hybrid approach - to take the best from JIT and AOT compilation.

We have two great features to further accelerate the launch. The first is that when you look at the order in which the memory pages are poked initially, you simply monitor it and appropriately link the file.

Nikita: Second, when you launch the executable, you understand which pieces of memory are pulled up, and then, instead of pulling them up in any order, you pull up the right piece right away. Also greatly accelerates the launch.

Ivan: These are separate product features.

Nikita: The first is called Startup Optimizer, and the second is Startup Accelerator. Features work differently. To use the first, you need to run the application before compiling, it will remember in what order your code was executed. Then in the correct order, this code will link. And the second is the launch of your application after compilation, then we already know what went where, and after that we launch everything in the right order.

In addition to performance-related features, there are a number of product features that make JET more convenient to use.

For example, we are able to pack, say, Windows distributions. Once - and got a Windows installer. You can distribute Java applications like real native applications. There is much more. For example, there is such a standard problem with AOT compilers and Java, when an application uses its own class loaders. If you have your own class loader, it’s not clear which classes we will AOT compile? Because there is a logic rezolv between classes can be anything. Accordingly, none of the Java AOT compilers, except ours, work with non-standard class loaders. We have special support in AOT for some classes of applications, where we know how their custom loaders work, how links between classes are resolved. For example, we have support for Eclipse RCP and there are customers who write desktop applications on Eclipse RCP and compile them. There is support for Tomcat, there are also used custom loaders. You can compile tomcat applications with us. We also recently released a version of JET with Spring Boot support out of the box.

Oleg: What is the server down there?

Nikita: Whatever you want. Which Spring Boot supports, it will work with such. Tomcat, Undertow, Jetty, Webflux.

Ivan: Here it is necessary to mention that for Jetty we do not have support for its custom class coolers.

Nikita: Jetty, as a stand-alone web server, has a custom classifier. But there is such a thing as Jetty Embedded, which can work without custom loaders. Jetty Embedded is quietly working on Excelsior JET. Inside Spring Boot Jetty will work in the next version, like any other servers supported by Spring Boot.

Oleg: Essentially, the user interface with JET is javac and Java or something else?

Ivan: No. For the user, we have several options for using JET. First, it is a GUI in which the user pierces all the features, then presses the button and his application is compiled. When he wants to make some installer so that the user can install the application, he once again pierces the buttons. But this approach is a bit outdated (the GUI was developed back in 2003), so now Nikita is developing and developing plug-ins for Maven and Gradle, which are much more convenient and familiar for modern users.

Nikita: You substitute seven lines in pom.xml or build.gradle, you say mvn jet:build , and you have a sausage stick on the way out.

Oleg: And now everyone loves Docker and Kubernetes very much, can we put together for them?

Nikita: Docker is the next topic. We have this parameter - packaging in Maven / Gradle plugins. I can add packaging apps for Docker.

Ivan: This is still work in progress. But in general, we tried to run JET-compiled applications on Docker.

Nikita: It works. Without Java. Naked Linux, thrust there JET-compiled application, and it starts.

Oleg: And what about the output from the packaging docker? Do you cram a container or file in the Docker file?

Nikita: Now you just write a JET-specific Docker file - these are three lines. Further everything works through regular Docker-tools.

I'm playing with microservices now. I compile them with JET, I launch, they discover each other, communicate. The JVM didn’t have to do anything for it.

Oleg: Now all sorts of cloud providers have launched things like Amazon Lambda, Google Cloud Functions, can you use it there?

Nikita: I think we need to go to the providers of all these pieces and say that if you use us, all your lambdas will work faster. But this is still just an idea.

Oleg: So they really will work faster!

Nikita: Yes, most likely, there will be more work to be done in this direction.

Oleg: I see a problem here in the compilation time of the lambda. What is your compilation time?

Ivan: It is there, and this is a problem that users of ordinary JVM with JIT do not think about. Usually, because how - launched the application, it works (albeit slowly at first due to compilation). And here there is a separate step for the AOT compilation of the entire application. This may be sensitive, so we are working to accelerate this stage. For example, we have an incremental recompilation, when only the changed parts of the application are recompiled. We call it smart recompilation. We were just doing this in the past dev.period with Nikita, sat in a pair.

Nikita: There are certain problems with Java and smart recompilation, for example, circular dependencies within Java applications - they are everywhere.

Ivan: There are a lot of problems that are not quite obvious until you think about this task. If you have a static AOT compiler that does various global optimizations, then it’s not so easy to figure out what exactly needs to be recompiled. It is necessary to remember all these optimizations. And optimizations can be nontrivial. For example, you could do all sorts of difficult devirtualization, inline something the devil knows where. If you changed one classic or one JAR, it does not mean that only it needs to be recompiled and that's it. No, it's all much more confusing. It is necessary to calculate and remember all the optimizations that the compiler has done.

Nikita: Actually doing the same thing that JIT does when it makes a decision about deoptimization, but only in the AOT compiler. Only the solution is not about deoptimization, but about recompilation.

Oleg: About the speed of smart compilation. If I take Hello, World, I compile it, then change two letters in the word Hello ...

Nikita: It compiles quickly.

Oleg: In the sense of not a minute?

Nikita: Seconds.

Ivan: But it still depends on whether we include platform classes in the executable.

Oleg: And what can be without it?

Nikita: Yes, by default our platform is being sawn into several DLLs. We implemented Jigsaw at the very beginning :-) That is, we drank Java SE classes into components for a very long time, back in the 90th year.

Ivan: The point is that our runtime plus platform classes - they are all precompiled by us, and yes - are divided into DLLs. When you run a JET compiled application, the runtime and the entire platform are represented as these DLLs. That is, as a result, it looks like this: you take “Hello, world”, compile, you actually compile one class. This happens in a few seconds.

Nikita: For 4 seconds, if in the global; in a couple of seconds, if not in the global. Global is when you link: all platform classes compiled into native code are in one large file.

Oleg: Can I do some hot reload?

Nikita: No.

Oleg: No? Sadness But it would be possible to generate one DLL, link it again, and then ...

Nikita: Since we have JIT (by the way, yes, we also have JIT too!), Then you, of course, can load pieces of code, unload, load back. For example, all the code that works through our JIT, in the same OSGI, can be reloaded if you want. But here is the hot reload, which is in HotSpot, when you sit in the debugger, and change the code on the fly, we don’t. This can not be done without loss of performance.

Oleg: At the development stage, performance is not so important.

Nikita: At the development stage, you use HotSpot, and you don't need anything else. We are compliant with the Java specification solution. If you use HotSpot and use hot reload in debugging, everything is fine. You debug, compile JET, and everything works as on HotSpot. It must be so. Usually so. If not, you write to the support, we understand.

Oleg: And what about debugging in JET? JVM TI? How much is all supported?

Ivan: One of the core values of using JET is security. Custom code will not be available to anyone. Because everything is compiled into the native. There are some contradictions with this, do we support the TI JVM? If we support it, it means that any pumped developer who knows how the TI JVM works will be able to get access to anything very quickly. We do not support the JVM TI now.

Nikita: This is an optional item by specification. It may be supported by platform implementers, may not be supported.

Ivan: There are many reasons. It is optional and violates security, which means that users will not say “thank you” to us. And it is very hotSpot-specific inside. Not so long ago, our guys supported the JVM TI as a pilot project, they reached a certain stage, but all the time they were confronted with the fact that it was very sharpened by HotSpot. In principle, this is possible, but what business problem will be solved by this?

Nikita: Once you have earned on HotSpot, but it did not work on the jet - this is not your problem. This is our problem.

Oleg: Got it. And do you have any additional features, which
not in HotSpot, but you have them, and they require direct control? Which I would like to pamper, to sort them out.

Nikita: Exactly one feature. We have an official extension of the platform called Windows Services, that is, you can compile Java applications in the form of real Windows services that will be monitored through standard Windows tools for Windows services and so on. In this case, you have to pick up our own JAR in order to compile such applications.

Oleg: This is not the greatest problem.

Nikita: The interface is very simple for these services. And for debugging, you use the methods of running your own application not through the Windows Service, but through main. Some kind of service-specific debugging, I don’t know if it is needed.

Oleg: Suppose a new developer who previously worked at HotSpot wants to develop something using JET, does he need to learn something, to understand something at all about life or about JET?

Ivan: He needs to copy seven lines into pom.xml if Maven is used, then run jet: build, and if JET is on the machine, then the Java application will be compiled into an executable. In theory, it’s just that you don’t do anything special, just take it, get it, and that's it.

Nikita: Either you know the command line from which your application starts, then you put this command line into our GUI, it will figure it out. You give the build command, you get the executable, that's it.

Ivan: It's very simple, you do not need to invent anything new. How Hotspotov AOT works, you yourself indicated on the report that you need a list of all the methods to get into the file, bind it, transform it - we don’t need to do anything like that. You just take your launch string on HotSpot, paste it into a special GUI, or add a small piece to pom.xml, and, hooray, after a while (because this is an AOT compilation), you get an exe file, which can be run.

Oleg: Do I need to learn to work with GC?

Nikita: Yes, we have our own GC, we need to seize it differently, not like in HotSpot. We have very few public pens.

Oleg: Is there a pen “do well” or “not do”?

Ivan: There is such a pen. There is a handle “set Xmx”, there is a handle “set number of workers” ... There are many pens, but why do you need to know about them? If something unexpected happens to you - write.

Of course, we have a lot to configure GC. We can tyunit the younger generation, we can frequency GC arrival. All this tyunitsya, but this is not a common option. We understand that people know -Xmx and point it out, so we parse it. There are a few more common options that work with JET, but in general everything is different.

Nikita: We also have a public option that allows you to set how much you allow the GC to spend the time of your application.

Oleg: In percentage?

Nikita: In the tenth percent. We understood that interest is a bit too much, it is rude.

Ivan: If you spend interest on GC, you have something wrong.

Oleg: But all these people from enterprises, who do everything during work hours, open a printout of GC work with a schedule and meditate. Can you meditate?

Nikita: We have special people inside the company who meditate.

Ivan: We have our own log format, so people are unlikely to be able to understand something about it. Although it is not enough? If they look at him for a long time, they can, perhaps, understand. Everything is written there. But, most likely, it is better to send us, and we will meditate.

Oleg: But naturally, you will do it for money, but you can watch for free on your own.

Nikita: If you are our client, then you have a support, and we do it, of course, as part of a support.

Ivan: But if you have some obvious problem, we can even say without a support.

Oleg: If this is some kind of bug?

Nikita: If a bug, then, of course, we accept from everyone and always. It's not like that “until you buy, we will not fix the bug.” If a bug, then we fix it. In general, users love our support. They usually say that it is of very high quality, and that they have never seen anything like it anywhere. Perhaps the fact is that we ourselves sit in the support, rotate in turn.

Oleg: Who is who?

Nikita: Developers, JVM-engineers.

Oleg: How often?

Nikita: The periodicity is different. Usually we sit for two weeks in turns. But if you are obliged to make a mega-smart for a certain number of days, then at this moment you will receive immunity from the support so that you can focus on this task.

Ivan: In theory, everyone should do it in turn. But sometimes someone takes a heroic second dose and supports a month or more, rather than two weeks. Personally, I like to support, but if you do it for too long, then you forget a little what you do in life, because you only begin to answer letters. And you still want to sausage JVM. Therefore, after some time you need to return.

Oleg: Do you have a hierarchy, how many levels of management? 20 or more?

Nikita: What are you, there are only 10 of us in the team.

Ivan: 15 with students.

Oleg: I'm talking about the fact that the authorities are involved in it or just looking?

Nikita: Pro bosses. Of course, there is a main person, and there are many local leaders.

Oleg: Each person has his own area?

Nikita: A person who takes on some big task and leads it. This is also rotated. You can take a big task and manage once, and next time you will be led.

Ivan: In general, we do not have a large hierarchy. We have one level of superiors. And about looking from above - absolutely not. Sometimes our authorities heroically take over the support if the release is near.

Nikita: The bosses are one person, his name is Vitaly Mikheyev.

Oleg: You can see him somewhere? At conferences or somewhere else?

Nikita: In general, my presentations at conferences began with the arrival of the St. Petersburg Java Day in Novosibirsk, organized by Belokrylov from Oracle, which is now in Bellsoft. He asked if we wanted to perform, and we then performed together with Vitaly. Then I offered him to continue to perform together, but he decided that he no longer wants.

Oleg: What is the report?

Nikita: “The Story of a JVM in Pictures” .

Ivan: I remember this report, I was either an intern, or just stopped being one. And I still think that this is one of the best reports that I have seen.

Nikita: Maybe it was the “premiere effect”, when you are performing for the first time in your life, you push well with energy.

Oleg: What are you talking about?

Nikita: The two of them told about JET from beginning to end for 20 minutes.

Oleg: For two, just 20 minutes? Usually, when several people, the time for the report only increases.

Nikita: We are very cheerful and lively told all the key topics.

Oleg: Vanya, did it influence your decision, what to do next, whether to work in a company?

Ivan: It may well be!

Oleg: People usually just ask why go to conferences, to reports, why to listen to something, if you can google it.

Nikita: Of course, in my opinion, it is very useful to attend conferences. When you have a live contact, direct participation - this is not at all something to look at in a YouTube room. It is important that you directly, and not virtually participate. You come into contact with the original source. The difference is about the same as attending a live concert or listening to it in a recording. Even, probably, big, because how much can you communicate with your favorite performer after the concert? And at the conference you can find the speaker you need and all you need to ask him - no problem.

Ivan: By the way, about the decision to "stay in the company", this is another story. We have a rather unique system of recruiting staff / interns. We take interns at 2-3 courses, usually with a mekhmat or physics department. Interns are very deeply immersed in the topic, curators help them to understand the various mechanisms of VM, implementation details, etc. - it's worth a lot.

After some time, they begin to give combat missions, to write real code in production. You make changes to the JVM, pass reviews, tests, benches - check that they have not sunk. Commit for the first time. After that, you focus on your diploma. Usually a diploma is also a cool part of the JVM, experimental, research. This is usually done by students. Then you, perhaps, produce it - and perhaps not. I have never seen such a thing so much time was spent on interns. And I personally appreciate it very much, because I remember how much time was spent on me. The output is a JVM engineer. Where else is there such a factory about the production of JVM-engineers?

Oleg: And you are not afraid of information leakage from interns, will they then describe everything in the diploma in open form?

Nikita: This is not a problem, because we are afraid of a leak abroad, and in Russian no one will particularly read, this is such a defense, obfuscation :-)

Ivan: I defended a student this year, I was a leader, and there was such a problem that not everyone wanted to write a diploma. We opened something from an absolutely closed topic, and, for example, our reviewer asked why we are not talking about certain things. I answered him that it was impossible to tell about this, it is very secret, we cannot. This is, I will tell you in your ear, but it will not go anywhere else. So we fear a little bit of it all the same, but in general you can find a lot of interesting things in diplomas.

Oleg: And how to look for diplomas with Excelsior participation?

Nikita: You come to the dean's office and ask to read such a diploma.

Ivan: We have a list of successfully defended diplomas on our site, but only names, without references. And we have no unsuccessfully defended ones.

Oleg: They either die or are protected.

Ivan: So it is! We have an average grade of graduate students 5.0, there are those who simply do not go for a diploma.

Oleg: In this training, the factory of JVM-engineers, tell us what are the stages of becoming a Jedi? When they give you a lightsaber, when can you start waving them?

Nikita: Pretty quick. Now young people are quickly becoming Jedi, I am proud of them.

Ivan: By the way, Nikita was my first curator when I was an intern. Regarding the internship: first you pass the selection: you solve problems, you come to one or several interviews. If everything is ok, then you become an intern. The first time you read scientific articles on topics that are most likely related to your future diploma, or just about-JVM-related subjects in English to fit the context. You read, write an essay on them, explain what is happening there. His very harsh review. Some scientists would envy such proofreading and such preparation of the essay. It turns out full-fledged articles in Russian. After that, it's time for you to write the code, and you are slowly brought into the essence of the matter - how it all works.

Nikita: And that's where science ends.

Ivan: Not necessarily!

Nikita: It is a little disappointing, at the beginning of the 2000s we published articles that were taken in ACM's magazines.

Ivan: Well, we are still doing it, what is the problem?

Nikita: What was our last article in the ACM magazine?

Ivan: So in ACM, we just did not try! Now we are blogging - it's the same thing, only people read it too. This is a similar activity.

Returning to the topic of Jedi, after this you do something for the first time in production under strict control. Need a small task, not related to your future diploma.

Oleg: For example, writing a comment.

Ivan: No. True functionality. The student must make his first real contribution so that he stays in the product. Потом он начинает заниматься дипломом, это какой-то исследовательский проект, который затем превращается в диплом. Потом, по-хорошему, он должен это исследование продуктизировать — это четвертый этап. Это не всегда случается, потому что не любое исследование нужно и можно продуктизировать прямо сейчас. В любом случае, после этих этапов получается новый JVM-инженер.

Олег: А какие самые сложные этапы? На что много времени тратят люди? Есть какая-нибудь или математика, или понимание стандарта, или ещё какая-то глубокая штука? Из чего состоит структура знания?

Иван: Я бы сказал, что довольно тяжело впитать в себя так много контекста. Что отличает нашу стажировку от какой-то другой — ты не можешь просто взять тяп-ляп и сделать задачу в продакшн, перед этим нужно понять, как все устроено, хотя бы часть глобальной картины увидеть, учесть очень много факторов, и вообще, много понимать про мир JVM. Помню, как я учился, и у меня этого понимания изначально не было, конечно. Помню, как у меня появлялись вспышки понимания: «Так вот как это работает!». И потихоньку все совмещалось в общую картинку.

Олег: Эта картинка специфичная для JET?

Никита: Нет, для JVM специфична.

Иван: Какие-то части объясняют тебе, почему JET — это JET. Я помню, что когда-то пришел и спросил у одного из кураторов, почему у нас два компилятора: оптимизирующий и baseline-компилятор. Я не очень понимал, зачем и почему. И это был момент, когда у меня случилась вспышка понимания, как этот JET работает.

Олег: А зачем два компилятора?

Иван: Один оптимизирующий, мощный. Второй не такой оптимизирующий, зато быстрее и надежнее работающий.

Никита: Оптимизирующий может когда-нибудь сломаться, а baseline не должен ломаться никогда.

Иван: Кроме того, baseline мы еще используем в качестве JIT. Да, у нас есть и JIT тоже, он необходим для корректности. Но мы не разгоняем в нем оптимизирующий компилятор, вместо этого используем baseline. Вот такой побочный эффект. Если что-то вдруг пошло не так в оптимизирующем, мы можем во время компиляции в качестве запасного варианта использовать baseline, он точно сработает.

Олег: Вы компилируете какие-нибудь UI-приложения? Там важна скорость запуска.

Никита: Разумеется. Мы долго позиционировали себя именно как десктопное решение. Большинство наших пользователей до сих пор на Windows.

Иван: Но насколько я слышал, отношение меняется в сторону других платформ. Например, те же десктопщики используют Mac.

Олег: А можно на Android компилять?

Никита: Мы изучали этот вопрос. Покомпилять можно, под Android же есть Linux-эмулятор, а Linux мы умеем. То есть компилировать под этот Linux на Android можно. На телефоне или планшете запустить какое-нибудь свинговое приложение. Были успешные эксперименты.

Олег: А без Линукса?

Никита: Скомпилировать дексы? Not.

Олег: У Android же есть Native SDK. Скомпилировать DLL и подцепить.

Никита: Были неудачные эксперименты, чего-то не взлетело, к сожалению. В Андроиде SO-шки как-то подхачены, работают не так, как в Linux, времени разобраться, какие именно отличия, не было. Но есть идея сделать возможность использовать настоящую Java на Android, не андроидовскую, компилировать ее в SO-шку, и потом этот кусок Java мог бы с Dalvik сопрягаться как обычная нативная динамическая библиотека. Можно тогда 90 процентов своего приложения в эту библиотеку запихать, например, всю бизнес-логику.

Олег: А потом можно было бы еще скомпилировать под iOS и сделать универсальную платформу для запуска всего?

Никита: Да. Начать с Android, потом пойти на iOS. Но инвестиций туда требуется очень много, пока мы туда не идем, к сожалению.

Олег: Учитывая, что вас 15 человек.

Никита: iOS мы хотим уже лет десять. Там нужны большие инвестиции, все никак не можем на это решиться.

Иван: Проблема ограниченности ресурсов есть, к сожалению.

Олег: Расскажите про особенности устройства команды?

Иван: Тут стоит сказать, что есть два мощных лагеря — разработчики компилятора и разработчики рантайма.

Олег: У вас же компилируется нативный код. Компилятор и рантайм не являются одним и тем же?

Никита: Рантайм — это полноценная JVM, в JVM очень много всего есть — треды, рефлексия, синхронизация, сборка мусора, управление памятью, — это здоровые куски, они очень сложные. Они местами гораздо более нетривиальные, чем компилятор. Ту же синхронизацию эффективно реализовать может не каждый компиляторный инженер в нашей команде. А если ты в GC где-то ошибся, то все плохо, так как там отладка не работает. Там отладка никакая невозможна, ты сидишь и медитируешь по ночам, отлаживаешь во сне, что там случилось.

Олег: Видел, что Шипилёв писал, например, всякие визуализации работы Шинанды. И у него была такая отладка, там цветом показывалось, как перемещаются блоки. И вы наверняка можете вставить GDB в нативный код.

Иван: Конечно, мы так и делаем. У рантаймщиков и компиляторщиков немного отличаются подходы к разработке, отладке и т.д. Например, т.к. компиляторы у нас написаны на managed-языках Java/Scala, то и разрабатывать и (чаще всего) отлаживать их можно прям из IDEA, что очень удобно и здорово. В рантайме же у тебя только два союзника — GDB и отладочная печать.

Никита: Но, кстати, можно писать unit-тесты на рантайм, и их тоже можно отлаживать в Идее!

Иван: Угу. Но и в целом немного менталитет отличается при разработке компилятора и рантайма. Я бы сказал, что в рантайме главная сложность в том, насколько много всего может происходить одновременно, это нужно понимать и даже чувствовать. Это такое странное ощущение времени и жизни всей JVM. В компиляторе по-другому, но тоже очень интересно. Мы недавно с Пашей Павловым это обсуждали, и он очень круто сформулировал: в компиляторе сложности другого рода, они возникают из-за «комбинаторного взрыва возможных состояний и ситуаций из-за (зачастую совершенно неочевидной) скрытой сложности мат. модели».

В общем, это два таких очень разных мира, но в сумме составляющих одно целое — собственно, всю JVM.

Олег: Вы с какой части баррикад находитесь?

Иван: Я рантаймщик.

Никита: Я компиляторщик. Но в последнее время я делаю продуктовые фичи. Та же поддержка Spring Boot означает, что нужно практически все компоненты JET немножко потрогать, чтобы всё заработало.

Иван: Если нужно, то мы погружаемся в чужие компоненты и что-то там делаем. Кроме того, есть люди, которые ни с какой стороны баррикад, они полноценные гибриды — полукомпиляторщики и полурантаймщики. Например, мы сейчас делаем новый JIT, и это полноценная работа с обеих сторон баррикад. Его и в компиляторе, и в рантайме нужно запускать.

Олег: JIT — это и компилятор, и рантайм одновременно?

Иван: Да, можно и так сказать.

Никита: Получается, что специализация какая-то есть, но тебе очень часто приходится на границе работать и врубаться во все вокруг.

Олег: А расскажите, как выглядит день работника у вас в компании? Например, я мог бы рассказать распорядок дня веб-разработчика, но это ужасно, поэтому пропустим и перейдем сразу к вам.

Иван: На самом деле, не то, чтобы здесь было что-то супер-особенное. У нас есть какие-то цели, планы, которые мы можем сделать. Бывает, что план прописан четко. Бывает, что тебе просто нужно сделать какую-то задачу. Есть issue-трекер, в котором есть баги, которые нужно чинить. Ты приходишь, смотришь, какая у тебя самая горячая задача и начинаешь ей заниматься.

Из необычного, что отличает от других. Предположим, ты написал код — в рантайме, в компиляторе или ещё где-нибудь. Обычно после этого программист запускает и смотрит, работает ли. А у нас после этого нужно запустить check-in тест, который сам по себе идет 1.5-2 часа.

Никита: Раньше у нас исходники лежали в Visual Source Safe, и там коммит назывался check-in. Прежде чем ты мог в этот VSS зачекиниться, ты должен был пройти check-in тест. После нашего ухода с VSS, он до сих пор называется check-in тестом.

Иван: Это тест, где собирается весь JET, то есть вся JVM с твоими правками, прогоняются базовые тесты. Это занимает 1.5-2 часа. Только через это время у тебя будет готовый JET, чтобы попробовать, сработало или нет. Баг у тебя проявился в каком-то конкретном приложении, наверное, поэтому нужно потом скомпилировать это приложение, пройти сценарий и понять — сработало ли. Сколько у тебя попыток в день так сделать? Не так много. Поэтому мы стараемся сразу писать код качественно.

Олег: JET написан на Scala?

Иван: Один из компиляторов написан на Scala.

Сам JET компилятор написан на Scala, при этом он сам компилируется JET-ом. То есть он JET-ом предыдущей версии собирается. Получается экзешник, который потом используется. Представь, что сначала нужно взять Scala-исходники, по ним пройтись с помощью scalac, получится байткод…

Никита: Больше всего времени в этом check-in тесте занимает компиляция самой Java-платформы в машинный код, потому что ее нужно скомпилировать всю, а она здоровая, и она компилируется полтора часа.

Олег: Можно как-нибудь разбить на юниты и раскидать на сборочный кластер, ну как с C++ делают?

Никита: Хорошая идея, мы про нее периодически думаем. У меня есть идеи, как это сделать, но руки не доходят распараллелить нашу любимую компиляцию.

Иван: Из необычного еще в дне JVM-инженера — бывает так, что ты очень долго отлаживаешь какой-то баг, потому что он спорадичный. У меня был рекорд — баг проявлялся раз в год. Случилась беда, я вроде понял, что произошло, проверил — не проявилось, закоммитил. Прошел год, и он снова случился. Я его назвал «годовалый баг». Есть такая особенность. Надеюсь, что сейчас я его починил, но год еще не прошел, поэтому не знаю :-) Такие баги — это очень неприятно. У меня был довольно большой проект, связанный с GC, который из-за каких-то неучтенных случаев или ошибок мог вызывать огромное количество спорадичных багов. Ты узнаешь, что у тебя что-то пошло не так, только постфактум, когда смотришь на развалы. У тебя нет сценария воспроизведения, никакое стресс-тестирование тебе не поможет. Ничего вообще не помогает.

Олег: Что делать, обращаться к Кашпировскому?

Иван: Я в таких случаях оставляю ловушку. Понимаю, что конкретно произошло по текущему крэшлогу. Если это снова случится, то огромное количество информации по этой проблеме я распечатаю в отдельный файл. Потом говорю QA-инженеру, что если вдруг случится ещё раз — ищи этот файл, он очень важный. В нем новые подсказки, что конкретно пошло не так.

Олег: Эти ловушки никак на перформанс не влияют?

Иван: Я делаю их так, чтобы не влияли. Минимальный импакт, они начинают тратить перформанс, когда проблема случилось. Вроде, получается. C GC бывают очень разные проблемы. Это хорошо, если ты увидел, что у тебя какой-то объект битый. А проблема же может быть гораздо сложнее, GC что-то сделал не так — собрал объект или не собрал объект. В результате через два часа работы приложения у тебя случается какой-нибудь креш. What was it? Что за метод?

Никита: Промахнулся в одном месте на один битик, и в следующий раз это случится неизвестно когда.

Иван: Тебе уже никакие логи, ни хип-дамп, ничего не поможет. Нет информации. Единственная информация, которая у тебя есть — что это однажды взорвалось. Тебе после этого можно только оставлять ловушки. В общем-то и всё.

Никита: Либо убеждать себя, что, наверное, пролетела из космоса частица и пробила память в одном бите.

Олег: А если он постоянно проявляется, можно что-то сделать?

Иван: Это какой-то очень простой баг, который мы быстро поправим.

Олег: А что с багами в JIT?

Никита: В JIT бывают простые баги. Посмотришь на assert, который случился, на стек-трейс, и становится ясно, что произошло. Тут же поправил.

Иван: Стоит сказать, что у нас очень хорошее тестирование. Мы JCK гоняем. Если что-то проходит JCK, это уже означает, что оно неплохо написано. Кроме того, у нас большое количество реальных приложений. В тестировании мы используем UI-тесты, которые протыкивают GUI-приложение, скомпилированные JET. Есть просто какие-то хитрые сценарии. Недавно мы делали тест, который брал конкретный проект с GitHub и начинал по одному коммиту чекаутить его и собирать JET-ом. Чекаутить следующий коммит и пересобирать JET-ом и так далее. У нас QA-отдел работает очень хорошо, всё отлично покрыто тестированием.

Олег: Тестировщик же должен понимать, что тестирует? Не окажется ли так, что квалификация тестировщика должна быть выше, чем квалификация разработчика?

Иван: У нас QA Lead — это рантайм-инженер. Он сейчас занимается еще и QA, но рантайм тоже пишет. Думаю, это довольно красноречиво. Да, он знает, что происходит и как должно быть. Как нужно тестировать, понимает специфику.

Олег: Сколько лет нужно потратить, чтобы стать QA в вашем проекте? Условно говоря, в вебе зачастую QA-шник, к сожалению, — должность с самой монотонной работой, на которой люди тыкают кнопки и смотрят, что они окрасились в синий цвет. У вас, вероятно, всё как-то по-другому.

Иван: Ситуация та же, что и с саппортом. Мы не можем нанять человека с улицы, чтобы он сидел на саппорте, потому что у него не хватит квалификации, поэтому мы сами становимся саппортами раз в какое-то время. Соответственно, QA Lead не пишет тесты постоянно, он руководит. И хотя руководит он не JVM-инженерами, сам он им является, это о многом говорит.

Олег: Как у вас выглядит тестирование? Есть какой-то заготовленный эталонный код, и QA-шники просто с ним сравнивают, если он отличается — то проблема?

Никита: У нас есть JCK, тесты от Oracle. Тестеры Oracle нам их поставляют, и это хорошо. Потому что практически на каждую букву спецификации написан какой-нибудь тест. Кроме JCK, обычно берутся какие-нибудь реальные приложения, компилируются, протыкиваются, и потом это протыкивание автоматизируется.

Олег: Что нужно, чтобы получить JCK?

Никита: Заплатить денег Oracle. Помимо этого, ты должен делать JVM, которая чем-то очень необычно отличается от той, что у Oracle.

Иван: Есть разные способы. Например, если ты вкладываешься в Open JDK, тебе дадут JCK.

Никита: Если тебе дали какую-то специальную звездочку, что ты не простой человек, а что-то уже сделал хорошее, то тебе да — наверное, дадут. Чтобы стать лицензиатом, тебе нужно доказать, что твой продукт как-то нетривиально отличается от продукта Oracle. Например, что он на платформе, которую Oracle не поддерживает.

Олег: А если тривиально отличается?

Никита: Тогда имеют право не дать. Нам удалось убедить, что мы что-то даём сверх имеющегося, это называется «value add». И нам дали JCK, в рамках коммерческой лицензии на Java, которая за деньги. Нам дали JCK и сказали, что если вы за три месяца его не пройдёте — то закрывайте лавочку.

Олег: Жесть какая! Почему бы тогда всем, кто делает форки OpenJDK, не собраться и не написать свой вариант?

Иван: Это довольно дорогая тема. Представь, что у тебя 70 тысяч тестов. У нас есть некоторый секретный набор тестов, которых нет в JCK, а может было бы неплохо, чтобы они там были, но мы их не публикуем. Потому что, может быть, и JET их не пройдет, и HotSpot их не пройдет. Есть такие тонкие места в спецификации.

Олег: Потом сделаете доклад про тонкие места спецификации? Это было бы интересно.

Никита: У меня с Мишей Быковым из Oracle был совместный доклад про поддержку JVM. Там я рассказал про некоторые тонкие места спецификации. Были «истории из жизни поддержки JVM» на конференции Joker.

Олег: Запись осталась?

Никита: Да, конечно. Насчет спецификации JVM. Вот реальный случай из недавнего: некоторые наши JVM-инженеры заметили, что код написан не по спецификации, и решили его переписать, затем прислали мне на ревью. Я читаю и понимаю, что в спецификации баг.

Олег: Может, в JCK баг?

Никита: Нет, в спецификации. Я его описание отослал в соответствующий mailing list, и Алекс Бакли (Alex Buckley), человек, который сейчас главный по Java/JVM-спецификациям, исправил этот баг в JVM Specification 12.

Иван: В JCK тоже бывают баги.

Никита: Когда мы начинали проходить JCK, мы пачками стали слать Sun некорректные тесты. Мы огромными простынями доказывали, что тест некорректный — а им приходилось эти некорректные тесты исключать.

Олег: Доказать некорректность теста — это чуть ли не сложнее, чем написать?

Никита: Конечно, сложнее. Очень кропотливая работа. Были, например, некорректные тесты на CORBA. Сидишь такой, сложный многотредовый тест куда-то ломится, и ты придумываешь и объясняешь разные ситуации. Я чуть ли не 6 страниц написал про то, почему тест на CORBA некорректный. Дело в том, что мультитредовая картинка может быть не такая, как на HotSpot, и надо доказать, что возможна такая ситуация, которую тест не ожидает.

Олег: Можно ли это сконвертировать в докторскую, шесть репортов — одна докторская?

Никита: Я две недели сидел и все выписывал. CORBA была обязательной частью, но сейчас её наконец выпилили. И Swing обязательный. В JCK есть автоматически тесты на Swing, а есть несколько сотен тестов, где нужно ручками все протыкать. И у нас после каждого релиза один выделенный человек садится и тыкает на каждой платформе все эти формочки.

Иван: Это называется «интерактивные тесты JCK».

Никита: Чтобы выпустить продукт, мы обязаны в любой момент предоставить доказательства, что JCK у нас проходит. И для этого нужно пройти эти интерактивные тесты.

Олег: Этого человека на видеокамеру пишут или что?

Никита: Нет, он проверяет, что всё работает, а потом результат криптуется так, что после этого результат является доказательством, что мы эти тесты реально проходили.

Олег: Просто можно было бы скрипт написать, который это делает.

Иван: В том-то и дело, что нет. Там есть те, которые роботами протыкиваются. Они тоже графические.

Олег: То есть там такие тесты, которые вообще не автоматизируемы?

Никита: Написано, что их нельзя проходить роботом.

Олег: То есть всё рассчитано на честность. Интересно, сколько кроме вас ещё таких честных. Вижу такую картину: проходит сто лет, вокруг всё делает искусственный интеллект. И у искусственного интеллекта есть один человек, которой проходит JCK, потому что всё написано на Java.

У меня ощущение, что надо закругляться. Неплохо бы нам придумать какое-нибудь послесловие и пожелание читателям.

Никита: Занимайтесь интересными делами — это драйвит. Неважно, что ты делаешь, если тебе это нравится — это замечательно. Моя мотивация работать в этом проекте заключается в том, что с годами почему-то интересные задачи не кончаются. Постоянно челленджи, челлендж на челлендже. Я бы посоветовал всем искать работу с челленджами, потому что тогда тебе веселее жить.

Олег: Это как-то связано с тем, что вся работа связана с виртуальными машинами или с системным программированием? Почему челленж?

Никита: Виртуальные машины — это постоянное исследование. Конечно, какой-нибудь С-шный компилятор ты двадцать лет делать не будешь, потому что там все-таки когда-нибудь кончатся новые знания. А с Java они почему-то не кончаются! До сих пор не понятно, как делать правильно GC. Как правильно делать компилятор — тоже до сих пор никому не понятно — ни хотспотовцам, ни нам.

Иван: Это не просто исследование, это передовой край всякой компиляторной, рантаймовской науки. У нас сейчас ребята переделывают бэкенд компилятора и для этого разбирают новейшие научные статьи (понятно, что для реальной JVM они требуют адаптации и дополнения). Обычно ты прочитал статью, и что? And nothing. В статье описывается прототип, кто-то чего-то измерил на нем, и все. А здесь есть возможность реализовать это на настоящей JVM и посмотреть, как это будет себя вести на практике. Это очень круто, мало где в мире есть такая возможность.

Сейчас в Java происходит много всего крутого. Все это ускорение релизов, мегапроекты, которые сейчас появляются — все они очень клевые. Тот же Loom и Metropolis, это такие очень основательные и очень прикольные проекты в экосистеме Java. Даже без привязки к какой-то конкретной JVM, смысл просто в том, что прогресс прет вперед, это круто, и за этим надо смотреть, разбираться и восхищаться. Посмотрите доклад про Loom и просто посмотрите, как он себя ведет на практике, что такие прототипы работают. Это дает надежду на будущее. Поэтому я напоследок всех призываю не только следить за новыми технологиями и разбираться в них, но и участвовать в их развитии — всё это реально очень круто.

Минутка рекламы. Раз вы дочитали досюда, вам явно очень важна Java — поэтому вам стоит знать, что 5-6 апреля в Москве состоится Java-конференция JPoint . Там как раз выступит Никита Липский , а также другие известные Java-спикеры (Simon Ritter, Rafael Winterhalter). Вы тоже можете подать заявку на доклад и попробовать себя в качестве спикера, Call for Papers открыт до 31 января. Увидеть актуальную информацию о программе и приобрести билеты можно на официальном сайте конференции. Чтобы оценить качество остальных докладов с прошлой конференции, можно посмотреть архив видеозаписей на YouTube. Встретимся на JPoint!

Source: https://habr.com/ru/post/437180/

Severe Siberian JVM: great interview about Excelsior JET

More articles: