PVS-Studio for Java

In the seventh version of the PVS-Studio static analyzer, we added support for the Java language. It's time to tell you a little about how we started to make support for the Java language, what we did and what further plans. And, of course, the article will show the first tests of the analyzer on open source projects.

PVS-Studio

For Java developers who have never heard of the PVS-Studio tool, I’ll give a very brief description of it.

PVS-Studio is a tool for detecting errors and potential vulnerabilities in the source code of programs written in C, C ++, C # and Java. Works on Windows, Linux and macOS.

PVS-Studio performs static code analysis and generates a report that helps the programmer to find and fix defects. For those who are interested in exactly how PVS-Studio is looking for errors, I suggest you to read the article " Technologies used in the PVS-Studio code analyzer to search for errors and potential vulnerabilities ."

Start

I could come up with a smart story, as we had been thinking for two years about which next language to support in PVS-Studio. The fact that Java is a reasonable choice, based on the high popularity of this language and so on.

However, as it happens in life, everything was decided not by a deep analysis, but by an experiment :). Yes, we were thinking in which direction to further develop the PVS-Studio analyzer. Considered such programming languages as: Java, PHP, Python, JavaScript, IBM RPG. And we were inclined towards the Java language, but the final choice was still not made. Those whose eyes are stuck on an unfamiliar IBM RPG, refer here to this note , from which everything will become clear.

At the end of 2017, colleague Yegor Bredikhin looked at what ready-made libraries for parsing code (in other words, parsers) for new directions interesting to us. And I came across several projects for parsing Java code. Based on Spoon , he quickly managed to make a prototype analyzer with a pair of diagnostics. Moreover, it became clear that we could use some C ++ analyzer mechanisms using SWIG in a Java analyzer. We looked at what happened and realized that our next analyzer would be for Java.

Thanks to Yegor for his initiative and active work he has done on the Java analyzer. He described exactly how the development was in the article " Development of a new static analyzer: PVS-Studio Java ".

Competitors?

There are many free and commercial static code analyzers for Java in the world. Enumerating them all in the article does not make sense, and I just leave a link to " List of tools for static code analysis " (see section Java and Multi-language).

However, I know that first of all we will be asked about IntelliJ IDEA, FindBugs and SonarQube (SonarJava).

IntelliJ IDEA

IntelliJ IDEA has a very powerful static code analyzer built into it. Moreover, the analyzer is developing, and its authors are closely monitoring our activities. With IntelliJ IDEA we will be the hardest. We will not be able to surpass IntelliJ IDEA in diagnostic capabilities, at least for now. Therefore, we will try to concentrate on our other advantages.

Static analysis in IntelliJ IDEA is, first of all, one of the design environment chips, which imposes certain restrictions on it. We are free in what we can do with our analyzer. For example, we can quickly adapt the analyzer to the specific needs of the customer. Fast and deep support is our competitive advantage. Our clients directly communicate directly with programmers who are developing a particular part of PVS-Studio.

There are many opportunities in PVS-Studio to integrate it into the development cycle of large old projects. This is an integration with SonarQube . This is a mass suppression of analyzer messages, which allows you to immediately start using the analyzer in a large project to track errors only in new or modified code. PVS-Studio is being integrated into the continuous integration process. I think it is these and other possibilities that will help our analyzer find a place under the sun in the Java world.

Findbugs

Project FindBugs abandoned . But it should be remembered for the reason that it is perhaps the most well-known free static analyzer of Java code.

The successor to FindBugs is the SpotBugs project. However, it is less popular, and what will happen to it is also not entirely clear.

In general, we believe that although FindBugs has been and remains extremely popular, and besides, also a free analyzer, we should not think about it. This project is just a quiet and calm thing of the past.

PS By the way, now PVS-Studio can also be used for free when working with open source projects.

SonarQube (SonarJava)

We believe that we do not compete with SonarQube, but supplement it. PVS-Studio integrates with SonarQube, which allows developers to find more errors and potential vulnerabilities in their projects. How to integrate the tool PVS-Studio and other analyzers into SonarQube, we regularly tell at master classes that we hold at various conferences ( example ).

How to run PVS-Studio for Java

We have made available to users the most popular ways to integrate the analyzer into the build system:

Plugin for maven;
Plugin for gradle;
Plugin for IntelliJ IDEA

At the testing stage, we met many users who had self-written assembly systems, especially in mobile development. They liked the ability to run the analyzer directly, listing the source code and the classpath.

You can find detailed information on all methods of launching the analyzer on the " How to run PVS-Studio Java " documentation page.

We could not ignore the SonarQube code quality control platform, which is so popular among Java developers, so we added Java language support to our SonarQube plugin .

Future plans

We have a lot of ideas that require additional study, but some plans characteristic of any of our analyzers look like this:

Creation of new diagnostics and refinement of existing ones;
The development of dataflow analysis;
Increase reliability and ease of use.

Perhaps we will find time to adapt IntelliJ IDEA plugin for CLion. Hello C ++ developers reading about Java analyzer :-)

Examples of errors found in open source projects.

I will not be me if I do not show in the article any errors found using the new analyzer. It would be possible to take some large open Java project and write a classic article with an analysis of errors, as we usually do .

However, I immediately anticipate questions, and can we find something in projects such as IntelliJ IDEA, FindBugs, and so on. Therefore, I simply have no way out, and I will start with these projects. So, I decided to quickly check and write out some interesting examples of errors from the following projects:

IntelliJ IDEA Community Edition . I think it is not necessary to explain why this project was chosen :).
SpotBugs . As I wrote earlier, the FindBugs project does not develop. Therefore, let's take a look at the SpotBugs project, which is the successor to FindBugs. SpotBugs is a classic static Java code analyzer.
Some of the projects of the company SonarSource, which develops software for continuous code quality control. Let's look at the SonarQube and SonarJava project .

To write about bugs in these projects is a difficult task. The fact is that these projects are very high quality. Actually, this is not surprising. As our observations show, the code of static analyzers is always well tested and tested using other tools.

Despite all this, I have to start with these projects. I won't have a second chance to write something about them. I am sure that after the release of PVS-Studio for Java, the developers of the listed projects will use PVS-Studio and start using it for regular or, at least, for periodic checks of their code. For example, I know that Tagir Valeev ( lany ), one of the developers of JetBrains, who is engaged in the static code analyzer IntelliJ IDEA, while I write an article, is already playing with the Beta version of PVS-Studio. He has already written to us about 15 letters with bug reports and recommendations. Thank you, Tagir!

Fortunately, I do not need to find as many errors as possible in one particular project. Now my task is to show that the PVS-Studio analyzer for Java did not appear in vain and will be able to replenish the range of other tools designed to improve the quality of the code. I just skimmed through the analyzer reports and wrote out a few errors that seemed interesting to me. If possible, I tried to write out errors of different types. Let's see what happened.

IntelliJ IDEA, integer division

private static boolean checkSentenceCapitalization(@NotNull String value) { List<String> words = StringUtil.split(value, " "); .... int capitalized = 1; .... return capitalized / words.size() < 0.2; // allow reasonable amount of // capitalized words }

PVS-Studio warning: V6011 [CWE-682] The '0.2' literal of the 'double' type is compared to the value of the 'int' type. TitleCapitalizationInspection.java 169

As intended, the function should return true if less than 20% of words begin with a capital letter. In fact, verification does not work, since integer division occurs. As a result of the division, only two values can be obtained: 0 or 1.

The function returns a false value only if all words begin with a capital letter. In all other cases, the division will result in 0, and the function will return the true value.

IntelliJ IDEA, suspicious loop

 public int findPreviousIndex(int current) { int count = myPainter.getErrorStripeCount(); int foundIndex = -1; int foundLayer = 0; if (0 <= current && current < count) { current--; for (int index = count - 1; index >= 0; index++) { // <= int layer = getLayer(index); if (layer > foundLayer) { foundIndex = index; foundLayer = layer; } } .... }

PVS-Studio warning: V6007 [CWE-571] Expression 'index> = 0' is always true. Updater.java 184

First look at the condition (0 <= current && current <count) . It is executed only if the value of the variable count is greater than 0.

Now look at the cycle:

 for (int index = count - 1; index >= 0; index++)

The index variable is initialized by the expression count - 1 . Since the count variable is greater than 0, the initial value of the index variable is always greater than or equal to 0. It turns out that the loop will be executed until the index variable overflows.

Most likely, this is just a typo and should be done not increment, but decrement of the variable:

 for (int index = count - 1; index >= 0; index--)

IntelliJ IDEA, Copy-Paste

 @NonNls public static final String BEFORE_STR_OLD = "before:"; @NonNls public static final String AFTER_STR_OLD = "after:"; private static boolean isBeforeOrAfterKeyword(String str, boolean trimKeyword) { return (trimKeyword ? LoadingOrder.BEFORE_STR.trim() : LoadingOrder.BEFORE_STR).equalsIgnoreCase(str) || (trimKeyword ? LoadingOrder.AFTER_STR.trim() : LoadingOrder.AFTER_STR).equalsIgnoreCase(str) || LoadingOrder.BEFORE_STR_OLD.equalsIgnoreCase(str) || // <= LoadingOrder.BEFORE_STR_OLD.equalsIgnoreCase(str); // <= }

PVS-Studio Warning: V6001 [CWE-570] There are identical sub-expressions 'LoadingOrder.BEFORE_STR_OLD.equalsIgnoreCase (str)' to the left | operator. Check lines: 127, 128. ExtensionOrderConverter.java 127

Good old effect of the last line . The programmer hurried and, having multiplied a line of code, forgot to fix it. As a result, twice the string str is compared with BEFORE_STR_OLD . Most likely, one of the comparisons should be with AFTER_STR_OLD .

IntelliJ IDEA, typo

 public synchronized boolean isIdentifier(@NotNull String name, final Project project) { if (!StringUtil.startsWithChar(name,'\'') && !StringUtil.startsWithChar(name,'\"')) { name = "\"" + name; } if (!StringUtil.endsWithChar(name,'"') && !StringUtil.endsWithChar(name,'\"')) { name += "\""; } .... }

PVS-Studio Warning: V6001 [CWE-571] There are identical sub-expressions '! StringUtil.endsWithChar (name,' "')' operator. JsonNamesValidator.java 27

This code snippet verifies that the name is in single or double quotes. If this is not the case, double quotes are added automatically.

Because of a typo, the ending of the name is checked only for the presence of double quotes. As a result, the name taken in single quotes will be processed incorrectly.

Name

 'Abcd'

due to the addition of extra double quotes will turn into:

 'Abcd'"

IntelliJ IDEA, incorrect protection against array overrun

 static Context parse(....) { .... for (int i = offset; i < endOffset; i++) { char c = text.charAt(i); if (c == '<' && i < endOffset && text.charAt(i + 1) == '/' && startTag != null && CharArrayUtil.regionMatches(text, i + 2, endOffset, startTag)) { endTagStartOffset = i; break; } } .... }

PVS-Studio warning: V6007 [CWE-571] Expression 'i <endOffset' is always true. EnterAfterJavadocTagHandler.java 183

The subexpression i <endOffset in the condition of the if statement does not make sense. The variable i is always less than endOffset , which follows from the condition of the loop.

Most likely, the programmer wanted to protect himself from going over the line boundary when calling functions:

text.charAt (i + 1)
CharArrayUtil.regionMatches (text, i + 2, endOffset, startTag)

In this case, the subexpression to check the index should be: i <endOffset - 2 .

IntelliJ IDEA, duplicate check

 public static String generateWarningMessage(....) { .... if (buffer.length() > 0) { if (buffer.length() > 0) { buffer.append(" ").append( IdeBundle.message("prompt.delete.and")).append(" "); } } .... }

PVS-Studio warning: V6007 [CWE-571] Expression 'buffer.length ()> 0' is always true. DeleteUtil.java 62

This can be a harmless redundant code, as well as a serious error.

If the duplicate check appeared by chance, for example, during refactoring, then there is nothing wrong with that. The second check can simply be deleted.

But another scenario is possible. The second check should be completely different and the code behaves differently than intended. Then this is a real mistake.

Note. By the way, there are a lot of different redundant checks. And often it is clear that this is not a mistake. However, it is also impossible to call analyzer messages with false positives. For clarification, here is an example, also taken from IntelliJ IDEA:

 private static boolean isMultiline(PsiElement element) { String text = element.getText(); return text.contains("\n") || text.contains("\r") || text.contains("\r\n"); }

The analyzer says that the text.contains ("\ r \ n") function always returns false. And indeed, if the symbol "\ n" and "\ r" are not found, then there is no point in searching for "\ r \ n". This is not an error, and the code is bad only because it works a little slower, performing a meaningless search for a substring.

How to deal with such code, in each case, it is up to the programmers to decide. When writing articles, as a rule, I simply do not pay attention to such code.

IntelliJ IDEA, something is wrong

 public boolean satisfiedBy(@NotNull PsiElement element) { .... @NonNls final String text = expression.getText().replaceAll("_", ""); if (text == null || text.length() < 2) { return false; } if ("0".equals(text) || "0L".equals(text) || "0l".equals(text)) { return false; } return text.charAt(0) == '0'; }

PVS-Studio warning: V6007 [CWE-570] Expression '"0". Equals (text)' is always false. ConvertIntegerToDecimalPredicate.java 46

This code accurately contains a logical error. But I find it difficult to say what the programmer wanted to check, and how to correct the flaw. Therefore, here I just point out the meaningless check.

In the beginning, it is checked that the string must contain at least two characters. If not, the function returns false .

Next comes the “0” .equals (text) check. It is meaningless, since a string cannot contain only one character.

In general, something is wrong here and the code should be corrected.

SpotBugs (FindBugs successor), iteration limit error

 public static String getXMLType(@WillNotClose InputStream in) throws IOException { .... String s; int count = 0; while (count < 4) { s = r.readLine(); if (s == null) { break; } Matcher m = tag.matcher(s); if (m.find()) { return m.group(1); } } throw new IOException("Didn't find xml tag"); .... }

PVS-Studio warning: V6007 [CWE-571] Expression 'count <4' is always true. Util.java 394

As planned, the search for the xml tag should be carried out only in the first four lines of the file. But because you forgot to increment the count variable, the entire file will be read.

Firstly, it can be a very slow operation, and secondly, somewhere in the middle of the file, something can be found that will be perceived as an xml tag, but it will not be one.

SpotBugs (FindBugs successor) mashing values

 private void reportBug() { int priority = LOW_PRIORITY; String pattern = "NS_NON_SHORT_CIRCUIT"; if (sawDangerOld) { if (sawNullTestVeryOld) { priority = HIGH_PRIORITY; // <= } if (sawMethodCallOld || sawNumericTestVeryOld && sawArrayDangerOld) { priority = HIGH_PRIORITY; // <= pattern = "NS_DANGEROUS_NON_SHORT_CIRCUIT"; } else { priority = NORMAL_PRIORITY; // <= } } bugAccumulator.accumulateBug( new BugInstance(this, pattern, priority).addClassAndMethod(this), this); }

PVS-Studio warning: V6021 [CWE-563] The variable is assigned but it is not used. FindNonShortCircuit.java 197

The value of the priority variable is set depending on the value of the sawNullTestVeryOld variable. However, this does not play any role. Next, the priority variable will be assigned a different value in any case. Explicit error in the logic of the function.

SonarQube, Copy-Paste

 public class RuleDto { .... private final RuleDefinitionDto definition; private final RuleMetadataDto metadata; .... private void setUpdatedAtFromDefinition(@Nullable Long updatedAt) { if (updatedAt != null && updatedAt > definition.getUpdatedAt()) { setUpdatedAt(updatedAt); } } private void setUpdatedAtFromMetadata(@Nullable Long updatedAt) { if (updatedAt != null && updatedAt > definition.getUpdatedAt()) { setUpdatedAt(updatedAt); } } .... }

PVS-Studio: V6032 It is odd that the body of the method 'setUpdatedAtFromDefinition' is fully equivalent to the body of another method 'setUpdatedAtFromMetadata'. Check lines: 396, 405. RuleDto.java 396

The setUpdatedAtFromMetadata method uses the definition field. Most likely, the metadata field should be used. This is very similar to the consequences of an unsuccessful copy-paste.

SonarJava, duplicates when initializing a Map

 private final Map<JavaPunctuator, Tree.Kind> assignmentOperators = Maps.newEnumMap(JavaPunctuator.class); public KindMaps() { .... assignmentOperators.put(JavaPunctuator.PLUSEQU, Tree.Kind.PLUS_ASSIGNMENT); .... assignmentOperators.put(JavaPunctuator.PLUSEQU, Tree.Kind.PLUS_ASSIGNMENT); .... }

PVS-Studio warning: V6033 [CWE-462] An item with the same key 'JavaPunctuator.PLUSEQU' has already been added. Check lines: 104, 100. KindMaps.java 104

Twice the same key-value pair is placed on the map. Most likely, this was due to carelessness, and in fact there is no real error. However, this code needs to be checked in any case, as you may have forgotten to add some other pair.

Conclusion

What conclusion can there be? I invite everyone, without delay, to download PVS-Studio and try to test your work projects in the Java language! Download PVS-Studio .

Thank you all for your attention. Hopefully, soon we will please readers with a series of articles devoted to testing various open Java projects.

If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. PVS-Studio for Java .