Using com.intellij.codeInspection.dataFlow

Hi, this one is probably for you too Maxim. If I can't get the degenerator
working, another way for my plugin to pass code to Soot is to compile the
code for a method to bytecode, and pass that (Soot takes source or class
files as input). (This would also probably be faster.) It would be totally
cool if I could use the "compiler" in codeInspection.dataFlow for this. Unfortunately,
there's no public methods for getting at the parts of an instruction (how
do I know what's being pushed, etc), and I get an NPE when I try to use the
memory state.

Would it be feasible to convert the instructions produced by DataFlowRunner
to bytecode instructions? If so, would it be possible for you to either move
it to the OpenAPI, or add some public getters for the instruction fields?

If you could do that, I could make IDEA into even more of a powerful machine
with regards to code analysis, because IDEA would now support all inspections
that Soot supports. It might make implementing some things easier for you
guys as well.

Thanks very much,
-Keith


5 comments
Comment actions Permalink

No, that won't be possible since dataflow machine doesn't operate concrete
data items but rather operates on equivalence classes of the abstract memory
state.
BTW, what inspection Soot supports which is missing from IDEA?

-


Maxim Shafirov
http://www.jetbrains.com
"Develop with pleasure!"


0
Comment actions Permalink

Soot provides a framework for analysis which makes implementing many analyses
simple, or at least simpler than doing it by hand. IDEA provides nothing
at all, to OpenAPI developers. An example of some analyses based on Soot
are http://indus.projects.cis.ksu.edu/ which I can imagine could be used
to produce some fancy inspections and features for IDEA.

Anyway, are you sure it wouldn't be possible? Soot doesn't work on such equivalence
classes, but it seems that DfaValues and Instructions could be converted
to bytecode in some manner that Soot would work with. Soot doesn't need to
know everything - for example, for my plugin's use of Soot, it doesn't know
anything about other methods - each method call is assumed to be correct,
Soot assumes that a method with the given types of arguments, and return
type, exists.

No, that won't be possible since dataflow machine doesn't operate
concrete
data items but rather operates on equivalence classes of the abstract
memory
state.
BTW, what inspection Soot supports which is missing from IDEA?
-------------------
Maxim Shafirov
http://www.jetbrains.com
"Develop with pleasure!"




0
Comment actions Permalink

Keith,
the fact Soot uses bytecode instructions as its input does not mean it doesn't
use some internal representation for it's purposes. It also does not mean
it provides same analysis accuracy we do.
For instance, try the following. Would it find any problems in the following
code?:

public void foo(int x, int y) {
int z;
if (x == 2 && y == 3) {
z = 1;
} else if (x == 2) {
z = 2;
} else {
z = 3;
}

if (x == 2 && z == 2) {
// Unreachable.
}
}

That's quite simple, isn't it? Well, yet one from the real life:

public void foo(Object o) {
if (o instanceof PsiElement) {
}
else if (o instanceof PsiClass) {
// Unreachable.
}
}

Anyway, are you sure it wouldn't be possible?

Well, don't you think this isn't a best way to get a bytecode out of source
code? Taking a IDEA's data flow analyzer as the medium seem weird to me.

-


Maxim Shafirov
http://www.jetbrains.com
"Develop with pleasure!"

Soot provides a framework for analysis which makes implementing many
analyses simple, or at least simpler than doing it by hand. IDEA
provides nothing at all, to OpenAPI developers. An example of some
analyses based on Soot are http://indus.projects.cis.ksu.edu/ which I
can imagine could be used to produce some fancy inspections and
features for IDEA.

Anyway, are you sure it wouldn't be possible? Soot doesn't work on
such equivalence classes, but it seems that DfaValues and Instructions
could be converted to bytecode in some manner that Soot would work
with. Soot doesn't need to know everything - for example, for my
plugin's use of Soot, it doesn't know anything about other methods -
each method call is assumed to be correct, Soot assumes that a method
with the given types of arguments, and return type, exists.

>> No, that won't be possible since dataflow machine doesn't operate
>> concrete
>> data items but rather operates on equivalence classes of the abstract
>> memory
>> state.
>> BTW, what inspection Soot supports which is missing from IDEA?
>> -


>> Maxim Shafirov
>> http://www.jetbrains.com
>> "Develop with pleasure!"


0
Comment actions Permalink

Keith, the fact Soot uses bytecode instructions as its input does not
mean it doesn't use some internal representation for it's purposes. It


It uses a three-address system called Jimple, which contains about 10 instruction
types, and an idea of locals (no stack).

also does not mean it provides same analysis accuracy we do. For
instance, try the following. Would it find any problems in the
following code?:


I don't believe Soot has built-in constant conditions test. However, its
BranchedRefVarsAnalysis combined with its constant and copy propagation and
a little custom analysis code would probably catch those cases. Soot does
catch things like this:

String s = "";
int x = 2;
if (x > 3) s = null;
s.toString() // Soot knows it will not throw NPE

>> Anyway, are you sure it wouldn't be possible?
>>

Well, don't you think this isn't a best way to get a bytecode out of
source code? Taking a IDEA's data flow analyzer as the medium seem
weird to me.


From my experience, I think it is a better way, both for performance and
for accuracy.
1. Performance
My plugin is pretty slow, and 70% of the time in my plugin is spent in Soot's
source parser. Additionally, to inspect a single method, I need to mess with
the PSI a great deal, to remove all other code from the class, but keep it
in a compilable state.
2. Accuracy
Soot simply will not accept malformed code. If you're missing a semicolon
you can't compile the file at all. This is the opposite of IDEA which deals
with errors gracefully, still parsing and analyzing what it can. My plugin
does not have this luxury. What IDEA sees as a slightly illegally defined
class, Soot sees as completely broken code.

A third reason is that Soot does not support Java 5 source yet, so I have
to use a java5-to-java14 converter upon every analysis, which hurts performance
a little more. Of course, caching would help this, so maybe it's not all
lost.

Maybe a third option would be writing a PSI-to-Soot compiler myself. However
I'm not familiar with either the JLS or the classfile format / bytecode instructions,
so it would not be easy.

I could also try to convert PSI tree to Polyglot source tree, which Soot
uses as its actual input, which might save parsing and validation time.


0
Comment actions Permalink

Yep, having PSI instead of the DFA instructions is right move. It covers
both error-handling and 1.4 compatilibty issues.
-


Maxim Shafirov
http://www.jetbrains.com
"Develop with pleasure!"

>> Keith, the fact Soot uses bytecode instructions as its input does not


0

Please sign in to leave a comment.