Getting covering tests for a method

Permanently deleted user

Created February 14, 2007 19:38

Dave,
this is I feature I've been longing for ages (at least two=). Though nowadays it seems more like TeamCity feature with visualization in IDEA (you still have to run your tests go get the info, why not on server?). Also note that in case of out-of-method-body modification it becomes extremely difficult to bound the number of tests to be re-run to ensure nothing is broken (without this logic I don't see a value in this feature): in case you change method signature, chances are some other method calls previously calling overloaded method start calling the changed one. Similar conflict could happen when you change extends list of a class. The tests set could be safely approximated (e.g. include all tests touching all overloads with the same name), but how small the superset will be is unclear to me. Or probably I'm too scientific/pedantic, and this 0.0001 probability is not really worth bothering.
This is up to Idea/TeamCity teams to decide.

Eugene.

Permanently deleted user

Created February 14, 2007 19:53

Yeah, I thought about the tracking issue, and I'm pretty much convinced that the 10%-of-the-effort solution (simply don't try to track changes between coverage runs) covers more than 90%-of-the-value. "Run the tests that used this method last time I did a full coverage run" can be a very big time savings, and you can always finish with a full test run (or more likely have TeamCity do it).

Agreed on the TeamCity integration. Unless I'm actively trying to drive up coverage numbers, simply having the previous night's build's coverage visible in my project is more than enough.

Tempted to go hack on Emma myself...

--Dave Griffith

Permanently deleted user

Created February 14, 2007 20:03

Well, from my experience with Emma, one should better write method-level instrumentation from scratch rather than try to get everything Vlad has put into emma. Don't want to put Vlad down, but I was having troubles understanding the code in emma.

Eugene.

Permanently deleted user

Created February 14, 2007 21:13

Naw. I worked with Vlad a while back in Austin. Once you get past the first 300 line method, it's all downhill from there. Besides, I've got experience unsnarling undocumented code written by brilliant, crazy Russians.

Hmm, looks like the methods that need to be modified are, um, challenging. Cyclomatic complexities of 55 and 37, respectively. No rest for the wicked.

--Dave Griffith

Permanently deleted user

Created February 16, 2007 05:40

Well, I just started reading the code coverage sections of http://www.amazon.com/Why-Programs-Fail-Systematic-Debugging/dp/1558608664/sr=8-1/qid=1171603150/ref=pd_bbs_sr_1/102-7386398-7611300?ie=UTF8&s=books , and it turns out we've both been thinking about this all wrong. This feature is actually way more valuable than we realized. We've been thinking about code coverage as a independent of output (probably since that's how all the commercial tools currently approach it). What if instead we collect coverages on a per-test basis, and then correlate with the success or failure of a test.

1) For a given failing test, it would be handy to highlight it's coverage, since you know the code defect must be in one or more of the lines covered.

2) For a test suite, it would be cool to highlight lines covered only or predominantly in failing tests, as it seems reasonable to expect that defects are more likely than in code covered by both failing and successful tests.

3)One could imagine building statistical test tools, which throw random inputs at a system and check for failures. Correlating the coverages for failed and successful runs could help isolate defects nearly automatically. In particular, if for a given failing test case you find the successful test case whose code coverage most closely resembles the failing case, the differences between the two are likely places to find the defect.

The book, by the way, is quite awesome, and highly recommended for anyone with an interest in either software debugging or tools. It has a lot more to say on this subject, and all of it quite eye-opening.

--Dave Griffith

Permanently deleted user

Created February 16, 2007 12:03

1) For a given failing test, it would be handy to
highlight it's coverage, +since you know the code
defect must be in one or more of the lines
covered+

Surely it would help to have coverage for failing test: this aproximates the debugging in the next run (i.e. speculative debugging would be possible in some cases).

2) For a test suite, it would be cool to highlight
lines covered only or predominantly in failing tests,
as it seems reasonable to expect that defects are
more likely than in code covered by both failing and
successful tests.

I rather doubt this probabilistic assumption will work in practise

3)One could imagine building statistical test tools,
which throw random inputs at a system and check for
failures. Correlating the coverages for failed and
successful runs could help isolate defects nearly
automatically. In particular, if for a given failing
test case you find the successful test case whose
code coverage most closely resembles the failing
case, the differences between the two are likely
places to find the defect.

If only computing resources were unlimited:)

The book, by the way, is quite awesome, and highly
recommended for anyone with an interest in either
software debugging or tools. It has a lot more to
say on this subject, and all of it quite
eye-opening.

Yes. Though I only read the chapters that were available freely, I must say the book is rather fascinating.

Eugene.

Permanently deleted user

Created February 16, 2007 14:39

+Surely it would help to have coverage for failing test: this aproximates the debugging in the next run (i.e. speculative debugging would be possible in some cases).
+

Speculative debugging is always possible, as my junior colleagues continually demonstrate. It's just not all that smart. http://www.jetbrains.net/jira/browse/IDEA-11573

If only computing resources were unlimited:)

Well the big chip brains are saying we should expect thousand-core CPUs in not too many years, and statistical defect analysis is just the sort of "embarassingly parallel" problem that hit the sweet spot for those. Until then: http://www.jetbrains.net/jira/browse/IDEA-11575

--Dave Griffith

Permanently deleted user

Created February 18, 2007 11:25

I like this idea of individual test coverage. I have implemented it a few times myself to try something for research purposes. Of course for real development it would need some real use and value though that shouldnt be too hard to find. If you do hack emma or something to better support tracing individual tests it would be nice to have that available to experiment with :).

I havent read the book you mention but Zeller's stuff is always cool and from what I have looked at the book contents it seems nice. Much of the concepts I think is also available in his delta debugging and other publications. I find it a bit more difficult to fully apply input variation for many cases at white box level where the parameters are often not very suitable for random generation or variation (objects, etc). Still, would be nice something like this was available in an IDE.

Permanently deleted user

Created February 18, 2007 14:02

The book is basically the cleaned up and expended course notes of Zeller's graduate course. Looking through scholar.google.com, it does seem to draw heavily from his papers and previous works. That said, it's an excellent presentation.

I think I've figured out the easiest way to extend Emma for tagging individualized tests. I'm going to have Emma instrument the bytecode of JUnit test methods to report their names (and classes) to an extended Emma runtime. Also need to do something clever with JUnit setup() calls, to link their coverage to the next called test method. Should work, and won't require mucking with things too much. It does require than Emma know about JUnit, but can certainly certainly abstract that out for other frameworks so that other frameworks can work via plugins. If I do all that, I'll certainly release to the community.

One other thing that individualized test coverage enables is efficient mutation testing to estimate the quality of test suites. Start with a working test suite, inject a bug into the source code by mutating a method, see if the test suite fails (running only those tests which execute the mutated method). Wash, rinse, repeat 1000x, and get a map of which parts of products aren't tested well enough to actually find bugs, and maybe even give an indication of how may defects are left in the product.

--Dave Griffith

Permanently deleted user

Created February 19, 2007 19:33

I have done single test coverage tracing with Emma by simply adding trace calls to the JUnit framework code and using this to tell my trace code when a test has started and stopped. With JUnit 3 you actually have to add the code to the framework and recompile. With JUnit 4 there should be a listener framework to enable this without touching the JUnit code itself. Haven't tried that yet though.

From the called trace code I used the Emma internal interface to dump the coverage and parsed out the coverage data for the individual test. After this was done the call returned to the JUnit code which continued with the next test. This is very slow and inefficient but I didn't have the resources to go dig through all of Emma to understand and modify its internal in-memory representations and get the coverage from there. Also, Emma provides reset coverage which I used between tests, but it requires some tricks to get it to work for class coverage.

If you could just provide a way to better control and access the coverage data from an Emma API that would be great for me to integrate with different test frameworks.

-Teemu