Indexing and other stuff

Permanently deleted user

Created June 08, 2011 16:00

Hi,
This is my first post in this forum. I've searched for a similar topic but I haven't found any similar.

I want to share some thoughts about the Intellij Idea index.
I mean, it's really great to have everything indexed so the IDE can quickly show you that a method was overridden by a subclass, the usages, etc...
But, there are a few annoying things that I would like to discuss and may be we come up with a improved solution.

Before starting I would like to mention that I'm a Linux user, I know that there are other OS that use a more limited file system.

As everyone knows, almost every software developer works with several version of the software. Of course most of the work is usually done in Main, but from time to time a branch is opened (with some kind of triage) and a new version is released.
Even more, there are teams (like the one I work in) that are specialized in GA versions where customer issues are solved and hotfixes are delivered to customers.
So, if we use subversion as version control system, then we will probably have different working copies with different version (branches). But.... Intelij IDEA has a single index. If something happens (lets say, a power cut, you accidentally kick the power cord, etc....) while working with one branch, the whole cache will get corrupted. And.... we will have to re-index every single branch again. This can be easily solved if we could specify an index folder per project. (I know that space can be saved by having everything together, but may be we can optimize it using hard links)
If instead we use GIT for example, we would have a single working tree and switching them from branch to branch. And... every time we switch from one branch to another, the IDE will re-index everything. This could also be solved if we modify in every branch the project to point to another cache directory (there are other more clever options of course)

Apart from that, I think that we still have other things to improve. For example, if the project is big, indexing it may take a couple of hours. And... all the developers have to wait that couple of hours. Here at work we are 50 developers in the same LAN, and we are not taking advantage of that. I mean, if someone has already indexed some particular git commit, why would I need to index it too. And if no-one has indexed completely already, may be we can split the indexing task among the different open IDE's.

This are my thoughts, I would like to here what you think about it.
And... thanks for doing the best Java IDE.

Cheers,
Ariel

8 comments

Yann Cebron

Created June 08, 2011 19:07

Did you try to improve performance by using SSD on dev machines? How big is your project and how many branches concurrently checked out? I doubt a "centralized"/LAN-approach will work under all conditions (and would possibly lead to other problems), although it sounds like a nice idea maybe worth tracking.

Permanently deleted user

Created June 08, 2011 19:44

Hi Yann,
Thanks for answering. Of course buying faster hardware is an alternative. Currently I have 2 SCSI drives in Raid-0 (by hardware). I haven't tried SSD disks yet. (but it seems to me that Intellij developers have already :-) )
Our project is quite big. Each version has approx. 1.4G of java source code (including junit tests).
Currently I have 4 branches (SVN) for our 10g version. Then we've moved our repository to (let's say) GIT for 11g (2 branches) and main is also in it.

In your answer you mentioned the word "centralized". In fact, I mean just the opposite (every host with its own index) but with some "distribute indexing" feature.
Even more, if there is some kind of version control system, we can check if another depeloper host has already indexed that commit. (or we may also get the closer one and append the current changes).

With my post I'm not trying to file an Request For Enhancement (RFE) for next version. Just to initiate a discusstion on what can be done (those were my suggestions and I hope other users post their own ideas) and may be we end up having an even smarter Idea.

Cheers,
Ariel

Yann Cebron

Created June 08, 2011 19:55

You should really try out SSD first, this will make a _huge_ difference to indexing speed (IMHO every developer machine should have SSDs anyways these days). Also try increasing RAM for the OS/filesystem to be more effective. Yes, I just called it "centralized" ;-) AFAIK the currrent design/content of the index isn't shareable. Wouldn't sharing only save on the CPU-time to create the index, when sharing it via LAN it still would have to be written to the local disk of every developer?

Permanently deleted user

Created June 08, 2011 20:18

Well, it depends. It's not the same reading a single file of 1.4G than 100000 files that all together have 1.4G. Even less, if you read one, then parse it, then another one, then write partial results, etc...

But anyway, the distributed index is just a piece of my comment. We still have the multi-branch stuff.

Cheers,
Ariel

Edited:
Of course, I'm assuming that the project is quite big. If the project is small making it distributed will only make things worse.