File parsing

The way projects are set up today regarding parsing of files and libraries has got to change. It simply takes way to much time.

The change that was made from 3.0 to 4.0 in that all symbols are collected to be able to do symbol searches has to be made optional or something like that. The first request for such a search could parse the data, that parsing could be made in the background after having created the project, or skipped all together. I never use symbol search.

I mean the wild-card symbol search feature.

And other than going about the actual when-to-parse problem, the parsing needs to be faster. I realize that computers get faster and all that, but the time spent parsing all that is not acceptable.

The improvements from, I think, 4.5.1 and 4.5.2 (?) helped a lot though, they stopped the OOME-problems I had. More of that effort! ;)


The way has got to change?

What do you care for how it is implemented?
If you say it is too slow, that's a different statement altogether.
Still, I am using a fairly large project and have no real complaints about parsing speed. Make sure you do not use a network file system for your sources.
In fact I think the Idea developers have made an incredible good job in building a project repository. If you watched the EAP progress you have noticed that continuous work is done in that area (which unfortunately means a repository rebuild in each affected EAP version, but hey that's why it is EAP).


I have no problem waiting a minute or two for the JDK to be parsed every time I download a new JDK, nor for the 5 seconds it takes to load my current 35,000-line project. I am on a 2.5-year-old laptop, with a 4200rpm disk, so I'm not even using recent and/or fast hardware.

Maybe there is a problem with your hardware or network connection.


I'd have to agree that it is a lot slower than I find reasonable. In 3.x I had never had this problem as described above. I have some larger projects with many dependent jars, and it can take 3 mins to load at times.
I have a P4 3.0Ghz w/ 1 GB RAM, and I've been running 4.5.2(just switched to Irida). none of my resources are on shared drives.

Although my project is large, some of my coworkers continue to use 3.x without the heavy frontend parsing.
As suggested above, perhaps allowing configuration of the parsing mechanisms would allow users to be flexible according to their needs.

As much as I would never return to 3.x, the startup parsing on 4.x really frustrates me to no end and has encouraged some of my coworkers not to upgrade to 4.x (which I still think is silly).

I would love to see an improvement(even by way of flexible config) in this area.


Since you didn't read anything of what I said:

I'm talking about the CURRENT state. I know how much work the developers put into this. All sources are on a local disk, the project is 3.5k classes in 3.1k files. About 15 megs of source code. And the source base I'm working on is three concurrent versions of the same source tree. 15 * 3 megs of source code. I realise that that's a lot of source code, and that there might be things I can do to ease the burden for IDEA.

But the points regarding how much data is sampled during the parse process is still valid. I never ever use the Find by symbol with wildcard-feature, and I'd be fine with having to wait for a complete re-parse if I were to use it, or any of the other alternatives I presented in the initial post. I just don't want to have to wait for that parse when I create a new project.

Also, it seems like that huge parse tree takes up a lot of memory and CPU-time for data-merges when do things like: compile, update sources from the source control system, and the like. It simply puts a lot of strain on my system while providing little in return.

This topic has been up before, and I think the answer went along the lines of it not being possible to only sample some of the data during the parse, so perhaps the parsing can be done in the background allowing me to start working sooner. (Of course with the implication that any use of any feature requiring the sample data would have to wait for the completion of the parsing).


And I'm on an 800 MHz laptop with 384 RAM. (Which is to be replaced with a faster one AnyDayNow(tm)(r)). As much as I'd like IDEA to work at the speed of notepad, I realize that that's impossible, that's why I'd like to see some discussion and reasoning around handling the parsing differently.


Please sign in to leave a comment.