My rant on Indexing again

I've been a PyCharm user since it's first released. For personal projects within a relatively small PYTHONPATH it works great.

Now I work in a company with 15+ years of Python code base. Hundreds of packages in PYTHONPATH on a shared network path.

99% of those packages NEVER change. Also, I'm working on multiple projects at a time and every time I open PyCharm it starts indexing THE SAME hundreds of packages AGAIN.

There are dozens of complaints on indexing but JetBrains for some reason is completely ignorant about this problem. 

We're not asking to make it faster, or make the problem magically disappear, just PLEASE give us the ability to control this process:

- Exclude paths from indexing. Super important! We still need some paths to be in PYTHONPATH, so they can't be removed, but they NEVER change, there is absolutely reason to reindex them every single time.

- Disable the indexing whatsoever on demand.

- Allow the indexing to run once and DO NOT re-index on restart unless explicitly asked.

- Prioritize other IDE actions over indexing. While PyCharm is indexing the IDE is pretty much useless.

Please make PyCharm great again by giving us control over the indexing!

21 comments
Comment actions Permalink

Hi,

I agree with you that the issue exists, and you're not the first one to ask this, but I haven't found a related feature request in our system.

You are welcome to submit a feature request at https://youtrack.jetbrains.com/issues/PY

Information on how to use YouTrack: https://intellij-support.jetbrains.com/hc/en-us/articles/207241135-How-to-follow-YouTrack-issues-and-receive-notifications

It is best to explain the use case, and why this feature is important, in the ticket's comments on YouTrack

0
Comment actions Permalink

I also have major issues regarding indexing. I use IntelliJ Ultimate mainly for Python work. I work with environments that have huge number of files which have nothing to do with code base. Opening these environments brings IntelliJ to its knees. For me I'd just like to work and not get into all the indexing and memory heap discussions. Why doesn't Jetbrains would set out some guidelines or profiles for optimizing IntelliJ based on project size. Like best practices.

1
Comment actions Permalink

Have you tried to exclude specific paths using Project Structure settings? You can add directories from outside of your project and mark them as excluded, which should stop PyCharm indexing them.

0
Comment actions Permalink

I've only excluded directories within my project using the 'Mark Directory As'. For excluding paths from the Project Structure settings I assumed when I created a project anything under the Global Libraries section was need to run the project and only that project.

0
Comment actions Permalink

You can add external content root and mark them as excluded. The code will be able to refer to them if needed, but they shouldn't be indexed by PyCharm.

0
Comment actions Permalink

Hey Andrey, thank you for your suggestion, but unfortunately, this didn't work. Well, sort off.... I added a few huge network mounted directories as extra project roots and excluded them. After File->Invalidate Cache and restart it's indexing them again, however it seems finished much quicker. I'll see how it behaves for a few days. Thanks.

0
Comment actions Permalink

Another suggestion (actually I should have suggested this in the first place):

Open your project interpreter settings, click gear icon > show all, then select your interpreter and click on interpreter path icon (it's a little buried, I know)

If you see any network paths there - delete them, and they should never bother you again (in terms of indexing)

0
Comment actions Permalink

And my project won't find modules :) The idea is to stop indexing and generating skeletons for some paths.

I know it sounds weird but this for historical reasons we've been versioning up packages in sys.path. So we ended up with hundreds of packages in sys.path each of them may have dozens of versions compiled for different OS architectures. So when I run my project, all I care about is my project files + common system paths like /usr/lib/pythonx.x, they should be indexed and tracked, while those network paths should be just available to the interpreter and IDE should not care about them whatsoever.

 

0
Comment actions Permalink

Thanks Andrey,

In my set up I create a new python SDK for each new project. When I open up a project and view the project structure I see all the python SDKs I've created. Is that the way it should be? And can I exclude those paths?

Also in your comment 'Open your project interpreter settings, click gear icon > show all' When I look Project Structure -> Platform Settings -> SDKs I don't see any of the icons you mention.

0
Comment actions Permalink

@C T Matsumoto

Are you using IDEA? PyCharm doesn't use the term "SDK" for it's interpreter, so I assume you're using IDEA, but IDEA's interface is different in regards to setting up Python. They are located in Project structure > SDK > Select your Python SDK > Classpath tab

0
Comment actions Permalink

What are network paths? I'm back because I've just installed the 2019 update and the indexing is actually bringing my machine to it's knees. I've got 8 cores and they are almost all maxed out. It would be a tremendous help by IntelliJ to have some guidelines about setting up memory and optimizing your IDE for your particular situation.

1
Comment actions Permalink

>What are network paths?

UNC paths or anything on a network-mounted drive.

To narrow the issue down, please try creating a new project and check if the issue is reproduced.

Please also try deleting the `.idea` directory from your affected project and check if the issue is reproduced. You can back up that directory and restore it after the test if you don't want to loose any project settings.

0
Comment actions Permalink

In our case, ALL our PYTHONPATH is network-mounted. And there are many packages in it that just only make sense at runtime (we have some run-time import hooks in place) and therefore, all this indexing and skeletons generating on hundreds of paths that PyCharm is doing - is useless and takes so much time. We really need a way to tell PyCharm to NOT EVEN LOOK! at certain paths :)  Please let us choose what to index!

P.S. The solution with adding those paths as roots and exclude them simply don't work. After a restart, PyCharm will be indexing it all over again.

Cheers.

0
Comment actions Permalink

Those paths should be listed under "interpreter paths" in the interpreter settings. You can delete them and PyCharm shouldn't index them. Here's the settings I'm talking about:

Let me know how it goes for you.

0
Comment actions Permalink

We can't remove interpreter paths, we won't be able to run or debug the project then! As I said we need many packages from those paths at runtime, but they should never be indexed because of the reason I described above - Historically those paths are a huge hierarchy of versions. For example, consider the path:

pst_tools/

.. __init__.py

.. v1/

.... __init__.py

.... foo.py

.. v2/

.... __init__.py

.... foo.py

And there are hundreds of those versions. An imagine that some module in my project imports pst_tools.v25.foo.py

It must be found in PYTHONPATH, but it never changes! The IDE should either index it once and forget about it forever OR let us ignore the whole pst_tools branch from indexing even if we lose most IDE features. I don't care if PyCharm will tell me it can't do certain operations it could've done had this path been indexed. Maybe because this is all network mounted the IDE gets confused and thinks that it needs to re-index the whole PYTHONPATH every time?

I know this is super weird, but this what we have to deal with now.  :(

0
Comment actions Permalink

I can suggest to try the following solution/workaround:

1. Remove all network-mounted paths from PYTHONPATH. If it only consists of network paths, just unset the variable.

2. In PyCharm, edit your run configuration, and add PYTHONPATH there with the same value as it was in your system.

This should make PyCharm stop indexing paths from PYTHONPATH, because on opening the project, PYTHONPATH is empty/non-existent. And when you run the script, PYTHONPATH will be created according to run configuration, so that interpreter could find the libraries.

Let us know if that works or not for you.

0
Comment actions Permalink

Thanks, Andrey

That helped with indexing, however there is another issue, less annoying though :) 

Our main system interpreter's site-packages contains a .pth file, which pulls everything I removed from PYTHONPATH back to the interpreter. Now, PyCharm, seems to skip indexing BUT the skeleton-generator still goes through these paths and it still takes a while. Much less then indexing though! Which is good. When I start PyCharm and do

ps -au | grep generator3.py

I see ...helpers/generator3.py ..... which takes a long argument of paths to scan including the ones from the .pth file.

I can live with that for now. Thank you so much for your support!

 

0
Comment actions Permalink

Well, I don't think there is a way to stop skeleton generator, unfortunately. It's refreshing skeletons every time you open a project, and I guess the core issue is that PyCharm simply don't bode well with network mounts. This is a known issue, and we have to do with weird workarounds, but our general recommendation is to not use network drives for anything related to project / libraries.

0
Comment actions Permalink

I removed the .idea directory and it spead up the indexing dramatically. The IDE seems to work, however, all the system specific packages, modules, classes etc are no longer in the tab completion. Also all my imports of system specific classes are marked as errors, while at the same time everything seems to work. I also noticed that directories I now exclude are correctly marked orange. This was also broken.

As far as UNC paths are concerned, I don't have any.

0
Comment actions Permalink

################################

>I removed the .idea directory and it spead up the indexing dramatically. The IDE seems to work, however, all the system specific packages, modules, classes etc are no longer in the tab completion. Also all my imports of system specific classes are marked as errors, while at the same time everything seems to work.

################################

Well, that's because...

################################

>So when I run my project, all I care about is my project files + common system paths like /usr/lib/pythonx.x, they should be indexed and tracked, while those network paths should be just available to the interpreter and IDE should not care about them whatsoever.

################################

 

Now that IDE "don't care about them", it will give warnings about missing modules/libraries/classes e.t.c, because we removed them from the indexing. If you want those objects to be resolved, we can include them back into PYTHONPATH, they will be indexed, but you'll have the performance issues again.

You can suppress those warnings/errors so they don't bother you, but they will be also suppressed for objects that you import from local libraries...

As I said, this is all weird workarounds, because, unfortunately, PyCharm is not in good relationship with network drives. Sorry about it.

0
Comment actions Permalink

In my case it is indexing the full anaconda directory.

I have to admit that I am disappointed in the level of support we are getting for a licensed product. I have a license on Toolbox and Pycharm is one of the many IDE's I am using . The way this is treated is worse than Open source products.

My machines are killed every evening at work so I have to wait for ours for the indexing to stop. Non of the suggested fixes really works? This should be addressed properly.

 

 

1

Please sign in to leave a comment.