My rant on Indexing again
I've been a PyCharm user since it's first released. For personal projects within a relatively small PYTHONPATH it works great.
Now I work in a company with 15+ years of Python code base. Hundreds of packages in PYTHONPATH on a shared network path.
99% of those packages NEVER change. Also, I'm working on multiple projects at a time and every time I open PyCharm it starts indexing THE SAME hundreds of packages AGAIN.
There are dozens of complaints on indexing but JetBrains for some reason is completely ignorant about this problem.
We're not asking to make it faster, or make the problem magically disappear, just PLEASE give us the ability to control this process:
- Exclude paths from indexing. Super important! We still need some paths to be in PYTHONPATH, so they can't be removed, but they NEVER change, there is absolutely reason to reindex them every single time.
- Disable the indexing whatsoever on demand.
- Allow the indexing to run once and DO NOT re-index on restart unless explicitly asked.
- Prioritize other IDE actions over indexing. While PyCharm is indexing the IDE is pretty much useless.
Please make PyCharm great again by giving us control over the indexing!
Please sign in to leave a comment.
Hi,
I agree with you that the issue exists, and you're not the first one to ask this, but I haven't found a related feature request in our system.
You are welcome to submit a feature request at https://youtrack.jetbrains.com/issues/PY
Information on how to use YouTrack: https://intellij-support.jetbrains.com/hc/en-us/articles/207241135-How-to-follow-YouTrack-issues-and-receive-notifications
It is best to explain the use case, and why this feature is important, in the ticket's comments on YouTrack
I also have major issues regarding indexing. I use IntelliJ Ultimate mainly for Python work. I work with environments that have huge number of files which have nothing to do with code base. Opening these environments brings IntelliJ to its knees. For me I'd just like to work and not get into all the indexing and memory heap discussions. Why doesn't Jetbrains would set out some guidelines or profiles for optimizing IntelliJ based on project size. Like best practices.
Have you tried to exclude specific paths using Project Structure settings? You can add directories from outside of your project and mark them as excluded, which should stop PyCharm indexing them.
I've only excluded directories within my project using the 'Mark Directory As'. For excluding paths from the Project Structure settings I assumed when I created a project anything under the Global Libraries section was need to run the project and only that project.
You can add external content root and mark them as excluded. The code will be able to refer to them if needed, but they shouldn't be indexed by PyCharm.
Hey Andrey, thank you for your suggestion, but unfortunately, this didn't work. Well, sort off.... I added a few huge network mounted directories as extra project roots and excluded them. After File->Invalidate Cache and restart it's indexing them again, however it seems finished much quicker. I'll see how it behaves for a few days. Thanks.
Another suggestion (actually I should have suggested this in the first place):
Open your project interpreter settings, click gear icon > show all, then select your interpreter and click on interpreter path icon (it's a little buried, I know)
If you see any network paths there - delete them, and they should never bother you again (in terms of indexing)
And my project won't find modules :) The idea is to stop indexing and generating skeletons for some paths.
I know it sounds weird but this for historical reasons we've been versioning up packages in sys.path. So we ended up with hundreds of packages in sys.path each of them may have dozens of versions compiled for different OS architectures. So when I run my project, all I care about is my project files + common system paths like /usr/lib/pythonx.x, they should be indexed and tracked, while those network paths should be just available to the interpreter and IDE should not care about them whatsoever.
Thanks Andrey,
In my set up I create a new python SDK for each new project. When I open up a project and view the project structure I see all the python SDKs I've created. Is that the way it should be? And can I exclude those paths?
Also in your comment 'Open your project interpreter settings, click gear icon > show all' When I look Project Structure -> Platform Settings -> SDKs I don't see any of the icons you mention.
@C T Matsumoto
Are you using IDEA? PyCharm doesn't use the term "SDK" for it's interpreter, so I assume you're using IDEA, but IDEA's interface is different in regards to setting up Python. They are located in Project structure > SDK > Select your Python SDK > Classpath tab
What are network paths? I'm back because I've just installed the 2019 update and the indexing is actually bringing my machine to it's knees. I've got 8 cores and they are almost all maxed out. It would be a tremendous help by IntelliJ to have some guidelines about setting up memory and optimizing your IDE for your particular situation.
>What are network paths?
UNC paths or anything on a network-mounted drive.
To narrow the issue down, please try creating a new project and check if the issue is reproduced.
Please also try deleting the `.idea` directory from your affected project and check if the issue is reproduced. You can back up that directory and restore it after the test if you don't want to loose any project settings.
In our case, ALL our PYTHONPATH is network-mounted. And there are many packages in it that just only make sense at runtime (we have some run-time import hooks in place) and therefore, all this indexing and skeletons generating on hundreds of paths that PyCharm is doing - is useless and takes so much time. We really need a way to tell PyCharm to NOT EVEN LOOK! at certain paths :) Please let us choose what to index!
P.S. The solution with adding those paths as roots and exclude them simply don't work. After a restart, PyCharm will be indexing it all over again.
Cheers.
Those paths should be listed under "interpreter paths" in the interpreter settings. You can delete them and PyCharm shouldn't index them. Here's the settings I'm talking about:
Let me know how it goes for you.
We can't remove interpreter paths, we won't be able to run or debug the project then! As I said we need many packages from those paths at runtime, but they should never be indexed because of the reason I described above - Historically those paths are a huge hierarchy of versions. For example, consider the path:
pst_tools/
.. __init__.py
.. v1/
.... __init__.py
.... foo.py
.. v2/
.... __init__.py
.... foo.py
And there are hundreds of those versions. An imagine that some module in my project imports pst_tools.v25.foo.py
It must be found in PYTHONPATH, but it never changes! The IDE should either index it once and forget about it forever OR let us ignore the whole pst_tools branch from indexing even if we lose most IDE features. I don't care if PyCharm will tell me it can't do certain operations it could've done had this path been indexed. Maybe because this is all network mounted the IDE gets confused and thinks that it needs to re-index the whole PYTHONPATH every time?
I know this is super weird, but this what we have to deal with now. :(
I can suggest to try the following solution/workaround:
1. Remove all network-mounted paths from PYTHONPATH. If it only consists of network paths, just unset the variable.
2. In PyCharm, edit your run configuration, and add PYTHONPATH there with the same value as it was in your system.
This should make PyCharm stop indexing paths from PYTHONPATH, because on opening the project, PYTHONPATH is empty/non-existent. And when you run the script, PYTHONPATH will be created according to run configuration, so that interpreter could find the libraries.
Let us know if that works or not for you.
Thanks, Andrey
That helped with indexing, however there is another issue, less annoying though :)
Our main system interpreter's site-packages contains a .pth file, which pulls everything I removed from PYTHONPATH back to the interpreter. Now, PyCharm, seems to skip indexing BUT the skeleton-generator still goes through these paths and it still takes a while. Much less then indexing though! Which is good. When I start PyCharm and do
ps -au | grep generator3.py
I see ...helpers/generator3.py ..... which takes a long argument of paths to scan including the ones from the .pth file.
I can live with that for now. Thank you so much for your support!
Well, I don't think there is a way to stop skeleton generator, unfortunately. It's refreshing skeletons every time you open a project, and I guess the core issue is that PyCharm simply don't bode well with network mounts. This is a known issue, and we have to do with weird workarounds, but our general recommendation is to not use network drives for anything related to project / libraries.
I removed the .idea directory and it spead up the indexing dramatically. The IDE seems to work, however, all the system specific packages, modules, classes etc are no longer in the tab completion. Also all my imports of system specific classes are marked as errors, while at the same time everything seems to work. I also noticed that directories I now exclude are correctly marked orange. This was also broken.
As far as UNC paths are concerned, I don't have any.
################################
>I removed the .idea directory and it spead up the indexing dramatically. The IDE seems to work, however, all the system specific packages, modules, classes etc are no longer in the tab completion. Also all my imports of system specific classes are marked as errors, while at the same time everything seems to work.
################################
Well, that's because...
################################
>So when I run my project, all I care about is my project files + common system paths like /usr/lib/pythonx.x, they should be indexed and tracked, while those network paths should be just available to the interpreter and IDE should not care about them whatsoever.
################################
Now that IDE "don't care about them", it will give warnings about missing modules/libraries/classes e.t.c, because we removed them from the indexing. If you want those objects to be resolved, we can include them back into PYTHONPATH, they will be indexed, but you'll have the performance issues again.
You can suppress those warnings/errors so they don't bother you, but they will be also suppressed for objects that you import from local libraries...
As I said, this is all weird workarounds, because, unfortunately, PyCharm is not in good relationship with network drives. Sorry about it.
In my case it is indexing the full anaconda directory.
I have to admit that I am disappointed in the level of support we are getting for a licensed product. I have a license on Toolbox and Pycharm is one of the many IDE's I am using . The way this is treated is worse than Open source products.
My machines are killed every evening at work so I have to wait for ours for the indexing to stop. Non of the suggested fixes really works? This should be addressed properly.
Its about a year later and an install of PyCharm 2020.1 on an 8 core i9 have the fans going and blowing for over 5 minutes already - machine somewhat responsive but noticeably slower. Oh PyCharm needs to "index" whatever that is.
My code base is not huge, it is unexplainable why this software should have to take over and hog the Mac.
This problem goes back many moons and appears utterly uninteresting to solve by JetBrains. I am no longer interested to pay for this. I never thought I'd say this, but Microsoft is waay better with their progress, focus and attitude on MS VSC.
I am not on a network and previously did not have a problem. However for some reason that I have forgotten I once selected [File] [Invalidate Caches / Restart ...]
Since I did that, weeks ago my machine has not stopped indexing and re-indexing and I have problems operating because some functions and features are disabled.
I have tried leaving the machine powered on and the project open in hopes it would complete indexing but I don't think that is the issue. I think it is repeatedly performing the indexing.
I have a 2nd Windows PC on which I edit the same repository and projects and on that one the indexing performs just fine - and of course I will not do the invalidate caches operation on that one
I am on 2019.2.6
Hello @Lmm305-themzlab!
First of all, please install the latest update it is 2020.1 if it does not help, please report the issue here
https://youtrack.jetbrains.com/issues/PY with attached logs folder zipped from ***Help | Collect logs and Diagnostic Data*** for a closer look.
I updated to 2020.1.1 and the behavior seems the same so I did collect the logs and data and submitted and issue, thanks!
The legacy of stock answers from "support" will continue until you get the next "improved" Pycharm with updates on how to navigate through a debugger nicely and how to integrate git better - AKA - updates no one cares about. When a so called flagship IDE maxes out your memory and takes more than 20 mins to index, there is sth wrong, really really wrong.
Really agree. I loose hours of productivity because somehow pycharm decides to index the entire project again. It has a crazy memory consumption and renders the IDE rather useless. I didn't even change ANY package these last days, just restarted my computer.
Can we just make a button: reindex? and tell pycharm to otherwise be silent?
Hello Roelant Stegmann ,
please report the issue here
https://youtrack.jetbrains.com/issues/PY with attached logs folder zipped from ***Help | Collect logs and Diagnostic Data*** . Thank you!