If I'm running a remote PyCharm IDE in a container, which IDE directories should I preserve between containers?

Hello,

I'm running a remote PyCharm IDE in a Docker container and would like to know which IDE directories (config, cache, etc) I should preserve when I delete the container and run another one. I.e. by mounting them in a permanent filesystem.

The goal is to keep directories which make it convenient to develop and abstract away the fact that a new container was created, while not keeping any stateful things that may hinder reproducibility and make a container that is started from fresh without the preserved directories fail.

Here is what I'm doing so far: https://github.com/CLAIRE-Labo/python-ml-research-template/blob/26ffd831a62132f1f87fee6d8ddc5ff15592a90c/installation/docker-amd64-cuda/entrypoints/remote-development-setup.sh#L54

For example, I'm preserving the $HOME/.config/JetBrains directory which preserves the IDE settings. However, it seems like the interpreter is indexed(?) (see screenshot) every time I start a new container. I may thus be missing some files.

It's not an option to preserve the whole $HOME directory as many files can be stored there which may make reproducibility without it impossible. For example, one can install pip packages there, which will be missing when a fresh container is created.

Thanks!

 

1

Hi @Skander Moalla! Wouldn't Dev Containers  be better suited for this scenario?

0

Hello Mikhail Tarabrikov! I don't have direct access to my container it's a remote container running behind a Kubernetes scheduler (Run:ai).

PyCharm can't attach to a running container in Kubernetes like VSCode (https://code.visualstudio.com/docs/devcontainers/attach-container#_attach-to-a-container-in-a-kubernetes-cluster).

So, I'm running an SSH server and the PyCharm IDE in the container and I connect to it as if it was a remote machine. Then to fully act like a remote machine I'm looking to preserve some directories of config and so on (while maintaining the reproducibility benefit of the container).

I don't think Dev Containers are an option here. Let me know if I should dig deeper into them.

0

Thanks Mikhail Tarabrikov for the link! I think what I'm looking for is in the system directory `.cache/JetBrains` (https://www.jetbrains.com/help/pycharm/directories-used-by-the-ide-to-store-settings-caches-plugins-and-logs.html#system-directory).

But that directory contains both things that should persist and things that shouldn't persist. E.g.,

…/RemoteDev/  seems to be about the current active remote development IDE and should not be persisted across containers.

…/RemoteDev-PY/ seems like it has things that should be persisted like …/index/ and things that shouldn't like `pid.279.temp.jbr `

 

Could you help with some details about what contains what or what can safely be persisted to gain startup time? Thanks

0

The missing things at startup were:

0

Skander Moalla , we would recommend to bound all directories mentioned in https://www.jetbrains.com/help/pycharm/directories-used-by-the-ide-to-store-settings-caches-plugins-and-logs.html, namely

~/.cache/JetBrains
~/.config/JetBrains
~/.local/share/JetBrains

Otherwise (if you add only sub directories) there is no guarantee that everything will work as expected.

0

I see. Makes sense.

I'm worried about two things in this case:

1- If I only have one container at a time where these directories are bound, can I expect the container startup and tear-down to play nicely as if it was a machine rebooting?

2- If I have multiple containers binding the same directories, can I expect those to play nicely? As shown in previous screenshots they contain directories with pids which can conflict when shared between different systems.

0

Most plugins live on the client side, so I feel like binding the following directories could be what I'm looking for?

~/.cache/JetBrains/RemoteDev-PY
~/.config/JetBrains/RemoteDev-PY	
0

Skander Moalla ,

1- Yes, this should work the same way as with a virtual machine

2- No, I would expect collisions when the same directories are shared by a number of containers running the same version of IDE

We would recommend including following binds (not the children):

~/.cache/JetBrains
~/.config/JetBrains
~/.local/share/JetBrains

E.g. ~/.cache/JetBrains/RemoteDev would contain installation files and session related data, while ~/.cache/JetBrains/RemoteDev-PY would have indexes, caches and other IDE related data.

0

Then  how should I proceed when running multiple containers? 

It seems like `~/.*/JetBrains/RemoteDev contains container-specific data that will cause collisions while ~/.*/JetBrains/RemoteDev-PY doesn't. Am I missing something?

Thanks a lot for the help!

0

Skander Moalla,

how should I proceed when running multiple containers

I think, it would depend on Run:ai configuration. Based on their documentation, I would try similar mount options for persistent storages / s3 buckets / local volumes, etc.:

-v /raid/public/john/cache/container-1:~/.cache/JetBrains:rw
-v /raid/public/john/config/container-1:~/.config/JetBrains:rw
-v /raid/public/john/local/container-1:~/.local/share/JetBrains:rw

[…]

-v /raid/public/john/cache/container-N:~/.cache/JetBrains:rw
-v /raid/public/john/config/container-N:~/.config/JetBrains:rw
-v /raid/public/john/local/container-N:~/.local/share/JetBrains:rw
0

Ok I see what you mean now. So the idea would be to keep a copy of the directories per project/container (assuming one has one development container per project).

I guess that avoids any issues, but it also goes against the benefit of having the same IDE configuration to work on multiple projects. One would have to reconfigure the IDE for every project/container.

I think this is unavoidable for the cache (as one might have the same interpreter name for 2 different containers/projects and their cache would conflict) but it should be okay for config and local, no?

What do you think of the following? Do you see any issue? (the difference is in all-containers)

Would you have an alternative not to have to reconfigure the IDE for every new project?

-v /raid/public/john/cache/container-1:~/.cache/JetBrains:rw
-v /raid/public/john/config/all-containers:~/.config/JetBrains:rw
-v /raid/public/john/local/all-containers:~/.local/share/JetBrains:rw

[…]

-v /raid/public/john/cache/container-N:~/.cache/JetBrains:rw
-v /raid/public/john/config/all-containers:~/.config/JetBrains:rw
-v /raid/public/john/local/all-containers:~/.local/share/JetBrains:rw
0

Skander Moalla, I think, it is worth trying with shared .local and .config directories. At the first sight I don't see any issues.

0

请先登录再写评论。