Remote python interpreter via Docker (and nvidia-docker) Problem (GPU)

已回答

Hi!

Our setup

We have a number of GPU servers in our office. Each server can use nvidia-docker to run stuff like Tensorflow and similar with GPU support. I can also launch the regular Docker executable with --device flags etc to get the regular Docker executable to see the GPUs. I'm using a Mac as my development machine and hence the mac version of PyCharm.

What I want to do

Now I want to use PyCharms Docker integration to connect to our servers on our local LAN and run/debug code inside the Docker container on the servers. I.e. I want to use the Python interpreter located inside a docker container on our servers.

My problem

There are several problems to doing this and I think I've come very close to solving them but is now stuck. The main problem seem to be that the Docker integration in PyCharm seem to only support locally running Docker containers. 

What I've done

I've tried a multitude of things. Here I mention a couple:

Forward nvidia-docker socket

1. Forward the nvidia-docker socket via SSH to my local machine (like so: https://medium.com/@dperny/forwarding-the-docker-socket-over-ssh-e6567cfab160)

2. Configuring nvidia-docker plugin to allow outside connections as well as set the permissions correctly to allow the forwarding to happen

3. Hook up PyCharm to this socket by setting the Engine API URL in the Docker dialogue to <my-forwarded-nvidia-docker-socket>

This works to the extent that PyCharm connects with it but I get an "NotAcceptableException" in the Docker dialogue (same place where I enter Engine API URL). I'm guessing this is because PyCharm seems that this is not a Docker socket, but an nvidia-docker socket.

However, if I instead forward the regular Docker socket at /var/run/docker.sock to my host machine in the above way then everything works just fine! The only problem is that then I can't see the GPUs I need to either be able to pass --device flags to Docker or run nvidia-docker instead of Docker.

Docker compose

So the next way I tried to solve it was to see if I could force Docker to take those --device flags and one way to do this as reported here: http://www.yigitozen.info/posts/2017/02/28/pycharm-docker-nvidia-gpu-container.html was to make a docker-compose file with nvidia-docker-compose and then use that yml file to launch Docker Compose instead of Docker in PyCharm. This seemed to work fine if I was on the local machine, but not remote. Here's what I tried:

1. Forward the regular Docker socket via the method in the previous section

2. Setup Docker in PyCharm to use that Docker server

3. Use the nvidia-docker.yml file to launch Docker Compose in PyCharm

This didn't work and frankly had some pretty odd behaviour. What happens is that it connects to the server via the SSH tunnel but starts dumping a bunch of tmp files in /Users/<user>/Library/PyCharm/somethingsomething on the server which PyCharm than complains about not being able to find. I think it's because it thinks the socket it connects to is running on my local mac (which would explain the distinct mac-like folder structure) but it is in fact running on the server. I have been unable to verify if this in fact would work with the GPUs as I was unable to go further from this tmp file problem.

Is there a way to solve this problem?

It seems that either of these three fixes should be able to do it:

1. See if you can some how circumvent that NotAcceptableException

2. Get Docker Compose to work nicely with remote

3. Be able to pass --device flags to Docker

(I'm guessing (2) is a hard one).

Can anybody help with this?

0

Michael,

Thank you for the great explanation! Regarding the solutions you proposed:

  1. I've created new issue about NotAcceptableException. To investigate this problem we need more information. Please take a look at the comment below the issue.
  2. This existing issue would technically solve PyCharm problem with docker-compose not working with remote servers.
  3. I've create another issue for passing --device flags to Docker in Python run configurations.

I think the second point will be ready the first most likely. We will try to get it done in PyCharm 2017.3.

0
Avatar
Permanently deleted user

Hi Alexander

Thanks a lot! I'll add the logs to the new issue (nr. 1) over the weekend

 

0

Hello, any update on this? I'm interested in the functionality as well.

0

Hi,

possible answer/workaround:

https://github.com/NVIDIA/nvidia-docker/issues/303#issuecomment-415668695 

simply change docker "default-runtime" to "nvidia" by adding line

"default-runtime": "nvidia"

to your "etc/docker/daemon.json" file

0

请先登录再写评论。