Running remote interpreter on a cluster with srun
已回答
I need to run python scripts on a lab cluster where we have GPU machines. The way to do this is to ssh to a login node, and then run a command that connects to the GPU machine. So there is an extra command
interact -gpu
to run after the ssh session.
Is there a way to do this with pycharm, so that pycharm can run the remote interpreter inside the GPU compute node?
I'd like to use pycharm mostly to do remote debugging on those GPU machines.
请先登录再写评论。
I have the same question. I need to run srun to get access to the gpu resource node on a cluster. I cannot run it on the head node.
I do have same problem. Anyone knows how to solve this problem?
It is not possible at the moment, unfortunately. If you want to debug the remote script you can use Debug Server https://www.jetbrains.com/help/pycharm/remote-debugging-with-product.html#remote-debug-config
Hi
Have you guys found a solution to this?
Yes you set up a debug server on your local machine as suggested in the previous comment.
I create an SSH tunnel to the compute node.
How did you created a ssh tunel directly to the compute node, I am not very familiar with this, as I only started pycharm for 1 month.
This is a non-PyCharm thing. If PyCharm can SSH to the compute node directly, then do that. If not, you can use option "-L" from the ssh command to make a tunnel. Typically, you have a login node throughout which you connect to the compute node. So you can do like "ssh -L 2222:compute_node:22 login_node". This connects through SSH from your computer to the login_node, but it also listens to TCP connections on your local computer on port 2222. When there's a connection, then it tunnels through this SSH connection to port 22 in the login_node (which is the typical port for SSH).
Then, you can create a remote interpreter in PyCharm to localhost:2222 (which will be the compute_node). If you could connect directly to the compute node, you could create a remote interpreter to compute_node directly and that'd be it.
Some clusters may have a port blocking policy. Try your luck.
Do you mean that they don't allow SSH'ing to the compute node?
In your description as I understand it, the compute node talks to my port 2222. The cluster that I use does not allow the compute node to connect to any non standard ports such as 2222. So in my understanding, your solution would not work on my cluster. Were you able to get it to work?
I run it daily for the cluster I use.
It's not as you described. The IDE connects to your port 2222 through SSH, which the ssh command converts it into connecting to your compute node through SSH to the port 22 (the standard one for this).
Hi
I got the cluster node to listen the TCP, but when I run my code with the ssh interpreter in pycharm, I think is still on the log in node, since I dont see any indication that it is on the specific node when I run it. The workaround for me is that the cluster provides an interactive python, and I can run my files there and edit them in pycharm.
Hi
Could you be more specific on how to create a remote interpreter in PyCharm to localhost:2222?
I tried and did not find a way to do so.
You first do:
ssh -L 2222:compute_node:22 login_node
Then you create the interpreter pointing to localhost:2222
Thanks, Santiago Castro
May I know how to initialize an interpreter to localhost:2222? The SSH setting in pycharm has no means to let me do this.
You add the interpreter in the same way as if it was in the compute_node, but using the host "localhost" and instead of the port 22, the port 2222.
Santiago Castro
Thanks for the reply!
I now could use the SSH tunnel to start a Jupiter notebook in a web browser, but the pycharm still could not work. Any thoughts are well appreciated!
Below are my steps:
1. Start the jupyter-lab backend on the remote server and I get the ports.
2. Using http://localhost:8888 I could start the jupyter lab in a web browser.
3. However, I try to set up the configs in pycharm and it keeps connecting to the port forever.