Code with me plugin with own lobby server, session expiration issue

Hello

We have deployed the lobby, relay and redis servers in our kubernetes using our own helm chart.

In Intellij one can start a new session and get a share link. But no user is able to join. The lobby server logs show that a session is created, but shortly after deleted.

Is this an indication of some bad configuration? Or is there a setting we have missed?

Below the lobby server logs:

2020-11-19 11:30:13,239 INFO events - event - hostBuildNumber=IU-202.7660.26, relayRegion=forced_uri, disabledFeaturesOnHost=p2p_quic, eventType=sessionCreated, sessionId=JYsgFlg8RArkXu_xuv2U9Q, type=event
2020-11-19 11:30:13,240 INFO requestLog - http request POST /createSession - duration=17, path=/createSession, requestId=ad1dbf14-8020-445e-81b3-b2b27255f2eb, ip=10.22.145.248, sessionId=JYsgFlg8RArkXu_xuv2U9Q, httpCode=201, successful=true
2020-11-19 11:30:13,451DEBUG c.i.c.s.LobbyServer - Loaded session 'JYsgFlg8RArkXu_xuv2U9Q' from redis key '5-JYsgFlg8RArkXu_xuv2U9Q':
| {
| "created" : 1605781813.235219000,
| "updated" : 1605781813.235220000,
| "eventsLogging" : true,
| "hostBuildNumber" : "IU-202.7660.26",
| "hostFingerprint" : "30DD.......................removed..................90E0",
| "hostProjectName" : "vvn-vrsvereinbarung-service",
| "hostUserMoniker" : "myUsername",
| "relayUrl" : "wss://cwm-relay-test.kubedev.corp.test",
| "relayRegion" : "forced_uri",
| "hostDirectTcpEndpoints" : [ "tcp://10.0.75.1:5990", "tcp://172.16.130.99:5990", "tcp://192.168.146.161:5990", "tcp://84.75.144.62:5990" ],
| "hostFeatures" : [ "ws_relay", "direct_tcp" ]
| } - duration=2, path=/JYsgFlg8RArkXu_xuv2U9Q, requestId=a8984e6a-6272-43f0-af5b-83ddadf54f48, ip=10.22.145.248, sessionId=JYsgFlg8RArkXu_xuv2U9Q
2020-11-19 11:30:13,453DEBUG c.i.c.s.LobbyServer - Removing session 'JYsgFlg8RArkXu_xuv2U9Q' by redis key '5-JYsgFlg8RArkXu_xuv2U9Q' - path=/JYsgFlg8RArkXu_xuv2U9Q, sessionId=JYsgFlg8RArkXu_xuv2U9Q, requestId=a8984e6a-6272-43f0-af5b-83ddadf54f48, ip=10.22.145.248
2020-11-19 11:30:13,454 INFO events - event - eventType=sessionRemoved, sessionId=JYsgFlg8RArkXu_xuv2U9Q, type=event
2020-11-19 11:30:13,455 INFO requestLog - http request DELETE /JYsgFlg8RArkXu_xuv2U9Q - duration=7, path=/JYsgFlg8RArkXu_xuv2U9Q, requestId=a8984e6a-6272-43f0-af5b-83ddadf54f48, ip=10.22.145.248, sessionId=JYsgFlg8RArkXu_xuv2U9Q, httpCode=200, successful=true
2020-11-19 11:30:27,794 WARN requestLog - HTTP 404 code for user: Session 'JYsgFlg8RArkXu_xuv2U9Q' is not found - duration=2, path=/JYsgFlg8RArkXu_xuv2U9Q/info, requestId=89bed566-1254-4a95-aca8-845878eda18d, ip=10.22.145.248, sessionId=JYsgFlg8RArkXu_xuv2U9Q, httpCode=404, successful=false

The deployment is done with a helm chart: https://github.com/wemu/jetbrains-cwm-chart 

Any hints would be cool :-)

 

0

Hello,

If possible, can you please check if IntelliJ IDEA v2020.3 (https://www.jetbrains.com/idea/nextversion) solves the problem for you. 

Thank you  

1

Thanks for your reply!

Good news and more questions :)

Tried the same with the 2020.3 EAP version. That looks way better! The log contains continous session load statements. And if I stop the sharing in IntelliJ a session delete is visible:

2020-11-26 16:28:30,950 INFO requestLog - http request POST /createSession - duration=30, path=/createSession, requestId=935870d5-50ec-49f4-a80e-9b42f5860be4, ip=10.22.145.248, sessionId=LVyxt36P2lH2b6_xEEz5yQ, httpCode=201, successful=true
2020-11-26 16:28:31,105DEBUG c.i.c.s.LobbyServer - Loaded session 'LVyxt36P2lH2b6_xEEz5yQ' from redis key '5-LVyxt36P2lH2b6_xEEz5yQ':
2020-11-26 16:28:31,151 INFO requestLog - http request POST /LVyxt36P2lH2b6_xEEz5yQ/joinPoll - duration=49, path=/LVyxt36P2lH2b6_xEEz5yQ/joinPoll, requestId=033f2fd7-ef9f-4132-b058-e6f70d6c7064, ip=10.22.145.248, sessionId=LVyxt36P2lH2b6_xEEz5yQ, httpCode=200, successful=true
2020-11-26 16:29:02,251DEBUG c.i.c.s.LobbyServer - Loaded session 'LVyxt36P2lH2b6_xEEz5yQ' from redis key '5-LVyxt36P2lH2b6_xEEz5yQ':
2020-11-26 16:32:39,732 INFO requestLog - http request POST /LVyxt36P2lH2b6_xEEz5yQ/joinPoll - duration=4, path=/LVyxt36P2lH2b6_xEEz5yQ/joinPoll, requestId=3a8d1499-f0a0-4a3f-abd1-757cf97ee59e, ip=10.22.145.249, sessionId=LVyxt36P2lH2b6_xEEz5yQ, httpCode=200, successful=true
2020-11-26 16:33:10,800DEBUG c.i.c.s.LobbyServer - Loaded session 'LVyxt36P2lH2b6_xEEz5yQ' from redis key '5-LVyxt36P2lH2b6_xEEz5yQ':
2020-11-26 16:33:10,801 INFO requestLog - http request POST /LVyxt36P2lH2b6_xEEz5yQ/joinPoll - duration=3, path=/LVyxt36P2lH2b6_xEEz5yQ/joinPoll, requestId=0cc090b9-f59d-4c81-8919-51ef5c20759f, ip=10.22.145.249, sessionId=LVyxt36P2lH2b6_xEEz5yQ, httpCode=200, successful=true
2020-11-26 16:33:27,047DEBUG c.i.c.s.LobbyServer - Loaded session 'LVyxt36P2lH2b6_xEEz5yQ' from redis key '5-LVyxt36P2lH2b6_xEEz5yQ':

2020-11-26 16:33:27,052DEBUG c.i.c.s.LobbyServer - Removing session 'LVyxt36P2lH2b6_xEEz5yQ' by redis key '5-LVyxt36P2lH2b6_xEEz5yQ' - path=/LVyxt36P2lH2b6_xEEz5yQ, sessionId=LVyxt36P2lH2b6_xEEz5yQ, requestId=3983161e-b246-4cc8-8d0b-0850ae496fcf, ip=10.22.145.251
2020-11-26 16:33:27,053 INFO events - event - eventType=sessionRemoved, sessionId=LVyxt36P2lH2b6_xEEz5yQ, type=event
2020-11-26 16:33:27,054 INFO requestLog - http request DELETE /LVyxt36P2lH2b6_xEEz5yQ - duration=20, path=/LVyxt36P2lH2b6_xEEz5yQ, requestId=3983161e-b246-4cc8-8d0b-0850ae496fcf, ip=10.22.145.251, sessionId=LVyxt36P2lH2b6_xEEz5yQ, httpCode=200, successful=true

IntelliJ now shows an error related to the relay server:

Error creating websocket 'wss://cwm-relay-test.kubedev.corp.test/register/NqVow3h_mM8fpwMVnz0VYg'

java.util.concurrent.CompletionException: java.net.http.WebSocketHandshakeException
            at java.base/java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
            at java.base/java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
            at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1074)
            at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
            at java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610)
            at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:840)
            at java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
            at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
            at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
            at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
            at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
            at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: java.net.http.WebSocketHandshakeException
            at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.resultFrom(OpeningHandshake.java:224)
            at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
            ... 9 more
Caused by: jdk.internal.net.http.websocket.CheckFailedException: Unexpected HTTP response status code 503
            at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.checkFailed(OpeningHandshake.java:341)
            at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.handleResponse(OpeningHandshake.java:250)
            at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.resultFrom(OpeningHandshake.java:220)
            ... 10 more

 

Is there a way to check connectivity with the relay server? something similar like /info resource that the lobby server offers? I checked the ports and service / ingress yaml files but the relay server log does not do anything. Not sure it listens on the port at all.

 

Thanks a lot!

kind regards

werner

 

 

 

0

Hello,

If possible, can you please share the full stack trace with the exceptions that you get. If convenient, you may create an issue at the YouTrack in CWM project (https://youtrack.jetbrains.com/issues/CWM).

What for the diagnostics, you may use the prometheus metrics address:

-prometheusAddr: The address on which to listen for prometheus metrics request. E.g.: 127.0.0.1:4422

Right now you can ask the / resource, it should return the WebSocket relay in Go. This is not documented and as such can be changed at any time.

Also you can ask any non-existing resource, e.g. /non-existing-resourcerelay.log should log the 404 for that.

Please share the results here or in the issue created at the YouTrack.

Thank you 

0

Thanks for the reply!

The exception is gone after I upgraded to 2020.3 RC - so that's good!

I still struggle connecting to the relay server, despite its configuration being almost the same as for the lobby server. But that might be caused by our kubernetes / service restrictions. Gonna check with an admin next week. Not sure why the lobby server works but the relay server is so picky (ingress seems not to like to connect to the relay server)

Currently I can only get the relay server to log something from within the relay server (wget localhost....). Not via the service name from the lobby pod and not via the ingress link. But I'll ask an admin first.

Both relay and lobby server are http? for the server and promtheus? the ssl termination is supposed to be done on the ingress / nginx, I got that right?

 

Thanks a lot!

0

update:

with some trying and erroring I was able to resolve my relay server connectivity issues. The addresses for the relay server and prometheus must be specified as ":4422" and ":4099" (or whatever port one likes). If the host is put there as well (localhost, 127.0.0.1) the server only listens to local connections.

This was probably obvious for all the Go people out there. Me just looking at the administration guide at https://www.jetbrains.com/help/cwm/code-with-me-administration-guide.html#relay did not really get that :)

Thanks again! :)

0

Hello,

> Both relay and lobby server are http? for the server and promtheus? the ssl termination is supposed to be done on the ingress / nginx, I got that right?

yes, you are correct.

Thank you for the feedback, we will change the documentation so that it becomes more clear.

0

请先登录再写评论。