Databricks console bricked after idling

Answered

We are using the official Databricks jdbc driver provided at https://www.databricks.com/spark/jdbc-drivers-download

We have faced an issue when after idling for an hour a console gets into a state that just starts throwing an sql exception on every subsequent query, the error message is attached at the bottom of the post.

Is there some behaviour for the console to recreate the connection on errors like this? It seems that the session id expired after idling for an hour and the session needs to be reinitiated to get a new session id. I suspect that "SqlException" is not specific enough for a reconnect to be triggered.

 

Mar 03 11:30:12.140 ERROR 56 com.databricks.client.exceptions.ExceptionConverter.toSQLException: [Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP Response code: 400, Error message: BAD_REQUEST: Invalid SessionHandle: Session [ce5cc898-f617-4c43-9866-8d42b718ad80] is closed..
java.sql.SQLException: [Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP Response code: 400, Error message: BAD_REQUEST: Invalid SessionHandle: Session [ce5cc898-f617-4c43-9866-8d42b718ad80] is closed..
    at com.databricks.client.hivecommon.api.HS2Client.handleTTransportException(Unknown Source)
    at com.databricks.client.spark.jdbc.DowloadableFetchClient.handleTTransportException(Unknown Source)
    at com.databricks.client.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source)
    at com.databricks.client.hivecommon.api.HS2Client.executeStatement(Unknown Source)
    at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeNonRowCountQueryHelper(Unknown Source)
    at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(Unknown Source)
    at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.<init>(Unknown Source)
    at com.databricks.client.hivecommon.dataengine.HiveJDBCDataEngine.prepare(Unknown Source)
    at com.databricks.client.jdbc.common.SStatement.executeNoParams(Unknown Source)
    at com.databricks.client.jdbc.common.BaseStatement.executeQuery(Unknown Source)
    at com.databricks.client.hivecommon.jdbc42.Hive42Statement.executeQuery(Unknown Source)
    at com.intellij.database.remote.jdbc.impl.RemoteStatementImpl.executeQuery(RemoteStatementImpl.java:179)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:360)
    at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
    at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
    at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
    at java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:587)
    at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)
    at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:705)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:399)
    at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:704)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: com.databricks.client.support.exceptions.ErrorException: [Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP Response code: 400, Error message: BAD_REQUEST: Invalid SessionHandle: Session [01edb987-28c8-12aa-9ddc-4ea530934451] is closed..
    ... 29 more
Caused by: com.databricks.client.jdbc42.internal.apache.thrift.transport.TTransportException: HTTP Response code: 400, Error message: BAD_REQUEST: Invalid SessionHandle: Session [01edb987-28c8-12aa-9ddc-4ea530934451] is closed.
    at com.databricks.client.hivecommon.api.TETHttpClient.handleHeaderErrorMessage(Unknown Source)
    at com.databricks.client.hivecommon.api.TETHttpClient.handleErrorResponse(Unknown Source)
    at com.databricks.client.hivecommon.api.TETHttpClient.flushUsingHttpClient(Unknown Source)
    at com.databricks.client.hivecommon.api.TETHttpClient.flush(Unknown Source)
    at com.databricks.client.jdbc42.internal.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73)
    at com.databricks.client.jdbc42.internal.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
    at com.databricks.client.jdbc42.internal.apache.hive.service.rpc.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:216)
    at com.databricks.client.hivecommon.api.HS2ClientWrapper.send_ExecuteStatement(Unknown Source)
    at com.databricks.client.jdbc42.internal.apache.hive.service.rpc.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:208)
    at com.databricks.client.hivecommon.api.HS2ClientWrapper.callExecuteStatement(Unknown Source)
    at com.databricks.client.hivecommon.api.HS2ClientWrapper.access$400(Unknown Source)
    at com.databricks.client.hivecommon.api.HS2ClientWrapper$5.clientCall(Unknown Source)
    at com.databricks.client.hivecommon.api.HS2ClientWrapper$5.clientCall(Unknown Source)
    at com.databricks.client.hivecommon.api.HS2ClientWrapper.executeWithRetry(Unknown Source)
    at com.databricks.client.hivecommon.api.HS2ClientWrapper.ExecuteStatement(Unknown Source)
    ... 27 more

0
7 comments

Hello,

Could you please share screenshots with database settings?

0

Can you specify which database settings exactly?


I am showing Data Sources and Drivers -> Project Data Sources "Databricks" -> Options (Tab)


I looked through them, Auto-disconnect after 300s could solve this problem?

"Run keep-alive query" should keep the session alive.

0

Yes, you may try auto-disconnect, so session will be closed when idle.

0

Yaroslav Bedrov What exception should be thrown for the console to automatically reconnect?

As I understand SqlException signalises to the console that it was an issue with the sql command, not an underlying connection issue.

0

Seems there is no exception that could initiate reconnection. It's possible to send "keep alive" requests.

0

I have found this one https://docs.oracle.com/javase/8/docs/api/java/sql/SQLRecoverableException.html

It says 

The subclass of SQLException thrown in situations where a previously failed operation might be able to succeed if the application performs some recovery steps and retries the entire transaction or in the case of a distributed transaction, the transaction branch. At a minimum, the recovery operation must include closing the current connection and getting a new connection.

If this exception is thrown instead of just a generic SQLException would the console reinitiate the connection?

0

Currently there is no proper handler in such cases and session should be reinitiated manually. We're reworking this part and will try to make it automatically. I created issue on YouTrack: https://youtrack.jetbrains.com/issue/DBE-17697/Reinitiate-session-in-case-of-session-id-expiration.

You may follow it for updates.

1

Please sign in to leave a comment.