Using SparkSQL with CDH Hadoop distribution

Permanently deleted user

Created September 23, 2020 10:22

I found an example of using JetBrains software to run SparkSQL queries through Spark Thrift Server with JDBC driver. However, the most popular and solid Hadoop distribution CDH doesn't officially support Spark Thrift Server [1].

It is possible to write SQL queries in Python with pyspark library and spark.sql() function as a wrapper, but it does not look like a good and permanent solution for SQL development.

How DataGrip can be used with CDH distribution for SQL queries?

[1] https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_633_unsupported_features.html

Please sign in to leave a comment.