How to setup pycharm for working to run spark

Hi there.

I'm trying to learn Spark and Python with pycharm. Found some useful tutorials from youtube or blogs, but I'm stuck when I try to run simple spark code such as:

from pyspark.sql import SparkSession
spark = SparkSession.builder \
      .master("local[1]") \
      .appName("SparkByExamples.com") \
      .getOrCreate()

Got errors like this:

/opt/spark/bin/spark-class: line 71: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java: No such file or directory
/opt/spark/bin/spark-class: line 97: CMD: bad array subscript

my path for java_home was : export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre

It frustrated me because I can run Java or Javac from any directory, which I assume indicates that the java path is correct. But when tried to code Python and Spark, the pycharm does not recognize the path of Java.

For information, I'm using Manjaro Linux and already follow the blog post : https://www.datacamp.com/tutorial/installation-of-pyspark

Does anyone here can help me??

Thank you.

0
2 comments

Hi guys... I found the cause of that errors. The error is the environment path is not set correctly. After fixing it and rebooting my computer, it activates the path and runs as I want.

So, this means that this topic was closed.

0

I was having the same problems that you had encountered:

+ /home/***/.local/lib/python3.11/site-packages/pyspark/bin/spark-class: line 71: /usr/lib/jvm/java-11-openjdk-arm64/bin/java: No such file or directory

+/home/***/.local/lib/python3.11/site-packages/pyspark/bin/spark-class: line 97: CMD: bad array subscript

can i ask how did you fix it (what did you change your path to)?

0

Please sign in to leave a comment.