How to setup Intellij 14 Scala Worksheet to run Spark Follow
I'm trying to create a SparkContext in an Intellij 14 Scala Worksheet.
here are my dependencies
name := "LearnSpark"
version := "1.0"
scalaVersion := "2.11.7"
// for working with Spark API
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.0"
Here is the code i run in the worksheet
import org.apache.spark.{SparkContext, SparkConf}
val conf = new SparkConf().setMaster("local").setAppName("spark-play")
val sc = new SparkContext(conf)
error
15/08/24 14:01:59 ERROR SparkContext: Error initializing SparkContext.
java.lang.ClassNotFoundException: rg.apache.spark.rpc.akka.AkkaRpcEnvFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
When I run Spark as standalone app it works fine. For example
Can someone provide some guidance on where I should look to resolve this exception?
Thanks.
Please sign in to leave a comment.
I also have this question and am running into a similar error.
I managed to somehow get worksheets working on my desktop with SparkContext, but I have no idea how. This was great for some modeling, but the SparkContext didn't handle saveAsTextFile correctly on my Windows system.
I'm now trying to get it working on IntelliJ installed on a development Hadoop cluster and I'm running into the same issue with akka on a fresh Spark project. Can any one provide a list of steps to follow that will get past this error both in the Scala worksheet and also in the Scala console?
I've spent quite a bit of time over the last month searching for answers to this topic, and it seems to remain unanswered. This appears to be an IntelliJ-specific issue.
Thanks for your help.
Below is a code snippet:
My build.sbt is:
The error I receive is:
I think it might be helpful, maybe to include the Scala version and libraries that I included in my project.
I'm using
and
with Scala version 2.10 in order to be compatible with the Spark 2.10 version.
I got past my issue.
Several other posts mentioned adding dependencies in the SBT. I had done this on my workstation but not on the CentOS development server.
More for others to read, than for my benefit, here is what seems to have made the difference:
I had been trying to add Maven libraries to my project using the New Project Library option. Even though I was using Scala 2.11, Maven would only locate org.apache.spark libraries with 2.10 versions. I clearly didn't (and still don't at this time) understand SBT all that well.
I added the following lines to my build.SBT file.
This addition left all of the dependencies underlined in red with errors indicating "Unresolved dependency". A banner also displayed indicating that build.sbt was changed and offering me a set of options to Refresh project, Enable auto-import, and Ignore. I still haven't figured out how to either make this banner appear again (without modifying my build.sbt file) or how to perform these activities manually.
Clicking the Refresh Project link in the banner, the little processing bar at the bottom of the screen started churning for a while and I found that all kinds of objects got added to the libraries of my project. This is clearly what SBT is meant to do. I also received some warnings about multiple dependencies with different versions. I'm still not sure how to address these warnings.
With this completed, my Scala Worksheet now gives me a different error:
I suspect that this error probably happened because in a different package within my IntelliJ Scala project, I had successfully created a SparkContext, but I hadn't stopped it. I cleared the error by closing and re-opening IntelliJ.
My next error when I tried executing my worksheet was:
I cleared this by adding another import to my Scala worksheet.
This import returned me to the state where another SparkContext was active, so I exited IntelliJ and got back in. I was able to create a SparkContext the first time I executed the Scala worksheet, but then got the SparkContext active error again, so I added a line to the end of the worksheet.
This didn't work, so for now, I will exit and re-start IntelliJ when I want to experiment with code. There were some examples online that show how to create a client that will execute the Spark logic. I'll work on that next.
My current Scala worksheet looks like this:
My SBT looks like this: