-
Type: Task
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.0.1
-
Component/s: Connection Management, Error Handling
-
Environment:Ubuntu Linux 14.04, JDK 7
I am trying to integrate MongoDB with Apache Spark to process data.
When trying to execute my program with this command (../spark-1.3.0-bin-hadoop2.4/bin/spark-submit --master spark://luis-VirtualBox:7077 --jars $(echo /home/luis/mongo-spark/lib/*.jar | tr ' ' ',') --class JavaWordCount target/scala-2.10/mongo-spark_2.10-1.0.jar mydb.testCollection mydb.outputTest7) I get the following exception:
15/03/23 17:05:34 WARN TaskSetManager: Lost task 0.1 in stage 0.0 (TID 4, 10.0.2.15): java.lang.IllegalStateException: open
at org.bson.util.Assertions.isTrue(Assertions.java:36)
at com.mongodb.DBTCPConnector.getPrimaryPort(DBTCPConnector.java:406)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:184)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:167)
at com.mongodb.DBCollection.insert(DBCollection.java:161)
at com.mongodb.DBCollection.insert(DBCollection.java:107)
at com.mongodb.DBCollection.save(DBCollection.java:1049)
at com.mongodb.DBCollection.save(DBCollection.java:1014)
at com.mongodb.hadoop.output.MongoRecordWriter.write(MongoRecordWriter.java:105)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:1000)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:979)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I have read in some places that it is caused by a close connection, but I don't close it in any part of the code.
Thank you in advance.