2014-11-10 18 views
7

Od miesiąca mamy mały klaster iskier, który pomyślnie wykonał zadania lub pozwolił mi uruchomić powłokę iskrową w klastrze.Spark nie może już wykonywać zadań. Executory nie tworzą katalogu

Nie ma znaczenia, że ​​przesyłam zadanie do klastra lub łączę się z nim za pomocą powłoki, błąd jest zawsze taki sam.

[email protected]~]$ $SPARK_HOME/bin/spark-shell 
Spark assembly has been built with Hive, including Datanucleus jars on classpath 
14/11/10 20:43:01 INFO spark.SecurityManager: Changing view acls to: root, 
14/11/10 20:43:01 INFO spark.SecurityManager: Changing modify acls to: root, 
14/11/10 20:43:01 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root,); users with modify permissions: Set(root,) 
14/11/10 20:43:01 INFO spark.HttpServer: Starting HTTP Server 
14/11/10 20:43:01 INFO server.Server: jetty-8.y.z-SNAPSHOT 
14/11/10 20:43:01 INFO server.AbstractConnector: Started [email protected]:60223 
14/11/10 20:43:01 INFO util.Utils: Successfully started service 'HTTP class server' on port 60223. 
Using Scala version 2.10.4 (OpenJDK 64-Bit Server VM, Java 1.7.0_65) 
Type in expressions to have them evaluated. 
Type :help for more information. 
14/11/10 20:43:05 INFO spark.SecurityManager: Changing view acls to: root, 
14/11/10 20:43:05 INFO spark.SecurityManager: Changing modify acls to: root, 
14/11/10 20:43:05 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root,); users with modify permissions: Set(root,) 
14/11/10 20:43:05 INFO slf4j.Slf4jLogger: Slf4jLogger started 
14/11/10 20:43:05 INFO Remoting: Starting remoting 
14/11/10 20:43:05 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:41369] 
14/11/10 20:43:05 INFO Remoting: Remoting now listens on addresses: [akka.tcp://[email protected]:41369] 
14/11/10 20:43:05 INFO util.Utils: Successfully started service 'sparkDriver' on port 41369. 
14/11/10 20:43:05 INFO spark.SparkEnv: Registering MapOutputTracker 
14/11/10 20:43:05 INFO spark.SparkEnv: Registering BlockManagerMaster 
14/11/10 20:43:05 INFO storage.DiskBlockManager: Created local directory at /mnt/spark/spark-local-20141110204305-a4f0 
14/11/10 20:43:05 INFO storage.DiskBlockManager: Created local directory at /mnt2/spark/spark-local-20141110204305-991c 
14/11/10 20:43:05 INFO util.Utils: Successfully started service 'Connection manager for block manager' on port 56708. 
14/11/10 20:43:05 INFO network.ConnectionManager: Bound socket to port 56708 with id = ConnectionManagerId(ip-10-237-182-163.ec2.internal,56708) 
14/11/10 20:43:05 INFO storage.MemoryStore: MemoryStore started with capacity 265.4 MB 
14/11/10 20:43:05 INFO storage.BlockManagerMaster: Trying to register BlockManager 
14/11/10 20:43:05 INFO storage.BlockManagerMasterActor: Registering block manager ip-10-237-182-163.ec2.internal:56708 with 265.4 MB RAM 
14/11/10 20:43:05 INFO storage.BlockManagerMaster: Registered BlockManager 
14/11/10 20:43:05 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-fa8cd9e8-5a4a-40a4-bc76-c2215886873e 
14/11/10 20:43:05 INFO spark.HttpServer: Starting HTTP Server 
14/11/10 20:43:05 INFO server.Server: jetty-8.y.z-SNAPSHOT 
14/11/10 20:43:05 INFO server.AbstractConnector: Started [email protected]:36394 
14/11/10 20:43:05 INFO util.Utils: Successfully started service 'HTTP file server' on port 36394. 
14/11/10 20:43:06 INFO server.Server: jetty-8.y.z-SNAPSHOT 
14/11/10 20:43:06 INFO server.AbstractConnector: Started [email protected]:4040 
14/11/10 20:43:06 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 
14/11/10 20:43:06 INFO ui.SparkUI: Started SparkUI at http://ec2-54-91-220-90.compute-1.amazonaws.com:4040 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Connecting to master spark://ec2-54-91-220-90.compute-1.amazonaws.com:7077... 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 
14/11/10 20:43:06 INFO repl.SparkILoop: Created spark context.. 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20141110204306-0389 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/0 on worker-20140929210658-ip-10-225-160-49.ec2.internal-60693 (ip-10-225-160-49.ec2.internal:60693) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/0 on hostPort ip-10-225-160-49.ec2.internal:60693 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/1 on worker-20140929210658-ip-10-147-28-32.ec2.internal-60731 (ip-10-147-28-32.ec2.internal:60731) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/1 on hostPort ip-10-147-28-32.ec2.internal:60731 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/2 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/2 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/2 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/1 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/2 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/2) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/2 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/2 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/3 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/3 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/0 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/3 is now RUNNING 
Spark context available as sc. 
scala> 14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/3 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/3) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/3 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/3 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/4 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/4 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/4 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/4 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/4) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/4 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/4 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/5 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/5 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/5 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/5 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/5) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/5 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/5 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/6 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/6 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/6 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/6 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/6) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/6 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/6 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/7 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/7 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/7 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/7 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/7) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/7 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/7 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/8 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/8 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/8 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/8 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/8) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/8 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/8 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/9 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/9 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/9 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/9 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/9) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/9 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/9 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/10 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/10 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/10 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/10 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/10) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/10 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/10 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor added: app-20141110204306-0389/11 on worker-20140929210657-ip-10-69-165-231.ec2.internal-47794 (ip-10-69-165-231.ec2.internal:47794) with 4 cores 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141110204306-0389/11 on hostPort ip-10-69-165-231.ec2.internal:47794 with 4 cores, 12.4 GB RAM 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/11 is now RUNNING 
14/11/10 20:43:06 INFO client.AppClient$ClientActor: Executor updated: app-20141110204306-0389/11 is now FAILED (java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/11) 
14/11/10 20:43:06 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141110204306-0389/11 removed: java.io.IOException: Failed to create directory /root/spark/work/app-20141110204306-0389/11 
14/11/10 20:43:06 ERROR cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our application: FAILED 
14/11/10 20:43:06 ERROR scheduler.TaskSchedulerImpl: Exiting due to error from cluster scheduler: Master removed our application: FAILED` 

Jeśli jednak uruchomię go tylko w trybie lokalnym, łączy się i działa poprawnie.

Pochylam się do tego, że jest to pewnego rodzaju błąd przyzwolenia, ale przez cały czas nie działaliśmy.

EDIT

Po trochę więcej kopanie odkryłem, że węzły pracownik były miejsca na dysku. W folderze roboczym okazuje się, że jest przechowywany słoik, który jest kopiowany, a także plik wyjściowy i stdr dla tego zadania. Czy jest tak, aby usunąć je po zakończeniu, ponieważ mamy ustawienia logowania dla zadań, aby przejść do S3.

Odpowiedz

Powiązane problemy