Mam uaktualniony do Apache Spark 1.5.1, ale nie jestem pewien, czy to spowodowało. Mam klucze dostępu w iskrowniku, który zawsze działał.Dlaczego nie mogę już czytać z aplikacji AWS S3 w aplikacji Spark?
Exception in thread "main" java.lang.NoSuchMethodError: org.jets3t.service.impl.rest.httpclient.RestS3Service.<init>(Lorg/jets3t/service/security/AWSCredentials;)V
SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
.format("com.databricks.spark.csv")
.option("inferSchema", "true")
.load("s3n://ossem-replication/gdelt_data/event_data/" + args[0]);
df.write()
.format("com.databricks.spark.csv")
.save("/user/spark/ossem_data/gdelt/" + args[0]);
Więcej błędu poniżej. Istnieje klasa, która nie zawiera metody, co oznacza, że zależności są niedopasowane. Wygląda na to, że jets3t nie zawiera metody RestS3Service. (Lorg/jets3t/service/security/AWSCredentials;) V Czy ktoś może mi to wyjaśnić?
Exception in thread "main" java.lang.NoSuchMethodError: org.jets3t.service.impl.rest.httpclient.RestS3Service.<init>(Lorg/jets3t/service/security/AWSCredentials;)V
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at org.apache.hadoop.fs.s3native.$Proxy24.initialize(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:272)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:256)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1277)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.RDD.take(RDD.scala:1272)
at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1312)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.RDD.first(RDD.scala:1311)
at com.databricks.spark.csv.CsvRelation.firstLine$lzycompute(CsvRelation.scala:101)
at com.databricks.spark.csv.CsvRelation.firstLine(CsvRelation.scala:99)
at com.databricks.spark.csv.CsvRelation.inferSchema(CsvRelation.scala:82)
at com.databricks.spark.csv.CsvRelation.<init>(CsvRelation.scala:42)
at com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:74)
at com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:39)
at com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:27)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:125)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
at com.bah.ossem.spark.GdeltSpark.main(GdeltSpark.java:20)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
To brzmi jak po modernizacji, coś się zmieniło w API, które zgłosiły swój kod niepoprawny. Czy czytałeś dzienniki zmian między wersją, która działała, a wersją, którą masz teraz? –
Spróbuj przeczytać [to] (http://apache-spark-user-list.1001560.n3.nabble.com/spark-read-data-from-aws-s3-td25260.html). –
Nadal nie działa. –