python - Spark java.lang.VerifyError -


i following error when try call use python client spark.

lines = sc.textfile(hdfs://...) lines.take(10) 

i suspect spark , hadoop versions might not compatible. here result of hadoop version: hadoop 2.5.2 subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r cc72e9b000545b86b75a61f4835eb86d57bfafc0 compiled jenkins on 2014-11-14t23:45z compiled protoc 2.5.0 source checksum df7537a4faa4658983d397abf4514320 command run using /etc/hadoop-2.5.2/share/hadoop/common/hadoop-common-2.5.2.jar

i have spark 1.3.1.

file "/etc/spark/python/pyspark/rdd.py", line 1194, in take     totalparts = self._jrdd.partitions().size() file "/etc/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__ file "/etc/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value     py4j.protocol.py4jjavaerror: error occurred while calling o21.partitions.     : java.lang.verifyerror: class org.apache.hadoop.hdfs.protocol.proto.clientnamenodeprotocolprotos$appendrequestproto overrides final method getunknownfields.    ()lcom/google/protobuf/unknownfieldset;     @ java.lang.classloader.defineclass1(native method)     @ java.lang.classloader.defineclasscond(classloader.java:631)     @ java.lang.classloader.defineclass(classloader.java:615)     @ java.security.secureclassloader.defineclass(secureclassloader.java:141)     @ java.net.urlclassloader.defineclass(urlclassloader.java:283)     @ java.net.urlclassloader.access$000(urlclassloader.java:58)     @ java.net.urlclassloader$1.run(urlclassloader.java:197)     @ java.security.accesscontroller.doprivileged(native method)     @ java.net.urlclassloader.findclass(urlclassloader.java:190)     @ java.lang.classloader.loadclass(classloader.java:306)     @ sun.misc.launcher$appclassloader.loadclass(launcher.java:301)     @ java.lang.classloader.loadclass(classloader.java:247)     @ java.lang.class.getdeclaredmethods0(native method)     @ java.lang.class.privategetdeclaredmethods(class.java:2436)     @ java.lang.class.privategetpublicmethods(class.java:2556)     @ java.lang.class.privategetpublicmethods(class.java:2566)     @ java.lang.class.getmethods(class.java:1412)     @ sun.misc.proxygenerator.generateclassfile(proxygenerator.java:409)     @ sun.misc.proxygenerator.generateproxyclass(proxygenerator.java:306)     @ java.lang.reflect.proxy.getproxyclass0(proxy.java:610)     @ java.lang.reflect.proxy.newproxyinstance(proxy.java:690)     @ org.apache.hadoop.ipc.protobufrpcengine.getproxy(protobufrpcengine.java:92)     @ org.apache.hadoop.ipc.rpc.getprotocolproxy(rpc.java:537)     @ org.apache.hadoop.hdfs.namenodeproxies.creatennproxywithclientprotocol(namenodeproxies.java:366)     @ org.apache.hadoop.hdfs.namenodeproxies.createnonhaproxy(namenodeproxies.java:262)     @ org.apache.hadoop.hdfs.namenodeproxies.createproxy(namenodeproxies.java:153)     @ org.apache.hadoop.hdfs.dfsclient.<init>(dfsclient.java:602)     @ org.apache.hadoop.hdfs.dfsclient.<init>(dfsclient.java:547)     @ org.apache.hadoop.hdfs.distributedfilesystem.initialize(distributedfilesystem.java:139)     @ org.apache.hadoop.fs.filesystem.createfilesystem(filesystem.java:2591)     @ org.apache.hadoop.fs.filesystem.access$200(filesystem.java:89)     @ org.apache.hadoop.fs.filesystem$cache.getinternal(filesystem.java:2625)     @ org.apache.hadoop.fs.filesystem$cache.get(filesystem.java:2607)     @ org.apache.hadoop.fs.filesystem.get(filesystem.java:368)     @ org.apache.hadoop.fs.path.getfilesystem(path.java:296)     @ org.apache.hadoop.mapred.fileinputformat.singlethreadedliststatus(fileinputformat.java:256)     @ org.apache.hadoop.mapred.fileinputformat.liststatus(fileinputformat.java:228)     @ org.apache.hadoop.mapred.fileinputformat.getsplits(fileinputformat.java:313)     @ org.apache.spark.rdd.hadooprdd.getpartitions(hadooprdd.scala:203)     @ org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:219)     @ org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:217)     @ scala.option.getorelse(option.scala:120)     @ org.apache.spark.rdd.rdd.partitions(rdd.scala:217)     @ org.apache.spark.rdd.mappartitionsrdd.getpartitions(mappartitionsrdd.scala:32)     @ org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:219)     @ org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:217)     @ scala.option.getorelse(option.scala:120)     @ org.apache.spark.rdd.rdd.partitions(rdd.scala:217)     @ org.apache.spark.api.java.javarddlike$class.partitions(javarddlike.scala:64)     @ org.apache.spark.api.java.abstractjavarddlike.partitions(javarddlike.scala:46)     @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)     @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:39)     @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:25)     @ java.lang.reflect.method.invoke(method.java:597)     @ py4j.reflection.methodinvoker.invoke(methodinvoker.java:231)     @ py4j.reflection.reflectionengine.invoke(reflectionengine.java:379)     @ py4j.gateway.invoke(gateway.java:259) 

i have been searching problem, people refer version of protobuffer not familiar how set correctly. idea?

check pom.xml compiled

search protobuf version. might solve problem.

or problem might else mentioned in jira thread.

https://issues.apache.org/jira/browse/spark-7238


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -