["首页","博客标签","我","开源","深度学习","机器学习","自然语言","爬虫","编程","开发语言","前端开发","生活","论文","关于me"]
spark任务使用protobuf3.x
作者: IntoHole | 可以转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明
网址: http://www.buyiker.com/2019/03/13/spark-use-protobuf3.html
背景
- spark依赖protobuf 是2.x,如果使用3.x版本会出现下列错误
NoClassMetodError:com.google.protobuf.*
- 原因
- spark优先加载自己的包,所以你用什么方法都会使用spark2.x方法
- 解决方法(maven工程),使用plgin插件,将依赖的protobuf3.x包,进行路径重命名,这样保证整体工程在编译的时候避免直接使用spark
<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<filtering>false</filtering>
</resource>
</resources>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>1.4</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<includes>
<include>*:silkroad*</include>
<include>*:protobuf*</include>
</includes>
</artifactSet>
<shadedArtifactAttached>false</shadedArtifactAttached>
<relocations>
<relocation>
<pattern>com.google.protobuf</pattern>
<shadedPattern>shaded.com.google.protobuf</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>