推广

大数据开发问题整理

iseeyu2年前 (2024-02-21)推广134

image.png

image.png

二、hive

1、hive任务执行时,任务失败,日志显示虚拟内存不足
方案:由于集群节点虚拟内存不足导致的,解决办法很简单,直接关闭虚拟内存检测就可以了
修改:yarn-site.xml 文件,添加如下配置
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
分发至集群其他节点,并重启集群。
1、通过load data local overwrite方式向分桶表加载数据,overwrite未生效,数据会追加到目标表中。需要通过insert overwrite table target_table select * from source_table的方式覆盖目标表。

三、Hbase

1、Hbase jar包在集群执行报错如图:

image.png

报错原因:没有在hadoop-env.sh文件里面配置HADOOP_CLASSPATH环境变量,所以你执行hadoop jar
命令时,它找不到运行程序所依赖的jar包,所以配置下就行。
解决方案:修改hadoop-env.sh文件,添加HADOOP-CLASSPATH环境变量

[hadoop@node02 hadoop]$ cd /kkb/install/hadoop-3.1.4/etc/hadoop
[hadoop@node02 hadoop]$ vim hadoop-env.sh 
export HADOOP_CLASSPATH=/kkb/install/hbase-2.2.2/lib/*
# * 一定要不然报错,注意不要用$HBASE_HOME代替

2、集群Hbase正常启动后,HRegionServer节点几分钟后自动断开
通过查看日志,如图报错
java.lang.NoClassDefFoundError: org/apache/htrace/SpanReceiver

image.png

解决方案:复制htrace-core4-4.2.0-incubating.jar至lib目录下

[hadoop@node03 client-facing-thirdparty]$ pwd
/kkb/install/hbase-2.2.2/lib/client-facing-thirdparty
[hadoop@node03 client-facing-thirdparty]$ cp /kkb/install/hbase-2.2.2/lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar /kkb/install/hbase-2.2.2/lib/

四、Flume

1、Flume agent执行如图报错:

image.png

报错原因:apache-flume-1.9.0-bin、hadoop-3.1.4都有guava包,但是版本不一致,会造成冲突
解决方案:将hadoop中高版本的guava包,替换flume中低版本的包

cd /kkb/install/apache-flume-1.9.0-bin/lib
rm -f guava-11.0.2.jar
cp /kkb/install/hadoop-3.1.4/share/hadoop/common/lib/guava-27.0-jre.jar .

MySQL

1、导出mysql表数据,指定csv格式
语法:select * from tablename into outfile “目录路径/tablename.csv” fields terminated by ‘,’ lines terminated by ‘\n’;
报错:ERROR 1290 (HY000): The MySQL server is running with the –secure-file-priv option so it cannot execute this statement
解决方案:查看mysql变量secure_file_priv设置,按照设置路径修改导出目录路径即可。

mysql> select * from students  into outfile "/tmp/students.csv" fields terminated by ',' lines terminated by '\n'; 
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement
mysql> show variables like '*file*';
Empty set (0.00 sec)

mysql> show variables like '*secure_file*';
Empty set (0.00 sec)

mysql> show variables like '%secure_file%'; 
+------------------+-----------------------+
| Variable_name    | Value                 |
+------------------+-----------------------+
| secure_file_priv | /var/lib/mysql-files/ |
+------------------+-----------------------+
1 row in set (0.00 sec)

mysql> select * from students  into outfile "/var/lib/mysql-files/students.csv" fields terminated by ',' lines terminated by '\n';           
Query OK, 6 rows affected (0.01 sec)

Spark

1、相同的jar包通过集群cluster的方式,在yarn执行成功,在spark的standalone下执行失败

bin/spark-submit --master spark://node01:7077 \
--deploy-mode cluster \
--class com.kkb.spark.core.SparkCountCluster  \
--executor-memory 1G \
--total-executor-cores 2 \
hdfs://node01:8020/original-spark-core-1.0-SNAPSHOT.jar \
hdfs://node01:8020/word.txt hdfs://node01:8020/output

报错:

Launch Command: "/kkb/install/jdk1.8.0_141/bin/java" "-cp" "/kkb/install/spark-2.3.3-bin-hadoop2.7/conf/:/kkb/install/spark-2.3.3-bin-hadoop2.7/jars/*:/kkb/install/hadoop-3.1.4/etc/hadoop/" "-Xmx1024M" "-Dspark.eventLog.enabled=true" "-Dspark.submit.deployMode=cluster" "-Dspark.yarn.historyServer.address=node01:4000" "-Dspark.app.name=com.kkb.spark.core.SparkCountCluster" "-Dspark.driver.supervise=false" "-Dspark.executor.memory=1g" "-Dspark.eventLog.dir=hdfs://node01:8020/spark_log" "-Dspark.master=spark://node01:7077" "-Dspark.driver.extraClassPath=/kkb/install/hadoop-3.1.4/share/hadoop/common/hadoop-lzo-0.4.20.jar" "-Dspark.eventLog.compress=true" "-Dspark.cores.max=2" "-Dspark.executor.extraClassPath=/kkb/install/hadoop-3.1.4/share/hadoop/common/hadoop-lzo-0.4.20.jar" "-Dspark.history.ui.port=4000" "-Dspark.rpc.askTimeout=10s" "-Dspark.jars=hdfs://node01:8020/original-spark-core-1.0-SNAPSHOT.jar" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@192.168.153.110:41049" "/kkb/install/spark-2.3.3-bin-hadoop2.7/work/driver-20210115100008-0000/original-spark-core-1.0-SNAPSHOT.jar" "com.kkb.spark.core.SparkCountCluster" "hdfs://node01:8020/word.txt" "hdfs://node01:8020/output"
========================================

Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65)
    at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
    at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:187)
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
    at org.apache.spark.Partitioner$$anonfun$4.apply(Partitioner.scala:78)
    at org.apache.spark.Partitioner$$anonfun$4.apply(Partitioner.scala:78)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.immutable.List.map(List.scala:285)
    at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:78)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:326)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:326)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
    at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:325)
    at com.kkb.spark.core.SparkCountCluster$.main(SparkCountCluster.scala:12)
    at com.kkb.spark.core.SparkCountCluster.main(SparkCountCluster.scala)
    ... 6 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
    ... 45 more
Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
    at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139)
    at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:180)
    at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
    ... 50 more
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
    at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)
    ... 52 

解决方案1:
–master指定6066端口,即REST URL: spark://node01.kaikeba.com:6066 (cluster mode)

image.png

bin/spark-submit --master spark://node01:6066 \
--deploy-mode cluster \
--class com.kkb.spark.core.SparkCountCluster  \
--executor-memory 1G \
--total-executor-cores 2 \
hdfs://node01:8020/original-spark-core-1.0-SNAPSHOT.jar \
hdfs://node01:8020/word.txt hdfs://node01:8020/output

解决方案2:
修改配置文件 spark-defaults.conf,添加配置spark.master spark://node01:7077,node02:7077

image.png

扫描二维码推送至手机访问。

版权声明:本文由西安泽虎代运营发布,如需转载请注明出处。

转载请注明出处https://www.0291.com.cn/post/57175.html

相关文章

垂直类短视频内容该如何打造?

垂直类短视频内容该如何打造?

作为内容创作者们在面对这个愈加残酷的环境时,如何将自己的内容更加垂直化并且形式丰富化的展现在大家面前? 前一阵关注短视频行业动态的小伙伴们看到了腾讯旗下的微视倒下了,有同学说,是不是这就意味着短视频的风口过去了?其实想说的是,微视的倒下大部分原因归于内部因素,主要是早期定位不明确,用户范...

小编分享品牌推广,品牌营销如何做好(品牌推广和品牌营销)

小编分享品牌推广,品牌营销如何做好(品牌推广和品牌营销)

互联网时代,抓住你的目标用户,你就能立于不败之地。那么如何走进用户的心中,就变得尤为重要。作为拥有大量用户的互联网市场是企业开展的重要战场,下面就来说说,在互联网背景下,企业的品牌推广,品牌营销如何有效的进行? 品牌营销就是利用消费者对产品的需求,然后用产品的质量、文化以及独特性的宣传...

零食品牌怎么做软文营销推广?

零食品牌怎么做软文营销推广?

  现在的人足不出户,便可享受到各种美味的零食,当然,主要是依托网上购物。零食行业,大多数都将主要的销售阵地转到了线上,方式也有了很大的变化,比如很多零食品牌都会进行来推广自家的产品,那么零食品牌怎么做软文营销推广呢?  一、明确长尾关键词  首先针对自家产品卖点制定软文营...

如何选择网站托管服务公司?网站托管服务有哪些注意事项。

如何选择网站托管服务公司?网站托管服务有哪些注意事项。

当今互联网环境下,企业把网站外包给专业的公司已成为企业在网络市场寻求长远发展的关键条件,通过网站托管服务,企业网站既能获得更加专业的技术和服务支持,又能大大降低企业运营成本,同时效果也能得到更好的保证,但是如何选择网站托管公司以及托管服务有哪些注意事项呢?接下来和小编一起看看你吧。 一、...

快手海外预算10亿美元,再战TikTok

快手海外预算10亿美元,再战TikTok

2021年,快手为海外扩张准备了10亿美元预算。今年头三个月,快手在海外就花了至少2.5亿美元(约为快手一季度总费用六分之一),4月单月花了近1亿美元。TikTok也确实预计在巴西投放1 亿美元,但不设上限。两家公司都把补贴新用户作为获客手段。刚开始,通过介绍新用户注册,平...

快手第二季营收217亿:同比增13% 国内业务实现单季盈利

雷递网 雷建平 8月23日报道快手今日发布截至2022年6月30日的财报。财报显示,快手2022年上半年营收为427.62亿元,较上年同期的361.58亿元增长18.3%。快手2022年上半年经调整EBITDA为-11.44亿元,较上年同期大幅收窄。快手2022年第二季营收...

现在,非常期待与您的又一次邂逅

我们努力让每一部企业宣传片和抖音短视频成为商业大片