Spark 使用textFile导入本地文件,抛出IllegalArgumentException异常


rt.


 JavaSparkContext jsc = new JavaSparkContext(sparkConf);
JavaRDD<String> lines = jsc.textFile(args[0]);

输入/home/users/spark/test/r.txt
抛出异常


 Exception in thread "main" java.lang.IllegalArgumentException
at java.net.URI.create(URI.java:841)
at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:168)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:146)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:375)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:409)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:381)
at org.apache.spark.SparkContext$$anonfun$24.apply(SparkContext.scala:560)
at org.apache.spark.SparkContext$$anonfun$24.apply(SparkContext.scala:560)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:149)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:149)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:149)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:172)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)
at org.apache.spark.api.java.JavaPairRDD.reduceByKey(JavaPairRDD.scala:490)
at JSparkTest.main(JSparkTest.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.URISyntaxException: Illegal character in authority at index 7: hdfs://${hadoop.host.name}:54310
    at java.net.URI$Parser.fail(URI.java:2810)
    at java.net.URI$Parser.parseAuthority(URI.java:3148)
    at java.net.URI$Parser.parseHierarchical(URI.java:3059)
    at java.net.URI$Parser.parse(URI.java:3015)
    at java.net.URI.<init>(URI.java:577)
    at java.net.URI.create(URI.java:839)
    ... 36 more

求问为啥

spark java

KEY大好 12 years, 1 month ago

看报错信息说有个不合法的参数。
应该是URI地址错了


 Illegal character in authority at index 7: hdfs://${hadoop.host.name}:54310

是不是 ${hadoop.host.name} 没有设置?

呆毛不天然 answered 12 years, 1 month ago

修正答案,应该是 ${hadoop.host.name} 没有设置

你可以在相关properties配置文件里增加


 hadoop.host.name=x.x.x.x

也可以在执行java程序时以参数方式指定


 java -Dhadoop.host.name=x.x.x.x` AwesomeJavaApplication

逆廻十六夜 answered 12 years, 1 month ago

Your Answer