spark1.2里的一小段scala代码看不懂
def map[U: ClassTag](f: T => U): RDD[U] = {
val cleanF = sc.clean(f)
new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.map(cleanF))
}
RDD.scala里的这个方法里的context, pid, iter不知道从哪来的啊??
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala
爆肝女青年A
9 years, 9 months ago
Answers
private[spark] class MapPartitionsRDD[U: ClassTag, T: ClassTag](
prev: RDD[T],
f: (TaskContext, Int, Iterator[T]) => Iterator[U], // (TaskContext, partition index, iterator)
preservesPartitioning: Boolean = false)
extends RDD[U](prev) {
override def compute(split: Partition, context: TaskContext): Iterator[U] =
f(context, split.index, firstParent[T].iterator(split, context))
}
方法的参数列表,传入一个参数为(TaskContext, Int, Iterator[T])返回为Iterator[U]的函数作为MapPartitionsRDD的构造函数的参数f,方法compute会调用这个方法。
DT的路过
answered 9 years, 9 months ago