leftOuterJoin JavaPairRDD<Integer, Integer> and JavaPairRDD<Integer, Map<Integer, Integer>>

I am trying to perform leftOuterJoin of JavaPairRDD and JavaPairRDD> and in function signature return type is

JavaPairRDD<Integer, Tuple2<Integer, Optional<Map<Integer, Integer>>>>

Optional here is com.google.common.base.Optional

Is this the correct return type when I perform leftOuterJoin?

My IDE is giving this error

no instance(s) of type variable(s) W exist so that Optinal<W> conforms to Optional<Map<Integer, Integer>>

I couldn't find proper documentation for this. If there are any links to understand this better that would be helpful too. Thanks.

1 answer

  • answered 2018-05-16 06:36 Oli

    According to the javadoc (https://spark.apache.org/docs/2.2.0/api/java/index.html?org/apache/spark/api/java/JavaPairRDD.html), a left outer join between a RDD of type JavaPairRDD<K, V> and a RDD of type JavaPairRDD<K,W> will give you this: JavaPairRDD<K,Tuple2<V,Optional<W>>>.

    This is what you wrote except that the Optional type is defined in spark's java API: org.apache.spark.api.java.Optional<T>. It is not the one defined by Google, hence the error your IDE throws at you ;-)