提问者:小点点

导入scala_


_显然,这是一个对象内的一个类在scala.当我导入它在这样的方法:

def f() = {
  val spark = SparkSession()....
  import spark.implicits._
}

它可以正常工作,但是我正在编写一个测试类,我想将此导入用于我尝试过的所有测试:

class SomeSpec extends FlatSpec with BeforeAndAfter {
  var spark:SparkSession = _

  //This won't compile
  import spark.implicits._

  before {
    spark = SparkSession()....
    //This won't either
    import spark.implicits._
  }

  "a test" should "run" in {
    //Even this won't compile (although it already looks bad here)
    import spark.implicits._

    //This was the only way i could make it work
    val spark = this.spark
    import spark.implicits._
  }
}

这不仅看起来很糟糕,我不想每次测试都这样做。什么是“正确”的方法?


共3个答案

匿名用户

您可以做类似于Spark测试套件中所做的事情。例如,这将起作用(受SQLTestData的启发):

class SomeSpec extends FlatSpec with BeforeAndAfter { self =>

  var spark: SparkSession = _

  private object testImplicits extends SQLImplicits {
    protected override def _sqlContext: SQLContext = self.spark.sqlContext
  }
  import testImplicits._

  before {
    spark = SparkSession.builder().master("local").getOrCreate()
  }

  "a test" should "run" in {
    // implicits are working
    val df = spark.sparkContext.parallelize(List(1,2,3)).toDF()
  }
}

或者,您可以直接使用诸如SharedSQLContext之类的东西,它提供了testImpicits: SQLImpicits,即:

class SomeSpec extends FlatSpec with SharedSQLContext {
  import testImplicits._

  // ...

}

匿名用户

我认为SparkSession. scala文件中的GitHub代码可以给你一个很好的提示:

      /**
       * :: Experimental ::
       * (Scala-specific) Implicit methods available in Scala for converting
       * common Scala objects into [[DataFrame]]s.
       *
       * {{{
       *   val sparkSession = SparkSession.builder.getOrCreate()
       *   import sparkSession.implicits._
       * }}}
       *
       * @since 2.0.0
       */
      @Experimental
      object implicits extends SQLImplicits with Serializable {
        protected override def _sqlContext: SQLContext = SparkSession.this.sqlContext
      }

在这里,_中的“火花”只是我们创建的火花会话对象。

这里还有一个参考!

匿名用户

我只是实例化SparkSession和之前使用,“导入含义”。

@transient lazy val spark = SparkSession
  .builder()
  .master("spark://master:7777")
  .getOrCreate()

import spark.implicits._