Packages

  • package root
    Definition Classes
    root
  • package org
    Definition Classes
    root
  • package apache
    Definition Classes
    org
  • package spark
    Definition Classes
    apache
  • package sql
    Definition Classes
    spark
  • package qualityFunctions
    Definition Classes
    sql
  • object LambdaCompilationUtils

    Functionality related to LambdaCompilation.

    Functionality related to LambdaCompilation. Seemingly all HigherOrderFunctions use a lazy val match to extract the NamedLambdaVariable's from the spark LambdaFunction after bind has been called. When doGenCode is called eval _could_ have been called and the lazy val evaluated, as such simply rewriting the tree may not fully work. Additionally the type for NamedLambdaVariable is bound in the lazy val's which means _ANY_ HigherOrderFunction may not tolerate swapping out NamedLambdaVariables for another NamedExpression.

    To add to the fun OpenSource Spark HoF's all use CodegenFallback, as does NamedLambdaVariable, so it's possible to swap out some of these implementations if an array_transform is nested in a Fun1 or Fun2. Similarly Fun1's can call Fun2 so the assumptions are for each Fun1/FunN doCodeGen:

    1. Use the processLambda function to evaluate the function 2. compilationHandlers uses the quality.lambdaHandlers environment variable to load a comma separated list of fqn=handler pairs 3. each fully qualified class name pair (e.g. org.apache.spark.sql.catalyst.expressions.ZipWith=handler.fqn) handler is loaded 4. processLambda then evaluates the expression tree, for each matching HoF classname it will call the handler 5. handlers are used to perform the custom doGenCode for that expression rather than the default OSS CodegenFallback 6. handlers return the ExprCode AND a list of NamedLambdaVariables who must have .value.set called upon them (e.g. we can't optimise them)

    NB The fqn will also be used to check for named' lambdas used through registerLambdaFunctions.

    https://github.com/apache/spark/pull/21954 introduced the lambdavariable with AtomicReference, it's inherent performance hit and, due to the difficulty of threading the holder through the expression chain did not have a compilation approach. After it's threaded and bind has been called the variable id is stable as is the AtomicReference, as such it can be swapped out for a simple variable in the same object.

    quality.lambdaHandlers will override the default for a given platform on an fqn basis, so you only need to "add" or "replace" the HoFs that cause issue not the entire list of OSS HigherOrderFunctions for example TransformValues. Note that some versions of Databricks provide compilation of their HoF's that may not be compatible in approach.

    Disable this approach by using the quality.lambdaHandlers to disable FunN with the default DoCodegenFallbackHandler: quality.lambdaHandlers=org.apache.spark.sql.qualityFunctions.FunN=org.apache.spark.sql.qualityFunctions.DoCodegenFallbackHandler

    Definition Classes
    qualityFunctions
  • LambdaCompilationHandler
  • NamedLambdaVariableOps

trait LambdaCompilationHandler extends AnyRef

Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. LambdaCompilationHandler
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def shouldTransform(expr: Expression): Seq[NamedLambdaVariable]

    returns

    empty if the expression should be transformed (i.e. there is a custom solution for it). Otherwise return the full set of NamedLambdaVariables found

  2. abstract def transform(expr: Expression, scope: Map[ExprId, NamedLambdaVariableCodeGen]): Expression

    Transform the expression using the scope of replaceable named lambda variable expression

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  14. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  16. def toString(): String
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped