Skip to content

SparkSubmitArguments

SparkSubmitArguments is a custom SparkSubmitArgumentsParser to handle the command-line arguments of spark-submit script that the actions use for execution (possibly with the explicit env environment).

SparkSubmitArguments is created when launching spark-submit script with only args passed in and later used for printing the arguments in verbose mode.

Command-Line Options

--files

  • Configuration Property: spark.files
  • Configuration Property (Spark on YARN): spark.yarn.dist.files

Printed out to standard output for --verbose option

When SparkSubmit is requested to prepareSubmitEnvironment, the files are:

Creating Instance

SparkSubmitArguments takes the following to be created:

  • Arguments (Seq[String])
  • Environment Variables (default: sys.env)

SparkSubmitArguments is created when:

Loading Spark Properties

loadEnvironmentArguments(): Unit

loadEnvironmentArguments loads the Spark properties for the current execution of spark-submit.

loadEnvironmentArguments reads command-line options first followed by Spark properties and System's environment variables.

Note

Spark config properties start with spark. prefix and can be set using --conf [key=value] command-line option.

Handling Options

handle(
  opt: String,
  value: String): Boolean

handle parses the input opt argument and returns true or throws an IllegalArgumentException when it finds an unknown opt.

handle sets the internal properties in the table Command-Line Options, Spark Properties and Environment Variables.

mergeDefaultSparkProperties

mergeDefaultSparkProperties(): Unit

mergeDefaultSparkProperties merges Spark properties from the default Spark properties file, i.e. spark-defaults.conf with those specified through --conf command-line option.

isPython Flag

isPython: Boolean = false

isPython indicates whether the application resource is a PySpark application (a Python script or pyspark shell).

Back to top