SparkSubmitArguments¶
SparkSubmitArguments is a custom SparkSubmitArgumentsParser to handle the command-line arguments of spark-submit script that the actions use for execution (possibly with the explicit env environment).
SparkSubmitArguments is created when launching spark-submit script with only args passed in and later used for printing the arguments in verbose mode.
Command-Line Options¶
--files¶
- Configuration Property: spark.files
- Configuration Property (Spark on YARN):
spark.yarn.dist.files
Printed out to standard output for --verbose option
When SparkSubmit is requested to prepareSubmitEnvironment, the files are:
Creating Instance¶
SparkSubmitArguments takes the following to be created:
- Arguments (
Seq[String]) - Environment Variables (default:
sys.env)
SparkSubmitArguments is created when:
SparkSubmitis requested to parseArguments and launched as a command-line application
Loading Spark Properties¶
loadEnvironmentArguments(): Unit
loadEnvironmentArguments loads the Spark properties for the current execution of spark-submit.
loadEnvironmentArguments reads command-line options first followed by Spark properties and System's environment variables.
Note
Spark config properties start with spark. prefix and can be set using --conf [key=value] command-line option.
Handling Options¶
handle(
opt: String,
value: String): Boolean
handle parses the input opt argument and returns true or throws an IllegalArgumentException when it finds an unknown opt.
handle sets the internal properties in the table Command-Line Options, Spark Properties and Environment Variables.
mergeDefaultSparkProperties¶
mergeDefaultSparkProperties(): Unit
mergeDefaultSparkProperties merges Spark properties from the default Spark properties file, i.e. spark-defaults.conf with those specified through --conf command-line option.
isPython Flag¶
isPython: Boolean = false
isPython indicates whether the application resource is a PySpark application (a Python script or pyspark shell).