Managed Spark API, gRPC: JobService.Create

Статья создана

Yandex Cloud

Обновлена 25 декабря 2025 г.

gRPC request
CreateJobRequest
SparkJob
PysparkJob
SparkConnectJob
operation.Operation
CreateJobMetadata
Job
SparkJob
PysparkJob
SparkConnectJob

Creates a job for Spark cluster.

gRPC request

rpc Create (CreateJobRequest) returns (operation.Operation)

CreateJobRequest

{
  "cluster_id": "string",
  "name": "string",
  // Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`
  "spark_job": {
    "args": [
      "string"
    ],
    "jar_file_uris": [
      "string"
    ],
    "file_uris": [
      "string"
    ],
    "archive_uris": [
      "string"
    ],
    "properties": "map<string, string>",
    "main_jar_file_uri": "string",
    "main_class": "string",
    "packages": [
      "string"
    ],
    "repositories": [
      "string"
    ],
    "exclude_packages": [
      "string"
    ]
  },
  "pyspark_job": {
    "args": [
      "string"
    ],
    "jar_file_uris": [
      "string"
    ],
    "file_uris": [
      "string"
    ],
    "archive_uris": [
      "string"
    ],
    "properties": "map<string, string>",
    "main_python_file_uri": "string",
    "python_file_uris": [
      "string"
    ],
    "packages": [
      "string"
    ],
    "repositories": [
      "string"
    ],
    "exclude_packages": [
      "string"
    ]
  },
  "spark_connect_job": {
    "jar_file_uris": [
      "string"
    ],
    "file_uris": [
      "string"
    ],
    "archive_uris": [
      "string"
    ],
    "properties": "map<string, string>",
    "packages": [
      "string"
    ],
    "repositories": [
      "string"
    ],
    "exclude_packages": [
      "string"
    ]
  },
  // end of the list of possible fields
  "service_account_id": "string"
}

Field	Description
cluster_id	string Required field. ID of the cluster to create Spark job in. The maximum string length in characters is 50.
name	string Optional. Name of the job. Value must match the regular expression `\\|[a-z][-a-z0-9]{1,61}[a-z0-9]`.
spark_job	SparkJob Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`.
pyspark_job	PysparkJob Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`.
spark_connect_job	SparkConnectJob Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`.
service_account_id	string Service account used to access Cloud resources. The maximum string length in characters is 50.

SparkJob

Field	Description
args[]	string Optional arguments to pass to the driver.
jar_file_uris[]	string Jar file URIs to add to the CLASSPATHs of the Spark driver and tasks.
file_uris[]	string URIs of files to be copied to the working directory of Spark drivers and distributed tasks.
archive_uris[]	string URIs of archives to be extracted in the working directory of Spark drivers and tasks.
properties	object (map<string, string>) A mapping of property names to values, used to configure Spark.
main_jar_file_uri	string URI of the jar file containing the main class.
main_class	string The name of the driver's main class.
packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
exclude_packages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.

PysparkJob

Field	Description
args[]	string Optional arguments to pass to the driver.
jar_file_uris[]	string Jar file URIs to add to the CLASSPATHs of the Spark driver and tasks.
file_uris[]	string URIs of files to be copied to the working directory of Spark drivers and distributed tasks.
archive_uris[]	string URIs of archives to be extracted in the working directory of Spark drivers and tasks.
properties	object (map<string, string>) A mapping of property names to values, used to configure Spark.
main_python_file_uri	string URI of the main Python file to use as the driver. Must be a .py file.
python_file_uris[]	string URIs of Python files to pass to the PySpark framework.
packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
exclude_packages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.

SparkConnectJob

Field	Description
jar_file_uris[]	string Jar file URIs to add to the CLASSPATHs of the Spark driver and tasks.
file_uris[]	string URIs of files to be copied to the working directory of Spark drivers and distributed tasks.
archive_uris[]	string URIs of archives to be extracted in the working directory of Spark drivers and tasks.
properties	object (map<string, string>) A mapping of property names to values, used to configure Spark.
packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
exclude_packages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.

operation.Operation

{
  "id": "string",
  "description": "string",
  "created_at": "google.protobuf.Timestamp",
  "created_by": "string",
  "modified_at": "google.protobuf.Timestamp",
  "done": "bool",
  "metadata": {
    "cluster_id": "string",
    "job_id": "string"
  },
  // Includes only one of the fields `error`, `response`
  "error": "google.rpc.Status",
  "response": {
    "id": "string",
    "cluster_id": "string",
    "created_at": "google.protobuf.Timestamp",
    "started_at": "google.protobuf.Timestamp",
    "finished_at": "google.protobuf.Timestamp",
    "name": "string",
    "created_by": "string",
    "status": "Status",
    // Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`
    "spark_job": {
      "args": [
        "string"
      ],
      "jar_file_uris": [
        "string"
      ],
      "file_uris": [
        "string"
      ],
      "archive_uris": [
        "string"
      ],
      "properties": "map<string, string>",
      "main_jar_file_uri": "string",
      "main_class": "string",
      "packages": [
        "string"
      ],
      "repositories": [
        "string"
      ],
      "exclude_packages": [
        "string"
      ]
    },
    "pyspark_job": {
      "args": [
        "string"
      ],
      "jar_file_uris": [
        "string"
      ],
      "file_uris": [
        "string"
      ],
      "archive_uris": [
        "string"
      ],
      "properties": "map<string, string>",
      "main_python_file_uri": "string",
      "python_file_uris": [
        "string"
      ],
      "packages": [
        "string"
      ],
      "repositories": [
        "string"
      ],
      "exclude_packages": [
        "string"
      ]
    },
    "spark_connect_job": {
      "jar_file_uris": [
        "string"
      ],
      "file_uris": [
        "string"
      ],
      "archive_uris": [
        "string"
      ],
      "properties": "map<string, string>",
      "packages": [
        "string"
      ],
      "repositories": [
        "string"
      ],
      "exclude_packages": [
        "string"
      ]
    },
    // end of the list of possible fields
    "ui_url": "string",
    "service_account_id": "string",
    "connect_url": "string"
  }
  // end of the list of possible fields
}

An Operation resource. For more information, see Operation.

Field	Description
id	string ID of the operation.
description	string Description of the operation. 0-256 characters long.
created_at	google.protobuf.Timestamp Creation timestamp.
created_by	string ID of the user or service account who initiated the operation.
modified_at	google.protobuf.Timestamp The time when the Operation resource was last modified.
done	bool If the value is `false`, it means the operation is still in progress. If `true`, the operation is completed, and either `error` or `response` is available.
metadata	CreateJobMetadata Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.
error	google.rpc.Status The error result of the operation in case of failure or cancellation. Includes only one of the fields `error`, `response`. The operation result. If `done == false` and there was no failure detected, neither `error` nor `response` is set. If `done == false` and there was a failure detected, `error` is set. If `done == true`, exactly one of `error` or `response` is set.
response	Job The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is the standard Create/Update, the response should be the target resource of the operation. Any method that returns a long-running operation should document the response type, if any. Includes only one of the fields `error`, `response`. The operation result. If `done == false` and there was no failure detected, neither `error` nor `response` is set. If `done == false` and there was a failure detected, `error` is set. If `done == true`, exactly one of `error` or `response` is set.

CreateJobMetadata

Field

Description

cluster_id

string

Required field. ID of the Spark cluster.

The maximum string length in characters is 50.

job_id

string

ID of the Spark job.

The maximum string length in characters is 50.

Job

Spark job.

Field	Description
id	string Required. Unique ID of the Spark job. This ID is assigned by MDB in the process of creating Spark job.
cluster_id	string Required. Unique ID of the Spark cluster.
created_at	google.protobuf.Timestamp The time when the Spark job was created.
started_at	google.protobuf.Timestamp The time when the Spark job was started.
finished_at	google.protobuf.Timestamp The time when the Spark job was finished.
name	string Name of the Spark job.
created_by	string The id of the user who created the job
status	enum Status Status. `PROVISIONING`: Job created and is waiting to acquire. `PENDING`: Job acquired and is waiting for execution. `RUNNING`: Job is running. `ERROR`: Job failed. `DONE`: Job finished. `CANCELLED`: Job cancelled. `CANCELLING`: Job is waiting for cancellation.
spark_job	SparkJob Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`. Job specification.
pyspark_job	PysparkJob Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`. Job specification.
spark_connect_job	SparkConnectJob Includes only one of the fields `spark_job`, `pyspark_job`, `spark_connect_job`. Job specification.
ui_url	string Spark UI Url.
service_account_id	string Service account used to access Cloud resources.
connect_url	string Spark Connect Url.

SparkJob

Field	Description
args[]	string Optional arguments to pass to the driver.
jar_file_uris[]	string Jar file URIs to add to the CLASSPATHs of the Spark driver and tasks.
file_uris[]	string URIs of files to be copied to the working directory of Spark drivers and distributed tasks.
archive_uris[]	string URIs of archives to be extracted in the working directory of Spark drivers and tasks.
properties	object (map<string, string>) A mapping of property names to values, used to configure Spark.
main_jar_file_uri	string URI of the jar file containing the main class.
main_class	string The name of the driver's main class.
packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
exclude_packages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.

PysparkJob

Field	Description
args[]	string Optional arguments to pass to the driver.
jar_file_uris[]	string Jar file URIs to add to the CLASSPATHs of the Spark driver and tasks.
file_uris[]	string URIs of files to be copied to the working directory of Spark drivers and distributed tasks.
archive_uris[]	string URIs of archives to be extracted in the working directory of Spark drivers and tasks.
properties	object (map<string, string>) A mapping of property names to values, used to configure Spark.
main_python_file_uri	string URI of the main Python file to use as the driver. Must be a .py file.
python_file_uris[]	string URIs of Python files to pass to the PySpark framework.
packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
exclude_packages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.

SparkConnectJob

Field	Description
jar_file_uris[]	string Jar file URIs to add to the CLASSPATHs of the Spark driver and tasks.
file_uris[]	string URIs of files to be copied to the working directory of Spark drivers and distributed tasks.
archive_uris[]	string URIs of archives to be extracted in the working directory of Spark drivers and tasks.
properties	object (map<string, string>) A mapping of property names to values, used to configure Spark.
packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
exclude_packages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.

Managed Spark API, gRPC: JobService.Create

gRPC requestgRPC request

CreateJobRequestCreateJobRequest

SparkJobSparkJob

PysparkJobPysparkJob

SparkConnectJobSparkConnectJob

operation.Operationoperation.Operation

CreateJobMetadataCreateJobMetadata

JobJob

SparkJobSparkJob

PysparkJobPysparkJob

SparkConnectJobSparkConnectJob

Была ли статья полезна?

gRPC request

CreateJobRequest

SparkJob

PysparkJob

SparkConnectJob

operation.Operation

CreateJobMetadata

Job

SparkJob

PysparkJob

SparkConnectJob