Data Proc API, REST: Job.Get
Returns the specified job.
HTTP request
GET https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters/{clusterId}/jobs/{jobId}
Path parameters
Field |
Description |
clusterId |
string Required field. ID of the cluster to request a job from. |
jobId |
string Required field. ID of the job to return. To get a job ID make a JobService.List request. |
Response
HTTP Code: 200 - OK
{
"id": "string",
"clusterId": "string",
"createdAt": "string",
"startedAt": "string",
"finishedAt": "string",
"name": "string",
"createdBy": "string",
"status": "string",
// Includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
"mapreduceJob": {
"args": [
"string"
],
"jarFileUris": [
"string"
],
"fileUris": [
"string"
],
"archiveUris": [
"string"
],
"properties": "object",
// Includes only one of the fields `mainJarFileUri`, `mainClass`
"mainJarFileUri": "string",
"mainClass": "string"
// end of the list of possible fields
},
"sparkJob": {
"args": [
"string"
],
"jarFileUris": [
"string"
],
"fileUris": [
"string"
],
"archiveUris": [
"string"
],
"properties": "object",
"mainJarFileUri": "string",
"mainClass": "string",
"packages": [
"string"
],
"repositories": [
"string"
],
"excludePackages": [
"string"
]
},
"pysparkJob": {
"args": [
"string"
],
"jarFileUris": [
"string"
],
"fileUris": [
"string"
],
"archiveUris": [
"string"
],
"properties": "object",
"mainPythonFileUri": "string",
"pythonFileUris": [
"string"
],
"packages": [
"string"
],
"repositories": [
"string"
],
"excludePackages": [
"string"
]
},
"hiveJob": {
"properties": "object",
"continueOnFailure": "boolean",
"scriptVariables": "object",
"jarFileUris": [
"string"
],
// Includes only one of the fields `queryFileUri`, `queryList`
"queryFileUri": "string",
"queryList": {
"queries": [
"string"
]
}
// end of the list of possible fields
},
// end of the list of possible fields
"applicationInfo": {
"id": "string",
"applicationAttempts": [
{
"id": "string",
"amContainerId": "string"
}
]
}
}
A Data Proc job. For details about the concept, see documentation.
Field |
Description |
id |
string ID of the job. Generated at creation time. |
clusterId |
string ID of the Data Proc cluster that the job belongs to. |
createdAt |
string (date-time) Creation timestamp. String in RFC3339 To work with values in this field, use the APIs described in the |
startedAt |
string (date-time) The time when the job was started. String in RFC3339 To work with values in this field, use the APIs described in the |
finishedAt |
string (date-time) The time when the job was finished. String in RFC3339 To work with values in this field, use the APIs described in the |
name |
string Name of the job, specified in the JobService.Create request. |
createdBy |
string The id of the user who created the job |
status |
enum (Status) Job status.
|
mapreduceJob |
Specification for a MapReduce job. Includes only one of the fields Specification for the job. |
sparkJob |
Specification for a Spark job. Includes only one of the fields Specification for the job. |
pysparkJob |
Specification for a PySpark job. Includes only one of the fields Specification for the job. |
hiveJob |
Specification for a Hive job. Includes only one of the fields Specification for the job. |
applicationInfo |
Attributes of YARN application. |
MapreduceJob
Field |
Description |
args[] |
string Optional arguments to pass to the driver. |
jarFileUris[] |
string JAR file URIs to add to CLASSPATH of the Data Proc driver and each task. |
fileUris[] |
string URIs of resource files to be copied to the working directory of Data Proc drivers |
archiveUris[] |
string URIs of archives to be extracted to the working directory of Data Proc drivers and tasks. |
properties |
object (map<string, string>) Property names and values, used to configure Data Proc and MapReduce. |
mainJarFileUri |
string HCFS URI of the .jar file containing the driver class. Includes only one of the fields |
mainClass |
string The name of the driver class. Includes only one of the fields |
SparkJob
Field |
Description |
args[] |
string Optional arguments to pass to the driver. |
jarFileUris[] |
string JAR file URIs to add to CLASSPATH of the Data Proc driver and each task. |
fileUris[] |
string URIs of resource files to be copied to the working directory of Data Proc drivers |
archiveUris[] |
string URIs of archives to be extracted to the working directory of Data Proc drivers and tasks. |
properties |
object (map<string, string>) Property names and values, used to configure Data Proc and Spark. |
mainJarFileUri |
string The HCFS URI of the JAR file containing the |
mainClass |
string The name of the driver class. |
packages[] |
string List of maven coordinates of jars to include on the driver and executor classpaths. |
repositories[] |
string List of additional remote repositories to search for the maven coordinates given with --packages. |
excludePackages[] |
string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts. |
PysparkJob
Field |
Description |
args[] |
string Optional arguments to pass to the driver. |
jarFileUris[] |
string JAR file URIs to add to CLASSPATH of the Data Proc driver and each task. |
fileUris[] |
string URIs of resource files to be copied to the working directory of Data Proc drivers |
archiveUris[] |
string URIs of archives to be extracted to the working directory of Data Proc drivers and tasks. |
properties |
object (map<string, string>) Property names and values, used to configure Data Proc and PySpark. |
mainPythonFileUri |
string URI of the file with the driver code. Must be a .py file. |
pythonFileUris[] |
string URIs of Python files to pass to the PySpark framework. |
packages[] |
string List of maven coordinates of jars to include on the driver and executor classpaths. |
repositories[] |
string List of additional remote repositories to search for the maven coordinates given with --packages. |
excludePackages[] |
string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts. |
HiveJob
Field |
Description |
properties |
object (map<string, string>) Property names and values, used to configure Data Proc and Hive. |
continueOnFailure |
boolean Flag indicating whether a job should continue to run if a query fails. |
scriptVariables |
object (map<string, string>) Query variables and their values. |
jarFileUris[] |
string JAR file URIs to add to CLASSPATH of the Hive driver and each task. |
queryFileUri |
string URI of the script with all the necessary Hive queries. Includes only one of the fields |
queryList |
List of Hive queries to be used in the job. Includes only one of the fields |
QueryList
Field |
Description |
queries[] |
string List of Hive queries. |
ApplicationInfo
Field |
Description |
id |
string ID of YARN application |
applicationAttempts[] |
YARN application attempts |
ApplicationAttempt
Field |
Description |
id |
string ID of YARN application attempt |
amContainerId |
string ID of YARN Application Master container |