Apache Airflow™
To operate under the Apache Airflow™YQExecuteQueryOperator
To make queries to Yandex Query, you need to initialize the YQExecuteQueryOperator operator using the arguments shown below.
Required arguments:
name: Apache Airflow™ job name.sql: Text of the SQL query to run in Yandex Query.
Optional arguments:
folder_id: Folder to execute the query in. If not specified, it is the same as the Managed Service for Apache Airflow™ folder.yandex_conn_id: ID of theyandexcloud-type connection containing the Yandex Cloud connection parameters. If not specified, the connection namedyandexcloud_defaultis used. Theyandexcloud_defaultconnection is pre-installed as part of Managed Service for Apache Airflow™, so you do not need to create it.
Example:
yq_operator = YQExecuteQueryOperator(task_id="yq_operator", sql="SELECT 'Hello, world!'")
In this example, we are creating an Apache Airflow™ job with the yq_operator ID, which runs the SELECT 'Hello, world!' query. For the full example of a query to Yandex Query from Managed Service for Apache Airflow™, see Automating Yandex Query tasks with Yandex Managed Service for Apache Airflow™.
Returned values
Successful YQExecuteQueryOperator execution outputs data in the form of a dictionary (dict) containing an array of column descriptions and an array of rows with the results.
Query:
yq_operator = YQExecuteQueryOperator(task_id="yq_operator", sql="SELECT 'Hello, World!'")
Result:
{
'rows': [['Hello, world!']],
'columns': [{'name': 'column0', 'type': 'String'}]
}
| Field | Description |
|---|---|
columns |
Array of returned value columns |
columns[].name |
Column name |
columns[].type |
Column data type |
rows |
Array of result rows with the returned value. The number of array elements of each row is the same as the number of columns from the columns parameter. |
YQL and Python type mapping
Below are the rules for converting YQL types to Python results.
Scalar types
| YQL type | Python type | Example in Python |
|---|---|---|
Int8, Int16, Int32, Uint8, Uint16, Uint32, Int64, Uint64 |
int |
647713 |
Bool |
bool |
True |
Float, Double |
doubleNaN and Inf are represented as None |
7.88731023None |
Decimal |
Decimal |
45.23410083 |
Utf8 |
str |
String text |
String |
str bytes |
String text |
Complex types
| YQL type | Python type | Example in Python |
|---|---|---|
Json, JsonDocument |
str (the entire node is inserted as a string) |
{"a":[1,2,3]} |
Date, Datetime, and Timestamp |
datetime |
2022-02-09 |
Optional types
| YQL type | Python type | Example in Python |
|---|---|---|
Optional |
Original type or None | 1 |
Containers
| YQL type | Python type | Example in Python |
|---|---|---|
List<Type> |
list |
[1,2,3,4] |
Dict<KeyType, ValueType> |
dict |
{key1: value1, key2: value2} |
Set<KeyType> |
set |
set(key_value1, key_value2) |
Tuple<Type1, Type2> |
tuple |
(element1, element2, ..) |
Struct<Name:Utf8,Age:Int32> |
dict |
{ "Name": "John", "Age": 128 } |
Variant<Type1, Type2> with a tuple |
list |
list[64563, 1] |
Variant<value:Int32,error:String> with a structure |
dict |
{key1: value1, key2: value2} |
Special types
| YQL type | Python type |
|---|---|
Void, Null |
None |
EmptyList |
[] |
EmptyDict |
{} |