Generating an image using YandexART
With YandexART, you can generate images in asynchronous mode. In response to an asynchronous request, the model will return an operation object containing an operation ID, which you can use to track the operation progress and get the result after the generation is completed. Generating a result in asynchronous mode can take from a few minutes up to several hours.
Getting started
To use the examples:
-
Create a service account and assign it the
ai.imageGeneration.user
role.You also need to assign the
ai.languageModels.user
role to the service account; in the example, we will utilize a YandexGPT API model to generate a prompt for YandexART. -
Get the service account API key and save it.
The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.
-
Use the pip
package manager to install the ML SDK library:pip install yandex-cloud-ml-sdk
-
Get API authentication credentials as described here: Authentication with the Yandex Foundation Models API.
To access the YandexART API, first assign the
ai.imageGeneration.user
role to the user or service account you will use to authenticate with the API. -
Install the utilities:
Generate an image
Note
YandexART logs user prompts to generate better responses. Do not use sensitive information and personal data in your prompts.
This code includes four independent examples illustrating different uses of the SDK interface:
- Example 1: A simple request of one text description.
- Example 2: A request of two text descriptions with the result saved to a file named
./image.jpeg
. - Example 3: A request of two text descriptions with weight specified.
- Example 4: A combination of a request to a YandexGPT API model (to generate an extended prompt) and a request to a YandexART model (to generate an image based on that prompt).
The code in the example does not return an operation object but waits for the models to execute their requests and stores the result in the result
variable.
-
Create a file named
generate-image.py
and paste the following code into it:#!/usr/bin/env python3 from __future__ import annotations import pathlib from yandex_cloud_ml_sdk import YCloudML message1 = "a red cat" message2 = "Miyazaki style" def main(): sdk = YCloudML( folder_id="<folder_ID>", auth="<API_key>", ) model = sdk.models.image_generation("yandex-art") # configuring model for all of future runs model = model.configure(width_ratio=1, height_ratio=2, seed=50) # Sample 1: simple run operation = model.run_deferred(message1) result = operation.wait() print(result) # Sample 2: run with several messages, saving the result to file path = pathlib.Path("./image.jpeg") try: operation = model.run_deferred([message1, message2]) result = operation.wait() path.write_bytes(result.image_bytes) finally: path.unlink(missing_ok=True) # Sample 3: run with several messages specifying weight operation = model.run_deferred([{"text": message1, "weight": 5}, message2]) result = operation.wait() print(result) # Sample 4: example of combining YandexGPT API and YandexART models gpt = sdk.models.completions("yandexgpt") messages = gpt.run( [ "you need to create a prompt for a yandexart model", "of " + message1 + "in" + message2, ] ) print(messages) operation = model.run_deferred(messages) result = operation.wait() print(result) if __name__ == "__main__": main()
Where:
Note
As input data for a request, Yandex Cloud ML SDK can accept a string, a dictionary, an object of the
TextMessage
class, or an array containing any combination of these data types. For more information, see Yandex Cloud ML SDK usage.message1
: Main part of the image generation prompt.message2
: Clarifying part of the image generation prompt.
-
<folder_ID>
: ID of the folder in which the service account was created. -
<API_key>
: Service account API key you got earlier required for authentication in the API.The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.
-
Run the created file:
python3 generate-image.py
Result:
ImageGenerationModelResult(model_version='', image_bytes=<889288 bytes>) ImageGenerationModelResult(model_version='', image_bytes=<1062632 bytes>) GPTModelResult(alternatives=(Alternative(role='assistant', text='Here is an example of what a request to a YandexART model may look like:\n\n"Create an image of a red cat in Hayao Miyazaki anime style. Make the background in soft pastel shades with details emphasizing the atmosphere of magic and comfort."\n\n*Note that this is just an example and you can adapt it to your needs.*', status=<AlternativeStatus.FINAL: 3>),), usage=Usage(input_text_tokens=31, completion_tokens=76, total_tokens=107), model_version='07.03.2024') ImageGenerationModelResult(model_version='', image_bytes=<1180073 bytes>)
The example below is intended to be run in MacOS and Linux. To run it in Windows, see how to work with Bash in Microsoft Windows.
-
Create a file with the request body, e.g.,
prompt.json
:{ "modelUri": "art://<folder_ID>/yandex-art/latest", "generationOptions": { "seed": "1863", "aspectRatio": { "widthRatio": "2", "heightRatio": "1" } }, "messages": [ { "weight": "1", "text": "a pattern of pastel colored succulents of multiple varieties, hd full wallpaper, sharp focus, many intricate details, picture depth, top view" } ] }
Where:
modelUri
: YandexART model ID which contains a Yandex Cloud folder ID.seed
: Generation seed.text
: Text description of the image to use for generation.weight
: Text description weight. If a request contains more than one description, their individual impact will be calculated based on weight, with the sum of all weights equal to 1.aspectRatio
: (Optional) Aspect ratio of the generated image:widthRatio
: Width (default value: 1).heightRatio
: Height (default value: 1).
-
To send a request to the neural network using the ImageGenerationAsync.generate method, run the following command:
curl \ --request POST \ --header "Authorization: Bearer <IAM_token_value>" \ --data "@prompt.json" \ "https://llm.api.cloud.yandex.net/foundationModels/v1/imageGenerationAsync"
Where:
<IAM_token_value>
: Your account's IAM token.prompt.json
: JSON file with request parameters.
The service will return the
operation
object in response:{"id":"fbveu1sntj**********","description":"","createdAt":null,"createdBy":"","modifiedAt":null,"done":false,"metadata":null}
Save the operation
id
you get in the response. -
Generating an image may take from a few seconds up to a few hours. Wait for a while and send a request to
https://llm.api.cloud.yandex.net:443/operations/<operation_ID>
to get the generation result. When the image is ready, you will get the result in a Base64-encoded file namedimage.jpeg
.curl --request GET --header "Authorization: Bearer <IAM_token_value>" https://llm.api.cloud.yandex.net:443/operations/<operation_ID> | jq -r '.response | .image' | base64 -d > image.jpeg
Where:
<IAM_token_value>
: IAM token you obtained when getting started.<operation_ID>
:id
field value obtained in response to the generation prompt.
See also
- YandexART overview
- Examples of working with ML SDK on GitHub