Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Tutorials
    • All tutorials
      • Recognizing text in image archives in Vision OCR
      • Developing a Telegram bot for text and audio recognition
      • Regular asynchronous recognition of audio files from Object Storage

In this article:

  • Before you begin
  • Required paid resources
  • Create a bucket
  • Create a VM
  • Configure the VM
  • Set up the Yandex Cloud CLI
  • Set up a service account
  • Set up the AWS CLI
  • Create an archive with images
  • Prepare a script for digitizing and uploading images
  • Configure the environment
  • Create a script
  • Double-check the recognition results
  • How to delete created resources
  1. Machine learning and artificial intelligence
  2. Image, text, and speech recognition
  3. Recognizing text in image archives in Vision OCR

Recognizing text in image archives in Yandex Vision OCR

Written by
Yandex Cloud
Improved by
Danila N.
Updated at May 7, 2025
  • Before you begin
    • Required paid resources
  • Create a bucket
  • Create a VM
  • Configure the VM
    • Set up the Yandex Cloud CLI
    • Set up a service account
    • Set up the AWS CLI
  • Create an archive with images
  • Prepare a script for digitizing and uploading images
    • Configure the environment
    • Create a script
  • Double-check the recognition results
  • How to delete created resources

Use the Yandex Vision OCR service to recognize text in images. You can also store both the source images and recognition results in Yandex Object Storage.

To set up an infrastructure for text recognition using Vision OCR and export the results automatically to Object Storage:

  1. Prepare your cloud.
  2. Create a bucket.
  3. Create a VM.
  4. Configure the VM.
  5. Create an archive with images.
  6. Prepare a script for recognition and uploading of images.
  7. Double-check the recognition results.

If you no longer need these resources, delete them.

Before you beginBefore you begin

Sign up in Yandex Cloud and create a billing account:

  1. Navigate to the management console and log in to Yandex Cloud or register a new account.
  2. On the Yandex Cloud Billing page, make sure you have a billing account linked and it has the ACTIVE or TRIAL_ACTIVE status. If you do not have a billing account, create one and link a cloud to it.

If you have an active billing account, you can navigate to the cloud page to create or select a folder for your infrastructure to operate in.

Learn more about clouds and folders.

Required paid resourcesRequired paid resources

The infrastructure costs for image recognition and data storage include:

  • A fee for VM computing resources (see Yandex Compute Cloud pricing).
  • A fee for data storage in a bucket and operations with data (see Yandex Object Storage pricing).
  • A fee for using a dynamic or a static public IP (see Yandex Virtual Private Cloud pricing).
  • A fee for using Yandex Vision OCR (see pricing for Yandex Vision OCR).

Create a bucketCreate a bucket

To create an Object Storage bucket to store the source images and recognition results:

Management console
  1. Go to the Yandex Cloud management console and select the folder where you will perform the operations.
  2. On the folder page, click Create resource and select Bucket.
  3. In the Name field, enter the bucket name following the naming conventions, such as vision-bucket.
  4. In the Bucket access field, select Restricted.
  5. In the Storage class field, select Cold.
  6. Click Create bucket.

Create a VMCreate a VM

Management console
  1. In the management console, click Create resource and select Virtual machine.

  2. In the Name field, enter a name for the VM, such as vision-vm. For naming requirements, see below:

    • It must be from 2 to 63 characters long.
    • It may contain lowercase Latin letters, numbers, and hyphens.
    • It must start with a letter and cannot end with a hyphen.
  3. Select an availability zone to place the VM in.

  4. Under Image/boot disk selection, go to the Cloud Marketplace tab and select a public CentOS 7 image.

  5. Under Disks and file storages, select the parameters:

    • Type: SSD.
    • Size: 19 GB.
  6. Under Computing resources, select:

    • Platform: Intel Cascade Lake.
    • Guaranteed vCPU share: 20%.
    • vCPU: 2.
    • RAM: 2 GB.
  7. Under Network settings, select the network and subnet to connect the VM to. If there aren't any networks, create one:

    1. Select Create network.

    2. In the window that opens, enter the network name and the folder to host the network.

    3. (optional) To automatically create subnets, select the Create subnets option.

    4. Click Create.

      Each network must have at least one subnet. If there is no subnet, create one by selecting Add subnet.

  8. In the Public address field, keep Auto to assign your VM a random external IP address from the Yandex Cloud pool, or select a static address from the list if you reserved one in advance.

  9. Enter the VM access information:

    • Enter the username in the Login field.

    • In the SSH key field, paste the contents of the public key file.

      You will need to create a key pair for the SSH connection yourself, see Creating an SSH key pair.

  10. Click Create VM.

  11. Wait for the VM status to change to Running and save its public IP address: you'll need it for SSH connection.

Configure the VMConfigure the VM

Set up the Yandex Cloud CLISet up the Yandex Cloud CLI

  1. Connect to the VM via SSH.

  2. Install the Yandex Cloud CLI and create a profile.

  3. Make sure that the Yandex Cloud CLI runs correctly:

    CLI

    Run the following command on the VM:

    yc config list
    

    Result:

    token: AQ...gs
    cloud-id: b1gdtdqb1900f5rqqvli
    folder-id: b1gveg9vude9g3uioa50
    

    Save the folder-id parameter: you'll need it to set up a service account.

Set up a service accountSet up a service account

CLI
  1. Create a service account:

    yc iam service-account create \
      --name <service_account_name> \
      --description "<service_account_description>"
    

    Where:

    • --name is the service account name, such as vision-sa.
    • --description is a description of the service account, for example, this is the vision service account.

    Result:

    id: aje6aoc8hccuh5tp55bg
    folder_id: b1gv87ssvu497lpgjh5o
    created_at: "2022-10-12T14:04:43.198559512Z"
    name: vision-sa
    description: this is vision service account
    

    Save the id parameter: this is the service account ID you'll need in the setup process.

  2. Assign the editor role to the service account.

    yc resource-manager folder add-access-binding <folder_id> \
      --role editor \
      --subject serviceAccount:<service_account_ID>
    

    Where:

    • --role: The role assigned.
    • --subject serviceAccount: Service account ID.
  3. Create a static access key for the service account.

    yc iam access-key create \
      --service-account-id <service_account_ID> \
      --description "<key_description>"
    

    Where:

    • --service-account-id: Service account ID.
    • --description: A description for the key, for example, this key is for vision.

    Result:

    access_key:
      id: ajen8d7fur27bt8losom
      service_account_id: aje6aoc8hccuh5tp55bg
      created_at: "2022-10-12T15:08:08.045280520Z"
      description: this key is for vision
      key_id: YC...li
    secret: YC...J5
    

    Save the following parameters (you'll need them to set up the AWS CLI utility):

    • key_id: The ID of the static access key.
    • secret: The secret key.
  4. Create an authorized key for a service account:

    yc iam key create \
      --service-account-id <service_account_ID> \
      --output key.json
    

    Where:

    • --service-account-id: Service account ID.
    • --output: The name of JSON file with an authorized key.

    Result:

    id: aje3qc9pagb9kedkhdn5
    service_account_id: aje6aoc8hccuh5tp55bg
    created_at: "2022-10-13T12:53:04.810240976Z"
    key_algorithm: RSA_2048
    
  5. Create a Yandex Cloud CLI profile to run on behalf of the service account, such as vision-profile:

    yc config profile create vision-profile
    

    Result:

    Profile 'vision-profile' created and activated
    
  6. Specify the authorized key of the service account in the profile configuration:

    yc config set service-account-key key.json
    
  7. Get an IAM token for the service account:

    yc iam create-token
    

    Save the IAM token, you'll need it to send images to Vision OCR.

Set up the AWS CLISet up the AWS CLI

  1. Update the packages installed in the VM operating system. To do this, run the command:

    sudo yum update -y
    
  2. Install the AWS CLI:

    sudo yum install awscli -y
    
  3. Set up the AWS CLI:

    aws configure
    

    Specify the parameter values:

    • AWS Access Key ID: The ID of the key_id static access key that you generated when setting up the service account.
    • AWS Secret Access Key: The secret key that you generated when setting up the service account.
    • Default region name: ru-central1.
    • Default output format: json.
  4. Make sure that the ~/.aws/credentials file contains relevant values for key_id and secret:

    cat ~/.aws/credentials
    
  5. Make sure that the ~/.aws/config file contains relevant values for Default region name and Default output format:

    cat ~/.aws/config
    

Create an archive with imagesCreate an archive with images

  1. Upload your images that include recognizable text to the bucket.

    Tip

    Use the sample image of the penguin crossing road sign.

  2. To make sure that the images were uploaded, use the request with the bucket name:

    aws --endpoint-url=https://storage.yandexcloud.net s3 ls s3://<bucket_name>/
    
  3. Save the images from the bucket to the VM, for example, to the my_pictures folder:

    aws --endpoint-url=https://storage.yandexcloud.net s3 cp s3://<bucket_name>/ my_pictures --recursive
    
  4. Compress the images into an archive, for example, my_pictures.tar:

    tar -cf my_pictures.tar my_pictures/*
    
  5. Delete the image directory:

    rm -rfd my_pictures
    

Prepare a script for digitizing and uploading imagesPrepare a script for digitizing and uploading images

Configure the environmentConfigure the environment

  1. Install the jq package. The script will use it to process the results from Vision OCR:

    sudo yum install jq -y
    
  2. Install the text editor nano:

    sudo yum install nano -y
    
  3. Set the environment variables necessary for the script to run:

    export BUCKETNAME="<bucket_name>"
    export FOLDERID="<folder_id>"
    export IAMTOKEN="<IAM_token>"
    

    Where:

    • BUCKETNAME: The bucket name.
    • FOLDERID: The folder ID.
    • IAMTOKEN: The IAM token that you issued when setting up the service account.

Create a scriptCreate a script

The script includes the following steps:

  1. Create the relevant directories.
  2. Unpack the archive with images.
  3. Process all the images one-by-one:
    1. Base64-encode the image.
    2. Create a request body for the given image.
    3. Send the image in a POST request to Vision OCR for recognition.
    4. Save the result to the output.json file.
    5. Extract the recognized text from output.json and save it to a text file.
  4. Add the resulting text files to an archive.
  5. Upload the archive with the text files to Object Storage.
  6. Delete the auxiliary files.

For your convenience, the text of the script includes comments to each step.

To implement the script:

  1. Create a file, for example, vision.sh and open it in the nano text editor:

    sudo nano vision.sh
    
  2. Copy the script text to vision.sh:

    #!/bin/bash
    
    # Create the relevant directories
     echo "Creating directories..."
    
    # Create a directory for the recognized text
    mkdir my_pictures_text
    
    # Unpack the archive with images to the created directory
    echo "Extract pictures in my_pictures directory..."
    tar -xf my_pictures.tar
    
    # Recognize the images from the archive
    FILES=my_pictures/*
    for f in $FILES
    # Loop through the files in the directory to run the actions:
    do
        # Base64-encode the image to send it to Vision OCR
        CODEIMG=$(base64 -i $f | cat)
    
        # Create the body.json file to be sent in a POST request to Vision OCR
        cat <<EOF > body.json
    {
    "folderId": "$FOLDERID",
    "analyze_specs": [{
    "content": "$CODEIMG",
    "features": [{
    "type": "TEXT_DETECTION",
    "text_detection_config": {
    "language_codes": ["en","ru"]
    }
    }]
    }]
    }
    EOF
        # Send the image to Vision OCR for recognition and write the result to output.json
        echo "Processing file $f in Vision OCR..."
        curl -X POST --silent \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer ${IAMTOKEN}" \
        -d '@body.json' \
        https://vision.api.cloud.yandex.net/vision/v1/batchAnalyze > output.json
    
        # Get the image file name to be used below
        IMAGE_BASE_NAME=$(basename -- "$f")
        IMAGE_NAME="${IMAGE_BASE_NAME%.*}"
    
        # Get text data from output.json and write it to a TXT file named identically with the image file
        cat output.json | jq -r '.results[].results[].textDetection.pages[].blocks[].lines[].words[].text' | awk -v ORS=" " '{print}' > my_pictures_text/$IMAGE_NAME".txt"
    done
    
    # Add the directory with the text files to an archive
    echo "Packing text files to archive..."
    tar -cf my_pictures_text.tar my_pictures_text
    
    # Send the text file archive to the bucket
    echo "Sending archive to Object Storage Bucket..."
    aws --endpoint-url=https://storage.yandexcloud.net s3 cp my_pictures_text.tar s3://$BUCKETNAME/ > /dev/null
    
    # Delete the auxiliary files
    echo "Cleaning up..."
    rm -f body.json
    rm -f output.json
    rm -rfd my_pictures
    rm -rfd my_pictures_text
    rm -r my_pictures_text.tar
    
  3. Set the permissions to run the script:

    sudo chmod 755 vision.sh
    
  4. Run the script:

    ./vision.sh
    

Double-check the recognition resultsDouble-check the recognition results

Management console
  1. In the Yandex Cloud management console, select the folder where the bucket with the recognition results is located.
  2. Select Object Storage.
  3. Open the bucket with the recognition results.
  4. Make sure that the bucket contains the my_pictures_text.tar archive.
  5. Download and unpack the archive.
  6. Make sure that the text in the <image name>.txt file matches the text in the image.

How to delete created resourcesHow to delete created resources

To stop paying for the resources created:

  1. Delete all the objects from the bucket.
  2. Delete the respective bucket.
  3. Delete the VM.
  4. Delete the static public IP if you reserved one.

Was the article helpful?

Previous
Federated data queries
Next
Developing a Telegram bot for text and audio recognition
© 2025 Direct Cursus Technology L.L.C.