Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Foundation Models
    • All tutorials
    • Disabling request logging
    • Getting an API key
    • Batch processing
      • Using embeddings to search through the knowledge base
  • Yandex Cloud ML SDK
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Public materials
  • Release notes

In this article:

  • Getting started
  • Run the search
  1. Step-by-step guides
  2. Embeddings
  3. Using embeddings to search through the knowledge base

Using embeddings to search through the knowledge base

Written by
Yandex Cloud
Updated at April 11, 2025
  • Getting started
  • Run the search

Embeddings make it possible to search a knowledge base for the most relevant answer to your question.

Getting started

To use the examples:

SDK
Python 3
  1. Create a service account and assign the ai.languageModels.user role to it.

  2. Get the service account API key and save it.

    The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.

  3. Use the pip package manager to install the ML SDK library:

    pip install yandex-cloud-ml-sdk
    
  1. Create a service account and assign the ai.languageModels.user role to it.
  2. Get an IAM token for your service account.

Run the search

In the example, the doc_texts array contains the source data for vectorization, i.e., the knowledge base, while the query_text variable contains the search query text. After you get the embeddings, it is time to calculate the distance between each vector in the knowledge base and the query vector to find the most closely related text in the knowledge base.

Note

As input data for a request, Yandex Cloud ML SDK can accept a string, a dictionary, an object of the TextMessage class, or an array containing any combination of these data types. For more information, see Yandex Cloud ML SDK usage.

SDK
Python 3
  1. Create a file named search-knowledge-base.py and paste the following code into it:

    #!/usr/bin/env python3
    # pylint: disable=import-outside-toplevel
    
    from __future__ import annotations
    from yandex_cloud_ml_sdk import YCloudML
    
    doc_texts = [
        """Alexander Sergeyevich Pushkin (May 26 [June 6], 1799, Moscow – January 29 [February 10], 1837, St. Petersburg)
        was a Russian poet, playwright, and novelist, the progenitor of Russian realism,
        a literary critic and theorist, historian, essay writer, and journalist.""",
        """Matricaria is a genus of annual flowering plants of the aster (composite) family. According to the today's classification, it includes around 70 species of low-rise fragrant herbs that blossom from the first year of life.""",
        
        
    ]
    query_text = "when is Pushkin's birthday?"
    
    
    def main():
        import numpy as np
        from scipy.spatial.distance import cdist
    
        sdk = YCloudML(
            folder_id="<folder_ID>",
            auth="<API_key>",
        )
    
        query_model = sdk.models.text_embeddings("query")
        query_embedding = query_model.run(query_text)
    
        doc_model = sdk.models.text_embeddings("doc")
        doc_embeddings = [doc_model.run(text) for text in doc_texts]
    
        query_embedding = np.array(query_embedding)
    
        dist = cdist([query_embedding], doc_embeddings, metric="cosine")
        sim = 1 - dist
        result = doc_texts[np.argmax(sim)]
        print(result)
    
    
    if __name__ == "__main__":
        main()
    

    Where:

    • <folder_ID>: ID of the folder in which the service account was created.

    • <API_key>: Service account API key you got earlier required for authentication in the API.

      The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.

    For more information about accessing text vectorization models, see Accessing models.

  2. Run the created file:

    python3 search-knowledge-base.py
    

    Result:

    Alexander Sergeyevich Pushkin (May 26 [June 6], 1799, Moscow — January 29 [February 10], 1837, St. Petersburg)
        was a Russian poet, playwright, and novelist, the progenitor of Russian realism,
        a literary critic and theorist, historian, essay writer, and journalist.
    
  1. Create a file named search-knowledge-base.py and paste the following code into it:

    import requests
    import numpy as np
    from scipy.spatial.distance import cdist
    
    FOLDER_ID = "<folder_ID>"
    IAM_TOKEN = "<IAM_token>"
    doc_uri = f"emb://{FOLDER_ID}/text-search-doc/latest"
    query_uri = f"emb://{FOLDER_ID}/text-search-query/latest"
    embed_url = "https://llm.api.cloud.yandex.net:443/foundationModels/v1/textEmbedding"
    headers = {"Content-Type": "application/json", "Authorization": f"Bearer {IAM_TOKEN}", "x-folder-id": f"{FOLDER_ID}"}
    doc_texts = [
      """Alexander Sergeyevich Pushkin (May 26 [June 6], 1799, Moscow — January 29 [February 10], 1837, St. Petersburg) was a Russian poet, playwright, and novelist, the progenitor of Russian realism, a literary critic and theorist, historian, essay writer, and journalist.""",
      """Matricaria is a genus of annual flowering plants of the aster (composite) family. According to the today's classification, it includes around 70 species of low-rise fragrant herbs that blossom from the first year of life."""
    ]
    query_text = "when is Pushkin's birthday?"
    
    def get_embedding(text: str, text_type: str = "doc") -> np.array:
        query_data = {
            "modelUri": doc_uri if text_type == "doc" else query_uri,
            "text": text,
        }
    
        return np.array(
            requests.post(embed_url, json=query_data, headers=headers).json()["embedding"]
        )
    
    query_embedding = get_embedding(query_text, text_type="query")
    docs_embedding = [get_embedding(doc_text) for doc_text in doc_texts]
    
    # Calculating cosine distance
    dist = cdist(query_embedding[None, :], docs_embedding, metric="cosine")
    
    # Calculating cosine similarity
    sim = 1 - dist
    
    # most similar doc text
    print(doc_texts[np.argmax(sim)])
    

    Where:

    • <folder_ID>: Yandex Cloud folder ID.
    • <IAM_token>: Service account IAM token you got earlier required for authentication in the API.
  2. Run the created file:

    python3 search-knowledge-base.py
    

    Result:

    Alexander Sergeyevich Pushkin (May 26 [June 6], 1799, Moscow — January 29 [February 10], 1837, St. Petersburg) was a Russian poet, playwright, and novelist, the progenitor of Russian realism, a literary critic and theorist, historian, essay writer, and journalist.
    

See also

  • Text vectorization
  • Examples of working with ML SDK on GitHub

Was the article helpful?

Previous
Using fine-tuned classifiers
Next
Creating a dataset
© 2025 Direct Cursus Technology L.L.C.