Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Tutorials
    • All tutorials
      • DataSphere integration with Yandex Data Processing
      • Classification of images in video frames
      • Running computations on a schedule in DataSphere
      • Running computations in DataSphere using the API.
      • Using data from Object Storage to train a model in DataSphere
      • Creating an MLFlow server for logging experiments and artifacts
      • Model tuning in DataSphere

In this article:

  • Getting started
  • Required paid resources
  • Prepare the infrastructure
  • Create a folder
  • Create a service account for the DataSphere project
  • Add the service account to a project
  • Prepare notebooks and your neural network's architecture
  • Train a neural network
  • Upload the model architecture and weights
  • Create a Cloud Functions
  • Create a Cloud Functions version
  • How to delete the resources you created
  1. Machine learning and artificial intelligence
  2. Development with DataSphere
  3. Running computations in DataSphere using the API.

Running computations in Yandex DataSphere using the API.

Written by
Yandex Cloud
Updated at March 6, 2025
  • Getting started
    • Required paid resources
  • Prepare the infrastructure
    • Create a folder
    • Create a service account for the DataSphere project
    • Add the service account to a project
  • Prepare notebooks and your neural network's architecture
  • Train a neural network
  • Upload the model architecture and weights
  • Create a Cloud Functions
    • Create a Cloud Functions version
  • How to delete the resources you created

In Yandex DataSphere, you can run code using the API without opening your project. This might be handy when you need to automate routine operations, additionally train a neural network, or deploy a service that does not require quick responses via the API.

Using a simple convolutional neural network (CNN) as an example, this tutorial describes how to deploy a model trained in DataSphere with Yandex Cloud Functions. The model's output will be saved to DataSphere project storage.

For information on how to deploy a service returning results via the API, see Deploying a service based on a Docker image with FastAPI.

  1. Prepare your infrastructure.
  2. Prepare notebooks.
  3. Train a neural network.
  4. Upload the model architecture and weights.
  5. Create a Cloud Functions.

If you no longer need the resources you created, delete them.

Getting started

Before getting started, register in Yandex Cloud, set up a community, and link your billing account to it.

  1. On the DataSphere home page, click Try for free and select an account to log in with: Yandex ID or your working account with the identity federation (SSO).
  2. Select the Yandex Cloud Organization organization you are going to use in Yandex Cloud.
  3. Create a community.
  4. Link your billing account to the DataSphere community you are going to work in. Make sure you have a linked billing account and its status is ACTIVE or TRIAL_ACTIVE. If you do not have a billing account yet, create one in the DataSphere interface.

Required paid resources

The cost of implementing regular runs includes:

  • Fee for DataSphere computing resource usage.
  • Fee for the number of Cloud Functions function calls.

Prepare the infrastructure

Log in to the Yandex Cloud management console and select the organization you use to access DataSphere. On the Yandex Cloud Billing page, make sure you have a billing account linked.

If you have an active billing account, you can create or select a folder to deploy your infrastructure in, on the cloud page.

Note

If you use an identity federation to access Yandex Cloud, billing details might be unavailable to you. In this case, contact your Yandex Cloud organization administrator.

Create a folder

Management console
  1. In the management console, select a cloud and click Create folder.
  2. Give your folder a name, e.g., data-folder.
  3. Click Create.

Create a service account for the DataSphere project

To access a DataSphere project from a Cloud Functions function, you need a service account with the datasphere.community-projects.editor role.

Management console
  1. In the management console, go to data-folder.
  2. In the list of services, select Identity and Access Management.
  3. Click Create service account.
  4. Enter a name for the service account, e.g., datasphere-sa.
  5. Click Add role and assign the service account the datasphere.community-projects.editor role.
  6. Click Create.

Add the service account to a project

To enable the service account to run a DataSphere project, add it to the list of project members:

  1. Select the relevant project in your community or on the DataSphere homepage in the Recent projects tab.

  2. In the Members tab, click Add member.
  3. Select the datasphere-sa account and click Add.

Prepare notebooks and your neural network's architecture

Clone the Git repository containing the notebooks with the examples of the ML model training and testing:

  1. In the top menu, click Git and select Clone.
  2. In the window that opens, enter https://github.com/yandex-cloud-examples/yc-datasphere-batch-execution.git and click Clone.

Wait until cloning is complete. It may take some time. You will see the cloned repository folder in the File Browser section.

The repository contains two notebooks and the neural network architecture:

  • train_classifier.ipynb: Notebook for downloading a training sample of the CIFAR10 dataset and training a simple neural network.

  • test_classifier.ipynb: Notebook for testing the model.

  • my_nn_model.py: Neural network architecture. For classification, three-dimensional images are input to the neural network. It contains two convolutional layers with the maxpool layer between them and three linear layers:

    import torch.nn as nn
    import torch.nn.functional as F
    import torch
    
    class Net(nn.Module):
        def __init__(self):
            super().__init__()
            self.conv1 = nn.Conv2d(3, 6, 5)
            self.pool = nn.MaxPool2d(2, 2)
            self.conv2 = nn.Conv2d(6, 16, 5)
            self.fc1 = nn.Linear(16 * 5 * 5, 120)
            self.fc2 = nn.Linear(120, 84)
            self.fc3 = nn.Linear(84, 10)
    
        def forward(self, x):
            x = self.pool(F.relu(self.conv1(x)))
            x = self.pool(F.relu(self.conv2(x)))
            x = torch.flatten(x, 1) # flatten all dimensions except batch
            x = F.relu(self.fc1(x))
            x = F.relu(self.fc2(x))
            x = self.fc3(x)
            return x
    

Train a neural network

In the train_classifier.ipynb notebook, you will download a training sample of the CIFAR10 dataset and train a simple neural network. The trained model's weights will be saved to the project storage named cifar_net.pth.

  1. Open the DataSphere project:

    1. Select the relevant project in your community or on the DataSphere homepage in the Recent projects tab.

    2. Click Open project in JupyterLab and wait for the loading to complete.
    3. Open the notebook tab.
  2. Import the libraries required to train the model:

    import torch
    import torchvision
    import torchvision.transforms as transforms
    import torch.optim as optim
    from my_nn_model import Net
    
  3. Upload the CIFAR10 dataset to train the model. Images in the dataset are of 10 categories:

    transform = transforms.Compose(
        [transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
    
    batch_size = 4
    
    trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                            download=True, transform=transform)
    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                            shuffle=True, num_workers=2)
    
    classes = ('plane', 'car', 'bird', 'cat',
            'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
    
  4. Output sample images from the dataset:

    import matplotlib.pyplot as plt
    import numpy as np
    
    def imshow(img):
        img = img / 2 + 0.5 # unnormalize
        npimg = img.numpy()
        plt.imshow(np.transpose(npimg, (1, 2, 0)))
        plt.show()
    
    dataiter = iter(trainloader)
    images, labels = next(dataiter)
    imshow(torchvision.utils.make_grid(images))
    print(' '.join(f'{classes[labels[j]]:5s}' for j in range(batch_size)))
    
  5. Create a loss function and an optimizer required to train the neural network:

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    net = Net()
    net.to(device)
    
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
    
  6. Run training on five epochs:

    for epoch in range(5):
        running_loss = 0.0
        for i, data in enumerate(trainloader, 0):
            inputs, labels = data[0].to(device), data[1].to(device)
    
            optimizer.zero_grad()
    
            outputs = net(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
    
            running_loss += loss.item()
            if i % 2000 == 1999:
                print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
                running_loss = 0.0
    
    print('Finished Training')
    
  7. Save the resulting model to the project disk:

    torch.save(net.state_dict(), './cifar_net.pth')
    

Upload the model architecture and weights

In the test_classifier.ipynb notebook, you will upload the model architecture and weights created while running the train_classifier.ipynb file. The uploaded model is used for predictions based on the test sample. Prediction results are saved to a file named test_predictions.csv.

  1. Open the DataSphere project:

    1. Select the relevant project in your community or on the DataSphere homepage in the Recent projects tab.

    2. Click Open project in JupyterLab and wait for the loading to complete.
    3. Open the notebook tab.
  2. Import the libraries required to run the model and make predictions:

    import torch
    import torchvision
    import torchvision.transforms as transforms
    from my_nn_model import Net
    import pandas as pd
    
  3. Prepare the objects that will enable you to access the test sample:

    transform = transforms.Compose(
        [transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
    
    batch_size = 4
    
    testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                        download=True, transform=transform)
    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                            shuffle=False, num_workers=2)
    
    classes = ('plane', 'car', 'bird', 'cat',
            'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
    
  4. Set the resource configuration to run the model on, СPU or GPU:

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    
  5. Upload the trained model's weights and make predictions based on the test sample:

    net = Net()
    net.to(device)
    net.load_state_dict(torch.load('./cifar_net.pth'))
    
    predictions = []
    predicted_labels = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data[0].to(device), data[1].to(device)
            outputs = net(images)
            _, predicted = torch.max(outputs.data, 1)
            predictions.append(predicted.tolist())
            predicted_labels.append([classes[predicted[j]] for j in range(batch_size)])
    
  6. Save the predictions in pandas.DataFrame format:

    final_pred = pd.DataFrame({'class_idx': [item for sublist in predictions for item in sublist],
                               'class': [item for sublist in predicted_labels for item in sublist]})
    
  7. Save the model predictions to a file:

    final_pred.to_csv('/home/jupyter/datasphere/project/test_predictions.csv')
    

Create a Cloud Functions

To run cells without opening JupyterLab, you need a Cloud Functions that will trigger computations in a notebook via the API.

Management console
  1. In the management console, select the folder where you want to create a function.
  2. Select Cloud Functions.
  3. Click Create function.
  4. Enter a name for the function, e.g., ai-function.
  5. Click Create function.

Create a Cloud Functions version

Versions contain the function code, run parameters, and all required dependencies.

Management console
  1. In the management console, select the folder containing the function.

  2. Select Cloud Functions.

  3. Select the function to create a version of.

  4. Under Last version, click Сreate in editor.

  5. Select the Python runtime environment. Do not select the Add files with code examples option.

  6. Choose the Code editor method.

  7. Click Create file and specify a file name, e.g., index.

  8. Enter the function code by inserting your project ID and the absolute path to the project notebook:

    import requests
    
    def handler(event, context):
    
        url = 'https://datasphere.api.cloud.yandex.net/datasphere/v2/projects/<project_ID>:execute'
        body = {"notebookId": "/home/jupyter/datasphere/project/test_classifier.ipynb"}
        headers = {"Content-Type" : "application/json",
                   "Authorization": "Bearer {}".format(context.token['access_token'])}
        resp = requests.post(url, json = body, headers=headers)
    
        return {
        'body': resp.json(),
        }
    

    Where:

    • <project_ID>: ID of the DataSphere project displayed on the project page under its name.
    • notebookId: Absolute path to the project notebook.
  9. Under Parameters, set the version parameters:

    • Entry point: index.handler.
    • Service account: datasphere-sa.
  10. In the top-right corner, click Save changes.

How to delete the resources you created

To stop paying for the resources you created:

  • Delete the function.
  • Delete the project.

Was the article helpful?

Previous
Running computations on a schedule in DataSphere
Next
Using data from Object Storage to train a model in DataSphere
© 2025 Direct Cursus Technology L.L.C.