Installing Stackland on Yandex Cloud VMs

Written by

Yandex Cloud

Updated at July 13, 2026

View in Markdown

Introduction
Get a cloud network ready
Prepare a service account
Create a disk image
Create a bastion host
Create cluster VMs
Configure DNS
Prepare the Stackland configuration
Download sladm
Prepare secrets
Install the cluster
Check the installation
Troubleshooting
- yandexcloud-lbrestarts with a metadata service error
- PVCs remain Pending
Useful links

This guide describes how to install Yandex Cloud Stackland on Yandex Cloud VMs from a custom boot image.

Note

Common use cases for Stackland deployment over VMs in Yandex Cloud are demonstration, prototyping, and test benches.

We recommend these services for production loads in zones where managed Yandex Cloud services are available.

Introduction

To deploy a minimal cluster, you will need:

One bastion host on Ubuntu 22.04 or higher. This host is used to run sladm, gets access to the cluster, and give nodes access to the internet.
At least three VMs for your future Stackland cluster. All the VMs must reside in the same cloud network and the same subnet.
Custom disk image created from a Stackland raw image.
On each VM: one boot disk created from the custom image and one separate data disk.
Service account with the permissions to manage Network Load Balancer and use the cloud network.

See the recommended resources for cluster nodes in the Infrastructure section.

Note

Links to release artifacts use the $VERSION variable. Replace it with current Stackland version.

The CLI examples below use variables. Before you run the commands, set them to the following:

export VERSION=26.1.5
export ZONE=ru-central1-d
export NETWORK_NAME=stackland-network
export SUBNET_NAME=stackland-subnet
export SUBNET_CIDR=10.130.0.0/24
export SECURITY_GROUP_NAME=stackland-sg
export DNS_ZONE_NAME=stackland-internal
export BASE_DOMAIN=stackland.internal
export CLUSTER_NAME=main

Get a cloud network ready

Create a cloud network with a subnet to deploy the bastion host and cluster VMs in.

Proceed with the following requirements in mind:

All cluster VMs must reside in the same subnet. This facilitates the operation of Network Load Balancer and the target group.
Cluster VMs must have stable internal IP addresses.
The bastion host must have internet access and network access to cluster VMs.
Custer nodes must have internet access via the bastion host, NAT, or another network mechanism in your infrastructure.

Create a DNS zone for the new cluster. The examples below use the stackland.internal domain.

If using the CLI, run these commands:

yc vpc network create --name $NETWORK_NAME

yc vpc subnet create \
  --name $SUBNET_NAME \
  --zone $ZONE \
  --range $SUBNET_CIDR \
  --network-name $NETWORK_NAME

NETWORK_ID=$(yc vpc network get $NETWORK_NAME --format json | jq -r '.id')

yc dns zone create \
  --name $DNS_ZONE_NAME \
  --zone $BASE_DOMAIN. \
  --private-visibility \
  --network-ids $NETWORK_ID

If using security groups for VMs, create a group and allow the traffic required for cluster installation and operation:

yc vpc security-group create \
  --name $SECURITY_GROUP_NAME \
  --network-name $NETWORK_NAME

SECURITY_GROUP_ID=$(yc vpc security-group get $SECURITY_GROUP_NAME --format json | jq -r '.id')

yc vpc security-group update-rules $SECURITY_GROUP_ID \
  --add-rule "direction=ingress,protocol=any,predefined=self_security_group" \
  --add-rule "direction=ingress,protocol=tcp,port=22,v4-cidrs=<admin_ip_address>/32" \
  --add-rule "direction=ingress,protocol=tcp,from-port=30000,to-port=32767,v4-cidrs=0.0.0.0/0" \
  --add-rule "direction=egress,protocol=any,v4-cidrs=0.0.0.0/0"

The 30000-32767 range rule is required for Network Load Balancer to access the NodePort ports of the system ingress. The example gives the standard NodePort range in Kubernetes. If you have another range set in your configuration, specify it in the security group rule. For production benches, limit the traffic source in this rule according to your organization’s network policy.

Prepare a service account

Create a service account to be associated with your cluster VMs. Assign the following roles to it:

load-balancer.admin: To create and delete network load balancers and target groups.
vpc.user: To work with cloud network resources.

Run the following commands:

yc iam service-account create --name stackland-yc-lb

SA_ID=$(yc iam service-account get stackland-yc-lb --format json | jq -r '.id')
FOLDER_ID=$(yc config get folder-id)

yc resource-manager folder add-access-binding "$FOLDER_ID" \
  --role load-balancer.admin \
  --subject serviceAccount:"$SA_ID"

yc resource-manager folder add-access-binding "$FOLDER_ID" \
  --role vpc.user \
  --subject serviceAccount:"$SA_ID"

Create a disk image

Create a custom disk image from the Stackland raw image. The raw image is available at:

https://storage.yandexcloud.net/stackland-public/stackland/$VERSION/images/stackland-amd64-$VERSION.raw

You can follow the Uploading a custom image guide to create an image.

If using the CLI, run this command:

yc compute image create \
  --name stackland-$VERSION \
  --source-uri https://storage.yandexcloud.net/stackland-public/stackland/$VERSION/images/stackland-amd64-$VERSION.raw \
  --os-type linux \
  --min-disk-size 150GB

Wait for the image to change its status to READY.

Create a bastion host

Create a VM on Ubuntu 22.04 or higher in the same subnet where the cluster nodes will reside.

Install the required utilities on the bastion host:

sudo apt update
sudo apt install unzip jq curl wget -y

Install kubectl to be able to check cluster health status after installation. For example:

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/

Install and configure the CLI for Yandex Cloud.

If cluster nodes access the internet via the bastion host, enable IPv4 routing and NAT. For a NAT configuration example, see Installing Stackland on Yandex BareMetal.

Create cluster VMs

Create at least three VMs for your Stackland cluster.

For each one, configure the following settings:

Boot disk: From the stackland-$VERSION custom image.
Additional data disk: At least the minimum required size.
Service account: Service account you created earlier.
Subnet: Same subnet the bastion host is in.
Internal IP address: Static.
Public IP address: Do not assign if hosts are accessed via the bastion host.
cluster-name label: The value must match that of metadata.name in the StacklandClusterConfig resource. In the example below, it is main.

If using the CLI, add the --labels cluster-name=$CLUSTER_NAME argument when creating the VMs.

If the VMs have already been created, add the label separately:

yc compute instance add-labels node1 --labels cluster-name=$CLUSTER_NAME
yc compute instance add-labels node2 --labels cluster-name=$CLUSTER_NAME
yc compute instance add-labels node3 --labels cluster-name=$CLUSTER_NAME

Warning

The cluster-name label is required for the yandexcloud-lb component. Without it, the balancer's operator will not be able to get the cluster name from the Yandex Cloud metadata service and will keep on restarting with the failed to get cluster name from metadata: metadata service returned status 404 error.

Here is an example of creating the first node using the CLI:

yc compute instance create \
  --name node1 \
  --zone $ZONE \
  --hostname node1.$BASE_DOMAIN \
  --platform standard-v3 \
  --cores 32 \
  --memory 64GB \
  --core-fraction 100 \
  --labels cluster-name=$CLUSTER_NAME \
  --create-boot-disk name=node1-boot,image-name=stackland-$VERSION,type=network-ssd,size=150,auto-delete=true \
  --create-disk name=node1-data,type=network-ssd,size=400,device-name=data,auto-delete=true \
  --network-interface subnet-name=$SUBNET_NAME,ipv4-address=10.130.0.11,security-group-ids=$SECURITY_GROUP_ID \
  --service-account-name stackland-yc-lb

Use this example to create your other nodes by changing names, FQDNs, IP addresses, and disk names.

Once the VMs are created, write down the following for each node:

FQDN or name to use in the Stackland configuration.
Internal IP address.
MAC address of the network interface.
Boot disk name within the guest OS. Usually the name is /dev/vda, but you should check the value for your configuration.
Data disk name within the guest OS. Usually the name is /dev/vdb if there is one additional disk connected to the VM.

Configure DNS

Configure DNS records for the cluster domain.

For cluster nodes, create A records pointing to the internal IP addresses of your VMs:

node1.stackland.internal.  A  10.130.0.11
node2.stackland.internal.  A  10.130.0.12
node3.stackland.internal.  A  10.130.0.13

If using the CLI, run this command:

yc dns zone add-records $DNS_ZONE_NAME \
  --record "node1 300 A 10.130.0.11" \
  --record "node2 300 A 10.130.0.12" \
  --record "node3 300 A 10.130.0.13"

Prepare records for system endpoints:

api.sys.$baseDomain: For the address of the network load balancer you use to access the Kubernetes API, or for internal IP addresses of the combined or control-plane nodes.
*.sys.$baseDomain: For the address you are going to assign to the network load balancer after installation.

If the address of the network load balancer is not known in advance, create or update a wildcard record after you complete the installation and create Network Load Balancer.

Prior to the installation, create a record for the Kubernetes API:

yc dns zone add-records $DNS_ZONE_NAME \
  --record "api.sys 300 A 10.130.0.11" \
  --record "api.sys 300 A 10.130.0.12" \
  --record "api.sys 300 A 10.130.0.13"

After the installation, get the external IP address of your Network Load Balancer and add a wildcard record:

INGRESS_IP=$(kubectl get svc -n stackland-ingress ingress-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

yc dns zone add-records $DNS_ZONE_NAME \
  --record "*.sys 300 A $INGRESS_IP"

Prepare the Stackland configuration

Create a folder named config/ on the bastion host and save the cluster configuration files in it.

Note

Do not specify ipPools for the yandex-nlb load balancer. Network Load Balancer does not support this parameter.

Cluster configuration example:

apiVersion: v1alpha1
kind: StacklandClusterConfig
metadata:
  name: main
spec:
  platform:
    type: yandexcloud
    loadBalancer:
      type: yandex-nlb

  cluster:
    baseDomain: stackland.internal

    networking:
      hostsNetwork:
        - cidr: 10.130.0.0/24
      clusterNetwork:
        - cidr: 172.16.0.0/16
      servicesNetwork:
        - cidr: 10.96.0.0/12

    storage:
      defaultStorageClass: stackland-other

  genericHostConfig:
    disksConfig:
      - installDisk:
          name: /dev/vda
      - dataDisk:
          name: /dev/vdb
    networkConfig:
      routes:
        - to: 0.0.0.0/0
          via: 10.130.0.1
          iface: eth0
      resolvers:
        - 10.130.0.2
      timeservers:
        - 10.130.0.2

In Compute Cloud, VM network disks are displayed as VirtIO devices. Stackland treats such disks as the stackland-other storage class; therefore, specify defaultStorageClass: stackland-other for this scenario.

If you specify stackland-ssd, persistent volumes for system components may remain Pending, and you will get the did not have enough free storage message in pod events.

Host configuration example:

apiVersion: v1alpha1
kind: StacklandHostsList
metadata:
  name: main
spec:
  hosts:
    - hostname: node1.stackland.internal
      role: combined
      networkConfig:
        interfaces:
          - macaddress: d0:0d:20:97:18:17
            name: eth0
        addresses:
          - interface: eth0
            ip: 10.130.0.11/24

    - hostname: node2.stackland.internal
      role: combined
      networkConfig:
        interfaces:
          - macaddress: d0:0d:1f:a3:b5:05
            name: eth0
        addresses:
          - interface: eth0
            ip: 10.130.0.12/24

    - hostname: node3.stackland.internal
      role: combined
      networkConfig:
        interfaces:
          - macaddress: d0:0d:1a:c5:b7:a5
            name: eth0
        addresses:
          - interface: eth0
            ip: 10.130.0.13/24

Replace the IP addresses, MAC addresses, DNS servers, NTP servers, and host names with what is relevant for your infrastructure.

Download `sladm`

On the bastion host, download and unzip sladm:

wget https://storage.yandexcloud.net/stackland-public/stackland/$VERSION/sladm-$VERSION-linux-amd64.zip
unzip sladm-$VERSION-linux-amd64.zip
chmod +x sladm

Prepare secrets

Generate a StacklandSecretsConfig resource:

./sladm secrets add --out config/secrets.yaml --license-key key.json

Where key.json is the file with the Stackland license key.

Install the cluster

Prior to installing, check the configuration:

./sladm validate --config config/

Run the installation:

./sladm install --config config/ --installation-timeout 2h 2>&1 | tee install-$(date +%y%m%d-%H%M).log

The installation takes approximately one hour. Save the installation log until the check is over: it contains messages about transitions of nodes and components between statuses; and after the installation is successfully completed, it will show the management console address and admin login and password.

Warning

The installation log contains the admin password. Do not publish the log or disclose it to third parties.

If the installation was not over within the allocated time or threw an error, refer to Troubleshooting.

Check the installation

After the installation is complete, the following message should appear in the sladm log:

✓ Your Stackland cluster is ready

sladm will also display the management console address and the default login and password.

Check the status of the installation manually on the bastion host.

If sladm has not copied kubeconfig to the user’s home directory yet, specify the kubeconfig file explicitly:

export KUBECONFIG=./_out/kubeconfig

Make sure the Kubernetes API is available and all nodes are Ready:
```
kubectl get nodes -o wide
```

Check whether the initial installation of the platform is completed:

kubectl get platformconfig main -o jsonpath='{.status.initialInstall.state}{"\n"}'

Expected value:

Installed

Check the status of components:
```
kubectl get componentinstallations -o wide
```
All components should be Ready.
Make sure there are no frozen PVCs and PVCs with an incorrect storage class:
```
kubectl get pvc -A
kubectl get pvc -A | grep -E 'Pending|Lost|stackland-ssd' || true
```
For this scenario, system component PVCs should use stackland-other.

Make sure there are no pods with errors:

kubectl get pod -A --field-selector=status.phase!=Succeeded | grep -E 'Pending|Error|CrashLoop|Init|0/|ContainerCreating' || true

If the command outputs pods, check their events and logs:

kubectl describe pod <pod_name> -n <namespace>
kubectl logs <pod_name> -n <namespace> --previous --tail=100

Check system ingress resources:
```
kubectl get ingress -A
kubectl get svc -n stackland-ingress ingress-controller -o wide
```
In the ADDRESS field, ingress resources must have the Network Load Balancer external IP address specified.

Make sure that system endpoints are available:

https://console.sys.$baseDomain: Cluster management console.
https://dashboard.sys.$baseDomain: Cluster dashboard.
https://grafana.sys.$baseDomain: Cluster charts in Grafana.
https://prometheus.sys.$baseDomain: Cluster metrics in Prometheus.
https://alertmanager.sys.$baseDomain: Cluster alerts in Alertmanager.

If the *.sys.$baseDomain wildcard record does not point at the Network Load Balancer address yet, get the address of the load balancer you created and update the DNS record.

For example:

INGRESS_IP=$(kubectl get svc -n stackland-ingress ingress-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

yc dns zone add-records $DNS_ZONE_NAME \
  --record "*.sys 300 A $INGRESS_IP"

If you are testing access from a workstation without using a public DNS zone, add these records to the hosts file on your computer:

<load_balancer_ip> console.sys.stackland.internal auth.sys.stackland.internal kubeconfig.sys.stackland.internal grafana.sys.stackland.internal docs.sys.stackland.internal

Note

If you open the Network Load Balancer IP address directly in your browser, the ingress will return default backend - 404. This is the expected behavior: routing is done based on host name. Open the console at https://console.sys.$baseDomain.

Test access to the management console:

curl -I --max-time 10 -H "Host: console.sys.$BASE_DOMAIN" http://$INGRESS_IP
curl -kI --max-time 10 \
  --resolve console.sys.$BASE_DOMAIN:443:$INGRESS_IP \
  https://console.sys.$BASE_DOMAIN

Expected result:

HTTP request returns a redirect to HTTPS.
HTTPS request returns a redirect to the login page or the login page itself with code 200.

Check the status of your load balancer’s target group:

yc load-balancer network-load-balancer list
yc load-balancer target-group list
yc load-balancer network-load-balancer target-states <load_balancer_name_or_id> \
  --target-group-id <target_group_id>

All nodes should now turn HEALTHY.

If the nodes remain UNHEALTHY, make sure the ingress NodePort is reachable from the bastion host:

INGRESS_HTTP_NODEPORT=$(kubectl get svc -n stackland-ingress ingress-controller -o jsonpath='{.spec.ports[?(@.port==80)].nodePort}')

for ip in <node_1_ip> <node_2_ip> <node_3_ip>; do
  timeout 3 bash -c "</dev/tcp/$ip/$INGRESS_HTTP_NODEPORT" && \
    echo "$ip:$INGRESS_HTTP_NODEPORT open" || \
    echo "$ip:$INGRESS_HTTP_NODEPORT closed"
done

If the NodePort is reachable from the subnet, but the target group remains UNHEALTHY, check:

Security group rules for incoming traffic from Network Load Balancer to NodePorts of your nodes.
Network ACLs, routes, and NAT rules that can affect the return traffic from nodes to Network Load Balancer health checks.
That the internal IP addresses of all nodes are specified in the target group.
That the stackland-ingress/ingress-controller service uses the same NodePorts as Network Load Balancer.

To additionally test the ingress from the subnet, run this command on the bastion host:

curl -I --max-time 10 -H "Host: console.sys.$BASE_DOMAIN" \
  http://<any_node_ip>:$INGRESS_HTTP_NODEPORT

Expected response: HTTP redirect to HTTPS or login page.

Troubleshooting

This section covers issues that are common when installing Stackland on VMs in Yandex Cloud. For common installation issues, refer to Diagnostics and troubleshooting.

`yandexcloud-lb`restarts with a metadata service error

If the installation stops at the load-balancer or ingress component, check the pods of the load balancer component:

kubectl get pod -n stackland-load-balancer
kubectl logs -n stackland-load-balancer deploy/yandexcloud-lb --tail=100

If the logs contain the failed to get cluster name from metadata: metadata service returned status 404 error, make sure all cluster VMs have the cluster-name label matching metadata.name in the StacklandClusterConfig resource:

yc compute instance add-labels node1 --labels cluster-name=$CLUSTER_NAME
yc compute instance add-labels node2 --labels cluster-name=$CLUSTER_NAME
yc compute instance add-labels node3 --labels cluster-name=$CLUSTER_NAME

Once the label is added, restart the yandexcloud-lb pod:

kubectl delete pod -n stackland-load-balancer -l app.kubernetes.io/name=yandexcloud-lb

PVCs remain `Pending`

If the installation stops at the iam, logging, or storage component, and pod events contain the did not have enough free storage message, check the storage class and TopoLVM capacity:

kubectl get nodes -o custom-columns=NAME:.metadata.name,OTHER:.status.capacity.capacity\.topolvm\.io/stackland-other,SSD:.status.capacity.capacity\.topolvm\.io/stackland-ssd
kubectl get pvc -A

For Compute Cloud VMs, the capacity should be available at capacity.topolvm.io/stackland-other. The installation configuration must contain the following parameter:

storage:
  defaultStorageClass: stackland-other

If PVCs have already been created with stackland-ssd, change the configuration and restart the installation. On a test bench without user data, you can delete only those PVCs that are Pending for operators to recreate them with correct storage class.

Installing Stackland on Yandex Cloud VMs

IntroductionIntroduction

Get a cloud network readyGet a cloud network ready

Prepare a service accountPrepare a service account

Create a disk imageCreate a disk image

Create a bastion hostCreate a bastion host

Create cluster VMsCreate cluster VMs

Configure DNSConfigure DNS

Prepare the Stackland configurationPrepare the Stackland configuration

DownloadDownload sladm

Prepare secretsPrepare secrets

Install the clusterInstall the cluster

Check the installationCheck the installation

TroubleshootingTroubleshooting

restarts with a metadata service erroryandexcloud-lbrestarts with a metadata service error

PVCs remainPVCs remain Pending

Useful linksUseful links

Was the article helpful?

Introduction

Get a cloud network ready

Prepare a service account

Create a disk image

Create a bastion host

Create cluster VMs

Configure DNS

Prepare the Stackland configuration

Download `sladm`

Prepare secrets

Install the cluster

Check the installation

Troubleshooting

`yandexcloud-lb`restarts with a metadata service error

PVCs remain `Pending`

Useful links