Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Tutorials
    • All tutorials
    • Migrating data to Yandex Cloud using Hystax Acura
    • Fault protection with Hystax Acura
    • Configuring an SFTP server based on CentOS 7
    • VM backups using Hystax Acura
    • Backing up to Object Storage with MSP360 Backup (CloudBerry Desktop Backup)
    • Backing up to Object Storage via Duplicati
    • Backing up to Object Storage with Bacula
    • Backing up to Yandex Object Storage with Veeam Backup
    • Backing up to Object Storage with Veritas Backup Exec
    • Managed Service for Kubernetes cluster backups in Object Storage
    • Deploying GlusterFS in high availability mode
    • Deploying GlusterFS in high performance mode
    • Replicating logs to Object Storage using Data Streams
    • Replicating logs to Object Storage using Fluent Bit
    • Using Object Storage in Yandex Data Processing
    • Connecting a BareMetal server to Cloud Backup

In this article:

  • Get your cloud ready
  • Required paid resources
  • Configure the CLI profile
  • Set up your resource environment
  • Deploy your resources
  • Install and configure GlusterFS
  • Test the solution’s availability
  • Test the solution’s performance
  • How to delete the resources you created
  1. Storing and recovering data
  2. Deploying GlusterFS in high performance mode

Deploying GlusterFS in high performance mode

Written by
Yandex Cloud
Updated at May 7, 2025
  • Get your cloud ready
    • Required paid resources
  • Configure the CLI profile
  • Set up your resource environment
  • Deploy your resources
  • Install and configure GlusterFS
  • Test the solution’s availability
  • Test the solution’s performance
  • How to delete the resources you created

GlusterFS is a parallel, distributed, and scalable file system. With horizontal scaling, the system provides the cloud with an aggregate bandwidth in the tens of GB/s and hundreds of thousands of IOPS.

Use this tutorial to create an infrastructure made up of 30 segments sharing a common GlusterFS file system. Placing storage disks in a single availability zone will ensure the high performance of your file system. In our scenario, it is the speed of accessing physical disks that limits performance, while network latency is less important.

To configure a high-performance file system:

  1. Get your cloud ready.
  2. Configure the CLI profile.
  3. Set up an environment for deploying the resources.
  4. Deploy your resources.
  5. Install and configure GlusterFS.
  6. Test the solution’s availability.
  7. Test the solution’s performance.

If you no longer need the resources you created, delete them.

Get your cloud readyGet your cloud ready

Sign up in Yandex Cloud and create a billing account:

  1. Navigate to the management console and log in to Yandex Cloud or register a new account.
  2. On the Yandex Cloud Billing page, make sure you have a billing account linked and it has the ACTIVE or TRIAL_ACTIVE status. If you do not have a billing account, create one and link a cloud to it.

If you have an active billing account, you can navigate to the cloud page to create or select a folder for your infrastructure to operate in.

Learn more about clouds and folders.

Required paid resourcesRequired paid resources

The infrastructure support costs include:

  • Fee for continuously running VMs and disks (see Yandex Compute Cloud pricing).
  • Fee for using public IP addresses and outbound traffic (see Yandex Virtual Private Cloud pricing).

Configure the CLI profileConfigure the CLI profile

  1. If you do not have the Yandex Cloud CLI yet, install it and get authenticated according to instructions provided.

  2. Create a service account:

    Management console
    CLI
    API
    1. In the management console, select the folder where you want to create a service account.
    2. In the list of services, select Identity and Access Management.
    3. Click Create service account.
    4. Specify the service account name, e.g., sa-glusterfs.
    5. Click Create.

    The folder specified when creating the CLI profile is used by default. To change the default folder, use the yc config set folder-id <folder_ID> command. You can specify a different folder using the --folder-name or --folder-id parameter.

    Run the command below to create a service account, specifying sa-glusterfs as its name:

    yc iam service-account create --name sa-glusterfs
    

    Where name is the service account name.

    Result:

    id: ajehr0to1g8b********
    folder_id: b1gv87ssvu49********
    created_at: "2023-06-20T09:03:11.665153755Z"
    name: sa-glusterfs
    

    To create a service account, use the ServiceAccountService/Create gRPC API call or the create REST API method for the ServiceAccount resource.

  3. Assign the administrator role for the folder to the service account:

    Management console
    CLI
    API
    1. On the management console home page, select a folder.
    2. Go to the Access bindings tab.
    3. Find the sa-glusterfs account in the list and click .
    4. Click Edit roles.
    5. Click Add role in the dialog that opens and select the admin role.

    Run this command:

    yc resource-manager folder add-access-binding <folder_ID> \
       --role admin \
       --subject serviceAccount:<service_account_ID>
    

    To assign a role for a folder to a service account, use the setAccessBindings REST API method for the ServiceAccount resource or the ServiceAccountService/SetAccessBindings gRPC API call.

  4. Set up the CLI profile to run operations on behalf of the service account:

    CLI
    1. Create an authorized key for the service account and save it to the file:

      yc iam key create \
      --service-account-id <service_account_ID> \
      --folder-id <ID_of_folder_with_service_account> \
      --output key.json
      

      Where:

      • service-account-id: Service account ID.
      • folder-id: Service account folder ID.
      • output: Authorized key file name.

      Result:

      id: aje8nn871qo4********
      service_account_id: ajehr0to1g8b********
      created_at: "2023-06-20T09:16:43.479156798Z"
      key_algorithm: RSA_2048
      
    2. Create a CLI profile to run operations on behalf of the service account:

      yc config profile create sa-glusterfs
      

      Result:

      Profile 'sa-glusterfs' created and activated
      
    3. Configure the profile:

      yc config set service-account-key key.json
      yc config set cloud-id <cloud_ID>
      yc config set folder-id <folder_ID>
      

      Where:

      • service-account-key: Authorized key file name.
      • cloud-id: Cloud ID.
      • folder-id: Folder ID.
    4. Export your credentials to environment variables:

      export YC_TOKEN=$(yc iam create-token)
      export YC_CLOUD_ID=$(yc config get cloud-id)
      export YC_FOLDER_ID=$(yc config get folder-id)
      

Set up your resource environmentSet up your resource environment

  1. Create an SSH key pair:

    ssh-keygen -t ed25519
    

    We recommend using the default key file name.

  2. Install Terraform.

  3. Clone the yandex-cloud-examples/yc-distributed-ha-storage-with-glusterfs GitHub repository and go to the yc-distributed-ha-storage-with-glusterfs folder:

    git clone https://github.com/yandex-cloud-examples/yc-distributed-ha-storage-with-glusterfs.git
    cd ./yc-distributed-ha-storage-with-glusterfs
    
  4. Edit the variables.tf file, specifying the parameters of the resources you are deploying:

    Warning

    The values set in the file result in deploying a resource-intensive infrastructure.
    To deploy the resources within your available quotas, use the values below or adjust the values to your specific needs.

    1. Under is_ha, change default to false.

    2. Under client_node_per_zone, change default to 30.

    3. Under storage_node_per_zone, change default to 30.

      Note

      In our scenario, we will deploy 30 VMs. You can change this number depending on the requirements for the final storage size or total bandwidth.
      To calculate the maximum aggregate bandwidth of the entire system, multiply each segment's bandwidth (450 MB/s for network SSDs) by the number of segments (30), which amounts to around 13.5 GB/s.
      To calculate the system capacity, multiply the number of segments (30) by the size of each storage (1 TB), which amounts to 30 TB.

    4. If you specified a name other than the default one when creating the SSH key pair, change default to <public_SSH_key_path> under local_pubkey_path.

    5. If you need enhanced performance while guaranteed data integrity is optional to you, you can use non-replicated SSDs. To do this, change default to network-ssd-nonreplicated under disk_type. In addition, make sure the default value under disk_size is a multiple of 93.

Deploy your resourcesDeploy your resources

  1. Initialize Terraform:
    terraform init
    
  2. Check the Terraform file configuration:
    terraform validate
    
  3. Preview the list of new cloud resources:
    terraform plan
    
  4. Create the resources:
    terraform apply -auto-approve
    
  5. Wait until you are notified it has been completed:
    Outputs:
    
    connect_line = "ssh storage@158.160.108.137"
    public_ip = "158.160.108.137"
    

This will create 30 VMs for hosting client code (client01, client02, etc.) in the folder and 30 VMs for distributed data storage (gluster01, gluster02, etc.) bound to the client VMs and placed in the same availability zone.

Install and configure GlusterFSInstall and configure GlusterFS

  1. Connect to the client01 VM using the command from the process completion output:

    ssh storage@158.160.108.137
    
  2. Switch to the root mode:

    sudo -i
    
  3. Install ClusterShell:

    dnf install epel-release -y
    dnf install clustershell -y
    echo 'ssh_options: -oStrictHostKeyChecking=no' >> /etc/clustershell/clush.conf
    
  4. Create the configuration files:

    cat > /etc/clustershell/groups.conf <<EOF
    [Main]
    default: cluster
    confdir: /etc/clustershell/groups.conf.d $CFGDIR/groups.conf.d
    autodir: /etc/clustershell/groups.d $CFGDIR/groups.d
    EOF
    
    cat > /etc/clustershell/groups.d/cluster.yaml <<EOF
    cluster:
       all: '@clients,@gluster'
       clients: 'client[01-30]'
       gluster: 'gluster[01-30]'
    EOF 
    
  5. Install GlusterFS:

    clush -w @all hostname # check and auto add fingerprints
    clush -w @all dnf install centos-release-gluster -y
    clush -w @all dnf --enablerepo=powertools install glusterfs-server -y
    clush -w @gluster mkfs.xfs -f -i size=512 /dev/vdb
    clush -w @gluster mkdir -p /bricks/brick1
    clush -w @gluster "echo '/dev/vdb /bricks/brick1 xfs defaults 1 2' >> /etc/fstab"
    clush -w @gluster "mount -a && mount"
    
  6. Restart GlusterFS:

    clush -w @gluster systemctl enable glusterd
    clush -w @gluster systemctl restart glusterd
    
  7. Check the availability of the gluster02 through gluster30 VMs:

    clush -w gluster01 'for i in {2..9}; do gluster peer probe gluster0$i; done'
    clush -w gluster01 'for i in {10..30}; do gluster peer probe gluster$i; done'
    
  8. Create a vol0 folder in each data storage VM and configure availability and fault tolerance by connecting to the stripe-volume shared folder:

    clush -w @gluster mkdir -p /bricks/brick1/vol0
    export STRIPE_NODES=$(nodeset -S':/bricks/brick1/vol0 ' -e @gluster)
    clush -w gluster01 gluster volume create stripe-volume ${STRIPE_NODES}:/bricks/brick1/vol0 
    
  9. Make use of the additional performance settings:

    clush -w gluster01 gluster volume set stripe-volume client.event-threads 8
    clush -w gluster01 gluster volume set stripe-volume server.event-threads 8
    clush -w gluster01 gluster volume set stripe-volume cluster.shd-max-threads 8
    clush -w gluster01 gluster volume set stripe-volume performance.read-ahead-page-count 16
    clush -w gluster01 gluster volume set stripe-volume performance.client-io-threads on
    clush -w gluster01 gluster volume set stripe-volume performance.quick-read off
    clush -w gluster01 gluster volume set stripe-volume performance.parallel-readdir on
    clush -w gluster01 gluster volume set stripe-volume performance.io-thread-count 32
    clush -w gluster01 gluster volume set stripe-volume performance.cache-size 1GB
    clush -w gluster01 gluster volume set stripe-volume performance.cache-invalidation on
    clush -w gluster01 gluster volume set stripe-volume performance.md-cache-timeout 600
    clush -w gluster01 gluster volume set stripe-volume performance.stat-prefetch on
    clush -w gluster01 gluster volume set stripe-volume server.allow-insecure on
    clush -w gluster01 gluster volume set stripe-volume network.inode-lru-limit 200000
    clush -w gluster01 gluster volume set stripe-volume features.shard-block-size 128MB
    clush -w gluster01 gluster volume set stripe-volume features.shard on
    clush -w gluster01 gluster volume set stripe-volume features.cache-invalidation-timeout 600
    clush -w gluster01 gluster volume set stripe-volume storage.fips-mode-rchecksum on
    
  10. Mount the stripe-volume shared folder on the client VMs:

    clush -w gluster01  gluster volume start stripe-volume
    clush -w @clients mount -t glusterfs gluster01:/stripe-volume /mnt/
    

Test the solution’s availabilityTest the solution’s availability

  1. Check the status of the stripe-volume shared folder:

    clush -w gluster01  gluster volume status
    
  2. Create a text file:

    cat > /mnt/test.txt <<EOF
    Hello, GlusterFS!
    EOF
    
  3. Make sure the file is available on all client VMs:

    clush -w @clients sha256sum /mnt/test.txt
    

    Result:

    client01: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85  /mnt/test.txt
    client02: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85  /mnt/test.txt
    client03: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85  /mnt/test.txt
    ...
    client30: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85  /mnt/test.txt
    

Test the solution’s performanceTest the solution’s performance

IOR is a benchmark for concurrent I/O operations you can use to test the performance of parallel data storage systems using various interfaces and access patterns.

  1. Install the dependencies:

    clush -w @clients dnf install -y autoconf automake pkg-config m4 libtool git mpich mpich-devel make fio
    cd /mnt/
    git clone https://github.com/hpc/ior.git
    cd ior
    mkdir prefix
    
  2. Close the shell and open it again:

    ^C
    sudo -i
    module load mpi/mpich-x86_64
    cd /mnt/ior
    
  3. Install IOR:

    ./bootstrap
    ./configure --disable-dependency-tracking  --prefix /mnt/ior/prefix
    make 
    make install
    mkdir -p /mnt/benchmark/ior
    
  4. Run IOR:

    export NODES=$(nodeset  -S',' -e @clients)
    mpirun -hosts $NODES -ppn 16 /mnt/ior/prefix/bin/ior  -o /mnt/benchmark/ior/ior_file -t 1m -b 16m -s 16 -F
    mpirun -hosts $NODES -ppn 16 /mnt/ior/prefix/bin/ior  -o /mnt/benchmark/ior/ior_file -t 1m -b 16m -s 16 -F -C
    

    Result:

    IOR-4.1.0+dev: MPI Coordinated Test of Parallel I/O
    Options:
    api                 : POSIX
    apiVersion          :
    test filename       : /mnt/benchmark/ior/ior_file
    access              : file-per-process
    type                : independent
    segments            : 16
    ordering in a file  : sequential
    ordering inter file : no tasks offsets
    nodes               : 30
    tasks               : 480
    clients per node    : 16
    memoryBuffer        : CPU
    dataAccess          : CPU
    GPUDirect           : 0
    repetitions         : 1
    xfersize            : 1 MiB
    blocksize           : 16 MiB
    aggregate filesize  : 120 GiB
    
    Results:
    
    access    bw(MiB/s)  IOPS       Latency(s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
    ------    ---------  ----       ----------  ---------- ---------  --------   --------   --------   --------   ----
    write     1223.48    1223.99    4.65        16384      1024.00    2.44       100.39     88.37      100.44     0
    read      1175.45    1175.65    4.83        16384      1024.00    0.643641   104.52     37.97      104.54     0
    

How to delete the resources you createdHow to delete the resources you created

To stop paying for the resources created, delete them:

terraform destroy -auto-approve

Was the article helpful?

Previous
Deploying GlusterFS in high availability mode
Next
Overview
Yandex project
© 2025 Yandex.Cloud LLC