Deploying GlusterFS in high performance mode
GlusterFS
Use this tutorial to create an infrastructure made up of 30 segments sharing a common GlusterFS file system. Placing storage disks in a single availability zone will ensure the high performance of your file system. In our scenario, it is the speed of accessing physical disks that limits performance, while network latency is less important.
To configure a high-performance file system:
- Get your cloud ready.
- Configure the CLI profile.
- Set up an environment for deploying the resources.
- Deploy your resources.
- Install and configure GlusterFS.
- Test the solution’s availability.
- Test the solution’s performance.
If you no longer need the resources you created, delete them.
Get your cloud ready
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
Required paid resources
The infrastructure support costs include:
- Fee for continuously running VMs and disks (see Yandex Compute Cloud pricing).
- Fee for using public IP addresses and outbound traffic (see Yandex Virtual Private Cloud pricing).
Configure the CLI profile
-
If you do not have the Yandex Cloud CLI yet, install it and get authenticated according to instructions provided.
-
Create a service account:
Management consoleCLIAPI- In the management console
, select the folder where you want to create a service account. - In the list of services, select Identity and Access Management.
- Click Create service account.
- Specify the service account name, e.g.,
sa-glusterfs
. - Click Create.
The folder specified in the CLI profile is used by default. You can specify a different folder through the
--folder-name
or--folder-id
parameter.Run the command below to create a service account, specifying
sa-glusterfs
as its name:yc iam service-account create --name sa-glusterfs
Where
name
is the service account name.Result:
id: ajehr0to1g8b******** folder_id: b1gv87ssvu49******** created_at: "2023-06-20T09:03:11.665153755Z" name: sa-glusterfs
To create a service account, use the ServiceAccountService/Create gRPC API call or the create REST API method for the
ServiceAccount
resource. - In the management console
-
Assign the administrator role for the folder to the service account:
Management consoleCLIAPI- On the management console home page
, select a folder. - Go to the Access bindings tab.
- Find the
sa-glusterfs
account in the list and click . - Click Edit roles.
- Click Add role in the dialog that opens and select the
admin
role.
Run this command:
yc resource-manager folder add-access-binding <folder_ID> \ --role admin \ --subject serviceAccount:<service_account_ID>
To assign a role for a folder to a service account, use the setAccessBindings REST API method for the ServiceAccount resource or the ServiceAccountService/SetAccessBindings gRPC API call.
- On the management console home page
-
Set up the CLI profile to run operations on behalf of the service account:
CLI-
Create an authorized key for the service account and save it to the file:
yc iam key create \ --service-account-id <service_account_ID> \ --folder-id <ID_of_folder_with_service_account> \ --output key.json
Where:
service-account-id
: Service account ID.folder-id
: Service account folder ID.output
: Authorized key file name.
Result:
id: aje8nn871qo4******** service_account_id: ajehr0to1g8b******** created_at: "2023-06-20T09:16:43.479156798Z" key_algorithm: RSA_2048
-
Create a CLI profile to run operations on behalf of the service account:
yc config profile create sa-glusterfs
Result:
Profile 'sa-glusterfs' created and activated
-
Configure the profile:
yc config set service-account-key key.json yc config set cloud-id <cloud_ID> yc config set folder-id <folder_ID>
Where:
-
Export your credentials to environment variables:
export YC_TOKEN=$(yc iam create-token) export YC_CLOUD_ID=$(yc config get cloud-id) export YC_FOLDER_ID=$(yc config get folder-id)
-
Set up your resource environment
-
Create an SSH key pair:
ssh-keygen -t ed25519
We recommend using the default key file name.
-
Clone the
yandex-cloud-examples/yc-distributed-ha-storage-with-glusterfs
GitHub repository and go to theyc-distributed-ha-storage-with-glusterfs
folder:git clone https://github.com/yandex-cloud-examples/yc-distributed-ha-storage-with-glusterfs.git cd ./yc-distributed-ha-storage-with-glusterfs
-
Edit the
variables.tf
file, specifying the parameters of the resources you are deploying:Warning
The values set in the file result in deploying a resource-intensive infrastructure.
To deploy the resources within your available quotas, use the values below or adjust the values to your specific needs.-
Under
is_ha
, changedefault
tofalse
. -
Under
client_node_per_zone
, changedefault
to30
. -
Under
storage_node_per_zone
, changedefault
to30
.Note
In our scenario, we will deploy 30 VMs. You can change this number depending on the requirements for the final storage size or total bandwidth.
To calculate the maximum aggregate bandwidth of the entire system, multiply each segment's bandwidth (450 MB/s for network SSDs) by the number of segments (30), which amounts to around 13.5 GB/s.
To calculate the system capacity, multiply the number of segments (30) by the size of each storage (1 TB), which amounts to 30 TB. -
If you specified a name other than the default one when creating the SSH key pair, change
default
to<public_SSH_key_path>
underlocal_pubkey_path
. -
If you need enhanced performance while guaranteed data integrity is optional to you, you can use non-replicated SSDs. To do this, change
default
tonetwork-ssd-nonreplicated
underdisk_type
. In addition, make sure thedefault
value underdisk_size
is a multiple of 93.
-
Deploy your resources
- Initialize Terraform:
terraform init
- Check the Terraform file configuration:
terraform validate
- Preview the list of new cloud resources:
terraform plan
- Create the resources:
terraform apply -auto-approve
- Wait until you are notified it has been completed:
Outputs: connect_line = "ssh storage@158.160.108.137" public_ip = "158.160.108.137"
This will create 30 VMs for hosting client code (client01
, client02
, etc.) in the folder and 30 VMs for distributed data storage (gluster01
, gluster02
, etc.) bound to the client VMs and placed in the same availability zone.
Install and configure GlusterFS
-
Connect to the
client01
VM using the command from the process completion output:ssh storage@158.160.108.137
-
Switch to the
root
mode:sudo -i
-
Install ClusterShell
:dnf install epel-release -y dnf install clustershell -y echo 'ssh_options: -oStrictHostKeyChecking=no' >> /etc/clustershell/clush.conf
-
Create the configuration files:
cat > /etc/clustershell/groups.conf <<EOF [Main] default: cluster confdir: /etc/clustershell/groups.conf.d $CFGDIR/groups.conf.d autodir: /etc/clustershell/groups.d $CFGDIR/groups.d EOF cat > /etc/clustershell/groups.d/cluster.yaml <<EOF cluster: all: '@clients,@gluster' clients: 'client[01-30]' gluster: 'gluster[01-30]' EOF
-
Install GlusterFS:
clush -w @all hostname # check and auto add fingerprints clush -w @all dnf install centos-release-gluster -y clush -w @all dnf --enablerepo=powertools install glusterfs-server -y clush -w @gluster mkfs.xfs -f -i size=512 /dev/vdb clush -w @gluster mkdir -p /bricks/brick1 clush -w @gluster "echo '/dev/vdb /bricks/brick1 xfs defaults 1 2' >> /etc/fstab" clush -w @gluster "mount -a && mount"
-
Restart GlusterFS:
clush -w @gluster systemctl enable glusterd clush -w @gluster systemctl restart glusterd
-
Check the availability of the
gluster02
throughgluster30
VMs:clush -w gluster01 'for i in {2..9}; do gluster peer probe gluster0$i; done' clush -w gluster01 'for i in {10..30}; do gluster peer probe gluster$i; done'
-
Create a
vol0
folder in each data storage VM and configure availability and fault tolerance by connecting to thestripe-volume
shared folder:clush -w @gluster mkdir -p /bricks/brick1/vol0 export STRIPE_NODES=$(nodeset -S':/bricks/brick1/vol0 ' -e @gluster) clush -w gluster01 gluster volume create stripe-volume ${STRIPE_NODES}:/bricks/brick1/vol0
-
Make use of the additional performance settings:
clush -w gluster01 gluster volume set stripe-volume client.event-threads 8 clush -w gluster01 gluster volume set stripe-volume server.event-threads 8 clush -w gluster01 gluster volume set stripe-volume cluster.shd-max-threads 8 clush -w gluster01 gluster volume set stripe-volume performance.read-ahead-page-count 16 clush -w gluster01 gluster volume set stripe-volume performance.client-io-threads on clush -w gluster01 gluster volume set stripe-volume performance.quick-read off clush -w gluster01 gluster volume set stripe-volume performance.parallel-readdir on clush -w gluster01 gluster volume set stripe-volume performance.io-thread-count 32 clush -w gluster01 gluster volume set stripe-volume performance.cache-size 1GB clush -w gluster01 gluster volume set stripe-volume performance.cache-invalidation on clush -w gluster01 gluster volume set stripe-volume performance.md-cache-timeout 600 clush -w gluster01 gluster volume set stripe-volume performance.stat-prefetch on clush -w gluster01 gluster volume set stripe-volume server.allow-insecure on clush -w gluster01 gluster volume set stripe-volume network.inode-lru-limit 200000 clush -w gluster01 gluster volume set stripe-volume features.shard-block-size 128MB clush -w gluster01 gluster volume set stripe-volume features.shard on clush -w gluster01 gluster volume set stripe-volume features.cache-invalidation-timeout 600 clush -w gluster01 gluster volume set stripe-volume storage.fips-mode-rchecksum on
-
Mount the
stripe-volume
shared folder on the client VMs:clush -w gluster01 gluster volume start stripe-volume clush -w @clients mount -t glusterfs gluster01:/stripe-volume /mnt/
Test the solution’s availability
-
Check the status of the
stripe-volume
shared folder:clush -w gluster01 gluster volume status
-
Create a text file:
cat > /mnt/test.txt <<EOF Hello, GlusterFS! EOF
-
Make sure the file is available on all client VMs:
clush -w @clients sha256sum /mnt/test.txt
Result:
client01: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85 /mnt/test.txt client02: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85 /mnt/test.txt client03: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85 /mnt/test.txt ... client30: 5fd9c031531c39f2568a8af5512803fad053baf3fe9eef2a03ed2a6f0a884c85 /mnt/test.txt
Test the solution’s performance
IOR
-
Install the dependencies:
clush -w @clients dnf install -y autoconf automake pkg-config m4 libtool git mpich mpich-devel make fio cd /mnt/ git clone https://github.com/hpc/ior.git cd ior mkdir prefix
-
Close the shell and open it again:
^C sudo -i module load mpi/mpich-x86_64 cd /mnt/ior
-
Install IOR:
./bootstrap ./configure --disable-dependency-tracking --prefix /mnt/ior/prefix make make install mkdir -p /mnt/benchmark/ior
-
Run IOR:
export NODES=$(nodeset -S',' -e @clients) mpirun -hosts $NODES -ppn 16 /mnt/ior/prefix/bin/ior -o /mnt/benchmark/ior/ior_file -t 1m -b 16m -s 16 -F mpirun -hosts $NODES -ppn 16 /mnt/ior/prefix/bin/ior -o /mnt/benchmark/ior/ior_file -t 1m -b 16m -s 16 -F -C
Result:
IOR-4.1.0+dev: MPI Coordinated Test of Parallel I/O Options: api : POSIX apiVersion : test filename : /mnt/benchmark/ior/ior_file access : file-per-process type : independent segments : 16 ordering in a file : sequential ordering inter file : no tasks offsets nodes : 30 tasks : 480 clients per node : 16 memoryBuffer : CPU dataAccess : CPU GPUDirect : 0 repetitions : 1 xfersize : 1 MiB blocksize : 16 MiB aggregate filesize : 120 GiB Results: access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- write 1223.48 1223.99 4.65 16384 1024.00 2.44 100.39 88.37 100.44 0 read 1175.45 1175.65 4.83 16384 1024.00 0.643641 104.52 37.97 104.54 0
How to delete the resources you created
To stop paying for the resources created, delete them:
terraform destroy -auto-approve