Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Compute Cloud
    • All guides
      • Creating a GPU cluster
      • Adding a VM to a GPU cluster
      • Updating a GPU cluster
      • Getting information about a GPU cluster
      • Configuring GPU cluster access permissions
      • Deleting a GPU cluster
      • Testing a GPU cluster physical state
      • Running parallel tasks in a GPU cluster
      • Testing InfiniBand throughput
    • Viewing operations with resources
  • Yandex Container Solution
  • Access management
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Release notes
  1. Step-by-step guides
  2. GPU clusters
  3. Testing InfiniBand throughput

Testing InfiniBand throughput

Written by
Yandex Cloud
Updated at April 18, 2025
  1. Connect to the VM over SSH.

  2. Install tools for testing:

    sudo apt update
    sudo apt install perftest numactl
    
  3. Create a file named /etc/security/limits.d/limits.conf with the following contents:

    * soft memlock unlimited
    * hard memlock unlimited
    
  4. Log out and log back in or reboot the machine to apply the changes. Check the limit using this command:

    ulimit -l
    

    The result should be unlimited.

  5. Create a file named infiniband_test.sh with the following contents:

    #!/bin/bash
    set -eu
    
    # Testing the memlock limit
    echo "Current memlock limit:"
    ulimit -l
    if [[ $(ulimit -l) != "unlimited" ]]; then
       echo "Memlock limit is not unlimited."
       echo "Create a file named /etc/security/limits.d/limits.conf with the following content:"
       echo "* soft memlock unlimited"
       echo "* hard memlock unlimited"
       exit 1
    fi
    
    # Cleanup funciton: Terminate all ib_write_bw processes upon script completion
    clean() {
       killall -9 ib_write_bw &>/dev/null
    }
    trap clean EXIT
    
    # Test parameters
    size=33554432  # Block size in bytes
    iters=10000    # Number of iterations
    q=1
    
    # Specify CPU numbers and network device names for different NUMA nodes
    # Example:
    numa0_cpu=40      # Client CPU (NUMA node 0)
    numa1_cpu=130     # Server CPU (NUMA node 1)
    numa0_net=mlx5_0  # Network interface for the client
    numa1_net=mlx5_7  # Network interface for the server
    
    # Start the server on NUMA node 1
    numactl -C $numa1_cpu --membind 1 /usr/bin/ib_write_bw --ib-dev=$numa1_net --report_gbits -s $size  --iters $iters -q $q &>/dev/null &
    sleep 1
    
    # Start the client on NUMA node 0 with high priority
    nice -20 numactl -C $numa0_cpu  --membind 0 /usr/bin/ib_write_bw --ib-dev=$numa0_net --report_gbits -s $size --iters $iters -q $q localhost &
    wait
    
  6. Make the script executable:

    chmod +x infiniband_test.sh
    
  7. Run the script:

    ./infiniband_test.sh
    

    Result:

    ---------------------------------------------------------------------------------------
    #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
    33554432    10000            394.58             394.40                    0.001469
    ---------------------------------------------------------------------------------------
    

Was the article helpful?

Previous
Running parallel tasks in a GPU cluster
Next
Viewing operations with resources
© 2025 Direct Cursus Technology L.L.C.