Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for PostgreSQL
  • Getting started
    • All tutorials
      • Managing extensions
      • pg_cron
      • pg_repack
      • pgaudit
      • pgcrypto
      • postgresql_anonymizer
      • Hunspell dictionaries for full-text search
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Installing pg_repack in a PostgreSQL cluster
  • Installing the pg_repack client
  • Running pg_repack
  • Usage example
  1. Step-by-step tutorials
  2. PostgreSQL extensions and dictionaries
  3. pg_repack

Using pg_repack in Managed Service for PostgreSQL

Written by
Yandex Cloud
Updated at December 24, 2024
  • Installing pg_repack in a PostgreSQL cluster
  • Installing the pg_repack client
  • Running pg_repack
  • Usage example

PostgreSQL tables and indexes may get bloated. When running transactions that update data in tables and indexes, the older data is retained in case you need to roll back the transactions. This causes tables and indexes to bloat during bulk data updates. You can use the pgstattuple extension or the pgsql-bloat-estimation queries to estimate the amount of bloat.

Older data versions are not deleted automatically. To free up storage space and remove bloat, you can delete the data you no longer need using the VACUUM FULL or CLUSTER commands. However, these commands must obtain an exclusive lock on the processed tables, which may not be convenient or possible.

The pg_repack extension allows you to remove bloat in tables and indexes by repacking them. Unlike other methods, pg_repack works without holding an exclusive lock on the tables. For more information, see the extension page.

Installing pg_repack in a PostgreSQL clusterInstalling pg_repack in a PostgreSQL cluster

  1. Add the pg_repack extension to your database.

  2. Assign the mdb_admin or mdb_superuser role to the owner of this database, if not assigned yet.

    You can get the owner's name with the list of databases in the cluster.

  3. Set the following PostgreSQL cluster parameters to zero:

    • statement_timeout
    • idle_in_transaction_session_timeout

    Warning

    Other values for these settings may cause long-running commands to interrupt or idle transactions to terminate. If this happens, pg_repack may fail.

Installing the pg_repack clientInstalling the pg_repack client

To use the extension, you need a client with the same name. The client must be installed on a host that can connect to the PostgreSQL cluster.

To install the client:

  1. Check the pg_repack version installed in the PostgreSQL cluster.

  2. Install the pg_repack client.

    Warning

    The client and extension versions must match; otherwise, the connection will return this error:

    ERROR: pg_repack failed with error: program 'pg_repack ...' does not match database library 'pg_repack ...'
    

    Depending on your operating system and connected repositories, you can install the client using a package manager, e.g., apt or yum. If the required client version is not available in the repositories, build it from the source files yourself.

    How to build the pg_repack client on Ubuntu 22.04 LTS
    1. Connect the PostgreSQL Apt repository, which contains some of the required dependencies:

      sudo apt install -y postgresql-common && \
      sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh -y $(lsb_release -cs)
      
    2. Install the dependencies.

      Select the postgresql-server-dev-* package version matching the version of the PostgreSQL cluster you plan to connect to using the client. This will make your client more stable.

      This example command installs such a package for PostgreSQL 16 together with other packages:

      sudo apt install git build-essential \
                       zlib1g-dev libzstd-dev liblz4-dev \
                       libreadline-dev \
                       postgresql-server-dev-16
      
    3. Clone the pg_repack Git repository by selecting the relevant version tag and open the local directory containing the repository.

      Example command for the ver_1.4.8 tag:

      git clone https://github.com/reorg/pg_repack.git \
          --branch ver_1.4.8 --depth 1 && \
      cd pg_repack
      
    4. Run the build:

      make
      

      Wait for the build process to complete.

    5. Place the pg_repack executable file in any directory listed in the PATH environment variable for the current user.

      Example command to move the file to /usr/local/bin:

      sudo install bin/pg_repack /usr/local/bin
      
    6. Check the client version:

      pg_repack --version
      

      This should output the version matching the tag you selected earlier, e.g.:

      pg_repack 1.4.8
      

Running pg_repackRunning pg_repack

Tip

Run pg_repack when the PostgreSQL cluster load is minimal: repacking database objects will put extra load on your cluster.

Cluster and host state data is available in the management console.

To run pg_repack and repack database objects:

  1. Make sure the cluster has enough free storage space: at least twice the total size of the tables and indexes to repack.

    During a repack, pg_repack works with copies of tables and indexes, which requires additional storage space.

  2. Run the pg_repack client with the parameters you need.

    Below are example commands to run the client with common parameter combinations. For a description of all supported parameters, see the extension page.

    Tip

    Add the --dry-run parameter to dry run pg_repack and review the planned updates.

    This will output the list of objects to repack when running pg_repack in regular mode.

    Run the command you need:

    • Command to repack the specified database tables:

      Repacking with connection without SSL
      Repacking with connection via SSL
      pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
                -U <username> \
                -d <DB_name> \
                -t <table_name>
      
      PGSSLMODE='verify-full' \
      pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
                -U <username> \
                -d <DB_name> \
                -t <table_name>
      

      If you need to repack multiple tables, provide the required number of -t parameters, one for each table.

    • Command to repack the specified database indexes:

      Repacking with connection without SSL
      Repacking with connection via SSL
      pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
                -U <username> \
                -d <DB_name> \
                -i <index_name>
      
      PGSSLMODE='verify-full' \
      pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
                -U <username> \
                -d <DB_name> \
                -i <index_name>
      

      If you need to repack multiple indexes, provide the required number of -i parameters, one for each index.

    • Command to repack all tables and indexes in the database:

      Repacking with connection without SSL
      Repacking with connection via SSL
      pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
                -U <username> \
                -d <DB_name>
      
      PGSSLMODE='verify-full' \
      pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
                -U <username> \
                -d <DB_name>
      

Usage exampleUsage example

Note

The example was tested in the following environment:

  • PostgreSQL 16 cluster where:

    • Database named db1 is created with user1 as its owner.
    • pg_repack version 1.4.8 is installed in the db1 database.
  • Yandex Cloud VM running Ubuntu 22.04 LTS where:

    • pg_repack client version 1.4.8 built from source files is installed.
    • SSL connections to the cluster are enabled.

This example uses pgbench, a load-testing tool for PostgreSQL which allows you to automate the following operations:

  • Creating test tables and indexes and populating the tables with test data.
  • Running multiple queries to test tables, including INSERT and DELETE queries that update the table contents.

When a pgbench run is over, test tables and their relevant indexes become bloated as a result of a large number of queries. Also, pgbench runs the VACUUM command, which removes table and index bloat, at the beginning rather than the end of its operation. So we will use pgbench to create bloated tables and indexes to demonstrate how pg_repack works.

To test pg_repack on the pgbench test tables and indexes:

  1. Get the PostgreSQL cluster ID.

  2. Add the pgstattuple extension to the db1 database.

    This extension allows you to estimate table and index bloat to see how pg_repack works.

  3. Install pgbench on a VM:

    sudo apt install postgresql-contrib
    

    Note

    The PostgreSQL Apt repository contains the postgresql-contrib package.

    If you deleted this repository from the virtual machine after building the pg_repack client on Ubuntu 22.04 LTS, reconnect the repository.

  4. Create test tables and indexes in the db1 database by connecting to it as its owner (user1):

    PGSSLMODE='verify-full' \
    pgbench -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
            -U user1 -i -s 1 db1
    

    This will create the following tables and indexes:

    Table Index
    pgbench_accounts pgbench_accounts_pkey
    pgbench_branches pgbench_branches_pkey
    pgbench_tellers pgbench_tellers_pkey
    pgbench_history —
  5. View the statistics for the pgbench_* tables and indexes:

    1. Connect to the db1 database as the owner (user1). Use psql to connect.

    2. Request table statistics:

      SELECT * FROM pgstattuple('pgbench_accounts');
      SELECT * FROM pgstattuple('pgbench_branches');
      SELECT * FROM pgstattuple('pgbench_tellers');
      SELECT * FROM pgstattuple('pgbench_history');
      
    3. Request index statistics:

      SELECT * FROM pgstattuple('pgbench_accounts_pkey');
      SELECT * FROM pgstattuple('pgbench_branches_pkey');
      SELECT * FROM pgstattuple('pgbench_tellers_pkey');
      

    The dead_tuple_count columns must show zero for all query results. This means there are no bloated tables and indexes.

    Here is an example of partial output for the pgbench_accounts table:

     table_len | tuple_count | ... | dead_tuple_count | ... | free_space | free_percent
    -----------+-------------+-...-+------------------+-...-+------------+--------------
      13434880 |      100000 | ... |                0 | ... |     188960 |         1.41
    
  6. Run pgbench once:

    PGSSLMODE='verify-full' \
    pgbench -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
            -U user1 -c 5 -j 2 -t 1000 db1
    

    Wait for the pgbench run to complete. This may take several minutes.

  7. Once again, view the statistics for the pgbench_* tables and indexes:

    Non-zero values in the dead_tuple_count columns indicate table and index bloat. For test tables and indexes created by pgbench, the pgbench_tellers table will show the greatest bloating.

  8. Dry run pg_repack to review the planned updates:

    PGSSLMODE='verify-full' \
    pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
              -U user1 \
              -d db1 \
              --dry-run
    

    Since some indexes and tables are not bloated, the output will not include all database objects.

    Output example:

    INFO: repacking table "public.pgbench_accounts"
    INFO: repacking table "public.pgbench_branches"
    INFO: repacking table "public.pgbench_tellers"
    
  9. Run pg_repack in regular mode to repack tables and indexes:

    PGSSLMODE='verify-full' \
    pg_repack -k -h c-<cluster_ID>.rw.mdb.yandexcloud.net -p 6432 \
              -U user1 \
              -d db1
    

    Wait for the pg_repack run to complete. This may take several minutes.

  10. Once again, view the statistics for the pgbench_* tables and indexes:

    The dead_tuple_count columns must show zero for all query results. This means there are no bloated tables or indexes, and pg_repack ran correctly.

Was the article helpful?

Previous
pg_cron
Next
pgaudit
© 2025 Direct Cursus Technology L.L.C.