GeeseFS
GeeseFS
Functionality
Performance
Compared to goofys and s3fs, GeeseFS handles large amounts of small files (up to 1 MB) much faster and achieves similar or higher performance with large files. For more information about benchmark tests, see the GeeseFS repository on GitHub
To make GeeseFS run faster, it implements:
- Parallel readahead.
- Heuristic readahead for random access: if the size of multiple blocks requested in a row is below the threshold, GeeseFS downloads smaller blocks from storage for upcoming requests.
- Parallel multipart uploads of objects to storage.
- Optimized object updates: the client and repository only exchange modified object parts.
- Background uploads of small object trees and directories: when a directory is requested, GeeseFS downloads the whole tree per request to storage.
- Asynchronous object write, rename, and delete.
- Disk cache for reads and writes.
POSIX compatibility
In addition to the basic functions of the POSIX standards (open
, read
, write
, close
, and so on), GeeseFS supports the following features:
- Read-after-write consistency.
- Partial writes (please note that partial writes in buckets with versioning may result in intermediate object versions).
fsync
: Synchronization of the contents of an object or directory between the VM memory and storage.truncate
: Changing object size at will.- Soft links (symlinks).
xattr
: Extended file attributes.- Directory renames.
readdir
: Reads of directory metadata.
Partial updating and appending of objects data
GeeseFS supports partial updating and appending of objects data to Object Storage buckets.
To enable partial object updates, use the --enable-patch
option.
To learn more, see the GeeseFS repository on GitHub:
- Partial object updates
: Description of partial updating and appending of objects data. - Concurrent Updates
: Description of how an object can be partially updated by multiple concurrent requests.
Limitations
GeeseFS does not support the following:
-
Working with file and directory access permissions, including the
chmod
andchown
commands.When mounting the file system, you can specify:
- Access permissions to all files or directories in the
--file-mode
and--dir-mode
option values, respectively. - ID of the owner of all files and directories in the
--uid
option value. - ID of the group all files and directories belong to in the
--gid
option value.
For example:
geesefs \ --file-mode=0666 \ --dir-mode=0777 \ --uid=1000 \ <bucket_name> <mount_point>
- Access permissions to all files or directories in the
-
Hard links.
-
File locking.
-
Correct time of the last access to the file (
atime
) and the last change of the file's attributes (ctime
). Both fields always contain the time of the file's last modification: same as in themtime
field. -
Creating files larger than 1 TB.
Getting started
- Create a service account.
- Assign the service account the roles required for your project. For more information about roles, see the Identity and Access Management documentation.
- Create a static access key.
Note
A service account is only allowed to view a list of buckets in the folder it was created in.
A service account can perform actions with objects in buckets that are created in folders different from the service account folder. To enable this, assign the service account roles for the appropriate folder or its bucket.
Installation
-
Install the utilities required by FUSE. For example:
-
Debian, Ubuntu:
sudo apt-get install fuse
-
CentOS:
sudo yum install fuse
-
-
Download and install GeeseFS:
wget https://github.com/yandex-cloud/geesefs/releases/latest/download/geesefs-linux-amd64 chmod a+x geesefs-linux-amd64 sudo cp geesefs-linux-amd64 /usr/bin/geesefs
-
Install the macFUSE
package. For more information, see the installation guide in the macFUSE repository on GitHub. -
Download and install GeeseFS:
platform='arm64' if [[ $(uname -m) == 'x86_64' ]]; then platform='amd64'; fi wget https://github.com/yandex-cloud/geesefs/releases/latest/download/geesefs-mac-$platform chmod a+x geesefs-mac-$platform sudo cp geesefs-mac-$platform /usr/bin/geesefs
You can also build GeeseFS yourself using its source code. For more information, see the guide
Authentication
GeeseFS uses a static access key for Object Storage. You can set it using one of the following methods:
-
Using the
credentials
file , which you need to put into the~/.aws/
folder:[default] aws_access_key_id = <key_ID> aws_secret_access_key = <secret_key>
If the key file is located elsewhere, specify its path in the
--shared-config
parameter when mounting the bucket:geesefs \ --shared-config <path_to_key_file> \ <bucket_name> <mount_point>
-
Using environment variables:
export AWS_ACCESS_KEY_ID=<key_ID> export AWS_SECRET_ACCESS_KEY=<secret_key>
Note
You can run the geesefs
command with superuser privileges (sudo
). In which case make sure to send information about the key either in the --shared-config
parameter or using environment variables.
-
Using the
credentials
file , which you need to put into theusers/<current_user>/.aws/
folder:[default] aws_access_key_id = <key_ID> aws_secret_access_key = <secret_key>
If the key file is located elsewhere, specify its path in the
--shared-config
parameter when mounting the bucket:geesefs <bucket_name> <mount_point> ^ --shared-config <path_to_key_file>
-
Using environment variables:
set AWS_ACCESS_KEY_ID=<key_ID> set AWS_SECRET_ACCESS_KEY=<secret_key>
When using GeeseFS on a Compute Cloud VM that has alinked service account, you can enable simplified authentication that does not require a static access key. To do this, use the --iam
option when mounting the bucket.
Mounting a bucket
Select the folder or disk where you want to mount the bucket. Make sure you have sufficient rights to perform this operation.
When mounting a bucket, you can also configure GeeseFS settings related to system performance and object access rights. To view the list of options and their descriptions, run geesefs --help
.
-
For one-time bucket mounting, run the following command:
geesefs <bucket_name> <mount_point>
-
To automatically mount a bucket at system startup:
Linux/macOSWindows-
Add the following line to the
/etc/fuse.conf
file:user_allow_other
-
Add the following line to the
/etc/fstab
file:<bucket_name> <mount_point> fuse.geesefs _netdev,allow_other,--file-mode=0666,--dir-mode=0777 0 0
Note
To ensure that the bucket is mounted correctly, provide the full absolute path to the mount point without
~
. Example:/home/user/mountpoint
.Create a Windows service that will automatically run at system startup:
-
Run
CMD
as an administrator. -
Run the following command:
sc create <service_name> ^ binPath="<command_for_mounting>" ^ DisplayName= "<service_name>" ^ type=own ^ start=auto
Where
binPath
is the path to thegeesefs.exe
file with the required mounting parameters. For example:C:\geesefs\geesefs.exe <bucket_name> <mount_point>
.Result:
[SC] CreateService: Success
-
Click Start and start typing
Services
in the Windows search bar. Run the Services application as an administrator. -
In the window that opens, find the service you created earlier, right-click it, and select Properties.
-
On the Log on tab, select This account and specify your Windows account name and password.
If necessary, click Browse → Advanced → Search to find the user you need on the computer.
-
Click OK.
To delete the created service, open
CMD
as an administrator and run the following command:sc delete <service_name>
Result:
[SC] DeleteService: Success
-