GeeseFS
GeeseFS
Features
Performance
Compared to goofys and s3fs, GeeseFS handles large amounts of small files (up to 1 MB) much faster and achieves similar or higher performance with large files. For more information about benchmark tests, see the GeeseFS repository on GitHub
To make GeeseFS run faster, it implements:
- Parallel readahead.
- Heuristic readahead for random access: if the size of multiple blocks requested in a row is below the threshold, GeeseFS downloads smaller blocks from storage for upcoming requests.
- Parallel multipart uploads of objects to storage.
- Optimized object updates: the client and repository only exchange modified object parts.
- Background uploads of small object trees and directories: when a directory is requested, GeeseFS downloads the whole tree per request to storage.
- Asynchronous object write, rename, and delete.
- Disk cache for reads and writes.
POSIX compatibility
In addition to the basic functions of the POSIX standards (open, read, write, close, and so on), GeeseFS supports the following features:
- Read-after-write consistency.
- Partial writes (please note that partial writes in buckets with versioning may result in intermediate object versions).
fsync: Synchronization of the contents of an object or directory between the VM memory and storage.truncate: Changing object size at will.- Soft links (symlinks).
xattr: Extended file attributes.- Directory renames.
readdir: Reads of directory metadata.
Partial updating and appending of objects data
GeeseFS supports partial updating and appending of objects data to Object Storage buckets.
To enable partial object updates, use the --enable-patch option.
To learn more, see the GeeseFS repository on GitHub:
- Partial object updates
: Description of partial updating and appending of objects data. - Concurrent Updates
: Description of how an object can be partially updated by multiple concurrent requests.
Limitations
GeeseFS does not support the following:
-
Working with file and directory access permissions, including the
chmodandchowncommands.When mounting the file system, you can specify:
- Access permissions to all files or directories in the
--file-modeand--dir-modeoption values, respectively. - ID of the owner of all files and directories in the
--uidoption value. - ID of the group all files and directories belong to in the
--gidoption value.
Here is an example:
geesefs \ --file-mode=0666 \ --dir-mode=0777 \ --uid=1000 \ <bucket_name> <mount_point> - Access permissions to all files or directories in the
-
Hard links.
-
File locking.
-
Correct time of the last access to the file (
atime) and the last change of the file's attributes (ctime). Both fields always contain the time of the file's last modification: same as in themtimefield. -
Creating files larger than 1 TB.
Getting started
-
Assign to the service account the roles required for your project, e.g., storage.editor for a bucket (to work with a particular bucket) or a folder (to work with all buckets in this folder). For more information about roles, see Access management with Yandex Identity and Access Management.
To work with objects in an encrypted bucket, a user or service account must have the following roles for the encryption key in addition to the
storage.configurerrole:kms.keys.encrypter: To read the key, encrypt and upload objects.kms.keys.decrypter: To read the key, decrypt and download objects.kms.keys.encrypterDecrypter: This role includes thekms.keys.encrypterandkms.keys.decrypterpermissions.
For more information, see Key Management Service service roles.
-
As a result, you will get the static access key data. To authenticate in Object Storage, you will need the following:
key_id: Static access key IDsecret: Secret key
Save
key_idandsecret: you will not be able to get the key value again.
Note
A service account is only allowed to view a list of buckets in the folder it was created in.
A service account can perform actions with objects in buckets that are created in folders different from the service account folder. To enable this, assign the service account roles for the appropriate folder or its bucket.
Installation
-
Make sure the FUSE utilities are installed in the distribution:
apt list --installed | grep fuseWarning
Many Linux distributions have the utilities for working with FUSE pre-installed by default. Reinstalling or deleting them may lead to OS failures.
-
If the FUSE utilities are not installed, run this command:
sudo apt-get install fuse -
Download and install GeeseFS:
wget https://github.com/yandex-cloud/geesefs/releases/latest/download/geesefs-linux-amd64 chmod a+x geesefs-linux-amd64 sudo cp geesefs-linux-amd64 /usr/bin/geesefs
-
Make sure the FUSE utilities are installed in the distribution:
yum list installed | grep fuseWarning
Many Linux distributions have the utilities for working with FUSE pre-installed by default. Reinstalling or deleting them may lead to OS failures.
-
If the FUSE utilities are not installed, run this command:
sudo yum install fuse -
Download and install GeeseFS:
wget https://github.com/yandex-cloud/geesefs/releases/latest/download/geesefs-linux-amd64 chmod a+x geesefs-linux-amd64 sudo cp geesefs-linux-amd64 /usr/bin/geesefs
-
Install the macFUSE
package. -
Enable
support for third-party core extensions. This step is only required the first time you use MacFUSE on an Apple Silicon Mac. -
Allow
loading the MacFUSE core extension (Apple Silicon and Intel Mac).For more information on installing macFUSE, see this installation guide
in the macFUSE GitHub repository. -
Download and install GeeseFS:
platform='arm64' if [[ $(uname -m) == 'x86_64' ]]; then platform='amd64'; fi wget https://github.com/yandex-cloud/geesefs/releases/latest/download/geesefs-mac-$platform chmod a+x geesefs-mac-$platform sudo cp geesefs-mac-$platform /usr/local/bin/geesefs
-
Download
and install WinFSP. -
Download
thegeesefs-win-x64.exefile. -
Rename
geesefs-win-x64.exetogeesefs.exefor convenience. -
Сreate a folder named
geesefsand move thegeesefs.exefile there. -
Add
geesefsto thePATHvariable:- Click Start and type Change system environment variables in the Windows search bar.
- Click Environment Variables... at the bottom right.
- In the window that opens, find the
PATHparameter and click Edit. - Add your folder path to the list.
- Click OK.
You can also build GeeseFS yourself using its source code. For more information, see the guide
Authentication
GeeseFS uses the static access key to Object Storage you got earlier. You can set it using one of the following methods:
-
Using the
credentialsfile , which you need to put into the~/.aws/folder:-
Create a directory:
mkdir ~/.aws -
Create a file named
credentialswith the following contents:[default] aws_access_key_id = <key_ID> aws_secret_access_key = <secret_key>
If the key file is located elsewhere, specify its path in the
--shared-configparameter when mounting the bucket:geesefs \ --shared-config <path_to_key_file> \ <bucket_name> <mount_point>The key file must have the same structure as
~/.aws/credentials. -
-
Using environment variables:
export AWS_ACCESS_KEY_ID=<key_ID> export AWS_SECRET_ACCESS_KEY=<secret_key>
Note
You can run the geesefs command with superuser privileges (sudo). In which case make sure to send information about the key either in the --shared-config parameter or using environment variables.
-
Using the
credentialsfile , which you need to put into theusers/<current_user>/.aws/folder:[default] aws_access_key_id = <key_ID> aws_secret_access_key = <secret_key>If the key file is located elsewhere, specify its path in the
--shared-configparameter when mounting the bucket:geesefs ^ --shared-config <path_to_key_file> ^ <bucket_name> <mount_point>The key file must have the same structure as
~/.aws/credentials.Specify an existing folder as the mount point.
-
Using environment variables:
set AWS_ACCESS_KEY_ID=<key_ID> set AWS_SECRET_ACCESS_KEY=<secret_key>
When using GeeseFS on a Compute Cloud VM that has a linked service account, you can enable simplified authentication that does not require a static access key. To do this, use the --iam option when mounting the bucket.
Mounting a bucket
Select the folder or disk where you want to mount the bucket. Make sure you have sufficient rights to perform this operation.
When mounting a bucket, you can also configure GeeseFS settings related to system performance and object access rights. To view the list of options and their descriptions, run geesefs --help.
-
For one-time bucket mounting:
Linux/macOSWindows-
Make sure the
.aws/credentialsfile contains the up-to-date static key data or provide the path to it in the--shared-configparameter. -
Create a folder for mounting:
mkdir <mount_point> -
Mount the bucket:
geesefs <bucket_name> <mount_point>You should specify an existing folder as the mount point.
-
Make sure the
.aws/credentialsfile contains the up-to-date static key data or provide the path to it in the--shared-configparameter. -
Mount the bucket:
geesefs <bucket_name> <mount_point>As the mount point, specify the name of the new folder that will be created when you mount the bucket. You cannot specify the name of an existing folder.
-
-
To automatically mount a bucket at system startup:
macOSLinuxWindows-
Create a folder for automatic mounting:
mkdir <mount_point> -
Create a file named
com.geesefs.automount.plistwith the autorun agent configuration:nano /Users/<username>/Library/LaunchAgents/com.geesefs.automount.plist -
Set the agent configuration by specifying the name of the bucket and the absolute path to the mount point:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>com.geesefs.automount</string> <key>ProgramArguments</key> <array> <string>/usr/local/bin/geesefs</string> <string><bucket_name></string> <string><absolute_path_to_mount_point></string> </array> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <dict> <key>NetworkState</key> <true/> </dict> </dict> </plist>Note
Specify an existing empty folder as the mount point.
For the bucket to be mounted correctly, provide the full absolute path to the mount point and to the key file without
~. e.g.,/home/user/. -
Enable the agent you created:
launchctl load /Users/<username>/Library/LaunchAgents/com.geesefs.automount.plist -
Reboot and check that the bucket has been mounted to the specified folder.
To disable agent autorun, use this command:
launchctl unload /Users/<username>/Library/LaunchAgents/com.geesefs.automount.plist-
Create a folder for automatic mounting:
mkdir <mount_point> -
Open the
/etc/fuse.conffile:sudo nano /etc/fuse.conf -
Add the following line to it:
user_allow_other -
Open the
/etc/fstabfile:sudo nano /etc/fstab -
Add the following line to the
/etc/fstabfile:<bucket_name> /home/<username>/<mount_point> fuse.geesefs _netdev,allow_other,--file-mode=0666,--dir-mode=0777,--shared-config=/home/<username>/.aws/credentials 0 0If you had created the
.aws/credentialsfile for therootuser, you do not need to specify the--shared-configparameter.Note
For the bucket to be mounted correctly, provide the full absolute path to the mount point and to the key file without
~, e.g.,/home/user/. -
Reboot and check that the bucket has been mounted to the specified folder.
To disable automounting, remove the line with the bucket name from the
/etc/fstabfile.Create a Windows service that will automatically run at system startup:
-
Run
CMDas an administrator. -
Run this command:
sc create <service_name> ^ binPath="<command_for_mounting>" ^ DisplayName= "<service_name>" ^ type=own ^ start=autoWhere
binPathis the path to thegeesefs.exefile with the required mounting parameters. Here is an example:C:\geesefs\geesefs.exe <bucket_name> <mount_point>. As the mount point, specify the name of the new folder that will be created when you mount the bucket. You cannot specify the name of an existing folder.Result:
[SC] CreateService: Success -
Click Start and start typing
Servicesin the Windows search bar. Run the Services application as an administrator. -
In the window that opens, find the service you created earlier, right-click it, and select Properties.
-
On the Log on tab, select This account and specify your Windows account name and password.
If necessary, click Browse → Advanced → Search to find the user you need on the computer.
-
Click OK.
To delete the created service, open
CMDas an administrator and run the following command:sc delete <service_name>Result:
[SC] DeleteService: Success -