Object metadata export (S3 Inventory)
In Object Storage, you can export metadata of bucket objects (S3 Inventory) for further analysis and cataloging. Data is exported to a different bucket in CSV
Warning
Object metadata export (S3 Inventory) is currently in Preview and is provided free of charge. Going forward, this will be a paid feature.
No fees will apply to LIST and HEAD operations performed when using S3 Inventory.
PUT operations and data storage in the target bucket are billable. For more information, see the Object Storage pricing policy.
Export of all objects’ metadata is useful for the following tasks:
- Data discovery and analysis.
- Deduplication of bucket files.
- Simplifying synchronization with other storage systems and services.
- Improving observability.
- Implementing backup and versioned data recovery scenarios.
How export works
The S3 Inventory export is a list of objects with metadata, generated based on the eventual consistency
There is a reason for implementing such an approach. To ensure horizontal scalability, fault tolerance, and availability, Object Storage partitions data into separate segments, or shards. Thus, to avoid impacting bucket performance, all shards are scanned asynchronously during export generation.
If objects are written to a shard that has already been scanned, the export will not include data on those objects. Similarly, object deletions may not be immediately reflected. This is a standard behavior for LIST operations on buckets.
Tip
Before performing any action with an object featured in an export, check its current state using the HeadObject method as described in this guide.
Source and target buckets
Metadata is exported from the source bucket. This bucket contains the export configuration and objects whose metadata will be exported. You can export metadata for all objects in the source bucket or filter objects by folder (prefix).
The exported metadata is written to the target bucket. You can specify an export prefix in the configuration to organize exported files within the target bucket.
Prerequisites
- Source and target can be the same bucket.
- Source and target buckets must reside in the same cloud.
- Encryption must be disabled on the target bucket.
Note
To write reports to a target bucket with an access policy configured, add a rule to this policy to allow any account to perform the PutObject action, and specify <export_prefix>/ as the resource.
Export configuration and metadata types
You can create an export configuration using the Yandex Cloud CLI or Yandex Cloud API. The configuration includes the following parameters:
- Configuration name.
- Target bucket and export prefix.
- Report frequency (daily or weekly).
- Included versions (current versions only or all object versions).
- Optional prefix to filter objects included in the report.
- Configuration status: enabled or disabled.
- List of optional object metadata fields.
Each export report contains a list of source bucket objects and their metadata. The following fields are included by default:
BUCKET_NAME: Source bucket name.KEY: Object key.
If exporting metadata of all object versions, the report will also indicate:
VERSION_ID: Version ID.IS_LATEST: Latest version flag.DELETE_MARKER: Delete marker flag.
Optionally, you add the following metadata fields to the configuration:
SIZE: Size in bytes, excluding incomplete parts of multipart uploads, object metadata, and delete markers.LAST_MODIFIED_DATE: Creation or last modification date.ETAG: Hash.STORAGE_CLASS: Storage class.IS_MULTIPART_UPLOADED: Multipart upload indicator.ENCRYPTION_STATUS: Encryption status.OBJECT_LOCK_RETAIN_UNTIL_DATE: Version lock expiration date.OBJECT_LOCK_MODE: Version lock type.OBJECT_LOCK_LEGAL_HOLD_STATUS: Version legal hold status.CHECKSUM_ALGORITHM: Algorithm used to compute the checksum.OBJECT_ACCESS_CONTROL_LIST: ACL, Base64-encoded.OBJECT_OWNER: Owner account ID.
For an up-to-date list of available parameters, refer to the API Reference.
Export results
S3 Inventory generates an export manifest, a manifest checksum file, and a report.
Manifest
The report manifest consists of two files:
-
<export_prefix>/<source_bucket_name>/<configuration_ID>/<export_date>/manifest.json: Manifest file. -
<export_prefix>/<source_bucket_name>/<configuration_ID>/<export_date>/manifest.checksum: MD5 checksum of the manifest.Where:
<export_prefix>: Prefix used for the export.<source_bucket_name>: Name of the bucket to export metadata from.<configuration_ID>: Export configuration ID.<export_date>: Export date inYYYY-MM-DDThh:mmZformat.
The manifest contains the following report information:
- Source bucket.
- Target bucket.
- S3 Inventory version.
- Bucket scan start time.
- File format.
- Report schema, i.e., fields included in the report.
- List of files in the report.
Manifest example:
{
"sourceBucket": "source-bucket",
"destinationBucket": "example-inventory-destination-bucket",
"version": "2016-11-30",
"creationTimestamp" : "1514944800000",
"fileFormat": "CSV",
"fileSchema": "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, Size",
"files": [
{
"key": "prefix/source-bucket/config-name/data/3a6d560f-d5d5-434c-a896-15b13f52ac09.csv",
"size": 2147483647,
"MD5checksum": "f11166069f1990abeb9c97ace9cdfabc"
}
]
}
Report
The report is a CSV file containing metadata of all source bucket objects. For large buckets, the report may be split into multiple files.
Note
Objects in the report may not be sorted in any particular order.
The report is stored at <export_prefix>/<source_bucket_name>/<configuration_ID>/data/<report_name>.csv, where:
<export_prefix>: Prefix used for the export.<source_bucket_name>: Name of the bucket to export metadata from.<configuration_ID>: Export configuration ID.<report_name>: Report UUID from the manifest.
Report example:
source-bucket-name,some-file-key-1,16777216,2024-11-26 08:22:15.12345+00,STANDARD,75662d2b5026e477f88a1b385fccfad7,f,SSE-S3,,,,MD5,,ajegtlf2q28a********
source-bucket-name,some-file-key-2,647168,2025-05-25 22:05:28.12345+00,COLD,7f9429f312poga209cd412aae2020ae,f,SSE-S3,,,,MD5,,ajegtlf2q28a********