Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Query
    • Overview
    • Batch processing
    • Streaming processing
    • Unified analysis of streaming and analytical data
  • Access management
  • Pricing policy
  • Integration
  • Audit Trails events
  • FAQ

In this article:

  • Get started
  • Connect to the data
  • Run the query
  • Review the result
  • See also
  1. Getting started
  2. Batch processing

Analytical processing of Yandex Object Storage data

Written by
Yandex Cloud
Updated at March 6, 2025
  • Get started
  • Connect to the data
  • Run the query
  • Review the result
  • See also

In this example, you will analytically process a dataset on New York City taxi rides. Data for the example were placed in the Yandex Object Storage bucket, in Parquet files.

As a result, you will build a frequency distribution of ride duration vs. ride count as a histogram.

To run this example:

  1. Get started.
  2. Connect to the data.
  3. Run the query.
  4. Review the result.

Note

Yandex Cloud provides the New York City taxi trips dataset as is. Yandex Cloud makes no representations, express or implied, warranties, or conditions pertaining to your use of the specified dataset. To the extent allowed by your local laws, Yandex Cloud shall not be liable for any loss or damage, including direct, consequential, special, indirect, incidental, or exemplary, resulting from your use of the dataset.

NYC Taxi and Limousine Commission (TLC):

The data was collected and provided to the NYC Taxi and Limousine Commission (TLC) by technology providers authorized under the Taxicab & Livery Passenger Enhancement Programs (TPEP/LPEP). The taxi trip data is not generated by the TLC, and the TLC makes no representations whatsoever about the accuracy of this data.

Take a look at the dataset source and its use policy.

Get startedGet started

  1. Log in or sign up to the management console. If not signed up yet, navigate to the management console and follow the on-screen instructions.
  2. On the Yandex Cloud Billing page, make sure you have a billing account linked and its status is ACTIVE or TRIAL_ACTIVE. If you do not have a billing account yet, create one.
  3. If you do not have a folder yet, create one.

Connect to the dataConnect to the data

  1. In the management console, select the folder where you want to create a connection.

  2. In the list of services, select Yandex Query.

  3. In the left-hand panel, select Tutorial.

  4. Under Create infrastructure for tutorial, click Create connection.

    A new connection creation page will open. View the default parameter values, but do not edit them.

  5. Click Create.

    The data binding page will open. View the default parameter values, but do not edit them.

  6. Click Create.

Run the queryRun the query

  1. In the query editor in the Query interface, click New analytics query.

  2. Enter the query text in the text field:

    $data =
    SELECT
        *
    FROM
        `tutorial-analytics`;
    
    $ride_time =
    SELECT
        DateTime::ToMinutes(tpep_dropoff_datetime-tpep_pickup_datetime) AS ride_time
    FROM
        $data;
    
    SELECT
        Histogram::Print(histogram(ride_time))
    FROM
        $ride_time;
    
  3. Click Run.

Review the resultReview the result

Once the query is completed, you'll see the following results: distribution of the taxi ride duration by the number of rides.

Kind: AdaptiveWard Bins: 100 WeightsSum: 140151844.000 Min: -531231.000 Max: 43648.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:   -5706.500 F:       4.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:   -4177.000 F:       3.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:   -2905.625 F:       8.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:   -1156.556 F:       9.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:     -43.545 F:    1685.000
█████████░░░░░░░░░░░░░░░░ P:       0.523 F: 3205072.000
███████████░░░░░░░░░░░░░░ P:       2.000 F: 3974384.000
█████████████████░░░░░░░░ P:       3.000 F: 6216464.000
██████████████████████░░░ P:       4.000 F: 7799899.000
████████████████████████░ P:       5.000 F: 8431504.000
█████████████████████████ P:       6.000 F: 8637705.000
████████████████████████░ P:       7.000 F: 8461147.000
███████████████████████░░ P:       8.000 F: 8122270.000
██████████████████████░░░ P:       9.000 F: 7643893.000
████████████████████░░░░░ P:      10.000 F: 7143245.000
██████████████████░░░░░░░ P:      11.000 F: 6549030.000
█████████████████░░░░░░░░ P:      12.000 F: 6013493.000
███████████████░░░░░░░░░░ P:      13.000 F: 5452450.000
██████████████░░░░░░░░░░░ P:      14.000 F: 4955050.000
████████████░░░░░░░░░░░░░ P:      15.000 F: 4470485.000
███████████░░░░░░░░░░░░░░ P:      16.000 F: 4047062.000
███████████████████░░░░░░ P:      17.474 F: 6886725.000
████████████████░░░░░░░░░ P:      19.475 F: 5569891.000
█████████████░░░░░░░░░░░░ P:      21.474 F: 4499806.000
██████████░░░░░░░░░░░░░░░ P:      23.475 F: 3646437.000
████████░░░░░░░░░░░░░░░░░ P:      25.475 F: 2962072.000
██████░░░░░░░░░░░░░░░░░░░ P:      27.476 F: 2414497.000
█████░░░░░░░░░░░░░░░░░░░░ P:      29.476 F: 1962886.000
████░░░░░░░░░░░░░░░░░░░░░ P:      31.535 F: 1676489.000
███░░░░░░░░░░░░░░░░░░░░░░ P:      33.542 F: 1301808.000
████░░░░░░░░░░░░░░░░░░░░░ P:      35.855 F: 1408697.000
███░░░░░░░░░░░░░░░░░░░░░░ P:      38.569 F: 1206848.000
███░░░░░░░░░░░░░░░░░░░░░░ P:      41.900 F: 1264922.000
██░░░░░░░░░░░░░░░░░░░░░░░ P:      45.386 F:  745821.000
█░░░░░░░░░░░░░░░░░░░░░░░░ P:      48.358 F:  597152.000
█░░░░░░░░░░░░░░░░░░░░░░░░ P:      51.440 F:  521645.000
█░░░░░░░░░░░░░░░░░░░░░░░░ P:      54.776 F:  442015.000
█░░░░░░░░░░░░░░░░░░░░░░░░ P:      58.505 F:  443528.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:      62.515 F:  344650.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:      67.911 F:  308517.000
░░░░░░░░░░░░░░░░░░░░░░░░░ P:     115.984 F:   22039.000

See alsoSee also

  • Named expressions. YQL syntax
  • HISTOGRAM. Built-in YQL functions
  • SQL expression format
  • Batch processing

Was the article helpful?

Previous
Overview
Next
Streaming processing
Yandex project
© 2025 Yandex.Cloud LLC