Background

The Yandex Go team, led by Ksenia Kolesnikova, Head of Internal Efficiency Analytics, ran a product analytics course at Moscow Institute of Physics and Technology (MIPT) to find promising interns and junior specialists. To get the course started quickly and ensure efficient student learning, they used Yandex Cloud services for working with data, such as Yandex Managed Service for PostgreSQL, Yandex DataSphere with serverless Jupyter® Notebook, and Yandex DataLens.

Out of 24 students enrolled, 22 completed the course, with seven of them joining the Yandex Go team.

Experienced teachers and real data

The analytics job market in Russia is still quite small since most universities do not train students for this role. Faced with the task of finding promising analyst interns, the Yandex Go team decided to go beyond a regular internship and train students on real data.

We launched the Product Analytics course and brought in teachers with real experience from Yandex Go, as well as Yandex Cloud experts, to deliver it. During the course, they shared real-life challenges they face every day and the tech stack they use as analysts. The program covered 13 theory topics and 10 hands-on homework assignments, all tied into a single project.

Yandex has been working closely with MIPT for quite a while. MIPT ranks among the world’s top 300 technical universities according to Quacquarelli Symonds and Times Higher Education, the two major ranking agencies. Only three Russian universities are on that list: Moscow State University, HSE University, and MIPT. The Data Analysis Department, which is one of the core departments at the Faculty of Innovation and High Technology, was founded and is still supported by Yandex. That is why we chose this university to run our first data analytics course.

Before the course started, students had to go through a selection process: out of 80 applicants, 24 got in. The students were divided into six teams of four members each. Each team had a mentor.

Cloud for quick start

Studying theory alone will not make you an analyst. You need hands-on training with real big data. That is why our students worked with data that Yandex Go collected from various cities.

While setting up our course, we faced the challenge of organizing the work with big data. Analysts often use professional tools that involve purchasing or renting infrastructure and deploying databases. Analyzing such data in Microsoft Excel tables will not work.

However, providing students with professional tools would slow down course development, make it much more expensive, and require Yandex Go team to bring in extra staff to manage the infrastructure. A more effective solution was to leverage Yandex Cloud services: Yandex Managed Service for PostgreSQL to store and access data, Yandex DataSphere with serverless Jupyter® Notebook to run the analysis, and Yandex DataLens to visualize the result.

How the course worked

To set up a working environment for students, the Yandex Go teaching team uploaded the collected data to Yandex Managed Service for PostgreSQL. This made maintaining the database manually no longer required. With Yandex Managed Service for PostgreSQL, it takes just a few minutes to allocate resources, install the DBMS, and create databases. This tool also handles backups and updates automatically.

Each student team got access to Yandex DataSphere, a tool for ML development with a user-friendly Jupyter® Notebook interface. Using it, the students could process and analyze data from the connected database. Yandex DataSphere also enabled the students to work together on their team project and share their progress with teachers. To visualize the results as charts and dashboards, they employed Yandex DataLens.

Course results

Out of 24 students enrolled, 22 completed the course, with seven of them joining the Yandex Go team. The students highly appreciated working with experienced teachers and mentors, as well as the chance to explore real Yandex Go data. The Yandex Go team aims to enhance the course, run it again at MIPT, and scale it to other universities.

Advantages

With Yandex Cloud technologies, you can create and run courses in any data processing and analysis domain:

  • Industrial analytics
  • Deep learning & Machine learning
  • Python programming
  • Natural language processing
  • Introduction to AI

Using our tools, you do not need to purchase or rent infrastructure, set up databases, configure environments, or provide user support. In Yandex Cloud, all cloud data tools get deployed quickly and run only during hands-on training and homework assignments. This helps start courses faster and reduces infrastructure expenses.

Opinion

Fyodor Lavrentyev,
Course Mentor, CDO at Yandex Go
Fyodor Lavrentyev,
Course Mentor, CDO at Yandex Go

Yandex Cloud makes it easy to deploy training environments, manage teams, and configure role-based access for students and teachers. We really liked the built-in monitoring system, which immediately shows whether we have enough resources; and, if we don’t, we can easily scale up. With DataSphere, there is even no need for that, as it automatically starts and stops virtual machines, which makes the things really convenient.