The analytics job market in Russia is still quite small since most universities do not train students for this role. Faced with the task of finding promising analyst interns, the Yandex Go team decided to go beyond a regular internship and train students on real data.

We launched the Product Analytics course and brought in teachers with real experience from Yandex Go, as well as Yandex Cloud experts, to deliver it. During the course, they shared real-life challenges they face every day and the tech stack they use as analysts. The program covered 13 theory topics and 10 hands-on homework assignments, all tied into a single project.

Yandex has been working closely with MIPT for quite a while. MIPT ranks among the world’s top 300 technical universities according to Quacquarelli Symonds and Times Higher Education, the two major ranking agencies. Only three Russian universities are on that list: Moscow State University, HSE University, and MIPT. The Data Analysis Department, which is one of the core departments at the Faculty of Innovation and High Technology, was founded and is still supported by Yandex. That is why we chose this university to run our first data analytics course.

Before the course started, students had to go through a selection process: out of 80 applicants, 24 got in. The students were divided into six teams of four members each. Each team had a mentor.

Cloud for quick start

Studying theory alone will not make you an analyst. You need hands-on training with real big data. That is why our students worked with data that Yandex Go collected from various cities.

While setting up our course, we faced the challenge of organizing the work with big data. Analysts often use professional tools that involve purchasing or renting infrastructure and deploying databases. Analyzing such data in Microsoft Excel tables will not work.

However, providing students with professional tools would slow down course development, make it much more expensive, and require Yandex Go team to bring in extra staff to manage the infrastructure. A more effective solution was to leverage Yandex Cloud services: Yandex Managed Service for PostgreSQL to store and access data, Yandex DataSphere with serverless Jupyter® Notebook to run the analysis, and Yandex DataLens to visualize the result.