Analyzing public data on road accidents in Russia
- Getting started
- Create a workbook
- Create a connection to files
- Create a dataset
- Analyze accident distribution density
- Analyze the number of accidents and death rate
- Analyze the statistics by week, day of week, and time of day
- Create a map with color fill by region
- Create a dashboard and add charts to it
- Add selectors to the dashboard
- Add a map with color fill by region to the dashboard
- Publish the dashboard
Note
In this tutorial all objects will be created and stored in a workbook. If using legacy folder navigation, create an individual directory to work in.
Create a directory
- Go to the DataLens home page
. - In the left-hand panel, select
All objects or Personal directory. - In the top-right corner, click Create → Directory.
- Enter a name for the directory.
- Click Create.
This guide analyzes public data on road accidents in Russia. In addition to identifying the facts of accident statistics in Russia, during the analysis you will learn how to:
- Work with the DataLens key entities: Connections, Datasets, Charts, and Dashboards.
- Merge multiple sources at single dataset level.
- Use the
Date and time
data type and edit chart-level groupings. - Work with geodata: geopoints and geopolygons.
- Create public dashboards that will be available to everyone from any device without authentication.
The following files will provide the source data:
- Traffic accident statistics
for April-December 2018 in CSV format. - Polygon reference
for regions of Russia in CSV format.
Download them before you proceed with this guide.
To visualize and explore data, set up DataLens and follow the steps below:
- Create a workbook.
- Create a connection to files.
- Create a dataset.
- Analyze accident distribution density.
- Analyze the number of accidents and death rate.
- Analyze the statistics by week, day of week, and time of day.
- Create a map with color fill by region.
- Create a dashboard and add charts to it.
- Add selectors to the dashboard.
- Add a map with color fill by region to the dashboard.
- Publish the dashboard.
Getting started
To get started with DataLens:
- Log in
to your Yandex account. If you do not have an account, create one. - Open the DataLens home page
. - Click Open DataLens.
- Confirm that you have read the Terms of use
and click Log in.
-
Log in
to your Yandex account. -
Open the DataLens home page
. -
Click Open DataLens.
-
Select one of the options:
-
If you already have an organization, select it from the drop-down menu in the Organizations tab and click DataLens.
Note
To activate a DataLens instance, the user must have the
admin
orowner
role. For more information about roles, see Access management in Yandex Cloud Organization. -
If you have a cloud but no organization, click Add new DataLens. In the window that opens, enter your organization's name and description and click Create organization and DataLens. For more information about working with organizations, see Getting started with organizations.
-
If you have any technical questions about the service, contact Yandex Cloud support
Create a workbook
- Go to the DataLens home page
. - In the left-hand panel, select
Collections and workbooks. - In the top-right corner, click Create → Create workbook.
- Enter the workbook name:
Tutorials
. - Click Create.
Create a connection to files
Create a connection to source data files:
-
In the top-right corner of the workbook, click Create →
Connection. -
Under Files and services, select the Files connection.
-
Click Upload files.
-
Select
dtp201804-1.csv
. Wait until the table content appears on the screen. -
Connect a chart with region geolayers. Click Upload files.
-
Select
Regions.csv
. Wait for the data to load. -
Select the uploaded
Regions.csv
file from the list on the left. -
For the Column header parameter, set the
Yes
value. -
Click Create connection.
-
Enter
dtp_data
for the connection name and click Create.
Create a dataset
Create a dataset based on the dtp_data
connection:
-
In the top-right corner of the connection page, click Create dataset.
-
Drag the
dtp201804-1.csv
table to the workspace. -
Drag the
Regions.csv
table to the workspace. An error message will appear as the tables' data have not been merged yet. -
Click the link icon between the tables.
-
Click Add link.
-
Select the fields to link:
reg_name
andAccident region
. -
Click Apply.
Make sure that the data in the preview section is presented correctly.
-
Go to the Fields tab. On this tab, you can add data fields and calculated fields, change field data types and aggregation rules, as well as rename fields.
-
Rename the fields as follows:
reg_code
toRegion code
reg_name
toRegion name
road_code
toRoad code
road_name
toRoad name
road_type
toRoad type
oktmo
toOKTMO code
address
toAddress
crash_type_name
toAccident type
To rename a field, click its current name, delete it, and enter a new one.
-
Accident date and time appear in the following fields:
crash_date
: Numeric date in20190218
format.crash_time
: Time in19:34
text format.
They will not be used in charts in this format. Click
to hide them (the icon appears when you hover over the field). -
Create a calculated field to display information about the accident date and time:
-
In the top-right corner, click Add field.
-
At the top-left, specify the field name:
Date and time
. -
In the formula field, enter
DATETIME_PARSE(str([crash_date])+' '+str([crash_time]))
.Tip
- To avoid errors, select fields for the formula from the list on the left instead of entering them manually.
- To display information about functions, click Reference at the top right.
-
Click Create.
The preview section now correctly displays the accident date and time as a
Date and time
type value withT
for separator. The new field is at the top of the table. Click to edit the field formula. -
-
Rename the fields as follows:
crash_reason
toCause of accident
fatalities_amount
toDeath toll
victims_amount
toNumber of victims
vehicles_amount
toNumber of vehicles
participants_amount
toNumber of participants
-
For the
Death toll
,Number of victims
Number of vehicles
andNumber of participants
fields, specify the Sum aggregation type. The fields with the defined aggregate functions become measures and change their color to blue. -
Create a field with accident coordinates:
-
In the top-right corner, click Add field.
-
At the top-left, specify the field name:
Geopoint
. -
In the formula field, enter
GEOPOINT([latitude],[longitude])
. -
Click Create.
Note
The example is based on ready-to-use data in coordinate format.
A new field of the
Geopoint
type will appear in the list of fields; the data for this field will appear in the preview section. -
-
Create the following fields:
Number of accidents
with theSUM(1)
formula. You do not need to set the aggregation type for this field: when you add it to the chart, DataLens will apply the aggregation function automatically.Accident death rate
with the[Death toll]/[Number of accidents]
formula.
-
For the
Polygon
field, change the data type to Geopolygon. -
Save the dataset:
- In the top-right corner, click Save.
- Enter
Accident
as the dataset name and click Create.
Analyze accident distribution density
To visualize accident distribution density by region, create a heat map:
-
On the dataset page, click Create chart in the top-right corner.
-
Select the Map visualization type.
-
Select the Heat map (Geopoints) layer type.
-
Add points with coordinates to the map. To do this, drag the
Geopoint
field from Dimensions to the Heatmap (Geopoints) section. -
Save the chart:
-
In the top-right corner, click Save.
-
In the window that opens, enter
Accident heat map
as the chart name and click Save.
-
Analyze the number of accidents and death rate
To analyze the number of accidents by region, create a bar chart.
-
Create a copy of the chart you created in the previous step:
-
In the top-right corner, click
→ Save as copy. -
In the window that opens, enter
Number of accidents by region
as the name for the new chart and click Save.
-
-
For the visualization type, select Line chart.
-
Add region names to your chart. To do this, drag the
Region name
field from Dimensions to the Y section. -
Add the number of accidents to your chart. Drag the
Number of accidents
field from Measures to the X section. -
Sort the chart by the number of accidents in descending order. Drag the
Number of accidents
field from Measures to the Sorting section.The chart will show the number of accidents by region. Most accidents occurred in the Moscow region.
Note
The legend may be missing for some of the chart lines. If this is the case, hover over the line and you will see a tooltip with the region name.
-
Save your chart: in the top-right corner, click Save.
To analyze the accident death rate by region, create another bar chart:
-
Create a copy of the
Number of accidents by region
chart.- In the top-right corner, click
→ Save as copy. - In the window that opens, enter
Death rate by region
as the name for the new chart and click Save.
- In the top-right corner, click
-
Replace the number of accidents measure with the accident death rate measure on axis X. To do this, drag the
Accident death rate
field from Dimensions to the X section and hold it over theNumber of accidents
field until the latter turns red. -
Change from sorting by the number of accidents to sorting by descending accident death rate. To do this, drag the
Accident death rate
field from Dimensions to the Sorting section and hold it over theNumber of accidents
field until the latter turns red.This time, Ingushetia, Kalmykia, and the Volgograd region are far ahead in the number of deaths.
-
Save your chart: in the top-right corner, click Save.
Analyze the statistics by week, day of week, and time of day
Let's see how the number of accidents and related deaths are distributed by week, day of week, and time of day.
-
Analyze the number of accidents and death rate by week:
-
Create a copy of the chart you created in the previous step:
- In the top-right corner, click
→ Save as copy. - In the window that opens, enter the
Number of accidents and death rate by week
as the name for the new chart and click Save.
- In the top-right corner, click
-
For the visualization type, choose Line chart.
-
Delete fields in all sections of the chart. To do this, click
(the icon appears when you hover over the field) next to a field in a section. -
Add accident date and time to your chart. To do this, drag the
Date and time
field from Dimensions to the X section. -
Group dates by week:
- Click the calendar icon next to the
Date and time
field in the X section. - In the Grouping field, choose Date portion ⟶ Week and click Apply.
- Click the calendar icon next to the
-
Drag:
Number of accidents
field to the Y section.Accident death rate
field to the Y2 section.
Now you can see a chart with two graphs: number of accidents and death rate. If you hover over a point on the chart, a tooltip appears with specific values.
-
Save your chart: in the top-right corner, click Save.
-
-
Now analyze the number of accidents and death rate by day of week:
-
Create a copy of the chart you created in the previous step:
- In the top-right corner, click
→ Save as copy. - In the window that opens, enter the
Number of accidents and death rate by day of week
as the name for the new chart and click Save.
- In the top-right corner, click
-
Change grouping by date and time to Date portion ⟶ Day of week.
This chart reveals more obvious trends: both measures increase towards the end of the week.
-
Save your chart: in the top-right corner, click Save.
-
-
Analyze the number of accidents and death rate by time of day.
-
Create a copy of the chart you created in the previous step:
- In the top-right corner, click
→ Save as copy. - In the window that opens, enter
Number of accidents and death rate by time of day
as the name for the new chart and click Save.
- In the top-right corner, click
-
Change grouping by date and time by specifying: Date portion ⟶ Hour.
The chart shows that less accidents occur at night, but they result in more casualties.
-
Save your chart: in the top-right corner, click Save.
-
Create a map with color fill by region
-
Create a copy of the chart you created in the previous step:
- In the top-right corner, click
→ Save as copy. - In the window that opens, enter the
Regions map
as the name for the new chart and click Save.
- In the top-right corner, click
-
Select the Map visualization type.
-
Select Polygons (Geopolygons) as the layer type.
-
Add polygons to the map. Drag the
Polygon** field from **Dimensions
to the Geopolygons section. -
Change colors of the polygons based on the number of accidents. To do this, drag the
Number of accidents
field from Measures to the Colors section. -
Drag the following fields to the Tooltips section:
Region name
Number of accidents
Death toll
Number of victims
Number of vehicles
Number of participants
Accident death rate
The chart will show a map with color fill by region. If you hover over a region, information for that region appears.
-
Save your chart: in the top-right corner, click Save.
Create a dashboard and add charts to it
Create a dashboard to add charts and other widgets to:
-
In the left-hand panel, select
Collections and workbooks and go to theTutorials
workbook. -
In the top-right corner, click Create →
Dashboard. -
In the panel at the bottom of the page, select Chart.
-
In the window that opens, click Select.
-
Select the
Accident heat map
chart. -
Click Add.
-
Repeat steps 3-6 to add the following charts to the dashboard:
Number of accidents by region
Number of accidents and death rate by week
Number of accidents and death rate by day of week
Number of accidents and death rate by hour of day
-
On the dashboard, create a widget with two tabs:
- In the top-right corner of the
Number of accidents by region
chart, click . - In the left part of the window, click Add.
- In the new tab, click Select.
- Select the
Death rate by region
chart. - Click Save.
- In the top-right corner of the
-
Adjust the size of the charts with your mouse and place them on the dashboard as you prefer.
Add selectors to the dashboard
Selectors allow you to filter data by value.
-
Add a widget for data filtering:
-
In the panel at the bottom of the page, choose Selector.
-
Choose the
Accidents
dataset. -
Select the
Region name
field. -
Add selectors by the following fields to the widget:
Road name
Cause of accident
Accident type
Road type
To to this, on the left side under Selectors, click
Add and select a field name. -
Click Save.
-
-
Position the widget on the dashboard where you think fit.
-
Save the dashboard:
- In the top-right corner of the dashboard, click Save.
- Enter
Accidents in Russia
as the dashboard name and click Create.
Your dashboard is ready.
On the dashboard, you can:
- Select selector values for data filtering.
- Rescale the map.
- Switch the
Number of accidents by region
chart toDeath rate by region
and back. To do this, click to the right of the chart name and select the appropriate tab.
Add a map with color fill by region to the dashboard
-
Click Edit at the top right.
-
In the top-right corner of the
Accident heat map chart
, click . -
In the left part of the window, click Add.
-
In the new tab, click Select.
-
Select the
Regions map
chart. -
Click Save.
-
In the top-right corner, click Save.
Now you can switch between charts: the heat map with accident density or the color fill map.
Publish the dashboard
To configure public access to the dashboard:
-
At the top of the screen, click
→ Public access. -
In the window that opens, enable the link sharing option next to the dashboard name.
-
Click Continue to confirm publication of the dashboard and the data connected to it.
-
Click Copy link to copy the dashboard's unique URL.
-
Click Apply.
-
Open a new tab in the browser and go to the copied link.
You can share this link with your colleagues, partners, and friends. When opening the dashboard link, no authorization and authentication are required.
Note
You can enable or disable public access separately for each chart and dataset associated with the dashboard.
Configuring public access for objects