Working with datasets
In this section, you will learn how to work with a dataset:
- Creating a dataset
- Adding data
- Configuring links
- Creating fields
- Configuring fields
- Adding parameters
- Adding filters
- Configuring access
Creating a dataset
You can create a new dataset using one of the following methods:
-
Go to the DataLens home page
. -
Click Create dataset.
- Go to the page with workbooks and collections
. - Open the workbook to create a dataset in.
- In the top-right corner, click Create and select Dataset.
In the left-hand panel, select
Adding data
You can add data to a dataset from tables available in the source or run an SQL query against the source.
Note
You can use SQL queries for database sources only.
-
In the dataset, go to the Sources tab.
-
In the connection panel, click
Add. -
Select a data source connection.
-
Add data from the data source:
TableSQL queryDrag the tables you need from the table panel to the workspace. The system will automatically link these tables together based on the first matching field name and data type.
Warning
- The maximum number of tables in a single dataset is 32.
- The maximum number of fields in a single dataset is 1,200.
Note
- To use subqueries as a source, in the connection settings, enable SQL query access level → Allow subqueries in datasets when creating or editing a connection.
- Describing a dataset via a SQL query does not support parameters. However, you can use parameterized queries in QL charts.
-
In the table panel, click
Add. -
Enter a Source name and enter the SQL code in the Subquery field.
-
Click Create.
Example of SQL query
A SQL query joins data from
samples.MS_SalesFacts
,samples.MS_Products
, andsamples.MS_Shops
and groups them on theStore
and theProduct type
fields. A sales total is computed for each group. At the same time, only sales records from 2019 are taken into consideration:SELECT t3.ShopName AS "Store", t2.ProductCategory AS "Product type", COUNT(t1.OrderID) AS "Number of sales", SUM(t1.Price*t1.ProductCount) AS "Sales total" FROM samples.MS_SalesFacts t1 INNER JOIN samples.MS_Products t2 ON t2.ProductID=t1.ProductID INNER JOIN samples.MS_Shops t3 ON t3.ShopID=t1.ShopID WHERE toYear(t1.OrderDatetime)='2019' -- condition for selecting sales for the specified year (2019) GROUP BY "Store", "Product type" -- group by Store and Product type ORDER BY "Store", "Product Type" -- sort by Store and Product type
Replacing a connection
To replace a connection in a dataset:
-
In the dataset, go to the Sources tab.
-
Delete all the tables from the workspace.
-
In the connection panel, click
next to the connection and select Replace connection. -
Select another connection.
-
Drag the tables to be used as a data source for the dataset to the workspace.
-
Edit or delete the calculated fields which use fields not present in the new table.
-
In the top-right corner, click Save.
Note
If the data schema in the source changed, update the dataset fields.
Configuring links
You can join data from source tables.
Note
When you merge data from multiple tables, a dataset will get duplicates of the fields used to create links. You can remove them manually from the list of dataset fields. You can also describe the dataset using an SQL query and exclude duplicate fields from the final set.
-
In the dataset, go to the Sources tab.
-
Click the table link icon.
-
Select the link type:
inner
,left
,right
, orfull
. -
Select the fields to join the tables on. You can only link fields with the same data type. DataLens gets information about the field's data type directly from the connection. If you need to join tables by fields with different data types, use one of these options:
-
Convert all the fields to the same data type at the source level.
-
Describe the dataset with an SQL query. Use the CAST or CONVERT
functions to convert the data type. For example:SELECT * FROM lead INNER JOIN user ON lead.assigned_by_id = CONVERT(user.external_id, CHAR)
-
-
If required, link other fields in the tables. To do this, click Add link.
-
Optionally, you can disable the Optimize link option to make the link required. In this case, the
JOIN
operation will be performed even if you select fields from a single table. -
Click Apply.
By default, when you add a new table from a data source to the workspace, it is automatically linked to the first table listed there. If required, you can link a new table to a specific table. To do this, hover the new table over a previously added one and release when you see a gray border around the latter. A new link to the specified table is created.
You can replace a previously added table. When dragging a new table, hover it over the icon with circular arrows to the right of the table you are replacing for the icon background to turn blue, then release.
Note
If the updated table retains the same set of fields with the same names, the field IDs in the dataset will not change. In this case, the charts created based on this dataset will not be disturbed. If the updated table's schema has changed, update the dataset fields.
Creating fields
You can add fields to a dataset from data source tables or create calculated fields using formulas.
-
In the dataset, go to the Fields tab.
-
At the top of the screen, click
Add field. -
Create a field:
Calculated fieldData field-
Enter a name for the field.
-
Enter a formula to calculate the field value using data source functions.
-
Optionally, enable Don't display in wizard to hide the field in the wizard when creating a chart.
-
Click Create. The field will appear in the dataset.
-
Enter a name for the field.
-
Go to the Field from source tab.
-
Specify the field parameters:
- Field ID: ID of the field you are creating that is used to provide parameters for filtering by this field on the dashboard or in the chart URL. It may contain lowercase Latin letters, as well as the
_
and-
characters. - Source: Table in the data source with the field you need.
- Source field: Field from the selected table in the data source.
- Field type: Field data type.
- Aggregation: Aggregation function applied to the field.
- Field ID: ID of the field you are creating that is used to provide parameters for filtering by this field on the dashboard or in the chart URL. It may contain lowercase Latin letters, as well as the
-
Optionally, enable Don't display in wizard to hide the field in the wizard when creating a chart.
-
Click Create. The field will appear in the dataset.
-
Configuring fields
You can perform the following operations on dataset fields:
-
Rename a field: To rename a field, click its current name and enter a new one.
-
Change the source: If the table schema in the source has changed, you can select a new field from the source to match the dataset field.
-
Change the field data type: Data type determines how to process data and which functions can apply to the field. To change the data type, click the current type and select a new one from the list of available types.
-
Select an aggregation type: Aggregation type determines the aggregation function to apply to the field. To change the aggregation type, click the current type and select a new one from the list of available aggregation types.
-
Configure access permissions: You can restrict access to field values for different users or user groups. To do this, click
and set the access configuration in'field_value': user_email
format. For more information, see Managing access at the data row level (RLS). -
Sort fields: You can sort fields by name, data type, aggregation type, and other parameters. To do this, use the sorting buttons at the top of the Fields table.
-
Add a description: You can add a comment to a field in the Description column. This can help other users understand what the field is used for or how it is calculated.
-
Hide fields: You can hide fields you do not need at the moment. Other users will not see these fields in the wizard when creating a chart. To do this, click
.
Updating fields
You need to update fields if you changed the schema of a table (or tables) in the data source or manually deleted fields from a dataset.
To update fields in a dataset:
-
In the dataset, go to the Fields tab.
-
At the top of the screen, click Update fields.
- If a field is missing in the dataset but present in the data source, you will see that field in the list.
- If a field is present in both the dataset and the data source, the field will be updated, e.g., its type will change.
- If a field is present in the dataset but missing in the data source, that field will be marked in red. Delete it from the dataset or replace it with a different field from the data source.
-
In the top-right corner, click Save.
Replacing fields
If a field has been removed from a data source, it will not be updated when a dataset is updated. Charts featuring this field will show the ERR.DS_API.FORMULA.UNKNOWN_SOURCE_COLUMN
error. You can delete this field from the chart or replace its source in the dataset.
To replace the field source in the dataset:
-
In the left-hand panel, click
Datasets and select the dataset you need. -
In the top-left corner, select the Fields tab.
-
In the dataset, select the field you want to replace.
-
Go to the Field source column.
-
In the field settings, select the field you wish to replace the deleted field with from the Field from source list.
-
Click Save.
Adding parameters
You can use parameters in calculated dataset fields.
Note
Parameters added at the dataset level are available in all charts created based on this dataset.
-
In the dataset, go to the Parameters tab.
-
Click Add.
-
In the Add parameter window, specify:
- Name: Sets the parameter name.
- Type: Parameter data type.
- Default value: This is a required value. It is used when no parameter value is set in the dashboard, chart URL, or chart settings.
-
Click Add.
You can also create parameters at the chart level.
Adding filters
At the dataset level, you can add default filters that will automatically apply to any new chart created from the current dataset.
- In the dataset, go to the Filtering tab.
- Click Add.
- Select the field to apply the filter to.
- Set a filtering rule.
- In the bottom-right corner, click Apply filter.
Note
Default filters are not applied to data in the dataset preview area.
Configuring access
The way you configure access to a dataset varies depending on whether it is located in a workbook or a folder.
Before assigning access permissions to a user, add this user to DataLens.
A dataset inherits all access permissions from a workbook. Assign the user permissions for the appropriate workbook and they will apply to your dataset.
-
On the dataset page, click
in the top menu and select Permissions. -
In the window that opens, click Dataset access.
-
Enter a username or select the All group.
Note
Assigning permissions to user groups is supported for workbooks and collections.
-
(Optional) Write a comment for the user you are granting permissions to.
-
Specify the appropriate permissions and click Add.
Note
You can assign roles for linked objects. For example, if you grant access to a chart, you can also grant access to a dataset and connection.
For more information on object access, see Managing access to DataLens.