Workbench
Data Browser#
Data Browser Overview#
The Data Browser provides all your Snowflake Schemas and datasets.
You can fade in or fade out the Data Browser by clicking "<<".
The Data Browser provides the following information:
- an 'Upload File' button to upload data from your device
- all of your Snowflake Warehouses, ordered alphabetically
- all of your Snowflake Schemas and datasets
- a filter option to browse more specifically through your data
The order within the Data Browser is the following:
- Snowflake Warehouses are sorted alphabetically
- Snowflake Schemas are sorted alphabetically
- tables are sorted alphabetically
- columns are sorted in the order defined in Snowflake's Schema
Browsing Datasets#
In order to find your needed dataset, you can enter your text in the search bar. Filtering is possible for databases, schemas and tables. Column filtering is not possible.
Find your matching entries listed below the search bar.
More Filtering Options#
You can filter the entries by the following criteria:
- "All Datasets": all available datasets are displayed
- "Added Datasets": shows added and in-use datasets for the actual Project
Flow Area#
The Flow Area visually represents your whole transformation pipeline, including data sources, transformations as well as deployed data sets.
The toolbar above the Flow Area is largely responsible for operating the Flow Area and allows you to open the SQL Editor, build new transformations and refresh used Snowflake sources.
Data Source Node Options#
Starting from a data source node you can:
- remove the node from the Project (Note: But not simultaneously from your Snowflake instance)
- open the node in the Notebook
Transformation Node Options#
For a transformation node you can:
- edit the node
- deploy or re-deploy the node to Snowflake
- exchange the transformation source
- delete the node from the Project
- open the node in the Notebook
- cache the dataset
Deployed Nodes Options#
For a deployed node you can:
- deploy or re-deploy the node to Snowflake
- materialize a table
- show the Execution History
- show the Deployment History
- remove the node from the Project (Note: But not simultaneously from your Snowflake instance)
Scheduled Email Node Options#
For a scheduled email node you can:
- modify the scheduling
- pause and resume the scheduling
- delete the scheduling
Grouping Node Options#
Once more than one node is grouped, you can do the following from the context menu:
- ungroup the grouped nodes
- expand the group and view all grouped nodes
Flow Area Visibility#
You can customize the display of your flow area by clicking on the different icons:
- hide unconnected nodes, only used nodes in the pipeline are displayed
- make the nodes fit to the size of the Flow Area
- zoom in and zoom out
Data Grid#
Data Grid Overview#
The Data Preview area enables you to view a data preview, explore the columns within a dataset and metadata as well as several exploration features. You can also share your data via email, to a Google Sheet or download a CSV.
To view a data preview of the actual data and get information about the rows and columns, click on "Preview".
To explore the columns of a dataset, click on "Columns". This view presents the column names (table schema), their data type as well as metadata (optional).
To view the profile information such as column metrics, click on "Profile".
To view and copy the SQL query for views, click on "SQL".
To keep the most important things in view, you can limit the number of columns by selecting the columns. For that, click on the 'Columns' drop-down and select the needed columns. Columns that have been hidden once can be shown again later here.
Column Metrics#
You can investigate column metrics in the 'Profile' tab for any data type. For each column a statistic can be shown.
To perform calculations, switch to the tab "Profile" in the Data Grid. Click on "Calculate" in the required columns. The metric is displayed for the respective column.
To calculate all columns at once, click on "Calculate".
You can also display only the calculated columns. For that, hit the checkbox "See Only Calculated". All calculated metrics are shown.
To enlarge the metrics view, click on the 'Maximize' button. You can go back to the minimized display, when clicking the button again.
As a calculation result, each metric shows:
- Data type and column name
- a green loading bar, that visualizes the ratio between available values and missing entries in the column, e.g. you have 500 values but 500 entries are missing, the bar would be half green and half red
- total number of values and total number of empty entries
- total number of unique values, if 5,000 values are calculated in the preview, a max. of 5,000 values can be reached. If there are only 5 different values out of 5,000 possible values, the 'Unique' would show '5'
- values for minimum, average and maximum
- a visualization, depending on the data type
Clicking the 'Minimize' button provides information for unique and null value amount as well as the minimum, average and maximum value.
To refresh the calculation, click on the 'Refresh' button next to the graphics.
Examples
Datatype: INTEGER
Since this dataset is sampled, the column "0_ORDERKEY" contains 150,000,000 values in total, 0 empty/ null values and 150,000,000 unique values. The minimal value is 1, average is 300,000 and maximum is 600,000,000. Since we have unique values only, no graphic is being presented in this case.
Datatype: DECIMAL
Since this dataset is sampled, the column "0_ORDERPRICE" contains 150,000,000 values in total, 0 empty/ null values and 34,700,489 unique values. The minimal value is 811.7, the average is 151K (rounded) and maximum is 591K (rounded). The graphic shows the amount of values that belong to a specific price. The dashed line meanwhile marks the average.
Datatype: STRING
Since the dataset is sampled, the column "0_ORDERPRIORITY" contains 150,000,000 values in total, 5 unique values and 0 empty/ null values. Whitespace is 0 and indication for the STRING length is minimum 5, 15 for maximum and an average of 8. The graphic shows the 5 unique STRING values and the amount of usage in this dataset.
SQL Syntax#
The SQL syntax is highlighted to identify the single syntax expressions. Find the following highlighting in the SQL Editor:
Inspector#
The Inspector contains all information about the actual dataset/ table/ warehouse/ Project that is selected.
Project 'Info' Tab#
The Project's 'Info' tab provides the following information:
- Project description: view and add a description
- Project tags: view and add tags
- date and time when the Project has been created
- date and time then the Project has been modified
Project 'Settings' Tab#
The Project's 'Settings' tab provides the following information:
- collaborators
- option to clone the Project
- option to delete the Project
Source Node 'Info' Tab#
The source data 'Info' tab provides the following information:
- Snowflake source as fully qualified name
- status: view and add a status
- description: view and add a description
- tags: view and add tags
- date and time when the source asset has been created
- date and time when the source asset has been modified
- 'created in' information
- used by the amount of Projects and the associated Project list
Source Node 'Settings' Tab#
Each source node 'Settings' tab provides the following options:
- add the node to a Project
- delete the node (from Snowflake and Datameer)
Transformation Node 'Info' Tab#
Each transformation node 'Info' tab provides the following information:
- description: view and add a description
- transformation node owner
- tags: view and add tags
- date and time when the transformation node has been created
- date and time when the transformation node has been modified
Datameer's AI features enable you to generate a description simply by clicking on the "Auto-generate Description" icon next to the 'Description' section.
Transformation Node 'Transform' Tab#
Each transformation node 'Transform' tab provides the following information:
- the associated source node
- a button to exchange the source node
- the receipt list with all transformations
- a button to add a transformation to the receipt
Deployed Node 'Info' Tab#
Each deployed node 'Info' tab provides:
- Snowflake target with link
- status information and status editing option
- description and description edit option
- tags and tag edit option
- set property option
- meta data: creation date and time, last modified date and time, table/view, amount of usage in Projects
Deployed Node 'Deployment' Tab for Views#
Each deployed view node 'Deployment' tab provides:
- a Deployment History button
Deployed Node 'Deployment' Tab for Tables#
Each deployed table node 'Deployment' tab provides:
- table materialization option
- Deployment History button
- Execution History button
- date and time information for the last run (with status) and the next run
- execution trigger configuration
Deployed Node 'Settings' Tab#
Each deployed node 'Settings' tab provides the following options:
- add the node to a Project
- delete the node (from Snowflake and Datameer)