Importing from a Cloud Storage

Find all information and how-to import from a cloud storage provider here.

Importing from Azure Data Lake Gen2

Azure Data Lake Gen2#

To import data from Azure Data Lake Storage Gen 2:

Click the "+" button and choose "Import Job or right-click in the File Browser and select "Create New" → "Import Job". The 'New Import Job' tab appears in the menu bar.
Click "Select Connection". The dialog 'Select Connection' opens.
Click on the connection for Azure Data Lake Gen2 and confirm with "Select". The connection is displayed.
Select the required file type from the drop-down "File Type" and confirm with "Next".
Enter the file or folder name as it is named in your storage.
Define the delimiter character the schema and the column names.

INFO: The default value for delimiter is the comma ',' character.
Select the schema of the imported data.
If needed, uncheck the check-box to not include the column names in the first row.

INFO: The check-box is marked per default. The column names are contained in the first row.
If you want to filter by data and time select the filter method from the drop-down.

INFO: Select the start date and end date from the calendar for the filter mode 'Fixed dates'.

INFO: Enter a 'das' expression as the start and end expression, e.g. 'TODAY()-4d for the filter mode 'Dynamic dates'.
If needed, exclude data by the file modification day, enter the amount of days.
If needed, modify the advanced settings, e.g. the character encoding and confirm with "Next". The tab 'Data Fields' opens.
Confirm with "Next". The tab 'Define Fields' opens.
Mark all required columns.
If needed, enter placeholder values and confirm with "Apply".
Decide how to handle invalid data.
Decide whether you want partition data and confirm with "Next". The tab 'Schedule' opens.

INFO: If you have checked 'Partition Data', enter a date expression and select the data format from the drop-down.
Decide whether the import shall be triggered manually or on a schedule.
Select the option for data retention.
If needed, enter the amount of sample records and the maximum amount of errors to log and confirm with "Next". The tab 'Schedule' opens.

INFO: Higher values lead to more precise preview results but can rapidly decrease the performance.
If needed, enter an import job description.
Unmark the checkbox if the import shall not start immediately after the saving.

INFO: The check-box is marked per default to start the import right after saving the import job.
If needed, enter the email address for several notifications and confirm with "Next". The 'Save Import Job' dialog opens.
Select the path the data shall be imported to, enter a name and confirm with "Save". Data Import from Azure Data Lake Storage Gen2 is finished.

Importing from Google Cloud Storage

Importing from Google Cloud Storage#

INFO

Find here all information about importing data from Google Cloud Storage.

Requirements: Configuring Google Cloud Storage as a Connection#

A Google Cloud Storage connection with Spectrum must be created before exporting data.

INFO

Note that a Google Cloud Storage bucket must not contain a "_" in its name.

Configuring Import Jobs with Google Cloud Storage#

To import data from Google Cloud Storage:

Click the "+" button and choose "Import Job" or right-click in the File Browser and select "Create New" → "Import Job". The 'New Import Job' tab appears in the menu bar.
Click "Select Connection". The dialog 'Select Connection' opens.
Click on the connection for Google Cloud Storage and confirm with "Select". The connection is displayed.
Select the required file type from the drop-down 'File Type' and confirm with "Next".
Click "Browse" to select the folder/ file from the Google Cloud Storage. The 'Remote Data Browser' opens.
Select the required folder/ file and confirm with "Select". The data name is displayed in the 'File or Folder' field.

TIP: You can scroll the results or filter for them.
Define the delimiter character the schema and the column names.

INFO: The default value for delimiter is ','.
If needed, choose a filter from the drop-down to allow the usage of a time range for applying time patterns in the file name path.
If needed, set file filter to exclude specific files in dependency to their age.
If needed, modify advanced settings, e.g. the character encoding and confirm with "Next". The tab 'Data Fields' opens.
Mark all required columns.
If needed, view the raw records.
If needed, enter placeholder values and confirm with "Apply".
Decide how to handle invalid data.
Decide whether you want partition data.
Confirm with "Next". The tab 'Schedule' opens.
Decide whether the import is triggered manually or on a schedule.
Select the way data is replaced or appended and confirm with "Next". The tab 'Save' opens.
If needed, enter an import job description.
Mark the checkbox when the import shall start immediately after the saving.
If needed, enter the email address for several notifications and confirm with "Next". The 'Save Import Job' dialog opens.
Select the path the data shall be imported, enter a name and confirm with "Save". Data Import from Google Cloud Storage is finished.

Importing from Snowflake

Importing from Snowflake#

To import from Snowflake:

Click the "+" button and choose "Import Job" or right-click in the File Browser and select "Create New" → "Import Job". The 'New Import Job' tab appears in the menu bar.
Click "Select Connection". The dialog 'Select Connection' opens.
Click on the connection for Snowflake and confirm with "Select". The connection is displayed.
The Snowflake database name is that set in the connector.

INFO: If not previously set from the connector, enter the Snowflake Warehouse name.
Select to import to a table or view and select the schema to be used. A preview of the imported data is displayed.
Review the schema and click "Next".
Review the schedule, data retention, and advanced properties for the job.
Add a description and check the box if you would like the import to start immediately after saving. Click "Save", and name the file.