Importing from a Cloud Storage
Find all information and how-to import from a cloud storage provider here.
Importing from Azure Data Lake Gen2
Azure Data Lake Gen2#
To import data from Azure Data Lake Storage Gen 2:
- Click the "+" button and choose "Import Job or right-click in the File Browser and select "Create New" → "Import Job". The 'New Import Job' tab appears in the menu bar.
- Click "Select Connection". The dialog 'Select Connection' opens.
- Click on the connection for Azure Data Lake Gen2 and confirm with "Select". The connection is displayed.
- Select the required file type from the drop-down "File Type" and confirm with "Next".
- Enter the file or folder name as it is named in your storage.
-
Define the delimiter character the schema and the column names.
INFO: The default value for delimiter is the comma ',' character.
-
Select the schema of the imported data.
-
If needed, uncheck the check-box to not include the column names in the first row.
INFO: The check-box is marked per default. The column names are contained in the first row.
-
If you want to filter by data and time select the filter method from the drop-down.
INFO: Select the start date and end date from the calendar for the filter mode 'Fixed dates'.
INFO: Enter a 'das' expression as the start and end expression, e.g. 'TODAY()-4d for the filter mode 'Dynamic dates'.
-
If needed, exclude data by the file modification day, enter the amount of days.
- If needed, modify the advanced settings, e.g. the character encoding and confirm with "Next". The tab 'Data Fields' opens.
- Confirm with "Next". The tab 'Define Fields' opens.
- Mark all required columns.
- If needed, enter placeholder values and confirm with "Apply".
- Decide how to handle invalid data.
-
Decide whether you want partition data and confirm with "Next". The tab 'Schedule' opens.
INFO: If you have checked 'Partition Data', enter a date expression and select the data format from the drop-down.
-
Decide whether the import shall be triggered manually or on a schedule.
- Select the option for data retention.
-
If needed, enter the amount of sample records and the maximum amount of errors to log and confirm with "Next". The tab 'Schedule' opens.
INFO: Higher values lead to more precise preview results but can rapidly decrease the performance.
-
If needed, enter an import job description.
-
Unmark the checkbox if the import shall not start immediately after the saving.
INFO: The check-box is marked per default to start the import right after saving the import job.
-
If needed, enter the email address for several notifications and confirm with "Next". The 'Save Import Job' dialog opens.
- Select the path the data shall be imported to, enter a name and confirm with "Save". Data Import from Azure Data Lake Storage Gen2 is finished.
Importing from Google Cloud Storage
Importing from Google Cloud Storage#
INFO
Find here all information about importing data from Google Cloud Storage.
Requirements: Configuring Google Cloud Storage as a Connection#
A Google Cloud Storage connection with Spectrum must be created before exporting data.
INFO
Note that a Google Cloud Storage bucket must not contain a "_" in its name.
Configuring Import Jobs with Google Cloud Storage#
To import data from Google Cloud Storage:
- Click the "+" button and choose "Import Job" or right-click in the File Browser and select "Create New" → "Import Job". The 'New Import Job' tab appears in the menu bar.
- Click "Select Connection". The dialog 'Select Connection' opens.
- Click on the connection for Google Cloud Storage and confirm with "Select". The connection is displayed.
- Select the required file type from the drop-down 'File Type' and confirm with "Next".
- Click "Browse" to select the folder/ file from the Google Cloud Storage. The 'Remote Data Browser' opens.
-
Select the required folder/ file and confirm with "Select". The data name is displayed in the 'File or Folder' field.
TIP: You can scroll the results or filter for them.
-
Define the delimiter character the schema and the column names.
INFO: The default value for delimiter is ','.
-
If needed, choose a filter from the drop-down to allow the usage of a time range for applying time patterns in the file name path.
- If needed, set file filter to exclude specific files in dependency to their age.
- If needed, modify advanced settings, e.g. the character encoding and confirm with "Next". The tab 'Data Fields' opens.
- Mark all required columns.
- If needed, view the raw records.
- If needed, enter placeholder values and confirm with "Apply".
- Decide how to handle invalid data.
- Decide whether you want partition data.
- Confirm with "Next". The tab 'Schedule' opens.
- Decide whether the import is triggered manually or on a schedule.
- Select the way data is replaced or appended and confirm with "Next". The tab 'Save' opens.
- If needed, enter an import job description.
- Mark the checkbox when the import shall start immediately after the saving.
- If needed, enter the email address for several notifications and confirm with "Next". The 'Save Import Job' dialog opens.
- Select the path the data shall be imported, enter a name and confirm with "Save". Data Import from Google Cloud Storage is finished.
Importing from Snowflake
Importing from Snowflake#
To import from Snowflake:
- Click the "+" button and choose "Import Job" or right-click in the File Browser and select "Create New" → "Import Job". The 'New Import Job' tab appears in the menu bar.
- Click "Select Connection". The dialog 'Select Connection' opens.
- Click on the connection for Snowflake and confirm with "Select". The connection is displayed.
-
The Snowflake database name is that set in the connector.
INFO: If not previously set from the connector, enter the Snowflake Warehouse name.
-
Select to import to a table or view and select the schema to be used. A preview of the imported data is displayed.
- Review the schema and click "Next".
- Review the schedule, data retention, and advanced properties for the job.
- Add a description and check the box if you would like the import to start immediately after saving. Click "Save", and name the file.