This tutorial will give you a chance to experience working with the Snowflake database and Domino. You will follow a basic data collection, engineering, and loading workflow and then create a model in Python that uses the data in Snowflake.
Domino offers various methods to connect to Snowflake:
- 
Snowflake SnowSQL. 
- 
Domino Data Source using Snowflake. 
- 
Snowflake Connector for Python. 
- 
Snowflake Snowpark. 
In this Get Started series, you’ll learn how to work with Domino Data Stores to crush big data with the following workflow:
- 
Preliminaries – Data Engineering: - 
Find data. 
- 
Understand the data. 
- 
Get the data. 
- 
Wrangle data into a format usable for analysis. 
 
- 
- 
Analysis: - 
Look at the data – normally using a subset of the complete dataset. 
- 
Clean the data – deal with missing and errant data. 
- 
Identify the arguments that you believe matter for your prediction to work. 
 
- 
- 
Model development: - 
Try out several algorithms to determine which one produces the best results. 
- 
Save the training function. 
 
- 
- 
Model training: - 
Run the model training function on the complete dataset. 
- 
Collect the model. 
- 
Test again. 
 
- 
- 
This tutorial is aimed at data science professionals familiar with JupyterLab, Jupyter Notebooks, and the Python language. 
- 
The code is for illustration purposes. It is functional, tested, and offers a very basic view into the use of Domino with data in Snowflake. 
- 
Domino offers multiple connectivity modes with Snowflake — primarily: - 
Domino Data Sources - meant for read-oriented exploration. 
- 
The Snowflake Python library - meant for full-featured database operations in Snowflake. 
 
- 
- 
Please use Domino’s file sync functionality to store your file progress in the project’s repository throughout the tutorial. 
- 
Familiarity with Domino Workspaces and Datasets. 
- 
Access permissions (username, password, and authorization) to a Snowflake database. 
- 
The name of your Snowflake warehouse, database, and schema. 
- 
Domino permissions to set up a Snowflake Data Source (if applicable). 
- 
Snowflake’s SnowSQL command line tool for the data engineering and loading sections of this tutorial. 
- 
Familiarity with the SQL language and Pandas library. 
The tutorial is designed to be followed in a sequence:
- 
Data engineering - Prepare and load the data into Snowflake. 
- 
Use Snowflake with a Domino Data Source - A simple connectivity example. 
- 
Use Snowflake’s Python driver in Domino: Build a data update service with a Domino Job. 
- 
Snowflake Snowpark - Create a model in Domino and set it up as a Snowflake user-defined function (Video). 
