Use Domino Data Sources to connect to Snowflake

The simplest and fastest way to connect and interact with the Snowflake database is by using a Domino Data Source. Data Sources simplify the database setup via configuration records and are optimized for read-heavy, SQL-based usage.

Snowflake Data Source setup

From inside your project, click on Data in the Project menu bar. This will open the Data window.
On the Data Sources page, click the + Add a Data Source button.
In the popup window, select Snowflake from the list.
On the Configure window, enter your account and connection credentials, add an optional description, then click Next.
On the next Authenticate window, enter your Snowflake username and password then click Test Credentials.
Domino will attempt to connect to the specified database using the user credentials you entered. If all goes well, you will see a message, confirming that the connection was successful. Click Next.
On the next Permissions screen, click Finish Setup.

You are now ready to use the Data Source.

Use the Data Source in the Workspace

Start a Domino Workspace. As the workspace starts up, you will notice a red dot next to Data in the Project menu. Once the Workspace is loaded, click Data.
The left pane of the screen will expand to show the Data Source. Click the copy icon to copy the connection code associated with the Snowflake Data Source.
Next, on the Launcher page, click on the Python shortcut in the Notebook section to open a Python Notebook.

Use Command + V (on Mac) or Ctrl + V on Windows and Linux to paste the code you copied into the first cell of the notebook. The Data Source functionality in Domino enables users to connect to the database using these templated commands. It will look something like this:

from domino.data_sources import DataSourceClient

# Instantiate a client and fetch the DataSource instance
ds = DataSourceClient().get_datasource("weather_db")

# Create a simple wrapper from the query result
res = ds.query("select * from {{table_reference}}")

# Load the result into a Pandas dataframe
df = res.to_pandas()

Modify the third statement (the query) so that it matches one of your database tables:
```
res = ds.query("select * from state_province")
```
This will load the list of states from the database table into a result object, which is then converted into a Pandas dataframe with the following code:
```
df = res.to_pandas()
```
Add the following code to display the first few rows of the table:
```
df.head()
```

Run the cell to get the following results:

    STATE_CODE  NAME
----------------------------------
0   AK          ALASKA
1   AL          ALABAMA
2   AR          ARKANSAS
3   AS          AMERICAN SAMOA
4   AZ          ARIZONA

As another example, run the following code, with a different query, to count how many weather stations are located in each country (note that the country code takes the first two characters in the station identifier):

sfQuery = """SELECT SUBSTR(STATION_ID, 1, 2) AS COUNTRY_CODE, COUNT(STATION_ID) AS STATION_COUNT
          FROM WEATHER_STATION
          GROUP BY SUBSTR(STATION_ID, 1, 2)
          ORDER BY STATION_COUNT DESC;"""

res = ds.query(sfQuery)
df = res.to_pandas()
df

The following results are returned:

    COUNTRY_CODE    STATION_COUNT
----------------------------------
0   GM              1123
1   NL              386
2   SP              207
3   UK              158
4   FR              111
5   IT              104
6   PO              26
7   AU              12
8   SZ              10
9   BE              1

In summary, the Data Source usage pattern entailed the following:

Define the query in SQL.
Use the Data Source to send the query and collect the results.
Convert the results into a Pandas dataframe.
Do something productive with the dataframe.

Next steps

Use Snowflake’s Python driver for read and write operations to create a predictive weather model.