Connecting to Redshift from Domino




Overview

This article describes how to connect to AWS Redshift from Domino. In order to connect successfully, you must have network connectivity between Redshift and your Domino deployment.




Connecting to Redshift with Domino Data Sources

The easiest way to connect to a Redshift instance from Domino is to use a Domino Data Source.

Configuration

To create a Redshift Data Source select Redshift as the Data Source type from the New Data Source wizard and specify the relevant parameters as shown below.

create_data_source_redshift.png

Authentication

Specify the credentials that will be used to connect to Redshift.

Currently the only authentication mechanism supported is username and password. The credentials will be securely stored in the Domino secret store backed by HashiCorp Vault.

credentials_data_source_redshift.png

Retrieving data

Once a Redshift data source is configured, users, who have Domino permissions to use it and have specified their credentials, can use the Domino Data API to retrieve data through the connector.

For more information, please refer to the following documentation.




Connecting to Redshift with Python or R package

If you are not using a Domino Data Source, this part of the guide shows how to connect to Redshift and retrieve results using specific Python and R packages.

Credentials

Your database is likely secured with a username and password. We recommend storing these credentials as environment variables in your project so you can access them at runtime without needing to include them in your code.

Python

To establish a connection to Redshift with the psycopg2 library:

import psycopg2
import os

HOST = os.environ['REDSHIFT_HOST']
PORT = 5439 # redshift default
USER = os.environ['REDSHIFT_USER']
PASSWORD = os.environ['REDSHIFT_PASSWD']
DATABASE = 'mydatabase'

def db_connection():
    conn = psycopg2.connect(
        host=HOST,
        port=PORT,
        user=USER,
        password=PASSWORD,
        database=DATABASE,
    )
    return conn

example_query = "SELECT * FROM my_table LIMIT 5"

conn = db_connection()
try:
    cursor = conn.cursor()
    cursor.execute(example_query)
    results = cursor.fetchall() # careful, the results could be huge
    conn.commit()
    print results
finally:
    conn.close()

# using pandas
import pandas as pd
conn = db_connection()
try:
    df = pd.read_sql(example_query, conn)
    df.to_csv('results/outfile.csv', index=False)
finally:
    conn.close()

R

To establish a connection to Redshift with the RPostgreSQL library:

install.packages("RPostgreSQL")
library(RPostgreSQL)

redshift_host <- Sys.getenv("REDSHIFT_HOST")
redshift_port <- "5439"
redshift_user <- Sys.getenv("REDSHIFT_USER")
redshift_password <- Sys.getenv("REDSHIFT_PASSWORD")
redshift_db <- "mydatabase"

drv <- dbDriver("PostgreSQL")
conn <- dbConnect(
    drv,
    host=redshift_host,
    port=redshift_port,
    user=redshift_user,
    password=redshift_password,
    dbname=redshift_db)

tryCatch({
    example_query <- "SELECT * FROM my_table LIMIT 5"
    results <- dbGetQuery(conn, example_query)
    print(results)
}, finally = {
    dbDisconnect(conn)
})