AWS Redshift Connector
Run SQL queries on AWS Redshift and save results to your data store.
Why use this connector?
Query directly from Valohai: No need to export data manually. Write SQL, run execution, get CSV output.
Version your queries: Every query is saved with the execution. Reproduce results months later by checking which query ran when.
Feed downstream jobs: Query outputs get datum URLs. Use them as inputs in other executions or pipelines.
Requirements
Redshift cluster on your AWS account
Cluster security group allows connections from
valohai-sg-workersAuthentication via IAM role or username/password
Authentication options
Option 1: IAM role (recommended)
If your Valohai workers run on AWS with IAM roles:
Attach the policy below to
ValohaiWorkerRole(or your worker role):
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "GetRedshiftCredentials",
"Effect": "Allow",
"Action": "redshift:GetClusterCredentials",
"Resource": "*"
}
]
}Set environment variables:
RSCLUSTERIDENTIFIER: Redshift cluster identifierRSDATABASE: Database nameRSHOST: Cluster endpoint (e.g.,my-cluster.abc123.us-east-1.redshift.amazonaws.com)RSREGION: AWS region (e.g.,us-east-1)RSIAM: Set to1RSPORT: (Optional) Default is5439
Option 2: Username and password
If not using IAM roles:
Set environment variables:
RSCLUSTERIDENTIFIER: Redshift cluster identifierRSDATABASE: Database nameRSHOST: Cluster endpointRSREGION: AWS regionRSIAM: Set to0RSUSER: Redshift usernameRSPASSWORD: Redshift password (mark as secret)RSPORT: (Optional) Default is5439
Add environment variables
Environment variables can be added:
Project-wide: Project Settings → Environment Variables
Organization-wide: Admin users can create environment variable groups that can be passed to several projects.
Per-execution: Set when creating the execution
We recommend project or organization settings for credentials.
Run a query
Open your project
Click Create Execution
Expand valohai-ecosystem → Select
redshift-queryConfigure parameters:
query: Your SQL query
output-path: (Optional) Output filename, default is
results.csvdatum-alias: (Optional) Alias for easy reference, e.g.,
latest-orders
Verify environment variables are set
Click Create Execution
Example query
Results are saved as results.csv (or your custom output path) and uploaded to your data store.
Use query results
The output of the execution gets a datum URL. Reference it in other executions by the URL directly or by using the datum alias shown in the example below:
Or use it in a pipeline by passing the execution output to the next node.
Troubleshooting
Connection refused
Check:
Redshift cluster security group allows connections from
valohai-sg-workersRSHOSTincludes the full cluster endpoint (not just the identifier)RSPORTis correct (default:5439)
Authentication fails
If using IAM (RSIAM=1):
Verify
ValohaiWorkerRolehasredshift:GetClusterCredentialspermissionCheck that worker role is properly attached to your workers
If using username/password (RSIAM=0):
Verify
RSUSERandRSPASSWORDare correctEnsure password is marked as a secret in Valohai
Query returns no results
Redshift queries run successfully even if they return zero rows. Check your WHERE clauses and table names.
Next steps
Other database connectors:
Build your own:
Last updated
Was this helpful?
