Introduction#
- create a Redshift cluster
- create a connection
- run some quries from a notebook
- github
Network Stack#
We need a VPC to host a Reshift cluster
this.vpc = new aws_ec2.Vpc(this, "VpcRedshift", {vpcName: "VpcRedshift",cidr: "10.0.0.0/16",maxAzs: 3,enableDnsSupport: true,enableDnsHostnames: true,subnetConfiguration: [{cidrMask: 24,name: "PublicSubnet1",subnetType: aws_ec2.SubnetType.PUBLIC,},{cidrMask: 24,name: "PublicSubnet2",subnetType: aws_ec2.SubnetType.PUBLIC,},{cidrMask: 24,name: "PublicSubnet3",subnetType: aws_ec2.SubnetType.PUBLIC,},],});
then create a security group for the cluster, need to open port 5439
this.sg = new aws_ec2.SecurityGroup(this, "SecurityGroupForRedshiftCluster", {securityGroupName: "SecurityGroupForRedshiftCluster",vpc: this.vpc,});this.sg.addIngressRule(aws_ec2.Peer.anyIpv4(), aws_ec2.Port.tcp(5439));
Cluster Stack#
let create a redshift cluster with 2 nodes, a database and a user name demo. First need to create role for the cluster
const role = new aws_iam.Role(this, "IamRoleForRedshiftCluster", {roleName: "IamRoleForRedshiftCluster",assumedBy: new aws_iam.CompositePrincipal(new aws_iam.ServicePrincipal("redshift.amazonaws.com"),new aws_iam.ServicePrincipal("sagemaker.amazonaws.com"),new aws_iam.ServicePrincipal("redshift-serverless.amazonaws.com")),});role.addToPolicy(new aws_iam.PolicyStatement({effect: aws_iam.Effect.ALLOW,resources: ["*"],actions: ["s3:*"],}));role.addManagedPolicy(aws_iam.ManagedPolicy.fromAwsManagedPolicyName("AmazonRedshiftAllCommandsFullAccess"));
then create subnet group
const subnetGroup = new aws_redshift.CfnClusterSubnetGroup(this,"SubnetGroupForRedshiftCluster",{description: "subnet group for redshift cluster",subnetIds: props.vpc.publicSubnets.map((subnet) => subnet.subnetId),});
finally, create the redshift cluster with databaes and user name
const cluster = new aws_redshift.CfnCluster(this, "RedshiftCluster", {clusterType: "multi-node",dbName: "demo",masterUsername: "demo",masterUserPassword: "Demo@2023",nodeType: "dc2.large",numberOfNodes: 2,port: 5439,publiclyAccessible: true,iamRoles: [role.roleArn],availabilityZone: props.vpc.availabilityZones[0],clusterSubnetGroupName: subnetGroup.ref,vpcSecurityGroupIds: [props.sg.securityGroupId],});cluster.addDependsOn(subnetGroup);
Serverless Stack#
similarly, first we need to create role and security as for the cluster group, then create a namspace
const namespace = new aws_redshiftserverless.CfnNamespace(this,"RedshiftNameSpace",{namespaceName: "demo",adminUsername: "admin",adminUserPassword: "Admin#2023",dbName: "demo",defaultIamRoleArn: role.roleArn,iamRoles: [role.roleArn],});
create a workgroup
const workgroup = new aws_redshiftserverless.CfnWorkgroup(this,"RedshiftWorkGroup",{workgroupName: "demo",baseCapacity: 32,namespaceName: "demo",subnetIds: props.vpc.publicSubnets.map((subnet) => subnet.subnetId),publiclyAccessible: true,securityGroupIds: [sg.securityGroupId],});
Connection#
Either using notebook sql or normal query editor, we need to create a connection first.
- double check the associated IAM roles
- create a connection with username, dbname as specified in the cluster stack above
then can run query to create a table
CREATE TABLE region (R_REGIONKEY bigint NOT NULL,R_NAME varchar(25),R_COMMENT varchar(152))diststyle all;
then copy data from S3 to the table
COPY region FROM 's3://redshift-immersionday-labs/data/region/region.tbl.lzo'iam_role defaultregion 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;