Getting Started with Data Warehousing on AWS

·

2 min read

Getting Started with Data Warehousing on AWS

Let’s say you want to get started building a data warehouse on AWS. You’re going to need to know the following:

  • Do you need a data lake, data warehouse, data mart or all some combination?
  • What is Redshift the AWS data warehouse service?
  • How to setup your account and provision the resources for your data warehouse
  • How to do initial ETL (Extract, Transform, Load) from your operational data stores into your data - warehouse.
  • How to automate ongoing ETL into your data warehouse.
  • Ongoing maintenance of the data warehouse in terms of space usage, performance, security.
  • How to then provide access to the data to your users through a BI presentation layer like Tableau, Looker, or some other BI system.
  • How to access the data through query tools or an API

For this article we’ll take a look at starting points to get familiar with the concepts and first implementations of a data warehouse.

Resources

AWS Data Warehouse Overview

AWS Data Warehouse - high level overview of the tiers of a data warehouse, what a data warehouse is used for and the differences between data warehouse, operational database, data lake and data mart.

AWS Data Services Course

AWS Data Services - overview course on getting started with AWS data services. Available with free trial or Linkedin Premium subscription.

AWS Data Warehouse Implementation Overview

Implementing a Data Warehouse on AWS - overview of AWS data warehousing on AWS youtube channel.

AWS Data Warehouse Walkthrough

Deploy a Data Warehouse on AWS - walkthrough to setup an Amazon Redshift cluster, load sample data, and setup SQL Workbench/J for data analysis.

Next Steps

Choose an overview video or article and review it. Then get started with Deploy a Data Warehouse on AWS. Be sure to check back for updates as I work on answering the questions I posed at the top of the article.