Project ArchitectureΒΆ
Infrastructure ProvidedΒΆ
As stated in the main page in this documentation, the datadelivery project enables a set of preconfigured AWS services in order to improve users journey on exploring data.
The diagram of services deployed can be found below:
In essence, by using datadelivery, users will have:
- πͺ£ S3 buckets for storing data and assets
- π¨ IAM policies and role to manage access
- π² Databases in Data Catalog
- π A one-time scheduled Glue Crawler to catalog data
- ποΈ An Athena workgroup to store query results
Module StructureΒΆ
In datadelivery, the AWS resources are declared in different Terraform files:
Considering the project architecture presented above, the table below shows how the structure of source repository.
π File | βοΈ Description |
---|---|
storage.tf |
Creates all s3 buckets and upload all files in data/ folder into buckets |
iam.tf |
Creates IAM policies and role to be assumed by a Glue Crawler |
catalog.tf |
Sets a preconfigured Glue Crawler to catalog raw files as tables in Data Catalog |
athena.tf |
Creates a preconfigured Athena workgroup to help users to run their queris |