This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Operational Excellence

Support development and run workloads effectively by gaining insight into operations continuously

This focuses on running and monitoring your deployment and continually evaluating the systems for improvement. It covers a great deal of how you set things up and should cover Infrastructure as code (IaC) as a core practice, as well as how you respond to requests and incidents.

From a business point of view a good solutions architect should be looking at how you fold back the learnings into processes, ensuring leasons are learn when there are operational issues and helping the business either avoid or recover more quickly. You should also be looking at the new feature’s be released and deciding if this will help you remove operational overhead. A great example of this is when intelligent tiering was released. Intelligent tiering in a nut shell helps you manage you data in S3 storage classes and moves the least active objects to cheaper tiers of storage. Before intelligent tiering, devops team would have to monitor this and make decisions on the creation of lifecycle policies, now intelligent tiering is here this free’s up time for the devops team and removes the human error factor also.

1 - The AWS Command Line

Useful AWS CLI commands

OpEx Sec Rel Perf Cost Sus

Let’s start off by getting familiar with the AWS CLI. If you are using the AWS Web Terminal as suggested in getting started this will be super simple. Manipulating large numbers of files is much simpler via the CLI compared to using the console interface, the ‘‘‘sync’’’ command is one example of this allowing you to easily copy a large number of files to or from an S3 bucket.

Below are a few commands that are super useful:

s3 make bucket

Quickly create a bucket from the command line. If you don’t specify –region the CLI will default to your default.

aws s3 mb s3://sqcows-bucket --region eu-west-1

s3 remove bucket

Use with care!!! The –force command is useful when you have a none empty bucket.

aws s3 rb s3://sqcows-bucket
aws s3 rb s3://sqcows-bucket --force # Delete a none empty bucket

s3 ls commands

Just like the *nix equivalent there is a recursive option which is useful when you start racking up lots of objects and prefixes.

aws s3 ls
aws s3 ls s3://sqcows-bucket
aws s3 ls s3://sqcows-bucket --recursive
aws s3 ls s3://sqcows-bucket --recursive  --human-readable --summarize

s3 cp commands

aws s3 cp getdata.php s3://sqcows-bucket
aws s3 cp /local/dir/data s3://sqcows-bucket --recursive
aws s3 cp s3://sqcows-bucket/getdata.php /local/dir/data
aws s3 cp s3://sqcows-bucket/ /local/dir/data --recursive
aws s3 cp s3://sqcows-bucket/init.xml s3://backup-bucket
aws s3 cp s3://sqcows-bucket s3://backup-bucket --recursive

s3 mv commands

aws s3 mv source.json s3://sqcows-bucket
aws s3 mv s3://sqcows-bucket/getdata.php /home/project
aws s3 mv s3://sqcows-bucket/source.json s3://backup-bucket
aws s3 mv /local/dir/data s3://sqcows-bucket/data --recursive
aws s3 mv s3://sqcows-bucket s3://backup-bucket --recursive

s3 rm commands

aws s3 rm s3://sqcows-bucket/<file_to_delete>
aws s3 rm s3://sqcows-bucket --recursive #delete all the files in the bucket!!!!

s3 sync commands

I tend to use these over the cp commands as it works really well and feels more like rsync.

aws s3 sync backup s3://sqcows-bucket
aws s3 sync s3://sqcows-bucket/backup /tmp/backup
aws s3 sync s3://sqcows-bucket s3://backup-bucket

2 - Infrastructure as Code

Automate all the things

OpEx Sec Rel Perf Cost Sus

The command line is great, but let’s face it we are going to want to use the API more in our automation. When it comes to Infrastructure as code there as a few options and all have their own merits. Personally I tend to use terraform but there are other options like Amazons own Cloudformation or CDK.

Terraform Logo https://learn.hashicorp.com/terraform
Cloud Formaion Logo https://aws.amazon.com/cloudformation/

Why should you use IaC?

Having your infrastructure (in this case your S3 buckets) as code it not only helps you build your workload out but also allows you to replicate your set up in other environments and even allows you recovery more quickly in the event of a disaster situation, by rebuilding your setup quickly and reliability. By turning your infrastructure into deployable code you remove human error and lesson the changes of something working in stage and then failing in production.

Terraform

Terraform is an open-source infrastructure as code software tool that provides a consistent CLI workflow to manage hundreds of cloud services. Terraform codifies cloud APIs into declarative configuration files.

Through out this book we’ll be using terraform, however you may already be an expert in cloudformation, so if you do feel inclined you could rewrite the examples, and we are always happy to accept pull requests to the book.

3 - Tags

Tagging everything!

OpEx Sec Rel Perf Cost Sus

Why are tags important you may ask? Well, tags can help in a few ways such as cost allocation, defining the sensitivity of the data, what environment the data is for, and even access control. They can even be used to show the owner of certain files or workloads helping you quickly contact users, other companies use to tags to classify the sensitivity of the data in terms of Personal Identifiable Information (PII). Those PII tags then can be used to include/exclude access from IAM users.

Resource constraints

Each resource (a bucket in our case) can have up to 10 user-defined tags and these are broadly split into these areas, Technical, Automation, Business, and Security. Here are the restrictions you should consider when making tags:

Restriction Description
Maximum number of tags per resource 50
Maximum key length 128 Unicode characters in UTF-8
Maximum value length 256 Unicode characters in UTF-8
Prefix restriction Do not use the aws: prefix in your tag names or values because it is reserved for AWS use. You can’t edit or delete tag names or values with this prefix. Tags with this prefix do not count against your tags per resource limit.
Character restrictions Tags may only contain Unicode letters, digits, whitespace, or these symbols: _ . : / = + - @

The table below shows the tags I recommend to customers, but of course, you can add more or leave some out.

Technical Tags Automation Tags Business Tags Security Tags
Name: the application name Created by: name the deployment system, CF, TF, or even by hand Project: which project is the resource aligned too Confidentiality: Mark if the data contains sensitive data such as personal information
Role: the application role/tier Do Not Delete: true or false (allow the automation to check and bail out preventing data loss) Owner: which team owns the data Compliance: Does this data need to be handled in a certain way, such as HIPPA
Environment: dev/stage/prod etc Date and Time: release date and time Cost Centre: who should be billed for the usage
Cluster: if the resource is used in a particular cluster tag it Customer: Add this if you charge your customer for usage
Version: stick to semver or git release if possible

Using Terraform

Lets take a look at how this would look in terraform, the code can be found in the folder chapter1/001. It consists of five files:

main.tf

As the tiles suggest this is where we define the main stuff, as in what want to do. In this example, its create an S3 bucket and set versioning and tags on that bucket.

providers.tf

This is where we tell terraform to connect to and in our case it’s AWS. We could also add some extra information to make sure the terraform state is stored remotely but that’s beyond the scope of this book.

variables.tf

Anything we might want to change the value of in any of the other files should be stored as a variable in this file. You’ll see them referenced in the other files like so: ${var.project} In this case this populates the project tag in the main.tf file.

versions.tf

We use this file to define the minimum version of the providers and modules we are using.

outputs.tf

The outputs file is a place where we can ask for information back from terraform, for instance, when a bucket is created.

To run this example make sure you are in the correct folder and run:

terraform init
terraform plan
terraform apply

Remember to answer yes when prompted on the apply. Once run you’ll see a new bucket in your AWS console called sqcows-demo-bucket-

If you go and inspect that bucket you’ll see versioning enabled and you can also inspect all the tags. The choice of tags that you require is up to you, however, one great way to find things that are not tagged in the correct way is to use AWS Config and use the required-tags rule to enforce all buckets be built with tags.

To clean up after this example just run the following command:

terraform destroy

Also type yes when prompted and it will clean up the deployment for you.

Technical considerations:

Which tags are useful tou your teams, having a owner and sensitivity level on data can help you prioritise work quickly in a crisis. You may also consider versioning is not required because you choose to replicate data to another region.

Business considerations:

What tags are important to the business, will they help you pin down the costs per click of operations for example? You should also be involved in setting the number of versions to keep. This could change from project to project but allows your business to know the risk and impact

4 - Access Points

Using access points to protect and control your data

How do S3 Access Points Work?

Access points are configured by policy to grant access to either specific users or applications. An example would be allowing groups of users (even from other accounts) access to your data lake.

Each access point is configured to connect to a single S3 bucket. The bucket then contains a network origin control (the source of where you’d like apps or users to connect from) and a Block Public Access Policy. You could, of course, use the CIDR block of your VPC as the access point or get more granular and only allow certain subnets or single IP’s. You can even use this to grant certain origins access to objects with a certain prefix or specific tags.

Access Points Diagram

You can access the data in a shared bucket by either ARN directly or via its Alias when the operation requires a full bucket name.

When to use S3 Access Points

S3 Access Points simplify how you manage data access for your application set to your shared data sets on S3. You no longer have to manage a single, complex bucket policy with hundreds of different permission rules that need to be written, read, tracked, and audited. With S3 Access Points, you can now create application-specific access points permitting access to shared data sets with policies tailored to the specific application.

  • Large shared data sets: Using Access Points, you can decompose one large bucket policy into separate, discrete access point policies for each application that needs to access the shared data set. This makes it simpler to focus on building the right access policy for an application, while not having to worry about disrupting what any other application is doing within the shared data set.
  • Copy data securely: Copy data securely at high speeds between same-region Access Points using the S3 Copy API using AWS internal networks and VPCs. Restrict access to VPC: An S3 Access Point can limit all S3 storage access to happen from a Virtual Private Cloud (VPC). You can also create a Service Control Policy (SCP) and require that all access points be restricted to a Virtual Private Cloud (VPC), firewalling your data to within your private networks.
  • Test new access policies: Using access points you can easily test new access control policies before migrating applications to the access point, or copying the policy to an existing access point.
  • Limit access to specific account IDs: With S3 Access Points you can specify VPC Endpoint policies that permit access only to access points (and thus buckets) owned by specific account IDs. This simplifies the creation of access policies that permit access to buckets within the same account, while rejecting any other S3 access via the VPC Endpoint.
  • Provide a unique name: S3 Access points allow you to specify any name that is unique within the account and region. For example, you can now have a “test” access point in every account and region.

Whether creating an access point for data ingestion, transformation, restricted read access, or unrestricted access, using S3 Access Points simplifies the work of creating, sharing, and maintaining access to data in your shared S3 buckets.