This post is based on the Fugue's Cloud Security Masterclass series focused on AWS IAM security.
Fugue's Cloud Security Masterclass series is designed to help cloud engineers think more critically about cloud misconfiguration — why it occurs, how malicious actors exploit it, and ways to prevent it.
Our two-part session on Locking Down the Security of AWS IAM (Part 1 and Part 2) has proved to be extremely popular with cloud engineers who want to better understand the AWS Identity and Access Management (IAM) service, so we thought a blog post was warranted.
Why AWS IAM?
Cloud misconfigurations are gaps in infrastructure configuration that hackers can squeeze into and exploit. Like all IAM services offered by cloud service providers, AWS IAM can be extremely secure when done right, but it's an enigma to many people, and when done wrong it can open attack vectors you may never have imagined.
Many advanced IAM misconfigurations are:
- Not considered compliance violations
- Not recognized as vulnerabilities
- Easy to overlook
- Incredibly common
- Increasingly exploited in major breaches
For example, hackers using found credentials is an obvious example, but breaches have occurred from bad actors using IAM as a "network" to get across service boundaries. (See Josh's blog post A Technical Analysis of the Capital One Cloud Misconfiguration Breach.)
IAM is made up of many moving parts, but understanding how the parts relate is key to developing a mental model so you can reason about IAM on your own, rather than relying on cookie-cutter, one-size-fits-all approaches from others. After all, there's no such thing as one-size-fits-all with cloud; every use case is different. It's important to learn to think critically about security for your unique AWS use cases — and the IAM masterclasses are designed to help you do just that.
AWS IAM basics
AWS IAM can become complex in implementation, but conceptually it's rather simple: principals are granted permission to take actions on resources.
- Principal: A human user or a process (such as a program running on an EC2 instance). Principals initiate actions against cloud APIs.
- Action: An API call that can be made to interact with resources (such as PutObject, GetObject, or ListBuckets for S3 buckets). Actions allow you to create, edit, delete, and view resources.
- Resource: An object that exists within an AWS service (such as an EC2 instance or S3 bucket). Resources have a unique ARN, or Amazon Resource Name.
A policy ties everything together. It's how you define IAM access and authorization privileges in AWS. Permissions in a policy determine whether a principal is allowed or denied an action on a resource.
The policy "stack"
There are several types of IAM policies, and some grant permissions while others restrict permissions. These policies make up a policy evaluation "stack." When any API call — meaning a request to take an action — makes its way to AWS, it goes through the stack to determine whether the call is allowed or denied.
If any layer of the stack explicitly denies the action, then the action is rejected — even if another layer explicitly allows it. At the end of the stack, if no policy explicitly allows the action, it's implicitly denied.
Writing IAM policies is about judiciously poking holes in the "deny all" wall so the right principals can take the right actions on the right resources.
The 6 types of IAM policies
When you think "IAM policy," you might be thinking "identity-based policy," such as one that allows a specific principal to take actions against a resource. But there are 6 types of policies, and it's critical to understand all of them and how they interrelate:
- Identity-based policies
- Resource-based policies
- Permission boundaries
- Organizations service control policies (SCPs)
- Access control lists (ACLs)
- Session policies
In particular, it's useful to know that these 2 types of policies can grant or limit permissions:
- Identity-based policies allow a given principal to take specific actions against specific resources.
- Resource-based policies allow specific principals to take specific actions against a given resource.
These 3 policy types limit permissions, meaning they can take away what the policies listed above have granted:
- Permission boundaries restrict the maximum permissions an identity-based policy can grant to a user or role.
- Organizations service control policies (SCPs) restrict the maximum permissions for an organization or organizational unit.
- Session policies restrict the permissions granted by a user or role's identity-based policy when that entity creates a temporary session.
Access control lists (ACLs) determine whether principals in other accounts can access a given resource. Don't use these; they're a legacy access control method and even AWS suggests using alternative methods.
For a detailed explanation of how AWS evaluates policies, including the order in which each type of policy is evaluated, see Policy evaluation logic.
3 simple use cases
There are countless use cases for IAM, but here are three simple examples.
Use Case 1: Account setup. This most basic use case is for setting up an AWS account implementing role-based access control (RBAC) by assigning users to groups with appropriate policies. Out of the box, an AWS account has a single user: root. Similar to root on Unix, root on AWS gives complete access to everything in your account (including billing information), and you cannot reduce the permissions assigned to it. Here's a tip: Don't use it. Use root only to create a user for yourself, create a group, assign the group an AdministratorAccess policy, and put the user in the group. (See AWS's tutorial here.) Then stop using root!
RBAC is your friend. The critical part of this use case is to gain a basic understanding of how to properly set up users, groups, roles, and attachments to policies. Get to know the policy stack; look at all the policy types and learn when to use which. For example, an S3 bucket should always have a resource-based policy called a bucket policy. Don't rely on identity-based policies alone for security.
Use Case 2: ABAC. This use case is for larger teams with many resources. ABAC, or attribute-based access control, allows you to scope actions and resources by team. It defines permissions based on tags: if the principal and the resource have the same user-defined tag, the principal can take action on the resource. In this way, Developer Team A and Developer Team B can have the same identity-based policy but access different resources. Developer Team A can only see resources tagged for Developer Team A, and Developer Team B can only see Developer Team B resources. (See this AWS tutorial for more detailed information.)
This approach allows your team and resources to grow without proliferating IAM roles. Proliferation of IAM roles and not maintaining them is a big attack vector. You may lose control of the boundaries of your configurations, which delights hackers because they can find and exploit policies you've forgotten about.
Note that if you use ABAC, you'll need to limit who can write tags. Otherwise, you increase the attack surface of your cloud infrastructure because bad actors can give themselves access to things they shouldn't be able to see, or prevent others from accessing the things they should. An example approach to limiting who can write tags is to use IAM limitations or bucket policies to deny S3 tagging rights for all users except one, and then use S3 tags for ABAC. Just keep in mind that ABAC is not a silver bullet, but it can be a simpler way of controlling access where it might otherwise be a nightmare.
Use Case 3: Compute resources as principals. This use case is often glossed over, but it's a vector for a lot of data breaches on cloud. When compute resources (such as an EC2 instance) have access to other resources (such as S3 buckets) via IAM, it's critical not to give the compute instance an overpermissioned policy. For instance, if an EC2 instance has the permission EC2:ReplaceIamInstanceProfileAssociation, a hacker can potentially escalate privileges by switching the instance's policy to a more permissive one and then use their new privileges to steal data. That's why it's especially important not to give compute resources overly permissive actions.
Security in IAM is not just about making sure your infrastructure doesn't have unused privileges, but making sure the ones they use aren't overly permissive. If so, that can be a sign that your application architecture needs to be redesigned. For instance, it's a best practice for compute instances not to have S3:ListBucket permissions. If they do, the bad guys can easily see your S3 topology and potentially perform commands like s3 sync that require list permissions.
Check out the Cloud Security Masterclass series!
To learn more about the topics discussed above, and much more — including best practices, common pitfalls, and steps to take when writing a new IAM policy — check out the two masterclasses, available on demand:
- Cloud Security Masterclass: Locking down the Security of AWS IAM, Part 1
- Cloud Security Masterclass: Locking down the Security of AWS IAM, Part 2
You can find a full list of past and upcoming masterclasses on our Cloud Security Masterclass page.