Adventures in AWS: Automating Service Limit Checks

In this series, I explore some of the everyday challenges facing an AWS developer/sysadmin. Today: Can you script service limit checks in Powershell even without service-specific APIs?

THE PROBLEM
AWS sets default limits on the number of resources you can create in a given account. The limits apply to EBS volume storage, EC2 instance reservations and total CloudFormation stacks, among many others. The full list of default limits–at least the ones you can change—is available here. Changing a limit requires submitting a support ticket and can take anywhere from a few minutes to several days (or never), depending on how much additional capacity you’ve requested. You can view your current EC2 service limits inside the AWS Console for your account.

I’m okay with submitting support tickets to change a limit, but usually I don’t find out if I’ve reached that limit until I kick off a CloudFormation template containing multiple resources and the stack rolls back with an error like this:

Maximum number of active gp2 volumes bytes, 20, exceeded.

Rolled-back stacks are messy and undesirable, and I don’t want to be manually poking around in the AWS console to check limits before launching an automated build. If only there was an API call I could make to check my limits before attempting to create any resources. Sadly, there’s no “get-all-service-limits” API call for any AWS service, at least not yet. So until AWS comes out with a better solution, let’s take a look at three creative ways to automate those pesky service limit checks in Powershell.

Side note: in this post, we’re talking about imposed account limits, not AWS infrastructure limitations. If you ever see an error like this…

We currently do not have sufficient i2.xlarge capacity in the Availability Zone you requested (us-east-1a). Our system will be working on provisioning additional capacity. You can currently get i2.xlarge capacity by not specifying an Availability Zone in your request or choosing us-east-1e, us-east-1c, us-east-1d.

…it means that AWS is actually running at or near capacity in the requested availability zone for the given EC2 instance type and literally can’t support your request, even if you have plenty of limit space. You’re out of luck until usage goes down across the AZ, or AWS provisions more computing resources, or both.

THE SOLUTION(s)

Option 1: DescribeAccountAttributes
EC2 provides a DescribeAccountAttributes API action that would be a great place to display service limits. They’ve chosen to expose only a few of those limits through this call however, as you’ll see if you run the Get-EC2AccountAttributes cmdlet in the AWS Powershell Tools:

(Get-EC2AccountAttributes).AttributeName
vpc-max-security-groups-per-interface
max-instances
supported-platforms
default-vpc
max-elastic-ips
vpc-max-elastic-ips

There are a couple of VPC and elastic IP limits in there, plus a “max-instances” attribute that displays the total number of On-Demand EC2 instances allowed in the region. (h/t Eric Hammond) But that doesn’t help if you want to check for specific EC2 instance types or EBS volume byte limits, among many other possibilities. On to Option 2!

Option 2: TrustedAdvisor Checks
TrustedAdvisor is a tool that monitors the services in your AWS account and provides various recommendations for cost, performance and security optimization. You have to have a certain AWS support plan in order to get the most out of it, but several of TrustedAdvisor’s most helpful features are free to everyone, including the “Service Limits” check. And yes, you can pull down the results of these checks through the Powershell Tools, including the current limit and the total resources used toward the limit for many services.

Let’s say we want to find out how many EBS gp2 volume bytes are available. Step 1 is to get the top-level information about the service limit check:

Get-ASATrustedAdvisorChecks -Language en | Where-Object {$_.category -eq "performance" -and $_.Name -eq "Service Limits"})

This call returns some interesting output:

Category : performance
Description : Checks for usage that is more than 80% of the service limit.
Values are based on a snapshot, so your current usage might
differ. Limit and usage data can take up to 24 hours to reflect
any changes. In some cases, your usage might be greater than the
indicated limit for a period of time.

Alert Criteria
Yellow: Usage is more than 80% of the service limit.
Recommended Action
If you anticipate exceeding a service limit, open a case in
Support Center to request a limit
increase.
Additional Resources
Trusted Advisor FAQs
AWS Service Limits
EC2 Service Limits (per region) in the
Amazon EC2 console
Id : eW7HH0l7J9
Metadata : {Region, Service, Limit Name, Limit Amount...}
Name : Service Limits

Expanding the Metadata property shows all the values returned for the limit check: Region, Service, Limit Name, Limit Amount, Current Usage and Status (“ok”, “warning” or “error”). Now it’s just a matter of finding the EBS gp2 limit information and subtracting the current usage from the limit:

$checkId = (Get-ASATrustedAdvisorChecks -Language en | Where-Object {$_.category -eq "performance" -and $_.Name -eq "Service Limits"}).Id
$alerts = (Get-ASATrustedAdvisorCheckResult -CheckId $checkId -Language en).FlaggedResources
$gp2_info = $alerts | where-object {$_.Metadata.Contains("us-east-1") -and $_.Metadata.Contains("General Purpose (SSD) volume storage (GiB)") }
$gp2_freebytes = $gp2_info.Metadata[3] - $gp2_info.Metadata[4]

Write-Host "Total free EBS gp2 (GiB): $gp2_freebytes"

As cool as this is, it’s not the end of the story. TrustedAdvisor currently doesn’t return information about every service limit–for example, it’s silent about individual On-Demand EC2 instance types. (This guy has a Python module to check AWS limits, and he just uses the default AWS limit hard-coded for anything he can’t get through TrustedAdvisor.)

So if you want to know how many i2.xlarge instances you can deploy in your account, there’s no automation available to you; you’ll just have to look in the console.

Unless you script against the console itself… 🙂

Option 3: Screen Scrape
Yeah, this is a little silly, but we’ve gone too far to stop now. The On-Demand instance information we want is available in the “Limits” page of the EC2 console, so why not just download the webpage with Powershell and parse out the relevant HTML? In order to make this happen, we’ll have to do a little magic with authentication, because we need a URL that works with temporary IAM credentials. The AWS security blog explains how to generate temporary STS credentials using your long-lived AWS access and secret key. I keep temporary creds (access and secret keys plus a session token) for various profiles in my .aws/credentials file, so I wrote the following function to grab the credentials out of that file, use them to generate a signin token and use that in turn to create a temporary link to my profile’s EC2 service limit page in us-east-1:

function Get-EC2ServiceLimitWebPage([string]$ProfileName)
{
    #load temporary credentials from profile (looking for profile marked with [$ProfileName])
    $credfile = "$env:userprofile\.aws\credentials"
    $marker = (select-string "\[$ProfileName\]" $credfile).LineNumber
    $creds = Get-Content $credfile
    $access = ($creds[$marker] | ConvertFrom-StringData).aws_access_key_id
    $secret = ($creds[$marker+1] | ConvertFrom-StringData).aws_secret_access_key
    $sessiontoken = ($creds[$marker+2] | ConvertFrom-StringData).aws_session_token

    #request temporary signin token from federation endpoint
    $json = "{`"sessionId`":`"$access`",`"sessionKey`":`"$secret`",`"sessionToken`":`"$sessiontoken`"}"
    $Uri = "https://signin.aws.amazon.com/federation"
    $headers = @{
        ContentType="application/x-www-form-urlencoded"
    }
    $body = @{
        Action="getSigninToken";
        Session=$json
    }
    $SigninToken = (Invoke-RestMethod -Method GET -Uri $Uri -Headers $headers -Body $body).SigninToken

    #use signin token to build pre-signed URL pointing to EC2 service limit page (valid for 15 minutes)
    $destURL = [System.Web.HttpUtility]::UrlEncode("https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Limits")
    $request_parameters = "?Action=login"
    $request_parameters += "&Issuer="
    $request_parameters += "&Destination=$destURL"
    $request_parameters += "&SigninToken=$SigninToken"
    $request_url = $Uri
    $request_url += $request_parameters

    #grab limit page
    $response = Invoke-WebRequest $request_url

    return $response
}

The response returned is an object containing the parsed HTML for the webpage; Powershell is smart enough to basically convert the markup into a giant hashtable for you. The AWS Console UI changes pretty frequently, so drilling down through the div tags is left as an exercise for the reader, but you should be able to get any EC2 limit you want this way.

I’ll update this post in the future if and when AWS improves their service limit APIs. In the meantime, let me know if this was helpful to you or if I missed anything!

Adventures in AWS: Automating Service Limit Checks

8 thoughts on “Adventures in AWS: Automating Service Limit Checks

    1. forrestbrazeal says:

      Thanks for chiming in, Jason! I actually linked to your project in the original post. My impression at the time was that you were using hard-coded default limits for anything that wasn’t exposed through TrustedAdvisor. Is that still true?

      Like

      1. jantman says:

        Forrest,

        Wow, I completely missed that link, sorry.

        awslimitchecker has come pretty far since then; I don’t know how many people are actually using it, but it’s getting a few hundred downloads a day (I can assume that many of those are people who are running and downloading it daily in a scheduled job) and I know that at least two users are running it on behalf of a large number of clients.

        Unfortunately, how to determine limits from a script is somewhat of a mess. Some services (AutoScaling, RDS, IAM, some EC2 things) can be directly retrieved in realtime from the API, which makes it easy. Some others can be retrieved from Trusted Advisor, in which case we use that. If all of those fail, we fall back to the documented default value and allow the user to override it. At my day job, we simply have JSON files of the limit overrides for each of our accounts (for limits not available from the API or Trusted Advisor), and update them every time a limit increase support ticket is resolved. The full list of limits it currently supports, and where their values come from, is available at: http://awslimitchecker.readthedocs.io/en/latest/limits.html

        While your approach is fine for an internal script, my interpretation of the AWS Terms is that screen-scraping is explicitly forbidden, and that was confirmed by AWS Support. As such, I didn’t think it pertinent to include in a public project (not to mention that makes the code, especially authentication, considerably more complex).

        I’m trying to get our AWS account manager to put pressure on the various Service Teams to include an API to retrieve limits like RDS does, but it seems to be slow going (for the most part, as far as I can tell, when feature requests enter AWS they rarely come out).

        -Jason

        Like

      2. forrestbrazeal says:

        Hi Jason,

        Thanks for your reply! You’re absolutely correct that limit checking in AWS is still a mess, with some limits exposed via service-level APIs, some in TrustedAdvisor, and some (OnDemand instances in particular) nowhere to be found.

        You’re also correct about the screen scraping 🙂 There’s a reason I didn’t include the actual HTML parsing code in the post…

        We’ve contacted AWS Support at my day job requesting better limit-checking APIs, and received similar responses (or lack thereof) to yours. My suspicion has always been that AWS doesn’t want to make it too easy to automate limit checks because they don’t want to get spammed with automated limit-increase requests. If there’s another reason, I’d be fascinated to hear it.

        Like

      3. jantman says:

        I sort of feel like they’d be happy to allow it, since every time someone bumps up against a limit, there’s a good likelihood they’ll decide to clean up some old stuff, which means less revenue for AWS. I imagine if there were an easier (maybe even built-in) way of checking your limits and opening requests when you approach them, there would be a lot more forgotten-about billable resources lying around. Of course, I’m not sure AWS really *wants* this…

        Over the past few years, I’ve come up with 3 theories on why limits may be handled how they are:

        1. They focus on features that either can be advertised as selling points, are requested by their biggest customers, or fix major issues. “We’ll tell you before we prevent you from creating things” certainly isn’t a selling point. And I imagine that Enterprise customers (Netflix, people of that scale) have dedicated staff to deal with this, and I know they certainly get priority service. So, customers in the middle range – big enough to hit limits, but not big enough to have Enterprise support – feel the pain. There’s definitely not a good/easy venue to bring this concern up to AWS, and there’s no real way to know how many other people have.

        2. If you look at the various APIs, it’s clear that the different Service Teams have little coordination. Tagging resources in some service takes a list of key/value lists, in some it takes a hash/map/dictionary of key/value pairs, and in some it takes a hash/map/dictionary with two keys, “Name” and “Value”. Different APIs have different names for the same thing. It’s entirely possible that getting a feature rolled out across all the services is nearly impossible, and that they all manage work in isolation.

        3. Some services simply can’t do it. For Spot Instances, for example, it seems that the limits are actually determined by some separate backend service, and it may not even be feasible to surface them to the customer-facing API.

        Maybe one day we’ll have an answer. Until then I’ll just keep pinging our account rep about this, and maybe go to one of the AWS conferences and see if I can jam it into a Q&A…

        Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s