Overview

So, I’ve recently had to automate cross-account DNS Route53 authorization and believe it or not I discovered that AWS has a security breach allowing Route53 backdoor - as mentioned in this Blog Post.

Regardless, you might wonder what was the initial goal? Well, the idea was to leverage AWS Route53 Resolver Service that allows for hybrid DNS query resolution. That means no more DNS Controllers in the cloud, cause AWS manages all that for you. There is this Blog Post and few similar ones from AWS that show what Route53 Resolver is capable of doing.

Nevertheless, there was a challenge to automate this deployment configuration. Usually, a challenge like this wouldn’t present an obstacle but this one has steps that can only be executed from within the accounts themselves with AWS CLI. One of these steps deals with an association of a Centralized DNS-VPC to the Hosted Zones of the Leaf accounts. So, since I am using Terraform to deploy all my Resources, it wasn’t possible to simply call API cross-account, so I had to leverage SDK in this case. I decided to go with Lambda and Python to achieve this.

Note: Now, I am using Terraform to deploy my resources cross-accounts and in most cases one can use providers to distinguish between regions or accounts for infrastructure delivery. However, as I already mentioned above that approach is not possible here.

Hybrid-DNS


AWS Resources needed for this Deployment


Centralized Shared Services Account:

  • Shared Services DNS-VPC
  • Route53 Resolver Inbound and Outbound Endpoints for Shared Services DNS-VPC
  • Appropriate Forwarding Rules for the Queries that go On-Premise/AWS and for the Reverse-lookups
  • AWS RAM service used for sharing these Forwarding Rules to the Leaf accounts
  • Lambda "zoneAssociation" function

Leaf Accounts:

  • Leaf VPC
  • Route53 Private Hosted Zone
  • Lambda "zoneAuthAssociation" function

Automation Process and Dependencies


I am delivering entire infrastructure from a single, isolated account. Lets call it a "Delivery" Account.

First Phase "Delivery" -> "Shared Services" cross-account code execution:

  1. Deploying Shared Services DNS-VPC
  2. Deploying Route53 Resolver Endpoints and attaching them to the DNS-VPC
  3. Deploying Forwarding Rules
  4. Sharing Forwarding Rules with the Leaf Accounts
  5. Deploying "zoneAssociation" Lambda

Second Phase "Delivery" -> "Leaf Account" cross-account code execution:

  1. Deploying Leaf VPC
  2. Deploying Route53 Private Hosted Zone
  3. Associating already shared Forwarding Rules with the Leaf VPC
  4. Associating Private Hosted Zone with the Leaf VPC
  5. Invoking "zoneAuthAssociation" Lambda that will authorize Zone association for this specific Hosted Zone
  6. Invoking "zoneAssociation" Lambda inside Shared Services Account that will associate Hosted Zone to the centralized DNS-VPC where the Route53 Resolver Endpoints reside

Terraform Code for Lambda Deployment Dependencies


resource "aws_route53_zone" "this" {
  name = "${var.project_name}.${var.aws_domain}"
  
  vpc {
    vpc_id = "${module.vpc.vpc_id}"
    vpc_region = "${var.region}"
  }
  
  lifecycle {
    ignore_changes = ["vpc"]
  }

  tags = module.label.tags
}

data "aws_lambda_invocation" "lambda-auth-association" {
  function_name = "${aws_lambda_function.lambda-auth-association.function_name}"
  input = <<JSON
{
  "foo": "bar"
}
JSON

  depends_on = ["aws_lambda_function.lambda-auth-association"]
}

## Execute only once, waiting on the resolution https://github.com/terraform-providers/terraform-provider-aws/issues/4746 or https://github.com/hashicorp/terraform/issues/17034
data "aws_lambda_invocation" "lambda-association" {
  provider = "aws.shared_services"

  function_name = "${var.lambda_association_name}"
  input = <<JSON
{
  "hosted_zone_id": "${aws_route53_zone.this.id}",
  "region": "${var.region}",
  "dns_vpc_id": "${var.dns_vpc_id}"
}
JSON

  depends_on = ["data.aws_lambda_invocation.lambda-auth-association"]
}

data "archive_file" "lambda-auth-association" {
  type          = "zip"
  source_file   = "${path.module}/lambdas/zone_auth_association.py"
  output_path   = "${path.module}/lambdas/zone_auth_association.zip"
}

resource "aws_lambda_function" "lambda-auth-association" {
  filename      = "${path.module}/lambdas/zone_auth_association.zip"
  function_name = "zoneAuthAssociation"
  role          = "${aws_iam_role.lambda-auth-association.arn}"
  handler       = "zoneAuthAssociation.handler"

  runtime = "python3.7"
  timeout = "60"

  environment {
    variables = {
      HOSTED_ZONE_ID = "${aws_route53_zone.this.id}"
      REGION = "${var.region}"
      DNS_VPC_ID = "${var.dns_vpc_id}"
    }
  }
  
  depends_on = ["aws_iam_role_policy_attachment.route53-fullaccess", "aws_iam_role.lambda-auth-association"]
}

resource "aws_iam_role" "lambda-auth-association" {
  name = "lambda-auth-association"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

data "aws_iam_policy" "route53-fullaccess" {
  arn = "arn:aws:iam::aws:policy/AmazonRoute53FullAccess"
}

resource "aws_iam_role_policy_attachment" "route53-fullaccess" {
  role       = "${aws_iam_role.lambda-auth-association.name}"
  policy_arn = "${data.aws_iam_policy.route53-fullaccess.arn}"
}

data "aws_iam_policy" "cloudwatch-fullaccess" {
  arn = "arn:aws:iam::aws:policy/CloudWatchFullAccess"
}

resource "aws_iam_role_policy_attachment" "cloudwatch-fullaccess" {
  role       = "${aws_iam_role.lambda-auth-association.name}"
  policy_arn = "${data.aws_iam_policy.cloudwatch-fullaccess.arn}"
}

data "aws_route53_resolver_rule" "aws" {
  name        = "aws"
}

data "aws_route53_resolver_rule" "onprem" {
  name        = "onprem"
}

data "aws_route53_resolver_rule" "aws-reverselookup" {
  name        = "aws-reverselookup"
}

data "aws_route53_resolver_rule" "onprem-reverselookup" {
  name        = "onprem-reverselookup"
}

resource "aws_route53_resolver_rule_association" "aws" {
  resolver_rule_id = "${data.aws_route53_resolver_rule.aws.id}"
  vpc_id           = "${module.vpc.vpc_id}"
}

resource "aws_route53_resolver_rule_association" "onprem" {
  resolver_rule_id = "${data.aws_route53_resolver_rule.onprem.id}"
  vpc_id           = "${module.vpc.vpc_id}"
}

resource "aws_route53_resolver_rule_association" "aws-reverselookup" {
  resolver_rule_id = "${data.aws_route53_resolver_rule.aws-reverselookup.id}"
  vpc_id           = "${module.vpc.vpc_id}"
}

resource "aws_route53_resolver_rule_association" "onprem-reverselookup" {
  resolver_rule_id = "${data.aws_route53_resolver_rule.onprem-reverselookup.id}"
  vpc_id           = "${module.vpc.vpc_id}"
}

Lambda Code for Zone Authorization Association


"""
Authorize Associations to the DNS-VPC
Python Version: 3.7.0
"""

import os
import boto3
from botocore.exceptions import ClientError
import logging
hosted_zone_id = os.getenv('HOSTED_ZONE_ID')
region         = os.getenv('REGION')
dns_vpc_id     = os.getenv('DNS_VPC_ID')

def aws_session(role_arn=None, session_name='lambda_session'):
    """
    If role_arn is given assumes a role and returns boto3 session
    otherwise return a regular session with the current IAM user/role
    """
    if role_arn:
        client = boto3.client('sts')
        response = client.assume_role(RoleArn=role_arn, RoleSessionName=session_name)
        session = boto3.Session(
            aws_access_key_id=response['Credentials']['AccessKeyId'],
            aws_secret_access_key=response['Credentials']['SecretAccessKey'],
            aws_session_token=response['Credentials']['SessionToken'])
        return session
    else:
        return boto3.Session()
    
def auth_dns_vpc():
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    try:
        client = boto3.client('route53')
        
        response = client.create_vpc_association_authorization(
            HostedZoneId=str(hosted_zone_id),
            VPC={
                'VPCRegion': str(region),
                'VPCId': str(dns_vpc_id)
                }
            )
        logger.debug(response)
        return logger.info("Sucessfully authorized association to DNS-VPC for own private hosted zone!")
    except Exception as error:
        logger.exception(error)
        logger.debug(response)
        
def handler (event, context):
    session_regular = aws_session()
    auth_dns_vpc()

Lambda Code for Zone-DNSVPC Association


"""
Associate DNS-VPCs with the appropriate private hosted zones
Python Version: 3.7.0
"""

import os
import boto3
from botocore.exceptions import ClientError
import logging

def aws_session(role_arn=None, session_name='lambda_session'):
    """
    If role_arn is given assumes a role and returns boto3 session
    otherwise return a regular session with the current IAM user/role
    """
    if role_arn:
        client = boto3.client('sts')
        response = client.assume_role(RoleArn=role_arn, RoleSessionName=session_name)
        session = boto3.Session(
            aws_access_key_id=response['Credentials']['AccessKeyId'],
            aws_secret_access_key=response['Credentials']['SecretAccessKey'],
            aws_session_token=response['Credentials']['SessionToken'])
        return session
    else:
        return boto3.Session()
    
def dns_vpc_zone_association(hosted_zone_id, region, dns_vpc_id):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    try:
        client = boto3.client('route53')
        response = client.associate_vpc_with_hosted_zone(
        HostedZoneId=hosted_zone_id,
        VPC={
            'VPCRegion': region,
            'VPCId': dns_vpc_id
            },
        Comment='Associate DNS-VPC with the leaf hosted zone'
        )
        logger.debug(response)
        return logger.info("Sucessfully associated DNS-VPC with the leaf private hosted zone!")
    except ClientError as e:
            if e.response['Error']['Code'] == "ConflictingDomainExists":
                pass
            else:
                pass
    except Exception as error:
        logger.exception(error)
        logger.debug(response)
        
def handler (event, context):
    hosted_zone_id = event['hosted_zone_id']
    region         = event['region']
    dns_vpc_id     = event['dns_vpc_id']
    
    session_regular = aws_session()
    dns_vpc_zone_association(hosted_zone_id, region, dns_vpc_id)

Conclusion

As you can see, it is pretty easy to combine IaC with SDK in an automated manner. I did have to leverage data source providers such as aws_lambda_invocation to invoke Lambda and set few [depends_on] lines of code but in the end it did its job. I find AWS efforts to replace any need to virtual appliances in the cloud with their own managed services an excellent path forward and especially after the introduction of Transit Gateway, I can confirm that Route 53 Resolver serves as an amazing service for a Hybrid-DNS architectural solutions.