Bootstrap

aws(学习笔记第二十五课) 使用aws batch

aws(学习笔记第二十五课)

  • 使用aws batch

学习内容:

  • aws batch整体架构
  • aws batchhands on

1. aws batch整体架构

  1. AWS batch的整体架构
    相对online开发而言,batch也是一个另外的一个开发领域。AWSbatch开发,既可以采用EC2,也可以采用Fargate
    • 用户通过各种方式将job放入到job queue
    • AWS batchjob queue里面的job取出
    • 之后将job通过Fargate或者EC2进行执行
      在这里插入图片描述
      接下来,hands on采用Fargate的方式进行练习。
  2. AWS batch的创建步骤
    • 创建compute environment
    • 创建job queue,这时候会绑定建立好的compute environment
    • 创建job definition
    • 最后创建job

2.aws batchhands on

  1. 首先定义计算环境compute environment
    • 定义计算环境在这里插入图片描述

    • 使用Fargate和默认的role

      • 这里的role,使用默认的,之后查看AWS设定该role权限信任关系在这里插入图片描述
      • 查看AWSServiceRoleForBatch
        在这里插入图片描述
      • 权限设定
        可以看出都是EC2ECS以及autoscaling相关的权限。
        {
           "Version": "2012-10-17",
           "Statement": [
               {
                   "Sid": "AWSBatchPolicyStatement1",
                   "Effect": "Allow",
                   "Action": [
                       "ec2:DescribeAccountAttributes",
                       "ec2:DescribeInstances",
                       "ec2:DescribeInstanceStatus",
                       "ec2:DescribeInstanceAttribute",
                       "ec2:DescribeSubnets",
                       "ec2:DescribeSecurityGroups",
                       "ec2:DescribeKeyPairs",
                       "ec2:DescribeImages",
                       "ec2:DescribeImageAttribute",
                       "ec2:DescribeSpotInstanceRequests",
                       "ec2:DescribeSpotFleetInstances",
                       "ec2:DescribeSpotFleetRequests",
                       "ec2:DescribeSpotPriceHistory",
                       "ec2:DescribeSpotFleetRequestHistory",
                       "ec2:DescribeVpcClassicLink",
                       "ec2:DescribeLaunchTemplateVersions",
                       "ec2:RequestSpotFleet",
                       "autoscaling:DescribeAccountLimits",
                       "autoscaling:DescribeAutoScalingGroups",
                       "autoscaling:DescribeLaunchConfigurations",
                       "autoscaling:DescribeAutoScalingInstances",
                       "autoscaling:DescribeScalingActivities",
                       "eks:DescribeCluster",
                       "ecs:DescribeClusters",
                       "ecs:DescribeContainerInstances",
                       "ecs:DescribeTaskDefinition",
                       "ecs:DescribeTasks",
                       "ecs:ListClusters",
                       "ecs:ListContainerInstances",
                       "ecs:ListTaskDefinitionFamilies",
                       "ecs:ListTaskDefinitions",
                       "ecs:ListTasks",
                       "ecs:DeregisterTaskDefinition",
                       "ecs:TagResource",
                       "ecs:ListAccountSettings",
                       "logs:DescribeLogGroups",
                       "iam:GetInstanceProfile",
                       "iam:GetRole"
                   ],
                   "Resource": "*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement2",
                   "Effect": "Allow",
                   "Action": [
                       "logs:CreateLogGroup",
                       "logs:CreateLogStream"
                   ],
                   "Resource": "arn:aws:logs:*:*:log-group:/aws/batch/job*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement3",
                   "Effect": "Allow",
                   "Action": [
                       "logs:PutLogEvents"
                   ],
                   "Resource": "arn:aws:logs:*:*:log-group:/aws/batch/job*:log-stream:*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement4",
                   "Effect": "Allow",
                   "Action": [
                       "autoscaling:CreateOrUpdateTags"
                   ],
                   "Resource": "*",
                   "Condition": {
                       "Null": {
                           "aws:RequestTag/AWSBatchServiceTag": "false"
                       }
                   }
               },
               {
                   "Sid": "AWSBatchPolicyStatement5",
                   "Effect": "Allow",
                   "Action": "iam:PassRole",
                   "Resource": [
                       "*"
                   ],
                   "Condition": {
                       "StringEquals": {
                           "iam:PassedToService": [
                               "ec2.amazonaws.com",
                               "ec2.amazonaws.com.cn",
                               "ecs-tasks.amazonaws.com"
                           ]
                       }
                   }
               },
               {
                   "Sid": "AWSBatchPolicyStatement6",
                   "Effect": "Allow",
                   "Action": "iam:CreateServiceLinkedRole",
                   "Resource": "*",
                   "Condition": {
                       "StringEquals": {
                           "iam:AWSServiceName": [
                               "spot.amazonaws.com",
                               "spotfleet.amazonaws.com",
                               "autoscaling.amazonaws.com",
                               "ecs.amazonaws.com"
                           ]
                       }
                   }
               },
               {
                   "Sid": "AWSBatchPolicyStatement7",
                   "Effect": "Allow",
                   "Action": [
                       "ec2:CreateLaunchTemplate"
                   ],
                   "Resource": "*",
                   "Condition": {
                       "Null": {
                           "aws:RequestTag/AWSBatchServiceTag": "false"
                       }
                   }
               },
               {
                   "Sid": "AWSBatchPolicyStatement8",
                   "Effect": "Allow",
                   "Action": [
                       "ec2:TerminateInstances",
                       "ec2:CancelSpotFleetRequests",
                       "ec2:ModifySpotFleetRequest",
                       "ec2:DeleteLaunchTemplate"
                   ],
                   "Resource": "*",
                   "Condition": {
                       "Null": {
                           "aws:ResourceTag/AWSBatchServiceTag": "false"
                       }
                   }
               },
               {
                   "Sid": "AWSBatchPolicyStatement9",
                   "Effect": "Allow",
                   "Action": [
                       "autoscaling:CreateLaunchConfiguration",
                       "autoscaling:DeleteLaunchConfiguration"
                   ],
                   "Resource": "arn:aws:autoscaling:*:*:launchConfiguration:*:launchConfigurationName/AWSBatch*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement10",
                   "Effect": "Allow",
                   "Action": [
                       "autoscaling:CreateAutoScalingGroup",
                       "autoscaling:UpdateAutoScalingGroup",
                       "autoscaling:SetDesiredCapacity",
                       "autoscaling:DeleteAutoScalingGroup",
                       "autoscaling:SuspendProcesses",
                       "autoscaling:PutNotificationConfiguration",
                       "autoscaling:TerminateInstanceInAutoScalingGroup"
                   ],
                   "Resource": "arn:aws:autoscaling:*:*:autoScalingGroup:*:autoScalingGroupName/AWSBatch*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement11",
                   "Effect": "Allow",
                   "Action": [
                       "ecs:DeleteCluster",
                       "ecs:DeregisterContainerInstance",
                       "ecs:RunTask",
                       "ecs:StartTask",
                       "ecs:StopTask"
                   ],
                   "Resource": "arn:aws:ecs:*:*:cluster/AWSBatch*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement12",
                   "Effect": "Allow",
                   "Action": [
                       "ecs:RunTask",
                       "ecs:StartTask",
                       "ecs:StopTask"
                   ],
                   "Resource": "arn:aws:ecs:*:*:task-definition/*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement13",
                   "Effect": "Allow",
                   "Action": [
                       "ecs:StopTask"
                   ],
                   "Resource": "arn:aws:ecs:*:*:task/*/*"
               },
               {
                   "Sid": "AWSBatchPolicyStatement14",
                   "Effect": "Allow",
                   "Action": [
                       "ecs:CreateCluster",
                       "ecs:RegisterTaskDefinition"
                   ],
                   "Resource": "*",
                   "Condition": {
                       "Null": {
                           "aws:RequestTag/AWSBatchServiceTag": "false"
                       }
                   }
               },
               {
                   "Sid": "AWSBatchPolicyStatement15",
                   "Effect": "Allow",
                   "Action": "ec2:RunInstances",
                   "Resource": [
                       "arn:aws:ec2:*::image/*",
                       "arn:aws:ec2:*::snapshot/*",
                       "arn:aws:ec2:*:*:subnet/*",
                       "arn:aws:ec2:*:*:network-interface/*",
                       "arn:aws:ec2:*:*:security-group/*",
                       "arn:aws:ec2:*:*:volume/*",
                       "arn:aws:ec2:*:*:key-pair/*",
                       "arn:aws:ec2:*:*:launch-template/*",
                       "arn:aws:ec2:*:*:placement-group/*",
                       "arn:aws:ec2:*:*:capacity-reservation/*",
                       "arn:aws:ec2:*:*:elastic-gpu/*",
                       "arn:aws:elastic-inference:*:*:elastic-inference-accelerator/*",
                       "arn:aws:resource-groups:*:*:group/*"
                   ]
               },
               {
                   "Sid": "AWSBatchPolicyStatement16",
                   "Effect": "Allow",
                   "Action": "ec2:RunInstances",
                   "Resource": "arn:aws:ec2:*:*:instance/*",
                   "Condition": {
                       "Null": {
                           "aws:RequestTag/AWSBatchServiceTag": "false"
                       }
                   }
               },
               {
                   "Sid": "AWSBatchPolicyStatement17",
                   "Effect": "Allow",
                   "Action": [
                       "ec2:CreateTags"
                   ],
                   "Resource": [
                       "*"
                   ],
                   "Condition": {
                       "StringEquals": {
                           "ec2:CreateAction": [
                               "RunInstances",
                               "CreateLaunchTemplate",
                               "RequestSpotFleet"
                           ]
                       }
                   }
               }
           ]
        }
        
      • 信任关系设定
        {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "batch.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        }
        
    • 设定VPC等网络环境

      • 设定公网Public VPC
        这里为了方便练习,设定了Public VPC,实际的生产环境中,为了安全考虑,应该是在Private VPC中,但是需要设定NAT
        在这里插入图片描述
    • 创建好的compute environment的执行环境
      在这里插入图片描述
      等待一会,状态就会变成成功。

    • 定义job queue
      定义job queue很简单,这里需要绑定compute environment
      在这里插入图片描述

    • 定义jobjob definition
      Fargatejob定义的思想就是docker image + entrance command的设定。

      • job名字的设定
        在这里插入图片描述
        • 为执行ECS Task设定role
          • 设定Service
            在这里插入图片描述
          • 设定权限
            这里选择AWS托管的AmazonECSTaskExecutionRolePolicy
            在这里插入图片描述
          • job definition设定role
            在这里插入图片描述
          • job definition定义docker imageentrance command
            在这里插入图片描述
    • 执行jobjob execution

      • 提交新作业
        在这里插入图片描述
      • 关联job queuejob definition
        在这里插入图片描述
      • 可以将enterance command覆盖job definitionenterance command在这里插入图片描述
      • 提交新作业
        在这里插入图片描述
    • 执行结果

      • 执行成功
        在这里插入图片描述
      • 查看log
        可以看出命令行的覆盖已经成功,job上定义的hello,world from job已经打印出来,job definition里面定义的hello,world from job definition没有生效。
        这里,为了测试,使用的是echo命令,但是真实的生产环境应该是镜像的入口执行command
        在这里插入图片描述
;