阿里云SLB负载均衡配置
一、概述
1.1 背景介绍
当业务流量超过单台服务器的承载能力,或者需要实现服务的高可用时,负载均衡成为必不可少的基础设施。阿里云SLB(Server Load Balancer)作为国内使用最广泛的云负载均衡服务,承载着海量的互联网流量。
某电商平台在2024年双十一期间,通过SLB集群承载了峰值每秒50万的请求量,后端服务器从日常的20台弹性扩展到200台,整个过程对用户透明,服务可用性达到99.99%。这得益于SLB的弹性扩展能力、智能健康检查和多可用区容灾设计。
SLB提供四层(TCP/UDP)和七层(HTTP/HTTPS)负载均衡能力。四层SLB适合需要极致性能的场景,七层SLB则提供更丰富的流量管理能力,如基于URL的路由、Cookie会话保持、HTTPS卸载等。
1.2 技术特点
多产品形态
阿里云负载均衡产品线包含三个产品:
CLB(Classic Load Balancer):经典负载均衡,支持四层和七层,技术成熟稳定
ALB(Application Load Balancer):应用负载均衡,专注七层,支持更丰富的路由规则
NLB(Network Load Balancer):网络负载均衡,专注四层,超高性能
2025年的选型建议:新业务优先考虑ALB/NLB,CLB作为存量业务的稳定选择。
弹性伸缩
SLB实例本身具备自动弹性能力,无需手动扩容:
CLB性能保障型实例按规格计费
ALB/NLB按实际使用量计费,无需选择规格
多可用区容灾
SLB支持跨可用区部署,当主可用区故障时自动切换到备可用区:
主备模式:一个主可用区,一个备可用区
多活模式(ALB):多个可用区同时服务
健康检查
SLB持续检测后端服务器健康状态:
四层健康检查:TCP连接或UDP探测
七层健康检查:HTTP/HTTPS请求
自动隔离异常服务器,故障恢复后自动加回
1.3 适用场景
| 场景类型 | 推荐产品 | 典型配置 |
|---|---|---|
| Web应用 | ALB | HTTPS监听 + 基于URL的路由 |
| API网关 | ALB | 多域名 + 转发规则 + 限流 |
| 游戏服务 | NLB | UDP监听 + 会话保持 |
| 数据库代理 | NLB | TCP监听 + 后端服务器组 |
| 混合云接入 | CLB | VPN网关 + 云企业网集成 |
| 微服务 | ALB | gRPC支持 + 服务发现集成 |
1.4 环境要求
| 组件 | 要求 | 说明 |
|---|---|---|
| VPC | 已创建 | SLB必须在VPC内 |
| 可用区 | 至少2个 | 高可用部署需要 |
| ECS实例 | 运行正常 | 后端服务器 |
| 安全组 | 配置正确 | 允许SLB健康检查流量 |
| 阿里云账号 | 已实名认证 | 开通SLB服务 |
| RAM权限 | SLB相关权限 | 运维账号需要 |
二、详细步骤
2.1 准备工作
VPC网络规划
在创建SLB之前,需要规划好网络架构:
VPC: 10.0.0.0/8 ├── 可用区A │ ├── 公网子网: 10.0.1.0/24 (SLB、NAT网关) │ └── 私网子网: 10.0.10.0/24 (ECS实例) ├── 可用区B │ ├── 公网子网: 10.0.2.0/24 (SLB备份) │ └── 私网子网: 10.0.20.0/24 (ECS实例) └── 可用区C └── 私网子网: 10.0.30.0/24 (ECS实例扩展)
后端服务器准备
确保后端ECS实例运行正常:
# 检查Web服务状态 systemctl status nginx # 确认端口监听 ss -tlnp | grep ':80|:443' # 测试本地服务 curl -I http://localhost/health # 检查安全组规则(允许SLB健康检查) # 源地址:100.64.0.0/10(SLB健康检查网段) # 端口:业务端口
使用Terraform准备基础设施
# main.tf
provider "alicloud" {
region = "cn-hangzhou"
}
# VPC
resource "alicloud_vpc" "main" {
vpc_name = "prod-vpc"
cidr_block = "10.0.0.0/8"
}
# 交换机 - 可用区A
resource "alicloud_vswitch" "zone_a" {
vpc_id = alicloud_vpc.main.id
cidr_block = "10.0.10.0/24"
zone_id = "cn-hangzhou-h"
vswitch_name = "prod-vsw-a"
}
# 交换机 - 可用区B
resource "alicloud_vswitch" "zone_b" {
vpc_id = alicloud_vpc.main.id
cidr_block = "10.0.20.0/24"
zone_id = "cn-hangzhou-i"
vswitch_name = "prod-vsw-b"
}
# 安全组
resource "alicloud_security_group" "web" {
name = "web-sg"
vpc_id = alicloud_vpc.main.id
description = "Security group for web servers"
}
# 安全组规则 - 允许SLB健康检查
resource "alicloud_security_group_rule" "slb_health_check" {
type = "ingress"
ip_protocol = "tcp"
port_range = "80/80"
security_group_id = alicloud_security_group.web.id
cidr_ip = "100.64.0.0/10"
description = "Allow SLB health check"
}
# ECS实例
resource "alicloud_instance" "web" {
count = 4
instance_name = "web-${count.index + 1}"
image_id = "aliyun_3_x64_20G_alibase_20231220.vhd"
instance_type = "ecs.g7.large"
security_groups = [alicloud_security_group.web.id]
vswitch_id = count.index % 2 == 0 ? alicloud_vswitch.zone_a.id : alicloud_vswitch.zone_b.id
system_disk_category = "cloud_essd"
system_disk_size = 40
tags = {
Environment = "prod"
Role = "web"
}
}
2.2 核心配置
创建CLB实例
通过控制台创建:
登录SLB控制台 -> 实例管理 -> 创建负载均衡
选择配置:
实例类型:传统型负载均衡CLB
实例规格:性能保障型(根据业务选择)
网络类型:公网/私网
主可用区:cn-hangzhou-h
备可用区:cn-hangzhou-i
通过Terraform创建:
# CLB实例
resource "alicloud_slb_load_balancer" "main" {
load_balancer_name = "prod-clb"
address_type = "internet"
load_balancer_spec = "slb.s3.medium"
vswitch_id = alicloud_vswitch.zone_a.id
master_zone_id = "cn-hangzhou-h"
slave_zone_id = "cn-hangzhou-i"
tags = {
Environment = "prod"
}
}
# HTTP监听
resource "alicloud_slb_listener" "http" {
load_balancer_id = alicloud_slb_load_balancer.main.id
backend_port = 80
frontend_port = 80
protocol = "http"
bandwidth = -1
sticky_session = "on"
sticky_session_type = "insert"
cookie_timeout = 86400
health_check = "on"
health_check_type = "http"
health_check_uri = "/health"
health_check_connect_port = 80
healthy_threshold = 3
unhealthy_threshold = 3
health_check_timeout = 5
health_check_interval = 2
health_check_http_code = "http_2xx,http_3xx"
x_forwarded_for {
retrive_slb_ip = true
retrive_slb_id = true
}
gzip = true
request_timeout = 60
idle_timeout = 15
}
创建ALB实例
ALB更适合现代Web应用:
# ALB实例
resource "alicloud_alb_load_balancer" "main" {
vpc_id = alicloud_vpc.main.id
address_type = "Internet"
address_allocated_mode = "Dynamic"
load_balancer_name = "prod-alb"
load_balancer_edition = "Standard"
load_balancer_billing_config {
pay_type = "PayAsYouGo"
}
zone_mappings {
vswitch_id = alicloud_vswitch.zone_a.id
zone_id = "cn-hangzhou-h"
}
zone_mappings {
vswitch_id = alicloud_vswitch.zone_b.id
zone_id = "cn-hangzhou-i"
}
}
# 服务器组
resource "alicloud_alb_server_group" "main" {
protocol = "HTTP"
vpc_id = alicloud_vpc.main.id
server_group_name = "prod-server-group"
server_group_type = "Instance"
health_check_config {
health_check_connect_port = 80
health_check_enabled = true
health_check_host = "$SERVER_IP"
health_check_http_version = "HTTP1.1"
health_check_interval = 2
health_check_method = "GET"
health_check_path = "/health"
health_check_protocol = "HTTP"
health_check_timeout = 5
healthy_threshold = 3
unhealthy_threshold = 3
health_check_codes = ["http_2xx", "http_3xx"]
}
sticky_session_config {
sticky_session_enabled = true
sticky_session_type = "Insert"
cookie_timeout = 86400
}
}
# 添加后端服务器
resource "alicloud_alb_server_group_server_attachment" "main" {
count = 4
server_group_id = alicloud_alb_server_group.main.id
server_id = alicloud_instance.web[count.index].id
server_ip = alicloud_instance.web[count.index].private_ip
server_type = "Ecs"
port = 80
weight = 100
}
# 监听器
resource "alicloud_alb_listener" "http" {
load_balancer_id = alicloud_alb_load_balancer.main.id
listener_protocol = "HTTP"
listener_port = 80
listener_description = "HTTP Listener"
default_actions {
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.main.id
}
}
}
}
配置HTTPS监听
# 上传SSL证书
resource "alicloud_slb_server_certificate" "main" {
name = "prod-cert"
server_certificate = file("${path.module}/certs/server.crt")
private_key = file("${path.module}/certs/server.key")
}
# HTTPS监听(CLB)
resource "alicloud_slb_listener" "https" {
load_balancer_id = alicloud_slb_load_balancer.main.id
backend_port = 80
frontend_port = 443
protocol = "https"
bandwidth = -1
server_certificate_id = alicloud_slb_server_certificate.main.id
tls_cipher_policy = "tls_cipher_policy_1_2"
sticky_session = "on"
sticky_session_type = "insert"
cookie_timeout = 86400
health_check = "on"
health_check_uri = "/health"
healthy_threshold = 3
unhealthy_threshold = 3
health_check_timeout = 5
health_check_interval = 2
health_check_http_code = "http_2xx,http_3xx"
x_forwarded_for {
retrive_slb_ip = true
retrive_slb_id = true
}
gzip = true
request_timeout = 60
idle_timeout = 15
}
# HTTP重定向到HTTPS
resource "alicloud_slb_listener" "http_redirect" {
load_balancer_id = alicloud_slb_load_balancer.main.id
frontend_port = 80
protocol = "http"
bandwidth = -1
listener_forward = "on"
forward_port = 443
}
ALB HTTPS配置(推荐)
# 创建HTTPS监听(ALB)
resource "alicloud_alb_listener" "https" {
load_balancer_id = alicloud_alb_load_balancer.main.id
listener_protocol = "HTTPS"
listener_port = 443
listener_description = "HTTPS Listener"
certificates {
certificate_id = alicloud_slb_server_certificate.main.id
}
default_actions {
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.main.id
}
}
}
}
# HTTP到HTTPS重定向规则
resource "alicloud_alb_rule" "http_to_https" {
rule_name = "http-to-https"
listener_id = alicloud_alb_listener.http.id
priority = 1
rule_conditions {
type = "Header"
header_config {
key = "X-Forwarded-Proto"
values = ["http"]
}
}
rule_actions {
order = 1
type = "Redirect"
redirect_config {
protocol = "HTTPS"
port = "443"
http_code = "301"
}
}
}
2.3 启动和验证
验证SLB状态
# 使用阿里云CLI查看SLB状态 aliyun slb DescribeLoadBalancers --RegionId cn-hangzhou --LoadBalancerId lb-xxx # 查看监听状态 aliyun slb DescribeLoadBalancerListeners --RegionId cn-hangzhou --LoadBalancerId lb-xxx # 查看后端服务器健康状态 aliyun slb DescribeHealthStatus --RegionId cn-hangzhou --LoadBalancerId lb-xxx --ListenerPort 80
测试负载均衡效果
# 获取SLB公网IP
SLB_IP=$(aliyun slb DescribeLoadBalancers
--LoadBalancerId lb-xxx
--output cols=Address | tail -1)
# 测试HTTP请求
curl -I http://${SLB_IP}/
# 多次请求观察负载均衡效果
for i in {1..10}; do
curl -s http://${SLB_IP}/server-info | jq '.hostname'
done
# 测试会话保持
# 使用相同cookie多次请求,应该路由到同一后端
curl -c cookie.txt http://${SLB_IP}/
for i in {1..5}; do
curl -b cookie.txt -s http://${SLB_IP}/server-info | jq '.hostname'
done
# 测试HTTPS
curl -I https://www.example.com/
# 测试健康检查
# 停止一台后端服务器的服务
ssh web-1 "systemctl stop nginx"
# 等待健康检查失败(约10秒)
sleep 15
# 检查后端服务器状态
aliyun slb DescribeHealthStatus
--LoadBalancerId lb-xxx
--ListenerPort 80
压力测试
# 使用wrk进行压力测试
wrk -t12 -c400 -d30s http://${SLB_IP}/
# 使用ab测试
ab -n 10000 -c 100 http://${SLB_IP}/
# 观察SLB监控指标
# 控制台 -> SLB -> 监控 -> 查看QPS、连接数、流量等
三、示例代码和配置
3.1 完整配置示例
生产级ALB完整配置
# variables.tf
variable "region" {
default = "cn-hangzhou"
}
variable "environment" {
default = "prod"
}
variable "domain" {
default = "example.com"
}
# main.tf
terraform {
required_providers {
alicloud = {
source = "aliyun/alicloud"
version = "~> 1.210"
}
}
}
provider "alicloud" {
region = var.region
}
# 获取可用区
data "alicloud_zones" "available" {
available_resource_creation = "VSwitch"
}
# VPC
resource "alicloud_vpc" "main" {
vpc_name = "${var.environment}-vpc"
cidr_block = "10.0.0.0/8"
}
# 交换机
resource "alicloud_vswitch" "main" {
count = 2
vpc_id = alicloud_vpc.main.id
cidr_block = "10.0.${count.index + 1}0.0/24"
zone_id = data.alicloud_zones.available.zones[count.index].id
vswitch_name = "${var.environment}-vsw-${count.index + 1}"
}
# ALB实例
resource "alicloud_alb_load_balancer" "main" {
vpc_id = alicloud_vpc.main.id
address_type = "Internet"
address_allocated_mode = "Dynamic"
load_balancer_name = "${var.environment}-alb"
load_balancer_edition = "Standard"
load_balancer_billing_config {
pay_type = "PayAsYouGo"
}
modification_protection_config {
status = "ConsoleProtection"
reason = "Production ALB"
}
dynamic "zone_mappings" {
for_each = alicloud_vswitch.main
content {
vswitch_id = zone_mappings.value.id
zone_id = zone_mappings.value.zone_id
}
}
tags = {
Environment = var.environment
ManagedBy = "terraform"
}
}
# 默认服务器组
resource "alicloud_alb_server_group" "default" {
protocol = "HTTP"
vpc_id = alicloud_vpc.main.id
server_group_name = "${var.environment}-default-sg"
server_group_type = "Instance"
health_check_config {
health_check_enabled = true
health_check_connect_port = 80
health_check_host = "$SERVER_IP"
health_check_http_version = "HTTP1.1"
health_check_interval = 2
health_check_method = "GET"
health_check_path = "/health"
health_check_protocol = "HTTP"
health_check_timeout = 5
healthy_threshold = 3
unhealthy_threshold = 3
health_check_codes = ["http_2xx", "http_3xx"]
}
sticky_session_config {
sticky_session_enabled = true
sticky_session_type = "Insert"
cookie_timeout = 86400
}
tags = {
Environment = var.environment
}
}
# API服务器组
resource "alicloud_alb_server_group" "api" {
protocol = "HTTP"
vpc_id = alicloud_vpc.main.id
server_group_name = "${var.environment}-api-sg"
server_group_type = "Instance"
health_check_config {
health_check_enabled = true
health_check_connect_port = 8080
health_check_path = "/api/health"
health_check_protocol = "HTTP"
health_check_interval = 2
health_check_timeout = 5
healthy_threshold = 3
unhealthy_threshold = 3
health_check_codes = ["http_2xx"]
}
sticky_session_config {
sticky_session_enabled = false
}
}
# 静态资源服务器组
resource "alicloud_alb_server_group" "static" {
protocol = "HTTP"
vpc_id = alicloud_vpc.main.id
server_group_name = "${var.environment}-static-sg"
server_group_type = "Instance"
health_check_config {
health_check_enabled = true
health_check_connect_port = 80
health_check_path = "/static/health.txt"
health_check_protocol = "HTTP"
health_check_interval = 5
health_check_timeout = 5
healthy_threshold = 2
unhealthy_threshold = 2
health_check_codes = ["http_2xx"]
}
sticky_session_config {
sticky_session_enabled = false
}
}
# HTTPS监听
resource "alicloud_alb_listener" "https" {
load_balancer_id = alicloud_alb_load_balancer.main.id
listener_protocol = "HTTPS"
listener_port = 443
listener_description = "Production HTTPS"
certificates {
certificate_id = alicloud_ssl_certificates_service_certificate.main.id
}
default_actions {
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.default.id
}
}
}
}
# HTTP监听(重定向到HTTPS)
resource "alicloud_alb_listener" "http" {
load_balancer_id = alicloud_alb_load_balancer.main.id
listener_protocol = "HTTP"
listener_port = 80
listener_description = "HTTP to HTTPS redirect"
default_actions {
type = "Redirect"
redirect_config {
protocol = "HTTPS"
port = "443"
http_code = "301"
}
}
}
# 转发规则 - API路由
resource "alicloud_alb_rule" "api" {
rule_name = "api-route"
listener_id = alicloud_alb_listener.https.id
priority = 10
rule_conditions {
type = "Path"
path_config {
values = ["/api/*"]
}
}
rule_actions {
order = 1
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.api.id
}
}
}
}
# 转发规则 - 静态资源路由
resource "alicloud_alb_rule" "static" {
rule_name = "static-route"
listener_id = alicloud_alb_listener.https.id
priority = 20
rule_conditions {
type = "Path"
path_config {
values = ["/static/*", "/assets/*", "*.css", "*.js", "*.png", "*.jpg"]
}
}
rule_actions {
order = 1
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.static.id
}
}
}
}
# 转发规则 - 添加响应头
resource "alicloud_alb_rule" "security_headers" {
rule_name = "security-headers"
listener_id = alicloud_alb_listener.https.id
priority = 1
rule_conditions {
type = "Path"
path_config {
values = ["/*"]
}
}
rule_actions {
order = 1
type = "InsertHeader"
insert_header_config {
key = "X-Content-Type-Options"
value = "nosniff"
value_type = "UserDefined"
}
}
rule_actions {
order = 2
type = "InsertHeader"
insert_header_config {
key = "X-Frame-Options"
value = "SAMEORIGIN"
value_type = "UserDefined"
}
}
rule_actions {
order = 3
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.default.id
}
}
}
}
# 输出
output "alb_dns_name" {
value = alicloud_alb_load_balancer.main.dns_name
}
output "alb_id" {
value = alicloud_alb_load_balancer.main.id
}
NLB四层负载均衡配置
# NLB实例
resource "alicloud_nlb_load_balancer" "main" {
load_balancer_name = "${var.environment}-nlb"
load_balancer_type = "Network"
address_type = "Internet"
address_ip_version = "Ipv4"
vpc_id = alicloud_vpc.main.id
zone_mappings {
vswitch_id = alicloud_vswitch.main[0].id
zone_id = alicloud_vswitch.main[0].zone_id
}
zone_mappings {
vswitch_id = alicloud_vswitch.main[1].id
zone_id = alicloud_vswitch.main[1].zone_id
}
}
# 服务器组
resource "alicloud_nlb_server_group" "main" {
server_group_name = "${var.environment}-nlb-sg"
server_group_type = "Instance"
vpc_id = alicloud_vpc.main.id
scheduler = "Wrr"
protocol = "TCP"
health_check {
health_check_enabled = true
health_check_type = "TCP"
health_check_connect_port = 0
healthy_threshold = 2
unhealthy_threshold = 2
health_check_connect_timeout = 5
health_check_interval = 10
}
connection_drain = true
connection_drain_timeout = 60
preserve_client_ip_enabled = true
}
# TCP监听
resource "alicloud_nlb_listener" "tcp" {
listener_protocol = "TCP"
listener_port = 3306
listener_description = "MySQL Proxy"
load_balancer_id = alicloud_nlb_load_balancer.main.id
server_group_id = alicloud_nlb_server_group.main.id
idle_timeout = 900
proxy_protocol_enabled = false
}
# UDP监听(游戏服务)
resource "alicloud_nlb_listener" "udp" {
listener_protocol = "UDP"
listener_port = 27015
listener_description = "Game Server"
load_balancer_id = alicloud_nlb_load_balancer.main.id
server_group_id = alicloud_nlb_server_group.game.id
}
3.2 实际应用案例
案例一:电商大促高可用架构
某电商平台日常流量约5000 QPS,双十一峰值预估50000 QPS。
架构设计:
┌─────────────────────────────────────┐ │ DNS (GTM) │ │ 主站: www.example.com │ └─────────────┬───────────────────────┘ │ ┌─────────────────────┼─────────────────────┐ ▼ ▼ ▼ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ ALB (杭州) │ │ ALB (上海) │ │ ALB (北京) │ │ 主可用区A/B │ │ 主可用区A/B │ │ 主可用区A/B │ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ │ │ │ ┌───────┼───────┐ ┌───────┼───────┐ ┌───────┼───────┐ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐ │ECS×10││ECS×10││ECS×10││ECS×10││ECS×10││ECS×10││ECS×10││ECS×10││ECS×10│ │ AZ-A ││ AZ-B ││ AZ-C ││ AZ-A ││ AZ-B ││ AZ-C ││ AZ-A ││ AZ-B ││ AZ-C │ └──────┘└──────┘└──────┘└──────┘└──────┘└──────┘└──────┘└──────┘└──────┘
关键配置:
多地域部署:GTM实现地域调度,用户就近访问
多可用区:每个地域ALB跨3个可用区
弹性伸缩:ECS配合ESS自动扩缩容
# ESS弹性伸缩组
resource "alicloud_ess_scaling_group" "web" {
min_size = 10
max_size = 200
scaling_group_name = "prod-web-asg"
vswitch_ids = alicloud_vswitch.main[*].id
# 关联ALB服务器组
alb_server_group {
alb_server_group_id = alicloud_alb_server_group.default.id
weight = 100
port = 80
}
}
# 扩容规则 - QPS触发
resource "alicloud_ess_scaling_rule" "scale_out" {
scaling_group_id = alicloud_ess_scaling_group.web.id
scaling_rule_name = "scale-out-qps"
scaling_rule_type = "TargetTrackingScalingRule"
target_value = 1000 # 每实例目标QPS
metric_name = "ALBQPSPerInstance"
}
# 缩容规则
resource "alicloud_ess_scaling_rule" "scale_in" {
scaling_group_id = alicloud_ess_scaling_group.web.id
scaling_rule_name = "scale-in"
scaling_rule_type = "SimpleScalingRule"
adjustment_type = "QuantityChangeInCapacity"
adjustment_value = -2
cooldown = 300
}
案例二:微服务API网关
使用ALB作为微服务的统一入口,实现基于路径的路由。
# 服务器组定义
locals {
services = {
user = {
path = "/api/user/*"
port = 8001
priority = 10
}
order = {
path = "/api/order/*"
port = 8002
priority = 20
}
product = {
path = "/api/product/*"
port = 8003
priority = 30
}
payment = {
path = "/api/payment/*"
port = 8004
priority = 40
}
}
}
# 为每个服务创建服务器组
resource "alicloud_alb_server_group" "services" {
for_each = local.services
protocol = "HTTP"
vpc_id = alicloud_vpc.main.id
server_group_name = "${var.environment}-${each.key}-sg"
server_group_type = "Instance"
health_check_config {
health_check_enabled = true
health_check_connect_port = each.value.port
health_check_path = "/health"
health_check_protocol = "HTTP"
health_check_interval = 2
healthy_threshold = 3
unhealthy_threshold = 3
health_check_codes = ["http_2xx"]
}
}
# 为每个服务创建路由规则
resource "alicloud_alb_rule" "services" {
for_each = local.services
rule_name = "${each.key}-route"
listener_id = alicloud_alb_listener.https.id
priority = each.value.priority
rule_conditions {
type = "Path"
path_config {
values = [each.value.path]
}
}
rule_actions {
order = 1
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.services[each.key].id
}
}
}
}
案例三:灰度发布配置
# 生产服务器组
resource "alicloud_alb_server_group" "prod" {
server_group_name = "prod-sg"
# ... 配置省略
}
# 灰度服务器组
resource "alicloud_alb_server_group" "canary" {
server_group_name = "canary-sg"
# ... 配置省略
}
# 灰度规则 - 按Header路由
resource "alicloud_alb_rule" "canary_header" {
rule_name = "canary-by-header"
listener_id = alicloud_alb_listener.https.id
priority = 5
rule_conditions {
type = "Header"
header_config {
key = "X-Canary"
values = ["true"]
}
}
rule_actions {
order = 1
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.canary.id
}
}
}
}
# 灰度规则 - 按百分比路由
resource "alicloud_alb_rule" "canary_weight" {
rule_name = "canary-by-weight"
listener_id = alicloud_alb_listener.https.id
priority = 100
rule_conditions {
type = "Path"
path_config {
values = ["/*"]
}
}
rule_actions {
order = 1
type = "ForwardGroup"
forward_group_config {
server_group_tuples {
server_group_id = alicloud_alb_server_group.prod.id
weight = 90
}
server_group_tuples {
server_group_id = alicloud_alb_server_group.canary.id
weight = 10
}
}
}
}
四、最佳实践和注意事项
4.1 最佳实践
性能优化
选择合适的实例规格
CLB规格选择参考: - slb.s1.small: 最大连接数5000, QPS 1000 - slb.s2.small: 最大连接数50000, QPS 5000 - slb.s2.medium: 最大连接数100000, QPS 10000 - slb.s3.small: 最大连接数200000, QPS 20000 - slb.s3.medium: 最大连接数500000, QPS 50000 - slb.s3.large: 最大连接数1000000, QPS 100000 ALB/NLB按量付费,无需选择规格。
优化健康检查配置
# 推荐配置
health_check_config {
health_check_interval = 2 # 检查间隔2秒
health_check_timeout = 5 # 超时5秒
healthy_threshold = 3 # 连续3次成功视为健康
unhealthy_threshold = 3 # 连续3次失败视为不健康
}
# 故障检测时间 = interval × unhealthy_threshold = 6秒
# 恢复检测时间 = interval × healthy_threshold = 6秒
启用连接复用
# 后端Nginx配置,支持HTTP Keep-Alive
upstream backend {
keepalive 100; # 保持100个长连接
}
server {
location / {
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
安全加固
TLS配置
# 使用安全的TLS策略 tls_cipher_policy = "tls_cipher_policy_1_2" # TLS 1.2+,禁用弱加密套件 # 支持的策略: # - tls_cipher_policy_1_0: 兼容性最好,安全性最低 # - tls_cipher_policy_1_1: 禁用SSLv3 # - tls_cipher_policy_1_2: 仅TLS 1.2,推荐 # - tls_cipher_policy_1_2_strict: TLS 1.2,更严格的加密套件 # - tls_cipher_policy_1_2_strict_with_1_3: TLS 1.2/1.3,最安全
访问控制
# ALB访问控制
resource "alicloud_alb_acl" "whitelist" {
acl_name = "office-whitelist"
acl_entries {
entry = "1.2.3.0/24"
description = "Office Network"
}
acl_entries {
entry = "4.5.6.0/24"
description = "VPN Gateway"
}
}
# 关联到监听
resource "alicloud_alb_listener" "admin" {
# ...
acl_config {
acl_type = "White"
acl_relations {
acl_id = alicloud_alb_acl.whitelist.id
}
}
}
DDoS防护
# 关联DDoS高防
resource "alicloud_ddoscoo_instance" "main" {
name = "prod-ddos"
bandwidth = 30
base_bandwidth = 30
service_bandwidth = 100
port_count = 50
domain_count = 50
}
高可用配置
跨可用区部署
# 至少2个可用区
zone_mappings {
vswitch_id = alicloud_vswitch.zone_a.id
zone_id = "cn-hangzhou-h"
}
zone_mappings {
vswitch_id = alicloud_vswitch.zone_b.id
zone_id = "cn-hangzhou-i"
}
后端服务器分布
# 后端服务器均匀分布在多个可用区
resource "alicloud_instance" "web" {
count = 6
vswitch_id = element(alicloud_vswitch.main[*].id, count.index % 2)
# 实例0,2,4在AZ-A,实例1,3,5在AZ-B
}
故障转移测试
# 模拟可用区故障 # 1. 停止一个可用区的所有实例 # 2. 观察SLB自动切换到其他可用区 # 3. 验证服务可用性
4.2 注意事项
| 错误类型 | 错误现象 | 原因分析 | 解决方案 |
|---|---|---|---|
| 健康检查失败 | 所有后端都不健康 | 安全组未放行 | 添加100.64.0.0/10到安全组 |
| 502 Bad Gateway | 后端返回错误 | 后端服务异常或超时 | 检查后端服务,调整超时时间 |
| 504 Gateway Timeout | 请求超时 | 后端处理时间过长 | 增加request_timeout |
| 会话不保持 | 请求被分发到不同后端 | Cookie配置问题 | 检查sticky_session配置 |
| HTTPS证书错误 | 浏览器提示不安全 | 证书不匹配或过期 | 更新证书,检查域名 |
| 连接数耗尽 | 无法建立新连接 | 规格不足或后端慢 | 升级规格,优化后端 |
| 访问延迟高 | 响应时间长 | 跨地域访问或后端慢 | 使用GTM就近访问 |
| 流量不均衡 | 部分后端负载过高 | 权重配置或会话保持 | 调整权重,检查会话配置 |
健康检查配置注意事项
# 确保后端健康检查端点正常 # 1. 返回2xx或3xx状态码 # 2. 响应时间<健康检查超时时间 # 3. 检查路径存在且可访问 # 检查示例 curl -I http://backend-server/health # 期望输出: HTTP/1.1 200 OK
会话保持注意事项
会话保持类型选择: - Insert Cookie: SLB植入Cookie,后端无感知 - Rewrite Cookie: SLB重写后端返回的Cookie - Server Cookie: 使用后端指定的Cookie 注意: 1. Insert Cookie需要客户端支持Cookie 2. 移动端APP需要正确处理Cookie 3. 会话保持可能导致负载不均衡
五、故障排查和监控
5.1 故障排查
健康检查故障排查
# 步骤1: 确认安全组规则 aliyun ecs DescribeSecurityGroupAttribute --SecurityGroupId sg-xxx --Direction ingress | grep 100.64 # 步骤2: 从SLB健康检查网段模拟检查 # 在同VPC的ECS上执行 curl -I http://backend-ip:80/health # 步骤3: 检查后端服务状态 ssh backend-server "systemctl status nginx" ssh backend-server "curl -I localhost/health" # 步骤4: 检查SLB监听配置 aliyun slb DescribeLoadBalancerHTTPListenerAttribute --LoadBalancerId lb-xxx --ListenerPort 80
连接问题排查
# 检查SLB连接数
aliyun cms DescribeMetricLast
--Namespace acs_slb_dashboard
--MetricName ActiveConnection
--Dimensions '[{"instanceId":"lb-xxx"}]'
# 检查后端连接数
ss -s
netstat -an | grep ESTABLISHED | wc -l
# 检查TIME_WAIT
netstat -an | grep TIME_WAIT | wc -l
# 优化内核参数(后端服务器)
cat >> /etc/sysctl.conf << 'EOF'
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
EOF
sysctl -p
性能问题排查
# 检查SLB QPS和延迟
aliyun cms DescribeMetricList
--Namespace acs_slb_dashboard
--MetricName Qps
--Dimensions '[{"instanceId":"lb-xxx"}]'
--StartTime "2025-01-09T0000Z"
--EndTime "2025-01-09T2359Z"
# 检查后端响应时间
aliyun cms DescribeMetricList
--Namespace acs_slb_dashboard
--MetricName Rt
--Dimensions '[{"instanceId":"lb-xxx"}]'
# 使用curl测试响应时间
curl -w "@curl-format.txt" -o /dev/null -s http://slb-ip/
# curl-format.txt内容:
# time_namelookup: %{time_namelookup}
# time_connect: %{time_connect}
# time_appconnect: %{time_appconnect}
# time_pretransfer: %{time_pretransfer}
# time_redirect: %{time_redirect}
# time_starttransfer: %{time_starttransfer}
# time_total: %{time_total}
5.2 性能监控
云监控配置
# 创建报警规则
resource "alicloud_cms_alarm" "slb_qps" {
name = "slb-qps-high"
project = "acs_slb_dashboard"
metric = "Qps"
dimensions = {
instanceId = alicloud_slb_load_balancer.main.id
}
escalations_critical {
statistics = "Average"
comparison_operator = ">="
threshold = "50000"
times = 3
}
contact_groups = ["ops-team"]
period = 60
}
resource "alicloud_cms_alarm" "slb_5xx" {
name = "slb-5xx-high"
project = "acs_slb_dashboard"
metric = "StatusCode5xx"
dimensions = {
instanceId = alicloud_slb_load_balancer.main.id
port = "443"
}
escalations_critical {
statistics = "Sum"
comparison_operator = ">="
threshold = "100"
times = 3
}
contact_groups = ["ops-team"]
period = 60
}
resource "alicloud_cms_alarm" "unhealthy_servers" {
name = "slb-unhealthy-servers"
project = "acs_slb_dashboard"
metric = "UnhealthyServerCount"
dimensions = {
instanceId = alicloud_slb_load_balancer.main.id
port = "443"
}
escalations_critical {
statistics = "Average"
comparison_operator = ">="
threshold = "1"
times = 2
}
contact_groups = ["ops-team"]
period = 60
}
关键监控指标
| 指标名称 | 说明 | 告警阈值建议 |
|---|---|---|
| Qps | 每秒请求数 | >80%规格上限 |
| ActiveConnection | 活跃连接数 | >80%规格上限 |
| NewConnection | 新建连接数 | >80%规格上限 |
| TrafficRX/TX | 流入/流出流量 | >80%带宽 |
| StatusCode5xx | 5xx错误数 | >1%总请求 |
| StatusCode4xx | 4xx错误数 | >5%总请求 |
| Rt | 平均响应时间 | >500ms |
| UnhealthyServerCount | 不健康服务器数 | >=1 |
Grafana仪表板
{
"panels": [
{
"title": "QPS趋势",
"type": "graph",
"datasource": "aliyun-cms",
"targets": [
{
"namespace": "acs_slb_dashboard",
"metric": "Qps",
"dimensions": {"instanceId": "$slb_id"}
}
]
},
{
"title": "响应时间",
"type": "graph",
"targets": [
{
"namespace": "acs_slb_dashboard",
"metric": "Rt"
}
]
},
{
"title": "HTTP状态码分布",
"type": "piechart",
"targets": [
{"metric": "StatusCode2xx"},
{"metric": "StatusCode3xx"},
{"metric": "StatusCode4xx"},
{"metric": "StatusCode5xx"}
]
},
{
"title": "后端服务器健康状态",
"type": "stat",
"targets": [
{"metric": "HealthyServerCount"},
{"metric": "UnhealthyServerCount"}
]
}
]
}
5.3 备份与恢复
配置导出
#!/bin/bash
# export-slb-config.sh
REGION="cn-hangzhou"
OUTPUT_DIR="./slb-backup/$(date +%Y%m%d)"
mkdir -p ${OUTPUT_DIR}
# 导出SLB实例配置
aliyun slb DescribeLoadBalancers
--RegionId ${REGION}
--output json > ${OUTPUT_DIR}/slb-instances.json
# 导出监听配置
for lb_id in $(jq -r '.LoadBalancers.LoadBalancer[].LoadBalancerId' ${OUTPUT_DIR}/slb-instances.json); do
aliyun slb DescribeLoadBalancerListeners
--LoadBalancerId ${lb_id}
--output json > ${OUTPUT_DIR}/listener-${lb_id}.json
# 导出后端服务器配置
aliyun slb DescribeVServerGroups
--LoadBalancerId ${lb_id}
--output json > ${OUTPUT_DIR}/vserver-groups-${lb_id}.json
done
# 导出证书
aliyun slb DescribeServerCertificates
--RegionId ${REGION}
--output json > ${OUTPUT_DIR}/certificates.json
echo "Backup completed: ${OUTPUT_DIR}"
使用Terraform管理配置
# 导入现有资源到Terraform terraform import alicloud_slb_load_balancer.main lb-xxx terraform import alicloud_slb_listener.http lb-xxx80 # 生成配置 terraform show -no-color > imported-config.tf # 验证配置 terraform plan
灾难恢复流程
# 1. 创建新的SLB实例(使用Terraform或控制台) terraform apply # 2. 配置DNS切换 aliyun alidns UpdateDomainRecord --RecordId xxx --RR www --Type A --Value# 3. 验证新SLB正常工作 curl -I http://new-slb-ip/ # 4. 更新CDN回源配置(如有) aliyun cdn ModifyCdnDomainConfig --DomainName www.example.com --Sources '[{"content":"new-slb-ip","type":"ipaddr","priority":"20","port":80}]'
六、总结
6.1 技术要点回顾
本文详细介绍了阿里云SLB负载均衡的配置和最佳实践:
产品选型:CLB适合存量业务,ALB适合七层应用,NLB适合高性能四层场景
高可用设计:跨可用区部署、健康检查、后端服务器分布
HTTPS配置:证书管理、TLS策略、HTTP重定向
流量管理:基于路径/Header的路由、会话保持、灰度发布
安全加固:访问控制、DDoS防护、安全组配置
监控告警:关键指标监控、异常告警、性能分析
6.2 进阶学习方向
GTM全局流量管理:多地域多活架构
DCDN全站加速:SLB与CDN联动
WAF Web应用防火墙:七层安全防护
服务网格:ALB与ASM集成
Kubernetes Ingress:ALB作为K8s入口
6.3 参考资料
阿里云SLB官方文档: https://help.aliyun.com/product/27537.html
ALB文档: https://help.aliyun.com/product/211127.html
NLB文档: https://help.aliyun.com/product/439469.html
Terraform阿里云Provider: https://registry.terraform.io/providers/aliyun/alicloud/latest
附录
A. 命令速查表
| 操作 | 命令 |
|---|---|
| 查看SLB实例 | aliyun slb DescribeLoadBalancers |
| 查看监听 | aliyun slb DescribeLoadBalancerListeners --LoadBalancerId lb-xxx |
| 查看健康状态 | aliyun slb DescribeHealthStatus --LoadBalancerId lb-xxx |
| 添加后端服务器 | aliyun slb AddBackendServers --LoadBalancerId lb-xxx --BackendServers '[...]' |
| 设置权重 | aliyun slb SetBackendServers --LoadBalancerId lb-xxx --BackendServers '[...]' |
| 上传证书 | aliyun slb UploadServerCertificate --ServerCertificate ... --PrivateKey ... |
B. 配置参数详解
监听参数
| 参数 | 默认值 | 说明 |
|---|---|---|
| bandwidth | -1 | 带宽峰值,-1表示不限制 |
| request_timeout | 60 | 请求超时时间(秒) |
| idle_timeout | 15 | 空闲连接超时(秒) |
| gzip | on | 是否开启Gzip压缩 |
健康检查参数
| 参数 | 默认值 | 说明 |
|---|---|---|
| health_check_interval | 2 | 检查间隔(秒) |
| health_check_timeout | 5 | 超时时间(秒) |
| healthy_threshold | 3 | 健康阈值 |
| unhealthy_threshold | 3 | 不健康阈值 |
C. 术语表
| 术语 | 说明 |
|---|---|
| CLB | Classic Load Balancer,经典负载均衡 |
| ALB | Application Load Balancer,应用负载均衡 |
| NLB | Network Load Balancer,网络负载均衡 |
| VServer Group | 虚拟服务器组,后端服务器分组 |
| Listener | 监听,定义端口和协议 |
| Health Check | 健康检查,检测后端服务器状态 |
| Session Persistence | 会话保持,同一客户端路由到同一后端 |
| Forwarding Rule | 转发规则,基于条件的路由 |
全部0条评论
快来发表一下你的评论吧 !