问题描述:阿里云nginx ip_hash将所有请求转发到一台机器
排查:查看nginx的error.log, 发现大部分的client ip是来自于10.159.95网段
点击(此处)折叠或打开
- 2015/10/25 12:01:03 [warn] 21580#0: *12412231 an upstream response is buffered to a temporary file /home/work/nginx/proxy_temp/2/55/0000075552 while reading upstream, client: 10.159.95.***, server: ***.***.***, request: “POST *** HTTP/1.0”, upstream: “http://***”, host: “***.***.***”
怀疑nginx未获取到真实的用户ip,而获取到上游SLB集群的IP。
nginx官方网站中关于ip_hash的说明
Specifies that a group should use a load balancing method where requests are distributed between servers based on client IP addresses. The first three octets of the client IPv4 address, or the entire IPv6 address, are used as a hashing key. The method ensures that requests from the same client will always be passed to the same server except when this server is unavailable. In the latter case client requests will be passed to another server. Most probably, it will always be the same server as well.
飘红字段的意思是,对于IPv4,ip_hash会使用IP的前3部分作为哈希的Key,一个SLB集群对应的网段都是一致的,导致ip_hash之后的请求全部落到一台upstream机器。
解决方案:让nginx获取到用户的真实IP,使用real client ip做ip_hash
1. 重新编译nginx, 增加–with-http_realip_module模块
2. 在nginx.conf中http{}中,增加
点击(此处)折叠或打开
- real_ip_header X-Forwarded-For;
- set_real_ip_from 0.0.0.0/0;
- real_ip_recursive on;
3. 重启nginx,查看error.log检查client: 对应日志,已经变成了用户真实的外网出口IP
4. 检查upstream的机器流量,已趋于均衡