Docker 容器网络
Docker 容器网络模式
Docker 提供了几种网络模式(network drivers),用于控制容器之间、容器与主机之间的网络连接方式。不同的模式适用于不同的使用场景。
| 模式名 | 含义(Driver) | 是否隔离网络空间 | 是否能与宿主机通信 | 典型用途 |
|---|---|---|---|---|
bridge |
桥接网络(默认) | ✅ 是 | ✅ 能 | 默认容器间通信 |
host |
与宿主机共享网络栈 | ❌ 否 | ✅ 容器 = 宿主机 | 高性能网络通信,port 映射无效 |
none |
无网络(完全隔离) | ✅ 是 | ❌ 无网络 | 安全隔离、手动配置 |
container:<id> |
共享另一个容器的网络空间 | ❌ 与目标容器共享 | ✅ 跟目标容器一样 | 多进程容器(如 sidecar) |
默认使用有 bridge、host、none 模式,默认使用 bridge 模式。
后边先介绍
bridge、host、none和container:<id>。
Bridge 模式
Bridge 模式网卡创建
bridge 模式下,这两个容器会:
- 自动连接到 Docker 默认的
bridge网络(docker0网桥); - 获得 Docker 自动分配的容器私有 IP(如
172.17.0.x); - 通过 NAT 出去访问外部网络;
- 通过
-p实现端口映射访问容器服务。
bridge 网络图如下(应该没画错,错了临时工 GPT 的问题):
| 组件 | 作用 |
|---|---|
docker0 |
Docker 自动创建的 Linux bridge(会有一个地址 172.17.0.1/16),用作容器虚拟交换机 |
veth 对 |
每个容器和宿主之间通过一对 veth 虚拟网卡连接 |
| 容器 IP | 每个容器自动获得私有 IP,如 172.17.0.2 |
| NAT 转发 | 容器访问外部网络通过宿主机做 SNAT 出口 |
| 端口映射 | -p 8080:80 将宿主机端口转发到容器内部 |
以下步骤查看 bridge 模式是如何实现容器网络的:
-
查看宿主机网卡
[root@remote-host ~]# ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 enp6s18 UP 192.168.50.100/24 fe80::be24:11ff:fe88:76d6/64 docker0 DOWN 172.17.0.1/16 fe80::1439:dfff:fef7:5471/64 -
创建两个容器
[root@remote-host ~]# docker run -itd --rm --name web1 nginx:latest 02a2d2c0e998cb6c4e28d9fcece6aa9728f737f44655c2bca5cd5805da662441 [root@remote-host ~]# docker run -itd --rm --name web2 nginx:latest eaffe4d3320acde35c6ba91079be9534a924f901325fe6a1e771a8193df0d0af -
再次查看网卡
[root@remote-host ~]# ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 enp6s18 UP 192.168.50.100/24 fe80::be24:11ff:fe88:76d6/64 docker0 UP 172.17.0.1/16 fe80::1439:dfff:fef7:5471/64 veth760500f@if2 UP fe80::d088:f5ff:fe1a:ee9a/64 veth67337d9@if2 UP fe80::88dd:a1ff:feb6:11ec/64可以看到多了两个网卡,先查看网桥信息。
-
检查网桥
[root@remote-host ~]# bridge link 461: veth760500f@enp6s18:mtu 1500 master docker0 state forwarding priority 32 cost 2 462: veth67337d9@enp6s18: mtu 1500 master docker0 state forwarding priority 32 cost 2 可以看到 veth 设备
veth760500f和veth67337d9都属于docker0网桥。 -
查看网卡详细信息
[root@remote-host ~]# ip link show 1: lo:mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp6s18: mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff 3: docker0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 16:39:df:f7:54:71 brd ff:ff:ff:ff:ff:ff 461: veth760500f@if2: mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether d2:88:f5:1a:ee:9a brd ff:ff:ff:ff:ff:ff link-netnsid 0 462: veth67337d9@if2: mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether 8a:dd:a1:b6:11:ec brd ff:ff:ff:ff:ff:ff link-netnsid 1 这里有个重要信息:
461和462是两个 veth 设备的ifindex
-
查看容器进程 PID
[root@remote-host ~]# docker inspect -f '{{.State.Pid}}' web1 101492 [root@remote-host ~]# docker inspect -f '{{.State.Pid}}' web2 101568 -
查看两个容器内的网卡信息
[root@remote-host ~]# nsenter -t 101492 -n ip a 1: lo:mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if461: mtu 1500 qdisc noqueue state UP group default link/ether fe:e2:1c:05:7b:36 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0 valid_lft forever preferred_lft forever [root@remote-host ~]# nsenter -t 101568 -n ip a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if462: mtu 1500 qdisc noqueue state UP group default link/ether 8e:e4:ad:9e:a5:41 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0 valid_lft forever preferred_lft forever -
查看两个容器命名空间内网卡对应的 veth 设备的
ifindex[root@remote-host ~]# nsenter -t 101492 -a cat /sys/class/net/eth0/iflink 461 [root@remote-host ~]# nsenter -t 101568 -a cat /sys/class/net/eth0/iflink 462 # 也可以用 ethtool 看 [root@remote-host ~]# nsenter -t 101492 -n ethtool -S eth0 | head -n2 NIC statistics: peer_ifindex: 461 [root@remote-host ~]# nsenter -t 101568 -n ethtool -S eth0 | head -n2 NIC statistics: peer_ifindex: 462可以看到第一个容器对应的 veth 设备
ifindex是 461,第二个对应的是 462。
所以 Docker 通过 veth 对实现了容器的网卡创建和网络通信。
上边能看到一些
@if2和@if461等信息,这个是创建跨 namespace 的 veth 对的时候,Linux 内核自动分配设置的。
可以手动创建 veth 对进行验证:
[root@remote-host ~]# ip link add veth-host type veth peer name veth-container
[root@remote-host ~]# ip netns add test
[root@remote-host ~]# ip link set veth-container netns test
[root@remote-host ~]# ip link show veth-host
478: veth-host@if477: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 9e:13:c2:52:d8:e4 brd ff:ff:ff:ff:ff:ff link-netns test
[root@remote-host ~]# ip netns exec test ip a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
477: veth-container@if478: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 4a:62:d0:33:fb:73 brd ff:ff:ff:ff:ff:ff link-netnsid 0
Bridge 模式网络实现方式
外部访问内部容器(端口映射)
端口映射通过 iptables 的 DNAT 实现:
[root@remote-host ~]# docker run -itd --rm --name nginx -p 8080:80 nginx:latest
91a3599d08592d01791ae98fe11a76b051faee1148d3a4db9c6ece76309b56e5
[root@remote-host ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
91a3599d0859 nginx:latest "/docker-entrypoint.…" 7 seconds ago Up 7 seconds 0.0.0.0:8080->80/tcp, [::]:8080->80/tcp nginx
[root@remote-host ~]# ss -ntlp | grep 8080
LISTEN 0 2048 0.0.0.0:8080 0.0.0.0:* users:(("docker-proxy",pid=116837,fd=7))
LISTEN 0 2048 [::]:8080 [::]:* users:(("docker-proxy",pid=116841,fd=7))
[root@remote-host ~]# docker inspect nginx | jq .[0].NetworkSettings.IPAddress
"172.17.0.4"
[root@remote-host ~]# iptables -vnL -t nat
...output omitted...
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
0 0 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:172.17.0.4:80
[root@remote-host ~]# curl localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
内部容器访问外部网络(地址伪装 MASQUERADE)
内部容器通过地址伪装 MASQUERADE 访问外部网络:
[root@remote-host ~]# iptables -vnL -t nat
...output omitted...
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
Host 模式
host 模式就比较简单了,直接共享宿主机网络:
[root@remote-host ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
124ca20cca7b nginx:latest "/docker-entrypoint.…" 9 seconds ago Up 9 seconds nginx-host
[root@remote-host ~]# ss -ntlp | grep 80
LISTEN 0 511 0.0.0.0:80 0.0.0.0:* users:(("nginx",pid=99885,fd=6),("nginx",pid=99847,fd=6))
LISTEN 0 511 [::]:80 [::]:* users:(("nginx",pid=99885,fd=7),("nginx",pid=99847,fd=7))
[root@remote-host ~]# curl localhost
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
None 模式
这个就是容器独立运行,不对外通信:
[root@remote-host ~]# docker inspect busybox | jq .[0].NetworkSettings
{
"Bridge": "",
"SandboxID": "ade98b04c41e3cd3a1e75e23f3d64748f92e85863420548043309cf30679781b",
"SandboxKey": "/var/run/docker/netns/ade98b04c41e",
"Ports": {},
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"none": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"MacAddress": "",
"DriverOpts": null,
"GwPriority": 0,
"NetworkID": "1ba79e362f0d46298496b560cc1a04662d18d3b43395330ff5e620a864a74ee1",
"EndpointID": "87fa60bc106716450ee40a2d6209a00b8aebec9f286c57cd717cdd6cce737cda",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"DNSNames": null
}
}
}
Container:\<id> 模式
container:<id> 模式用于让两个容器共享网络:
[root@remote-host ~]# docker run -itd --rm --name busybox-1 busybox:latest
8955a7e94462e88c6ca786b28f723a036f478aa4fcdf57e15fc3829480628bce
[root@remote-host ~]# docker run -itd --rm --name busybox-2 --network container:busybox-1 busybox:latest
35d6fe82b5f2e68f001b3878f012e7b974bf444e6a167e5d914d410c071d3194
[root@remote-host ~]# docker exec -it busybox-1 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0@if493: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether ee:1c:3a:44:a7:eb brd ff:ff:ff:ff:ff:ff
inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
valid_lft forever preferred_lft forever
[root@remote-host ~]# docker exec -it busybox-2 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0@if493: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether ee:1c:3a:44:a7:eb brd ff:ff:ff:ff:ff:ff
inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
valid_lft forever preferred_lft forever
Macvaln 和 Ipvlan 模式
Docker 默认会创建 bridge、host、none 三种网络。其中 host、none 和前边介绍的 container:<id> 模式属于特殊的使用模式,bridge 属于 Overlay 网络。除此之外,Docker 还支持 Underlay 网络,有 macvlan 和 ipvlan 模式。
Macvlan
macvlan 的作用是在一张物理网卡上虚拟出多个二层网卡,每个都有独立 MAC,在 Docker 里就是每个容器的网卡都是物理网卡的子接口,拥有独立的 MAC 地址,相较于 bridge 模式少了网桥这一层。
macvaln 要求:
- 仅支持 Linux,内核需要 4.0 或更高版本
- 某些云厂商会禁止一个网卡出现多个 MAC 地址信息,所以
macvlan在某些云厂商环境无法使用 - 无法在 rootless 模式下使用
- 网络设备需要支持混杂模式,即允许一个物理接口可以被分配多个 MAC 地址
macvaln 有如下模式:
| 模式 | 描述 |
|---|---|
| bridge | 同一 parent 下的 macvlan 可直接通信(内核转发,不使用 Linux bridge) |
| vepa | macvlan 之间通信必须通过外部交换机(hairpin 转发) |
| passthru | 单个 macvlan 独占 parent 网卡(类似直通) |
| private | macvlan 之间完全隔离(即使同一 parent) |
macvlan 创建选项:
macvlan_mode:macvlan模式。可选:bridge、vepa、passthru和private,默认bridgeparent:指定父接口
[root@remote-host ~]# docker network create \
-d macvlan \
--subnet 192.168.50.0/24 \
--gateway 192.168.50.1 \
-o parent=enp6s18 \
macvaln_net
331438af5594d741e41e2f94b9182372cb17a6368512a5b85a88eb7f552fdda6
可以通过
--ip-range=192.168.32.128/25指定地址池。可以通过
--aux-address="my-router=192.168.32.129"排除地址。
macvlan也支持带 VLAN 的子网卡,-o parent=指定子网卡名。
macvaln 容器创建:
[root@remote-host ~]# docker run -d --rm --network macvaln_net --name macvlan1 busybox sleep infinity
9cda3285ff237174777bff3079a9d536c1d6f049773c55816d5e4e3ff59346a0
[root@remote-host ~]# docker run -d --rm --network macvaln_net --name macvlan2 busybox sleep infinity
48303fd696cf32be39637c37653fe81526d97a992f065692a70693d4fba7903a
[root@remote-host ~]# ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
enp6s18 UP 192.168.50.100/24 fe80::be24:11ff:fe88:76d6/64
docker0 DOWN 172.17.0.1/16
[root@remote-host ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp6s18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 8e:8f:d0:c1:57:fe brd ff:ff:ff:ff:ff:ff
[root@remote-host ~]# ip -d link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 minmtu 0 maxmtu 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
2: enp6s18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500 addrgenmode none numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 parentbus virtio parentdev virtio5
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 8e:8f:d0:c1:57:fe brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.8e:8f:d0:c1:57:fe designated_root 8000.8e:8f:d0:c1:57:fe root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer 0.00 tcn_timer 0.00 topology_change_timer 0.00 gc_timer 60.83 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 no_linklocal_learn 0 mcast_vlan_snooping 0 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
[root@remote-host ~]# docker exec -it macvlan1 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
4: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether 5e:08:24:85:a1:c3 brd ff:ff:ff:ff:ff:ff
inet 192.168.50.2/24 brd 192.168.50.255 scope global eth0
valid_lft forever preferred_lft forever
[root@remote-host ~]# docker exec -it macvlan2 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
5: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether 1a:ff:38:64:3c:03 brd ff:ff:ff:ff:ff:ff
inet 192.168.50.3/24 brd 192.168.50.255 scope global eth0
valid_lft forever preferred_lft forever
[root@remote-host ~]# nsenter -t 2377 -n ip -d link show eth0
4: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 5e:08:24:85:a1:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 1500
macvlan mode bridge bcqueuelen 1000 usedbcqueuelen 1000 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
这里可以看到容器里的网卡名为 eth0@if2,if2 表示容器的网卡是某个编号为 2 的网卡的子网卡(实际就是宿主机的编号为 2 的网卡)。
关于 macvlan,还有一些限制,macvlan 的容器无法和宿主机直接通信(因为 Linux 内核限制),想要解决这个问题有两个办法:
- 创建容器时将容器同时连接到
macvlan和桥接网络 - 在宿主机上创建一个跟容器相同父接口的
macvlan接口,并设置 docker 子网中的地址
下边演示两种方法:
方法 1:
[root@remote-host ~]# docker run -d --network macvaln_net --name macvlan3 busybox sleep infinity
1fffdb5c820127c2b2038b799883093bd86114720ec6005670c6791442f2b9ed
[root@remote-host ~]# ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
enp6s18 UP 192.168.50.100/24 fe80::be24:11ff:fe88:76d6/64
docker0 DOWN 172.17.0.1/16 fe80::8c8f:d0ff:fec1:57fe/64
[root@remote-host ~]# docker exec -it macvlan3 sh
/ # ping -c2 -w2 192.168.50.100
PING 192.168.50.100 (192.168.50.100): 56 data bytes
--- 192.168.50.100 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
[root@remote-host ~]# docker network create bridge-net
0732633f4767e530561653f76b2d5afeb630a04ca2faeac5abcd8b5c3b73af3d
[root@remote-host ~]# docker run -d --network macvaln_net --network bridge-net --name macvlan3 busybox sleep infinity
bdd4c3e9f2068f27874fb2242df7dba0c03ca36f67e82d7f313a184295f739db
[root@remote-host ~]# ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
enp6s18 UP 192.168.50.100/24 fe80::be24:11ff:fe88:76d6/64
docker0 DOWN 172.17.0.1/16 fe80::8c8f:d0ff:fec1:57fe/64
br-0732633f4767 UP 172.18.0.1/16 fe80::ecfc:5aff:fedc:c036/64
veth38fd0c7@if2 UP fe80::e0ac:feff:feb2:8d7e/64
[root@remote-host ~]# docker exec -it macvlan3 sh
/ # ping -c2 -w2 172.18.0.1
PING 172.18.0.1 (172.18.0.1): 56 data bytes
64 bytes from 172.18.0.1: seq=0 ttl=64 time=0.070 ms
64 bytes from 172.18.0.1: seq=1 ttl=64 time=0.095 ms
--- 172.18.0.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.070/0.082/0.095 ms
方法 2:
# 这里重新创建一个网络
[root@remote-host ~]# ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
enp6s18 UP 192.168.50.100/24 fe80::be24:11ff:fe88:76d6/64
docker0 DOWN 172.17.0.1/16 fe80::8c8f:d0ff:fec1:57fe/64
enp6s19 UP fe80::be24:11ff:feba:f53d/64
[root@remote-host ~]# docker network create \
-d macvlan \
--subnet 192.168.10.0/24 \
--gateway 192.168.10.1 \
-o parent=enp6s19 \
macvaln_net_1
56eb304eab68c4dbe4fa8e3637e02190b1e5a7d911765e06913bda06c54fe15a
[root@remote-host ~]# ip link add macvlan-host link enp6s19 type macvlan mode bridge
[root@remote-host ~]# ip addr add 192.168.10.1/24 dev macvlan-host
[root@remote-host ~]# ip link set macvlan-host up
[root@remote-host ~]# ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
enp6s18 UP 192.168.50.100/24 fe80::be24:11ff:fe88:76d6/64
docker0 DOWN 172.17.0.1/16 fe80::8c8f:d0ff:fec1:57fe/64
enp6s19 UP fe80::be24:11ff:feba:f53d/64
macvlan-host@enp6s19 UP 192.168.10.1/24 fe80::c4b6:89ff:fe73:ce6d/64
[root@remote-host ~]# docker run -d --rm --name macvlan4 --network macvaln_net_1 busybox sleep infinity
fdeda4a9b1e57a5276e2fa041cda8034bfeca59b3ef54d97e91c8f22e2ab0ba7
[root@remote-host ~]# docker exec -it macvlan4 sh
/ # wget 192.168.10.1:8080
Connecting to 192.168.10.1:8080 (192.168.10.1:8080)
saving to 'index.html'
index.html 100% |*******************************************************************| 13 0:00:00 ETA
'index.html' saved
/ #
Ipvlan
ipvlan 用同一个 MAC,通过不同 IP 来区分子接口。
ipvlan 可以解决 macvlan 的痛点:
- MAC 太多:交换机 CAM 表可能不够用
- 广播问题:ARP 广播多、ARP 广播多
ipvlan 有如下选项:
-
ipvlan_mode:ipvlan运行模式-
l2:二层网络(像接交换机),默认值 -
l3:路由模式(宿主机当网关) -
l3s:带状态的路由(解决回包 / NAT 问题)
-
-
ipvlan_flag:ipvlan模式bridge:子接口之间可以直接通信(本机内部互通),默认值private:子接口之间完全隔离vepa:子接口通信必须经过外部交换机
-
parent:指定父接口
ipvlan 的 l2 和 l3 模式:
-
l2模式:所有网络在二层,跨三层需要外部路由器 -
l3模式:且l3模式会丢弃所有广播和多播流量,l3内部通过内核转发,外部通过宿主机路由表转发
l2 模式的 ipvlan 网络创建:
# 这里跳过单个子网,直接创建两个子网
# 这里用的 vlan 子网卡创建的,外部设备可能无法访问
[root@remote-host ~]# docker network create -d ipvlan \
--subnet=192.168.114.0/24 --subnet=192.168.116.0/24 \
--gateway=192.168.114.254 --gateway=192.168.116.254 \
-o ipvlan_mode=l2 \
-o parent=enp6s18.114 ipvlan114
2a1a9536b329ead7ce886e5282cdd4b5b440db1a6a1571344929edaf8ca7316c
还有个
--internal选项,这个参数创建的网络的容器无法访问外部网络。
ipvlan也支持带 VLAN 的子网卡,-o parent=指子网卡名。
ipvlan也支持多组--subnet=、--gateway=来添加多个子网
l2 模式的 ipvlan 容器创建:
[root@remote-host ~]# docker run -d --rm --network ipvlan114 --name test1 --ip 192.168.114.100 busybox sleep infinity
834dcc6b941ef0cddd02f1c6bc62fe937bb2648f5b6af60ba81bb346c16c2e47
[root@remote-host ~]# docker run -d --rm --network ipvlan114 --name test2 --ip 192.168.116.100 busybox sleep infinity
6158726e93b8c0bbe755393791517b38f4e03d95d9d153d7b6d1f1464f4eba56
[root@remote-host ~]# docker exec -it test1 sh
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
25: eth0@if24: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.114.100/24 brd 192.168.114.255 scope global eth0
valid_lft forever preferred_lft forever
/ # ip route
default via 192.168.114.254 dev eth0
192.168.114.0/24 dev eth0 scope link src 192.168.114.100
/ #
[root@remote-host ~]# docker exec -it test2 sh
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
26: eth0@if24: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.116.100/24 brd 192.168.116.255 scope global eth0
valid_lft forever preferred_lft forever
/ # ip route
default via 192.168.116.254 dev eth0
192.168.116.0/24 dev eth0 scope link src 192.168.116.100
/ # ping -c1 -w1 192.168.116.100
PING 192.168.116.100 (192.168.116.100): 56 data bytes
--- 192.168.116.100 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
如果是
l2模式的话两个子网需要外部路由才能实现通信。
l3 模式的 ipvlan 子网创建:
# 上一个 l2 的网络已经被删除了
# l3 模式会忽略 --gateway 选项
[root@remote-host ~]# docker network create -d ipvlan \
--subnet=192.168.114.0/24 \
--subnet=192.168.116.0/24 \
-o ipvlan_mode=l3 \
-o parent=enp6s18 ipvlan_l3
0528ef66e56a42d9214367e571646273b4340177aed13558950f74f2c82370f8
l3 模式的 ipvlan 容器创建:
[root@remote-host ~]# docker run -d --rm --network ipvlan_l3 --name ipvlan_l3_1 --ip 192.168.114.100 busybox sleep infinity
75f3b415588b1af3c8a154211a80f8eb925b932592b9675632383dff1d940837
[root@remote-host ~]# docker run -d --rm --network ipvlan_l3 --name ipvlan_l3_2 --ip 192.168.116.100 busybox sleep infinity
1bc5cdef4bfcab7e2ef355b7ee946dcfefde91eb9aa3d0e95c33ca4b138c3d48
[root@remote-host ~]# docker exec -it ipvlan_l3_1 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
28: eth0@if27: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.114.100/24 brd 192.168.114.255 scope global eth0
valid_lft forever preferred_lft forever
[root@remote-host ~]# docker exec -it ipvlan_l3_1 ip route
default dev eth0 scope link
192.168.114.0/24 dev eth0 scope link src 192.168.114.100
[root@remote-host ~]# docker exec -it ipvlan_l3_2 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
29: eth0@if27: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.116.100/24 brd 192.168.116.255 scope global eth0
valid_lft forever preferred_lft forever
[root@remote-host ~]# docker exec -it ipvlan_l3_2 ip route
default dev eth0 scope link
192.168.116.0/24 dev eth0 scope link src 192.168.116.100
[root@remote-host ~]# docker exec -it ipvlan_l3_1 sh
/ # ping -c1 192.168.116.100
PING 192.168.116.100 (192.168.116.100): 56 data bytes
64 bytes from 192.168.116.100: seq=0 ttl=64 time=0.075 ms
--- 192.168.116.100 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.075/0.075/0.075 ms
这里可以看到 l3 模式的 ipvlan 的容器网关指向容器的 eth0 网卡。
如果需要远程的主机访问 l3 模式的容器,需要远程主机和中间物理网络添加指向容器宿主机物理网卡的路由。
下边测试下外部访问:
# 外部设备网络
[root@test ~]# ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
ens18 UP 192.168.50.189/24 fe80::be24:11ff:fec8:450/64
# 容器网络
[root@remote-host ~]# docker exec -it ipvlan_l3_1 sh
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
36: eth0@if2: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether bc:24:11:88:76:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.114.100/24 brd 192.168.114.255 scope global eth0
valid_lft forever preferred_lft forever
# 尝试从容器 ping 外部主机,不通
[root@remote-host ~]# docker exec -it ipvlan_l3_1 sh
/ # ping -c1 -w1 192.168.50.189
PING 192.168.50.189 (192.168.50.189): 56 data bytes
--- 192.168.50.189 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
# 尝试从外部主机 ping 容器,不通
[root@test ~]# ping -c1 -w1 192.168.114.100
PING 192.168.114.100 (192.168.114.100) 56(84) bytes of data.
--- 192.168.114.100 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
# 外部主机添加路由
[root@test ~]# ip route add 192.168.114.0/24 via 192.168.50.100
# 再测试 ping
[root@remote-host ~]# docker exec -it ipvlan_l3_1 sh
/ # ping -c1 -w1 192.168.50.189
PING 192.168.50.189 (192.168.50.189): 56 data bytes
64 bytes from 192.168.50.189: seq=0 ttl=64 time=0.247 ms
--- 192.168.50.189 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.247/0.247/0.247 ms
[root@test ~]# ping -c1 -w1 192.168.114.100
PING 192.168.114.100 (192.168.114.100) 56(84) bytes of data.
64 bytes from 192.168.114.100: icmp_seq=1 ttl=64 time=0.220 ms
--- 192.168.114.100 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.220/0.220/0.220/0.000 ms
为容器添加域名解析
Docker 支持为容器的 /etc/hosts 添加字段,比方说创建一个 mysql 容器,在创建一个 nginx 容器,想让 nginx 容器直接通过 mysql 字段访问 mysql 容器,可以通过如下的方式:
这里用
busybox镜像模拟 mysql 容器。
[root@remote-host ~]# docker run -itd --rm --name mysql busybox:latest
fdc62f57f6d73f6556f7c95caedd2ecdedd81436b7114f78f9f342498f2954d0
[root@remote-host ~]# docker run -itd --rm --name nginx --link mysql nginx:latest
da902521358855950202d98f764d1f674ecd573e5e5c09d1b8ecbbf1f2a700cc
[root@remote-host ~]# docker inspect mysql | jq .[0].NetworkSettings.IPAddress
"172.17.0.2"
[root@remote-host ~]# docker exec -it nginx cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00:: ip6-localnet
ff00:: ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.17.0.2 mysql fdc62f57f6d7
172.17.0.3 da9025213588
可以看到 nginx 容器的 /etc/hosts 里有一条 mysql 解析地址为 172.17.0.2。
这个
--link在比较新版本的 Docker 中已经废弃了,新版本 Docker 中同一个网络创建的容器可以直接通过容器名字解析。