Detect Proxmox hardware unit hang
VM connectivity was somehow blocked
Recently I had some issues with the reliability of my local network connections from the proxmox host and their services. This came up with application crashes on a Windows 11 VM where explorer.exe restarted every time I connected to this VM via RDP or Plex started buffering the stream in random situations. Turns out this was due the reset of the built-in network card on the proxmox host itself.
Detect the problem
In proxmox syslog I saw alot messages like these:
Jan 05 12:31:21 pve kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
TDH <c9>
TDT <10>
next_to_use <10>
next_to_clean <c8>
buffer_info[next_to_clean]:
time_stamp <11d57416d>
next_to_watch <c9>
jiffies <11d574588>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
Jan 05 12:31:22 pve kernel: e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
Jan 05 12:31:22 pve kernel: vmbr0: port 1(eno1) entered disabled state
Jan 05 12:31:26 pve kernel: e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 05 12:31:26 pve kernel: vmbr0: port 1(eno1) entered blocking state
Jan 05 12:31:26 pve kernel: vmbr0: port 1(eno1) entered forwarding state
Jan 05 12:31:28 pve pvestatd[1119]: status update time (6.257 seconds)
I have a Intel I219-LM onboard Chip and after searching around it looks like this was a well known problem.
root@pve:~# lspci -v | grep Ethernet
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-LM (rev 10)
DeviceName: Onboard - Ethernet
Subsystem: Lenovo Ethernet Connection (7) I219-LM
There are multiple variations of this problem with different NICs, but for me and many others it solved to problem when you disable offloading.
Resolve the problem
Go to /etc/network/interfaces and add the last 2 lines with the parameters, check that the interface names are correct (via “ip a” for example).
auto vmbr0
iface vmbr0 inet static
address 10.10.5.0/24
gateway 10.10.5.xxx
bridge-ports eno1
bridge-stp off
bridge-fd 0
pre-up /sbin/ethtool --offload vmbr0 gso off tso off sg off gro off
pre-up /sbin/ethtool --offload eno1 gso off tso off sg off gro off
In my case I have the loopback device (lo), the interface itself (eno1) and the bridge (vmbr0) for proxmox. You need to reference to your bridge and interface in the config.
root@pve:~# ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP group default qlen 1000
link/ether f8:75:a4:20:10:79 brd ff:ff:ff:ff:ff:ff
altname enp0s31f6
3: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether f8:75:a4:20:10:79 brd ff:ff:ff:ff:ff:ff
inet 10.10.5.5/24 scope global vmbr0
valid_lft forever preferred_lft forever
inet6 fe80::fa75:a4ff:fe20:1079/64 scope link
valid_lft forever preferred_lft forever
After applying and a reboot, my network interface is now stable and works as expeceted.