Behavior and Analysis
After we upgrades the VMWARE Version of an Actifio Appliance from ESX 6.5 Version 9 to 13 (the actual one) this Linux Guest crashed with VMware core dump again and again. Problem analysis was horrible.
Most time VMware HA had restarted the guest it crashed again if i tried to connect via SSH console. So it was a night mare in problem analysis but much time and a lot of retries later we succeded in scp download the vmcore files.
So we could analyse it and found the reason:
1 2 3 4 5 6 |
ls -l /var/crash/127.0.0.1-2017-10-19-07:32:44 -rw------- 1 root root 629240952 Oct 19 13:33 vmcore -rw-r--r-- 1 root root 93396 Oct 19 13:32 vmcore-dmesg.txt grep BUG vmcore-dmesg.txt <2>kernel BUG at drivers/net/vmxnet3/vmxnet3_drv.c:1412! |
This is a known VMware Bug
Bug 191201 – Randomly freezes due to VMXNET3
This issue affects all virtual machines running on ESXi 6.5 host (with virtual hardware version 13), the guest will freeze randomly (sometimes several minutes after power on, and sometimes freezes several hours from boot).
Solution
- Issue can be resolved with an ESX upgrade.
- Several workaround allready exists.
- We used this one: disable vmnet3 in VMs vmx file
1 |
vmxnet3.rev.30 = FALSE |
Links