Kernel panics with nested virtualiation on Fedora 27 w/ AMD ThreadRipper?
Has anyone been successful in doing nested virtualization with AMD ThreadRipper processors and Fedora as a host OS?
I'm getting kernel panics in the hypervisor VM the moment I start the nested VM.
To reproduce, I have:
- downloaded the last Fedora 27 workstation DVD
- installed it into a VM on my Fedora 27 workstation
- stopped the VM, changed CPU type to 'host-passthrough', started the VM
- updated the software in the VM
- installed kernel-debuginfo, crash, kexec-tools
- start kdump service
- installed guestfish and then I ran:
truncate --size 5G /tmp/foo.img export LIBGUESTFS_BACKEND=direct guestfish -a /tmp/foo.img
- type 'run'
Kernel traceback below, vmcore not available (for some reason kdump trigger with sysrq, not with an actual panic):
[ 216.002435] kernel BUG at arch/x86/kvm/x86.c:334!
[ 216.003054] invalid opcode: 0000 [#1] SMP NOPTI
[ 216.003566] Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc snd_hda_codec_generic kvm_amd kvm snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev snd_timer joydev snd virtio_balloon parport_pc i2c_piix4 soundcore parport xfs libcrc32c virtio_net virtio_console virtio_blk qxl drm_kms_helper
[ 216.011428] ttm drm crc32c_intel serio_raw virtio_pci ata_generic pata_acpi qemu_fw_cfg virtio_rng virtio_ring virtio
[ 216.012640] CPU: 0 PID: 1559 Comm: qemu-system-x86 Not tainted 4.14.11-300.fc27.x86_64 #1
[ 216.013565] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-1.fc27 04/01/2014
[ 216.014552] task: ffff8caff9ce9f40 task.stack: ffffb9e0409ac000
[ 216.015233] RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm]
[ 216.015823] RSP: 0018:ffffb9e0409afcf0 EFLAGS: 00010246
[ 216.016418] RAX: 0000000048dfb000 RBX: 0000000000000000 RCX: 0000000000000000
[ 216.017206] RDX: 0000000000000663 RSI: 0000000000000000 RDI: 0000000000000000
[ 216.017950] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 216.018700] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 216.019497] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 216.020291] FS: 00007fbcef452700(0000) GS:ffff8cafffc00000(0000) knlGS:0000000000000000
[ 216.021166] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 216.021788] CR2: 0000561e99570958 CR3: 0000000048c88000 CR4: 00000000003406f0
[ 216.022585] Call Trace:
[ 216.022873] svm_exit+0x69/0x55e [kvm_amd]
[ 216.023340] ? svm_vcpu_run+0x183/0x560 [kvm_amd]
[ 216.023878] ? kvm_arch_vcpu_ioctl_run+0x55a/0x1650 [kvm]
[ 216.024493] ? get_futex_key+0x372/0x4d0
[ 216.024946] ? kvm_vcpu_ioctl+0x27e/0x5c0 [kvm]
[ 216.025513] ? kvm_vcpu_ioctl+0x27e/0x5c0 [kvm]
[ 216.026033] ? do_signal+0x19a/0x610
[ 216.026451] ? __fpu__restore_sig+0x96/0x440
[ 216.026964] ? do_vfs_ioctl+0xa1/0x610
[ 216.027421] ? SyS_ioctl+0x74/0x80
[ 216.027811] ? entry_SYSCALL_64_fastpath+0x1a/0x7d
[ 216.028355] Code: 00 d3 e2 f6 c2 1a 75 0d 31 c0 81 e2 00 01 04 00 0f 95 c0 01 c0 f3 c3 0f ff b8 03 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 53 89 ff 65
[ 216.030442] RIP: kvm_spurious_fault+0x5/0x10 [kvm] RSP: ffffb9e0409afcf0
[ 216.031163] ---[ end trace 0921574e14ea9f98 ]---
Mind that normal virtualization works fine.
I've had issues with Nested Virt on a 1950X as well. ksm seems to make it worse, as well.
Have you found any thing to ameliorate the system for these issues?