I’ll have to try that. What I have tried so far is running a different kernel version and making sure my driver blacklists are correct (I found that the GPU shouldn’t ever connect to snd_hda_intel. It briefly eas again, but after fixing it, I still had the problem.).
For me, I have intel integrated + amd discrete. When I tried to set DRI_PRIME to 0 it complained that 0 was invalid, when I set it to 2 it said it had to be less than the number of GPUs detected (2). After digging in I noticed my cards in /dev/dri/by-path were card1 card2 rather than 0 and 1 like everyone online said they should be. Searching for that I found a few threads like this one that mentioned simpledrm was enabled by default in 6.4.8, which apparently broke some kind of enumeration with amd GPUs. I don’t really understand why, but setting that param made my cards number correctly, and prime selection works again.
Huh. My issue seems different, but I’ll still test that flag to see if it changes anything. My problem looks like the device doesn’t return to host after VM shutdown, possibly because of the reset bug (based on my observation of dmesg), which I hadn’t encountered after about a year of GPU passthrough VM usage.
Ahh, yeah if it’s specifically when coming back from a VM, that sounds different. Maybe the vfio_pci driver isn’t getting swapped back to the real one? I barely know how it works, I’m sure you’ve checked everything.
I’ll have to try that. What I have tried so far is running a different kernel version and making sure my driver blacklists are correct (I found that the GPU shouldn’t ever connect to snd_hda_intel. It briefly eas again, but after fixing it, I still had the problem.).
For me, I have intel integrated + amd discrete. When I tried to set DRI_PRIME to 0 it complained that 0 was invalid, when I set it to 2 it said it had to be less than the number of GPUs detected (2). After digging in I noticed my cards in
/dev/dri/by-path
were card1 card2 rather than 0 and 1 like everyone online said they should be. Searching for that I found a few threads like this one that mentioned simpledrm was enabled by default in 6.4.8, which apparently broke some kind of enumeration with amd GPUs. I don’t really understand why, but setting that param made my cards number correctly, and prime selection works again.Huh. My issue seems different, but I’ll still test that flag to see if it changes anything. My problem looks like the device doesn’t return to host after VM shutdown, possibly because of the reset bug (based on my observation of dmesg), which I hadn’t encountered after about a year of GPU passthrough VM usage.
Ahh, yeah if it’s specifically when coming back from a VM, that sounds different. Maybe the vfio_pci driver isn’t getting swapped back to the real one? I barely know how it works, I’m sure you’ve checked everything.