Jump to content

Which script in particular? Could you link the actual script rather than lengthy forum discussions?

 

What have you tried already, where did you get stuck? Did you get errors while executing the script?

Remember to either quote or @mention others, so they are notified of your reply

Link to post
Share on other sites

We made a script similar to the one posted here, but edited for the 9070:

 

#!/bin/bash

echo "0000:03:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind
sleep 2
echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind
sleep 2

As long as I run this after VM shutdown and/or before starting the VM again, I can (so far) reliably restart the VM without having to restart the Proxmox host.

 

 

 

 

 

How to do this?

Link to post
Share on other sites

1 hour ago, Rad25 said:

How to do this?

In liveattach-gpu-vm.sh replace YourVMName with the name of your vm in qemu

 

In the xml files and second bash script you need to get the device pci id for your gpu by running "lspci -D" command in terminal. For me I have nvidia gpu its address is listed as following

0000:09:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
0000:09:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)

The first line is the device id for the gpu, the second line is the device id for gpu audio

The replacement would be like so

<address domain="0x0000" bus="0x09" slot="0x00" function="0x0"/>  <!-- gpu -->

<address domain="0x0000" bus="0x09" slot="0x00" function="0x1"/>  <!-- audio -->

in bash script i would replace "0000:03:00.0" with "0000:09:00.0"

(Of course yours will be different)

Link to post
Share on other sites

2 hours ago, C2dan88 said:

In liveattach-gpu-vm.sh replace YourVMName with the name of your vm in qemu

 

In the xml files and second bash script you need to get the device pci id for your gpu by running "lspci -D" command in terminal. For me I have nvidia gpu its address is listed as following

0000:09:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
0000:09:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)

The first line is the device id for the gpu, the second line is the device id for gpu audio

The replacement would be like so

<address domain="0x0000" bus="0x09" slot="0x00" function="0x0"/>  <!-- gpu -->

<address domain="0x0000" bus="0x09" slot="0x00" function="0x1"/>  <!-- audio -->

in bash script i would replace "0000:03:00.0" with "0000:09:00.0"

(Of course yours will be different)

How to create .sh script? I never doing this.

 

sudo nano name.sh?

Link to post
Share on other sites

1 hour ago, Rad25 said:

sudo nano name.sh?

Pretty much, any script starting with 

#!/bin/bash

is basically a list of console commands that will be executed in order, so

echo "0000:03:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind

unbinds the vfio-driver from the device @<address domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>

sleep 2 should be replaced with 

while [[ -h /sys/bus/pci/drivers/vfio-pci/0000\:03\:00.0 ]];do sleep 0.5;done

it's more robust, it actually waits for the driver to be unbound before letting the script continue.

It basically says "if the symlink 0000:03:00.0 exists then wait 0.5 seconds then check again". Once the driver is unbound the symlink will disappear and script continues with ...

echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind

which is bind (enable) the amdgpu driver on the device @<address domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>

again we replace sleep 2 with a more robust alternative

until [[ -h /sys/bus/pci/drivers/amdgpu/0000\:03\:00.0 ]];do sleep 0.5;done

which waits for the symlink to appear (the driver to be bound).

 

Once you are done editing you need to make your script executable with sudo chmod u+x name.sh

It's a good idea to manually check your bind commands work before you script them, in this case that would be with

echo "0000:03:00.0"|sudo tee /sys/bus/pci/drivers/amdgpu/bind
dmesg

and check for errors.

If your VM didn't leave you card in an "accessible" state rebinding can fail, which is why it's a good idea to give both items (the GPU section and the audio section) the same bind/unbind treatment or you are trying to "split your hardware between native and VM"

 

Cont'd...

Link to post
Share on other sites

The command 

echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind

can hang if it fails, or if it completes but never manages to bind

until [[ -h /sys/bus/pci/drivers/amdgpu/0000\:03\:00.0 ]];do sleep 0.5;done

will hang.

The "even more robust" solution to this is to background the echo and add a timeout to the wait command, thus:

echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind &
limit=100 #seconds to wait x10
until [[ -h /sys/bus/pci/drivers/amdgpu/0000\:03\:00.0 ]];do
    sleep 0.1
    limit=$(( $limit - 1))
    if [[ $limit -lt 1 ]];then
    	echo "Bind failed!!"
        exit 1
    fi
done

 

Now because you are going to be binding/unbinding both the gpu and associated audio it makes sense to turn the bind and unbind operations into functions:

#!/bin/bash
bind(){
    device="$1"
    driver="$2"
    echo "Binding ${device} to $driver"
    echo "${device}" > "/sys/bus/pci/drivers/${driver}/bind" 2>/dev/null &
    limit=100
    until [[ -h "/sys/bus/pci/drivers/${driver}/${device}" ]];do
        sleep 0.1
        limit=$(( $limit - 1))
        if [[ $limit -lt 1 ]];then
                echo "Binding $device to $driver driver failed!!"
                break
        fi
    done
}
unbind(){
    device="$1"
    driver="$2"
    echo "Unbinding ${device} from $driver"
    echo "${device}" > "/sys/bus/pci/drivers/${driver}/unbind" 2>/dev/null &
    limit=100
    while [[ -h "/sys/bus/pci/drivers/${driver}/${device}" ]];do
        sleep 0.1
        limit=$(( $limit - 1))
        if [[ $limit -lt 1 ]];then
                echo "Unbinding $device from $driver driver failed!!"
                break
        fi
    done
}


GPU="0000:03:00.0"
AUDIO="0000:03:00.1"

## before starting the VM
unbind "$GPU" "amdgpu"
unbind "$AUDIO" "snd_hda_intel"
bind "$GPU" "vfio-pci"
bind "$AUDIO" "vfio-pci"

## after stopping the vm
#unbind "$GPU" "vfio-pci"
#unbind "$AUDIO" "vfio-pci"
#bind "$GPU" "amdgpu"
#bind "$AUDIO" "snd_hda_intel"

 

Link to post
Share on other sites

/sys/bus/pci/drivers/vfio-pci/unbind: Permission denied
/sys/bus/pci/drivers/amdgpu/bind: Permission denied

 

How to fix this?

 

My script.sh:

 

#!/bin/bash

 

echo "0000:03:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind
sleep 2
echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind
sleep 2

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×