Jump to content

AMD gpu driver problem on Ubuntu 18.04

So I have a nice PC with a Ryzen 3600X, an ASRock Radeon 5700 Challenger, 16 GB of Corsair Vengeance RAM and a MSI B450 pro gaming max. Everything works fine in Windows but now that I'm using Linux (Elementary OS 5.4 based on Ubuntu 18.04) I'm experiencing some issues and I think they have to do with my GPU driver. When playing back video (youtube, zoom), it sometimes freezes and I have to restart my PC. It also happens while gaming.

 

I replaced my ASRock B450 PRO4 with a MSI B450 pro gaming max a couple of weeks ago because I already had RAM problems (it would only boot in stot A1/A2) and the RGB on ASRock motherboards unusable. This greatly reduced the amount of freezes but it still happens sometimes.

 

Another strange problem, is that I get the error "There is a problem with your graphics card. Please ensure that your card meets the minimum system requirements and that you have the latest drivers installed." when running the epic games store in Wine. It was running fine before, but now that I updated to kernel 5.9.4 I keep getting this error.

 

Those two problems together make me think it's a driver issue. I tried installing the Linux drivers from the AMD website, but then I got a dpkg error and when I rebooted, the desktop environment didn't start (startx didn't work either) so I reinstalled the OS (I just installed the os so I didn't really matter).
I have installed the drivers from ppa:oibaf/graphics-drivers but it doesn't seem to improve things...

 

Does anyone know what could be the problem here? Thanks in advance!

Link to comment
Share on other sites

Link to post
Share on other sites

Are you using the correct Linux driver, there's only 2 available.

You might want to try to upgrade the Ubuntu version to the latest 20.04, or upgrade CentOS to a version that uses Ubuntu 20.04 and try using the other driver to see if it works.

image.thumb.png.0f39dbe4a9c1d0475360117692d5fff1.png

Link to comment
Share on other sites

Link to post
Share on other sites

I tried to install the bottom one. Is it possible to install Ubuntu 20.04 while having Elementary OS?  I kinda like it...

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Kaassouffle said:

I tried to install the bottom one. Is it possible to install Ubuntu 20.04 while having Elementary OS?  I kinda like it...

I meant like upgrade, as in upgrade whatever centOS uses of Ubuntu 18.04, to 20.04 not installing Ubuntu, or finding a version of elementary os that uses ubuntu 20.04.

Link to comment
Share on other sites

Link to post
Share on other sites

Unfortunately, Elementary OS doesn't have a 20.04 version yet

Link to comment
Share on other sites

Link to post
Share on other sites

the issue is the 5700 series is quite new, and Ubuntu 20.04 being LTS all the driver and kernel stuff it ships with is kinda outdated because it's an LTS release. i recommend you switch to either Ubuntu 20.10 or Manjaro Linux. Manjaro is arch based and has a bit of a learning curve compared to Ubuntu, so if you are familiar with Ubuntu i'd recommend installing 20.10.

 

you can do this from your existing install by going into the "softwar eand updates" program and setting the "notifiy of a new ubuntu release" to "any new version" and then upgrading from 20.04 to 20.10 through software update, that way you don't have to reinstall anything. 

Edited by Ashley xD
typo

She/Her

Link to comment
Share on other sites

Link to post
Share on other sites

Title might be a bit misleading but I'm actually using Elementary OS, based on Ubuntu 18.04 so I can't upgrade to 20.10. I think I'll try Manjaro, liked the feel of it very much when I tried it some time ago (it was a bit too difficult for me back then). I see that AMD doesn't provide drivers for Arch/Manjaro, will the performance be fine on Manjaro or will Ubuntu 20.10 be better?

Link to comment
Share on other sites

Link to post
Share on other sites

amd driver are in ubuntu's kernel 

do a kernel update and tell me what happens 

if it was useful give it a like :) btw if your into linux pay a visit here

 

Link to comment
Share on other sites

Link to post
Share on other sites

On Linux you want the AMDGPU Drivers, which are part of the kernel and mesa stack. You shouldn't need to download anything.

If your having issues, then the kernel and/or Mesa build that ships on that distro is probably outdated.

 

Using the AMDGPU-PRO drivers or I guess Radeon, will usually lead to compatibility and performance issues outside of workstation applications.

Link to comment
Share on other sites

Link to post
Share on other sites

@mahyar I'm currently running kernel 5.9.4, the mainline gui doesn't want to update to 5.9.6 for some reason.

@Nayr438 The command sudo lshw -C display tells me that I'm already running the amdgpu drivers.

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/7/2020 at 12:39 AM, Ashley xD said:

the issue is the 5700 series is quite new, and Ubuntu 20.04 being LTS all the driver and kernel stuff it ships with is kinda outdated because it's an LTS release. i recommend you switch to either Ubuntu 20.10 or Manjaro Linux. Manjaro is arch based and has a bit of a learning curve compared to Ubuntu, so if you are familiar with Ubuntu i'd recommend installing 20.10.

 

you can do this from your existing install by going into the "softwar eand updates" program and setting the "notifiy of a new ubuntu release" to "any new version" and then upgrading from 20.04 to 20.10 through software update, that way you don't have to reinstall anything. 

@Ashley xD So I switched to Manjaro but I still have issues... Youtube often crashes on Brave with error code SIGSEGV. I'm running kernel 5.9.3-1

Link to comment
Share on other sites

Link to post
Share on other sites

28 minutes ago, Kaassouffle said:

@Ashley xD So I switched to Manjaro but I still have issues... Youtube often crashes on Brave with error code SIGSEGV. I'm running kernel 5.9.3-1

try chromium or firefox, Brave is a terrible browser.

She/Her

Link to comment
Share on other sites

Link to post
Share on other sites

When it crashes, does anything relevant appear in `journalctl --catalog --boot`? In case you haven't used it before, the navigation is the same as with Vim. To get to the bottom of the log, press Shift+G. You can move up and down with both the arrow keys (line by line) and PgUp/Down (multiple lines). To search, type the character “/” (without the quotation marks; I don't know which keyboard layout you're using and it's Shift+7 in the Bosnian layout), type your query, “sigsev” or “brave” for example, hit enter, and navigate through the matches with N and Shift+N.

 

Also, if you can, please try looking at the log when the entire computer crashes by running `journalctl --catalog --boot=-1` once the computer restarts.

 

Edit

 

Oh, and do stay away from the AMDGPU-Pro drivers. They're only really necessary for OpenCL, and even then you can try ROCm before caving into the proprietary drivers.

Forums really should adopt markdown as an option

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/12/2020 at 1:54 PM, Ashley xD said:

try chromium or firefox, Brave is a terrible browser.

Yeah I switched to firefox now... The page doesn't crash anymore but the OS still does sometimes

 

On 11/12/2020 at 4:20 PM, elsandosgrande said:

When it crashes, does anything relevant appear in `journalctl --catalog --boot`? In case you haven't used it before, the navigation is the same as with Vim. To get to the bottom of the log, press Shift+G. You can move up and down with both the arrow keys (line by line) and PgUp/Down (multiple lines). To search, type the character “/” (without the quotation marks; I don't know which keyboard layout you're using and it's Shift+7 in the Bosnian layout), type your query, “sigsev” or “brave” for example, hit enter, and navigate through the matches with N and Shift+N.

 

Also, if you can, please try looking at the log when the entire computer crashes by running `journalctl --catalog --boot=-1` once the computer restarts.

 

Edit

 

Oh, and do stay away from the AMDGPU-Pro drivers. They're only really necessary for OpenCL, and even then you can try ROCm before caving into the proprietary drivers.

Forums really should adopt markdown as an option

I've ran `journalctl --catalog --boot`, but I don't really know what I'm looking for...
I did find some red errors:


kernel: sp5100-tco sp5100-tco: Watchdog hardware is disabled
kernel: kvm: disabled by bios (this one occurred quite often)
kernel: sd 10:0:0:0: [sdd] No Caching mode page found

kernel: sd 10:0:0:0: [sdd] Assuming drive cache: write through

lightdm[1122]: gkr-pam: unable to locate daemon control file

colord-sane[1071]: io/hpmud/pp.c 627: unable to read device-id ret=-1

 

--boot=-1 gave the same errors and a weird pulseaudio error that I didn't get with --boot

pulseaudio[3621]: ALSA woke us up to write new data to the device, but there was actually nothing to write.
pulseaudio[3621]: Most likely this is a bug in the ALSA driver 'snd_hda_intel'. Please report this issue to the ALSA developers.
pulseaudio[3621]: We were woken up with POLLOUT set -- however a subsequent snd_pcm_avail() returned 0 or another value < min_avail.

 

Link to comment
Share on other sites

Link to post
Share on other sites

Whats the output of the following

  • dmesg | grep error
  • dmesg | grep amdgpu
  • glxinfo | grep "version"
  • uname -a
Link to comment
Share on other sites

Link to post
Share on other sites

dmesg | grep error gives nothing

dmesg | grep amdgpu gives

server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL core profile version string: 4.6 (Core Profile) Mesa 20.2.1
OpenGL core profile shading language version string: 4.60
OpenGL version string: 4.6 (Compatibility Profile) Mesa 20.2.1
OpenGL shading language version string: 4.60
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 20.2.1
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
    GL_EXT_shader_implicit_conversions, GL_EXT_shader_integer_mix,

glxinfo | grep "version" gives

server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL core profile version string: 4.6 (Core Profile) Mesa 20.2.1
OpenGL core profile shading language version string: 4.60
OpenGL version string: 4.6 (Compatibility Profile) Mesa 20.2.1
OpenGL shading language version string: 4.60
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 20.2.1
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
    GL_EXT_shader_implicit_conversions, GL_EXT_shader_integer_mix,

uname -a gives

Linux [name] 5.9.3-1-MANJARO #1 SMP PREEMPT Sun Nov 1 14:25:36 UTC 2020 x86_64 GNU/Linux


I tried kernel version 5.4.74-1, 5.8.18-1 and 5.10.rc2.d1101.g3cea11c-1 as well and 5.9.3-1 seems to be the most stable

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Kaassouffle said:

dmesg | grep amdgpu gives

Run this one again. It shouldn't be outputting glxinfo. Everything else seems fine however.

 

 

I am also wondering if you Have a card that suffers from a similar issue as my MSI 5700XT.

I have a MSI card that runs the memory quite literally at it's limits. On Windows it seems stable, On Linux it results in random freezes and restarts. Maybe this is something the Windows driver takes into account that AMDGPU doesn't, no idea.

 

The fix on my card what to set the memory clock from 875 to 850.

 

Try setting "amdgpu.ppfeaturemask=0xffffffff" to the kernel command line either at grub (temporary) during boot or to "/boot/grub/grub.cfg" (permanent). You can just add it to the same line as "quite splash"

Then install corectrl and try dropping the memory frequency and see if the crashes still occur.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Kaassouffle said:

Yeah I switched to firefox now... The page doesn't crash anymore but the OS still does sometimes

 

I've ran `journalctl --catalog --boot`, but I don't really know what I'm looking for...
I did find some red errors:


kernel: sp5100-tco sp5100-tco: Watchdog hardware is disabled
kernel: kvm: disabled by bios (this one occurred quite often)
kernel: sd 10:0:0:0: [sdd] No Caching mode page found

kernel: sd 10:0:0:0: [sdd] Assuming drive cache: write through

lightdm[1122]: gkr-pam: unable to locate daemon control file

colord-sane[1071]: io/hpmud/pp.c 627: unable to read device-id ret=-1

 

--boot=-1 gave the same errors and a weird pulseaudio error that I didn't get with --boot

pulseaudio[3621]: ALSA woke us up to write new data to the device, but there was actually nothing to write.
pulseaudio[3621]: Most likely this is a bug in the ALSA driver 'snd_hda_intel'. Please report this issue to the ALSA developers.
pulseaudio[3621]: We were woken up with POLLOUT set -- however a subsequent snd_pcm_avail() returned 0 or another value < min_avail.

 

KVM being disabled by BIOS/UEFI probably just means that the processor's virtualization extensions are disabled (not relevant). The sdd and lightdm messages are of no worry either. The colord-sane message is not relevant either. It's the same with the watchdog message as far as I can tell. The PulseAudio messages do not seem suspect either, especially since the bug reports that I could find do not seem related to the issue at hand.

 

Did you run `journalctl --catalog --boot=-1` after your computer crashed and restarted? Usually, with GPU crashes, there is a crash dump somewhere and It's usually pointed to in journalctl, at least with Intel iGPUs. I haven't had GPU crashes with my previous AMD laptop, so I can't say for sure.

Edited by elsandosgrande
“[…] reports I could find […]” → “[…] reports that I could find […]”
Link to comment
Share on other sites

Link to post
Share on other sites

Oops, accidentally sent the same thing twice... dmesg gives

[    4.393282] [drm] amdgpu kernel modesetting enabled.
[    4.393364] amdgpu: Ignoring ACPI CRAT on non-APU system
[    4.393388] amdgpu: Topology: Add CPU node
[    4.393534] fb0: switching to amdgpudrmfb from EFI VGA
[    4.393597] amdgpu 0000:28:00.0: vgaarb: deactivate vga console
[    4.393646] amdgpu 0000:28:00.0: enabling device (0006 -> 0007)
[    4.393742] amdgpu 0000:28:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[    4.395183] amdgpu: ATOM BIOS: 113-EXT900122-L04
[    4.395241] amdgpu 0000:28:00.0: amdgpu: VRAM: 8176M 0x0000008000000000 - 0x00000081FEFFFFFF (8176M used)
[    4.395243] amdgpu 0000:28:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
[    4.395373] [drm] amdgpu: 8176M of VRAM memory ready
[    4.395377] [drm] amdgpu: 8176M of GTT memory ready.
[    5.142686] amdgpu 0000:28:00.0: amdgpu: RAS: optional ras ta ucode is not available
[    5.162764] amdgpu 0000:28:00.0: amdgpu: use vbios provided pptable
[    5.162767] amdgpu 0000:28:00.0: amdgpu: smc_dpm_info table revision(format.content): 4.5
[    5.213760] amdgpu 0000:28:00.0: amdgpu: SMU is initialized successfully!
[    5.246728] snd_hda_intel 0000:28:00.1: bound 0000:28:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[    5.589129] amdgpu: Topology: Add dGPU node [0x731f:0x1002]
[    5.589135] amdgpu 0000:28:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 10, active_cu_number 36
[    5.592560] fbcon: amdgpudrmfb (fb0) is primary device
[    5.592563] amdgpu 0000:28:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[    5.612772] amdgpu 0000:28:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[    5.612774] amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    5.612776] amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    5.612777] amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[    5.612777] amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[    5.612779] amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[    5.612780] amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[    5.612781] amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[    5.612782] amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[    5.612783] amdgpu 0000:28:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[    5.612785] amdgpu 0000:28:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[    5.612786] amdgpu 0000:28:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[    5.612787] amdgpu 0000:28:00.0: amdgpu: ring vcn_dec uses VM inv eng 0 on hub 1
[    5.612788] amdgpu 0000:28:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 1 on hub 1
[    5.612789] amdgpu 0000:28:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 4 on hub 1
[    5.612790] amdgpu 0000:28:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
[    5.613436] [drm] Initialized amdgpu 3.39.0 20150101 for 0000:28:00.0 on minor 0

I'll try setting the memory clock to 850

 

@elsandosgrandeI ran the command right after the crash and restart

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, elsandosgrande said:

Did you run `journalctl --catalog --boot=-1` after your computer crashed and restarted? Usually, with GPU crashes, there is a crash dump somewhere and It's usually pointed to in journalctl, at least with Intel iGPUs. I haven't had GPU crashes with my previous AMD laptop, so I can't say for sure.

Not necessarily. This is only true if useful information is generated before the kernel panics / deadlocks, unless you have kdump setup, something that I think only Fedora does and I am actually not even positive if they do.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Kaassouffle said:

I'll try setting the memory clock to 850

Everything looks fine. Just try setting your memory clock lower than it's default, a -25 offset would probably suffice. Corectrl will show you the default values.

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, Nayr438 said:

Not necessarily. This is only true if useful information is generated before the kernel panics / deadlocks, unless you have kdump setup, something that I think only Fedora does and I am actually not even positive if they do.

It had given me relevant information when Ubuntu crashed a few times in the past, so… ¯\_(ツ)_/¯

 

Edit

 

@Kaassouffle Correct me if I'm wrong, but that looks like the beginning of the log. If Nayr's right though, the log won't be necessary anyway.

Edited by elsandosgrande
Link to comment
Share on other sites

Link to post
Share on other sites

I clocked my GPU memory down to 850 (instead of 875) and it's been great the last two days. It didn't crash a single time!

Thanks for the help, you guys are awesome!

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×