Jump to content

BlueJedi

Member
  • Posts

    41
  • Joined

  • Last visited

Everything posted by BlueJedi

  1. Would be nice. Not sure why LTT hasn't been doing this. I've never seen it for any of their streams. I saw one for GN's stream after this one.
  2. If microcenter can't help you you can try Gigabyte and go the RMA route. It could be that particular card has issues. Having no idea what you tried I will add it's worthing using DDU to remove old drivers and then install the latest. Especially if you have updated Windows 10 to version 2004. It has a lot of changes focused at GPUs. Also make sure your RAM is stable as configured. A bad XMP profile can cause issues for Vega/Navi cards.
  3. Sounds like file permission issues or anti-virus software issues. My guess is it can't write to the log, the config, or other files. I'd add exceptions for FAHClient in whatever anti-virus is on the system if any, including Windows Defender. Track down the locations where it sticks the logs and config files and make sure the file permissions are right.
  4. What clock speed is it hitting before you do that? PPD estimates are based on TPF, which can vary a lot throughout a WU. How much is TPF going down? Does it stay there? How much is PPD being pushed up on average?
  5. As long as your card isn't stuck at idle clocks (<1000) or thermally constrained I wouldn't worry. It still appears to be 100% utilized at those lower clocks so it's probably working as intended. Compute work loads are handled differently by the driver. Most GPUs don't try to hit max clocks on certain workloads. Depending on the kind of work, and how it's setup, there might not be much benefit to clocking faster for a particular WU. Some computational work is also more power intensive. So the GPU clocks back a bit to keep power and thermals under control. You can certainly trick it into higher clocks like you did there with Furmark but the gains will be minimal. And you risk causing compute errors that may cause the WU to fail.
  6. It may have been defective out of the box then. If the card is unstable at stock settings I'd contact Gigabyte support. It might not be the source of your problem here, but sounds like a strong possibility. If nothing else it's best to rule it out.
  7. Go here: https://www.guru3d.com/files-details/display-driver-uninstaller-download.html Read everything before you download and run, there's instructions there. General idea is you'll want to boot into safe mode, run this, reboot, then reinstall with the latest drivers. It's a third party tool to remove everything the AMD/Nvidia drivers leave behind.
  8. The GPU might not be dead but it could be dying. Components coming back to life after being completely powered down for awhile is often a sign of failing hardware. My only other guess is if secure boot is enabled it might be causing issues with the card POSTing correctly.
  9. Did you use DDU to uninstall the old drivers? The normal uninstall leaves a lot behind. Some of it can interfere with performance, especially when you jump brands or a few generations on the card.
  10. Not too sure... but EC2 Spot instances are an Amazon clould service. So not so sure that's an individual, it might be Amazon directly?
  11. I run all AMD GPUs (been a budget buyer for years) and I see that exact thing across all my cards. A few other people here have been too. So you're not alone. It's not isolated to specific projects, at least in my experience. One or more of the early OpenCL calls are sensitive to instability on the AMD driver. Once it gets going its fine. So card stability definitely plays a big part in how many failures you get. My factory OCd 280X, that can run about as hot as your 390, has the most failures out of all my cards. It's on a Linux box and too old to down clock in the power tables. I'll have to throw a custom BIOS on it when I have time. Ultimately the root issue is the AMD OpenCL driver itself though, instability just makes the issue worse. I know from my own cards that the problem persists on Tahiti, Polaris, Vega 10 and 20 on both Windows and Linux. Even my coolest cards running stock have WUs fail to initialize, just less so. I've been talking to AMD about it and it's a known issue, and supposedly high priority, but I haven't been given any firm answers on when it will be fixed. That isn't to say FAH or the OpenMM people couldn't find a work around for Core22. Whether it's worth their time, when the Nvidia driver is fine, I couldn't say. I'm with you though, I hope someone, anyone, fixes it. I have 105 failed WUs to 161 completed across all my cards since I starting logging for the event. All failures were at initialization. I can't imagine its helping me get WUs right now. My PPD is probably a third what it could be.
  12. Unfortunately I'm not aware of a way to limit it in that sense. The WUs likely aren't all that big, just computationally intense, so max packet size only helps to a point. What WUs you're getting is probably down to what instruction sets your CPU has compatibility for. What CPU your client gives in its configuration is used by the server to decide what WUs to assign, I believe. I imagine the solution is for the FAH people to up the requirements for those WUs. It might be handing them out to any CPU with AVX2 compatibility, when practically that list needs to be trimmed a bit. This may also be a consequence of them trying to get as many WUs out there as they can.
  13. Check the logs. On windows they're usually in: C:/<YourUser>/AppData/Roaming/FAHClient/logs and you can open File Explorer and type: %appdata% in the top navigation bar to go right to the AppData folder for your user. Take a look at the log files and it should give some idea what FAHClient.exe was failing on or at least what it was trying to do last. Post any relevant snippets here, should be the last stuff in the log, and we can sort through it with you.
  14. True. That's why I don't necessarily think it was the solution for someone running multiple power hungry systems. As a diagnostic step, a small decent UPS rules out power entirely and can be fairly useful for overclocking and testing. It might even shed some light on the quality of the wall power under load.
  15. That's fair. Not suggesting a permanent solution for all your systems. If the issue is because you're seeing power dips, than a line conditioner will only get you so far. A good line interactive UPS is a line conditioner too and makes up for any shortfalls from the wall. They usually give somewhat helpful real-time power stats too. Can be helpful to see what the wall power is doing while you're overclocking.
  16. For crypto, the goal is to get power usage as low as possible to make the returns profitable. So they leave a lot of performance on the table so the card is profitable in the long run. For something like FAH, it's better to just configure the GPU at the driver level. Easy enough to do on Windows in the radeon software. Can also be done on Linux with a bit more work. That way it's not permanent and you can tune it to find a balance between performance/power that works for you and your setup rather than crypto profit margins.
  17. I've run into a few cases where this wasn't enough to give fahclient permissions to run openCL calls. Not entirely sure what the issue is. Only solution I've found is to run FAHClient as root, which obviously isn't ideal.
  18. FAH is pretty picky about stability, more so than other similar projects even. It doesn't help that the AMD OpenCL driver has a few issues with core22 and openMM. OCs amplify those issues. Also, I haven't find the latest AMD drivers stable for most compute. Blackscreens and hangs that seem mostly resolved for gaming still show up for compute workloads.
  19. Great guide. I just recently started monitoring my clients this way and diving into HFM. Thank you for putting it all in one place!
  20. It wasn't clear from your post, FAHClient is starting up successfully on the server yeah? I know FAHClient doesn't play nice with Ubuntu 19.10 or later without playing dependency roulette.
  21. How do you find the 20.x AMD drivers on Windows for compute? They made some change after 19.11.3 where not all 4 compute groups are reported by the driver. Compute 2 is missing from task manager or HWiNFO. Not the cause, but any later driver with that bug has been unstable so far, at least for Vega cards. Sounds like Navi fairs better?
  22. I've been tempted. I end up spending a lot of time anyway working through a lot of weird issues trying to get AMD cards going on Linux with the amdgpu-pro drivers.
  23. I see this too on all of my AMD cards across multiple generations, Tahiti, Polaris and Vega. It's the AMD OpenCL driver. This is separate from your display driver and doesn't change often. There's not much for it until AMD updates the OpenCL driver or maybe if FAH finds a work around for Core22. It isn't isolated to a single project # either. I have WUs go through fine and those that hit this for the same projects.
  24. FAHClient is failing to get the core files it needs to fold a work unit. Probably a firewall issue. Make sure there's exceptions setup for the FAHClient in Windows Defender or any third party anti-virus software you use.
×