Jump to content

Camofelix

Member
  • Posts

    114
  • Joined

  • Last visited

Everything posted by Camofelix

  1. I believe I had e-core's disabled and changed to dynamic clock of 4.8 and ring bus at 4500. power limits set to "water cooled" mode. I've used XMP in the past but did not have it enabled
  2. @Guest 5150Just in case, I tried both sticks separately in each of the four slots, results were as above. Praying it isnt a dead board out of nowhere
  3. Hi ladies and gents, looking for some advice after my system died on me as of yesterday. Issue: When booting, EZ_debug LED for CPU flashes red for a moment, case fans and fans on PSU spin up for a split second before shutting down again, briefly starting a second time, before shutting down completely. I've ruled out PSU, CMOS/Bios revision/configuration, pulling all devices and more (see Steps attempted). System has been fine since November 2021 Relevant specs: i7-12700k MSI-z690 pro A DDR4 2*16GB G.Skill RipJaws 3200 16-18-18-38 After that I have multiple disk drives, an rx5600xt etc. but those have all been removed to trouble shoot this issue. Steps attempted: I've attempted using a different, known good PSU I've cleared CMOS using first the jumper method, then the CMOS battery method. I've attempted using the MSI bios flash from USB function from the motherboard, using bios revision 10, 11, 12 and 13. Disconnecting all external PCI-e cards, peripherals, etc. leaving only CPU_power1, case fans and the 24pin connected, removing things like RAM and so on. Any suggestions?
  4. After further review, and going down the rabbit whole, I ended up going with the Fractal Meshify 2, and it's been quite good so far. Thanks again for having taken the time gentleman
  5. Yeah, time permitting I'm hoping to have time after the kernel 5.17 merge window to look into it. It's a *somewhat* niche case, but malloc mixed with bit-shifting for exponential trees isnt completely uncommon in HPC, so could have some problems sitting there sucking up cycles in super computers as I type this Thankfully those environments tend to use the CRAY intel or custom compilers which are immune to this.
  6. Not quite sure what you mean Went further down the rabbit hole, and it's a bug in how Glibc (the GNU C library) and GCC do malloc. Replacing the memory allocation subroutines with TC malloc, JE malloc or HOARD malloc all yielded *massive* uplifts in performance, leading to GCC-12 surpassing ICC and CLANG-11 (when those 2 are using GLIBC malloc) I haven't had the time to integrate TC malloc etc. with OneAPI yet, but hope to do so soon.
  7. Yes, but not in a while. think original DDR/DDR2 era and prior Also had cards that would take memory on board as high speed cache. You'd load your files into it on boot like a ramdisk, but they'd use the PCI (not PCIe) bus instead of actual system ram. Was a way of doing SSD's before SSD's really existed in the consumer space Conceptually similar to a very dump, very very very expensive optane/ssd cache for an HDD
  8. Addressing the size, there's no reason to minimize space on this sort of prototype product. From the PCB, it seems to be Revision R1.00T. More relevant is that the larger size makes it easier to attach oscilloscope probes to the pads exposed on the PCB near the ROG logo, making it much easier to debug any issues. As for overbuilding mother boards, as you extend traces, and then jump from on discrete material to another (from the main board to the Dimm's pins for example) you get signal bounce back creating noise on the lines amongst other issues. LTT's cable testing video illustrates a simplified version of this problem. If the initial signal from an overbuilt board is cleaner than that from a lower end board, the odds of success with the higher end board, while not guaranteed, are higher. It's the same idea as when an OC MB will only have one memory slot per channel, to avoid bounce back*. *Technically reflection issues with 2 dimms per channel depends on T topology vs daisy chain topology, but that's beyond the scope of this post.
  9. Summary Found floating online in a few small circles of beta/early testers are what look like prototype/samples of ASUS ROG branded DDR4 to DDR5 adapter risers with integrated power and logic circuitry. These custom adapters with integrated components are needed because of the difference in architecture on the dimm's themselves. Thankfully Alder Lake sports a memory controller capable of both DDR4 and DDR5 capabilities, and if validated on the ROG boards, this could lead to a transitional adapter for early adopters waiting for DDR5 to mature who also already own top of the line DDR4 Quotes My thoughts it's an interesting case we find ourselves in. This sort of device would only be possible if Asus had either foreseen this issue, overbuilt their trace signalling beyond even normal spec or both. What will be interesting is how with this adapter the highest end Asus boards, in combination with the highest end alder lake processors, will be able to OC the current top of the Line DDR4 dimms for potential OC world records. EDIT: For the sake of clarification, I'd like to highlight that the board in the video is a prototype that is intentionally oversized for easier debugging with tools such as an oscilloscope. If this product were to come to market, I would be very surprised to see it be even 1/3 as tall as the prototype shown in the video. Sources
  10. Turns out it isn't alder lake at all. It's pervasive as far back as nehalem on all Gcc versions. There didnt seem to be a lot of interest on the LTT forums, so I stopped updating this thread, but the main L1T thread has much more info: I've been tracking this more on the Level1 techs forums (https://forum.level1techs.com/t/wip-testing-update-its-not-just-alder-lake-it-goes-back-to-nehalem-gcc-50-performance-regressions-vs-clang-and-intel-compilers-in-specific-workloads-across-all-opt-settings/179712/10) I've dug through a lot of the assembly, but haven't gone *all the way down* the rabbit hole as it were. (If you count 100+ different runs as not going all the way down I guess ) It seem's GCC is trying to pre-cache instructions a lot, almost n64 instruction cache style, using wayyyyyyyyyyyyyyyyyyyyy more registers at times and wasting cycles.
  11. Gotcha! What my current thinking is, is the 5kD silent due to the point mentioned above for dampening spindle noise, mount the AIO to the roof of the case, grab a pair of the brackets that you linked above and mount those in the open space on the floor of the case near the PSU cover, then route cables from there
  12. Interesting The abundance of 3.5" drive cages have been the one thing that most OEM's have kept, interesting that in the consumer space they've died off. Also explains why I see almost no 5.25" bays on most recommended case lists. Thankfully the PSU has always been oversized and run behind a line interactive UPS, so that gives some piece of mind.
  13. TLDR; Programs compiled with GCC (version 7-12) are taking up to 50% longer to complete vs those compiled by open rouce LLVM based CLang (11-13) and Intel ICX, and closed source Intel ICC compilers. Hi Ladies and Gents, I've been working on profiling the ins and outs of how alder lake works with various workloads in various environments as a way of previewing how Sapphire Rapids, which utilizes the same Golden Cove core, will perform in HPC tasks. To that end, I've already published a few hundred results on twitter in different scenarios with different kernels, compilers memory sub timings etc. those can be found here: External Link Of interest for today however is this test of Binary trees: gcc numbers 7 8 9 10 11 12 time taken is 379.943716 time taken is 395.665537 time taken is 373.488119 time taken is 392.596422 time taken is 382.825910 time taken is 390.466340 clang numbers 11 12 13 time taken is 256.381165 time taken is 290.616438 time taken is 284.877824 intel numbers icc icx time taken is 249.630150 time taken is 250.511041 Above tests were completed with Tree size of 26 The above was the output after running the test 20 times, and results were within run to run of +/- 0.2% Git with the test can be found here: https://github.com/FCLC/Choosing-a-compiler-performance-testing-GCC_ICC_ICPX_NVCC_CLANG_HIP/tree/main/Binary_tree Would love to see results from anyone else and their thoughts
  14. Hi Ladies and Gents, Been using workstations for a while now and haven't been in the full DIY space since ~2011. I'm looking for a case recommendation that fits the following: full size ATX Fits 6*3.5" Drives (or accepts additional internal 3.5" HDD cages and/or a 5.25" hot swap drive bay to make up the difference between the built drive bays and the needed amount) can fit/use a 240mm AIO Ideally sub 150 CAD, ~120 USD Silent if at all possible Dont need: RGB tempered glass ETC. The corsair 4/5000D looks like an option, but drive support seems to be questionable at best. Corsair seems to sell drive cage internal drive cages, so these may be a possibility: link to corsair website add in HDD cage accessories Would *love* any and all recommendations. Thanks, FCLC P.S. Details about machine, but not super needed for above Reason for the Drives is that this machine, my dev/CFD workstation has 2 ZFS RAIDz1 arrays for bulk model storage that I need local access to. Other components are an Alder lake 12700k (in AVX 512 mode), MSI Pro z690 DDR4 and current VGA card is 5600xt. PSU is a full modular Seasonic 750w gold from a few years back ~2011-2013 outside of the above are a SAS card, a USB add in card, optane drives and nand based NVME drives.
  15. I realize this is now an older (just short of 6 months) thread, but for the sake of search ability, you're currently best off maximizing using external accelerated solvers such as the PETSC framework and it's solvers. it supports a much wider set of solvers, and those solvers are available for cuda, Hip, Sycl etc. They also tend to push updated code for performance uplifts multiple time per quarter. Even without GPU I've seen net uplifts of over 2x in total compute time for the same solver types (and matching results) just by using the PETSC AVX2, AVX512 and intel KML plugins.
  16. older thread, so apologies for the bump everyone. Wanted to let people know, I've since gotten my hands on the 12700k and begun testing AVX 512 performance in different HPC applications. some quick data can be found here https://openbenchmarking.org/result/2112040-TJ-2111077TJ72&hgv=i7-12700K+P-Cores+%2B+AVX-512+DDR4&ppt=D As of now I'm working on testing for how different cache size scales per core as a way of previewing the Golden Cove core's that will be in Sapphire rapids (same basic topology and by disabling a given amount of cores we can approximate a given amount of shared L3 per core) only thing that this doesn't allow to test for is the new AMX instructions directly, but they seem to have an AVX backup possibility (at a reciprocal performance hit) feel free to search for #avx512 on twitter and you should be able to find any experiments myself or other collaborators work on! An interesting part is that, the i7 in AVX 512 mode can obliterate the i9 in all core mode for AVX-512 workloads like CFD and other engineering workloads
  17. First patch from intel acknowledging AVX-512 on alder lake, but marking it as unsupported. Phoronix article here: https://phoronix.com/scan.php?page=news_item&px=Intel-Alder-Lake-Tuning-GCC actual code for the compiler here: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583958.html
  18. Same on my end. It's google user content for the URL, probably a permission issue.
  19. It's now the first *real* post launch time for the main NA offices to start making decisions on what's next for AVX-512. Thursday having been launch and friday being PR day, now's the time when real discussions will start happening (beyond the impromptu ones over the week end of course). It isnt a very big deal for the vast majority of users, but it's very strange for this to happen at intel. It seems like a sort of left hand not talking to right hand sort of situation. New early coverage starting to come in on the performance from Phoronix: https://www.phoronix.com/scan.php?page=article&item=alder-lake-avx512 Seems Dr. Kinghorn of Puget has also started his AVX testing on linux with the MKL, but only up to AVX2 so far. Not sure if he knows about 512 enabling yet. Then again, since it's not POR, Puget may not want to get involved with it.
  20. Follow up on AVX-512 support: here's a post from Der8auer showing how to turn it on for Asus boards: Please note The part about Anandtech calling it a leak and being wrong etc. are flat out misleading/mischaracterizations of the facts. See this twitter thread for context:
  21. Hoping to see AMD finally adopt it properly, especially since the spec was published as far back as 2013 The funny part to me is that Centaur (the part of Via that actually owns the x86 license [and seems to be being spun off?]) already has AVX-512 in market. I don't expect Zen4's AVX performance to be all that exceptional unfortunately. Compared to AVX1 and 2 it will be amazing, but hopefully they give us more than just the foundation set. Good point on the AVX-2 implementation, I'd forgotten about that. Amazes me how many HPC devs don't look at the ISA/cluster they're targeting in more detail. But that's probably a topic for a different thread. (Haven't been on the LTT forums often in the last few years, any HPC/hardcore turbo nerd areas worth taking a gander at?) Back to Alder Lake, still no word from intel r.e. AVX-512 enablement, but MSI have now come out and said that they're going to be enabling support across their entire z690 lineup (They reached out to Dr. Ian C at AD). To me that's as good a sign as any that, at least for the K sku, AVX-512 will be treated like overclocking
×