Jump to content
Search In
  • More options...
Find results that contain...
Find results in...

Apple removing Rosetta 2 x86 emulation!?

15 minutes ago, StDragon said:

Honestly, what's the damn point? I remember when running a "hackintosh" was a useful edgy thing to do. But with MacOS being a walled garden, limited Steam library, and overall a giant PITA to implement and maintain with current hardware (specifically the GPU), WHY BOTHER??

Apple, you can keep your platform. I wouldn't run OSX on my PC even if you gave the OS away for free.聽馃ケ

Rosetta 2 wasn鈥檛 made to run games that would run on windows, remember that the x86 emulation is limited to only mac software. A more realistic use case is word processing software that hasn鈥檛 been updated, or be able to run skype on their mac to video call their parents.

Link to post
Share on other sites
1 hour ago, curiousmind34 said:

Rosetta 2 wasn鈥檛 made to run games that would run on windows, remember that the x86 emulation is limited to only mac software. A more realistic use case is word processing software that hasn鈥檛 been updated, or be able to run skype on their mac to video call their parents.

Rosetta also appears to be fast enough to run (within reason) more demanding productivity software as well. As more intensive applications tend to take time to properly optimize for an entirely different ISA, it (Rosetta) kind of makes sense that it allows a bridge for more certain hobbyists and users until the software is ready. 聽

I have to admit, watching that prototype ARM Mac (A12z chip) running Rise of the Tomb Raider via emulation (keeping in mind the game has a Mac variant), in 1080P and 60 fps,聽was quite impressive.聽

Unfortunately, I doubt a lot of old games will likely not be ported to ARM,聽and so will probably require a separate machine for people to enjoy once Rosetta is dropped. Which is a shame as Rosetta is really a great engineering feat. 聽

The pursuit of knowledge for the sake of knowledge.

Forever in search of my reason to exist.

Link to post
Share on other sites
2 hours ago, StDragon said:

Honestly, what's the damn point? I remember when running a "hackintosh" was a useful edgy thing to do. But with MacOS being a walled garden, limited Steam library, and overall a giant PITA to implement and maintain with current hardware (specifically the GPU), WHY BOTHER??

Apple, you can keep your platform. I wouldn't run OSX on my PC even if you gave the OS away for free.聽馃ケ

That鈥檚 not why fewer hackintoshes are made. Over the years it鈥檚 gotten progressively harder and harder to do. 聽Apple long ago said that hackentoshing was something they would have to combat incrementally over a very long period of time, and it seems it has been the case. 聽Many years ago hackentoshing was trivial. 聽I once had an聽eeee. (I forget how many es but more than two) I bought specifically to hackintosh. 聽 It could be done out of the box. 聽Everything worked. 聽It dramatically improved the machine. 聽When running windows聽and OSX head to head it becomes a clear what a slow buggy mess windows was聽in comparison. 聽聽been a long time since one could do something like that though.聽

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to post
Share on other sites
12 minutes ago, Bombastinator said:

That鈥檚 not why fewer hackintoshes are made. Over the years it鈥檚 gotten progressively harder and harder to do. 聽Apple long ago said that hackentoshing was something they would have to combat incrementally over a very long period of time, and it seems it has been the case. 聽Many years ago hackentoshing was trivial. 聽I once had an聽eeee. (I forget how many es but more than two) I bought specifically to hackintosh. 聽 It could be done out of the box. 聽Everything worked. 聽It dramatically improved the machine. 聽When running windows聽and OSX head to head it becomes a clear what a slow buggy mess windows was聽in comparison. 聽聽been a long time since one could do something like that though.聽

Honestly my experience with Hackintoshes has gotten better as time has gone on. I got Catalina running on my custom rig in about an hour - much quicker than other machines.

York: Intel Core i7-2600 (4C/8T), 16GB 1600MHz DDR3, Zotac GTX 550 Ti AMP! 1GB, 250GB Samsung 870 EVO, Windows 7 Professional

Phobos: AMD Ryzen 7 2700 (8C/16T), 16GB 3000MHz DDR4, XFX Radeon RX 570 4GB, 1TB WD Blue SN550, 960GB Crucial M500, 2TB Seagate BarraCuda, Windows 10 Pro for Workstations

Bondi: Apple iMac G3 (Tray Loader, Revision A), PowerPC 750, 384MB PC-66, ATI Rage IIc 2MB, 4GB Quantum Fireball, Mac OS 9.2.2

Link to post
Share on other sites
40 minutes ago, BondiBlue said:

Honestly my experience with Hackintoshes has gotten better as time has gone on. I got Catalina running on my custom rig in about an hour - much quicker than other machines.

The complication is partly what one has to do to build one, and what one can get to work. 聽Both have gotten worse over time. 聽If you can fulfil the criteria though it鈥檚 about the same, more or less. 聽It depends more on what contortions are necessary to make a particular version work, which varies somewhat randomly for the end user, though i has gotten dramatically more complex for the unlock creators.

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to post
Share on other sites

Surely, to port softwares to ARM, they just need to recompile right? I mean it isn't like they are developing for entirely different OS. If SDKs and APIs are all the same, they should just need to recompile the software with their respective ARM compiler/assembler from source which can be done in seconds, well, for the humans to run the compiler聽that is. Not sure how long it would take for the compiler to compile.聽聽

Sudo make me a sandwich聽

Link to post
Share on other sites

There鈥檚 a chance this is BS and it鈥檚 referring to an earlier thing where there was an implication in a dev build that it might be removable. 聽There鈥檚 also a chance that this is a later build and it went more in that direction. 聽I don鈥檛 know.

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to post
Share on other sites

So apple have adding error strings to the OS just in case there is a legal battle from intel that forces them to stop distorbuting it in some regions. Currently if rosetta2 fails to download the user gets a rather useless error message, this changes that to provide a more meaningful message but it does not at all mean they will stop shipping rosetta2 just means they want to be ready incase intel sue them and win in some regions of the world. (if apple were planning on pulling rosseta2 support themselves they would do it globally so they error message would not say something about your region).

Link to post
Share on other sites
1 hour ago, hishnash said:

So apple have adding error strings to the OS just in case there is a legal battle from intel that forces them to stop distorbuting it in some regions.

Will Intel actually do that stoop that low? Quite petty of them to do such thing聽if you ask me.

There is more that meets the eye
I see the soul that is inside

Making Windows Defender as good or even better than paid options

Link to post
Share on other sites
1 hour ago, captain_to_fire said:

Will Intel actually do that stoop that low? Quite petty of them to do such thing聽if you ask me.

When Microsoft was working on x86 Emulation in the past, Intel was throwing indirect warnings to Microsoft. Given the struggles Intel has been having lately, I wouldn't put it past them.

https://www.forbes.com/sites/tiriasresearch/2017/06/16/intel-threatens-microsoft-and-qualcomm-over-x86-emulation/?sh=739b91fd54f4

Link to post
Share on other sites
7 hours ago, wasab said:

Surely, to port softwares to ARM, they just need to recompile right? I mean it isn't like they are developing for entirely different OS. If SDKs and APIs are all the same, they should just need to recompile the software with their respective ARM compiler/assembler from source which can be done in seconds, well, for the humans to run the compiler聽that is. Not sure how long it would take for the compiler to compile.聽聽

There will be programs out there that use assembly code in order to squeeze every last ounce of strength out of their code, along with optimizations for special instruction sets.聽 This is likely more pronounced in video and sound editing software where performance in some areas are key.聽 So it wouldn't be as simple as recompiling.

On top of that, if a vendor has a newer version that they might just see it as an opportunity to sell the newest version compatible with the M1's and just not remake the old version

3735928559 - Beware of the dead beef

Link to post
Share on other sites
6 hours ago, Bombastinator said:

There鈥檚 a chance this is BS and it鈥檚 referring to an earlier thing where there was an implication in a dev build that it might be removable. 聽There鈥檚 also a chance that this is a later build and it went more in that direction. 聽I don鈥檛 know.

agreed but also doesnt apple also kill off others to make their own too?

Link to post
Share on other sites
29 minutes ago, pas008 said:

agreed but also doesnt apple also kill off others to make their own too?

Huh? Kill off what? 聽This sentence聽didn鈥檛 make sense to me. 聽Undifferentiated pronouns.

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to post
Share on other sites
2 hours ago, Bombastinator said:

Huh? Kill off what? 聽This sentence聽didn鈥檛 make sense to me. 聽Undifferentiated pronouns.

dont they kill off apps/features/etc from other companies

because they implement similar app/features/etc into their own

Link to post
Share on other sites
Posted (edited)
3 hours ago, pas008 said:

dont they kill off apps/features/etc from other companies

because they implement similar app/features/etc into their own

Heh. 聽Yes and no. 聽They don鈥檛 destroy them so much as include often better versions for free. 聽There鈥檚 very little difference between that and say Google who more or less destroyed Garmin聽with Google maps. Google maps only pretended to be free of course because there was data collection, but there is a similarity. 聽More or less every OS maker has done this. 聽Often with less kindness. 聽It鈥檚 not something in any way specific to Apple. 聽Microsoft didn鈥檛 鈥渄eliberately destroy鈥 solitaire software programs when it inteoduced free cell. 聽The action did effectively put a cap on price and forced devs to create a materially better product than they offered for nothing. 聽I鈥檓 in a similar situation with my most important piece of phone software. 聽Apple has two included programs: alarm clock and iCal, which together do most of the things informant5 does better than informant5. Informant5 actually uses iCal databases. 聽It鈥檚 value added though so they add a not-so-small monthly fee to use it. 聽Which I pay. 聽So there would be a situation where Apple didn鈥檛 do that.

Edited by Bombastinator

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to post
Share on other sites
12 hours ago, hishnash said:

So apple have adding error strings to the OS just in case there is a legal battle from intel that forces them to stop distorbuting it in some regions. Currently if rosetta2 fails to download the user gets a rather useless error message, this changes that to provide a more meaningful message but it does not at all mean they will stop shipping rosetta2 just means they want to be ready incase intel sue them and win in some regions of the world. (if apple were planning on pulling rosseta2 support themselves they would do it globally so they error message would not say something about your region).

10 hours ago, Nayr438 said:

When Microsoft was working on x86 Emulation in the past, Intel was throwing indirect warnings to Microsoft. Given the struggles Intel has been having lately, I wouldn't put it past them

But then why would it only be in a few regions? Unless Intel doesn't own the rights to x86 everywhere or there is another company out there that bought rights from Intel to emulate x86 in certain regions? But I find that hard to believe聽

馃尣馃尣馃尣

Judge the product by its聽own merits,聽not by the Company that created it.

Don't dilute <good thing> by always trying to focus on, and drag conversation back to,聽<bad thing>.

馃尣馃尣馃尣

Link to post
Share on other sites

I'm mainly a web developer and don't know that much about low level stuff. How difficult is it to write code in ARM compared to x86?

Link to post
Share on other sites
30 minutes ago, AldiPrayogi said:

I'm mainly a web developer and don't know that much about low level stuff. How difficult is it to write code in ARM compared to x86?

Highly depends on what you write.

If you're only using standard C and native APIs then it's literally just a flag in the compiler that is different. You can use exactly the same code for ARM and x86.

For most developers, it will not be any different writing code for ARM vs x86 since they stay in the high level languages.

I don't know assembly so I have no idea how hard or easy it is to write ARM vs x86 assembly, but from what I've heard and seen it seems like ARM is easier. I mean, that's kind of the point of having a RISC ISA.

Link to post
Share on other sites
28 minutes ago, AldiPrayogi said:

I'm mainly a web developer and don't know that much about low level stuff. How difficult is it to write code in ARM compared to x86?

Hard.

The same reason why PPC cpu's are harder to write for than x86.

RISC stuff does things more efficiently, but it requires more logic in the compiler to do so. The result is less code.

64bit programs, written in C or C++ use intrinsics聽to get around the differences in cpu's. But it requires 64-bit cpu's to really mask the differences in cpu functionality.

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#

They are completely optional to use, but if you do use them, then you also need re-implement them for different CPU types. That's where Rosetta comes in.

There is no 1:1 intrinsics, just like there are no 1:1 for the basic instructions.

This is why people who pre-optimize their programs to use assembly usually get laughed at, because they're only hurting their ability to debug it. If you decide to use use intrinsics to optimize something, it has to be in the smallest way to make a large performance change. This is why it's usually in math libraries and machine learning libraries rather than in monolithic programs.聽

In short, if you can't program the thing you want in straight C (or C++, Rust, C#, or anything that generates a machine-language binary) then it's not going to be portable to another CPU. Even Rosetta2 will not be able to cover everything. Let's say you develop a program using 11th gen cpu instructions (eg AVX512) only. Not only is it not portable on OS X, it's not portable to AMD. OSX's compiler is clang-llvm. So updates to clang may introduce those intrinsics, even if the cpu on the system does not support it.

https://www.phoronix.com/scan.php?page=news_item&px=LLVM-Clang-10-AVX512-Change

Quote

Unless 512-bit intrinsics are used in the source code, 512-bit ZMM registers will not be used since those operations lead to most processors running at a lower frequency state.

There's a lot of edge cases by using assembly directly. So it's not typically a good idea to go directly to assembly unless you know what you're doing.

If you're only programming in an interpreted language, eg javascript, or using a cross-compiler to javascript (eg wasm) none of these features are available, and all assembly without intrinsics are ignored, because there is no way to convey what the assembly is intended to do, as it would have to be converted back to what it theoretically is in C.

Link to post
Share on other sites
Posted (edited)
14 minutes ago, Kisai said:

Hard.

The same reason why PPC cpu's are harder to write for than x86.

RISC stuff does things more efficiently, but it requires more logic in the compiler to do so. The result is less code.

64bit programs, written in C or C++ use intrinsics聽to get around the differences in cpu's. But it requires 64-bit cpu's to really mask the differences in cpu functionality.

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#

They are completely optional to use, but if you do use them, then you also need re-implement them for different CPU types. That's where Rosetta comes in.

There is no 1:1 intrinsics, just like there are no 1:1 for the basic instructions.

This is why people who pre-optimize their programs to use assembly usually get laughed at, because they're only hurting their ability to debug it. If you decide to use use intrinsics to optimize something, it has to be in the smallest way to make a large performance change. This is why it's usually in math libraries and machine learning libraries rather than in monolithic programs.聽

In short, if you can't program the thing you want in straight C (or C++, Rust, C#, or anything that generates a machine-language binary) then it's not going to be portable to another CPU. Even Rosetta2 will not be able to cover everything. Let's say you develop a program using 11th gen cpu instructions (eg AVX512) only. Not only is it not portable on OS X, it's not portable to AMD. OSX's compiler is clang-llvm. So updates to clang may introduce those intrinsics, even if the cpu on the system does not support it.

https://www.phoronix.com/scan.php?page=news_item&px=LLVM-Clang-10-AVX512-Change

There's a lot of edge cases by using assembly directly. So it's not typically a good idea to go directly to assembly unless you know what you're doing.

If you're only programming in an interpreted language, eg javascript, or using a cross-compiler to javascript (eg wasm) none of these features are available, and all assembly without intrinsics are ignored, because there is no way to convey what the assembly is intended to do, as it would have to be converted back to what it theoretically is in C.

I wonder if you know the answer to something ive been wondering about:

Im wondering if my logic train works or if it merely left the station without me.聽

one of the things apple said is they鈥檙e trying to get parallels working for the m1. Parallels was an app that allowed one to basically just run most PC programs in MacOS without bothering with a windows partition and boot camp at all. 聽Stuff just ran. 聽There were serious limits. 聽There was a bunch of stuff it wouldn鈥檛 do. 聽It ran really slow too聽which made it not so useful for games. Apparently one of the big wins of the M1 is the type of thing parallels did for MacOS The m1 can do with a mere 20% performance hit and it can run things parallels couldn鈥檛.聽聽This would make an M1 if perhaps not a very snappy windows gaming machine at least a functional one. 聽Does this make sense or am I missing something fundamental here?

Edited by Bombastinator

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to post
Share on other sites
20 hours ago, captain_to_fire said:

Will Intel actually do that stoop that low? Quite petty of them to do such thing聽if you ask me.

They hired the actor who used to do Mac ads to do ads that say intel stuff are cool.

馃枼锔 Motherboard: MSI A320M PRO-VH PLUS聽 ** Processor: AMD Ryzen 2600 3.4 GHz ** Video Card: Nvidia GeForce 1070 TI 8GB Zotac 1070ti 馃枼锔
馃枼锔 Memory: 32GB DDR4 2400聽 ** Power Supply: 650 Watts Power Supply Thermaltake +80 Bronze Thermaltake PSU 馃枼锔

馃崕 2012 iMac i7 27";聽 2007 MBP 2.2 GHZ; Power Mac G5 Dual 2GHZ; B&W G3; Quadra 650; Mac SE 馃崕

馃崕 iPad Air2; iPhone SE 2020; iPhone 5s; AppleTV 4k 馃崕

Link to post
Share on other sites
9 hours ago, AldiPrayogi said:

I'm mainly a web developer and don't know that much about low level stuff. How difficult is it to write code in ARM compared to x86?

If you aren't dabling with low level stuff, it makes no difference whatsoever. If you're dealing with low end stuff, ARM per se聽is pretty easy, it gets annoying due to the differences in peripherals whereas x86 was those in a more standard way.

8 hours ago, Kisai said:

Hard.

The same reason why PPC cpu's are harder to write for than x86.

RISC stuff does things more efficiently, but it requires more logic in the compiler to do so. The result is less code.

That's now how it works. ARM and PPC's asm is way simpler and easier to write than x86, the problem is that since it's way simpler and has fewer instructions, in the end you'll need to use more instructions to accomplish the same thing you'd do with way less instructions in x86.

CISC does stuff more efficiently if you're thinking about the amount of code required, at the cost of way higher complexity on the decoder and actual CPU backend.

(as a side note, nice example on how efficient CISC is space-wise can be seen here, where this simple code only requires 49 bytes, saving costs on flash/eeprom).

8 hours ago, Kisai said:

64bit programs, written in C or C++ use intrinsics聽to get around the differences in cpu's. But it requires 64-bit cpu's to really mask the differences in cpu functionality.

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#

They are completely optional to use, but if you do use them, then you also need re-implement them for different CPU types. That's where Rosetta comes in.

Intrinsics aren't meant to "get around differences in cpu's", but to give the compiler a hint to use some specific instruction/extensions, such as AVX-512 that you mentioned, so you don't need to get down to asm by hand.

You can use intrinsics for most CPUs, not only 64bit ones. ARM also has intrinsics (such as the ones for NEON), and even for microcontrollers.

8 hours ago, Kisai said:

OSX's compiler is clang-llvm. So updates to clang may introduce those intrinsics, even if the cpu on the system does not support it.

That's not how it works. Modern compilers support AVX-512, AVX2 and other modern extensions, but those are not used by default, you need to enable it by setting the proper march/matune flags, which by default is set to聽x86-64v1 (including extensions up to SSE2, but no AVX), meaning that it's compatible with any聽x86-64 CPU.

AVX is only present on x86-64v3 (which is similar to intel's 4th gen feature set).

8 hours ago, Kisai said:

There's a lot of edge cases by using assembly directly. So it's not typically a good idea to go directly to assembly unless you know what you're doing.

Instinsics is not assemble, it's a way to actually聽avoid聽writing assembly.

8 hours ago, Kisai said:

If you're only programming in an interpreted language, eg javascript, or using a cross-compiler to javascript (eg wasm) none of these features are available, and all assembly without intrinsics are ignored, because there is no way to convey what the assembly is intended to do, as it would have to be converted back to what it theoretically is in C.

SIMD.js used to be a thing, and WASM SIMD is already available on Chrome Canary, and has been merged into Chakra Core.

If you compile the interpreters beneath those languages with said intrinsics/proper flags, then they'd make use of those.

Python even has PGO聽in order to build the fastest binary for the target system. Intel took it a step further on Clear Linux by enabling AVX2 by default.

8 hours ago, Bombastinator said:

one of the things apple said is they鈥檙e trying to get parallels working for the m1. Parallels was an app that allowed one to basically just run most PC programs in MacOS without bothering with a windows partition and boot camp at all. 聽Stuff just ran. 聽There were serious limits. 聽There was a bunch of stuff it wouldn鈥檛 do. 聽It ran really slow too聽which made it not so useful for games. Apparently one of the big wins of the M1 is the type of thing parallels did for MacOS The m1 can do with a mere 20% performance hit and it can run things parallels couldn鈥檛.聽聽This would make an M1 if perhaps not a very snappy windows gaming machine at least a functional one. 聽Does this make sense or am I missing something fundamental here?

That's not only related to parallels, but any virtualization task. When you ran a Windows VM on Intel Macs, you were just virtualizing the CPU using specific virtualization extensions to make it faster, meaning that the performance hit on CPU-bound tasks were pretty much nil.

What you seem to be complaining about is stuff that's GPU-dependent, since current consumer GPUs can't be easily virtualized/shared with a VM. The M1 won't make any difference on that.

Since most games are built for x86, you'd need to:聽

1- Emulate x86 (something that R2 takes care of)

2- Virtualize Windows (something that software like Parallels takes care of)

3- Emulate the GPU subsystem <- this is where you'd get stuck

Given all of the above overheads, performance would be shit.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to post
Share on other sites
9 hours ago, Bombastinator said:

one of the things apple said is they鈥檙e trying to get parallels working for the m1.

Parallels was introduced way back when XP was still an OS you could install. It's a little more than a "machine on top of a machine" in a similar way that Windows XP mode was on Windows Vista. Basically it just spins up a virtual machine, but the video driver acts as a "remote desktop" which is what allows for a seamless use of the mac and windows applications on the mac. This is likely also how Windows 10 is dealing with Linux on Windows UI stuff, but I haven't explored it.

To put things succinctly, Parallels is just a virtualization platform with some clever tricks to make it easier to use than say, QEMU (Virtualbox) and VMWare, as the latter are more about virtualizing a machine with no integration.

So it would be difficult to get Parallels to work as well as it does on the Intel Mac since even if it could leverage Rosetta2, Windows applications in the future might use code paths that aren't available. The OS though, should work unless they phase out installation on cpu's without AVX (which was introduced with the 4th gen intel cpu's), which I don't see Microsoft doing as long as there's no reason for it.

Link to post
Share on other sites
3 hours ago, igormp said:

That's now how it works. ARM and PPC's asm is way simpler and easier to write than x86, the problem is that since it's way simpler and has fewer instructions, in the end you'll need to use more instructions to accomplish the same thing you'd do with way less instructions in x86.

If you're writing just ASM yes. But a C program is optimized differently for different CPU's, and Intel has a lot of instructions to do very specific things where as the RISC CPU's kinda just require you to unroll those instructions. You're still unlikely to use assembly regardless of the CPU you are using unless you're build a special-purpose tool or library.

3 hours ago, igormp said:

CISC does stuff more efficiently if you're thinking about the amount of code required, at the cost of way higher complexity on the decoder and actual CPU backend.

Which is the entire complaint about Intel CPU's. That's why it uses so much more energy.

3 hours ago, igormp said:

Intrinsics aren't meant to "get around differences in cpu's", but to give the compiler a hint to use some specific instruction/extensions, such as AVX-512 that you mentioned, so you don't need to get down to asm by hand.

You can use intrinsics for most CPUs, not only 64bit ones. ARM also has intrinsics (such as the ones for NEON), and even for microcontrollers.

That's what I said. I know you can use it for 32-bit, but the way the Visual C compiler works is that you can't just drop inline assembly into the code in 64bit, it will berate you and tell you to use intrinsics. Only 32-bit lets you compile an OBJ and link it as-is.聽

https://docs.microsoft.com/en-us/cpp/intrinsics/compiler-intrinsics?view=msvc-160

Quote

The intrinsics are required on 64-bit architectures where inline assembly is not supported.

https://docs.microsoft.com/en-us/cpp/intrinsics/intrinsics-available-on-all-architectures?view=msvc-160

3 hours ago, igormp said:

That's not how it works. Modern compilers support AVX-512, AVX2 and other modern extensions, but those are not used by default, you need to enable it by setting the proper march/matune flags, which by default is set to聽x86-64v1 (including extensions up to SSE2, but no AVX), meaning that it's compatible with any聽x86-64 CPU.

AVX is only present on x86-64v3 (which is similar to intel's 4th gen feature set).

Instinsics is not assemble, it's a way to actually聽avoid聽writing assembly.

On Visual C it's /Oi , and you're just rephrasing what I wrote. Use the intrinsics so you get portable code that will use assembly if available rather than going to inline assembly which will not be predictable.

3 hours ago, igormp said:

SIMD.js used to be a thing, and WASM SIMD is already available on Chrome Canary, and has been merged into Chakra Core.

If you compile the interpreters beneath those languages with said intrinsics/proper flags, then they'd make use of those.

Which isn't what happens on platforms that have packaged binaries, which is pretty much everything. And to be honest, even if you could cross-compile a C program with SIMD assembly to javascript WASM, you have no idea if the browser will have a code path for it now or in the future.聽

Honestly, people should just not cross-compile anything to javascript as that just introduces sloppy coding into the web browser, and that's why WASM gets used mostly for malware like cryptominers.

3 hours ago, igormp said:

Python even has PGO聽in order to build the fastest binary for the target system. Intel took it a step further on Clear Linux by enabling AVX2 by default.

Honestly, if Intel wants to help projects be optimized for their CPU, let them. But I've found that any time you tune something for one cpu type (eg I used to tune my FreeBSD systems for the cpu in the server), it would result in a chain of dependency hell where everything on the server had to be recompiled as well, and I just can't be bothered to waste an entire day compiling everything on the OS, for what likely amounts to only a few seconds saved here and there.

Link to post
Share on other sites
35 minutes ago, Kisai said:

But a C program is optimized differently for different CPU's, and Intel has a lot of instructions to do very specific things where as the RISC CPU's kinda just require you to unroll those instructions.

Yes, and that's the compiler's job. Even so, if you're writing it by hand, the asm for RISC CPUs is just more tedious but not hard, you'd just write more stuff repeatedly, while with a CISC one you'd need to remember weird mnemonics, so you can't say one is really harder than the other.

40 minutes ago, Kisai said:

That's what I said.

Then you didn't express yourself properly, or I did get it wrong.

40 minutes ago, Kisai said:

I know you can use it for 32-bit, but the way the Visual C compiler works is that you can't just drop inline assembly into the code in 64bit, it will berate you and tell you to use intrinsics. Only 32-bit lets you compile an OBJ and link it as-is.聽

https://docs.microsoft.com/en-us/cpp/intrinsics/compiler-intrinsics?view=msvc-160

Quote

The intrinsics are required on 64-bit architectures where inline assembly is not supported.

https://docs.microsoft.com/en-us/cpp/intrinsics/intrinsics-available-on-all-architectures?view=msvc-160

MSVC isn't the only compiler in existence. GCC supports asm blocks no matter the ISA.

45 minutes ago, Kisai said:

On Visual C it's /Oi

Don't assume that's the case for either Clang nor GCC.

46 minutes ago, Kisai said:

nd you're just rephrasing what I wrote.

No, you implied that the compiler would introduce intrinsics in your code that would render it unable to run in other CPUs, which is not the case.

46 minutes ago, Kisai said:

Use the intrinsics so you get portable code that will use assembly if available rather than going to inline assembly which will not be predictable.

Most intrinsics aren't portable though, but you're right on the predictability.

51 minutes ago, Kisai said:

Which isn't what happens on platforms that have packaged binaries, which is pretty much everything. And to be honest, even if you could cross-compile a C program with SIMD assembly to javascript WASM, you have no idea if the browser will have a code path for it now or in the future.聽

Honestly, people should just not cross-compile anything to javascript as that just introduces sloppy coding into the web browser, and that's why WASM gets used mostly for malware like cryptominers.

My point was that you can enable SIMD on the interpreter itself, thus making it faster in a seamless way, not that you'd (or even should) use聽SIMD straight from the interpreted code itself (even though it is possible).

53 minutes ago, Kisai said:

Honestly, if Intel wants to help projects be optimized for their CPU, let them. But I've found that any time you tune something for one cpu type (eg I used to tune my FreeBSD systems for the cpu in the server), it would result in a chain of dependency hell where everything on the server had to be recompiled as well, and I just can't be bothered to waste an entire day compiling everything on the OS, for what likely amounts to only a few seconds saved here and there.

Apart from AVX-512, most features found on Clear Linux work for both Intel and AMD CPUs, so it's more of a proof of concept of what can be achieved with proper tuning, and which has paid off since GCC has introduced feature levels for x86 and most other distros are starting to enable those.

As for tuning specific software, from my own experience I didn't feel anything like so. Personally, I have Tensorflow, Python 3.5 and 3.7, and other miscellaneous softwares running without any problems here. If I'm going to compile those anyway, then I can just turn on those extra flags for some free performance.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now