Jump to content

M3 Macbook Pro Reviews. 8GB of RAM on a $1600 laptop is criticised heavily

filpo
4 hours ago, mecarry30 said:

I wish i could understand what any of this means as a first year computer engineering student

If you want to give yourself some spoilers, grab Patterson's book on computer architecture, it gives you a nice walk-through on how the cpu works and how all of that memory hierarchy stuff happens. 

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Paul Thexton said:

You can boot Macs from an external drive. Always could, still can. Whether a write-worn SSD would somehow stop that from working though I couldn’t say.

All Macs with a disk drive were unable to boot from USB drives (artificial limitation).

 

Faulty flash in a M-era device bricks the device. AFAIK there is no UEFI / BIOS that would be able to boot / launch anything.

https://news.ycombinator.com/item?id=26114417

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Sauron said:

 

That might be the case... if the systems didn't come out the factory with extremely limited capacity and the OEM capacity upgrades didn't come at multiple times the market value of consumer RAM. As a consumer it's also good to at least have the option, it's not always easy to project what your memory needs will be 5 years from now.

Conceptually, a laptop doesn't really "need" socketed/slotted RAM, if the models available cover all use cases, including niche edge cases. Would a 256GB RAM on a MacBook Pro be useful? Yes. Would Apple be justified with that BYO option? No, because that's not generally a good use case for a laptop.  Like just to point back to my screenshot of my desktop with 96GB of RAM with only 32GB really being used. There is a cross-over point where too much RAM in the system will likely not ever be fully realized. So THAT is what the default RAM should be set to. 

 

Same with the storage. When I bought the 16GB iPad, I was basically doing the "I don't know how much I need" but that quickly became too small after several iOS updates, never mind various apps literately taking up 2GB by themselves. If I were to buy a new one, I would likely get at least the 1TB storage model.

 

So we have two variables, RAM and Storage. And since they are soldered, Apple would need to make a huge number of configurations to appease everyone. That's why it shouldn't be done and RAM/Disk should be dealer/user-installable, even if the SoC is fixed to the board.  If the SoC is fixed to the board, you're only dealing with two or three assembly lines, not 3x3 (27) configurations per base model.  Heck it's even MORE than that if you select the M3 Pro/Max model, which has (2x2x4) + (1x2x5) + (1x3x5) =41 configurations. That's insane. It's also absurdly priced ($9000 CDN) to have it maxed out, when an equivalent Dell or HP doesn't exist.

 

A 15" 64GB/4TB BYO 5680 Dell Precision with an 8GB Ada 2000 is $8000.

A 16" Alienware m16 with 64GB RAM+8TB SSD+ 16GB 4090 is $5900

A 16" Macbook Pro with 64GB/4TB SSD $6800.

A 16" Macbook Pro with 64GB/8TB SSD $8300.

 

Like, it's a bit embarrassing to line up that Alienware and Dell Precision against the MacBook Pro and see the upgrade from 4GB to 8GB storage on the Dell is $900 but $1500 on the Mac. 16GB to 32GB on the Dell is $200, and then to 64GB $100 on top of that. Apple 64GB to 128GB RAM? $1000. 36 to 64? $500.

 

So never mind the different size types, but each upgrade on the MacBook Pro is a linear upgrade, which doesn't make sense. $250 36GB, $500 64GB, $1000 128GB. That doesn't explain why Apple's upgrade from 36 to 64 is $500 while Dell's is $100.

 

It feels really weird to see the MacBook Pro's priced like much more powerful laptops because the BYO options are overpriced. This is why people who bought Mac Pro's and Mac Mini's always opted for installing their own RAM and/or hard drives, when 16GB modules is $60 and 32GB modules are $120. The Dell upgrades are almost exactly in line with that, where as Apple's are at least double that. Storage is even worse. A 4TB PCIe 5.0 NVMe is $900. That 4GB to 8GB upgrade for Apple is $1500.

 

Anyway. The prices for storage and RAM on Apple devices are uncompetitive with Windows devices. Makes you wonder if Apple stockpiled all this stuff back in 2019 and is still selling it at peak-Covid prices.

 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, HenrySalayne said:

All Macs with a disk drive were unable to boot from USB drives (artificial limitation)

I literally did it with my first Core2Duo Mac Mini. Have done it many times since with newer hardware to test out beta OS release etc. Perhaps it was a limitation in the PowerPC era?

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, HenrySalayne said:

Faulty flash in a M-era device bricks the device. AFAIK there is no UEFI / BIOS that would be able to boot / launch anything.

https://news.ycombinator.com/item?id=26114417

Yeah so faulty SSD causing boot problems makes sense with the initial boot stages coming from the EFI partition. 
 

For people who are hyper concerned about wear rates and want to run their main OS on an external drive (most likely Mac Mini users), that’s still an option and would hugely reduce writes on the on package storage as it would literally just be OS updates writing stuff to it.

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/10/2023 at 2:55 PM, Obioban said:

Every time anything Apple goes wrong it's national news. If SSDs were failing as machines aged, we'd know about it-- pretty confidently, that's not an issue. 

Not just Apple. Remember when Spotify was writing hundreds of GBs a day for 5 months due to a bug? Burning away precious SSD lifespan. https://www.extremetech.com/computing/239268-spotify-may-killing-ssd

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/11/2023 at 7:47 PM, Kisai said:

Anyway. The prices for storage and RAM on Apple devices are uncompetitive with Windows devices. Makes you wonder if Apple stockpiled all this stuff back in 2019 and is still selling it at peak-Covid prices.

Apple has always done this, even 10 years ago RAM upgrades were significantly more expensive than market price, despite the fact that they still had socketed ram back then. Maybe they realized people were buying the base SKU and installing their own reasonably priced RAM... can't have that I guess.

On 11/11/2023 at 7:47 PM, Kisai said:

Conceptually, a laptop doesn't really "need" socketed/slotted RAM, if the models available cover all use cases, including niche edge cases. Would a 256GB RAM on a MacBook Pro be useful? Yes. Would Apple be justified with that BYO option? No, because that's not generally a good use case for a laptop.

Sure, but we wouldn't be having this conversation if the base SKU had 256GB of memory. Heck, we wouldn't be having it even if it had 16. The problem is it has 8, making it either an extremely overpriced typing machine or a paperweight in two years. And as you pointed out, the higher capacity upgrades are unreasonably priced. So I can only see this as bait by Apple to claim a lower "starting from" price to lure you in and then get you to pay more in the configurator.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

The problem is not entirely on MacOS memory allocation and paging. It's how every other app (looking at you chrome) eats up ram very quickly. Most Linuxes can run fine on a gig or two.

 

Apple PR reasoning as someone who doesn't actually work in the hardware and software department should have been "we tested using our system using our included apps, google is clearly unreliable"

Specs: Motherboard: Asus X470-PLUS TUF gaming (Yes I know it's poor but I wasn't informed) RAM: Corsair VENGEANCE® LPX DDR4 3200Mhz CL16-18-18-36 2x8GB

            CPU: Ryzen 9 5900X          Case: Antec P8     PSU: Corsair RM850x                        Cooler: Antec K240 with two Noctura Industrial PPC 3000 PWM

            Drives: Samsung 970 EVO plus 250GB, Micron 1100 2TB, Seagate ST4000DM000/1F2168 GPU: EVGA RTX 2080 ti Black edition

Link to comment
Share on other sites

Link to post
Share on other sites

18 minutes ago, williamcll said:

The problem is not entirely on MacOS memory allocation and paging. It's how every other app (looking at you chrome) eats up ram very quickly. Most Linuxes can run fine on a gig or two.

 

Apple PR reasoning as someone who doesn't actually work in the hardware and software department should have been "we tested using our system using our included apps, google is clearly unreliable"

MacOS is not particularly light on memory, at least it wasn't last I tried it - I remember idling around 4GB+ on a (virtual) machine that only had like 6GB total. Besides, "professionals" (if we still want to pretend they are Apple's target) might want to run multiple programs at the same time; even if individual tasks would fit in the installed 8GB, running more than one at a time will quickly exceed that limit.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

I'm not going to claim that Apple SSDs wear out or die any unusual rate, but I am going to claim that it sucks to get bad luck and have your SSD conk out, because now your laptop is a brick. Seen it happen.

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/11/2023 at 2:41 AM, igormp said:

Not really, you don't see that happening for all of the tiered levels of cache we have and also virtual memory, that's why we have the likes of the TLB.

First of all thanks for all the links and infos (again), I currently don't have the bandwidth to look through it in detail but will keep it in mind. Then, as you and leadeater both mentioned, I completely forgot about NUMA support which surely implements "this" on the OS side.

 

Next: Cache differs fundamentally from the task at hand here: First, it's tiny compared to possibly hundreds of GBytes of RAM, and second, it keeps copies of data that are present somewhere else that are "simply" overridden and don't have to be moved to a different memory tier before the respective memory areas can be used for sth different.

I completely forgot about NUMA, so yeah, I can totally see how this would work if the OS manages it which has very good visiblity of all allocated sections and also can use RAM itself for RAM housekeeping, but again, doing all this in a dedicated controller for such amounts of memory seems a bit unrealistic to me.

On 11/11/2023 at 2:41 AM, igormp said:

Hence why there's different algorithms to keep track of that. As an example, Intel's HBM offerings let you choose if you want it to run as part of the memory (so your first handful of RAM is really fast), as a cache (so totally transparent to the OS) or as standalone RAM (so no extra RAM on the system).

Hmm, yes but none of these options implements what I would consider sensible for tiered RAM (moving data forth and back between the different tiers based on access patterns). Not sure if you refer to that as "first handful of RAM is really fast"?

On 11/11/2023 at 2:41 AM, igormp said:

The OS gets the syscalls and then sends it commands to alloc and free stuff.

Calls like malloc() and free() get forwarded to a hardware memory controller which then keeps track of all allocated sections? Where does it store that info if I have a system with 512GB of RAM and allocate it in 16kB chunks?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/11/2023 at 3:25 PM, Sauron said:

Oh dear, are we really about to question the whole concept of tiered memory? This is established computer science, it has been and is done, swap and cache are just two of the ways it's widely used.

 

There are plenty of algorithms to choose from ranging from simply always prioritizing fast memory and only falling back to slow memory when the fast memory is full (which is more or less how swap works) to smarter predictive algorithms like the ones used to populate cache. This is a solved problem.

Yeah, how much overgeneralization do you want to have? Give me all of it...

Managing a few dozen MBytes of cache on a block level (which keeps copies of data in contrast to one sole copy) is totally equivalent to managing possibly hundreds of Gbytes of different tiers of RAM where data is always only present as one single copy?

 

Next, neither of the algorithms you mention come close to what I was referring to: First filling up the fast tier RAM, and then dynamically moving data forth and back between the different RAM tiers based on usage patterns, best on a per-allocated-section level, which could both be tiny and huge. I could see sth like that "easily" implemented in an OS (like for NUMA as discussed before) but not in a dedicated memory controller - that's simply a ton of housekeeping to do.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Dracarris said:

First, it's tiny compared to possibly hundreds of GBytes of RAM

Irrelevant, the algorithms that make it work can scale to a few kb to terabytes.

1 hour ago, Dracarris said:

and second, it keeps copies of data that are present somewhere else that are "simply" overridden and don't have to be moved to a different memory tier before the respective memory areas can be used for sth different.

That can be changed, we have many different algorithms and can just "downgrade" or "upgrade" them at will:

https://en.wikipedia.org/wiki/Cache_replacement_policies

 

The migration code on the kernel I post is exactly responsible for doing so. You can imagine it's the same idea between what should be compressed vs what should not in memory, or what should be swapped or not.

1 hour ago, Dracarris said:

doing all this in a dedicated controller for such amounts of memory seems a bit unrealistic to me.

MMUs (!= from memory controller) are really powerful nowadays (they're pretty much a CPU on its own), and it's already done. You don't need tons of ram to do so, as I mentioned before, we have stuff like TLBs to keep track of such things. I really recommend reading Patterson's book on computer architecture, it gives a really nice overview on how all of this memory hierarchy stuff works.

1 hour ago, Dracarris said:

Hmm, yes but none of these options implements what I would consider sensible for tiered RAM (moving data forth and back between the different tiers based on access patterns). Not sure if you refer to that as "first handful of RAM is really fast"?

Yes, when using the HBM as part of ram (intel calls it "flat mode") you configure it as a different numa node, so the same NUMA idea applies here.

You can refer to intel's manual for more info: https://cdrdv2-public.intel.com/769060/354227-intel-xeon-cpu-max-series-configuration-and-tuning-guide.pdf

1 hour ago, Dracarris said:

Calls like malloc() and free() get forwarded to a hardware memory controller which then keeps track of all allocated sections?

Yes.

1 hour ago, Dracarris said:

Where does it store that info if I have a system with 512GB of RAM and allocate it in 16kB chunks?

On the TLB. At this point I feel like I'm repeating myself, it'd nice for you to try to give a read into the stuff we gave you, it really explains tons of your questions, and as leadeater said, it is a solved problem already.

1 hour ago, Dracarris said:

Managing a few dozen MBytes of cache on a block level (which keeps copies of data in contrast to one sole copy) is totally equivalent to managing possibly hundreds of Gbytes of different tiers of RAM where data is always only present as one single copy?

It actually is really similar lol

2 hours ago, Dracarris said:

First filling up the fast tier RAM, and then dynamically moving data forth and back between the different RAM tiers based on usage patterns, best on a per-allocated-section level, which could both be tiny and huge. I could see sth like that "easily" implemented in an OS (like for NUMA as discussed before) but not in a dedicated memory controller - that's simply a ton of housekeeping to do.

That's usually done in the OS for flexibility sake, so your MMU is only responsible for knowing if something is paged or not, where it is and if it's in use or not, but doesn't care about such migrations.

This gives a nice read on the tiering stuff: https://pmem.io/blog/2022/06/memory-tiering-part-2-writing-transparent-tiering-solution/

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

36 minutes ago, igormp said:

On the TLB. At this point I feel like I'm repeating myself, it'd nice for you to try to give a read into the stuff we gave you, it really explains tons of your questions, and as leadeater said, it is a solved problem already.

Hm again not sure how a TLB would manage to magically do away with keeping track of millions of start pointers and sizes. A TLB literally translates addresses, usually virtual to physical addresses, on a page level (fixed size), so I actually have no clue how you think it would solve the issue of keeping track of each and every 16kB block of memory solitair.exe has malloced? Where is the housekeeping info stored what was accessed how recently (some kind of usage score to decide what will be moved to a lower tier once the higher tier runs out of space)?

Every TLB miss means walking the page table, which is usually stored itself in RAM (and can therefore grow more or less arbitrarily), which kinda was my point from the beginning (the memory controller/MMU _not_ having it's own-rather-large RAM to store all that housekeeping info).

 

You guys make it sound indeed like the memory pyramid in Patterson and everything from Cache on the CPU die to Petabytes of HDD is "just another tier of memory" and can be managed in exactly the same way and nothing needs any special treatment, only set some flag somewhere and that's just not how simple things are in reality. I say that as a person that designs and implements microprocessor SoCs (HDL and low-level software). Yes, I have no clue about complex cache systems but as an hardware architect I think some things are really displayed in an oversimplified fashion here.

 

I agree that this problem is solved in software/OS - I still don't see how it would be managed purely in hardware, be it in the memory controller, MMU, TLB or wherever, with the exception of keeping the actual scoreboard what is malloced where and accessed how recently in RAM itself.

Link to comment
Share on other sites

Link to post
Share on other sites

16 minutes ago, Dracarris said:

Hm again not sure how a TLB would manage to magically do away with keeping track of millions of start pointers and sizes. A TLB literally translates addresses, usually virtual to physical address, so I actually have no clue how you think it would solve the issue of keeping track of each and every 16kB block of memory solitair.exe has malloced?

When you do a malloc, you actually end up with a page (or either more space in a page, or get new pages to use), so in the end you just need to keep track of pages, and that's the TLB's job.

If a program has requested two 16kB blocks at different points in time, you can't know if they're being fully utilized or not, just how recently the pages they're in have been used, and that's all you care about, is up to the program to make proper use of the stuff it has allocated.

19 minutes ago, Dracarris said:

You guys make it sound indeed like the memory pyramid in Patterson and everything from Cache on the CPU die to Petabytes of HDD is "just another tier of memory"

It basically is haha

From your ZFS storage with those SSD caches to your CPU with SRAM caches the ideas are pretty similar.

20 minutes ago, Dracarris said:

and can be managed in exactly the same way and nothing needs any special treatment, only set some flag somewhere and that's just not how simple things are in reality.

Not that simple, but the algorithms themselves are extremely similar. All you need is some metadata (either given by the hw or that you handle yourself), and set parameters that make you decide on how to work with stuff based on a performance (both in terms of bw and latency) x capacity basis, since that's the major difference between all of those layers in the pyramid, be either a l1 cache or CXL-attached memory.

 

24 minutes ago, Dracarris said:

I agree that this problem is solved in software/OS - I still don't see how it would be managed purely in hardware, be it in the memory controller, MMU, TLB or wherever, with the exception of keeping the actual scoreboard what is malloced where in RAM itself.

I guess it's important to say that I never once mentioned that it's entirely done in hw (do point out if I did so), in my first replies related to that I pointed to how linux does it. Also, just as a reminder that the TLB is capable of handling petabytes of virtual memory, so it doesn't really matter how much of that is fast or slow, and we also have many different TLBs (like instruction and data TLBs), and multiple TLB levels, so all that level of indirection allows us to address all of that stuff in hardware without needing tons of resource (with a small latency trade off, ofc).

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, igormp said:

I guess it's important to say that I never once mentioned that it's entirely done in hw (do point out if I did so), in my first replies related to that I pointed to how linux does it. Also, just as a reminder that the TLB is capable of handling petabytes of virtual memory, so it doesn't really matter how much of that is fast or slow, and we also have many different TLBs (like instruction and data TLBs), and multiple TLB levels, so all that level of indirection allows us to address all of that stuff in hardware without needing tons of resource (with a small latency trade off, ofc).

Two things only come clear to me now which so far caused a lot of confusion to me:

- It all happens on a page level, which severely limits the granularity and bounds worst-case scenarios

- In the end, the page table(s) aka "the big boy" resides in RAM itself

 

Just because the TLB can handle Petabytes does not change the fact that at any given time it/they only contain a subset of (info about) everything that is currently allocated. TLB is a lot about virtual memory - my question is transparent to that, you could have a system with physical addressing only and it would remain the same.

 

Now that we cleared this confusion up, I'm still curious where the "recent usage" info about each page is stored, I guess in the page table itself and then "bumped" every time a page is accessed? Now, if physical space in the highest memory tier runs low, a decision must be made which physical page to move since in the end you must make room in the physical highest tier. Seriously, I absolutely did not think about virtual memory in this context and it completely threw me off trying to make sense of all this.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Dracarris said:

- It all happens on a page level, which severely limits the granularity and bounds worst-case scenarios

Hence why we have all that discussion about the ideal page size, and how the M1/MacOS does 16kb pages instead of the usual 4kb you find in other places.

6 minutes ago, Dracarris said:

Now that we cleared this confusion up, I'm still curious where the "recent usage" info about each page is stored, I guess in the page table itself and then "bumped" every time a page is accessed?

Yup:

Quote

A page table entry or other per-page information may also include information about whether the page has been written to (the dirty bit), when it was last used (the accessed bit, for a least recently used (LRU) page replacement algorithm), what kind of processes (user mode or supervisor mode) may read and write it, and whether it should be cached.

 

7 minutes ago, Dracarris said:

Seriously, I absolutely did not think about virtual memory in this context and it completely threw me off trying to make sense of all this.

Yeah, that's why we were saying that's all pretty much the same, you treat all of it as virtual memory with different tiers, hence why we said it's a solved problem since we've been doing it for ages lol

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, igormp said:

Yeah, that's why we were saying that's all pretty much the same, you treat all of it as virtual memory with different tiers, hence why we said it's a solved problem since we've been doing it for ages lol

But the different tiers must be actual, physical memory with the virtual side looking all the same? When a page is marked as recently used/not used, someone still needs to look at the other side to figure out the physical address, move it to a different physical address in the appropriate tier (first decide on an address within that tier that is not used by another page, so the pysical-address side of all page table entries has to be looked at) and subsequently the new physical address must be set/updated and any accesses to said page during the move stalled.

 

That sounds awfully complicated to me to do it all in hardware, and the MMU needs some very fast access to the page table.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Dracarris said:

But the different tiers must be actual, physical memory with the virtual side looking all the same? When a page is marked as recently used/not used, someone still needs to look at the other side to figure out the physical address, move it to a different physical address in the appropriate tier (first decide on an address within that tier that is not used by another page, so the pysical-address side of all page table entries has to be looked at) and subsequently the new physical address must be set/updated and any accesses during the move stalled.

 

That sounds awfully complicated to me to do it all in hardware, and the MMU needs some very fast access to the page table.

Most of that migration stuff is done in the OS (as I linked before). Once the OS decides what should be where, you should need to update the page tables with the new physical reference when moving stuff and mark the previous one as dirty (or do whatever else you want with it), all within the virtual realm. I guess you can have extra metadata to make such lookups/updates faster/easier, but I'm keen on those details, sorry.

 

Since the page table lives in memory, and the MMU actually sits right in front of it, between the physical RAM and the CPU, I guess it's safe to assume that it already has some very fast access to the page table 😛

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, igormp said:

Most of that migration stuff is done in the OS (as I linked before). Once the OS decides what should be where

Okay, but if all this lives in the OS still, then I would not consider this a fully hardware-managed tier, to come back to the origin of this discussion:

On 11/10/2023 at 11:12 PM, Dracarris said:

How would the controller know which data to put into which tier?

Answer: It does not know and putting data into the different tiers is still OS-managed through walking the page table/analyzing meta info in it.

On 11/11/2023 at 12:04 AM, Dracarris said:

To keep track of every allocated section in memory, the controller itself needs a seizable amount of memory.

Answer: It's still in the page table, which is still in RAM, which is the "seizable amount of memory". TLB is only a fast access path to a subset of the page table, the actual record remains in RAM.

On 11/10/2023 at 11:24 PM, igormp said:

The strategies used are pretty similar to multi cache system, which we have plenty of, and can be either totally transparent to the OS

When the different tiers of RAM are NUMA nodes of which the OS must be aware (citing from the nice Intel doc you linked below) I don't think of this as "totally transparent to the OS". Maybe in cache mode, but that does not give you extra RAM capacity, and also in that case the question is where LRU information for every page in that now seizable cache is stored and maintained and who gives the trigger to actually move a page from one tier to the other.

Well, the Intel doc says in 2LM cache mode HBM simply is a direct-mapped cache, so quite far away from the discussed scenario.

image.thumb.png.401f1a37a528ca1f62d8953f8aa82a85.png

7 hours ago, igormp said:

 

Since the page table lives in memory, and the MMU actually sits right in front of it, between the physical RAM and the CPU, I guess it's safe to assume that it already has some very fast access to the page table 😛

Makes a lot of sense 🙂 It however also means that both the MMU and the OS access/manipulate the page table, which is ultimately "just" some data structure in RAM which probably comes with its own kind of challenges and issues.

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/6/2023 at 7:27 PM, Zando_ said:

Might do. But I meant the lower capacity (256GB) SSDs in base model M-chip equipped Macs are slower than the higher capacity models of the same M-chip equipped Macs, so both would have the controller in the SoC making that distinction irrelevant. Seems to be specifically the 256GB models as they use a single NAND chip, older 128GBs used 2 NAND chips so they were actually faster: https://www.macrumors.com/2023/01/24/m2-mac-mini-256gb-slower-ssd/#:~:text=We have confirmed with the,benchmark results and real-world. And: https://www.macrumors.com/2023/06/13/15-inch-macbook-air-single-256gb-chip/

This is only true for M2. According to iFixit video at 3:40, there are 2 NAND chips for 256GB M3.

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Erebuxy said:

This is only true for M2. According to iFixit video at 3:40, there are 2 NAND chips for 256GB M3.

Indeed. That decision wouldn’t have been about deliberately making the base spec storage slower, it would have purely been down to BOM cost for the lower storage tier based on supply chain costs at the time.

 

Wouldn’t surprise me if this is something they do again in the future.

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, Paul Thexton said:

Indeed. That decision wouldn’t have been about deliberately making the base spec storage slower, it would have purely been down to BOM cost for the lower storage tier based on supply chain costs at the time.

 

Wouldn’t surprise me if this is something they do again in the future.

How many actually need that performance and also how many of those that did brought higher capacity options that came with 2 anyway 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, leadeater said:

How many actually need that performance and also how many of those that did brought higher capacity options that came with 2 anyway 

I’m willing to bet that the majority who bought the lowest tier storage didn’t even notice the decreased speed until they saw hysterical  videos about it from professional benchmark runners.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Paul Thexton said:

I’m willing to bet that the majority who bought the lowest tier storage didn’t even notice the decreased speed until they saw hysterical  videos about it from professional benchmark runners.

Or when they were running out of memory because they only had 8 GB of RAM and the entire machine slowed down well below the M1 model. 🙃

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×