Jump to content

RAM speed in programming

Arktius

Hi guys, 

 

I'd like to know how important the RAM speed for programming. You can find dozens of benchmarks and articles about gaming, but nothing regarding programming. Is 3000 MHz  CL16 more than enough even for expensive calculations done on the CPU?

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Arktius said:

I'd like to know how important the RAM speed for programming. You can find dozens of benchmarks and articles about gaming, but nothing regarding programming. Is 3000 MHz  CL16 more than enough even for expensive calculations done on the CPU?

Obviously, RAM-speed matters whenever you're doing something that is really dependent on RAM-speed. That said, if you have to ask these kinds of questions, you're very much unlikely to be doing anything where RAM-speed would matter.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, Arktius said:

Hi guys, 

 

I'd like to know how important the RAM speed for programming. You can find dozens of benchmarks and articles about gaming, but nothing regarding programming. Is 3000 MHz  CL16 more than enough even for expensive calculations done on the CPU?

I'd worry much more about being cache friendly if you need performance.

Link to comment
Share on other sites

Link to post
Share on other sites

In general you find the performance weak spots via measuring and profiling. Performance (for most programming) optimizations are not something you try to write from the get go, since 90% of optimizations you could make wouldn't make a difference. Obviously there are exceptions (HFT comes to mind), but in general you write first, then optimize AFTER figuring out what needs optimized.

 

To drive the point home, you could know of an optimization that you used in the past, use it without measuring, only to find out later that since you last used it that optimization has actually been done better in the compiler and you actually made it slightly worse by not just letting the compiler do its job. When it comes to performance, measure measure measure.

Gaming build:

CPU: i7-7700k (5.0ghz, 1.312v)

GPU(s): Asus Strix 1080ti OC (~2063mhz)

Memory: 32GB (4x8) DDR4 G.Skill TridentZ RGB 3000mhz

Motherboard: Asus Prime z270-AR

PSU: Seasonic Prime Titanium 850W

Cooler: Custom water loop (420mm rad + 360mm rad)

Case: Be quiet! Dark base pro 900 (silver)
Primary storage: Samsung 960 evo m.2 SSD (500gb)

Secondary storage: Samsung 850 evo SSD (250gb)

 

Server build:

OS: Ubuntu server 16.04 LTS (though will probably upgrade to 17.04 for better ryzen support)

CPU: Ryzen R7 1700x

Memory: Ballistix Sport LT 16GB

Motherboard: Asrock B350 m4 pro

PSU: Corsair CX550M

Cooler: Cooler master hyper 212 evo

Storage: 2TB WD Red x1, 128gb OCZ SSD for OS

Case: HAF 932 adv

 

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, Arktius said:

I'd like to know how important the RAM speed for programming. You can find dozens of benchmarks and articles about gaming, but nothing regarding programming. Is 3000 MHz  CL16 more than enough even for expensive calculations done on the CPU? 

RAM speed is not really very important at all. For nearly every problem RAM speed will not be the bottleneck.

By far the biggest bottleneck is the amount of RAM available, but that only comes into play when dealing with dubious amounts of data.

Warning, rant in spoiler:

Spoiler

Because some idiots in the past thought that a capacitor and a transistor would always be cheaper than six transistors, we are now pretty much permanently stuck with RAM that could never approach on-die memory speeds. That is just the way it is, and speed differences in DRAM of the same era (i.e. DDR4 to DDR4, DDR3 to DDR3) won't really be that noticeable.



The tactics that should be used to improve memory access performance are to decrease the number of memory accesses and, where possible, write things to be cache oblivious. Of course, you can only determine if there is actually a bottleneck by profiling. So, as @reniat said, "measure, measure, measure".

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, straight_stewie said:

Because some idiots in the past thought that a capacitor and a transistor would always be cheaper than six transistors, we are now pretty much permanently stuck with RAM that could never approach on-die memory speeds.

Afaik DRAM is still is orders of magnitude cheaper then SRAM. Look at the amount of die space a handfull of megabytes L3 cache takes.

(Source: Random anandtech article pulled from google. )

 

Imagine what a couple gigabytes would look like and cost.

 

Edit: I also see no technical reason why we would be "stuck" with anything. If cheap SRAM technology were to be invented, there's nothing to stop Intel/AMD from releasing new CPU's/chipsets that would use SRAM memory sticks totally transparant to the software.

Link to comment
Share on other sites

Link to post
Share on other sites

48 minutes ago, Unimportant said:

Afaik DRAM is still is orders of magnitude cheaper then SRAM.

There are a few problems with the die shot you posted: That was Ivy Bridge and, if you look closely, you can see that much of the area taken up by the L3 cache is not actually memory, but memory controllers. That's because L3 cache, atleast in Intel chips, usually has a relatively high number of access channels (usually four per section and four sections, so 16 access channels). RAM won't dedicate quite that much space to the memory controllers.

650px-kaby_lake_(quad_core)_(annotated).

 

This is a Kaby Lake chip. Much of what you see labelled "L3$" is actually called the "Side Cache" and consists of 64 MB of memory. By my math, the "Side Cache" takes roughly 14.8 mm2 for 64 MB. That's 222 mm2 per gigabyte, which is roughly 1/3 as dense as modern DRAM offerings, which yields roughly 1 gigabyte per 70 mm2.

So I guess the conclusion of my analysis is that, atleast in terms of die usage, SRAM is competitive with DRAM. Even more so when you factor in the performance improvement.

However, I can't find a reasonable cost analysis, because SRAM is in very low demand so it's cost is inflated. I'm not an economist, so any price adjustments I could come up with to make an apples to apples comparison on cost would be complete nonsense.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

TL;DR.  For general programming, capacity is far, far more important than bandwidth or latency.

 

For general cases, it doesn't really matter a huge amount.  Latency in terms of time (not cycles) hasn't changed a huge amount over the last decade or so, and likely won't unless there's a breakthrough.  

 

Bandwidth can be more important, but only if you're moving a lot of data, so like say copying extremely large strings, large buffers, etc, and even then, to really maximize you're usage, you have to use SSE and/or AVX instructions to get the best performance.

Wendell actually has a video discussing performance with memcpy and how to improve it's performance.

 

Again though, to sum it up, overall capacity is generally more important than the bandwidth of the memory.  

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Erik Sieghart said:

Worry about it when it becomes relevant.

 

It's relevant when your best optimized algorithm isn't good enough still. Algorithms > hardware.

While you indeed should not be optimizing too much prematurely, *some* hardware behavior should be kept in mind from the very beginning as it can have a great influence on the overall design.

 

We had a multi-threaded application run glacially slow once, as it turned out due to false sharing. To solve the false sharing issue certain parts of the application's data structure had to be modified. What is the data structure in OOP ? Your object structure. So we had to radically change part of the object structure which, as you're probably well aware, took a lot of refactoring and time (money).

 

When certain decisions have to be right from the very beginning, because they're too costly too change afterward, then thinking about and making the right decisions from the start isn't premature. The hardware, mostly caching, affects certain such decisions.

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

17 hours ago, Unimportant said:

When certain decisions have to be right from the very beginning, because they're too costly too change afterward, then thinking about and making the right decisions from the start isn't premature.

Completely valid, however I'd argue that kind of decision making is less "writing code" and more "architecture design", and if you are doing that work you are PROBABLY not gonna need to ask that question on the LTT forum lol.

 

But it is good to point out that making those consideration early has its place, it's just not something that most newer developers are gonna find themselves needing to do.

Gaming build:

CPU: i7-7700k (5.0ghz, 1.312v)

GPU(s): Asus Strix 1080ti OC (~2063mhz)

Memory: 32GB (4x8) DDR4 G.Skill TridentZ RGB 3000mhz

Motherboard: Asus Prime z270-AR

PSU: Seasonic Prime Titanium 850W

Cooler: Custom water loop (420mm rad + 360mm rad)

Case: Be quiet! Dark base pro 900 (silver)
Primary storage: Samsung 960 evo m.2 SSD (500gb)

Secondary storage: Samsung 850 evo SSD (250gb)

 

Server build:

OS: Ubuntu server 16.04 LTS (though will probably upgrade to 17.04 for better ryzen support)

CPU: Ryzen R7 1700x

Memory: Ballistix Sport LT 16GB

Motherboard: Asrock B350 m4 pro

PSU: Corsair CX550M

Cooler: Cooler master hyper 212 evo

Storage: 2TB WD Red x1, 128gb OCZ SSD for OS

Case: HAF 932 adv

 

Link to comment
Share on other sites

Link to post
Share on other sites

20 hours ago, reniat said:

Completely valid, however I'd argue that kind of decision making is less "writing code" and more "architecture design", and if you are doing that work you are PROBABLY not gonna need to ask that question on the LTT forum lol. 

 

But it is good to point out that making those consideration early has its place, it's just not something that most newer developers are gonna find themselves needing to do. 

I have to argue that hobby programmers would often need to make those type of architectural decisions, as they don't have the benefit of their employer having hired people specifically to architect for them.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

35 minutes ago, straight_stewie said:

I have to argue that hobby programmers would often need to make those type of architectural decisions, as they don't have the benefit of their employer having hired people specifically to architect for them

Fair, but the only time you need to do this level of optimization during architecting is if you have a fairly strict performance requirement up front and can design around those requirements. For most general purpose software, regardless of team size, spending tons of time worrying about super specific performance considerations usually isn't necessary. I'm not saying there aren't times, but it's not something you do on every software project "just because".

Gaming build:

CPU: i7-7700k (5.0ghz, 1.312v)

GPU(s): Asus Strix 1080ti OC (~2063mhz)

Memory: 32GB (4x8) DDR4 G.Skill TridentZ RGB 3000mhz

Motherboard: Asus Prime z270-AR

PSU: Seasonic Prime Titanium 850W

Cooler: Custom water loop (420mm rad + 360mm rad)

Case: Be quiet! Dark base pro 900 (silver)
Primary storage: Samsung 960 evo m.2 SSD (500gb)

Secondary storage: Samsung 850 evo SSD (250gb)

 

Server build:

OS: Ubuntu server 16.04 LTS (though will probably upgrade to 17.04 for better ryzen support)

CPU: Ryzen R7 1700x

Memory: Ballistix Sport LT 16GB

Motherboard: Asrock B350 m4 pro

PSU: Corsair CX550M

Cooler: Cooler master hyper 212 evo

Storage: 2TB WD Red x1, 128gb OCZ SSD for OS

Case: HAF 932 adv

 

Link to comment
Share on other sites

Link to post
Share on other sites

Thanks for your comments. I've had a lack of RAM for a couple of times and wanted to know whether the speed does play any role. One last question: Is there a tool to monitor the RAM usage? I'm again interested in the activity/speed rather than the amount of space being used. 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Arktius said:

Thanks for your comments. I've had a lack of RAM for a couple of times and wanted to know whether the speed does play any role. One last question: Is there a tool to monitor the RAM usage? I'm again interested in the activity/speed rather than the amount of space being used. 

There is tons of tools. None are perfect as it's near impossible to pinpoint.

 

Personally i use ProcessExplorer (free from Microsoft website) when i am dealing with overall system integration on windows. This one is good to find the application in a large implementation that is causing the problem (if there is one obviously).

 

RedGate have nice options for specific languages specially to see what objects remain in memory that have not been released or at least the amount being commit or paged.

 

Those are the main 2 i uses.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Arktius said:

Is there a tool to monitor the RAM usage?

If you're talking for your specific application, then this is usually done using via profiling. This is basically just an extension of a debugging environment where the system that is running your application is tracking resource resource usage. Visual Studios debugger does a lot of profiling out of the box. For example, you can see the resources in real time when debugging: https://i.imgur.com/P1YjMMT.mp4

 

There are tons of profiling tools out there for different environments though, its not just a VS thing by any means

Gaming build:

CPU: i7-7700k (5.0ghz, 1.312v)

GPU(s): Asus Strix 1080ti OC (~2063mhz)

Memory: 32GB (4x8) DDR4 G.Skill TridentZ RGB 3000mhz

Motherboard: Asus Prime z270-AR

PSU: Seasonic Prime Titanium 850W

Cooler: Custom water loop (420mm rad + 360mm rad)

Case: Be quiet! Dark base pro 900 (silver)
Primary storage: Samsung 960 evo m.2 SSD (500gb)

Secondary storage: Samsung 850 evo SSD (250gb)

 

Server build:

OS: Ubuntu server 16.04 LTS (though will probably upgrade to 17.04 for better ryzen support)

CPU: Ryzen R7 1700x

Memory: Ballistix Sport LT 16GB

Motherboard: Asrock B350 m4 pro

PSU: Corsair CX550M

Cooler: Cooler master hyper 212 evo

Storage: 2TB WD Red x1, 128gb OCZ SSD for OS

Case: HAF 932 adv

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/25/2019 at 7:19 AM, straight_stewie said:

RAM speed is not really very important at all. For nearly every problem RAM speed will not be the bottleneck.

This is entirely dependent on the workload. I’m not even quite sure you can suggest “nearly every” problem is not memory bound in some way. 

15" MBP TB

AMD 5800X | Gigabyte Aorus Master | EVGA 2060 KO Ultra | Define 7 || Blade Server: Intel 3570k | GD65 | Corsair C70 | 13TB

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Blade of Grass said:

I’m not even quite sure you can suggest “nearly every” problem is not memory bound in some way. 

I guess I should have formatted my statement to suggest that "within any given generation of RAM, differences in RAM 'speed' have a negligible impact on program performance, in the context of small desktop applications". That statement is even supported by LMGs own tests, comparing game performance against different sets of RAM.

 

Nearly every problem is "memory bound" in the sense that the very definition of a computer involves the statement "given an infinitely long tape of cells...", or in other words, Alan Turings definition of a computer assumes that infinite memory is available.

My original statement was to mean that the amount of memory available is of far more importance than the speed of the memory. In other words, don't choose an 8GB kit over a 16GB kit because it has a lower CAS Latency.

 

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

17 hours ago, straight_stewie said:

I guess I should have formatted my statement to suggest that "within any given generation of RAM, differences in RAM 'speed' have a negligible impact on program performance, in the context of small desktop applications". That statement is even supported by LMGs own tests, comparing game performance against different sets of RAM.

 

Nearly every problem is "memory bound" in the sense that the very definition of a computer involves the statement "given an infinitely long tape of cells...", or in other words, Alan Turings definition of a computer assumes that infinite memory is available.

My original statement was to mean that the amount of memory available is of far more importance than the speed of the memory. In other words, don't choose an 8GB kit over a 16GB kit because it has a lower CAS Latency.

 

I get what you mean now.

 

A few things though. 

 

Memory speed is not just about CAS latency, but is also affected by the processor architecture. 

 

You’ve conflated Turing machines with computers. They are not the same. Turning machines are purely theoretical machines used in computing/complexity theory. As such, the definition of a computer contains no such statement.

 

And realistically, most algorithms are not even memory bound (i.e. have a high space complexity), instead I would argue most are time bound—high time complexity—which is reflected in the volume of material on time vs space complexities of algorithms. 

15" MBP TB

AMD 5800X | Gigabyte Aorus Master | EVGA 2060 KO Ultra | Define 7 || Blade Server: Intel 3570k | GD65 | Corsair C70 | 13TB

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Blade of Grass said:

 

 

And realistically, most algorithms are not even memory bound (i.e. have a high space complexity), instead I would argue most are time bound—high time complexity—which is reflected in the volume of material on time vs space complexities of algorithms. 

There are alogrthms that trades space for speeds though, e.g. anything that relies on a lookup table. 

Sudo make me a sandwich 

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, wasab said:

There are alogrthms that trades space for speeds though, e.g. anything that relies on a lookup table. 

Sure, but we still do not necessarily call them space bound. Few algorithms have space as a realistic limit to their operation, given a modern computer. 

15" MBP TB

AMD 5800X | Gigabyte Aorus Master | EVGA 2060 KO Ultra | Define 7 || Blade Server: Intel 3570k | GD65 | Corsair C70 | 13TB

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×