Mira Yurizaki

Mira Yurizaki · July 9, 2019

5 hours ago, Arika S said:

Current top 3 video games?

I can say the top one right now is Final Fantasy XIV.

The other two seems kind of out there. There's a lot of games I enjoyed, but it's hard to pick which ones I'd gush over again. But if you were to make me pick two, they'd have to be Chrono Trigger and Secret of Mana.

5 hours ago, Arika S said:

Do you have a dream you're working towards?

Settling down somewhere with a house, maybe a partner. I probably won't move from where I'm at in general though.

Mira Yurizaki · March 5, 2019

I don't now if anyone keeps up with this, but another solution I thought of is keep two sets of physics routines: one purely cosmetic and the other that actually affects game play. So things like cloth and hair animation or cinematic animation would go to the cosmetic simulation while entity interaction would go to the game play simulation. The cosmetic simulation can run alongside the graphics routine and can run as fast as possible with the time delta simply being the last frame time, while the game play simulation will run with the game logic at a fixed interval to keep things simple.

Mira Yurizaki · January 21, 2019

The unicorn is getting a fast enough bus, but it's largely that at the moment, a unicorn.

The other question how much of a benefit are you truly getting from this from a manufacturing standpoint? If we pull up some numbers on Wikipedia, we can find the following stats:

GT 1030
- 384 SPUs, 24 TMUs, and 16 ROPs
- 1.8 billion transistors
- 74 mm^2 die
GTX 1050 Ti
- 768 SPus, 48 TMUs, and 16 ROPs
- 3.3 billion transistors
- 132 mm^2 die

Even though the GTX 1050 Ti is basically double the GT 1030, the GTX 1050 Ti is a more efficient design since it uses less transistors and die space than a reasonably assumed 2 times. Also note that nothing else is different between the two. They both were designed with the same external bus and memory type. I'm pretty sure my math isn't correct here, but for the same amount of material you can make 66 GT 1030's, you can make 37 GTX 1050 Ti's. You can lose 4 1050 Ti's and still even out, but this is an 89% yield. We also can't assume the GT 1030 has a 100% yield rate and, for the sake of simplicity, it too suffers from an 89% yield. Which means about 58 dies are good, which further increases the amount of bad 1050 Ti's that you can have before you drop below the 2x breaking even limit. In other words, you can have a defect rate as low as 79% on the GTX 1050 Ti's before it starts to no longer make sense to make them (as much) as opposed to gluing two GT 1030's together.

We also have to consider what I mentioned in the blog post: a GPU for gaming is going to be working on a time-sensitive task. Benching something like POV Ray and x264 is fine and all on a CPU because we don't care in the order that the final output is assembled (more or less) nor how long it takes (though the faster the better). In a GPU, the order of how the final output is assembled does matter and we do care how long it takes to get something done. I'm not quite sure how sensitive introducing latency or whatnot will affect overall graphics performance, and the only thing SLI shares is frame buffer data (I'm not sure what NVLink shares)

But overall, until we solve the two biggest issues plaguing multi-GPU setups, that being memory pools don't combine and workload distribution, I don't think chiplets will be anything more than a fancier way of doing multi-GPU setups.

Mira Yurizaki · January 19, 2019

That won't work very well because the AI wouldn't know where anything is until the geometry is setup, which is one of the first steps to 3D rendering. By that point, the work was already distributed.

Mira Yurizaki · January 18, 2019

A note about this part of the blog:

Quote

So while adding more transistors per CPU core hasn't always been viable...

What this means is that in GPU land, you can get away with simply duplicating your basic execution units. In AMD terms, this is a stream processor. In NVIDIA terms, a CUDA core.

In CPUs, you can't duplicate its basic execution units, which are the ALU, AGU, and FPU, and expect a linear improvement in performance. Most of the transistor count increases per core over time for CPUs mainly may be due to adding unrelated to semi-related features like SIMD processing.

Mira Yurizaki · January 7, 2019

I'm also starting to think now that I've found a better way to measure how cooling performance affects PerfCap reasons, we could start adding that to performance metrics.

Mira Yurizaki · January 7, 2019

My radiator fan curve is set to 30% if the CPU is under 40C, but tops out at 50% at 50C. The CPU rarely gets above 50C though.

Mira Yurizaki · December 30, 2018

I made an edit to this post as of 12/29/2018 @ 8:55PM PST. So if you're reading before then, you should refresh.

Mira Yurizaki · December 29, 2018

@TVwazhere Just for you I'll do a more scientific test to see if the sound card is a factor or not.

Mira Yurizaki · December 29, 2018

I doubt it's the sound card because all of the hot air is blown towards the sides of the case. It's either going to go up or down from there, and I'd rather not have it go down.

Mira Yurizaki · December 20, 2018

2 hours ago, dwang040 said:

If the game engine is only capable of producing and updating at 60 Hz, sure, we can say that there is a "reason" to cap. But if that were the case, is a cap really necessary?

No, but I wouldn't see a reason to have it uncapped either other than for bragging rights.

Quote

Hmm, I'm not particularly saying that a system that is incapable of producing 60+ fps would suffer a penalty because it's incapable of reaching more than 60 fps, and correct me if I misread your comment, but it sounds as if you're implying that systems that are capable of running the game at 100+ fps will have an advantage because they can produce more fps, thus forcing the engine to run faster? If that is the case, what about those systems that can only run the game at 30-45 fps? I can't say I remember seeing a lot of people saying that their game runs slow cause they couldn't reach 60 fps?

It's about how often the system can run the game logic. I'm under the belief that most game engines run their logic at the same rate, regardless of what processor you throw at it. The frame rate you get at the end is how much time left over the processor can spend sending render commands to the GPU. So if a game runs at 60Hz, then every 16 or so milliseconds, it'll run the logic. If it can complete this within say 500 microseconds, this gives the CPU 15.5 milliseconds to compile and send GPU commands.

But otherwise, yes, I'm implying that it may be advantageous to be able to run the logic more or less often.

Quote

For me personally, I question if the engine is scaling based on the fps cap. We know that increasing the cap will increase the engine speed, but what about decreasing the cap to 30 fps (Probably someone tried it out, but I couldn't find any info)? Would that slow down the game to half speed?

No, because again of my assumption that the game will always run the game logic at a consistent rate, regardless of FPS it can spit out. In this case, if you get 30 FPS, and the game is say 60 Hz, the game already processed two cycles by the time you receive a frame.

Just remember, graphics is a visual representation of the state of the world. As such, it's the last thing that gets done in game. I think a good video that "explains" this is here:

(I say "explains" because it's an intermediate level video, the presenter doesn't explain most of the terms he uses)

Mira Yurizaki · December 20, 2018

While it's easy to claim it's awful design to have things revolve around the frame rate, I would argue on the other end of the spectrum, it may not be useful to have a frame rate that can exceed how fast the game world runs. If the game world only updates at 60 Hz, there's no point in exceeding 60 FPS because the graphics is a visual representation of the current state of the game world. You would just have extra frames that are rendering the same thing. Maybe if the graphics engine were fancy enough it would render in-between frames of animation, but those wouldn't really count for anything.

Of course, you could also ask why won't developers allow the game world to run faster or slower? Because this would create an inconsistent experience when comparing lower end systems to higher end ones. Imagine being able to cheat by running the game world at a lower rate; you could effectively phase through matter.

Mira Yurizaki · October 26, 2018

41 minutes ago, CarnageTR said:

SLI bottlenecks communication traffic. Linus have a video about it. NVlink might be solution.

NVLink is still slow compare to VRAM bandwidth. Considering that GPUs are memory bandwidth sensitive, I don't believe even running the links at VRAM bandwidth would solve everything since memory sensitive applications have issues in NUMA based systems.

Mira Yurizaki · July 30, 2018

It can, but developers who think their app deserves all of the CPU time in the world will never send a signal to say the thread is idling.

EDIT: Though some OSes do track process utilization as a means to tell what the process is doing. I believe MINIX can use this to judge if a process is stuck in a forever loop and lower its priority automatically until it just dies.

Mira Yurizaki · July 30, 2018

It was being handled that way in the application side, in that incoming ACK requests would immediately cause the state machine to go back to "Tx waiting" and never really get to "Waiting for ACK."

Basically I believe the solution was more or less have sending ACKs out on the same "priority" as retry sending the message. It's important to send ACKs out, but it's also equally important to retry the message.

Mira Yurizaki · March 12, 2018

I've given up on directly confronting them.

Mira Yurizaki · October 21, 2017

On a side note, I think Windows 10 is smart enough to make sure it segregates storage "optimization" (i.e., TRIM or disk defrag). That is it's probably impossible to actually defrag an SSD in Windows 10.

Mira Yurizaki · October 20, 2017

So day one with an SSHD.

Either it already cached what I use often or using an HDD isn't as terrible as people think. I mean, it's telling when my laptop's OS loading time is less than 12 seconds (that could've been fast boot though) and while there are some hiccups, most things work smoothly.

Mira Yurizaki · October 18, 2017

As a note, the only potential issue I've seen that is related to changing from HDDs to SSDs is that on older HDDs with 512-byte or 512e sectors, if they're not 4K aligned, they will cause performance issues on SSDs since SSDs have pretty much been using 4K sectors since whenever. But practically all HDDs use 4K sectors and so that shouldn't be a problem. But I'll check on that anyway.

SilicateWielder · June 3, 2017

My major criticism is that it's very high level and complex, as in each instruction is trying to do a lot of things at once. This looks more like an API for system software related things, like the POSIX standard.

I guess it depends on what you were trying to do though, design wise.

SSL · February 23, 2017

Have you looked at Kanto YU2? I'm thinking Best Buy allegedly replaced AudioEngine for them.

Mira Yurizaki · January 14, 2017

As an explanation for the choice of benchmarks...

The program must have an in-game benchmark, because I wanted little influence as possible from outside programs on the results. Even if that influence is negligible.
3DMark was chosen because well, it's 3DMark
Unigine Heaven was chosen because it's still a popular DX11 benchmark
FFXIV was chosen mostly because it's what I'm playing right now :3 However it does not test network capabilities.
GTAV was chosen mostly because it's still a popular benchmark
Deus Ex: Mankind Divided due it being rather stressful on cards
F1 2016 due to being relatively CPU intensive since the benchmark simulates all ~24 drivers.

Mira Yurizaki · December 30, 2016

8GB of shared RAM, and the only slides I can find from developers who showed the memory usage (look up Killzone Shadow Fall Post Mortem) posit that the main system takes 3GB for itself, leaving 5GB total for the games. 3.5GB of that was used for the GPU and the rest for the game itself.

As far as I know, the PS4 Pro did not increase the memory capacity. However it can render 4K using a tiled approach which requires less bandwidth.

Also as far as I know, the CPU is still a Jaguar CPU, just with a clock speed bump. Jaguar is a netbook class architecture. So a desktop class architecture should be more than enough to make up for any deficiencies that simply doubling the clock speed but halving the cores might do. Also, you can't buy dual socketed AM1 boards

Mira Yurizaki · September 7, 2016

And therein lies the problem with this discussion: there's too many open variables that people will skew in order to make their side look better. For example, you mentioned the second hand market. Guess what console buyers can do? Buy second hand if they want to. Which is why I set some strict, specific guidelines on my comparison.

If you're going to start price comparing, you have to tie up as many open ended variables as possible. Otherwise your biases will start playing in and your argument will be weak.

Sign In

Mira Yurizaki

Posts

Joined

Last visited

Content Type

Forums

Status Updates

Blogs

Events

Gallery

Downloads

Store Home

Blog Comments posted by Mira Yurizaki

Yet another AMA

Why are physics engines tied to frame rates?

The Chiplet "Problem" with GPUs

The Chiplet "Problem" with GPUs

The Chiplet "Problem" with GPUs

Second followup to the airflow mod

Second followup to the airflow mod

Follow-up to the Airflow Mod

Simple mod to increase video card airflow in my build

Simple mod to increase video card airflow in my build

Why are physics engines tied to frame rates?

Why are physics engines tied to frame rates?

Why multi-video card setups can't combine VRAM

About that Task Manager CPU utilization "being wrong" (and about idling)

When something seemingly designed well still has a problem.

Think beyond computer problem solutions than just knowing them

Project Dolly: A look into cloning Windows (Preliminary Results)

Project Dolly: A look into cloning Windows (Planning)

Project Dolly: A look into cloning Windows (Planning)

Simulated Processor - Creating an instruction Set

Good Speakers

Does Making Windows 10 Lean Do Anything For Gaming Performance?

Let's build the PS4 Pro

Is PC gaming really cheaper than console gaming?

My Activity Streams