Labs External Data

ScottishTapWater · June 17, 2023

Following on from the discussion on tonight's WAN show, I have an idea.

Essentially, package up the labs autobench software so that it can run on volunteer machines when they're idle (similar to BOINC) and upload the results to Labs Servers along with hardware info etc. for collation.

Removes human error and the service could account for confounding factors (like background tasks) automatically.

Thoughts?

Eigenvektor · June 17, 2023

Just to help others find it, the short discussion takes place around the 1:20 mark (no timestamp yet, because the show is ongoing). It focuses on reviews, not benchmarks.

1 minute ago, ScottishTapWater said:

Essentially, package up the labs autobench software so that it can run on volunteer machines when they're idle (similar to BOINC) and upload the results to Labs Servers along with hardware info etc. for collation.

The biggest issue is, how do you validate these results and make sure someone isn't trying to influence the numbers in their favor by submitting skewed results? The reason to have a lab in the first place is to measure things in a well known environment in a repeatable fashion. Otherwise you just end up with UBM 2.0.

Also not sure how you want to automatically account for background tasks. There's no known "task x reduces performance by y%" that you can subtract. At best you can measure with and without that background task to arrive at an average number of performance loss, based on underlying hardware type. Or refuse the benchmark when there's something running that's not part of Windows. But there would be lots of variables to take into account, making it hard to transfer results from one system to another.

ScottishTapWater · June 17, 2023

Just now, Eigenvektor said:

Just to help others find it, the short discussion takes place around the 1:20 mark (no timestamp yet, because the show is ongoing). It focuses on reviews, not benchmarks.

The biggest issue is, how do you validate these results and make sure someone isn't trying to influence the numbers in their favor by submitting skewed results? The reason to have a lab in the first place is to measure things in a well known environment in a repeatable fashion. Otherwise you just end up with UBM 2.0.

Also not sure how you want to automatically account for background tasks. There's no known "task x reduces performance by y%" that you can subtract. At best you can measure with and without that background task to arrive at an average number of performance loss, based on underlying hardware type. Or refuse the benchmark when there's something running that's not part of Windows. But there would be lots of variables to take into account, making it hard to transfer results from one system to another.

You validate the results by ensuring that the application has some sort of self-integrity check. Honestly, while you can never trust something running on a client machine... That's probably enough, seriously, who's going to care enough to hack their way around that to submit a benchmark result that will be immediately discounted because it's an outlier?

As for the rest, you just don't run the tests on any machine with more than a trivial amount of background processing going on and the rest should average out with enough data.

Eigenvektor · June 17, 2023

Just now, ScottishTapWater said:

You validate the results by ensuring that the application has some sort of self-integrity check.

That's a very non-trivial task. As you said, you can't trust stuff running on a client machine, so it's hard to ensure the software hasn't been manipulated. Besides, who says I'm running your application in the first place and I'm not simply talking to your API directly?

ScottishTapWater · June 17, 2023

44 minutes ago, Eigenvektor said:

That's a very non-trivial task. As you said, you can't trust stuff running on a client machine, so it's hard to ensure the software hasn't been manipulated. Besides, who says I'm running your application in the first place and I'm not simply talking to your API directly?

It's non-trivial, but it's not impossible to do a good enough job of it either. Anti-cheat systems have been doing it for years. There's less motive here to work around it too than there would be for a game cheat, so you just need to make it non-trivial, rather than impossible.

At the end of the day, the only people that would want to bypass such a system would be hardware manufacturers, and the blowback if they were caught doing something like that would be so catastrophically brand destroying (not to mention the legal ramifications) I doubt they'd try it.

Eigenvektor · June 17, 2023

If anything, hardware vendors have more motivation and resources to cheat than your average gamer looking to increase their Battlefield score or whatever. There's a lot of potential money on the table. But you're correct that getting caught would be very bad, so there is some natural deterrent.

But no matter how big that risk actually is, you're still spending resources to make the system tamper proof that would be better spent on improving the lab itself. The primary reason I would trust lab results is precisely because they are results measured by professionals, rather than user submitted.

ScottishTapWater · June 17, 2023

Sure, but you're never going to be able to achieve the same breadth of configurations or have significant sample size. Not to mention that nothing ever runs the same in the real world as it does under lab conditions.

I think there's a case to be made for both. You could potentially have the "lab score" and the "crowd-sourced score". It would also be a useful sanity check for the lab... If their scores significantly differ from real world scores, then maybe their methodology needs looking at again.

I'm also not sure the engineering effort involved would be massive, there are pretty standardised solutions for accomplishing this sort of thing. That does somewhat rely on the assumption that the existing system as it is has been designed in such a way that it could fairly easily be packaged up, which may well not be the case

Poinkachu · June 17, 2023

Isn't the whole idea of making a specially equipped lab is to test stuffs in a very controlled environment ?

So... what's the point if it's just gonna use data from tests done by strangers whom level of skill or integrity we know not of, performed in an environment we know not of.

10 hours ago, ScottishTapWater said:

It's non-trivial, but it's not impossible to do a good enough job of it either. Anti-cheat systems have been doing it for years.

And people hate such thing so much already.

Quote

At the end of the day, the only people that would want to bypass such a system would be hardware manufacturers, and the blowback if they were caught doing something like that would be so catastrophically brand destroying (not to mention the legal ramifications) I doubt they'd try it.

And some extreme fanboys / group, or bored crackers.
Or some bored kids with a weird attachment to epeen.

Avocado Diaboli · June 17, 2023

Yes, let's hand a corporation free data of questionable value so they can monetize it.

ScottishTapWater · June 17, 2023

6 hours ago, Poinkachu said:

Isn't the whole idea of making a specially equipped lab is to test stuffs in a very controlled environment ?

So... what's the point if it's just gonna use data from tests done by strangers whom level of skill or integrity we know not of, performed in an environment we know not of.

And people hate such thing so much already.

And some extreme fanboys / group, or bored crackers.
Or some bored kids with a weird attachment to epeen.

To take your points one at a time:

Sure, but real world data is important too.

The idea would be to automate it, no skill required beyond installing it.

People hate it because it affects their ability to play their games. That wouldn't really apply to this so I don't see why there would be an objection.

An extreme fanboy is unlikely to have significant enough resources to generate enough fake results that they couldn't be discarded as outliers.

ScottishTapWater · June 17, 2023

2 hours ago, Avocado Diaboli said:

Yes, let's hand a corporation free data of questionable value so they can monetize it.

You literally do that all of the time you're on the internet because you get something convenient or useful in return... I'd argue the labs website is going to be far more useful for most of us than what most companies give in exchange for our data.

Plus, it would be voluntary, not like you'd have to contribute if you didn't want to.

Arika · June 17, 2023

So you want labs to become.... userbench?

LogicalDrm · June 18, 2023

I'm sure if they ever release version of MarkBench, it will have fan database. Just like we have for Cinebench. But as noted, those results wouldn't be applicable to be reviewed at same level with in-house tests. Too many variables that couldn't be controlled.

Sign In

Labs External Data

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

Your PC Can Look Like THIS Now!

Latest From Tech Quickie:

What Speed DDR5 Should You Buy?

Latest From TechLinked:

Search is About to Change Forever

Latest From GameLinked:

Eat it, Nintendo.

Latest From ShortCircuit:

I think y’all overlooked something… - Pixel 8a

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!