Labs Transparency Thread

LMGcommunity · August 25, 2023

As promised, here is the thread where you can post your transparency suggestions for Labs. Here are some examples of things we'd like to see feedback about:

Ideas in our benchmarking.
What you'd like to see regarding test system information.
Benchmark details.
Testbed variations.

And, of course, any other transparency-related feedback is welcome, but similar to the Sponsor Complaints subforum, any off-topic comments will be subject to moderation.

Thank you for wanting to help us improve the way we do things here!

RoyalSpud · August 25, 2023

I'd love to see an in-depth video (or video series) detailing every benchmarking setup and the benchmarking processes. After that you would be able to refer/link to those videos whenever you use a specific benchmark in later videos. It would be an interesting way to show everyone "how the sausage is made" as Gary put it.

PocketNerd · August 25, 2023

2 minutes ago, RoyalSpud said:

I'd love to see an in-depth video (or video series) detailing every benchmarking setup and the benchmarking processes. After that you would be able to refer/link to those videos whenever you use a specific benchmark in later videos. It would be an interesting way to show everyone "how the sausage is made" as Gary put it.

What if the setup or process changes, would you expect a new video every time? It feels like a living document (maybe embedded with shorter videos) with written but extensive details would be more appropriate for this.

RoyalSpud · August 25, 2023

Just now, PocketNerd said:

What if the setup or process changes, would you expect a new video every time? It feels like a living document (maybe embedded with shorter videos) with written but extensive details would be more appropriate for this.

You might be right on the longer term. Unlisted (shorter) videos might be better. But a detailed overview of the testing setup and processes might be a good starting point to get initial feedback from the community

Distinctly Average · August 25, 2023

Maybe employ Steve so you can get it right 100% of the time like he does

Seriously though, while I am sure the community appreciates the asking I am sure the team have more than enough experience and knowledge to get things right and to a high standard. Migh be worth looking at the likes of Which to see how they display the results of their testing.

Fire_Burns_22 · August 25, 2023

23 minutes ago, RoyalSpud said:

I'd love to see an in-depth video (or video series) detailing every benchmarking setup and the benchmarking processes. After that you would be able to refer/link to those videos whenever you use a specific benchmark in later videos. It would be an interesting way to show everyone "how the sausage is made" as Gary put it.

I have to agree with another reply who said this could get complex with updates because tests will have to be revamped every time there's an architecture change or feature change.

What I would suggest is a dedicated Labs channel (so it doesn't mess with any other LMG channel's metrics) and on that channel do much more informal routine updates from the engineers that are working on the different tests where they give updates and overviews of the tests. Then just assign each engineer (that manages a test procedure, machine, or system) an hour a week to do a quick little ~15 minute recording with an overview of what they're doing with their particular test. No opinions, no comparisons to other outlets (anything like that should probably be a formally reviewed LTT video) but just a technical touchpoint. Maybe the first videos will need to be a bit longer than 15 minutes, but the following ones should be able to keep to relatively short lengths.

davejr · August 25, 2023

I found a link to watch the apology video without providing a view on the YouTube video (I cannot believe it was monetized) and saw the statement about how there would be a video describing the process or the details of the flow on Floatplane - I think given what has happened the idea of pay walling the transparency for processes that will provide videos to YouTube shows that no real change of mindset has occurred.

If you want to gain the respect and transparency that you allude to, not only should the processes be documented well, but they should be available to anyone who wants them, not just those who pay for FP.

Off the topic of transparency - I don't know the exact dates of when the labs started producing outputs, but I recall years ago people talking about prototypes of keyboard key pressing machines, soundproof chambers, PSU testers and the benchmark suite being produced but I just don't see any results - is this a case of everyone has a pet project and they want to get to some kind of big result? The website doesn't appear to be available yet - it feels like someone needs to take charge and actually produce some results - I can't believe that the only thing to result so far is effectively part of the controversy.

Toakan · August 25, 2023

A publicly accessible database / website that displays the data you are generating. Showing Software Versions, Hwinfo and obviously scores from the games and tests.

scmitr · August 25, 2023

19 hours ago, LMGcommunity said:

As promised, here is the thread where you can post your transparency suggestions for Labs. Here are some examples of things we'd like to see feedback about:

Ideas in our benchmarking.

What you'd like to see regarding test system information.

Benchmark details.

Testbed variations.

And, of course, any other transparency-related feedback is welcome.

Thank you for wanting to help us improve the way we do things here!

I don't mind hearing erroneous data coming from goofy LTT staff (especially Linus) but it should be corrected verbally in the same video. This is how facts are mixed with entertainment in scriptwriting.

LMGcommunity · August 25, 2023

1 hour ago, RoyalSpud said:

I'd love to see an in-depth video (or video series) detailing every benchmarking setup and the benchmarking processes.

1 hour ago, mMontana said:

IMVHO it's the other way around: LMG publish data, then the community review and promote (or bash the hell out) if things are

44 minutes ago, davejr said:

If you want to gain the respect and transparency that you allude to, not only should the processes be documented well, but they should be available to anyone who wants them, not just those who pay for FP.

33 minutes ago, Toakan said:

A publicly accessible database / website that displays the data you are generating. Showing Software Versions, Hwinfo and obviously scores from the games and tests.

To help answer some of this, there will be a second Forum thread (hopefully today) that will provide our test procedures for both software and hardware-related items.

Jack1997 · August 25, 2023

I think this is a tricky one and depends on what you want from the Labs. For content purposes, I agree that video content explaining how you run a testing suite is largely useless as it should change - especially if the process is automated like what it appears to be. Additionally, breaking it down for a non-technical audience (and I mean people who aren't familiar with this testing) is hard to do in a fun, but meaningful way.

So the question is how the Labs fit into the whole process. If you want to be the consumer watchdog to hold manufacturers to account, then they need to be treated separately. If this is the case, publish technical papers on method development, and when something goes wrong, publish a paper on that.

I, genuinely, don't think the videos are the way to go. Maybe for a casual observer of walking through the process of 'how an LTT video is made' a yearly video could be made, and maybe an additional video (FP exclusive?) on large-scale launches and how they were tested, especially as those videos will likely be more time-sensitive than standard videos, and so more likely to have things seep through the net.

For anyone who genuinely cares about the testing process, a technical paper (2-3 pages) is not too much to read and can be interesting. And regular papers on this topic are the only way to ensure the technical aspects are not overlooked.

TychoHybrid · August 25, 2023

20 hours ago, LMGcommunity said:

As promised, here is the thread where you can post your transparency suggestions for Labs. Here are some examples of things we'd like to see feedback about:

Ideas in our benchmarking.

What you'd like to see regarding test system information.

Benchmark details.

Testbed variations.

And, of course, any other transparency-related feedback is welcome.

Thank you for wanting to help us improve the way we do things here!

In my opinion, it’s as simple as employing proper Engineering Rigor. I’m not an engineer by credential, but am employed as an engineer at one of the world’s most well-known corporations. From where I’m sitting, the testing and analysis is not performed using proper and established engineering processes. Just my $0.02.

Jack1997 · August 25, 2023

3 minutes ago, TychoHybrid said:

As someone who works as an engineer (without a degree) for one of the world’s most well known corporations, here are my “humble but confident” two cents. I’ve learned hard way, the need for “Engineering Rigor”

This isn’t a complex problem, it just takes time, validation, and review.

Methodical, systematic, and documented Conception, Design, Failure Mode Analysis, Pre-Review, Execution, Validation, and Post-Review.

When issues are found post-release, a methodical Root Cause Analysis should be carried out to enact continuous improvement.

To be professional, let alone “World Class”, it’s as simple as this. To summarize what I’m saying, proper Engineering Rigor would eliminate these issues.

I agree that this is the only way to go forward, but unless those are published as a method development or root cause analysis paper, then how can, the audience and the wider public trust what they're saying? The company has grown extremely fast and they've failed to put in fail-safes, and now have been called out for it.

I think the 'trust me, I won't make the same mistake, it was an oopsy' doesn't work when the corporation is trying to convince us of something based on data. Suppose LTT comes up with something like AMD's new chips are consistently performing 30% worse than expected. Who do I trust? AMD or LTT? AMD has something to gain from lying (increased sales), but LTT now has a shadow cast over ALL of its tests and results.

Elliott Dyson · August 25, 2023

I believe integrating an upvote-like system (like Reddit's), for viewers to comment on such testing methods and ways of displaying/presenting results would be a great idea. Perhaps hosting a Reddit community for this might suffice, however, relying on external services is unreliable in the long run, so something like this would be better built into the forum (in my opinion).

Doing such a thing would allow filtering out bogus comments, and boost opinions the community feels strongly about.

Guidelines that appear on the landing page could include things like suggesting users weight their upvotes to suggestions that provide strong evidence for their suggestions. The stronger the evidence (e.g. peer-reviewed papers), the more willing you should be in considering upvoting that suggestion.

Perhaps weightings could also be applied if users have proof of proficiency in a given area, such as higher education, years of working in the industry, etc.,

... however, that would be something that would require experimentation before integration.

greaves1111 · August 25, 2023

I'm not qualified enough to tell you guys how to run a benchmarking lab, but I will tell you a topic that I've always been interested in is optimal hardware configurations, and hardware configs that don't make sense/the graphs that show the bottlenecks when you pair a really powerful GPU with other budget parts

The stuff Gamers Nexus does is cool and all, but I guess I just don't need to see a 50 point presentation about the acoustics of a case with the glass vs vented side panel comparison lol, but hey some people geek out about that

leadeater · August 25, 2023

21 hours ago, LMGcommunity said:

Testbed variations.

Request from vendors like Intel and AMD validation platform motherboards equivalent or near equivalent to what they use that by design more strictly adhere to manufacture default settings and behaviors with extra diagnostic information available.

MaddogZ · August 25, 2023

As LTT and LMG are YouTube channel that needs to always be looking forward I understand not editing every video to correct errors. Even if they improve we are all only human. I think that the LAB data should be published when video goes up using their data and some verbiage should add to all LAB review content to check the link to the posting for the most up to data, graphs, and testing process... This would give them a good way for them to post corrections or updates based on new firmware test results or incorrect information. This would also be a place for community feedback on processes and results. If for some reason the video does not represent the product correctly (not just a incorrect graph) then a update video should be posted and old pulled down.

I understand that they have big plans for the LAB website and its still cooking. They could still do this process with form posts until it is ready and just update links in videos down the road.

I love LTT and it sucks to see them in this position. I hope for a nice recovery and improvements going forward. Please don't loose your fun side

Derkoli · August 25, 2023

I really just think you need to expose yourself to the scientific process, and allow your data to be easily available for free and in an extensive, but organised way. Allow it to be scrutinised. Allow it to be easily changed and updated as new information is released or discovered. Do not lock it behind a paywall.

Provide all of the data you can supply for the test conditions. Tell us all of the variables. Take criticism and allow your test methods to be fluid. Tell us the software versions of everything, tell us what drivers you're using.

You guys have a lot of resources at your disposal, and you could be a real force for good, and a force for transparency if you pull it off properly with integrity. But you must take criticism and allow yourselves to not be offended by it.

When measuring audio equipment, tell us what the air temperature and pressure was. It does matter. If you need help with anything to do with audio, you can message me, or get in contact with Amir from audiosciencereview.com.

Not disclosing variables and admitting mistakes will be the downfall of Labs.

Provide literally all info.

What cables did you use?

Drivers used?

Software versions used?

Ambient temperature?

For PSU's, tell us what the current RMS voltage was. What was the peak-to-peak voltage? How much distortion did the sine wave have?

KieranHolroyd · August 25, 2023

Every time a mistake makes it into a video, there should be a thread with a post-mortem.

Almost like a status page, like whenever Cloudflare or something has an outage, they do a full post-mortem on what went wrong and what will change. It should be a little log of every mistake that gets made, so that everyone can understand why things happened, and leaving them all in one forums thread or some other method will provide a searchable history. While this obviously doesn't help fix the issue in the video, it would boost confidence in the results collected, and would also be useful.

Also providing some kind of Versioning for results, meaning that if a flaw was found in a subset of data, it can either have annotations added (to the lttlabs data site) or could have data invalidated en masse by which version of the software was used or what the pipeline was like at the time, as well as providing an instant visual indicator of what sophistication the whole stack was at the time, since constant small improvements will keep happening over time I feel like being able to understand the difference between data sources (eg test runner v2 vs v3. or data pipeline document #2 vs #5). With everything getting more transparent, I basically feel it's necessary to start cataloguing the data with the documentation and software/methedology at a moment in time will be required to ensure consistency.

Best of luck to the Labs team.

LinusTech · August 25, 2023

31 minutes ago, leadeater said:

Request from vendors like Intel and AMD validation platform motherboards equivalent or near equivalent to what they use that by design more strictly adhere to manufacture default settings and behaviors with extra diagnostic information available.

This is a really cool idea, but one that may not be practical.

1. they don't provide this gear. It literally doesn't leave the labs of either the chip-makers or their NDA'd partners

2. ... it's not representative of the experience an end-user will have anyway, making it academically interesting, but practically-speaking kind of irrelevant.

Alvin853 · August 25, 2023

Always test CPUs with both an AMD and NVidia GPU test system. Driver overhead is a thing, depending on GPU load the CPU might be affected differently between different GPU setups.

Always test GPUs with both an AMD and an Intel CPU test system. This will likely have less impact than the first one, but there are certain situations, for example ReBAR usage, where GPUs behave differently depending on the CPU.

State very clearly, especially for CPU tests, that the benchmarks you perform are designed to show differences in CPU performance, and that "CPU A is x% faster than CPU B" will not mean the typical gamer gets x% more FPS using CPU A. Maybe at some point you can do an explanation video using a midrange GPU setup and various CPUs, to show how performance scales in the typical use cases that most gamers will encounter.

KieranHolroyd · August 25, 2023

26 minutes ago, LinusTech said:

This is a really cool idea, but one that may not be practical.

1. they don't provide this gear. It literally doesn't leave the labs of either the chip-makers or their NDA'd partners

2. ... it's not representative of the experience an end-user will have anyway, making it academically interesting, but practically-speaking kind of irrelevant.

To invert the question previously asked, it would be interesting to see if there is data which can be pooled from a large number of tests to create a baseline almost expected performance level for a given set of parts +-1% or something, allowing for automated outlier detection. which wouldn't be helpful necessarily be helpful for first time parts, but it would be a good first warning for repeated testing, even if most of the time the answer is "this performed slightly better because we used driver 522 over 502".

I believe it would also be good to have results with std dev from previous runs.

KieranHolroyd · August 25, 2023

5 minutes ago, Pekirt said:

I'd say this idea is very hard to implement correctly. You need a SOP for a wide range of products to be compared at the same test for the same performance. This may indeed be impossible. Moreover, "garbage in, garbage out" would be hard to avoid. You end up with a whole bunch of bad data, which unfairly skews the results, and you would be confident in your faulty results because there are so many of them. Moreover, this is tech space: Before you can run all those tests, new stuff comes out.

The criticism against the automation that would be provided by the new technical dept of LMG goes along the same lines. "Experience is very important," the critics would say.

From what I gather watching the videos, there's a relatively limited set of games and benchmarks run for the video results, and it's almost always the same, which is not to discount the garbage in garbage out critique, but I would argue that having automation in the testing for the test engineers is never a bad thing.

In web development for instance there are alerts setup all over the place to monitor performance based on expected outcomes, deviation from which doesn't necessarily indicate a problem, but it's always good to have easy access to historical data.

Since the Labs team is re-testing the same hardware on the same software for every project as Gary explained, the data for outlier detection shouldn't be impossible to collect, collate and action responses to an odd result.

While this is slightly less about transparency, having a process of tracable data for the team working on testing, the ability to alert them to erroneous data allowing for quicker actioning to re-test or investigate would personally give me more confidence, and tracking such data would potentially allow for standard deviation bars on the graphs produced by labs and other statistical inferences across time.

Daniel White · August 25, 2023

Post on GitHub the practices and methodologies in an editable format. Take PRs from the community to improve or add clarity

TylerD321 · August 25, 2023

I would like to see the LMG get ISO 9001 certified (or even just the labs) and have the standard work pertaining to video creation or testing methodology public.

Here is a link describing ISO 9001:2015

https://www.iso.org/files/live/sites/isoorg/files/archive/pdf/en/documented_information.pdf

Edited August 26, 2023 by TylerD321
Added ISO Link

Sign In

Labs Transparency Thread

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites