Jump to content

Labs Transparency Thread

LMGcommunity

I don't know if this suggestion belongs here, but here goes.

 

I would like to see a product reviewed, especially for comparison, regardless of the company. For example, when talking about chargers, Anker's chargers must be compared against them, even when their company has some history. People still buy them, and they need to know whether the product is good or not.

 

For a power bank, if the manufacturing company breaches privacy in the past with the other product line, it should be, at most, an asterisk. But if they have a history of exploding batteries, then it should be heavily weighed when reviewing the product. Apple and NVIDIA are prime examples of companies with a 'history,' yet their products are still reviewed and compared against when their competitor's products are reviewed.

 

I noticed this trend is even more apparent for past sponsors you worked with but have a history with. If you have a history with them, I think you guys should take the stance of "We will review the product, but the history of the company will be taken into consideration in the review, and we will let our audience know about it" instead of "We will not ever mention your products again." The latter is, in my opinion, destructive as information that could be told to the audience was omitted.

 

Separate the 'product' with the 'company' when it makes sense. Obviously, when, for example, the product is a surveillance camera, and it uploads the footage without you knowing, it is part of the product and should be treated as such (as with comparing the privacy history of cloud storage and passwords manager). 

 

As for the database you're building in the labs, maybe maintain a company table (or page; I don't know what your database schema is) and note the notable history of what the audience should know (yes, we forget about it all the time). Then you link this company entry to the data of the product and flag any company history that may affect the product for easier tracking (e.g., If Anker decides to rebrand its camera lineup to a whole different name). This way, when a product is added, the audience should be able just to click (or see) the company and its history.

 

I want to end this on a positive note. I am writing this input wishing that you guys will grow. I have been watching you for 3-4 years and almost missed nothing. I was a university student when I started watching, and your videos literally landed me a career. I am an entry-level Infrastructure Engineer in a huge company now, and arguably, your videos play as much of a role as my CS degree for this career. Thank you, and I wish you all the best.

Link to comment
Share on other sites

Link to post
Share on other sites

On 8/25/2023 at 8:03 PM, Daniel White said:

Iso is more about process and documenting what you do

It depends on which iso you are certified for. ISO 9000 is that your processes are documented, the affected employees understand how to find, read and follow these processes and finally prove they ate being followed. It also has a large component dealing with continual improvement. There are other ISO that have more result driven requirements. All that being said if their documentation meets iso standards and that was the documentation that is made public for people to comment on nobody could complain about the quality of the documents and their unkept certification would show they are adhering to the iso process

Link to comment
Share on other sites

Link to post
Share on other sites

I would like to also see two streams of testing. Real world testing of components in systems using the appropriate benchmarking tools that have been automated for repeatability. These tests should also include environment data including ambient temperature, relative humidity (or even better wet bulb humidity), hardware of system used to perform the testing (including SN of each component. This would allow for multiple benches that represent different segments of the recent generations (AMD vs Intel, DDR4 vs DDR5, nvidia vs amd vs intel) with components being the most overspecced possible with the hope that the component being tested becomes the bottleneck. Finally it is important to have multiple runs on each system to show that repeatability of the results.

 

Secondly I would like to see out of testing around thermally bound components. For cooling components, I would advocate for a test rig that can generate heat instantly homogeneously on demand with adequate instrumentation to show the results across the entirety of the contact patch showing if there are thermal deviations, and what the effective cooling area actually is. I would also like to see an equivalent device to be able to measure if there is thermal deviation on heat generating components (cpus, gpus) to know check what kind of thermal discrepancies are seen on heat spreaders. The two results could be mapped against each other to then find the theoretical optimal cooling solution for a set of components. i I think having a more hands on set of results for these types of components will lead to a better comparison over time as you are no longer beholden to the changes overtime of drivers, firmwares, bios versions.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Sel said:

I noticed this trend is even more apparent for past sponsors you worked with but have a history with. If you have a history with them, I think you guys should take the stance of "We will review the product, but the history of the company will be taken into consideration in the review, and we will let our audience know about it" instead of "We will not ever mention your products again." The latter is, in my opinion, destructive as information that could be told to the audience was omitted.

 

I agree with the premise behind the suggestion, but have some quibbles with it in practice.

 

I understand the rationale behind redacting Anker from videos where the subject of the video was not specifically about Anker. (Let's replace all of the barrel jacks with USB-C, for example) If they were to attempt to do a comprehensive comparison test of all of the major charger/power brick vendors, I can see where it would make sense for Anker to be on the list- despite their history.

 

That said, there are alternatives to Anker products. There aren't exactly alternatives in the GPU space.

 

It would be worthwhile as a footnote appended to something like their 'production bible', at least. Where, exactly, does the line get drawn between 'censure' and 'censor'?

 

---

 

Otherwise, I do have an actual suggestion:

 

It might be worthwhile to pick up the book: "Story-Telling With Data". (I'm not one to go into depth in the self-help genre, let alone the business self-help genre, but the book does have a fair amount of insight, with good examples) - Anything to make the communicated data much clearer to digest, regardless of the amount of focus spent on testing procedures and/or various certifications.

 

I would also suggest taking the time to create some standard spreadsheet templates, that are already scripted with the various data visualizations needed to for a quick pre-analysis of the data set you're working with.

 

A scatter plot with a regression line works wonders for picking up outliers, and doesn't need much more familiarity than "this dot is much further away from the line than the others." This can be a PITA to do on a spreadsheet, at least at first, because most of these don't have regression functions as part of the visualization package- but it can be done. - Otherwise, pick up a copy of "R" (open source) and learn some of the more basic functions.

 

Another possible advantage with a 'pre-screening' spreadsheet, is that it could be a relatively simple way to communicate all of the results from a given project, without having to go into too much detail on each and every one individually. (At least until the specification and development is done on whatever future public facing database.) 

 

I am less concerned about the specific methodology, and more concerned with consistency and avoidance of obvious errors that confound the data.

Link to comment
Share on other sites

Link to post
Share on other sites

For me I think it would be good if all newly tested items are tested in the same way as the existing type of component so that we can see the difference in performance over previous items, as if a whole new style of test is used for each new item, then it would also mean every previously tested item would have to be retested to the new style of test make any conclusion usable.

So basically speaking if the same test bed and test is used to test all GPU's then regardless of when the test was carried out the data will be the same and allow us to see meaningful and accurate performance.

Link to comment
Share on other sites

Link to post
Share on other sites

A change log for any changes made to the tests/ benchmarking goes along way! 

Link to comment
Share on other sites

Link to post
Share on other sites

Establish and cultivate a academic culture instead of a cooperate:

1. Peer review the work. 

2. Hold progress reports (e.g.30 minutes; 1 person per week). This keeps the entire team knowing what is going on and provides them with additional knowledge outside their day to day operation that might be useful one day (interdisciplinary).

 

In terms of transparency:

- Are there reviews regarding how the employee thinks/solves the task (not the actual work; just how he approaches a task to catch conceptual misunderstandings early on)?

- Do new employees get tested in data analysis/measurement uncertainty and if required trained to internal standard?

- Is this "test2 repeated for long term employees to ensure they still know this critical skill in and out?

 

 

On 8/26/2023 at 1:21 AM, KieranHolroyd said:

standard deviation bars on the graphs produced by labs and other statistical inferences across time.

I think standard deviation is on of the most misunderstood concepts. Just ask somebody talking about it what a student-t test is. If he can't answer he probably never looked that deep into this . If he knows this but has never heard about q test he at least has some knowledge.

Without this understanding you might do stupid stuff like linearization to fit a linear function and calculate a standard derivation on it.

People never go out of business.

Link to comment
Share on other sites

Link to post
Share on other sites

On 8/26/2023 at 5:29 PM, pxlfall said:

It would be cool to also test it with "cybersport" config, like in 1280x1024, Nvidia / amd control panel - best performance for 3d settings etc etc. 

Is this a thing that people do? I have to image that there is a distinct disadvantage with that little detail.

ask me about my homelab

on a personal quest convincing the general public to return to the glory that is 12" laptops.

cheap and easy cable management is my fetish.

Link to comment
Share on other sites

Link to post
Share on other sites

On 8/24/2023 at 8:12 PM, LMGcommunity said:

As promised, here is the thread where you can post your transparency suggestions for Labs. Here are some examples of things we'd like to see feedback about:

  • Ideas in our benchmarking.
  • What you'd like to see regarding test system information.
  • Benchmark details.
  • Testbed variations.

And, of course, any other transparency-related feedback is welcome, but similar to the Sponsor Complaints subforum, any off-topic comments will be subject to moderation.

 

Thank you for wanting to help us improve the way we do things here!

hi guys. points for y'all on the openness! I'm not sure if this is something someone recommended or not, I'd simply love for y'all to kick off every year with a video walkthrough on that year's benchmarking protocols. 

thinking out loud, the video would start with reviewing last year's benchmark frameworks going over both internal and community feedback (quick interview bits with staff and screenshots of top community feedback would be fun to go through). next up would be this year's changes, such as games being used, any design improvements, and throwing in an "in action" example of how benchmarks in 2024 would look. throw some kudos to the staff and community that were involved and done!

hoping this is something y'all take to heart! help provide me and the audiences the transparency and trust that LMG really WANTS be more advanced and more thoughtful with all the numbers. 😄

Link to comment
Share on other sites

Link to post
Share on other sites

I would Love to See some more detailed systematic and statistical Analysis of the Testing Data. This of course makes your testing sweep more transparent as Well.

 

By statistical analysis I dont mean that you should buy 10 GPUs for one Benchmark. I want simpler things. What I would love to see is the standard deviation when taking the average FPS for one benchmark. This can be calculated really fast, when you collect every FPS data point seperately. In this way you can show how much the framerate fluctuates and you can even compare two measurements. You could finally GIVE the margin of error without guessing that your result should be within the error margin. You can also give the numbers after the integer FPS value, instead of saying 69 FPS you could state something like 69.3±0.4 FPS. This shows how accurate your test sweep is! And now a 68.0±0.5 FPS Bench won't count as significantly different from this 69.3±0.4 Test

(Adding up to 2.03 Std. Deviations, assuming that >3 Std. Devs. is a significant deviation).

As another example when you just give integer FPS values, you're implying that your Error is at least ±3 FPS, since you usually dont give the Error Up to the next decimal until ±2.9. This suggests that, when we have a Bench that is at 50±3 FPS it isnt significantly deviating from a measurement that hast 60±3 FPS. This is within a 3 Std. Devs. error margin! I hope this illustrates what bothers me here, since this can very well be the generational improvement you are looking for. 

The averaging of course can also be done over a Couple of Games at 1440p, RT on. Then the Std. Deviations still are interesting, when you compare between different GPUs that surpassed the same Test.

 

The systematic part is largely covered by your new clear methodology papers, but of course slight changes in temperature or even when another type of SSD is used for a test should be noted to show the full environment of your test. But this part can also be added with reaction times of you or your code, giving even more Insight Into the test sweep.

 

I'm a physicist and had to do different types of data evaluation throughout my studies. If you have any questions regarding this or how to set this up or even regarding the formulas to use for this, feel free to ask. I have no idea of how common this knowledge is, but I must say that its visualisation can be very clear and straight forward to understand 🙂

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 weeks later...

Last month LTT Gary stated they would be open sourcing the software they use in the lab - but as far as I can tell, none of that code has manifested in a public source control.

Have I just not found the correct thread yet? Is this still in the works but having some problems? Or has LTT backed out of doing this? Would be both concerning and a shame if it's the last.

 

Link to comment
Share on other sites

Link to post
Share on other sites

No news yet, though my eyebrows went through the roof when it was implied in the apology video that it would be released rapidly, heh. It'll take time. It's a real pain to open-source something that was wholly intended to be solely in-house.

 

Additionally, in the first post-apocalypse graphics card review, Linus mentioned that they benchmarked fewer cards, because they've reverted to doing them manually, having paused automated testing due to its issues. I would imagine we won't see them open source it until they have it to a level that they are comfortable to once again use themselves. 

Link to comment
Share on other sites

Link to post
Share on other sites

Update -

1. We are still on schedule to provide the current MarkBench Test Harnesses by 9/28/2023 at GitHub.

2. We are still on schedule to provide the game or application test settings for those harnesses by 9/29/2023 at GitHub. 

3. We will release any new test harnesses developed on a 30 day schedule at GitHub. 

Link to comment
Share on other sites

Link to post
Share on other sites

bro sony is engaging in consumer fraud with these earbuds because they are substituting the 3.85v battery with 3.7v batterys guaranteeing battery failures and reduced capacity because they failed to correct the firmware for these lower voltage batts secretly extending the warranty Lois Rossmann has made a video before the release of this video https://www.youtube.com/watch?v=u525qqRHonM

Link to comment
Share on other sites

Link to post
Share on other sites

  • 4 months later...

i hope this is the right place for this 😕

i watched the most recent short circuit on the 7600xt, and i feel it lacked enough context in the data. you only compared with like GPUs from the competition, which i feel leaves allot to be desired. I understand its a "short circuit" but for allot of people this is there first and only look at this card, i feel a wider testing range would better stage the position of the card. why should a viewer care about this card over a 7600 or a 4060 or a a750/a770 8gb which are lower end lower vram cards, if you had tested these cards you might have shown the context for its existence. like wise a gamut of higher end cards to show the viewer the benefits of going up the stack rather than just whats at the price point.

i personally think any video that is intended for a general audience should contain this context for the data, as unfortunately people make decisions off single videos even if its not best practices.

also all that hard work that was done on getting similar CPUs for GPU reviews is made a bit redundant given the lack of error bars in any of the graphs, i wanna know how tight your data is at a glance, is the performance between two cards within error or is it small but significant. 

id rather a late review that was complete rather than one that felt rushed, or complete a smaller range of different tests but a larger sample of different GPUs.



tldr the RX 7600 should have been in the testing for the review of the 7600 xt at the very least, needs more context.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...
On 3/28/2024 at 5:00 PM, LMGcommunity said:

Hey folks! Our first LTT Labs Product Page is live on our beta website!

 

Check it out, and feel free to leave your feedback here.image.png

First off I would like to have the text below the picture or at lest centered and when you put the graphs also add the game you tested them in

Link to comment
Share on other sites

Link to post
Share on other sites

Would be nice to see a price evolution of a part. Like how it rises, drops, and maybe compares to other parts in the same specs/price/time. These prices could be taken from shopping sites and big retail stores. Will give you an easy way to see if there might be a simular product with higher specs at a lower price that is a way better deal. 

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, Tienenaar said:

Would be nice to see a price evolution of a part. Like how it rises, drops, and maybe compares to other parts in the same specs/price/time. These prices could be taken from shopping sites and big retail stores. Will give you an easy way to see if there might be a simular product with higher specs at a lower price that is a way better deal. 

Pc Partpicker does this already, though I wonder if a link to here or some sort of mirroring would be helpful.

 

https://pcpartpicker.com/product/FnLdnQ/gigabyte-windforce-oc-geforce-rtx-4070-12-gb-video-card-gv-n4070wf3oc-12gd

My PC Specs: (expand to view)

 

 

Main Gaming Machine

CPU: Intel Core i7-10700K - OC to 5 GHz All Cores
CPU Cooler: Corsair iCUE H115i RGB Pro XT (Front Mounted AIO)
Motherboard: Asus TUF GAMING Z490-PLUS (WI-FI)
Memory: Corsair Vengeance LPX 32 GB (2 x 16 GB) DDR4-3600

Storage: Intel 665p 1 TB M.2-2280 NVME SSD (x2)
Video Card: Zotac RTX 3070 8 GB GAMING Twin Edge OC

Power Supply: Corsair RM850 850W
Case: Corsair 4000D Airflow
Case Fan 120mm: Noctua F12 PWM 54.97 CFM 120 mm (x1)
Case Fan 140mm: Noctua A14 PWM 82.5 CFM 140 mm (x4)
Monitor Main: Asus VG278QR 27.0" 1920x1080 165 Hz
Monitor Vertical: Asus VA27EHE 27.0" 1920x1080 75 Hz

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/6/2024 at 12:44 AM, Tienenaar said:

Would be nice to see a price evolution of a part.

Pricing is tough to include since it varies drastically depending on where you are in the world.

Link to comment
Share on other sites

Link to post
Share on other sites

I feel like a good addition could be a chart of all graphics cards, cpus, etc. like the passmark website. That could make it much easier to navigate through everything as a less informed buyer. On top of that you could have a pop out when you hover over a product on the list that shows some extra details so you don’t have to open up a whole new page

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/28/2024 at 11:00 PM, LMGcommunity said:

Check it out, and feel free to leave your feedback here.

Comparison page needs some more work.


Highlights feel odd:
image.thumb.png.27febfc71d2b2921c63e44c8933851ba.png

I mean I get it, it is just fetching highlights of the individual GPUs, but it is laid out as if it is a direct comparison.
I'd just probably ditch the two-column layout and put them as separate rows (or tables, vertically).

Performance is tedious to use:
image.thumb.png.fa58e8a38f83d876bd53867a1c0ed182.png

Ok... but I don't see the RTX 4070 Super I am want to compare it to.
I can get it by picking it from the dropdown and then picking the game & resolution again.
image.thumb.png.af4136bf47115af4e059a699c71a1f7a.png
To finally get what I came to compare, in RDR2 @ 1440p ULTRA:
7600 XT 16GB = 65/57 ; RTX 4070 Super 12GB = 117/103.
Also, when picking which GPU to show from the dropdown the resolution changes (defaults to 1080p for 7600XT and 1440p for RTX 4070)... easy for users to overlook that change.

 

PS Will any other presets besides ULTRA be available in the future?

VGhlIHF1aWV0ZXIgeW91IGJlY29tZSwgdGhlIG1vcmUgeW91IGFyZSBhYmxlIHRvIGhlYXIu

^ not a crypto wallet

Link to comment
Share on other sites

Link to post
Share on other sites

On a product review page when scrolling down I think the header and navigation bar take too much space and distract from the experience of reading the product review especially on a 16:9 1080p monitor,

I think it would be better if they were like 30%+ smaller or if when scrolling down the navigation bar integrate into the header bar as some sort of drop-down menu and only reappear when scrolling all the way up.

Link to comment
Share on other sites

Link to post
Share on other sites

I think we need a new thread for the labs website? This topic seems to have fulfilled its original purpose.

ask me about my homelab

on a personal quest convincing the general public to return to the glory that is 12" laptops.

cheap and easy cable management is my fetish.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×