MarkBench Development and Feature Requests

morganh · October 14, 2022

hi I commented on the video and was suggested to post here as well. I think it would fill an unfilled niche to do performance testing for music production software, as I have yet to see anyone test performance with it. Like games which have to get a frame out in time otherwise you get tears, the audio buffer has to be filled in time otherwise you get pops and clicks and the daw (digital audio workstation) may stop playback eventually. It's also a mix of multi-threaded and single-threaded workloads. Afaik, daws put one channel on a thread, and so if you have one heavy thread, it can be enough to stop playback on its own.

it would be amazing to fork an open source daw (or work with a company like ableton) to add in data collection for buffer fill times and other data, and possibly log when a thread is being computed or not. It would also be great to have this be cross-platform as well. There's a few synths i've thought of that would be good to use for benchmarking. i don't have one of them but i could def make presets for the others that are very heavy on cpu to see how many can be run before it starts to have underfilled buffers, or using very long samples to push the ram and disk as well. additionally, audacity declipping, click removal, or applying some other effect on an hour or two of audio would also be a good benchmark, as it takes a while to run.

my guess is that there would be some difference between cpus in addition to just the average performance changing, like how some gpus with some games are very stable or very unstable performance.

Allen Chen · October 14, 2022

My suggestions:

1. A GUI to change the presets/games/settings, rather than using the raw xml
2. Add in some optional tests, eg, ML ones, video encode/decode and the like
3. Ask if the user if they want to help add data to your database
4. Add a public page to see all the data, for transparency, preferably with visualization options; also, make it available on Linux
Nice piece of software for only 6 months of development, btw

Edited October 14, 2022 by Allen Chen

Zephyrmoth · October 14, 2022

I assume you'll have already worked this out, but I'd really appreciate a harness for the FFXIV benchmarks specifically- they provide a score and rating based on the metrics generated, but most of them are hidden from you. Also it's either the first or second most played MMO at the moment.

It would also be interesting to use this as a data source for a sort of "non-weirdly biased UserBenchmark" by crowdsourcing benchmark results for various hardware configurations, though I imagine the community could do this easily. On that note, potentially including a cryptographic signature in the official built executable such that modified results could be thrown out could be handy.

Also, making it easy for users to add custom harnesses and potentially providing a curated Github repo for them if you can spare the dev time; while I understand limiting the scope of your personal benchmark suite, it would be nice to have a consistent way of scoring hardware for more niche titles.

Swiftwing · October 14, 2022

cooldown / idle monitor for X length - used between full benchmark and/or each test.

could also be used as toggleable trigger if the system holds thermal throttle for X amount to alt-f4 the active test and allow it to cool off.

*some rooms somewhere in the world are going to get hot from long duration testing?

Torreip · October 14, 2022

It’s seems like a fantastic tool! I have a little question tho, what will the project be licensed under at lunch if you have any idea already ?

I think to feature I would like to see is custom script and recordable macro to avoid being limited to Officialy suported game or software. Of course it’s far from a perfect solution, but theocraticly a user could macro a DDU, adjust parameter of certain title or even better propose some light OC on the bench

Back2Gaming.BossMac · October 14, 2022

The benchmark list shows games that have in-game benchmarks (F1, ACV, etc) except Microsoft Flight Simulator and CSGO. MSFS has run and go flight activities while CSGO has workshop benchmark maps. Will MarkBench support manually controlled benchmarked scenes recorded with a script for automation for games without benchmarks like Destiny 2 or Modern Warfare?

Great project! Reminds me of PugetBenchmark with their user-uploaded result libraries for easy comparison.

Jonas_Kramer · October 14, 2022

I don't know if it was metioned yet, but as a video editor,it would be awesome to see how different GPUs or complete systems perform to playback and render videos.

Julleeee · October 14, 2022

FEATURE REQUEST:

Please make it open source. There are a lot of weerd Os and graphics systems, and making it open source would enable us to either adjust it or make plugins / additional content for it. Doing so on GitHub would even enable you to profit an buil ind the code we make! The open source community is massive, and something like this would be amazing to have.

DarkKnight2037 · October 14, 2022

I was just curious, the benchmarks that it would run, we would have to own the software that it would be running the benchmarks own in the first place, right? Also, would it possible to add some feature where benchmarks coul dbe tested at different wattages through it as well?

Data Fred · October 14, 2022

Data Scientist and hardware tinkerer here! Here's a few thoughts:

The software is super cool! If it can submit results to a network destination during testing, you could even do community events (like a live testing of a new GPU generation) if the visualizations are publicly available!
You could think of combining this with some hardware automation. I'm thinking of some kind of contraption that could
- Connect all GPUs (or other PCIe peripherals) to some sort of custom-built multiplexer, allowing many cards to be connected to the same PCI slot on a test bench
- Select one of the connected GPUs for testing, connecting it to the main board's PCI slot
- Toggle the power button to turn on the test bench
- Automatically run a set of benchmarks (or preparation steps) and submit the results to the database
- Turn the system off
- Connect the next GPU (or even multiple ones)
- Repeat
Grafana is a great frontend, but makes it important to design the database in the right way. The right design would even allow you to build some great dashboards, for example: Have a dashboard where you could select multiple GPUs for comparison and create the charts for each one automatically, same for RAM, CPU, benchmarked game, and other variations.
A personal recommendation for the backend: Consider the Elasticsearch stack. It's the best option in combination with Grafana, supremely fast and flexible, and comes with its own visualization tool, Kibana. Kibana has Canvas, which is intended for pixel-perfect, standardized visualizations, without its own, imposed branding (much more appealing than Grafana). Kibana and Grafana can coexist without problems on the same backend, ensuring great coverage for use cases for many audiences. Please don't waste your time on a tool like D3, which will only slow you down.

Excited to see where this journey takes the lab and LTT! Feel free to reach out for questions.

creesch · October 14, 2022

I posted the same in the release video thread not realizing this is the actual feedback/request thread.

I probably will have a lot more to say in the future, but for now I'll mostly stick with my thoughts on publishing the "mother of all testing databases". In the screenshots so far and in the video this is shown to be Grafana. Which makes a lot of sense as it is a powerful that can present data in a lot of neat ways.

It also is a tool that (sometimes depending on the source) can wildly misrepresent data if you are not careful. For example it is often combined with Prometheus as an underlying data source in IT environments for monitoring applications. However the way Prometheus works means that averages can be misleading if the dataset is over a too short period of time. It is a bit too much tl;dr to explain in depth and I fully expect the folks in the lab to be aware of this as well.

However when making it a public dataset a lot of people will not be aware of these sorts of caveats. So my suggestion there would be to think about what to expose and how. For example if you are giving grafana dashboards to everyone make sure to also put a big fat explanation (or link to it) next to it that goes into how to read the data. That way you can somewhat reduce changes of people wildly misinterpreting your presented data in internet slapfights.

Having said all that, I am a big fan of using what is already there for exposing and applying all this data. It is a sensible approach that hopefully will allow a lot more than would be possible if you tried to do it all with custom solutions.

In that regard, I am curious to how things are set up in the background. Is the labs team leaning on the floatplane development infrastructure to make sure things they build are maintainable and such? If not, what is the approach there. To me it seems that a lot of what the labs does is effectively very similar to modern software development and specifically very similar to test automation and performance testing. So I'd expect that internally there could be a lot of benefit from setting up things in a same way. So for example making use of gitlab (self hosted) for the development, hosting and possibly even triggering of MarkBench through pipelines. Because to keep things consistent you of course want to make sure things are run with similar version of MarkBench and the same configs for games. In a similar sense I can see a benefit in keeping a repository with sets of game configs for various types of benchmarking. In fact, with a little bit of effort I can even see a future where you don't use a GUI or where the GUI doesn't trigger local MarkBench. But, instead triggers a pipeline which fires up the benchmarks on a few test benches in parallel.

As someone who works as a test automation engineer I see a lot of potential here. Needless to say that I am excited about this In fact, if it wasn't for me being located in Europe I honestly would have considered applying for an LTT job at some point.

IdkIJustWannaAskSomething · October 14, 2022

Will it be opensource and published under a libre licence like GPL V3. That also prohibits commerical sale of the software or redistrubition in a commercial way of the software. So it doesn't end up like minix. Also will you do a Linux release or help the community to do a port to Linux. In your Videos about Linux you always complained about support. Now there is your time to do something about it or the situation will never change. It would be great to also do this on steam decks that's why a Linux port would also be very cool. But openssource is still the best way to go then atleast the community can do something with it.

Smokingpc · October 14, 2022

9 hours ago, AdamFromLTT said:

When it launches, MarkBench will be completely free.

When you say free, do you mean free as in freedom (libre) or just price? Because we already have a free as in freedom software called Phoronix Test Suite (PTS) that does everything you said in your video. Hopefully MarkBench is ACTUALLY free or else there won’t be a reason to use it.

I Just Want Ram Drives · October 14, 2022

Make it free for media-use as well.

The way to defeat the BS about favoring certain reviewers is to unionize the reviewers around a test suite that companies can't bypass. Make it so all the media reviewers can get around the tight time limitations.

HW100 · October 14, 2022

Already love the concept behind it. Doing ALL the benchmarks in one click? Oh hell yeah!

10 hours ago, AdamFromLTT said:

Is it free?

When it launches, MarkBench will be completely free.

Linus mentioned in the video that it will be free for personal use. So what about... Let's say other YouTubers who want to use it? Will it require a licence then? Or a special agreement?

Shepanator · October 14, 2022

Not really a feature suggestion but for the data visualisation you should look into either using Google data studio (already comes with Gsuite, is automated and works well with big datasets) or Metabase (basically an entire browser based database frontend with a GUI query builder & automated reports/dashboards with visualisation. There's an open source version which can be self hosted too)

ReHWolution · October 14, 2022

This is incredibly interesting. Will it also be free to use for other reviewers? Or is it gonna need a paid license?

gabbaii_ · October 14, 2022

In the video Linus mentions it's sometimes difficult to visualise a tonne of benchmark information at once. If we want to contribute, possible if you can release a desensitised dataset for us visualisation nerds to play with?

dpr78 · October 14, 2022

Given the cost of electricity is only going up, especially in Europe, I'd be interested in seeing which cards are most power efficient at a given performance level.

Lets say I just want to play <favourite game> at 1080p on high settings and achieve say a solid 120fps; yeah I could just get the latest Nvidia 40 series, but is it more efficient than an older 20 or 30 series or one from AMD or Intel. OR, does it not even matter that much?

NLStitch · October 14, 2022

Feedback/ Feature request;

It raises a lot of questions; How will you handle cpu instruction sets? Will you have support for legacy systems that lack certain cpu instruction sets? How will you handle benchmarks of software that is incompatible with the underlying hardware?

Since this will be the new standard, will you test older hardware or just current and next gen.. and if you manage to test older hardware.. how old will you go?

The reason for all these questions is quite simple; Not everyone has the money to build a next gen system, so publishing results for older systems would benefit the used market may even help fellow YT channels that review older hardware. (e.g. Using cpu/gpu x in 2022 style videos)

Frappocalypse · October 14, 2022

I can definitely understand not wanting to make it open-source right away (even just code shame from the types of hacky things you do to create an MVP or prototype application), though what might be a good middleground for allowing community contributions (or contributions from other creators) is documenting how to make harnesses, and making them externally loadable if they aren't already (e.g. through a harness directory). The phrasing in the video of developing harnesses for games already implies a setup somewhat like this, but I think if you documented the process of making one and made it easy to add harnesses then you could get a huge amount of community engagement on that front, and a lot of useful additional tests. I don't know what the current restrictions in MarkBench are, so there may be security concerns around making it easy for random users to run random scripts depending on whether it is a fairly restricted definition format or essentially just an open-ended Python script, but if it is relatively locked down this could be a very effective way to get support for a lot of games quite quickly.

In a similar vein, I'm not sure if the configuration settings are hardcoded in currently, but if the harness defines the settings that a game has available and how to modify them, then you can set up configuration profiles for a harness with values for those settings, it could both lead to the creation of more useful harnesses by the community, and open up the potential for testing games with different setting values which may also be of interest (e.g. DLSS on/off, 1080p/1440p/4K res).

(Also generic request for the framework to support Linux/macOS even if individual harnesses may not, aside from the standard reasons something like testing game performance under Proton on Linux versus native on Windows could be a different angle than the hardware tests this seems targeted towards, but very interesting all the same).

This is a really cool project, and I wish you all the best with it!

Iris104 · October 14, 2022

I see your videos guys, its amazing

A feature that i want is laptop battery benchmark. This will do the benchmark (browse youtube / video playback, etc) and periodically send the data to the servers until the battery dies. That way i can just leave the laptop at night and no need to worries since the data will stay on the cloud. This is very usefull when selling and buying a second hand laptop as well as reviewing a new laptop model. Potentially could also be usufull for benchmarking UPS.

GorujoCY · October 14, 2022

38 minutes ago, Iris104 said:

I see your videos guys, its amazing

A feature that i want is laptop battery benchmark. This will do the benchmark (browse youtube / video playback, etc) and periodically send the data to the servers until the battery dies. That way i can just leave the laptop at night and no need to worries since the data will stay on the cloud. This is very usefull when selling and buying a second hand laptop as well as reviewing a new laptop model. Potentially could also be usufull for benchmarking UPS.

the problem is that unless your computer hibernates or sleeps during the battery that it runs out, there's no way to submit battery data when it's doing that, you are gonna have to do charging, sure one way to do it gathering from Logs of sort but that isn't an option, maybe MarkBench could detect when the computer slept or hibernated and when the computer is at 100% battery charge then give you the result and hopefully with my suggestion I have below:

11 hours ago, GorujoCY said:

Not gonna lie but everything is looking very promising and I'm all for it, im willing to contribute if a little and especially beta test it or some sort!

My suggestion is adding a way to submit the data online for the community to see, so if LTT for example doesn't wanna test the Intel Arc A380 for example, well maybe another youtuber or an individual does those automation benchmarks and submits them onto the website for users to see, this will generalize the way we see gaming and productivity data while finally not worrying about bloody userbenchmark and their manipulations or something like that but since the data would be community contributed, it really does help to give consumers realistic expectations going forward...

An optional one would also be trying to automate VR testing too, with what Headset is connected and what way etc. (eg. Direct SteamVR: Valve Index, SteamVR via Oculus Link/Air link/Virtual Desktop: Oculus Quest 2, SteamVR via iVry: phonevr, SteamVR via Oculus: Oculus Rift S, SteamVR via WMR: HP Reverb G2 and etc. and for "other" headsets: SteamVR via OpenXR: unknown) and I can also see the appeal for VR youtubers to use this if it gets implemented, genuinely guys you got this!

I hope this gets implemented and this is definitely going to be the perfect benchmark application by far!

to be able to submit that data for users to see...

Blobby Fisher · October 14, 2022

What's the expected dev time? Got about 60 cards I been too lazy to test. Also, can you make it cycle through multiple graphics cards in one PC? Like if mobo has a couple of full size slots, can it do all that's on 1 graphics card, then auto move to the other? Will save me a lot of time.

DevSwarnakar · October 14, 2022

my request is pretty simple but I guess I'm not the only one wanting to say this, could you please include a blender render test too if you're checking the GPU's. we from the CG community would love to see that! Thanks!

Sign In

MarkBench Development and Feature Requests

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites