Suggestion to improve "Top 50 devices" list #73322
Labels
No Label
Priority
High
Priority
Low
Priority
Normal
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Information from Developers
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Report
Type
To Do
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: infrastructure/blender-open-data#73322
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This is open for discussion and comes from a conversation about giving visibility to Nvidia devices using both CUDA and OptiX which are currently penalized by averaging the scores.
The simplest solution would be to query and display the fastest result for a device, instead of taking the median over all the results. But that may not be accurate enough, depending on how stable the results are. Another potential option would be to group results by both device name and device type (CUDA, OptiX, OpenCL, …), rather than just device name (here: https://developer.blender.org/source/blender-open-data/browse/master/website/opendata_main/views/home.py$138), calculate the median for each and then select the fastest.
Changed status from 'Needs Triage' to: 'Confirmed'
Added subscriber: @fsiddi
#74214 was marked as duplicate of this issue
Added subscriber: @pmoursnv
We could take e.g. the 10th percentile instead, which is also my preferred solution. The problem with the other approach is that it makes the results too hard to interpret/verify.
Another approach might be to limit the results on the homepage to recent results (e.g. one year).
You can still send in results for CUDA on RTX GPUs right now though, in addition to OptiX. It's only that those ideally should not affect the overall rank, since it's the worse option.
There is some weird behavior with the results currently displayed on http://opendata.blender.org:
It shows the
GeForce RTX 2080 SUPER
as being the fastest GPU, even though e.g. theGeForce RTX 2080 Ti
has better results. Looks like this is because it uses the OptiX results for the SUPER, but the CUDA results for all other RTX GPUs to determine rank.GeForce RTX 2080 Ti
: https://opendata.blender.org/benchmarks/query/?device_name=GeForce%20RTX%202080%20Ti&benchmark=bmw27&group_by=device_type:CUDA: 39.894, OptiX: 21.02255
GeForce RTX 2080 SUPER
: https://opendata.blender.org/benchmarks/query/?device_name=GeForce%20RTX%202080%20SUPER&benchmark=bmw27&group_by=device_type:CUDA: 52.1603, OptiX: 27.4187
Yet on the home page, the
GeForce RTX 2080 Ti
shows38.63
for the bmw27 scene (which is close to the CUDA number) and theGeForce RTX 2080 SUPER
shows28.94
(which is close to the OptiX number).Technically the GeForce RTX 2080 Ti is the faster GPU of the two and the benchmark results prove that, but it is not being displayed correctly in both the Top 50 and Fastest GPUs lists.
@SemMulder mind having a look at the last remark here?
This is because the number on the home page is the median of all benchmarks of that (device, scene) combo. Since the
Ti
has more CUDA than OptiX results the final number is closer to the CUDA result.@SemMulder How about this for the Top 50 query:
This adds one additional step to the query which selects the better device type for a device (see the extra
SELECT rank()
in there, the rest is pretty much identical to the old query). It will only have an effect on RTX GPUs currently, which support two device types (CUDA and OptiX). But it ensures the median render time is only calculated for the better option instead of calculating it over all device types.Ping =)
Added subscribers: @Walles, @SemMulder
@pmoursnv except that the actual SQL is too much for me, I like the idea and I think it is an improvement on the current tables.
Some issues still though:
num_cpu_threads
might have to be taken into account as well. If you have two$FAST_CPU
s in a box, that will be twice as fast as having just one, but both results will count towards the ranking in this table.Thanks for thinking about this, I like top lists! :)
This is false :). A benchmark needs to have
minimum_number_of_samples_per_benchmark
for each of the hardcoded scenes inbenchmarks
to show up.This is also not entirely accurate, since we put
num_cpu_sockets
indevice_name
.Maybe we should just hardcode the Blender version to a recent one, and pick the minimum median render time over all configurations (i.e. the minimum over
(device_type, os)
).If we want to not hardcode the Blender version, we could go for the Blender version that has the highest number of samples over the last couple of months / the last 10000 samples / whatever makes sense.
I'd like to emphasize the importance of the original issue again. The information about RTX GPUs has been broken for weeks now, which is really bad for the affected party (considering that Open Data does influence decisions as to what hardware to buy for Blender).
@SemMulder You haven't commented on the fix I proposed above. Are there objections? It seems a simple enough change that doesn't affect most of the data. Do you want me to create a Differential request or could you commit something along those lines?
Sorry about this, we had an internal discussion about this going on. I should have at least posted a message that we were evaluating our options.
I see you created a Differential, thank you for that. I have a few other improvements I'd like to make as well. Such as taking the fastest median time over all configurations where a configuration is the tuple
(device_type, os, blender_version)
. I will implement and discuss this internally, after which we can hopefully resolve this issue.Awesome, thank you! Let me know if there is anything else you need from me.
I'm hoping this can be resolved as soon as possible. If you expect more than a week, it would be helpful to get a rough time estimate as to when, so that I have more answer material for the inevitable press questions on the matter that we receive here.
The changes are in
39c9dc48a6
. I am in the process of deploying it as we speak.Changed status from 'Confirmed' to: 'Resolved'
The changes are deployed. I'm closing this ticket now, but still would like to know what you think. Note that we opted to remove the table in favor of improving the chart since it was hard to convey the required information in the form of a table.
Great job! The scene breakdown is especially useful.
Long term it might be useful to add a graph (or link to one) that compares all devices (CPU, GPU), but for now the two graphs works well.