3DMark 2005
Compared to 3DMark 03, the size of the newly-minted 3DMark 05 has increased by one and a half times and now occupies 280 MB in the archive and 633 MB in the expanded state. System requirements have also changed. Now operation requires a processor of at least 2 GHz, at least 512 MB of RAM, and a graphics accelerator with support for pixel and vertex shaders version 128 with a memory capacity of at least XNUMX MB.
YOM: 2006
Developer: Futuremark Corporation
Platform: PC
Minimum system requirements:
Operating system: Microsoft Windows 2000 or XP operating system
Processor: x86 compatible processor with MMX support, 2000MHz
RAM: (512MB recommended)
DIRECT X: DirectX9.0c or later (required)
DirectX9 and shader models
As the functionality of GPUs evolves, developers must take advantage of their additional capabilities to both improve the quality of scenes and gain additional performance.
Microsoft, the creator of DirectX, has demonstrated remarkable flexibility in allowing two leading GPU developers to spawn a suite of shader models that leverage advanced GPU functionality beyond the basic requirements of the 2.0 shader model. The industry needed a standardized approach to extending the core capabilities of DirectX 9, and the result was Shader Model 2.0a, Shader Model 2.0b and Shader Model 3.0.
For example, in Far Cry When calculating illumination using pixel shaders model 3.0, a maximum of 4 light sources are calculated in one pass, when using model 2.0b - for 3, when using model 2.0 - for one light source.
So, with the transition of games to a new way of development, a new approach to performance assessment has appeared: instead of testing video cards under absolutely identical conditions, it is necessary to use for each specific accelerator the shader model that most fully uses its capabilities. This is exactly how the developers from Futuremark saw 3Mark05.
3DMark05: stages of the journey
Futuremark, calling its test suites "The Gamer's Benchmark", constantly adds support for new technologies and develops the functionality of its 3DMark. First introduced in late 1998, Futuremark's benchmark suites have become the go-to tool for measuring graphics card performance and power balances for people ranging from enthusiasts to corporate executives.
3DMark99 - concentrated on the speed of texturing and polygon processing.
3DMark2000 - received support for hardware transformation of polygons, and at the same time the complexity of scenes increased.
3DMark2001 - received support for vertex and pixel shaders 1.1 and additionally increased scene complexity - now there were already tens of thousands of polygons in the scenes.
3DMark03 - used shaders 1.x and 2.0. Only one of Boall's gaming tests did not use pixel shaders. All other gaming tests made extensive use of DirectX8 pixel and vertex shaders, and the final and most complex gaming test made extensive use of Model 2.0 shaders. The complexity of the scenes increased to hundreds of thousands of polygons.
3DMark05 - raised the level of technology even higher. The package uses only shaders of model 2.0 and higher, and all shaders can be executed in any of the profiles corresponding to models 2.0, 2.0a, 2.0b and 3.0. The complexity of scenes has increased: now on average there can be more than one million polygons in a frame.
The main difference between 3DMark05 and previous versions of the test package is the use of at least model 2.0 shaders and the selection of the optimal rendering path for each video card.
Patric Ojala from Futuremark gives this example: “We use several specially prepared shaders, for example, in the first game test, a shader corresponding to the 3.0 model uses dynamic execution control and stops working when it detects that the surface is not illuminated. Another example would be a shader that uses Depth Stencil Textures.”
Graphics engine: using shaders
Futuremark's previous incarnations of 3DMark used modified versions of MAX-FX, but for 3DMark05 the company has developed a new graphics engine. All shaders used in scenes are written in a high-level language - HLSL. These shaders are not executed directly; before sending them to the accelerator, they need to be compiled, i.e., translated into a language more understandable to the GPU and its driver. DirectX offers several profiles - several optimal configurations for the shader compiler, designed for GPUs with different functionality. So, say, for ATI RADEON 9700 PRO profile PS 2_0/ VS 2_0 will be used, and for NVIDIA GeForce 6800 Ultra – PS 3_0/VS 3_0. The latest generation of GPUs exceed the basic requirements of DirectX 9.0, and although they support lower profiles, say, PS 2_0/VS 2_0, they will default to the profile that makes the most use of their functionality. The same applies to graphics processors that will appear after the release of 3DMark05 - for them, based on the list of capabilities mandatory provided by the drivers, profiles will be selected that most fully utilize their functionality.
So, in the new reincarnation of the test package, Futuremark moves even further away from the idea of comparing video cards in absolutely identical conditions. This is not surprising: all modern video cards support the basic requirements of DirectX 9.0, but above the basic requirements, all have completely different functionality. Putting them in the same conditions is incorrect: these same conditions in different cases will be optimal for some video cards and non-optimal for others. Instead, 3DMark05, by choosing the most functional profiles for each graphics processor, uses the capabilities of each video card to the maximum.
However, for those who are still interested in comparing the performance of video cards under the same conditions, the ability to select a profile for compiling HLSL shaders has been introduced. This way you can force the video card to work with a less functional profile, say, NVIDIA GeForce 6800 Ultra will not use shaders 3.0, but the speed of rendering scenes will, of course, change.
Graphics Engine: CPU Usage
In gaming tests, Futuremark's new package does not use CPU resources for anything other than preparing data to build the scene. That is, there are no calculations related to game physics, logic or AI - “artificial intelligence” - in game tests.
Most of the built-in tests in regular games are organized in the same way: while playing a demo recording and measuring speed, calculating AI, physics, etc. turns off. For example, in Doom3 the built-in test is organized in this way.
So, in terms of the use of CPU resources in gaming tests, Futuremark developers tried to get closer to real gaming tests, and this approach seems to be quite justified, because 3DMark is, first of all, a test of video cards, not central processors.
Graphics engine: shadow calculation system
Dynamic shadows appeared in 3DMark 2001 scenes - the engine used projected shadow maps. In 3DMark03, in the second and third tests, the graphics engine switched to a different way of constructing shadows, the same one used by the “great and terrible” Doom3 - calculating volumes that bound shadowed areas and using a stencil buffer to determine the illumination of objects.
In 3DMark05, the developers moved away from this method of calculating shadows - while providing excellent quality, it nevertheless has a number of disadvantages. For each object that should cast a shadow, you need to create its “shadow volume” - a polygonal model, the edges of which on the side of the light source are the edges of the object itself, and on the sides - the silhouette of the object extended from the light source to infinity. Finding the edges that form the silhouette of an object and are subject to extrusion is a difficult task performed by the central processor, and the more complex the object, that is, the more polygons included in the calculation, the more time it takes to create the “shadow volume”.
Further use of these invisible “shadow volumes” involves the need to render them into the template buffer, which significantly increases the load on the GPU in terms of shading speed. And the more objects cast shadows, the greater the load on the GPU.
The shadow calculation method used in 3DMark05 is free from these shortcomings. 3DMark05 uses a type of shadow map called “perspective shadow maps” (PSM) to calculate dynamic shadows, with its own modifications to minimize the manifestation of their characteristic shortcomings.
When calculating dynamic shadows using a shadow map, the stages of building a scene look something like this:
-First, the scene is built from the position of the light source. No textures, pixel shaders, etc. are used during construction, since all that is needed at this stage is the Z value, that is, the distance of the scene pixels from the light source. This value for each pixel is written to the output buffer in floating point format. The higher the size of this buffer and the more accurate the data presentation format, the better the result will be.
-After calculating the shadow map, the scene is constructed in the usual way, from the camera position. To determine the illumination of pixels, pixel shaders are used: in the shader, for each pixel in the scene, the corresponding shadow map pixel is determined and the distance from the scene pixel to the light source is calculated. If this distance is equal to or less than the value stored in the shadow map, then the pixel is illuminated. If this distance is greater, then it is obvious that when calculating the shadow map, some element of the scene turned out to be closer to the light source, and the pixel turned out to be shadowed.
The main advantage of calculating shadows using PSM is that calculating dynamic shadows when using shadow maps does not require additional calculations by the central processor, and the number of calculations does not depend on the complexity of the scene - invisible, but resource-consuming “shadow volumes” are not added .
Modern pixel processors support long and complex shaders, which makes it possible to determine the shading of each pixel in relation to several light sources in one pass, i.e., further reduce the amount of work.
Plus, this method uses pixel processors, and it is their performance that has been growing at the fastest rate lately - faster than the power of vertex processors, CPU performance, memory bus speed or texture fetch speed.
This method, of course, has its own weaknesses, but the developers at Futuremark assure that their modification of PSM is well suited for a wide variety of scenes and light sources.
Shadow maps for directional light sources are calculated in a resolution of 2048x2048; shadow maps are saved in floating point, R32F or D24X8 format. For these lights, the graphics engine calculates two shadow maps: one for objects closer to the camera, and one for the rest of the scene. This achieves increased accuracy in the calculation of shadows in those places where it is most noticeable, i.e., close to the camera, while maintaining the ability to calculate shadows for the rest of the scene. However, even this is sometimes not enough to completely avoid the appearance of artifacts - in the third 3DMark05 gaming test, shading artifacts are noticeable on sections of rocks located almost parallel to the sun's rays.
Note the shadow cast by the cables near the "fin" of the flying ship, and the fragment of the canyon wall.
The developers note that these are not driver errors or hardware problems, this is a manifestation of one of the weak points of the shadow map method.
For non-directional light sources, the 3DMark05 graphics engine builds six shadow maps in the R32F format of 512x512, placing the light source in the center of an imaginary cube, building shadow maps for the 6 faces of this cube and thus reducing this case to the case of a directional light source.
Sampling values from the shadow map in a pixel shader in the presence of hardware support for Percentage Closest Filtering (PCF) and Depth Stencil Textures (DST) is performed using PCF, that is, in fact, with ordinary bilinear filtering, and if there is no hardware support for PCF, filtering is performed directly in the shader; for this, the 4 values closest to the reference point are selected from the shadow map and averaged.
These approaches provide slightly different results, both in terms of performance and image quality, but the developers at Futuremark believe that DST and PCF, that is, hardware support and hardware filtering of shadow maps, should be used whenever possible, since the most Major game developers are already using these GPU features, and the demand for these features will only increase in the future.
So, enough details. Let us finally move on to the description of the tests.
Game Test 1: Return to Proxycon:
The first game test definitely belongs to the section of action games: space pirates again attack the Proxycon cargo ship.
Reflecting the scenes of classic shooters, Game Test 1: Return to Proxycon combines fairly large rooms with narrow corridors, and a large number of simultaneously fighting foot soldiers, bringing the gaming situation closer to multiplayer games.
Most surfaces in Game Test 1: Return to Proxycon use shader-defined “metallic” materials with lighting calculations based on the Blinn-Phong model. The exponents required for the calculation are not calculated mathematically in the shaders; instead, samples from a pre-calculated table are used.
In total, the scene has 8 light sources that cast shadows: 2 directional light sources, for which shadow maps of 2048x2048 are calculated, and six non-directional sources, for which shadow maps of 512x512x6 are calculated.
Game Test 2: Firefly Forest:
This test is a good example of an outdoor scene using a lot of vegetation. The scene is relatively small, but extremely rich in detail.
Moonlight night. The ground is covered with thick grass, the branches of the trees are slightly swayed by a light breeze...
The display of dense vegetation on the ground is implemented in a dynamic way: the concentration and level of detail of ground vegetation changes along with the movement of the camera. Grass leaves are displayed only where needed, reducing GPU load while maintaining the highest visual experience.
The ground surface in this test is rendered using the "metal" shader from the first test, but with the addition of base and detailed color/normal maps. The tree branch material does not use bump or specular maps, but does have a cube color map. The sky is rendered using a shader that simulates light scattering.
Moonlight is a directional light source that casts dynamic shadows. Shadows are calculated using a shadow map with a resolution of 2048x2048. The magic firefly illuminates the grass and trees as an omnidirectional light source with a 512x512x6 shadow cubemap.
Game Test 3: Canyon Flight:
The latest game test features large open spaces - in this scene, a Jules Verne flying ship sails over the waves through a canyon guarded by a real sea devil.
The water surface is the most striking part of this scene. Water not only imitates reflections and refractions, but also has its own transparency value, so that a sea monster moving in the water column appears to be actually swimming in the water column, and not behind cloudy refractive glass.
The shader used to display the water surface is an improved modification of the "water" shader from 3DMark03, but the water surface is not only a shader. To correctly calculate refractions and reflections, including the correct display of shadows, six passes of the graphics accelerator are required. The shader itself uses readings from normal, refraction and reflection maps. In addition, a volumetric fog is used for objects located underwater, making them darker and less saturated as they move deeper from the surface of the water.
Fog is used in the scene to enhance the presence of a large open space, making distant rocks look more natural.
The shader used to render rocks is what the developers call the most complex 3DMark05 shader - when combined with shadow calculations, it barely fits into the specifications of pixel shaders model 2.0. The rock material uses two base textures, two normal maps and a Lambert lighting calculation.
The scene has one light source - the Sun. Sun shadows are calculated using two shadow maps with a resolution of 2048x2048, one map is used for objects close to the camera, and the second is used for the rest of the scene.
CPU Test:
3DMark05, like 3DMark03, uses game tests at 640x480 resolution with post-effects disabled and vertex shader software emulation to test the CPU speed. This shifts the balance of tests towards increasing the CPU load and makes the test results dependent on the CPU speed rather than the video card. To ensure that the tests are run under absolutely identical conditions on any system, both CPU Tests use a scene output mode with a fixed number of frames per second.
In the first CPU Test, the developers introduced additional calculations assigned to the central processor. Despite the fact that a ship's flight through a canyon under any conditions follows a constant trajectory, this test includes a continuous calculation of the optimal trajectory that goes around the contours of the canyon. The calculations associated with this calculation are performed in a side thread, and this allows you to take advantage of the capabilities of multiprocessor systems or processors with HyperThreading.
Fill Rate:
This test transferred to 3DMark05 virtually unchanged. Everything that has changed is visible to the naked eye: in order to reduce the requirements for memory bandwidth and highlight the texturing speed, the developers have reduced the resolution of the textures used as much as possible - now they are boring “cells”.
The test, as usual, has two modes: single texture overlay and multi-texturing. In single texture overlay mode, the scene has 64 surfaces with one texture layer on each, and in multi-texturing mode there are 8 surfaces with eight textures on each.
Pixel Shader:
This test uses the most complex of 3DMark05's shaders, the rock surface shader in the third gaming test. The shader was carried over from the game test with only one change - shadows are not calculated here.
It is worth recalling that the shader is written in HLSL, the basic requirements for the GPU, like all other 3DMark05 tests, are support for model 2.0 shaders.
The developers note that the results of this test will be determined not only by the performance of pixel processors, but also by the speed of the memory bus - this shader intensively uses large-volume textures.
A less memory-speed-dependent alternative to such a shader, developers see a shader that uses mathematical calculations, that is, “creating textures on the fly,” but, firstly, a similar shader has already been used in 3DMark03, and secondly, as Futuremark notes, game developers Instead of mathematical calculations in shaders, they are much more willing to use ordinary textures.
Vertex Shader:
The test consists of two parts: in the first part, the speed of simple transformation of sea monster models is measured - the shader responsible for the transformation may well fit within the specifications of model 1.0 shaders, but the test, following the Futuremark ideology, uses DirectX 9.0.
The second, more complex version of the test uses a complex vertex shader to transform blades of grass. Each blade of grass bends independently of the others under the influence of the “wind” set by fractal noise calculated on the central processor. In order to reduce the influence of factors such as CPU performance and shading speed on the test results, the calculation of fractal noise is optimized as much as possible, and the blades of grass are located further away from the camera.
Batch Size Tests:
Batch Size Tests – a set of tests designed for the speed of rendering batches of different sizes – groups of polygons sent by the application to the accelerator driver in one Direct3D function call. Each of these tests draws the same number of polygons, but the polygons are grouped into groups of different sizes each time: 8,32,128, 512, 2048, and 32768 polygons.
To prevent video card drivers from combining small groups into larger ones for optimization purposes, each initial group of polygons is drawn with its own color - this causes the graphics pipeline to restart when each new group of polygons arrives:
The test reveals how much graphics pipeline reboots reduce the performance of the video card and shows the efficiency of the driver and video card on groups of different sizes - it is known that sending to the accelerator and rendering the same number of polygons with one function call and one group is faster than with many small groups.
Each of the 3DMark reincarnations has brought modern graphics cards to their knees, using new technologies and more complex scenes, but never before has the "parrot denomination" been accompanied by such a huge leap in image quality. In order to get the most out of the experience and appreciate the new tests, you should, of course, watch the demo mode - each game scene in the demo mode is a complete work of art.
It is noteworthy that this time Futuremark clearly divides gaming scenes into genres, and this is noticeable at first glance at the tests - a repetition of the situation with 3DMark03, where the second and third gaming tests behind a different shell were the same inside, did not happen. Each game scene is vastly different, and at first glance it should look like it mirrors the scenes from games in the near and distant future quite well.
It is worth noting Futuremark's new approach to testing video cards with different functionality - the use of HLSL shaders and its own optimal profile for each GPU is perhaps the most adequate reflection of existing trends in the gaming industry.