Geforce gtx 780 ti tests. Video cards. Screenshot of Gpu-Z at standard frequencies

Because of the family's exit AMD video cards Volcanic Islands, a fierce rush has begun on the graphics adapter market. Nvidia even lowered the prices of their top-end video cards, but this didn’t help either. It was necessary to do something and somehow change the situation in our favor. A special model was needed that would take away from the flagship AMD Radeon R9 290X and remove it from the list of competitors.

In principle, nVidia has such a video card - it is the GeForce GTX TITAN. This is the fastest video card on a single chip at the time of release. But its price is too high and for most it is simply unaffordable. Not even every gamer will decide to buy one for himself. After all, for this money you can easily buy good computer with excellent performance.

And now, nVidia has finally released a video card that is better than its competitor from AMD and more affordable - GeForce GTX 780 Ti. She also became the head of the GTX 7xx video card family.

It can be assumed that this is a kind of trump card that the green ones kept up their sleeves. A certain pause was simply waited and the right moment was chosen for release.

Specifications

Three of the fifteen large blocks from the Kepler architecture - streaming multiprocessors - were returned to this video card. It is much more powerful, especially in terms of shader instructions and texturing.

It also supports 7 GHz video memory, thereby increasing bandwidth by as much as 14%.

Reference version dimensions

  • Height: 26.67 cm / 10.5 inches
  • Length: 11.16 cm / 4.376 inches
  • Width: Two slots

GPU Specifications

  • CUDA Cores: 2880
  • Base clock speed: 875
  • Boost clock speed: 928
  • Texture fill rate: 210 (GigaTexels/s)

Memory Specifications

  • Memory speed (Gbps): 7.0
  • Memory capacity: 3072 MB
  • Memory interface: 384-bit GDDR5
  • Maximum Memory Bandwidth: 336

Possibilities

  • FXAA and TXAA: +
  • NVIDIA SLI technology support: +
  • Purevideo: +
  • 3D Vision: +
  • PhysX: +

Software environment: CUDA

  • DirectX: 11
  • OpenGL: 4.3
  • Bus: PCI-E 3.0
  • 3D Games: +
  • Blu Ray 3D: +

Specifications

  • Maximum digital resolution: 4096×2160
  • Maximum VGA resolution: 2048×1536
  • Media connection: HDMI, DisplayPort, Dual Link DVI-I, Dual Link DVI-D
  • Multiple display support: +
  • HDCP: +
  • HDMI: +
  • Audio input for HDMI: Internal

Power and temperature

  • Maximum temperature: 95
  • Power consumption: 250 W
  • Minimum system power requirements: 600 W
  • Power connectors: 6-pin & 8-pin

Comparison of GTX 780 TI and GTX 780

GeForce GTX 780 GeForce GTX 780 Ti
GPU GK110 GK110
Number of transistors 7.1 billion 7.1 billion
Technical process, nm 28 28
GPU clock speed, MHz: Base Clock / Boost Clock 863/900 875/928
Stream processors 2304 2880
Texture blocks 192 240
Rasterization Units (ROPs) 48 48
Video memory: type, volume, MB GDDR5, 3072 GDDR5, 3072
Memory clock frequency: real (effective), MHz 1753 (7010) 1502 (6008)
Memory bus width, bits 384 384
Interface PCI-Express 3.0 x16
Image output
Interfaces 1 x DL DVI-I,
1 x DL DVI-D,
1 x HDMI 1.4a,
1 x DisplayPort 1.2
Maximum resolution VGA: 2048×1536,
DVI: 2560×1600,
HDMI: 4096x2160,
DisplayPort: 4096×2160
Maximum power consumption, W 250 250

Manufacturers

MSI GTX 780 Lightning 3 GB

It has a core frequency of 980 MHz and a memory frequency of 1502 MHz

MSIGTX 780TwinFrozrGaming 3G.B.

ASUS GTX 780 DirectCU II OC 3 GB

It has a core frequency of 889 MHz and a memory frequency of 1502 MHz

EVGA GTX 780 Superclocked w/ ACX Cooler 3 GB

It has a core frequency of 967 MHz and a memory frequency of 1502 MHz

Video card Gigabyte GTX 780 WindForce OC 3 GB

It has a core frequency of 954 MHz and a memory frequency of 1502 MHz

PalitGTX 780SuperJetStream 3G.B.

It has a core frequency of 980 MHz and a memory frequency of 1550 MHz

MSI GTX 780 Gaming 6 GB

It has a core frequency of 902 MHz and a memory frequency of 1502 MHz

Comparison with competitors

Technical Specifications/Model AMD Radeon R9 290X nVidia GeForce GTX 780 nVidia GeForce GTX 780 Ti nVidia GeForce GTX Titan
GPU Hawaii XT GK110 (GK110-300-A1) GK110 (GK110-425-B1) GK110 (GK110-400-A1)
Technical process 28 nm 28 nm 28 nm 28 nm
Number of transistors 6.2 billion 7.1 billion 7.1 billion 7.1 billion
GPU clock speed (base frequency) 864 MHz 876 MHz 837 MHz
GPU clock speed (Boost frequency) 1.000 MHz 902 MHz 928 MHz 876 MHz
Memory frequency 1.250 MHz 1.502 MHz 1.750 MHz 1.502 MHz
Memory type GDDR5 GDDR5 GDDR5 GDDR5
Memory capacity 4.096 MB 3.072 MB 3.072 MB 6.144 MB
Memory bus width 512 bit 384 bit 384 bit 384 bit
Memory Bandwidth 320.0 GB/s 288.4 GB/s 336 GB/s 288.4 GB/s
DirectX version 11.2 11.1 11.1 11.1
Stream processors 2.816 2.304 2.880 2.688
Texture blocks 176 192 240 224
Raster Operation Pipelines (ROPs) 64 48 48 48
TDP > 250 W 250 W 250 W 250 W

Comparing the 780 Ti with its competitors, you can understand its pros and cons. For example, the number of threading processes is not much different from others, but the throughput is still slightly higher. But AMD has a significantly higher amount of video memory, Boost frequency and pixel fill rate. Now it is PowerTune and GPU Boost that greatly influence performance. And it’s not so easy to evaluate it. So comparing all these main characteristics is not so important.

Overclocking and tests

Overclocking the GTX 780 Ti is something special. The video card has significant hidden reserves.

The company's engineers did their best. Due to a more advanced chip and an improved power supply system, you can overclock it by 200 MHz without raising the voltage.

Benchmark tests

Tests in games

Crysis 3

In Crysis 3, the video card showed excellent results, up to 36 fps at maximum settings.

Total War ROME II

This game is simply crammed with various modern graphical bells and whistles.

In FullHd resolution, fps ranged from 65-99 at maximum settings.

Call of Duty Ghosts

In FullHd resolution, fps ranged from 132 to 148 at maximum settings.

Conclusions

We can say that the GTX 780 Ti is an example of a high-quality video card that is designed for serious games. nVidia has taken a step in the development of the Kepler family of GPUs. One could argue that this architecture has reached its ceiling.

  • Part 2 - Practical acquaintance
  • Part 3 - Game Test Results (Performance)

In this part we will study the video card and also get acquainted with the results of synthetic tests. Our laboratory tested an Nvidia reference card.

Fee

  • GPU: GeForce Titan (GK110)
  • Interface: PCI Express x16
  • GPU operating frequency (ROPs): 875-1020 MHz (nominal - 875-1020 MHz)
  • Memory operating frequency (physical (effective)): 1750 (7000) MHz (nominal - 1750 (7000) MHz)
  • Memory bus width: 384 bit
  • Number of computational units in the GPU/block operating frequency: 15/875-1020 MHz (nominal - 15/875-1020 MHz)
  • Number of operations (ALU) in block: 192
  • Total number of operations (ALU): 2880
  • Number of texturing units: 240 (BLF/TLF/ANIS)
  • Number of rasterization units (ROP): 48
  • Dimensions: 270×100×37 mm (the card occupies 2 slots in the system unit)
  • PCB color: black
  • Power Consumption (Peak 3D/2D/Sleep): 264/86/70 W
  • Output Jacks: 1×DVI (Dual-Link/HDMI), 1×DVI (Single-Link/VGA), 1×HDMI 1.4a, 1×DisplayPort 1.2
  • Multiprocessor support: SLI (Hardware)

Nvidia Geforce GTX 780 Ti 3072 MB 384-bit GDDR5 PCI-E

The card has 3072 MB of GDDR5 SDRAM memory located in 12 chips on the front side of the PCB.

The card requires additional power in the form of two connectors: 8- and 6-pin.

About the cooling system.

Nvidia Geforce GTX 780 Ti 3072 MB 384-bit GDDR5 PCI-E

The cooling system is completely identical to the reference cooler from GTX Titan. The cooler has a traditional closed design with a cylindrical fan at the end. The radiator, pressed against the core, is based on an evaporation chamber, inside of which there is a special, easily evaporated liquid. The lower plate of the chamber is pressed against the core, heat is transferred to the liquid, which evaporates and carries the heat to the upper plate (which has cooling fins), where the vapors condense, etc. We have already talked more than once about this scheme for modern cooling of top-end accelerators.

The fan drives air through the aforementioned radiator and has a special impeller shape that gives reduced level noise. We must say that at maximum load the noise is still slightly noticeable, because the maximum speed is above 2200 rpm.

The memory chips are cooled by a central radiator (the cooler has a special plate that presses against the memory chips and power unit transistors).

We conducted a temperature study using the new version 4.2.1 of the EVGA PrecisionX utility (author A. Nikolaychuk AKA Unwinder) and obtained the following results.

After running the card for 6 hours under maximum gaming load, the maximum core temperature was 84 degrees, which is more than normal for such a powerful accelerator.

Equipment. The reference card arrived to us in OEM packaging, so there is no kit.

Installation and drivers

Test bench configuration:

  • CPU-based computers Intel Core i7-3960X (Socket 2011):
    • 2 Intel Core i7-3960X processors (o/c 4 GHz);
    • WITH Hydro SeriesT H100i Extreme Performance CPU Cooler;
    • With Intel Thermal Solution RTS2011LC;
    • Asus Sabertooth X79 motherboard based on Intel X79 chipset;
    • MSI X79A-GD45(8D) motherboard based on Intel X79 chipset;
    • RAM 16 GB DDR3 Corsair Vengeance CMZ16GX3M4A1600C9 1600 MHz;
    • hard drive Seagate Barracuda 7200.14 3 TB SATA2;
    • hard drive WD Caviar Blue WD10EZEX 1 TB SATA2;
    • 2 SSD Corsair Neutron SSD CSSD-N120GB3-BK;
    • 2 Corsair CMPSU-1200AXEU power supplies (1200 W);
    • Corsair Obsidian 800D Full Tower case.
  • operating room Windows system 7 64-bit; DirectX 11;
  • monitor Dell UltraSharp U3011 (30″);
  • monitor Asus ProArt PA249Q (24″);
  • AMD drivers version Catalyst 13.11beta8; Nvidia version 331.70 (for GTX 780 Ti) / 331/58 (for other Geforces)

VSync is disabled.

Synthetic tests

The synthetic test packages we use can be downloaded here:

  • D3D RightMark Beta 4 (1050) with a description on the website 3d.rightmark.org.
  • D3D RightMark Pixel Shading 2 and D3D RightMark Pixel Shading 3— tests of pixel shaders versions 2.0 and 3.0, link.
  • RightMark3D 2.0 With brief description: under Vista without SP1, under Vista with SP1.

For synthetic DirectX 11 tests, we used examples from the Microsoft and AMD SDKs, as well as the Nvidia demo program. First, there are HDRToneMappingCS11.exe and NBodyGravityCS11.exe from the DirectX SDK (February 2010). We also took applications from both video chip manufacturers: Nvidia and AMD. The examples DetailTessellation11 and PNTriangles11 were taken from the ATI Radeon SDK (they are also in the DirectX SDK). Additionally, Nvidia's demo program, Realistic Water Terrain, also known as Island11, was used.

Synthetic tests were carried out on the following video cards:

  • GeForce GTX 780 Ti GTX 780 Ti)
  • GeForce GTX Titan with standard parameters (further GTX Titan)
  • GeForce GTX 780 with standard parameters (further GTX 780)
  • Radeon R9 290X with standard parameters in the “Uber Mode” mode (hereinafter R9 290X)
  • Radeon HD 7990 with standard parameters (further HD 7990)

To analyze the results of the new high-end video card Geforce GTX 780 Ti, these solutions were chosen for the following reasons. Geforce GTX Titan is an exclusive model based on the same GK110 chip, has a large amount of video memory and is sold at a much higher price. Titan is Nvidia's previously powerful single-chip solution, and it will be interesting to see how much faster the new product turns out to be. A comparison with the GeForce GTX 780 will be interesting because this is a less expensive video card from the company, based on the same chip, but with a quarter fewer active execution units.

For our comparison, two video cards were selected from competing company AMD, based on different graphics processors and even different numbers of them. At the time of the release of Nvidia's new product, the Radeon R9 290X is its closest competitor in price, and at the same time the most productive video card from AMD. And the Radeon HD 7990 has two Tahiti video chips at once and is not a competitor to the GTX 780 Ti, but we will be interested to see how the speed of such a powerful dual-chip solution compares with the best single-chip solution from Nvidia.

Direct3D 9: Pixel Shaders tests

We will look at texturing and fill rate tests from the 3DMark Vantage package a little later, and the first group of pixel shaders that we use includes various versions of pixel programs of relatively low complexity: 1.1, 1.4 and 2.0, found only in old games, very simple for modern video chips.

Modern GPUs cope with the simplest tests with ease; the speed of powerful solutions in them always rests on various limits, which is especially true for GeForce. These tests are not able to show the capabilities of modern video chips and are interesting only from the point of view of outdated gaming applications. The performance of modern video cards is often limited by the speed of texturing or fillrate, and Nvidia video cards have long ceased to be optimized for such tasks, as the results of today's comparison clearly show.

Look, all Geforce boards differ slightly in speed from each other, the difference between the GTX 780 Ti and Titan is only 1-4%, with a much higher theoretical one. The new video card model released today in this comparison, although it turns out to be the best among Nvidia cards, is clearly inferior to its main competitor, the Radeon R9 290X, which is always noticeably ahead. Let's look at the results of more complex intermediate pixel programs:

The Cook-Torrance test is more computationally intensive, and its speed depends more on the number of ALUs and their frequency, but also on the speed of the TMU. This test is historically better suited for AMD graphics solutions, although the new top-end GeForce boards based on the Kepler architecture also show strong results, which we can see from the generally good numbers of the new GeForce GTX 780 Ti.

The most powerful board from the GeForce GTX 700 family turned out to be 5-6% faster than the exclusive GTX Titan, which is also less than the theoretical difference and can only be explained by the emphasis on the performance of ROP units. Nvidia's new product slightly outperforms its main competitor in one of the tests - in the Water test, where texturing speed is more important, I'm not talking about mathematical performance, in which AMD boards have some advantage. Therefore, in the second test, the results of the GeForce GTX 780 Ti are slightly lower than those of the Radeon R9 290X. On average, there is clear parity in these tests.

Direct3D 9: pixel shader tests Pixel Shaders 2.0

These DirectX 9 pixel shader tests are more complex than the previous ones, they are close to what we now see in multi-platform games, and are divided into two categories. Let's start with the simpler version 2.0 shaders:

  • Parallax Mapping- a method of texture mapping familiar to most modern games, described in detail in the article “”.
  • Frozen Glass- a complex procedural texture of frozen glass with controllable parameters.

There are two variants of these shaders: those with a focus on mathematical calculations and those with a preference for sampling values ​​from textures. Let's consider mathematically intensive options that are more promising from the point of view of future applications:

These are universal tests, in which performance depends on both the speed of ALU units and the texturing speed; the overall balance of the chip and the efficiency of execution of computer programs are also important in them. Our past studies show that in these specific tasks, the GCN architecture from AMD is significantly better than the Nvidia Kepler graphics architecture, and this happened this time too.

In the Frozen Glass test, speed is more dependent on mathematical performance, and in the case of all Geforce boards there is always some kind of obstacle, due to which Nvidia boards lose almost twice as much to the almost better single-chip Radeon. The GeForce GTX 780 Ti model is only 1% faster than the GTX Titan, which only confirms the strange performance emphasis for all GeForces.

But in the second “Parallax Mapping” test, the new Geforce GTX 780 Ti video card showed performance 15% higher than that of the GTX Titan, which is already very close to theory. As for comparison with its competitor, the comparison of the new product with the rival model Radeon HD R9 290X is not the most rosy - the AMD board is faster in this test by almost a third. Let's consider these same tests in a modification with a preference for samples from textures over mathematical calculations:

In these conditions, the position of video cards produced by Nvidia has improved somewhat, because they traditionally cope with texture samples better than with mathematical calculations. But the Radeon R9 290X is still ahead of today's new product by a good margin, especially in the Frozen Glass test, where the difference remains indecent. The new product is 4-12% faster than the GTX Titan, which is more or less in line with theory. As for comparison with the R9 290X, the GTX 780 Ti is only close to it in the Parallax Mapping test, and even then the difference exceeds 20%.

However, these were long-outdated tasks, with an emphasis on texturing, which is almost never seen in games. Next we will look at the results of two more pixel shader tests, but this time version 3.0, the most complex of our pixel shader tests for Direct3D 9. They are more indicative from the point of view of modern games on PC, including many multi-platform ones. The tests differ in that they heavily load both the ALU and texture modules; both shader programs are complex and lengthy and include a large number of branches:

  • Steep Parallax Mapping- a much more “heavy” type of parallax mapping technique, also described in the article “Modern terminology of 3D graphics”.
  • Fur— a procedural shader that renders fur.

These tests are no longer limited by the performance of texture samples or fill rates alone, and the speed in them most of all depends on the efficiency of the execution of complex shader code. In the most difficult DX9 tests from the first version of the RightMark package, Nvidia video cards were slightly stronger in previous years, but the GCN architecture helped AMD video cards take the lead at least in the complex parallax mapping test, especially after carefully fine-tuning the Catalyst drivers.

Nvidia's top new product shows very good results in these tasks, outperforming the best of its predecessors based on the same GK110 chip by 11%, which is close to the theoretical figures for the difference in mathematical performance. As for comparison with the most powerful top-end graphics card based on the Hawaii chip from a competitor, the GTX 780 Ti lags behind it only in the parallax mapping test. But in the Fur test new board The Radeon R9 290X still lost to the Geforce GTX 780 Ti, although not by that much. In general, the situation in these tests is ambiguous.

Direct3D 10: PS 4.0 pixel shader tests (texturing, loops)

The second version of RightMark3D included two already familiar PS 3.0 tests for Direct3D 9, which were rewritten for DirectX 10, as well as two more new tests. The first pair added the ability to enable self-shadowing and shader supersampling, which further increases the load on video chips.

These tests measure the performance of pixel shaders running in cycles with a large number of texture samples (in the heaviest mode, up to several hundred samples per pixel) and a relatively small ALU load. In other words, they measure the speed of texture samples and the efficiency of branches in the pixel shader.

The first test of pixel shaders will be Fur. At the lowest settings, it uses 15 to 30 texture samples from the height map and two samples from the main texture. The Effect detail mode - “High” increases the number of samples to 40-80, the inclusion of “shader” supersampling - up to 60-120 samples, and the “High” mode together with SSAA is characterized by maximum “heaviness” - from 160 to 320 samples from the height map.

Let's first check the modes without supersampling enabled; they are relatively simple, and the ratio of results in the “Low” and “High” modes should be approximately the same.

Performance in this test depends on the number and efficiency of TMUs, as well as the efficiency of executing complex programs. And in the version without supersampling, the effective fill rate and memory bandwidth also have an additional impact on performance. The results at the “High” level of detail are up to one and a half times lower than at the “Low” level.

In tasks of procedural fur visualization with a large number of texture samples, over a couple of generations of graphic architectures, AMD has reduced the difference with Nvidia boards, and with the release of video chips based on the GCN architecture, it has completely taken the lead, and now Radeon boards are the leaders in these comparisons, which indicates high efficiency of their implementation of these programs.

The new top-end GeForce GTX 780 Ti is 11-12% ahead of the exclusive GTX Titan model, beating other Nvidia solutions, which is in line with the theory. But, taking into account the fact that in this test even AMD boards of the previous generation are faster than the new GeForce GTX 780 series, there is no point in considering a comparison of the R9 290X and GTX 780 Ti - the AMD model shows too high a result, not to mention the dual-chip card of the previous generation, which became the fastest here.

Let's look at the result of the same test, but with shader supersampling enabled, which increases the work by four times: perhaps in this situation something will change, and memory bandwidth with fill rate will have less effect:

The situation is similar to what we saw in the previous diagram, but Nvidia video cards are even slightly inferior to their AMD rivals. The new GeForce GTX 780 Ti turns out to be faster than the GTX Titan model by up to 11%, which is close to the theoretical difference in mathematical performance. Unfortunately, the loss to its direct competitor in the form of the Radeon R9 290X is very impressive. It is again confirmed that AMD chips, which prefer per-pixel calculations, clearly have an advantage in such calculations.

The next DX10 test measures the performance of complex pixel shaders with loops with a large number of texture samples and is called Steep Parallax Mapping. At low settings it uses 10 to 50 texture samples from the height map and three samples from the main textures. Enabling heavy mode with self-shadowing doubles the number of samples, and supersampling quadruples this number. The most complex test mode with supersampling and self-shadowing selects from 80 to 400 texture values, that is, eight times more than the simple mode. Let's check first simple options without supersampling:

The second Direct3D 10 pixel shader test is more interesting from a practical point of view, since types of parallax mapping are widely used in games, and heavy options, like steep parallax mapping, have long been used in many projects, for example, in the games of the Crysis and Lost Planet series. In addition, in our test, in addition to supersampling, you can enable self-shadowing, which approximately doubles the load on the video chip - this mode is called “High”.

The diagram is generally similar to the previous one, also without SSAA enabled, and this time the GeForce GTX 780 Ti is ahead of the GTX Titan by as much as 16-18%, which is even more than the theoretical difference in ALU speed. Most likely, the speed here also depends on the memory bandwidth of the video memory. But since Nvidia video cards in this test always perform worse than competing solutions from AMD, the GeForce GTX 780 Ti model in the updated D3D10 version of the test without supersampling again shows a worse result than the Radeon R9 290X, not to mention the dual-chip HD 7990. Let's see what difference enabling supersampling will make:

Everything is again approximately the same as in “Fur” - when supersampling and self-shadowing are enabled, the task becomes even more difficult; enabling two options together increases the load on the cards by almost eight times, causing a serious drop in performance. The difference between the speed performance of the tested video cards has changed only slightly; turning on supersampling has less of an impact than in the previous case.

We again see that Radeon graphics solutions perform more efficiently in our D3D10 pixel shader tests compared to competing GeForces, and the older top-end board on the Hawaii chip outperforms the GeForce GTX 780 Ti announced today by a huge advantage. Compared to other Nvidia boards, the new product shows better performance, outperforming the GTX Titan by 10-11%, which is approximately what it should be according to theory. It is clear that the GTX 780 is even further behind. Let's see what happens in purely computational problems.

Direct3D 10: PS 4.0 Pixel Shader Tests (Compute)

The next couple of pixel shader tests contain minimum quantity texture fetches to reduce the performance impact of TMUs. They use a large number of arithmetic operations, and they measure precisely the mathematical performance of video chips, the speed of execution of arithmetic instructions in a pixel shader.

The first math test is Mineral. This is a complex procedural texturing test that uses only two samples of texture data and 65 sin and cos instructions.

The results of limiting mathematical tests usually only approximately correspond to the difference in frequencies and the number of computational units; they are influenced by the different efficiency of their use in specific solutions, and driver optimization is also important. In the case of the Mineral test, the new GeForce GTX 780 Ti model is only 8% ahead of the GTX Titan, which is clearly lower than the theoretical difference in mathematical performance between them. Probably some kind of limitation is affecting it, because this cannot be explained by the difference in characteristics.

As we already know, AMD architectures have always had a significant advantage over competing Nvidia solutions in such tests, but in the Kepler architecture the Californian company managed to increase the number of stream processors, and the peak mathematical performance of GeForce models, starting with the GTX 680, has increased significantly. We can see this from the results of our first mathematical test, where the best GeForce video card, although still inferior to the board based on the Hawaii chip, is only 9% ahead of its competitor GTX 780 Ti. However, judging by the prices, the Nvidia graphics card should be ahead, so there is still some work to be done.

Let's look at the second shader calculation test, which is called Fire. It is heavier for an ALU, and there is only one texture fetch, and the number of sin and cos instructions has been doubled, to 130. Let's see what has changed with increasing load:

But in the second mathematical test we see completely different results from video cards relative to each other. The difference between the GTX Titan and today's new product in this test was even a little more theoretical - 19%. This looks much more like a true difference in math performance.

Unfortunately, even with such a strong result, the new single-chip top Nvidia Geforce GTX 700 series cannot cope with its competitor from AMD, which also has a lower price. The GeForce GTX 780 Ti cannot compete with the latest AMD board, which turns out to be 12% faster in the second mathematical test. The only good news is that the GTX 780 Ti is clearly faster than the GTX 780 and Titan.

Direct3D 10: geometry shader tests

The RightMark3D 2.0 package has two geometry shader speed tests, the first option is called “Galaxy”, a technique similar to “point sprites” from previous versions of Direct3D. It animates a particle system on the GPU, a geometry shader from each point creates four vertices that form a particle. Similar algorithms should be widely used in future DirectX 10 games.

Changing the balancing in geometry shader tests does not affect the final rendering result, the final image is always exactly the same, only the methods of processing the scene change. The “GS load” parameter determines which shader the calculations are performed in—vertex or geometry. The number of calculations is always the same.

Let's look at the first version of the Galaxy test, with calculations in the vertex shader, for three levels of geometric complexity:

The ratio of speeds for different geometric complexity of scenes is approximately the same for all solutions, performance corresponds to the number of points, with each step the FPS drop is close to twofold. This task is not very difficult for modern video cards, and performance is limited by the speed of geometry processing, and sometimes by memory bandwidth.

There is some difference between the results of video cards based on Nvidia and AMD chips, due to differences in the geometric pipelines of the chips from these companies. If in previous tests with pixel shaders AMD boards were noticeably more efficient and faster, then geometry tests show that Nvidia boards are more productive in such tasks, even despite the increase in the number of geometry blocks in Hawaii.

But the difference between AMD and Nvidia is no longer as great as it used to be. Nvidia's geometric performance solutions have always done better and are therefore faster. Today's new GeForce GTX 780 Ti turns out to be approximately equal in performance to the earlier solution in the form of the GTX Titan, which indicates testing the performance of the geometric pipeline. Let's see how the situation changes when we transfer part of the calculations to the geometry shader:

As the load changed in this test, the numbers improved slightly for both AMD and Nvidia boards. The video cards in this test of geometry shaders react weakly to changes in the GS load parameter, which is responsible for transferring part of the calculations to the geometry shader, so all the conclusions remain the same. The new Geforce GTX 780 Ti model still shows performance on par with other boards based on the GK110 chip. And the competing Radeon R9 290X still lags behind them, so nothing changes in the conclusions.

“Hyperlight” is the second test of geometry shaders, demonstrating the use of several techniques at once: instancing, stream output, buffer load. It uses dynamic geometry creation by drawing to two buffers, and new opportunity Direct3D 10 - stream output. The first shader generates the direction of the rays, the speed and direction of their growth, this data is placed in a buffer, which is used by the second shader for drawing. For each point of the ray, 14 vertices are built in a circle, up to a million output points in total.

A new type of shader programs is used to generate “rays”, and with the “GS load” parameter set to “Heavy” - also to draw them. In other words, in the “Balanced” mode, geometry shaders are used only to create and “grow” rays, the output is carried out using “instancing”, and in the “Heavy” mode, the geometry shader is also involved in output.

Unfortunately, “Hyperlight” simply does not work on all modern AMD video cards, including the top-end Radeon R9 290X. At some point, another driver update led to this test It simply does not run on boards from this company. And therefore, the most interesting geometry test of our package, which assumes a large load on geometry shaders, cannot say anything about comparing AMD and Nvidia boards.

But we can at least see what has changed in the case of Nvidia solutions. The relative results of solutions in different modes approximately correspond to the change in load: in all cases, performance scales well and is close to theoretical parameters, according to which each subsequent level of “Polygon count” should be slightly less than twice as slow.

The rendering speed in this test is limited mainly by geometry performance, but in the case of balanced loading of geometry shaders, all results are close. The Geforce GTX 780 Ti showed a speed 6-8% higher than the Titan level, which suggests that the matter is clearly not only in geometric performance. However, the numbers may change significantly in the next diagram, in a test with more active use of geometry shaders. It will also be interesting to compare the results obtained in the “Balanced” and “Heavy” modes with each other.

The most important parameter in this test is geometry processing speed, which Nvidia does very well, especially with the fully unlocked GK110 chip on which the Geforce GTX 780 Ti model in question is based. Due to the larger number of geometric blocks, the GeForce GTX 780 Ti outperforms the GTX Titan by 14-19%, and the latter, in turn, is significantly faster than the younger board based on the GK110 chip - the GTX 780.

Direct3D 10: texture fetching speed from vertex shaders

Vertex Texture Fetch tests measure speed large quantity texture samples from the vertex shader. The tests are essentially similar, so the ratio between the cards' results in the Earth and Waves tests should be approximately the same. Both tests use displacement mapping based on texture sample data, the only significant difference is that the Waves test uses conditional branches, while the Earth test does not.

Let's look at the first "Earth" test, first in the "Effect detail Low" mode:

Previous research has shown that the results of this test can be affected by both fill rate and memory bandwidth, which is especially noticeable in easy mode. The results of Nvidia graphics cards are often limited by something strange, as evidenced by the similar results of all graphics cards based on the GK110 GPU.

As expected, the fastest among single-chip solutions in comparison was the top-end Radeon R9 290X, and the new GeForce GTX 780 Ti presented today is inferior to it in all modes, even in heavy mode, where the difference is least. Nvidia's new top-end board outperformed the GTX Titan in this test by 10-13%, which is close to theory. Let's look at the performance in the same test with an increased number of texture samples:

The situation on the diagram has changed significantly - the results of AMD solutions in heavy modes have worsened, while for GeForce they have remained in almost the same positions. Now the Radeon R9 290X shows results noticeably higher than the speed of the new Nvidia product only in the simplest mode, and in medium and heavy mode the Geforce GTX 780 Ti announced today is ahead of it. The difference between the GTX 780 Ti and the GTX Titan is 9-12%, which is in line with theory.

Let's look at the results of the second test of texture fetches from vertex shaders. The Waves test has a smaller number of samples, but it uses conditional jumps. The number of bilinear texture samples in this case is up to 14 (“Effect detail Low”) or up to 24 (“Effect detail High”) for each vertex. The complexity of the geometry changes similarly to the previous test.

The results in the second "Waves" vertex texturing test are generally similar to what we saw in the previous charts. For some reason, the performance of all GeForce boards based on GK110 in light mode remains greatly underestimated, and they are almost twice as bad as the speed of the dual-chip Radeon HD 7990. The speed of the new top-end GeForce GTX 780 Ti compared to its counterparts in this test is not bad, the new the single-chip top based on GK110 turned out to be 8-10% faster than the GTX Titan. Let's consider the second version of the same test:

In the second test of texture samples with increasing complexity of the task, the speed of all solutions became lower, and GeForce video cards were especially seriously affected in light modes. The results of today's new GeForce GTX 780 Ti from Nvidia were only 5% better than the GTX Titan based on the same chip, which suggests that the main performance limit in this test for Nvidia video cards is the performance of the ROP units, most likely .

3DMark Vantage: Feature tests

Synthetic tests from the 3DMark Vantage package will show us what we previously missed. Feature tests from this test package support DirectX 10 and are interesting in that they differ from ours and are still relevant. Probably, when analyzing the results of the new GeForce GTX 780 Ti video card in this package, we will draw some new useful conclusions that eluded us in tests from the RightMark family of packages.

Feature Test 1: Texture Fill

The first test measures the performance of texture fetch blocks. This involves filling a rectangle with values ​​read from a small texture using multiple texture coordinates that change every frame.

The performance of AMD and Nvidia video cards in the Futuremark texture test is quite high and the comparative figures of the models are close to the corresponding theoretical parameters. The older top-end model Geforce GTX 780 Ti, which was released today, in this test is only 2% faster than the formerly most powerful GTX Titan video card, which is not too close to theory, I must admit.

Naturally, the GTX 780 lags even further behind the two most expensive Nvidia solutions in terms of texturing speed. As for comparing the GeForce GTX 780 Ti with the solution of its competitor Radeon R9 290X, Nvidia’s new product is slightly faster in texture speed than the board based on the Hawaii GPU. What was expected based on theoretical indicators.

Feature Test 2: Color Fill

The second task is a fill rate test. It uses a very simple pixel shader that does not limit performance. The interpolated color value is written to an off-screen buffer (render target) using alpha blending. The 16-bit off-screen buffer of the FP16 format is used, which is most often used in games that use HDR rendering, so this test is quite timely.

In this case, it is not the peak speed of ROP blocks that is measured, the numbers from the 3DMark Vantage subtest show the performance of ROP blocks taking into account the amount of video memory bandwidth (the so-called “effective fill rate”), and the test measures exactly throughput, not ROP performance.

Therefore, the result of the announced Nvidia board in the performance test of ROP units turned out to be 10% better compared to the GTX Titan, since there is a theoretical difference in memory bandwidth between them. The same applies to outperforming the competitor in the form of the Radeon R9 290X - in fact, the speed of the ROP blocks is higher on the AMD board, but due to lower memory bandwidth it loses to the new Geforce GTX 780 Ti.

Feature Test 3: Parallax Occlusion Mapping

One of the most interesting feature tests, since a similar technique is already used in games. It draws one quadrilateral (more precisely, two triangles) using a special Parallax Occlusion Mapping technique that simulates complex geometry. Quite resource-intensive ray tracing operations and a high-resolution depth map are used. This surface is also shaded using the heavy Strauss algorithm. This is a test of a very complex and heavy pixel shader for a video chip, containing numerous texture samples during ray tracing, dynamic branching and complex lighting calculations according to Strauss.

This test of the 3DMark Vantage package differs from the ones we conducted earlier in that the results depend not solely on the speed of mathematical calculations, the efficiency of branch execution or the speed of texture samples, but on several parameters simultaneously. To achieve high speed in this task, the correct balance of the GPU is important, as well as the efficiency of executing complex shaders.

In this case, both mathematical and texture performance are important, and possibly also ROP speed, since in this “synthetics” from 3DMark Vantage, the new Geforce GTX 780 Ti is only 5% ahead of the more expensive Nvidia board, which does not quite correspond to the theoretical difference in texturing speed and computing performance.

If we compare the new product with its competitor's solution, then in this test the GTX 780 Ti cannot compete with the Radeon R9 290X, not to mention the dual-chip HD 7990, since AMD GPUs are more efficient in this particular task. Alas, the GTX 780 lags behind its closest competitor in price by 20%, which is quite a lot.

Feature Test 4: GPU Cloth

The fourth test is interesting because it calculates physical interactions (fabric imitation) using a video chip. Vertex simulation is used, using the combined work of vertex and geometry shaders, with several passes. Use stream out to transfer vertices from one simulation pass to another. Thus, the execution performance of vertex and geometry shaders and the stream out speed are tested.

The rendering speed in this test should also depend on several parameters at once, and the main influencing factors should be geometry processing performance and the efficiency of geometry shaders. But the picture on the diagram turned out to be very strange, both Radeon video cards show a frame rate of about 130 FPS, and the results of the three GeForces also hit the limit, but at a level of about 95-100 FPS, as we saw earlier.

And yet, the new product is 7% ahead of the expensive GTX Titan, oddly enough. The new top-of-the-range model from Nvidia shows speeds one-third worse than the competitor’s older board, the Radeon R9 290X. And all this despite the fact that the geometric performance of Nvidia video cards should be higher than that of competitor solutions, since they have a larger number of corresponding execution units. We will also recheck geometric performance in DirectX 11 tests.

Feature Test 5: GPU Particles

Test of physical simulation of effects based on particle systems calculated using a video chip. Vertex simulation is also used, each vertex representing a single particle. Stream out is used for the same purpose as in the previous test. Several hundred thousand particles are calculated, all are animated separately, and their collisions with the height map are also calculated.

Similar to one of our RightMark3D 2.0 tests, particles are rendered using a geometry shader that creates four vertices from each point to form a particle. But the test most of all loads shader units with vertex calculations; stream out is also tested.

In the second geometry test from 3DMark Vantage, the situation has changed, and this time the clear leader is the dual-chip Radeon HD 7990, which is out of the standings today. Nvidia's new product was only 1% superior to the GTX Titan board based on the same GK110 chip, which indicates an emphasis on geometric performance, at least for Nvidia boards.

If we compare the speed of the new GeForce with its only competitor from AMD, then the new board is very close to its rival - they both show similar results in this task. And this is a good result most likely for Radeon, because it costs less, and even before, synthetic tests simulating fabrics and particles from the 3DMark Vantage test package, which actively use geometry shaders, showed that Nvidia boards are significantly ahead of competing models from AMD, but now everything is not so obvious.

Feature Test 6: Perlin Noise

The last feature test of the Vantage package is a mathematically intensive test of the video chip; it calculates several octaves of the Perlin noise algorithm in the pixel shader. Each color channel uses its own noise function to put more stress on the video chip. Perlin noise is a standard algorithm often used in procedural texturing and uses a lot of math.

In a purely mathematical test from the Futuremark package, showing the peak performance of video chips in extreme tasks, we see a different distribution of results compared to similar tests from our test package. In this case, the performance of the solutions does not quite correspond to the theory and is at odds with what we saw earlier in mathematical tests from the RightMark 2.0 package.

AMD's Radeon video cards, based on GCN architecture chips, cope very well with such tasks and show better results in cases where intensive "math" is performed. This does not apply except to the dual-chip Radeon HD 7990 board, which clearly did not work efficiently in this case. However, if we compare the GeForce GTX 780 Ti announced today with the Radeon R9 290X, the latter outperforms the Nvidia board by 18%.

The GTX 780 Ti video card released on the market today showed speeds even slightly slower than the GTX Titan model from the same manufacturer and based on the same chip, which absolutely does not correspond to the theory. Today's new product still outperformed the GTX 780 by 11%, although it should have won by a much greater margin. Probably, some limitation of GPU Boost had an effect, reducing the frequency of the GK110 in the GTX 780 Ti during the last synthetic test of the package.

Direct3D 11: Compute Shaders

To test Nvidia's new solution on tasks that use DirectX 11 features such as tessellation and compute shaders, we used examples from the SDKs and demos from Microsoft, Nvidia, and AMD.

First we'll look at tests that use Compute shaders. Their appearance is one of the most important innovations in the latest versions of the DX API, they are already used in modern games to perform various tasks: post-processing, simulations, etc. The first test shows an example of HDR rendering with tone mapping from the DirectX SDK, with post-processing , using pixel and compute shaders.

The speed of calculations in compute and pixel shaders for all AMD and Nvidia boards is approximately the same, although video cards with GPUs of previous architectures had differences (curiously, the video card on Hawaii showed it again, albeit small). Judging by our previous tests, the results in the problem clearly depend not only on mathematical power and computational efficiency, but also on other factors such as memory bandwidth and ROP performance.

In this case, the speed of video cards is limited by bandwidth. Nvidia's new top-end board was 12% faster than its predecessor, the GTX Titan, in this test. If we compare the new product with the AMD board, then the Geforce GTX 780 Ti and the direct competitor Radeon R9 290X are approximately equal, although the Nvidia board is slightly more expensive.

The second compute shader test is also taken from the Microsoft DirectX SDK and shows a computational N-body gravity problem - a simulation of a dynamic particle system that is subject to physical forces such as gravity.

In the case of this test, the balance of power between the decisions of different companies turned out to be completely different. Nvidia graphics cards have a clear advantage in these calculation tasks, and Radeon graphics cards do not handle them very well. Therefore, it would be logical if this test was won by the most powerful of Nvidia’s boards - the Geforce GTX 780 Ti card presented today, which has more active computing units and operates at a high frequency.

But no, the GTX 780 Ti again lost a couple of percent to the more expensive GTX Titan in the computing task. Most likely, in calculation tasks, the frequency of the GK110 graphics processor in the case of a gaming video card drops below the level set in the case of the “computing” version - GTX Titan. As for the competitor, the Radeon R9 290X was left far behind, almost half as much as the new Nvidia product.

Direct3D 11: Tessellation Performance

Compute shaders are very important, but another interesting innovation in Direct3D 11 is hardware tessellation. We looked at it in great detail in our theoretical article about the Nvidia GF100. Tessellation has been used for quite some time in DX11 games, such as STALKER: Call of Pripyat, DiRT 2, Aliens vs Predator, Metro Last Light, Civilization V, Crysis 3, Battlefield 3 and others. Some of them use tessellation for character models, others use it to simulate realistic water surfaces or landscapes.

There are several different schemes for partitioning graphic primitives (tessellation). For example, phong tessellation, PN triangles, Catmull-Clark subdivision. Thus, the PN Triangles partitioning scheme is used in STALKER: Call of Pripyat, and in Metro 2033 - Phong tessellation. These methods are relatively quickly and easily implemented into the game development process and existing engines, which is why they have become popular.

The first tessellation test will be the Detail Tessellation example from the ATI Radeon SDK. It implements not only tessellation, but also two different pixel-by-pixel processing techniques: simple normal map overlay and parallax occlusion mapping. Well, let's compare AMD and Nvidia DX11 solutions in different conditions:

In a simple bump mapping test, speed often comes down to bandwidth or ROP performance, and the result of the new Geforce GTX 780 Ti confirms this - it is almost identical to the speed of the GTX Titan in this test. All GeForces in this subtest are far behind the Radeon R9 290X, but not because of bandwidth, but because of the speed of ROP blocks.

In the second subtest, with noticeably more complex pixel-by-pixel calculations, things are a little more interesting. The efficiency of performing such mathematical calculations in pixel shaders is higher for GCN architecture chips than for Kepler, so it is not surprising that all Nvidia boards again lost out to the new solution based on the Hawaii chip. The Radeon R9 290X based on the new graphics processor is noticeably faster than the new GeForce GTX 780 Ti, which, in turn, overtook the GTX Titan by an impressive 18%, which approximately corresponds to the theory in terms of the speed of mathematical calculations.

In the tessellation test, the result of the new product is approximately the same as in the first subtest. The GTX 780 Ti model showed almost the same speed as the GTX Titan, losing to its direct rival in the Radeon R9 290X. This happened because in this tessellation test the division of triangles is moderate and the speed in it is not limited by the performance of the geometry processing units, so the triangle processing speed of AMD boards is enough to show good results.

The second tessellation performance test will be another example for 3D developers from the ATI Radeon SDK - PN Triangles. Actually, both examples are also included in the DX SDK, so we are sure that game developers create their code based on them. We tested this example with different tessellation factors to understand how much impact changing it has on overall performance.

But in this example, a more complex geometry is used, therefore, a comparison of the geometric power of various solutions for this test brings other conclusions. All modern solutions presented in the material cope well with light and medium geometric loads, showing high speed, but in difficult conditions Nvidia GPUs are still much more productive.

The Geforce GTX 780 Ti model announced today showed an abnormally low result compared to the GTX Titan on the same GK110 chip. And the lag is 15-20% with the three most simple levels tessellation cannot be explained in any way, because the GTX 780 Ti is faster than Titan in all theoretical parameters (except for video memory). We are likely seeing the result of a software error in the form of unoptimized drivers. And only with the most complex tessellation does the new product take the lead, as it should.

And the comparison with the competitor in difficult conditions is positive for the new product, because it has more geometric blocks compared to Hawaii. Therefore, the GTX 780 Ti is much more faster than cards New generation AMD, but only in difficult conditions, when the speed of the Radeon is seriously reduced, while the new Nvidia board remains quite high.

Let's take a look at the results of another test, the Nvidia Realistic Water Terrain demo, also known as Island. This demo uses tessellation and displacement mapping to render realistic-looking ocean surfaces and terrain.

The Island test is not a purely synthetic test for measuring exclusively geometric GPU performance, since it contains both complex pixel and compute shaders, and such a load is closer to real games that use all GPU blocks, and not just geometric ones, as in previous geometry tests. However, the main one still remains the load on the geometry processing units.

We tested the solutions at four different tessellation ratios—in this case, the setting is called Dynamic Tessellation LOD. If at the very first triangle splitting factor, when the speed is not limited by the performance of geometric blocks, the new top-end video card from AMD shows a fairly high result, trying to compete with GeForce, but it does not reach the level of the GTX 780 Ti even in this case. And with an increase in geometric work, Nvidia's new product moves ahead even further.

Nvidia video cards are very fast in this test; the new Geforce GTX 780 Ti turned out to be 5-10% more productive than the more expensive GTX Titan, as it should be according to theory, unlike the previous test. The competitor still doesn’t have enough speed to compete with Nvidia cards, although in real games the load on geometric blocks is much less, and everything will be completely different there.

Conclusions on synthetic tests

The results of synthetic tests of the Geforce GTX 780 Ti video card, which has become the most powerful board in Nvidia's top series, as well as the results of other video card models produced by both manufacturers of discrete video chips showed that the new board is one of the most powerful solutions on the market, and it should successfully compete with others top-end boards, despite the rather high price.

The main thing we have determined is that the new product is clearly faster than the Geforce GTX Titan in most tests, and this with a noticeable difference in price in favor of the GTX 780 Ti. For gaming, it's no surprise that Nvidia's new board is one of the more powerful offerings at the top end of the price range. With the exception of some tasks, the Nvidia model announced today performed well compared to the powerful Radeon R9 290X. Our set of synthetic tests showed that in terms of performance they will compete with each other in games, especially since Nvidia solutions traditionally perform better there than in “synthetics”.

The new Geforce GTX 780 Ti is clearly aimed at those enthusiasts who are not ready to compromise and plan to play modern and future games at maximum settings in the most high resolutions, and are willing to pay a little more money for it than the competing Radeon R9 290X costs. Those who already wanted to buy a Geforce GTX Titan for gaming will be most happy, and those who have recently bought it will be least happy. After all, the new Nvidia model is cheaper, but will be even more productive in games. Let's move on to evaluating the real performance of the GTX 780 Ti in games in the next part of the article.

The notorious competition between the two main GPU manufacturers NVIDIA and AMD has once again given green fans cause for joy. Before the Reds had time to enjoy the applause in honor of the release of their new flagship in the form of the Radeon R9 290X, the Californians cleverly tripped them up. It was quite expected for experts that after the release of a top-end video card from AMD, NVIDIA would not stand aside and would try to create, if not much more, then certainly no less powerful solution. Expectations were justified, and a new representative of the GeForce family is being released - a video card GTX 780 Ti.

The announced video adapter at the end of 2013 is the most powerful among single-chip ones in terms of implementation in modern demanding games. The video card is built on a graphics processor labeled GK110, which was previously found in and. However, unlike the mentioned predecessors, the new product has a fully functional (not cropped) core, which is found only in a professional solution like . So, for example, in the GeForce GTX 780 Ti the number of computing cores is 2880, while in TITAN there are 2688 of them. But let's take a closer look at the characteristics of the above video cards in order to compare them.

Specifications

As can be seen from the table, the new product is ahead of its predecessors in many key parameters. Titan only has more video memory, but we already wrote about how important this parameter is in the article about. Thus, the final performance of the GTX 780 Ti is, if not superior, then at least significantly higher than that of the GTX 780 and GTX TITAN. Well, now, actually, about performance.

Synthetic test results

*Highest possible quality with screen resolution 1920x1080

And in conclusion of the review, I would like to say that the recommended price for the GeForce GTX 780 Ti video card for the US market is $699, for Russia - 24,990 rubles. Arrival of new items on Russian market expected after November 15, 2013.

Comparative testing of GeForce GTX 780Ti and AMD Radeon R9 290X

  • Analysis of geometric average results, purchase attractiveness and measurement of energy consumption
  • advertising

    Introduction

    After the release of AMD's flagship single-processor solutions - Radeon R9 290X and Radeon R9 290, NVIDIA lost its leadership in the gaming video card market, as its GeForce GTX Titan and GTX 780 accelerators were inferior to direct competitors.

    However, the company did not put up with this state of affairs, and in a fairly short time, NVIDIA released its response to the opponent’s actions - the GeForce GTX 780 Ti 3072 MB video card. What is it remarkable about and thanks to what hidden resources will the new product be able to compete with opposing models?

    Firstly, it is based on a full-fledged and uncut GK 110 graphics processor, which includes 2880 unified shader processors, 240 texture units and 48 raster operation units. The GPU itself operates at 876 MHz.

    Secondly, the effective operating frequency of the video memory was 7000 MHz, which, together with the 384-bit bus, made it possible to increase the video memory bandwidth to 336 GB/s. This was enough to neutralize the 512-bit memory bus of the Radeon R9 290X video card, whose video memory bandwidth is 320 GB/s.

    advertising

    In this test we will study what the new NVIDIA product is capable of.

    The rivals of the GeForce GTX 780 Ti 3072 MB are:

    • GeForce GTX Titan 6144 MB;
    • GeForce GTX 780 3072 MB;
    • GeForce GTX 770 2048 MB;
    • GeForce GTX 680 2048 MB;

    • Radeon R9 290X 4096 MB;
    • Radeon R9 290 4096 MB;
    • Radeon R9 280X 3072 MB.

    Test configuration

    Tests were carried out on the following stand:

    • CPU: Intel Core i7-3770K (Ivy Bridge, D2, L3 8 MB), 1.0 V, Turbo Boost / Hyper Threading - off - 3500 @ 4600 MHz (1.25 V);
    • Motherboard: GigaByte GA-Z77X-UD5H, LGA 1155, BIOS F14;
    • CPU cooling system: Corsair Hydro Series H100 (~1300 rpm);
    • RAM: 2 x 4096 MB DDR3 Geil BLACK DRAGON GB38GB2133C10ADC (Spec: 2133 MHz / 10-11-11-30-1t / 1.5 V), X.M.P. - off;
    • Disk subsystem: 64 GB, SSD ADATA SX900;
    • Power unit: Thermaltake Toughpower 1200 Watt (standard fan: 140 mm inlet);
    • Frame: open test bench;
    • Monitor: 27" ASUS PB278Q BK (Wide LCD, 2560x1440 / 60 Hz).

    Video cards:

    • Radeon R9 290X 4096 MB - 1000/5000 @ 1130/5800 MHz (Sapphire);
    • Radeon R9 290 4096 MB - 947/5000 @ 1120/5800 MHz (Sapphire);
    • Radeon R9 280X 3072 MB - 1000/6000 @ 1150/7000 MHz (Gigabyte);

    • GeForce GTX 780 Ti 3072 MB - 876/7000 @ 1110/7700 MHz (MSI);
    • GeForce GTX Titan 6144 MB - 837/6008 @ 970/7200 MHz (Gigabyte);
    • GeForce GTX 780 3072 MB - 863/6008 @ 1000/7200 MHz (Palit);
    • GeForce GTX 770 2048 MB - 1046/7000 @ 1260/7800 MHz (Zotac);

    • GeForce GTX 680 2048 MB - 1006/6008 @ 1260/7100 MHz (Gainward).

    Software:

    • Operating system: Windows 7 x64 SP1;
    • Video card drivers: NVIDIA GeForce 334.67 Beta and AMD Catalyst 14.1 BETA 1.6.
    • Utilities: FRAPS 3.5.9 Build 15586, AutoHotkey v1.0.48.05, MSI Afterburner 3.0.0 Beta 18.

    Testing tools and methodology

    For a more clear comparison of video cards, all games used as test applications were launched in resolutions of 1920x1080 and 2560x1440.

    Built-in benchmarks, FRAPS 3.5.9 Build 15586 and AutoHotkey v1.0.48.05 utilities were used as performance measurement tools. List of gaming applications:

    • Assassin's Creed 4 Black Flag (Nassau).
    • Batman: Arkham Origins (Gotham City).
    • Battlefield 4 (Tashgar).
    • Company of Heroes 2 (Benchmark).
    • Crysis (Benchmark - Village).
    • Far Cry 3 (Chapter 2. Hunter).
    • GRID 2 (Benchmark).
    • Max Payne 3 (Chapter 5. Alive, although a little shabby).
    • Metro: Last Light (Benchmark).
    • Saints Row IV (Game Start).
    • Sleeping Dogs (Benchmark).
    • Tom Clancy's Splinter Cell: Blacklist (Item zero).

    Measured in all games minimum And average FPS values. In tests in which there was no possibility to measure minimum FPS, this value was measured by the FRAPS utility. VSync was disabled during testing.

    Let's move directly to the tests.

    NVIDIA has decided to introduce another video card based on the GK110 chip, most likely the last one. This time, the GeForce GTX 780 Ti uses the full version of the "Kepler" architecture, which will continue to dominate the market until the advent of "Maxwell" next year. Judging by numerous rumors, the idea of ​​introducing such a video card was born at NVIDIA quite a long time ago. Ever since the advent of the corresponding Tesla and Quadro video cards, it became clear that a desktop video card is just around the corner. The new GeForce GTX 780 Ti video card can be considered as a response to the AMD Radeon R9 290X, but in any case, this is NVIDIA's new flagship. The GeForce GTX Titan video card gives way to the GeForce GTX 780 Ti. But how much faster will the new product work compared to Titan? Or compared to Radeon R9 290X? In our article we will answer all these questions.

    But before we get to last chapter in the life of "Kepler", let me take a short excursion into the past. On March 22, 2012, NVIDIA introduced the GeForce GTX 680 video card, the first model based on the new “Kepler” GPU architecture. The streaming multiprocessors known from the "Fermi" architecture were improved in GK104 from 32 to 192 CUDA cores, now called SMX clusters. However, the changes affected not only an increase in the number of cores, but also the proportions of control logic and computing units - in the new generation, the emphasis was shifted to computing units. Eight SMX clusters of 192 CUDA cores each resulted in a total of 1536 cores.

    For almost a year, NVIDIA continued to optimize the "Kepler" architecture. As a result, in February 2013 the GK110 processor was released, containing up to 15 SMX clusters with 192 CUDA cores, which gave 2880 cores. But for the GeForce GTX Titan and the subsequent GTX 780, fewer SMX clusters were used: 14 and 12, respectively. Here, it seems to us, the reason was the difficulties of chip production, since a crystal with 7.1 billion transistors and an area of ​​533 mm² is not so easy to produce without errors using a 28-nm process technology. Even during the presentation of the first desktop video cards based on the GK110 chip, rumors began to circulate about the possible release of "GeForce GTX Titan Ultra". And today this video card finally appeared.

    Before we get into the details of the GeForce GTX 780 Ti, let us talk about the specifications.

    NVIDIA GeForce GTX 780 Ti
    Retail price RUB 24,990 in Russia
    649 euros in Europe
    Manufacturer's website NVIDIA
    Technical Specifications
    GPU GK110 (GK110-425-B1)
    Technical process 28 nm
    Number of transistors 7.1 billion
    876 MHz
    928 MHz
    Memory frequency 1.750 MHz
    Memory type GDDR5
    Memory capacity 3.072 MB
    Memory bus width 384 bit
    336 GB/s
    DirectX version 11.1
    Stream processors 2880
    Texture blocks 240
    48
    Pixel fill rate 42 Gpixel/s
    SLI/CrossFire SLI
    TDP 250 W

    Of course, the distinguishing feature of the GeForce GTX 780 Ti is the GK110 processor in full version. It is equipped with 15 SMX clusters, each with 192 CUDA cores, for a total of 2880 CUDA cores. The graphics processor is manufactured using a 28 nm process technology and contains 7.1 billion transistors in an area of ​​533 mm². NVIDIA, unlike the latest "Hawaii" video cards from AMD, decided to set a fixed base GPU frequency of 876 MHz, but thanks to GPU Boost, dynamic overclocking is provided to at least 928 MHz. According to NVIDIA, most video cards easily reach the 1 GHz bar in practice. We are getting a second NVIDIA video card with high frequency GDDR5 memory 1750 MHz. It is connected via a 384-bit interface, resulting in a memory bandwidth of 336 GB/s. But NVIDIA decided not to standardly equip the video card with 6 GB of video memory; unlike the GeForce GTX Titan, you will get only 3 GB. Perhaps the situation will change after the release of the first video cards with alternative cooling systems, so GeForce GTX 780 Ti with 6 GB of memory and even more may appear on the market. After equipping the Radeon R9 290X and 290 4 GB of memory, such a step seems quite logical to us.

    The new GPU is equipped with 15 SMX clusters, which corresponds to 2880 CUDA stream processors and 240 texture units (16 of them are used per SMX cluster). The 384-bit memory interface interfaces with 48 raster operation pipelines (ROPs). According to the manufacturer's specifications, the theoretical pixel fill rate is no less than 42 Gpixels/s. In the SLI configuration, up to four GeForce GTX 780 Ti video cards can operate. Unlike high-end AMD graphics cards, the new NVIDIA graphics cards still require an SLI bridge to operate. But the situation may change in the next generation. The thermal package (TDP) is the same 250 W - this may be another advantage over the Radeon R9 290X, whose TDP exceeds 250 W.

    Screenshot of GPU-Z video card NVIDIA GeForce GTX 780 Ti

    The GPU-Z screenshot confirms the key technical specifications and provides some additional details.

    Comparison of NVIDIA GeForce GTX 780 Ti with competitors
    Model AMD Radeon R9 290X NVIDIA GeForce GTX 780 NVIDIA GeForce GTX 780 Ti NVIDIA GeForce GTX Titan
    Retail price about 480 euros 20.2 thousand rubles
    about 410 euros
    24,990 rubles
    649 euros
    32.2 thousand rubles
    about 820 euros
    Manufacturer's website AMD NVIDIA NVIDIA NVIDIA
    Technical Specifications
    GPU Hawaii XT GK110 (GK110-300-A1) GK110 (GK110-425-B1) GK110 (GK110-400-A1)
    Technical process 28 nm 28 nm 28 nm 28 nm
    Number of transistors 6.2 billion 7.1 billion 7.1 billion 7.1 billion
    GPU clock speed (base frequency) - 864 MHz 876 MHz 837 MHz
    GPU clock speed (Boost frequency) 1.000 MHz 902 MHz 928 MHz 876 MHz
    Memory frequency 1.250 MHz 1.502 MHz 1.750 MHz 1.502 MHz
    Memory type GDDR5 GDDR5 GDDR5 GDDR5
    Memory capacity 4.096 MB 3.072 MB 3.072 MB 6.144 MB
    Memory bus width 512 bit 384 bit 384 bit 384 bit
    Memory Bandwidth 320.0 GB/s 288.4 GB/s 336 GB/s 288.4 GB/s
    DirectX version 11.2 11.1 11.1 11.1
    Stream processors 2.816 2.304 2.880 2.688
    Texture blocks 176 192 240 224
    Raster Operation Pipelines (ROPs) 64 48 48 48
    TDP > 250 W 250 W 250 W 250 W

    Comparisons with both in-house and AMD competitors highlight the theoretical advantages and disadvantages of the GeForce GTX 780 Ti. The number of stream processors does not exceed its competitors so significantly; NVIDIA's memory bandwidth is slightly higher. But the memory capacity, Boost frequency and pixel fill rate (due to the higher number of raster operation pipelines) are higher for AMD. However, PowerTune and GPU Boost technologies today have a significant impact on performance, which is difficult to estimate based on theoretical specifications. So comparison of theoretical computing performance, memory bandwidth and other technical characteristics It’s hardly that relevant. The test results will show how far the new video card can break away from the GeForce GTX Titan and Radeon R9 290X.