![]() ![]() In the mesh deformer kernel, the sine function is called twice (once for the position and once for the normal). Transcendental functions (sine, cosine, …) are functions processed by SFU or Special Functions Unit on a GeForce. So all GeForce tests, in this article and in the following ones, have been done with R195.62.īut it’s still interesting to look at the reasons of poor performance of NVIDIA’s first OpenCL drivers.Īfter some tests, reading and discussions, I found two things to optimize: transcendental functions and OpenCL work group size. ![]() And with the latest WHQL R195.62, NVIDIA’s performance finally reached AMD one. A simple update to ForceWare 195.39 was enough to improve a bit the FPS: 12 FPS on the GTS 250 and 21 FPS on the GTX 295. The first problem of GeForce was the ForceWare 190.89. ![]() Actually both explanations are true: Radeon HD 5000 is a fast card and there are problems with NVIDIA OpenCL driver. Two explanations: either Radeons are very very powerful or there is a problem somewhere. The difference was just not possible: 51 FPS for the HD 5770!!! Then I tested the demo on a Radeon HD 5770. The mesh deformation ran at about 6 FPS… ouch, OpenCL is not really efficient! On a GTX 295, the demo ran at 14 FPS. ![]() I started the development of the demo on a GeForce GTS 250 and in the first versions I used a mesh plane of 800×800 vertices (around 1.2 millions triangles). When I saw the simpleGL sample in NVIDIA OpenCL SDK, I decided to improve it by rendering a real mesh with lighting instead of a grid of flat colored vertices. The surface deformer is the first demo I coded in OpenCL. GPU Computing: GeForce and Radeon OpenCL Test (Part 4 and conclusion).GPU Computing: GeForce and Radeon OpenCL Test (Part 3).GPU Computing: GeForce and Radeon OpenCL Test (Part 1).OpenCL Surface Deformer demo with a mesh of 512×512 vertices ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |