www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - OpenGL: C and D same code - different results

reply dyh <why.you.need my.email> writes:
recently I've tried few OpenGL examples ported from C to D (on win32)
and have run into strange problems.

Translated example was built against same C libs as original, using same
(translated from platform sdk) headers. Resulted binary is using same dlls
as original. But there are some differences:

1. Behavior
In original calls glIndexi() has *no* effect. In translation has.
In original calls glColor3f() has effect. In translation has *not*.
In original initial color is white. In translation it is kind of brown.

2. Performance
original example performs noticeably faster than translated one.

No matter what compiler switches I've tried (-O, -inline, -release, etc).
Example is extremely simple, and i do not see any possibilities to have any
difference in performance. There is no GC used - there are no memory
allocations array slice at all. Actually there are no array usage. In fact
there is nothing at all except opengl api calls. And it is literally same
library in both example and translation.

Here is code (~300 lines) 
original (C) http://paste.dprogramming.com/dpfwrsgw.php
translation (D) http://paste.dprogramming.com/dpu768pr.php.

Any tips from opengl experts?
Feb 06 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
dyh wrote:
 recently I've tried few OpenGL examples ported from C to D (on win32)
 and have run into strange problems.
 
 Translated example was built against same C libs as original, using same
 (translated from platform sdk) headers. Resulted binary is using same dlls
 as original. But there are some differences:
 
 1. Behavior
 In original calls glIndexi() has *no* effect. In translation has.
 In original calls glColor3f() has effect. In translation has *not*.
 In original initial color is white. In translation it is kind of brown.
 
 2. Performance
 original example performs noticeably faster than translated one.
 
 No matter what compiler switches I've tried (-O, -inline, -release, etc).
 Example is extremely simple, and i do not see any possibilities to have any
 difference in performance. There is no GC used - there are no memory
 allocations array slice at all. Actually there are no array usage. In fact
 there is nothing at all except opengl api calls. And it is literally same
 library in both example and translation.
 
 Here is code (~300 lines) 
 original (C) http://paste.dprogramming.com/dpfwrsgw.php
 translation (D) http://paste.dprogramming.com/dpu768pr.php.
 
 Any tips from opengl experts?

I don't know why, but it seems pretty clear from the results you're seeing that the original version didn't actually get the PFD_TYPE_COLORINDEX visual it was asking for, whereas the D version does. That would explain the performance difference too because a color index visual is probably going to be slow on most modern hardware. Recent hardware may not even support color index buffers, so it may mean you're getting a fallback software renderer in the D case. Is there some reason why you really need to use a color index visual? You'd be much better off with a true color visual. --bb
Feb 06 2007
parent Dave <Dave_member pathlink.com> writes:
The depth and breadth of experience and knowledge in this NG continues to amaze
me...

Bill Baxter wrote:
 dyh wrote:
 recently I've tried few OpenGL examples ported from C to D (on win32)
 and have run into strange problems.

 Translated example was built against same C libs as original, using same
 (translated from platform sdk) headers. Resulted binary is using same 
 dlls
 as original. But there are some differences:

 1. Behavior
 In original calls glIndexi() has *no* effect. In translation has.
 In original calls glColor3f() has effect. In translation has *not*.
 In original initial color is white. In translation it is kind of brown.

 2. Performance
 original example performs noticeably faster than translated one.

 No matter what compiler switches I've tried (-O, -inline, -release, etc).
 Example is extremely simple, and i do not see any possibilities to 
 have any
 difference in performance. There is no GC used - there are no memory
 allocations array slice at all. Actually there are no array usage. In 
 fact
 there is nothing at all except opengl api calls. And it is literally same
 library in both example and translation.

 Here is code (~300 lines) original (C) 
 http://paste.dprogramming.com/dpfwrsgw.php
 translation (D) http://paste.dprogramming.com/dpu768pr.php.

 Any tips from opengl experts?

I don't know why, but it seems pretty clear from the results you're seeing that the original version didn't actually get the PFD_TYPE_COLORINDEX visual it was asking for, whereas the D version does. That would explain the performance difference too because a color index visual is probably going to be slow on most modern hardware. Recent hardware may not even support color index buffers, so it may mean you're getting a fallback software renderer in the D case. Is there some reason why you really need to use a color index visual? You'd be much better off with a true color visual. --bb

Feb 06 2007
prev sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
dyh wrote:
 recently I've tried few OpenGL examples ported from C to D (on win32)
 and have run into strange problems.
 
 Translated example was built against same C libs as original, using same
 (translated from platform sdk) headers. Resulted binary is using same dlls
 as original. But there are some differences:
 
 1. Behavior
 In original calls glIndexi() has *no* effect. In translation has.
 In original calls glColor3f() has effect. In translation has *not*.
 In original initial color is white. In translation it is kind of brown.
 
 2. Performance
 original example performs noticeably faster than translated one.
 
 No matter what compiler switches I've tried (-O, -inline, -release, etc).
 Example is extremely simple, and i do not see any possibilities to have any
 difference in performance. There is no GC used - there are no memory
 allocations array slice at all. Actually there are no array usage. In fact
 there is nothing at all except opengl api calls. And it is literally same
 library in both example and translation.
 
 Here is code (~300 lines) 
 original (C) http://paste.dprogramming.com/dpfwrsgw.php
 translation (D) http://paste.dprogramming.com/dpu768pr.php.
 
 Any tips from opengl experts?

No tips, since I'm pretty new at this myself, but I've just finished translating nVidia's glsl_pseudo_instancing* sample from C++ to D, and believe it or not, D's version is actually faster (albeit by a mere 1%). That really surprised me, since the MSVC 2005 compiler is really good at FPU stuff, plus that whole-program-optimization thing they have. So I'm confident there must be something else going on in your case. A literal translation to D should not cause big differences. Well, no negative differences anyway ;) By the way, the C++ version had many uninitialized variables. Thanks, Walter, for making floats default to NaN. It's really easy to track the source of a uninited variable this way! L. (*If anybody wants that app, just say it)
Feb 07 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Lionello Lunesu wrote:
 No tips, since I'm pretty new at this myself, but I've just finished 
 translating nVidia's glsl_pseudo_instancing* sample from C++ to D, and 
 believe it or not, D's version is actually faster (albeit by a mere 1%).
 
 That really surprised me, since the MSVC 2005 compiler is really good at 
 FPU stuff, plus that whole-program-optimization thing they have.
 
 So I'm confident there must be something else going on in your case. A 
 literal translation to D should not cause big differences. Well, no 
 negative differences anyway ;)
 
 By the way, the C++ version had many uninitialized variables. Thanks, 
 Walter, for making floats default to NaN. It's really easy to track the 
 source of a uninited variable this way!
 
 L.
 
 (*If anybody wants that app, just say it)

I'd be interested in seeing the code. --bb
Feb 07 2007
parent Lionello Lunesu <lio lunesu.remove.com> writes:
Bill Baxter wrote:
 Lionello Lunesu wrote:
 No tips, since I'm pretty new at this myself, but I've just finished 
 translating nVidia's glsl_pseudo_instancing* sample from C++ to D, and 
 believe it or not, D's version is actually faster (albeit by a mere 1%).

 That really surprised me, since the MSVC 2005 compiler is really good 
 at FPU stuff, plus that whole-program-optimization thing they have.

 So I'm confident there must be something else going on in your case. A 
 literal translation to D should not cause big differences. Well, no 
 negative differences anyway ;)

 By the way, the C++ version had many uninitialized variables. Thanks, 
 Walter, for making floats default to NaN. It's really easy to track 
 the source of a uninited variable this way!

 L.

 (*If anybody wants that app, just say it)

I'd be interested in seeing the code. --bb

Thought I'd check the license.txt: "... Developer agrees not distribute the Materials or any derivative works created therewith without the express written permission of an authorized NVIDIA officer or employee. ..." :( I've written them an e-mail... L.
Feb 07 2007
prev sibling parent reply Wolfgang Draxinger <wdraxinger darkstargames.de> writes:
Lionello Lunesu wrote:

 dyh wrote:
 recently I've tried few OpenGL examples ported from C to D (on
 win32) and have run into strange problems.
 
 Translated example was built against same C libs as original,
 using same (translated from platform sdk) headers. Resulted
 binary is using same dlls as original. But there are some
 differences:
 
 1. Behavior
 In original calls glIndexi() has *no* effect. In translation
 has. In original calls glColor3f() has effect. In translation
 has *not*. In original initial color is white. In translation
 it is kind of brown.>> 
 2. Performance
 original example performs noticeably faster than translated
 one.
 
 No matter what compiler switches I've tried (-O, -inline,
 -release, etc). Example is extremely simple, and i do not see
 any possibilities to have any difference in performance. There
 is no GC used - there are no memory allocations array slice at
 all. Actually there are no array usage. In fact there is
 nothing at all except opengl api calls. And it is literally
 same library in both example and translation.
 
 Here is code (~300 lines)
 original (C) http://paste.dprogramming.com/dpfwrsgw.php
 translation (D) http://paste.dprogramming.com/dpu768pr.php.
 
 Any tips from opengl experts?


Yes, me. The problem is that you're using indexed mode. For some reason the non D-example does not get a index colour mode, but a RGB mode. The OpenGL bindings for D, that I've seen so far circumvent the normally used linkage to the DLL, which is normally happening by specifying the DLL in the Executable header. Instead the D bindings use LoadLibrary and GetProcAddress. Maybe this makes the D version to actually get the index mode. That also explains, why glColor3f has effect in the C99 example: In index mode glColor doesn't work - period. Instead you must set a palette for your drawable, which is then accessed by the index values. Since you don't set a palette, and glColor doesn't work you will get only white shapes in index mode. The bad performance is caused by the simple fact, that index mode is no longer supported by modern hardware and must be emulated by the OpenGL software renderer, which is by nature very slow. In fact OpenGL2.0 no longer has indexed color mode. Just don't use it and alwas do things in RGB(A). If you really need an indexed image first render in RGB(A) and dither afterhand. Wolfgang Draxinger -- E-Mail address works, Jabber: hexarith jabber.org, ICQ: 134682867
Feb 08 2007
parent reply Wolfgang Draxinger <wdraxinger darkstargames.de> writes:
Wolfgang Draxinger wrote:

 The OpenGL bindings for D, that I've seen so far circumvent the
 normally used linkage to the DLL, which is normally happening
 by specifying the DLL in the Executable header. Instead the D
 bindings use LoadLibrary and GetProcAddress. Maybe this makes
 the D version to actually get the index mode.

Got the explanation for that one, too: Modern drivers intercept the linkage on executable load to insert some of their own juice into the code. Mainly to make usage of features like antialiasing and PBuffers more efficient. Naturally getting the functions pointers via LoadLibrary and GetProcAddress will circumvent this and give you only the vanially opengl32.dll which will happyly serve you a software emulated index colour mode. So instead of loader hacks a OpenGL binding for D should just have a pragma to link against opengl32.lib (on windows) or opengl32.so on *ix and provide the identifiers. Extension loading should be done by wglGetProcAddress or glxGetProcAddress _after_ a valid OpenGL context has been aquired anyway. Sooner or later I will fork/extend GLEW to GLEW'D (aka a GLEW that creates D code instead of C). Wolfgang Draxinger -- E-Mail address works, Jabber: hexarith jabber.org, ICQ: 134682867
Feb 08 2007
parent reply Mike Parker <aldacron71 yahoo.com> writes:
Wolfgang Draxinger wrote:
 Wolfgang Draxinger wrote:
 
 The OpenGL bindings for D, that I've seen so far circumvent the
 normally used linkage to the DLL, which is normally happening
 by specifying the DLL in the Executable header. Instead the D
 bindings use LoadLibrary and GetProcAddress. Maybe this makes
 the D version to actually get the index mode.

Got the explanation for that one, too: Modern drivers intercept the linkage on executable load to insert some of their own juice into the code. Mainly to make usage of features like antialiasing and PBuffers more efficient. Naturally getting the functions pointers via LoadLibrary and GetProcAddress will circumvent this and give you only the vanially opengl32.dll which will happyly serve you a software emulated index colour mode.

While it's true that when statically linking to the import library on Windows will do some jiggity foo to get the current driver implementation loaded, going through LoadLibrary does not affect this. Every game out there based on the Quake 2 & 3 engines loads dynamically. All of the games using the GarageGame's Torque Game Engine do it. So do the Java games out there using LWJGL (like Tribal Trouble, Bang! Howdy, and the games from PuppyGames.net). Likely several other games I'm not aware of do the same. You can test this with DerelictGL. I've used it several times in testing and always get a hardware-accelerated 32-bit color mode. Antialiasing is set up during context creation via wgl extensions on Windows. pbuffers are through extensions also.
 
 Extension loading should be done by wglGetProcAddress or
 glxGetProcAddress _after_ a valid OpenGL context has been
 aquired anyway. Sooner or later I will fork/extend GLEW to
 GLEW'D (aka a GLEW that creates D code instead of C).

Yes, this is an issue on Windows. If the context is not created, you cannot properly load extensions. When you change contexts, there is no guarantee that previously loaded extensions will be valid. That's why extension loading is separated from DLL loading in DerelictGL. It also has a mechanism to reload extensions when you want to switch contexts. So I thing the OP's problem lies elsewhere. Besides, AFAIK, DerelictGL is the only binding that does go through LoadLibrary. The other OpenGL bindings I've seen all link statically to the import library.
Feb 08 2007
parent reply Wolfgang Draxinger <wdraxinger darkstargames.de> writes:
Mike Parker wrote:

 While it's true that when statically linking to the import
 library on Windows will do some jiggity foo to get the current
 driver implementation loaded, going through LoadLibrary does
 not affect this. Every game out there based on the Quake 2 & 3
 engines loads dynamically.

I think, the main reason for this is, that a few years ago there were different versions of opengl32.dll (the SGI version, the Win9x version and the WinNT - though Win9x and WinNT DLLs are almost identical, only difference is a value in the version resource entry). Not linking with LoadLibrary could have caused some trouble since some compilers were creating the link based on the ordinal number which were not identical to the library installed on the developer's system.
 All of the games using the GarageGame's Torque Game Engine do
 it. So do the Java games out there using LWJGL (like Tribal
 Trouble, Bang! Howdy, and the games from PuppyGames.net).

 Likely several other games I'm not aware of do the same.

My engine does this, too. But for another reason: Instead of PE or ELF binaries the modeuls are contained in a custom, platform independent format. Despite OS specific code those modules can be loaded and run on any OS, as long it is running on the architecture, the code was compiled for. So the core game code must be compiled only once for every architecture. And since there has only two architectures remained on which games are played (x86 and x86_64) this means I have to compile only twice for most of the code. Of course such a custom module format the normal executable binary loader is ignorant of, so there is a a small wrapper, bootstrapping it. All system libraries then must of course be loaded through dlopen/LoadLibrary. I think the reason, that id's engine also use LoadLibrary is a similair: They all contain some VM that should get access to OpenGL => LoadLibrary. In Java there is no other option.
 You can test this with DerelictGL. I've 
 used it several times in testing and always get a
 hardware-accelerated 32-bit color mode.

Well, you will have no problems getting a hardware accelerated mode, since all OpenGL implementations can do this. But if you request a mode, that the driver can't deal with you'll get software emulation. Indexed colour mode is such a mode. Now some drivers intercept this and give the application a "mode not supported" error instead.
 Antialiasing is set up during context creation via wgl
 extensions on Windows. pbuffers are through extensions also.

Yes, this is true of course, but some drivers use code injection as a workaround to intercept program calls that might cause buggy behaviour when sharing the contexts. I remember of a nasty stencil buffer bug on R300 cards, that was workarounded with that method. Unfortunately the engines you mentioned circumenvent this, rendering Viewports larger than 1600x1200 on a R300 unusable in the first version of the "fixed" drivers. Later versions where injecting some code into the engine's executable itself to fix it. Wolfgang Draxinger -- E-Mail address works, Jabber: hexarith jabber.org, ICQ: 134682867
Feb 08 2007
parent reply Mike Parker <aldacron71 yahoo.com> writes:
Wolfgang Draxinger wrote:

 
 I think the reason, that id's engine also use LoadLibrary is a
 similair: They all contain some VM that should get access to
 OpenGL => LoadLibrary. In Java there is no other option.

I'm not sure why id did this for Q3. Perhaps just because they had always done it so? In Q2, true, there was the mini-driver for 3dfx cards to consider. That's why they implemented it that way then. But in Q3, AFAIK, looking at the source, the QVM doesn't care where the gl funcs come from. All GL calls are contained in the rendering modules and not exposed elsewhere. The benefit of this is that the function pointers can be replaced with debug wrappers. As for Java, it's not the only option. Only the JNI DLL need be loaded dynamically. It can statically link to opengl32.lib itself.
 
 Well, you will have no problems getting a hardware accelerated
 mode, since all OpenGL implementations can do this. But if you
 request a mode, that the driver can't deal with you'll get
 software emulation. Indexed colour mode is such a mode. Now some
 drivers intercept this and give the application a "mode not
 supported" error instead.

That's why it's up to the programmer to verify the pixel format when it is returned to ensure it is hardware accelerated. On Windows, when software emulation is being used this can be determined by testing the pixel format flags. It can also be seen in the return of glGetString(GL_VENDOR).
 
 Yes, this is true of course, but some drivers use code injection
 as a workaround to intercept program calls that might cause
 buggy behaviour when sharing the contexts. I remember of a nasty
 stencil buffer bug on R300 cards, that was workarounded with
 that method. Unfortunately the engines you mentioned
 circumenvent this, rendering Viewports larger than 1600x1200 on
 a R300 unusable in the first version of the "fixed" drivers.
 Later versions where injecting some code into the engine's
 executable itself to fix it.

This is new information for me. Considering that dynamic loading of OpenGL is such a common practice, that's rather bad form on the vendor's part. I'll definitely have to look into this. At any rate, we still haven't solved the OP's problem :)
Feb 08 2007
next sibling parent Wolfgang Draxinger <wdraxinger darkstargames.de> writes:
Mike Parker wrote:

 At any rate, we still haven't solved the OP's problem :)

The solution is simple: Don't use indexed mode. The non D-version falls into RGB(A) mode (glColor3f working indicates this). And index mode nowadays is only supported through SW emulation. Just don't use it. Wolfgang Draxinger -- E-Mail address works, Jabber: hexarith jabber.org, ICQ: 134682867
Feb 08 2007
prev sibling parent dyh <why.you.need my.email> writes:
Mike Parker Wrote:
 At any rate, we still haven't solved the OP's problem :)

Sorry for absence - I have watched thread, but wasn't able to do any actual research till now... The cause of described behavior turned to be different pixel format selected by ChoosePixelFormat() - it is different for C and D versions. And it is entirely my fault - in D version I removed initialization of iPixelType to PFD_TYPE_COLORINDEX. As it is deprecated, and as far as I remember, docs says it is ignored anyway. What happens later all explained in this thread by you guys - thanks a lot. Regards.
Feb 12 2007