I want to render large triangle meshes with lighting and smooth
shading on windows with either an ATI or NVidia graphics
card.
what is currently the fastest way to use OpenGL:
- display list ?
maybe with minimization of state changes by grouping
similar normal attributes..
- vertex arrays ?
- strips ?
- other ?
any help would be greatly appreciate,
pierre
--
Pierre Alliez
INRIA Sophia-Antipolis
http://www-sop.inria.fr/geometrica/team/Pierre.Alliez/
An OpenGL-based Tutorial for CGAL:
http://www.cgal.org/Tutorials/Polyhedron/index.html
|
|
0
|
|
|
|
Reply
|
Pierre
|
6/7/2005 12:46:54 PM |
|
Pierre Alliez wrote:
> I want to render large triangle meshes with lighting and smooth
> shading on windows with either an ATI or NVidia graphics
> card.
>
> what is currently the fastest way to use OpenGL:
> - vertex arrays ?
Vertex arrays - put them on the graphics
card with the vertex_buffer_object extension.
Arrays without indices are faster, indices
are a bit slower but save memory. Your choice.
> - strips ?
Definitely not strips.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/7/2005 1:11:16 PM
|
|
fungus wrote:
> Pierre Alliez wrote:
>> I want to render large triangle meshes with lighting and smooth
>> shading on windows with either an ATI or NVidia graphics
>> card.
>>
>> what is currently the fastest way to use OpenGL:
>> - vertex arrays ?
>
> Vertex arrays - put them on the graphics
> card with the vertex_buffer_object extension.
>
> Arrays without indices are faster, indices
> are a bit slower but save memory. Your choice.
Sure about that? I mean, you have to repeat vertices without indices, so a
lot of redundant calculations are done. With indices and a reasonable
ordering of the vertices, vertex caches can be used to reduce that
redunancy, or am I missing something?
>> - strips ?
>
> Definitely not strips.
Doesn't that depend on how good the mesh can be stripped?
|
|
0
|
|
|
|
Reply
|
Rolf
|
6/7/2005 2:43:38 PM
|
|
Rolf Magnus wrote:
> fungus wrote:
>>Arrays without indices are faster, indices
>>are a bit slower but save memory. Your choice.
>
> Sure about that? I mean, you have to repeat vertices without indices, so a
> lot of redundant calculations are done.
>
It's going to depend on your graphics card
but anything recent can do the calculations,
no problem.
>
>>>- strips ?
>>
>>Definitely not strips.
>
>
> Doesn't that depend on how good the mesh can be stripped?
Again, it depends on your card. A modern card
can transfrom vertices so fast that its going
to be sitting around waiting for you to send
the next strip.
Best to just give the card a huge chunk of data
to play with and let it do its thing.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/7/2005 3:14:30 PM
|
|
It depends on your data, I suppose... If it's static, than using
Display Lists (TriangleStrips can be used in this case as in many
others, and coul dbe a good choice, since sending N vertices you can
draw N-2 triangles... instead of N/3 using triangles... but with
triangle strips you should be careful with normals...) could be the
best solution... maybe you should use VBOs too (Vertex Buffer
Objects)... if you are rendering dynamic data, then maybe it's better
to use VertexArrays/VBOs than DisplayLists... TriangleStrips is anyway
a good idea...
|
|
0
|
|
|
|
Reply
|
BladeWise
|
6/7/2005 3:25:44 PM
|
|
BladeWise wrote:
> if you are rendering dynamic data, then maybe it's better
> to use VertexArrays/VBOs than DisplayLists...
These days I think VBOs is always better than
display lists. There's been some debate about
which is faster but I think the difference is
in how you do the VBO.
Indexed arrays are slower, this is easy to test.
Display lists are usually as fast as unindexed
arrays ... which leads me to suspect that the
display list compiler always makes unindexed
arrays no matter what data you pass to it.
For big objects this will use a lot more
valuable graphics card memory, which could
be a disaster in the making.
Best to use VBOs so you stay in control
and can decide the storage format yourself.
> TriangleStrips is anyway
> a good idea...
>
No!
Well, it depends. If you put them in a display
list the the display list compiler will unmesh
them for you, see above.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/7/2005 3:51:38 PM
|
|
fungus wrote:
> BladeWise wrote:
>> if you are rendering dynamic data, then maybe it's better
>> to use VertexArrays/VBOs than DisplayLists...
>
> These days I think VBOs is always better than
> display lists. There's been some debate about
> which is faster but I think the difference is
> in how you do the VBO.
>
> Indexed arrays are slower, this is easy to test.
> Display lists are usually as fast as unindexed
> arrays ... which leads me to suspect that the
> display list compiler always makes unindexed
> arrays no matter what data you pass to it.
>
> For big objects this will use a lot more
> valuable graphics card memory, which could
> be a disaster in the making.
>
> Best to use VBOs so you stay in control
> and can decide the storage format yourself.
IIRC, in the last major thread on this we did some objective tests which
showed that isn't always true. IIRC, ATi cards do much better with display
lists.
> > TriangleStrips is anyway
>> a good idea...
>>
>
> No!
>
> Well, it depends. If you put them in a display
> list the the display list compiler will unmesh
> them for you, see above.
The last time I tested that it wasn't true either.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/8/2005 1:14:44 AM
|
|
Jon Harrop wrote:
>
>>Well, it depends. If you put them in a display
>>list the the display list compiler will unmesh
>>them for you, see above.
>
>
> The last time I tested that it wasn't true either.
>
How can you tell? A display list is a black box.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/8/2005 1:46:51 AM
|
|
Jon Harrop wrote:
>
> IIRC, in the last major thread on this we did some objective tests which
> showed that isn't always true. IIRC, ATi cards do much better with display
> lists.
>
I think those tests didn't distinguish between
indexed/non-indexed arrays and that's why the
results we got were mixed.
If the drivers really do de-index the vertex
arrays (which I'm starting to believe they do)
then we need to do a new test...
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/8/2005 2:35:02 AM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:CNspe.43866$dr.40901@news.ono.com...
> Jon Harrop wrote:
> >
> > IIRC, in the last major thread on this we did some objective tests which
> > showed that isn't always true. IIRC, ATi cards do much better with
display
> > lists.
> >
>
> I think those tests didn't distinguish between
> indexed/non-indexed arrays and that's why the
> results we got were mixed.
>
> If the drivers really do de-index the vertex
> arrays (which I'm starting to believe they do)
> then we need to do a new test...
>
>
> --
> <\___/>
> / O O \
> \_____/ FTB. For email, remove my socks.
>
> In science it often happens that scientists say, 'You know
> that's a really good argument; my position is mistaken,'
> and then they actually change their minds and you never
> hear that old view from them again. They really do it.
> It doesn't happen as often as it should, because scientists
> are human and change is sometimes painful. But it happens
> every day. I cannot recall the last time something like
> that happened in politics or religion.
>
> - Carl Sagan, 1987 CSICOP keynote address
It appears that NVIDIA drivers run some sort of stripifying algorithm on
display list triangles. That's the only reason I can figure the memory use
goes up so high
for a while while it's compiling them -- especially unsorted piles of
triangles.
After chasing this issue for years, I've concluded that what's best is
what's best for
your app. If you have to go through great pains & data duplication to use a
VBO
over a display list, it's probably not worth it. I wouldn't try to jam a
square peg
into a round hole just to use VBO's, or display lists for that matter. It is
important
to test both representative as well as worst-case data loads before
committing
to one way or the other.
jbw
|
|
0
|
|
|
|
Reply
|
JB
|
6/8/2005 3:03:15 AM
|
|
On Tue, 07 Jun 2005 14:46:54 +0200, Pierre Alliez
<pierre.alliez@sophia.inria.fr> wrote:
I have done a optimization of our old OpenGL code last days with an
associate.
Switch to displaylists has done nothing in speed.
After rebuild the code to use Vertex Buffer Objects the redraw of an
scene ist twice fast now.
Howie
|
|
0
|
|
|
|
Reply
|
Howie
|
6/8/2005 5:33:42 AM
|
|
fungus wrote:
> Jon Harrop wrote:
>>>Well, it depends. If you put them in a display
>>>list the the display list compiler will unmesh
>>>them for you, see above.
>>
>> The last time I tested that it wasn't true either.
>
> How can you tell? A display list is a black box.
The performance of strips in a display list was significantly higher. I did
that test many years ago though, so it may well have changed.
In fact, I noticed very recently that compiling a display list takes a lot
longer on my new computer than it ever used to...
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/8/2005 6:50:31 AM
|
|
Howie wrote:
> Switch to displaylists has done nothing in speed.
> After rebuild the code to use Vertex Buffer Objects the redraw of an
> scene ist twice fast now.
Are you using nVidia hardware?
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/8/2005 6:51:16 AM
|
|
fungus wrote:
> Rolf Magnus wrote:
>> fungus wrote:
>>>Arrays without indices are faster, indices
>>>are a bit slower but save memory. Your choice.
>>
>> Sure about that? I mean, you have to repeat vertices without indices, so
>> a lot of redundant calculations are done.
>>
>
> It's going to depend on your graphics card
> but anything recent can do the calculations,
> no problem.
With the OpenGL default render pipeline maybe, but what if you use some
rather complex vertex shaders?
>>>>- strips ?
>>>
>>>Definitely not strips.
>>
>>
>> Doesn't that depend on how good the mesh can be stripped?
>
> Again, it depends on your card. A modern card
> can transfrom vertices so fast that its going
> to be sitting around waiting for you to send
> the next strip.
Well, I wasn't thinking about converting them on the fly, but rather about
precalculating the strips, so you can send the vertices in at the same
speed, but get a smaller number of vertices to send.
> Best to just give the card a huge chunk of data to play with and let it do
> its thing.
I just wish there was some wiki or something where everyone can write his
results of benchmarking different things related to OpenGL, so that one
doesn't need to test everything that others already tested.
|
|
0
|
|
|
|
Reply
|
Rolf
|
6/8/2005 8:53:50 AM
|
|
Rolf Magnus wrote:
>
> With the OpenGL default render pipeline maybe, but what if you use some
> rather complex vertex shaders?
>
:-)
Let's just go back to the universal answer to all
OpenGL performance questions:
"It depends..."
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/8/2005 9:28:37 AM
|
|
On Wed, 08 Jun 2005 07:51:16 +0100, Jon Harrop <usenet@jdh30.plus.com>
wrote:
>Howie wrote:
>> Switch to displaylists has done nothing in speed.
>> After rebuild the code to use Vertex Buffer Objects the redraw of an
>> scene ist twice fast now.
>
>Are you using nVidia hardware?
Yes.
Howie
|
|
0
|
|
|
|
Reply
|
Howie
|
6/8/2005 10:02:25 AM
|
|
Howie wrote:
> On Wed, 08 Jun 2005 07:51:16 +0100, Jon Harrop <usenet@jdh30.plus.com>
> wrote:
>>Howie wrote:
>>> Switch to displaylists has done nothing in speed.
>>> After rebuild the code to use Vertex Buffer Objects the redraw of an
>>> scene ist twice fast now.
>>
>>Are you using nVidia hardware?
>
> Yes.
From the tests we did during the last major thread on this topic, nVidia
hardware showed much better performance without display lists whereas ATi
hardware gave very good performance with display lists. So just be aware
that this choice is optimising for a certain type of hardware.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/8/2005 1:47:05 PM
|
|
Rolf Magnus wrote:
> I just wish there was some wiki or something where everyone can write his
> results of benchmarking different things related to OpenGL, so that one
> doesn't need to test everything that others already tested.
That's an excellent idea. :-)
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/8/2005 1:48:28 PM
|
|
Jon Harrop wrote:
>
> From the tests we did during the last major thread on this topic, nVidia
> hardware showed much better performance without display lists whereas ATi
> hardware gave very good performance with display lists.
>
That's not what I get on my Radeon x800...
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/8/2005 2:09:52 PM
|
|
> Vertex arrays - put them on the graphics
> card with the vertex_buffer_object extension.
>
> Arrays without indices are faster, indices
> are a bit slower but save memory.
What are 'Arrays without indices' in the context of vertex buffers?
|
|
0
|
|
|
|
Reply
|
Makhno
|
6/8/2005 10:51:40 PM
|
|
Makhno wrote:
>>Vertex arrays - put them on the graphics
>>card with the vertex_buffer_object extension.
>>
>>Arrays without indices are faster, indices
>>are a bit slower but save memory.
>
>
> What are 'Arrays without indices' in the context of vertex buffers?
>
>
I mean glDrawArrays() vs. glDrawElements()
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/8/2005 11:32:55 PM
|
|
fungus wrote:
> Makhno wrote:
>>>Vertex arrays - put them on the graphics
>>>card with the vertex_buffer_object extension.
>>>
>>>Arrays without indices are faster, indices
>>>are a bit slower but save memory.
>>
>> What are 'Arrays without indices' in the context of vertex buffers?
>
> I mean glDrawArrays() vs. glDrawElements()
Are you saying that glDrawArrays with individual triangles will be faster
than glDrawElements with triangle strips?
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/9/2005 3:03:40 AM
|
|
Jon Harrop wrote:
>
> Are you saying that glDrawArrays with individual triangles will be faster
> than glDrawElements with triangle strips?
>
Using VBOs on modern cards? Yes.
Caveat: If you throw in a complex vertex
program you might swing the balance.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/9/2005 3:42:37 AM
|
|
Jon Harrop wrote:
> Rolf Magnus wrote:
>> I just wish there was some wiki or something where everyone can write his
>> results of benchmarking different things related to OpenGL, so that one
>> doesn't need to test everything that others already tested.
>
> That's an excellent idea. :-)
Does that mean you'd like to put one up? ;-)
|
|
0
|
|
|
|
Reply
|
Rolf
|
6/9/2005 10:27:05 AM
|
|
fungus wrote:
> Jon Harrop wrote:
>>
>> Are you saying that glDrawArrays with individual triangles will be faster
>> than glDrawElements with triangle strips?
>
> Using VBOs on modern cards? Yes.
>
> Caveat: If you throw in a complex vertex
> program you might swing the balance.
Even without a complicated vertex program, I'd have thought that
glDrawArrays would require lots of duplication and, therefore, more data
transfer to the graphics card.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/9/2005 10:58:12 AM
|
|
Rolf Magnus wrote:
> Jon Harrop wrote:
>> Rolf Magnus wrote:
>>> I just wish there was some wiki or something where everyone can write
>>> his results of benchmarking different things related to OpenGL, so that
>>> one doesn't need to test everything that others already tested.
>>
>> That's an excellent idea. :-)
>
> Does that mean you'd like to put one up? ;-)
In theory. ;-)
Perhaps it would be better to have a repository of open source benchmarking
code and results.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/9/2005 10:59:14 AM
|
|
Jon Harrop wrote:
>>>Are you saying that glDrawArrays with individual triangles will be faster
>>>than glDrawElements with triangle strips?
>>
>>Using VBOs on modern cards? Yes.
>>
> Even without a complicated vertex program, I'd have thought that
> glDrawArrays would require lots of duplication and, therefore, more data
> transfer to the graphics card.
>
With a VBO there's no data transfer to the graphics card.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/9/2005 11:32:51 AM
|
|
fungus wrote:
> Jon Harrop wrote:
>>>>Are you saying that glDrawArrays with individual triangles will be
>>>>faster than glDrawElements with triangle strips?
>>>
>>>Using VBOs on modern cards? Yes.
>>
>> Even without a complicated vertex program, I'd have thought that
>> glDrawArrays would require lots of duplication and, therefore, more data
>> transfer to the graphics card.
>
> With a VBO there's no data transfer to the graphics card.
You mean beyond the initial transfer? Do we know that the data is static?
I'm quite surprised by this though. I'd have assumed that minimising memory
usage would be good, even if it is just minimising the loading of data from
the graphics card's memory into the GPU.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/9/2005 2:08:28 PM
|
|
Jon Harrop wrote:
> fungus wrote:
>>>>>Are you saying that glDrawArrays with individual triangles will be
>>>>>faster than glDrawElements with triangle strips?
>>>>
>>>>Using VBOs on modern cards? Yes.
>>>
>
> I'm quite surprised by this though.
>
Why? My Radeon x800 has three vertex transform
units, it can transform a triangle in a single
clock cycle. The latest NVIDIA cards also have
three vertex units.
On a previous generation card (eg. 9700) it only
takes three cycles to transform a triangle so that's
100 million triangles per second - enough to keep
up with the rasterization process.
I'm not sure why indexed arrays are slower but
I measured them and the difference is clear.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/9/2005 7:36:21 PM
|
|
> I mean glDrawArrays() vs. glDrawElements()
So you're saying, if I have a triangle mesh to draw, then instead of
re-indexing all the verticies in some complex fashion to try and draw it in
strips, I should just put all the verticies, repeating them where necessary,
into a vertex buffer array and call glDrawArrays() on it to get the best
performance?
|
|
0
|
|
|
|
Reply
|
Makhno
|
6/9/2005 8:32:56 PM
|
|
Makhno wrote:
>>I mean glDrawArrays() vs. glDrawElements()
>
>
> So you're saying, if I have a triangle mesh to draw, then instead of
> re-indexing all the verticies in some complex fashion to try and draw it in
> strips, I should just put all the verticies, repeating them where necessary,
> into a vertex buffer array and call glDrawArrays() on it to get the best
> performance?
>
On a modern card? Yes.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/9/2005 9:03:12 PM
|
|
> > So you're saying, if I have a triangle mesh to draw, then instead of
> > re-indexing all the verticies in some complex fashion to try and draw it
> > in strips, I should just put all the verticies, repeating them where
> > necessary, into a vertex buffer array and call glDrawArrays() on it to
> > get the best performance?
>
> On a modern card? Yes.
Even if glDrawArrays() uses a mode of GL_TRIANGLES, rather than
GL_TRIANGLE_STRIP? (which is what I meant to add to the paragraph above).
It strikes me as odd that specifying each triangle individually can be
faster than specifying them as a strip, but then I suppose graphics cards
work in myterious ways.
|
|
0
|
|
|
|
Reply
|
Makhno
|
6/9/2005 9:12:25 PM
|
|
So where do these graphics cards put all these verticies? Does it go into
their texture memory, or do they have special memory allocated for them?
|
|
0
|
|
|
|
Reply
|
Makhno
|
6/9/2005 9:27:08 PM
|
|
Makhno wrote:
>>>So you're saying, if I have a triangle mesh to draw, then instead of
>>>re-indexing all the verticies in some complex fashion to try and draw it
>>>in strips, I should just put all the verticies, repeating them where
>>>necessary, into a vertex buffer array and call glDrawArrays() on it to
>>>get the best performance?
>>
>>On a modern card? Yes.
>
>
> Even if glDrawArrays() uses a mode of GL_TRIANGLES, rather than
> GL_TRIANGLE_STRIP? (which is what I meant to add to the paragraph above).
>
Yes.
> It strikes me as odd that specifying each triangle individually can be
> faster than specifying them as a strip, but then I suppose graphics cards
> work in myterious ways.
>
That's beacause you're thinking of it in terms
of software, which it isn't, it's a hardware
pipeline.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/9/2005 9:27:39 PM
|
|
"Makhno" <root@127.0.0.1> wrote in message
news:d8acgv$fhf$1@news6.svr.pol.co.uk...
> So where do these graphics cards put all these verticies? Does it go into
> their texture memory, or do they have special memory allocated for them?
>
>
Thats what's been overlooked so far in this discussion, and one of the great
perils of
single benchmarks. The assumption is that it *all* fits. Most cards are just
unified memory,
frame buffer, zbuffer, pbuffer, display lists, VBO's, texture objects all
jammed into
[128..256...etc] megs. Not really much left over after frame buffers & etc.
If you are managing huge data (tens/hundreds of millions of triangles,
hundreds of large
texture maps & etc), then the tradeoff to be more economical with memory may
well be
worth it & unwinding a triangle strip may be a real bad idea.
-jbw
|
|
0
|
|
|
|
Reply
|
JB
|
6/10/2005 2:17:12 AM
|
|
Makhno wrote:
> So where do these graphics cards put all these verticies? Does it go into
> their texture memory, or do they have special memory allocated for them?
>
PC Graphics cards only have one type of
memory - no separate memory for textures,
frame buffer, etc. Everything goes in the
same place - frame buffer, textures, and
yes, vertices as well.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/10/2005 9:18:54 AM
|
|
fungus wrote:
> Jon Harrop wrote:
>> I'm quite surprised by this though.
>
> Why? My Radeon x800 has three vertex transform
> units, it can transform a triangle in a single
> clock cycle. The latest NVIDIA cards also have
> three vertex units.
>
> On a previous generation card (eg. 9700) it only
> takes three cycles to transform a triangle so that's
> 100 million triangles per second - enough to keep
> up with the rasterization process.
I'd have thought that memory access would be the limiting factor, and that
glDrawArrays would require more data and, therefore, more memory access.
> I'm not sure why indexed arrays are slower but
> I measured them and the difference is clear.
How exactly did you measure this?
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/10/2005 11:11:59 PM
|
|
Jon Harrop wrote:
>
> I'd have thought that memory access would be the limiting factor, and that
> glDrawArrays would require more data and, therefore, more memory access.
>
But indexed arrays are much worse for a cache than
the neat, sequential access of glDrawArrays().
>
>>I'm not sure why indexed arrays are slower but
>>I measured them and the difference is clear.
>
> How exactly did you measure this?
>
Guess!
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/11/2005 12:27:07 AM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:Ibqqe.44289$dr.39091@news.ono.com...
> Jon Harrop wrote:
>>
>> I'd have thought that memory access would be the limiting factor, and
>> that
>> glDrawArrays would require more data and, therefore, more memory access.
>>
>
> But indexed arrays are much worse for a cache than
> the neat, sequential access of glDrawArrays().
Statistically, each vertex of a triangle mesh has on average
6 triangles sharing it. Whether or not a six-fold duplication
of vertex data makes sense for an application really does
depend on the application. It certainly does not for things
like medical 3d image display/analysis, where triangle meshes
representing isosurfaces have millions of triangles. You still
have to get these into system memory before you can transfer
them to VRAM. Such applications even require having a
copy in system memory and in VRAM. And these meshes
might be dynamic, so you have the AGP bottleneck to include
in your measurements.
As always, there are trade offs to be considered and measured.
>>>I'm not sure why indexed arrays are slower but
>>>I measured them and the difference is clear.
>>
>> How exactly did you measure this?
>>
>
> Guess!
If you want to be a credible source of information, answers
like these are not helping your cause. Why not post a small
OpenGL program that does the comparison? This will allow
others to verify your results, show they are dependent on
graphics card and/or application, or debunk your claim.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/11/2005 2:29:16 AM
|
|
In article <Ibqqe.44289$dr.39091@news.ono.com>,
fungus <umailMY@SOCKSartlum.com> wrote:
>But indexed arrays are much worse for a cache than
>the neat, sequential access of glDrawArrays().
I think they would both be equally bad for a cache. A cache depends on
locality of reference--that is, repeated references to some small subset
of the total data. Sequential access through the entire data violates
that just as badly as random access through the entire data.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/11/2005 6:50:51 AM
|
|
Lawrence D�Oliveiro wrote:
> In article <Ibqqe.44289$dr.39091@news.ono.com>,
> fungus <umailMY@SOCKSartlum.com> wrote:
>
>
>>But indexed arrays are much worse for a cache than
>>the neat, sequential access of glDrawArrays().
>
>
> I think they would both be equally bad for a cache. A cache depends on
> locality of reference--that is, repeated references to some small subset
> of the total data. Sequential access through the entire data violates
> that just as badly as random access through the entire data.
A cache loads data in chunks. If you don't use
the entire chunk of data then you wasted some
bandwidth. Sequential access at least garantees
that you're not doing this.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/11/2005 7:52:15 AM
|
|
fungus wrote:
>>>But indexed arrays are much worse for a cache than
>>>the neat, sequential access of glDrawArrays().
>>
>>
>> I think they would both be equally bad for a cache. A cache depends on
>> locality of reference--that is, repeated references to some small subset
>> of the total data. Sequential access through the entire data violates
>> that just as badly as random access through the entire data.
>
> A cache loads data in chunks. If you don't use
> the entire chunk of data then you wasted some
> bandwidth. Sequential access at least garantees
> that you're not doing this.
According to NVidia documents, modern graphics cards have two caches. One
that is operating on raw memory chucks, and one that stores vertices (after
going through the vertex shader). The former will be most effective with
sequential access, the latter will only have any effect at all if a vertex
is accessed more than once, which doesn't ever happen with glDrawArrays().
And nobody forces you to have an indexed array in random order. You can
still sort the vertices so that the first cache is effective too. It might
even be more effective, because less vertices need to go through it.
|
|
0
|
|
|
|
Reply
|
Rolf
|
6/11/2005 10:10:54 AM
|
|
Rolf Magnus wrote:
>
> According to NVidia documents, modern graphics cards have two caches. One
> that is operating on raw memory chucks, and one that stores vertices (after
> going through the vertex shader). The former will be most effective with
> sequential access, the latter will only have any effect at all if a vertex
> is accessed more than once, which doesn't ever happen with glDrawArrays().
> And nobody forces you to have an indexed array in random order. You can
> still sort the vertices so that the first cache is effective too. It might
> even be more effective, because less vertices need to go through it.
>
Yep.
Remember I only said it was fastest "on modern cards".
A Radeon x800/GeForce 6800 has three vertex units so
it can transform all three vertices of a triangle in
a single cycle.
....
A GeForce2 needs indexed arrays and good use of the
vertex cache to get best results.
So basically we're back to "it depends"... :-)
The only way to get "best" performance is to benchmark
the graphics card on program startup and adapt your
program to what you find.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/11/2005 10:25:54 AM
|
|
Ok, here's a test program I just knocked together
to test this thing once and for all. The results
seem to disagree with what I got last time I tested
but maybe the drivers have been tweaked or something.
[ Then again, this is a "benchmark" and the other ]
[ was a "real dataset", so who knows... :-) ]
Here's the exe file:
http://www.artlum.com/arraytest.exe
Run it for a minute or so to let it stabilize.
On my Radeon x800 Pro I get:
Unindexed display list: 50 ms
Unindexed VBO: 45 ms
Indexed display list: 50 ms
Indexed VBO: 23 ms
(Smaller numbers are better).
So on my card:
a) VBO is faster than display list
b) Indexed VBOs are the fastest (twice as fast!)
Let's see what everybody else gets...
FWIW the source code is here:
http://www.artlum.com/arraytest.cpp
You won't be able to compile it but you might
find mistakes...
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/11/2005 5:18:27 PM
|
|
fungus wrote:
> Here's the exe file:
>
> http://www.artlum.com/arraytest.exe
If that's for Windows then I can't run it on any of the machines here. Any
chance of a Linux version?
> So on my card:
>
> a) VBO is faster than display list
> b) Indexed VBOs are the fastest (twice as fast!)
Ok, I'd have expected indexed to be faster but I'm surprised anything is
twice as fast as a display list. That implies your drivers are doing
something retarded with the contents of the display list.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/11/2005 5:41:53 PM
|
|
Jon Harrop wrote:
>
> If that's for Windows then I can't run it on any of the machines here. Any
> chance of a Linux version?
>
Not much...
> Ok, I'd have expected indexed to be faster but I'm surprised anything is
> twice as fast as a display list.
It's consistent no matter what the size of the array is.
eg. With a smaller array array I get:
Unindexed display list: 7.0 ms
Unindexed VBO: 4.5 ms
Indexed display list: 7.1 ms
Indexed VBO: 2.9 ms
> That implies your drivers are doing
> something retarded with the contents of the display list.
>
Latest ATI drivers (downloaded about a week ago).
The code is:
indexedDisplayList = glGenLists(1);
glNewList(indexedDisplayList, GL_COMPILE);
glDrawElements(GL_TRIANGLES,indices.size(),GL_UNSIGNED_SHORT,&indices[0]);
glEndList();
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/11/2005 9:08:37 PM
|
|
fungus wrote:
> Ok, here's a test program I just knocked together
> to test this thing once and for all. The results
> seem to disagree with what I got last time I tested
> but maybe the drivers have been tweaked or something.
>
> [ Then again, this is a "benchmark" and the other ]
> [ was a "real dataset", so who knows... :-) ]
>
> Here's the exe file:
>
> http://www.artlum.com/arraytest.exe
>
> Run it for a minute or so to let it stabilize.
>
>
> On my Radeon x800 Pro I get:
>
> Unindexed display list: 50 ms
> Unindexed VBO: 45 ms
> Indexed display list: 50 ms
> Indexed VBO: 23 ms
>
> (Smaller numbers are better).
>
>
> So on my card:
>
> a) VBO is faster than display list
> b) Indexed VBOs are the fastest (twice as fast!)
>
>
> Let's see what everybody else gets...
>
>
>
> FWIW the source code is here:
>
> http://www.artlum.com/arraytest.cpp
>
> You won't be able to compile it but you might
> find mistakes...
>
>
There should be something wrong with this application. It throws to me
these numbers with a WildCat Realizm 100:
Unindexed display list: 0 ms
Unindexed VBO: 818 ms
Indexed display list: 0 ms
Indexed VBO: 764 ms
BTW, it is normal to see only a white shape?
|
|
0
|
|
|
|
Reply
|
Jacobo
|
6/11/2005 9:52:38 PM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:D1Fqe.50478$US.2981@news.ono.com...
> Ok, here's a test program I just knocked together
> to test this thing once and for all. The results
> seem to disagree with what I got last time I tested
> but maybe the drivers have been tweaked or something.
<snip>
Finally, we are getting somewhere with the ability to use
scientific method.
The numbers are in the order unindexed DL, unindexed VBO,
indexed DL, indexed VBO. Units in milliseconds.
-----
Windows Laptop, NVidia GeForce2 GO:
251, 241, 244, 187
Windows PC, NVidia GeForce4 TI 4200:
167, 100, 167, 83
Linux PC, NVidia GeForce4 TI 4200:
70 to 120 (varied), 90, 135 to 190 (varied), 120
Windows PC, NVidia FX 5600 Ultra:
69, 120, 69, 67
Windows PC, ATI Radeon 9800 Pro:
62, 58, 62, 27
Macintosh G4, ATI Radeon 8500
software?, 240, software?, 93 (very very slow for display lists)
-----
My conclusion is that I will keep my current renderer
design (using indexed VBOs).
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/11/2005 10:18:40 PM
|
|
"Jacobo Rodriguez" <sdf@skdjf.com> wrote in message
news:D2Jqe.50520$US.44682@news.ono.com...
> There should be something wrong with this application. It throws to me
> these numbers with a WildCat Realizm 100:
>
> Unindexed display list: 0 ms
> Unindexed VBO: 818 ms
> Indexed display list: 0 ms
> Indexed VBO: 764 ms
>
> BTW, it is normal to see only a white shape?
White shapes showed up on the Macintosh with an ATI Radeon 8500,
but in color on a PC with an ATI Radeon 9800 Pro.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/11/2005 10:20:20 PM
|
|
In article <qpIqe.44421$dr.39790@news.ono.com>,
fungus <umailMY@SOCKSartlum.com> wrote:
>Jon Harrop wrote:
>>
>> If that's for Windows then I can't run it on any of the machines here. Any
>> chance of a Linux version?
>
>Not much...
If you make the source code available, others could port it.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/11/2005 10:26:15 PM
|
|
"Lawrence D�Oliveiro" <ldo@geek-central.gen.new_zealand> wrote in message
news:ldo-1C15F2.10261512062005@lust.ihug.co.nz...
> In article <qpIqe.44421$dr.39790@news.ono.com>,
> fungus <umailMY@SOCKSartlum.com> wrote:
>
>>Jon Harrop wrote:
>>>
>>> If that's for Windows then I can't run it on any of the machines here.
>>> Any
>>> chance of a Linux version?
>>
>>Not much...
>
> If you make the source code available, others could port it.
'fungus' posted his source code. I ported it to Linux and Macintosh.
Easy enough, given the very good separation of the specific feature
code from his application layer.
So what problem are you having with the porting?
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/12/2005 3:53:21 AM
|
|
Lawrence D�Oliveiro wrote:
> In article <qpIqe.44421$dr.39790@news.ono.com>,
> fungus <umailMY@SOCKSartlum.com> wrote:
>
>
>>Jon Harrop wrote:
>>
>>>If that's for Windows then I can't run it on any of the machines here. Any
>>>chance of a Linux version?
>>
>>Not much...
>
>
> If you make the source code available, others could port it.
I think a GLUT version would be quite easy...
the hard part is reading a decent timer in
an OS independent way.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 4:24:37 AM
|
|
Dave Eberly wrote:
> Windows Laptop, NVidia GeForce2 GO:
> 251, 241, 244, 187
>
Indexed VBO wins.
> Windows PC, NVidia GeForce4 TI 4200:
> 167, 100, 167, 83
>
Ditto.
> Linux PC, NVidia GeForce4 TI 4200:
> 70 to 120 (varied), 90, 135 to 190 (varied), 120
>
Not sure.
> Windows PC, NVidia FX 5600 Ultra:
> 69, 120, 69, 67
>
Indexed VBO by a whisker.
> Windows PC, ATI Radeon 9800 Pro:
> 62, 58, 62, 27
>
Indexed VBO is easily the fastest.
> Macintosh G4, ATI Radeon 8500
> software?, 240, software?, 93 (very very slow for display lists)
You ran it on a Mac???
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 4:27:03 AM
|
|
Lawrence D�Oliveiro wrote:
> In article <qpIqe.44421$dr.39790@news.ono.com>,
> fungus <umailMY@SOCKSartlum.com> wrote:
>
>
>>Jon Harrop wrote:
>>
>>>If that's for Windows then I can't run it on any of the machines here. Any
>>>chance of a Linux version?
>>
>>Not much...
>
>
> If you make the source code available, others could port it.
http://www.artlum.com/arraytest.cpp
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 4:28:03 AM
|
|
Jacobo Rodriguez wrote:
>
> There should be something wrong with this application. It throws to me
> these numbers with a WildCat Realizm 100:
>
> Unindexed display list: 0 ms
> Unindexed VBO: 818 ms
> Indexed display list: 0 ms
> Indexed VBO: 764 ms
>
Weird. If anything I'd say it should be the other
way around with display lists working (simplest code)
and the VBOs failing (more complex, more driver dependent).
> BTW, it is normal to see only a white shape?
No, it should be colored.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 4:32:51 AM
|
|
I just updated the benchmark program to include
triangle strips.
The results are weird. The fastest way to send
the geometry turned out to be triangle strips
inside a display list....
BUT... the speed is *very* dependent on strip
length. If your strips are less than 250 triangles
long then it's not worth it. If your strips are
only ten or twelve triangles then you can just
forget it - it's ten times slower than indexed
arrays.
Here's the results for my Radeon x800 Pro:
Unindexed display list: 50.13 ms
Unindexed VBO: 44.72 ms
Indexed display list: 50.07 ms
Indexed VBO: 22.98 ms
LongStrip display list: 18.72 ms
LongStrip VBO: 22.35 ms
ShortStrip display list: 114.13 ms
ShortStrip VBO: 190.49 ms
Based on this and Dave E's results from yesterday
I'd say indexed VBOs are the way to go at the
moment. By a strange coincidence this is also
what the manufacturers are telling us to use... :-)
PS: I made it write out a log file so you can
cut/paste the results - no need to write them
down.
PPS: I'd be interested to see the results for a
high end NVIDIA card....anybody?
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 11:52:27 AM
|
|
Dave Eberly wrote:
>>>> If that's for Windows then I can't run it on any of the machines here.
>>>> Any chance of a Linux version?
>>>
>>>Not much...
>>
>> If you make the source code available, others could port it.
>
> 'fungus' posted his source code. I ported it to Linux and Macintosh.
> Easy enough, given the very good separation of the specific feature
> code from his application layer.
Are you willing to post that code or does everyone need to duplicate your
work?
|
|
0
|
|
|
|
Reply
|
Rolf
|
6/12/2005 12:26:04 PM
|
|
Rolf Magnus wrote:
>>'fungus' posted his source code. I ported it to Linux and Macintosh.
>>Easy enough, given the very good separation of the specific feature
>>code from his application layer.
>
> Are you willing to post that code or does everyone need to duplicate your
> work?
>
I don't think the Linux/Macintosh drivers are going
to give wildly different results from Windows drivers.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 12:33:18 PM
|
|
fungus wrote:
> I just updated the benchmark program to include
> triangle strips.
>
> The results are weird. The fastest way to send
> the geometry turned out to be triangle strips
> inside a display list....
>
> BUT... the speed is *very* dependent on strip
> length. If your strips are less than 250 triangles
> long then it's not worth it. If your strips are
> only ten or twelve triangles then you can just
> forget it - it's ten times slower than indexed
> arrays.
So that means It Depends(tm) on how well the mesh can be converted to strips
(i.e. how long the average strip length is). I guess that means that strips
are of limited use.
> Here's the results for my Radeon x800 Pro:
> Unindexed display list: 50.13 ms
> Unindexed VBO: 44.72 ms
> Indexed display list: 50.07 ms
> Indexed VBO: 22.98 ms
> LongStrip display list: 18.72 ms
> LongStrip VBO: 22.35 ms
> ShortStrip display list: 114.13 ms
> ShortStrip VBO: 190.49 ms
>
> Based on this and Dave E's results from yesterday
> I'd say indexed VBOs are the way to go at the
> moment. By a strange coincidence this is also
> what the manufacturers are telling us to use... :-)
Heh.
> PS: I made it write out a log file so you can
> cut/paste the results - no need to write them
> down.
>
> PPS: I'd be interested to see the results for a
> high end NVIDIA card....anybody?
I guess my FX-5200 doesn't really count as "high end", does it? ;-)
|
|
0
|
|
|
|
Reply
|
Rolf
|
6/12/2005 12:37:20 PM
|
|
Rolf Magnus wrote:
> fungus wrote:
>>If your strips are less than 250 triangles
>>long then it's not worth it. If your strips are
>>only ten or twelve triangles then you can just
>>forget it - it's ten times slower than indexed
>>arrays.
>
>
> So that means It Depends(tm) on how well the mesh can be converted to strips
> (i.e. how long the average strip length is). I guess that means that strips
> are of limited use.
>
I'd say that trying to stripify general objects
is a waste of time. There's always triangles and
short strips left over after you've found the
long ones and this will make them useless.
If you know in advance that there's not going to
be any little bits left over then it might be
worth it.
(Caveat: I've only got results for my graphics
card and I can't be bothered to start swapping
cards around in the other machine).
>>PPS: I'd be interested to see the results for a
>>high end NVIDIA card....anybody?
>
> I guess my FX-5200 doesn't really count as "high end", does it? ;-)
>
No, but post 'em anyway. Let's see what you've got...
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 12:48:49 PM
|
|
fungus wrote:
>>>PPS: I'd be interested to see the results for a
>>>high end NVIDIA card....anybody?
>>
>> I guess my FX-5200 doesn't really count as "high end", does it? ;-)
>>
>
> No, but post 'em anyway. Let's see what you've got...
For that, I'd first need to port it to Linux ;-)
Now you know why I asked Dave for the ported sources.
|
|
0
|
|
|
|
Reply
|
Rolf
|
6/12/2005 12:57:27 PM
|
|
Dave Eberly wrote:
> 'fungus' posted his source code. I ported it to Linux and Macintosh.
> Easy enough, given the very good separation of the specific feature
> code from his application layer.
I have a GeForce 3 and FX5700Go that I'll run a Linux port on if you post
it.
> So what problem are you having with the porting?
Lack of time.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/12/2005 1:16:46 PM
|
|
fungus wrote:
> The results are weird. The fastest way to send
> the geometry turned out to be triangle strips
> inside a display list....
I really don't understand why there is any significant difference between
VBOs and display lists for static geometry, apart from DL compile time.
> BUT... the speed is *very* dependent on strip
> length. If your strips are less than 250 triangles
> long then it's not worth it.
On a GeForce 3 it used to be about 20 triangles per strip, to optimise cache
coherency for transformed data:
http://www.chem.pwf.cam.ac.uk/~jdh30/programming/opengl/performance/
The bit on "Reordering vertex data for AGP memory" is interesting - Kevin
Moule from nVidia had written an article saying that unindexed was faster
than indexed which I tested and found it to be wrong (then as well as now).
> If your strips are
> only ten or twelve triangles then you can just
> forget it - it's ten times slower than indexed arrays.
Again, I can't think why this should be ten times slower. What on earth is
going on that could cause such a slowdown?!
> Here's the results for my Radeon x800 Pro:
> Unindexed display list: 50.13 ms
> Unindexed VBO: 44.72 ms
> Indexed display list: 50.07 ms
> Indexed VBO: 22.98 ms
> LongStrip display list: 18.72 ms
> LongStrip VBO: 22.35 ms
> ShortStrip display list: 114.13 ms
> ShortStrip VBO: 190.49 ms
Those are pretty wildly varying results!
How much RAM is required in each case? Maybe for non-DL the speed can be
associated with the memory requirements. Also, how much RAM does your
graphics card have and do the relative performances change if you only use
half as much geometry (i.e. half the RAM)?
> Based on this and Dave E's results from yesterday
> I'd say indexed VBOs are the way to go at the
> moment. By a strange coincidence this is also
> what the manufacturers are telling us to use... :-)
Or display lists with long strips (which is the last thing I was using).
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/12/2005 1:28:37 PM
|
|
Jon Harrop wrote:
>
> I really don't understand why there is any significant difference between
> VBOs and display lists for static geometry, apart from DL compile time.
>
I guess it depends on the driver.
> The bit on "Reordering vertex data for AGP memory" is interesting - Kevin
> Moule from nVidia had written an article saying that unindexed was faster
> than indexed which I tested and found it to be wrong (then as well as now).
>
Two days ago I would have sworn that unindexed was
faster - based on previous testing with "real world"
data.
My benchmark is quite cache friendly. Maybe I should
mix the indices up and see what happens.
> Again, I can't think why this should be ten times slower. What on earth is
> going on that could cause such a slowdown?!
>
Maybe the strip sizes are being sent from the host
CPU and setup time is slow. Still... ten times
slower is pretty bad. I double checked the code
when I say it but I can't see anything wrong.
> How much RAM is required in each case?
Not much. Certainly less than a megabyte.
> do the relative performances change if you only use
> half as much geometry (i.e. half the RAM)?
>
No.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/12/2005 1:40:14 PM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:sQOqe.44440$dr.12963@news.ono.com...
> You ran it on a Mac???
I ported your code to run within my engine/application layer
so that I could run it on a Linux box and a Macintosh.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/12/2005 3:23:08 PM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:jYVqe.50574$US.36085@news.ono.com...
> I don't think the Linux/Macintosh drivers are going
> to give wildly different results from Windows drivers.
I disagree with this assessment. As my Macintosh statistics
showed, the card appeared to switch to software rendering
for the display list cases.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/12/2005 3:27:10 PM
|
|
Jon Harrop <usenet@jdh30.plus.com> writes:
> On a GeForce 3 it used to be about 20 triangles per strip, to optimise cache
> coherency for transformed data:
>
> http://www.chem.pwf.cam.ac.uk/~jdh30/programming/opengl/performance/
>
> The bit on "Reordering vertex data for AGP memory" is interesting - Kevin
> Moule from nVidia had written an article saying that unindexed was faster
> than indexed which I tested and found it to be wrong (then as well
> as now).
Hah. I should tell Kevin that he got a new job :). Here's the actual
PDF from the talk Kevin gave:
http://www.cgl.uwaterloo.ca/~krmoule/talks/performance/performance.pdf
No, Kevin does not work for NVIDIA (nor has he in my knowledge ever
done so). He's a researcher at the University of Waterloo. Also,
unlike what your page states, that document is *not* an official
NVIDIA document. You may want to clear that up.
--
Stefanus Du Toit <sjdutoit@cgl.uwaterloo.ca>
Graduate Student
Computer Graphics Lab, University of Waterloo
|
|
0
|
|
|
|
Reply
|
Stefanus
|
6/12/2005 4:11:03 PM
|
|
Stefanus Du Toit wrote:
> No, Kevin does not work for NVIDIA (nor has he in my knowledge ever
> done so). He's a researcher at the University of Waterloo. Also,
> unlike what your page states, that document is *not* an official
> NVIDIA document.
Ah, ok. I certainly downloaded the document from www.nvidia.com but I may
have browsed past a disclaimer without noticing.
> You may want to clear that up.
Alas, I can't - I lost write access to my site when I left uni.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/12/2005 5:51:03 PM
|
|
> I'd say that trying to stripify general objects
> is a waste of time. There's always triangles and
> short strips left over after you've found the
> long ones and this will make them useless.
I found this webpage
http://www.codercorner.com/Strips.htm
It suggests any mesh can be turned into one long strip by the addition of
null verticies, which apparently don't cost very much. No idea how good this
is in practice.
|
|
0
|
|
|
|
Reply
|
Makhno
|
6/12/2005 6:48:24 PM
|
|
"Makhno" <root@127.0.0.1> wrote in message
news:d8i0df$7r7$1@news6.svr.pol.co.uk...
> > I'd say that trying to stripify general objects
> > is a waste of time. There's always triangles and
> > short strips left over after you've found the
> > long ones and this will make them useless.
>
> I found this webpage
> http://www.codercorner.com/Strips.htm
> It suggests any mesh can be turned into one long strip by the addition of
> null verticies, which apparently don't cost very much. No idea how good
this
> is in practice.
>
>
If you are memory-bound, (that is, you have too much stuff to fit in VRAM).
it's a *real* good idea.
-jbw
|
|
0
|
|
|
|
Reply
|
JB
|
6/13/2005 2:03:38 AM
|
|
In article <QKwqe.44307$dr.14870@news.ono.com>,
fungus <umailMY@SOCKSartlum.com> wrote:
>Lawrence D�Oliveiro wrote:
>> In article <Ibqqe.44289$dr.39091@news.ono.com>,
>> fungus <umailMY@SOCKSartlum.com> wrote:
>>
>>>But indexed arrays are much worse for a cache than
>>>the neat, sequential access of glDrawArrays().
>>
>> I think they would both be equally bad for a cache. A cache depends on
>> locality of reference--that is, repeated references to some small subset
>> of the total data. Sequential access through the entire data violates
>> that just as badly as random access through the entire data.
>
>A cache loads data in chunks. If you don't use
>the entire chunk of data then you wasted some
>bandwidth. Sequential access at least g[u]arantees
>that you're not doing this.
Don't confuse caching with buffering. Cache blocks tend to be small, 1-2
kiB--much smaller than typical buffer sizes.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/13/2005 3:01:28 AM
|
|
"Rolf Magnus" <ramagnus@t-online.de> wrote in message
news:d8h9f5$u15$03$1@news.t-online.com...
> Are you willing to post that code or does everyone need to duplicate your
> work?
I cut and paste 'fungus's code into my own Wild Magic
development environment. I could post the test application and
require folks to download Wild Magic, but instead I wrote a
GLUT-based application with only the essence of what the test
program needs.
http://www.geometrictools.com/Temp/TestSpeed.zip
On a Windows PC, there is TestSpeed.vcproj for MS Visual
Studio .NET 2003 (VC++ 7.1). For folks who have VC++ 6
or VC++ 7.0, you can create your own project file and insert
the files TestSpeed.{h,cpp}, arraytest.cpp,
Wm3OpenGLExtensions.{h,c}, and Wm3WglExtensions.{h,c}.
On Linux, there is compile.sh to compile the program. When
you unzip the file, you need to make compile.sh executable via
"chmod 755 compile.sh".
On the Macintosh, there is an Xcode project. I got this to compile
and run, but the timings are wrong (0ms or 1ms) and only the
text displays. The debugger indicates I am actually getting to the
various OpenGL calls. The statistics I posted earlier of the Mac
were based on putting arraytest.cpp code into a Wild Magic
application, which does show everything. Not clear what the
problem is with the GLUT-based version.
The test code is not intended to be pretty or efficient. Just a hack
to get something running on all the platforms.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/13/2005 3:28:14 AM
|
|
Makhno wrote:
>>I'd say that trying to stripify general objects
>>is a waste of time. There's always triangles and
>>short strips left over after you've found the
>>long ones and this will make them useless.
>
>
> I found this webpage
> http://www.codercorner.com/Strips.htm
> It suggests any mesh can be turned into one long strip by the addition of
> null verticies, which apparently don't cost very much. No idea how good this
> is in practice.
>
I've never liked the sound of that method. Is there
a *guarantee* that triangles with repeated indices
will draw nothing? If you have links, post them!
Still, I put it in my program and it put VBOs back
at the top of the list again (by enough of a margin
to make it worth while implementing if we can
guarantee that it works correctly).
Display lists went slower. I don't know why....
Revised test:
Unindexed VBO: 44.80 ms
Indexed VBO: 23.31 ms
LongStrip VBO: 23.79 ms
JoinedStrip VBO: 17.78 ms ** Winner! **
Unindexed display list: 50.15 ms
Indexed display list: 50.24 ms
LongStrip display list: 19.23 ms
JoinedStrip display list: 21.98 ms
So...let's see some more results!
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/13/2005 3:34:47 AM
|
|
Dave Eberly wrote:
> I wrote a GLUT-based application with only the essence
> of what the test program needs.
> http://www.geometrictools.com/Temp/TestSpeed.zip
>
That's nice....but I just updateed the program
again... :-)
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/13/2005 3:41:23 AM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:Ef7re.44597$dr.28103@news.ono.com...
> That's nice....but I just updateed the program
> again... :-)
Then folks can feel free to modify what I posted after
you post your updated program.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/13/2005 3:53:55 AM
|
|
Dave Eberly wrote:
> "fungus" <umailMY@SOCKSartlum.com> wrote in message
> news:Ef7re.44597$dr.28103@news.ono.com...
>
>
>>That's nice....but I just updateed the program
>>again... :-)
>
>
> Then folks can feel free to modify what I posted after
> you post your updated program.
>
Same place as before:
http://www.artlum.com/arraytest.exe
http://www.artlum.com/arraytest.cpp
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/13/2005 3:55:08 AM
|
|
U = unindexed
I = indexed
VBO = vertex buffer objects
DL = display lists
LS = long strips
JS = joined strips
order is: UVBO, IVBO, LSVBO, JSVBO, UDL, IDL, LSDL, JSDL
Macintosh ATI Radeon 8500:
242, 97, 100, 111, 1473, 1471, 113, 631
Linux NVidia GeForce4 TI4200:
90, 117, 68, 85, 241, 241, 213, 216
Windows (laptop) NVidia GeForce2 G0:
242, 200, 192, 237, 272, 265, 192, 236
Windows NVidia GeForce4 TI4200:
90, 67, 67, 84, 159, 159, 140, 142
Windows NVidia GeForceFX 5600 Ultra:
121, 72, 48, 57, 67, 67, 48, 59
Windows ATI Radeon 9800 Pro:
58, 27, 19, 19, 60, 62, 22, 26
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/13/2005 4:41:55 AM
|
|
On Wed, 08 Jun 2005 16:09:52 +0200, fungus <umailMY@SOCKSartlum.com>
wrote:
>Jon Harrop wrote:
>>
>> From the tests we did during the last major thread on this topic, nVidia
>> hardware showed much better performance without display lists whereas ATi
>> hardware gave very good performance with display lists.
>>
>
>That's not what I get on my Radeon x800...
NVIDIA Geforce GO5660
Unindexed VBO 164.18
Indexed VBO 266.65
LongStrip VBO 97.88
JoinedStrip VBO 116.89
Unindexed display list 252.01
Indexed display list 251.02
LongStrip display list 98.72
JoinedStrip display list 114.64
Howie
|
|
0
|
|
|
|
Reply
|
Howie
|
6/13/2005 5:40:03 AM
|
|
"Dave Eberly" <dNOSPAMeberly@usemydomain.com> wrote in message
news:i37re.3168$eM6.978@newsread3.news.atl.earthlink.net...
> On the Macintosh, there is an Xcode project. I got this to compile
> and run, but the timings are wrong (0ms or 1ms) and only the
> text displays. The debugger indicates I am actually getting to the
> various OpenGL calls. The statistics I posted earlier of the Mac
> were based on putting arraytest.cpp code into a Wild Magic
> application, which does show everything. Not clear what the
> problem is with the GLUT-based version.
I posted the source code and projects for the performance testing
that live on top of Wild Magic 3.3 (latest version). This also includes
'fungus's "latest" version.
http://www.geometrictools.com/Temp/WMTestSpeed.zip
The Mac version lives on top of AGL (not GLUT) and produces
meaningful statistics.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/13/2005 5:45:56 AM
|
|
Dave Eberly wrote:
> Macintosh ATI Radeon 8500:
> Linux NVidia GeForce4 TI4200:
> Windows (laptop) NVidia GeForce2 G0:
> Windows NVidia GeForce4 TI4200:
> Windows NVidia GeForceFX 5600 Ultra:
> Windows ATI Radeon 9800 Pro:
>
So....
Conclusions:
------------
VBO is the clear winner over display lists.
Not one card tested so far has display lists
faster then VBOs.
The decision seems to be between indexed arrays(IA)
and joined triangle strips(JTS). So far it's around
50:50 whether JTS is faster than IA. The difference
in speed between JTS and IA is usually significant,
suggesting that a program should optimize for this
at runtime.
I've found documents on both ATI and NVIDIA web sites
which suggest that joining strips with degenerate
triangles is an Ok thing to do. It seems the hardware
does indeed check for this.
The document below says:
"Use lists if you would have to add large
numbers of degenerate polys to stick with
strips (more than ~20% means use lists)"
http://www.ati.com/developer/gdc/D3DTutorial3_Pipeline_Performance.pdf&e=9717
This refers to Direct3D but there's no reason
to think OpenGL is any different.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/13/2005 7:26:37 AM
|
|
fungus wrote:
>
> http://www.ati.com/developer/gdc/D3DTutorial3_Pipeline_Performance.pdf&e=9717
>
Note that page 28 of that document clearly shows
that OpenGL is twice as fast as D3D.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/13/2005 8:18:07 AM
|
|
Dave Eberly wrote:
> On Linux, there is compile.sh to compile the program. When
> you unzip the file, you need to make compile.sh executable via
> "chmod 755 compile.sh".
I just tried it on a 1.2GHz Athlon t-bird with a GeForce 3 running x86
Debian and a 1.8GHz Athlon64 with a GeForce FX5700Go running pure64 Debian
and both segfault.
Any ideas why?
PS: I had to tweak the AMD64 code to not conflict with the standard
library's definition of int64_t.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/13/2005 1:18:58 PM
|
|
"Jon Harrop" <usenet@jdh30.plus.com> wrote in message
news:42ad8800$0$2062$ed2e19e4@ptn-nntp-reader04.plus.net...
> I just tried it on a 1.2GHz Athlon t-bird with a GeForce 3 running x86
> Debian and a 1.8GHz Athlon64 with a GeForce FX5700Go running pure64 Debian
> and both segfault.
>
> Any ideas why?
I neither have these systems to test/debug nor do I have any knowledge
of what graphics hardware drivers are on these platforms. Moreover, I
do not have a 64-bit processor to see what joys arise when attempting
to use a 32-bit graphics system on them. All my tests were on 32-bit
machines. My only suggestion is step through the code with a debugger
or resort to the tried-and-true use of print statements to find out the
offending line of code. Welcome to the pain of portability.
> PS: I had to tweak the AMD64 code to not conflict with the standard
> library's definition of int64_t.
I converted glxext.h (version mentioned in Wm3GlxExtensions.h) and
preserved the ugly conditional compile block for int64_t that occurs
in GLX_OML_sync_control. You would think current day compiler
writers would finally agree on providing standardized definitions for
intN_t.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/13/2005 2:34:25 PM
|
|
Dave Eberly wrote:
> "Rolf Magnus" <ramagnus@t-online.de> wrote in message
> news:d8h9f5$u15$03$1@news.t-online.com...
>
>
>>Are you willing to post that code or does everyone need to duplicate your
>>work?
>
>
> I cut and paste 'fungus's code into my own Wild Magic
> development environment. I could post the test application and
> require folks to download Wild Magic, but instead I wrote a
> GLUT-based application with only the essence of what the test
> program needs.
> http://www.geometrictools.com/Temp/TestSpeed.zip
>
After test it on my PC (WinXP Amd 2.6+, WildCat Realizm 100) None of the
test are performed. I got 0 ms on all tests. After looking inside of the
source code, I saw some things that I dont like and can explain the
errors: all vertex array were created using glEnable to enable them,
instead of glEnableClienState (glEnable(GL_VERTEX_ARRAY); instead of
glEnableClientState(GL_VERTEX_ARRAY); ) the same thing happens with the
glDisable (I dont know how it can works to the other people with the
arrays enabled by this way...).
When I have more free time, I'll keep an eye on the source code
Best regards,
Jacobo
|
|
0
|
|
|
|
Reply
|
Jacobo
|
6/13/2005 2:34:29 PM
|
|
Jacobo Rodriguez wrote:
> all vertex array were created using glEnable to enable them,
> instead of glEnableClienState
Did I really do that? How embarrassing...!
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/13/2005 2:41:47 PM
|
|
fungus wrote:
> Jacobo Rodriguez wrote:
>
>> all vertex array were created using glEnable to enable them, instead
>> of glEnableClienState
>
> Did I really do that? How embarrassing...!
>
That would explain why some people got "white"
color.
Weirdly enough it appears to work on my driver
(it does what I intended, not what I said).
I've updated the version on the web site:
http://www.artlum.com/arraytest.exe
http://www.artlum.com/arraytest.cpp
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/13/2005 2:44:50 PM
|
|
"Jacobo Rodriguez" <sdf@skdjf.com> wrote in message
news:4Qgre.50758$US.42434@news.ono.com...
> After test it on my PC (WinXP Amd 2.6+, WildCat Realizm 100) None of the
> test are performed. I got 0 ms on all tests. After looking inside of the
> source code, I saw some things that I dont like and can explain the
> errors: all vertex array were created using glEnable to enable them,
> instead of glEnableClienState (glEnable(GL_VERTEX_ARRAY); instead of
> glEnableClientState(GL_VERTEX_ARRAY); ) the same thing happens with the
> glDisable (I dont know how it can works to the other people with the
> arrays enabled by this way...).
All I did was cut and paste without regards to checking the
correctness of 'fungus's code. At any rate, the thread evolved
into what it should have been: If a poster makes a claim about
something, back it up with source code so that others may
verify or debunk the claim. All focus is now on the source code,
its correctness, and its efficiency.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/13/2005 3:50:44 PM
|
|
> Unindexed VBO: 44.80 ms
> Indexed VBO: 23.31 ms
> LongStrip VBO: 23.79 ms
> JoinedStrip VBO: 17.78 ms ** Winner! **
> Unindexed display list: 50.15 ms
> Indexed display list: 50.24 ms
> LongStrip display list: 19.23 ms
> JoinedStrip display list: 21.98 ms
Are the Indexed/Unindexed/Striped VBOs also in display lists?
|
|
0
|
|
|
|
Reply
|
Makhno
|
6/13/2005 8:05:53 PM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:s97re.50681$US.35545@news.ono.com...
>
> I've never liked the sound of that method. Is there
> a *guarantee* that triangles with repeated indices
> will draw nothing? If you have links, post them!
>
>
> Still, I put it in my program and it put VBOs back
> at the top of the list again (by enough of a margin
> to make it worth while implementing if we can
> guarantee that it works correctly).
>
> Display lists went slower. I don't know why....
>
>
> Revised test:
>
> Unindexed VBO: 44.80 ms
> Indexed VBO: 23.31 ms
> LongStrip VBO: 23.79 ms
> JoinedStrip VBO: 17.78 ms ** Winner! **
> Unindexed display list: 50.15 ms
> Indexed display list: 50.24 ms
> LongStrip display list: 19.23 ms
> JoinedStrip display list: 21.98 ms
>
> So...let's see some more results!
>
>
Fungus,
Here are the results for my system. These suit my (time-poor, lazy?)
preference for just passing the triangles to the driver and letting it do
what it is good at.
John Paine
Results of OpenGL test with array size 150
------------------------------------------
Unindexed VBO: 57.21 ms
Indexed VBO: 28.59 ms
LongStrip VBO: 28.69 ms
JoinedStrip VBO: 35.87 ms
Unindexed display list: 30.12 ms
Indexed display list: 30.03 ms
LongStrip display list: 32.10 ms
JoinedStrip display list: 35.90 ms
System info:
------------
OS = Windows XP
CPUs = 2
RAM = 2047Mb
OpenGL info
-----------
Vendor = NVIDIA Corporation
Renderer = GeForce 6800 Series GPU/PCI/SSE2
Version = 1.5.3
Hardware acceleration? Yes
Color bits = 24
Depth bits = 24
Stencil bits = 0
Alpha bits = 0
Texture units = 4
Max anisotropic filter = 16
|
|
0
|
|
|
|
Reply
|
John
|
6/14/2005 12:19:50 AM
|
|
Makhno wrote:
>>Unindexed VBO: 44.80 ms
>>Indexed VBO: 23.31 ms
>>LongStrip VBO: 23.79 ms
>>JoinedStrip VBO: 17.78 ms ** Winner! **
>>Unindexed display list: 50.15 ms
>>Indexed display list: 50.24 ms
>>LongStrip display list: 19.23 ms
>>JoinedStrip display list: 21.98 ms
>
>
> Are the Indexed/Unindexed/Striped VBOs also in display lists?
>
no.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/14/2005 2:14:55 AM
|
|
John Paine wrote:
>
> Results of OpenGL test with array size 150
> ------------------------------------------
> Unindexed VBO: 57.21 ms
> Indexed VBO: 28.59 ms
> LongStrip VBO: 28.69 ms
> JoinedStrip VBO: 35.87 ms
> Unindexed display list: 30.12 ms
> Indexed display list: 30.03 ms
> LongStrip display list: 32.10 ms
> JoinedStrip display list: 35.90 ms
>
VBOs are fastest, but only by a whisker on
this card.
> Here are the results for my system. These suit my (time-poor, lazy?)
> preference for just passing the triangles to the driver and letting it do
> what it is good at.
>
But this only happens on your card. On other
cards VBOs enjoy a much bigger margin.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/14/2005 3:00:45 AM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:zLrre.50856$US.29794@news.ono.com...
> John Paine wrote:
>>
>> Results of OpenGL test with array size 150
>> ------------------------------------------
>> Unindexed VBO: 57.21 ms
>> Indexed VBO: 28.59 ms
>> LongStrip VBO: 28.69 ms
>> JoinedStrip VBO: 35.87 ms
>> Unindexed display list: 30.12 ms
>> Indexed display list: 30.03 ms
>> LongStrip display list: 32.10 ms
>> JoinedStrip display list: 35.90 ms
>>
>
> VBOs are fastest, but only by a whisker on
> this card.
>
>> Here are the results for my system. These suit my (time-poor, lazy?)
>> preference for just passing the triangles to the driver and letting it do
>> what it is good at.
>>
>
>
> But this only happens on your card. On other
> cards VBOs enjoy a much bigger margin.
>
>
Undeniably true, and implementing VBO's in my code has moved up on my list
of priorities as a result of the discussion and benchmarking in this thread.
But stripifying used to be on my list as well, so it seems that I've saved
myself some wasted effort as it never actually made it to the top. Care to
make any predictions for the future as to whether indexed VBO's will become
and stay dominant?
John
|
|
0
|
|
|
|
Reply
|
John
|
6/14/2005 4:26:31 AM
|
|
I just converted the test to use GLUT.
It doesn't change the results, it's just
GLUT instead of my own API.
http://www.artlum.com/arraytest.exe
http://www.artlum.com/arraytest.cpp
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/14/2005 8:39:55 AM
|
|
John Paine wrote:
>
> Undeniably true, and implementing VBO's in my code has moved up on my list
> of priorities as a result of the discussion and benchmarking in this thread.
> But stripifying used to be on my list as well, so it seems that I've saved
> myself some wasted effort as it never actually made it to the top. Care to
> make any predictions for the future as to whether indexed VBO's will become
> and stay dominant?
>
Based on what the card manufacturers are
saying...yes. Indexed VBOs are definitely
the future.
The reasons are:
a) It's what Direct3D does.
b) It works well with vertex programs.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/14/2005 8:47:43 AM
|
|
Some results sent to me via email...
Steve Gladin wrote:
> order is: UVBO, IVBO, LSVBO, JSVBO, UDL, IDL, LSDL, JSDL
>
>
> Windows 2K [P3@700], ATI Radeon 9600
> 96, 33, 33, 33, 94, 94, 32, 39
>
> Windows XP [PM@1700], ATI Radeon Mobility 9700
> 97, 40, 37, 38, 96, 96, 33, 40
>
We finally found a card where display lists win!
(But only for long triangle strips - not very
practial in real life)
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/14/2005 12:26:49 PM
|
|
"fungus" <umailMY@SOCKSartlum.com> wrote in message
news:f2Are.44797$dr.33579@news.ono.com...
>
> Some results sent to me via email...
>
>
> Steve Gladin wrote:
> > order is: UVBO, IVBO, LSVBO, JSVBO, UDL, IDL, LSDL, JSDL
> >
> >
> > Windows 2K [P3@700], ATI Radeon 9600
> > 96, 33, 33, 33, 94, 94, 32, 39
> >
> > Windows XP [PM@1700], ATI Radeon Mobility 9700
> > 97, 40, 37, 38, 96, 96, 33, 40
> >
>
> We finally found a card where display lists win!
> (But only for long triangle strips - not very
> practial in real life)
>
>
> --
> <\___/>
> / O O \
> \_____/ FTB. For email, remove my socks.
>
> In science it often happens that scientists say, 'You know
> that's a really good argument; my position is mistaken,'
> and then they actually change their minds and you never
> hear that old view from them again. They really do it.
> It doesn't happen as often as it should, because scientists
> are human and change is sometimes painful. But it happens
> every day. I cannot recall the last time something like
> that happened in politics or religion.
>
> - Carl Sagan, 1987 CSICOP keynote address
GeForce FX5200, WINXP home
212, 93, 112, 105 107,109,101,123
Indexed VBO, by just a shade above test noise.
What's remarkable is how close they mostly are -- mostly within noise of
each other.
However, I've also noted some unusual behavior when, for example, the
entire object is off-screen. Display lists render in 0 ms, VBO's don't
always.
I've also seen poor behavior for Display lists on Linux (Quadro 4 980XGL),
and
discovered it's because of the DrawArrays -- immediate-mode-defined (!)
display lists run considerably faster. Now that the new test is out in GLUT
form I'll
re-run the tests.
jbw
|
|
0
|
|
|
|
Reply
|
JB
|
6/15/2005 1:32:58 AM
|
|
In article <xJwre.44774$dr.19763@news.ono.com>,
fungus <umailMY@SOCKSartlum.com> wrote:
>http://www.artlum.com/arraytest.cpp
I'm trying to get hold of this, but your server doesn't seem to be
responding.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/15/2005 3:00:53 AM
|
|
On Win2K, ATI Cat 5.2 drivers, P4 2.26GHz
ABIT Radeon 9600XT-VIO 256MB
Times: 76, 26, 26, 26, 75, 75, 25, 31
VBO and strips are consistently faster on this card and driver.
One thing I found when using OGL a few years ago was that DL were faster
than VBO on some cards. I confirmed this with D3D. Basically, the older
cards are very sensitive to the exact format of the vertices in the VBO. If
the card's hardware expected xyz and you only gave it xy, it reverted to
software vertex processing. The shader-based cards had a more flexible
vertex loader engine and could handle the different formats. On old cards,
the DL code knew which vertex format the card expected and would transform
xy to xyz internally, thus it would execute faster. Further evidence of this
was that the DL would use the same amount of memory whether I gave it xy or
xyz vertices.
Mike
|
|
0
|
|
|
|
Reply
|
mikegi
|
6/15/2005 4:18:13 AM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
fungus wrote:
> Unindexed display list: 50 ms
> Unindexed VBO: 45 ms
> Indexed display list: 50 ms
> Indexed VBO: 23 ms
The display list times *should* be identical. When you are compiling a
display list, the GL deferences the vertex arrays immediatly. So, the
performance of an "indexed display list" would be identical to an
"immediate mode display list".
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCsdEGX1gOwKyEAw8RApzsAJ9+ueVeqy9RaOPVJJ/GZvRLu17qWwCgiU3T
fkXUFHtpjpzsewr2WaMDDS8=
=aBp7
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/16/2005 6:59:52 PM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jon Harrop wrote:
> fungus wrote:
>
>>So on my card:
>>
>>a) VBO is faster than display list
>>b) Indexed VBOs are the fastest (twice as fast!)
>
> Ok, I'd have expected indexed to be faster but I'm surprised anything is
> twice as fast as a display list. That implies your drivers are doing
> something retarded with the contents of the display list.
Not really. Display lists are surprisingly hard to optimize in the
general case. That was part of the reason that VBOs were invented. :)
Given typical application usage patterns, I don't think as much effort
is put into optimizing display lists as there was before the gaming card
boom.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCsdIdX1gOwKyEAw8RAinuAKCD7bEBawXqmeN8Chdoy/mxVimK6wCfa3ij
5tPEX+R8Wgc0q72a75H1Tg0=
=KzdF
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/16/2005 7:04:35 PM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dave Eberly wrote:
> "Jon Harrop" <usenet@jdh30.plus.com> wrote in message
>
>>PS: I had to tweak the AMD64 code to not conflict with the standard
>>library's definition of int64_t.
>
> I converted glxext.h (version mentioned in Wm3GlxExtensions.h) and
> preserved the ugly conditional compile block for int64_t that occurs
> in GLX_OML_sync_control. You would think current day compiler
> writers would finally agree on providing standardized definitions for
> intN_t.
As one of the people responsable for that uglyness, current day
compilers *do* agree. The C99 spec adds stdint.h, which has all of that
good stuff. However, glxext.h has to be usable on a number of systems
where C99 is not available. I can assure you that this situation
irritates me more than anyone. :(
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCsdPHX1gOwKyEAw8RAnPBAKCRNxEgVNCFQESjBBIdYAxyL9XQRQCeLCSi
M+2UYob3Oj7+hPR0wJCIaEk=
=U26G
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/16/2005 7:11:38 PM
|
|
Ian Romanick wrote:
>
> When you are compiling a display list,
> the GL deferences the vertex arrays immediatly.
Can you say that for sure? Surely that's up
to the driver....
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/16/2005 7:15:57 PM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jacobo Rodriguez wrote:
> After test it on my PC (WinXP Amd 2.6+, WildCat Realizm 100) None of the
> test are performed. I got 0 ms on all tests. After looking inside of the
> source code, I saw some things that I dont like and can explain the
> errors: all vertex array were created using glEnable to enable them,
> instead of glEnableClienState (glEnable(GL_VERTEX_ARRAY); instead of
> glEnableClientState(GL_VERTEX_ARRAY); ) the same thing happens with the
> glDisable (I dont know how it can works to the other people with the
> arrays enabled by this way...).
Ah, the classic. This "works" because GL_EXT_vertex_array was
originally specified to use Enable / Disable. Because of that, a *lot*
of implementations still support using Enable / Disable. In fact, the
GLX protocol code for the X.org libGL does this:
void __indirect_glEnable(GLenum cap)
{
__GLX_DECLARE_VARIABLES();
__GLX_LOAD_VARIABLES();
if (!gc->currentDpy) return;
switch(cap) {
case GL_COLOR_ARRAY:
case GL_EDGE_FLAG_ARRAY:
case GL_INDEX_ARRAY:
case GL_NORMAL_ARRAY:
case GL_TEXTURE_COORD_ARRAY:
case GL_VERTEX_ARRAY:
case GL_SECONDARY_COLOR_ARRAY:
case GL_FOG_COORD_ARRAY:
__indirect_glEnableClientState(cap);
return;
default:
break;
}
__GLX_BEGIN(X_GLrop_Enable,8);
__GLX_PUT_LONG(4,cap);
__GLX_END(8);
}
Since there are still some applications floating around that use the
GL_EXT_vertex_array version, I'm surprised that it didn't work on 3dlabs
drivers. That's interesting.
http://oss.sgi.com/projects/ogl-sample/registry/EXT/vertex_array.txt
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCsdVFX1gOwKyEAw8RAukjAJ0WXaqMc9zSToHVJ+I7DlktBjY4DQCcDXkd
MhbXwCuQquuwl0tP+FtCros=
=iE9W
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/16/2005 7:18:03 PM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
fungus wrote:
> FWIW the source code is here:
>
> http://www.artlum.com/arraytest.cpp
>
> You won't be able to compile it but you might
> find mistakes...
A quick review of the code revealed one of my wglGetProcAddress /
glXGetProcAddress pet peeves. Do *NOT* name your function pointer the
same thing as the actual function! There are a number of systems,
including some older versions of Linux where this will cause mysterious
crashes. You wouldn't put 'void *(*malloc)(size_t);' in a program and
expect libc to be happy, would you?
You're creating an entity with the same name, but a wildly different
type, as something that already exists in a system library. No good can
ever come of that.
Also, if you're using GLUT, why not just use glutGetProcAddress instead
of the loadGLfunc business?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCsdgBX1gOwKyEAw8RAjWsAJ0bmTWVajtpI3bNjtmMidJviqwyIwCfWV+D
qP+CSaw6Z0+REZ/Wo+94ASU=
=7xOQ
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/16/2005 7:29:45 PM
|
|
Ian Romanick wrote:
>
> A quick review of the code revealed one of my wglGetProcAddress /
> glXGetProcAddress pet peeves. Do *NOT* name your function pointer the
> same thing as the actual function! There are a number of systems,
> including some older versions of Linux where this will cause mysterious
> crashes. You wouldn't put 'void *(*malloc)(size_t);' in a program and
> expect libc to be happy, would you?
>
Good point, I never thought of that.
Surely the correct thing to do is put a big
#ifdef around the whole extension thing and
just let the library do its thing on Linux.
If not, how do you write portable code?
> You're creating an entity with the same name, but a wildly different
> type, as something that already exists in a system library. No good can
> ever come of that.
>
> Also, if you're using GLUT, why not just use glutGetProcAddress instead
> of the loadGLfunc business?
My copy of GLUT doesn't seem to have that function
(which I thought was weird because it *does* have
glutExtensionSupported()...)
I'm not a GLUT expert. This is actually my first
GLUT program.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/16/2005 7:42:44 PM
|
|
"Ian Romanick" <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote in
message news:1118949097.692643@q7.q7.com...
> As one of the people responsable for that uglyness, current day
> compilers *do* agree. The C99 spec adds stdint.h, which has all of that
> good stuff. However, glxext.h has to be usable on a number of systems
> where C99 is not available. I can assure you that this situation
> irritates me more than anyone. :(
You lost me on this. Your first sentence says that compilers
*do agree*. Later you say that not all compilers use the
C99 spec, which means that compilers *do not agree*.
Specifications are great, but as long as compiler writers do
not follow them, there will be differences. And even if a new
version of a compiler gets it right, older versions that do not
get it right will continue to be used. That's part of the pain in
trying to write portable software.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/16/2005 7:44:33 PM
|
|
"Ian Romanick" <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote in
message news:1118950181.833368@q7.q7.com...
> A quick review of the code revealed one of my wglGetProcAddress /
> glXGetProcAddress pet peeves. Do *NOT* name your function pointer the
> same thing as the actual function! There are a number of systems,
> including some older versions of Linux where this will cause mysterious
> crashes.
How do you suggest using glxext.h? Including it directly in a program
does not appear to work. The various extension wrapping systems
have to go to great pains to make a wrapper that works on multiple
platforms.
On Red Hat Fedora Core 3, and whatever version of g++ and ld
ship with it, naming the function pointer the same thing did not even
work. But naming it something different, and then #define'ing the
functions real name to be that other name worked. But somehow
I suspect you will object to this, too.
So honestly, are we all missing some simple way to just include
the wglext.h and glxext.h files that you can download from the
SGI site and have them work with compiler issues?
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/16/2005 7:52:30 PM
|
|
Dave Eberly wrote:
> "Ian Romanick" <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote in
> message news:1118949097.692643@q7.q7.com...
>
>
>>As one of the people responsable for that uglyness, current day
>>compilers *do* agree. The C99 spec adds stdint.h, which has all of that
>>good stuff. However, glxext.h has to be usable on a number of systems
>>where C99 is not available. I can assure you that this situation
>>irritates me more than anyone. :(
>
>
> You lost me on this. Your first sentence says that compilers
> *do agree*.
I parse it as "current day compilers *do* agree",
ie. older ones don't.
> Later you say that not all compilers use the
> C99 spec, which means that compilers *do not agree*.
>
What he said. :-)
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/16/2005 7:55:34 PM
|
|
Dave Eberly wrote:
>
>> Do *NOT* name your function pointer the
>>same thing as the actual function! There are a number of systems,
>>including some older versions of Linux where this will cause mysterious
>>crashes.
>
> How do you suggest using glxext.h? Including it directly in a program
> does not appear to work. The various extension wrapping systems
> have to go to great pains to make a wrapper that works on multiple
> platforms.
>
I think this is more fuel for my argument that
providing up to date OpenGL libraries sounds
nice but it's really a bad idea in practice.
I think we should all stick with OpenGL 1.1
libraries and use extensions for the newer
stuff. It avoids all sorts of problems.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/16/2005 8:00:57 PM
|
|
In article <1118949097.692643@q7.q7.com>,
Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>The C99 spec adds stdint.h, which has all of that
>good stuff. However, glxext.h has to be usable on a number of systems
>where C99 is not available.
Apparently, most of the closed-source compiler vendors (Microsoft et al)
are not keen on C99.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/17/2005 7:08:55 PM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
fungus wrote:
> Ian Romanick wrote:
>
>> When you are compiling a display list,
>> the GL deferences the vertex arrays immediatly.
>
> Can you say that for sure? Surely that's up
> to the driver....
No. The OpenGL 1.2 spec explictly says that vertex arrays are
dereference immediatly when compiling a display list. It's the first
paragraph of page 239 (page 253 of the PDF) of the OpenGL 2.0 spec:
"Vertex array pointers are dereferenced when the commands
ArrayElement, DrawArrays, DrawElements, or DrawRangeElements
are accumulated into a display list."
This is required because vertex arrays are client-side state and display
lists are server-side state. When the display list is executed on the
server, there is no way for the server to access the client-side state.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCtg2sX1gOwKyEAw8RAoUqAJoCy0y5CfPt7zwkGU6eVeqOzFztHQCeKMbf
wg/Civ1MLjJMZlBTuEsSZXs=
=Ezto
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/20/2005 12:07:40 AM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dave Eberly wrote:
> "Ian Romanick" <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote in
> message news:1118950181.833368@q7.q7.com...
>
>
>>A quick review of the code revealed one of my wglGetProcAddress /
>>glXGetProcAddress pet peeves. Do *NOT* name your function pointer the
>>same thing as the actual function! There are a number of systems,
>>including some older versions of Linux where this will cause mysterious
>>crashes.
>
> How do you suggest using glxext.h? Including it directly in a program
> does not appear to work. The various extension wrapping systems
> have to go to great pains to make a wrapper that works on multiple
> platforms.
>
> On Red Hat Fedora Core 3, and whatever version of g++ and ld
> ship with it, naming the function pointer the same thing did not even
> work. But naming it something different, and then #define'ing the
> functions real name to be that other name worked. But somehow
> I suspect you will object to this, too.
>
> So honestly, are we all missing some simple way to just include
> the wglext.h and glxext.h files that you can download from the
> SGI site and have them work with compiler issues?
The Quake and Quake2 source code released by id Software handle this
correctly.
1. Test for the existance of the desired functionality by looking at
the string returned by glGetString(GL_EXTENSIONS) or
glGetString(GL_VERSION). Set some flag to know what code paths the
application should use for rendering.
2. Get pointers to the functions using glXGetProcAddress (or
wglGetProcAddress). Store the pointers in variables with names
different than the functions. I believe Quake 2 uses things like
qglMultiTexCoord3fv, for example.
3. Only call those function pointers when the flag set in step #1 is
set. If you're stylistically comfortable using a macro like '#define
glMultiTexCoord3fv(t,v) (*qglMultiTexCoord3fv)(t,v)' that's fine. Your
redefined glMultiTexCoord3fv doesn't end up in the object file, so it
doesn't matter.
It is *not* enough that glXGetProcAddress return non-NULL. The GLX spec
differs from the WGL "spec" (there is no formal, written spec for WGL)
in that pointers returned by glXGetProcAddress are context independent.
Everytime you call glXGetProcAddress for the same function name within
an application, no matter what context is bound, you will get the same
pointer. The same is *not* true on WGL. As a side-effect, it is
perfectly legal in GLX to call glXGetProcAddress with no context bound.
Here's the problem: since there is no context, the GL *cannot* know
what functions are supported.
On Linux, when you call glXGetProcAddress you get a pointer to a
trampoline function that will call the real function once a context is
bound. As a result, you can call 'glXGetProcAddress("glFooBar")' on
most Linux drivers (Nvidia's drivers being the only exception that I
know of) and get a pointer to something. Just don't try calling it. :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCthBIX1gOwKyEAw8RAqZqAJ9Y8sMPJLSg8olsTU6NfWFpgF96DQCfe+WA
mxIPbtYDuQYN0w+Z2JnnyYY=
=+PIL
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/20/2005 12:18:50 AM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dave Eberly wrote:
> "Ian Romanick" <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote in
> message news:1118949097.692643@q7.q7.com...
>
>>As one of the people responsable for that uglyness, current day
>>compilers *do* agree. The C99 spec adds stdint.h, which has all of that
>>good stuff. However, glxext.h has to be usable on a number of systems
>>where C99 is not available. I can assure you that this situation
>>irritates me more than anyone. :(
>
> You lost me on this. Your first sentence says that compilers
> *do agree*. Later you say that not all compilers use the
> C99 spec, which means that compilers *do not agree*.
Modern compilers agree, but not all compilers are modern. glxext.h is
"expected" to work on systems older than 1999, which is when C99 became
available. That's all I meant by that.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCthEXX1gOwKyEAw8RAvntAJ49RAW0XlBXMr3g3nBtbqrM5yPY0QCfVMQ9
G8/Q2Ez9mKZclEner8Mk3ng=
=Okdp
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/20/2005 12:22:10 AM
|
|
Ian Romanick wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> fungus wrote:
>
>>Ian Romanick wrote:
>>
>>
>>>When you are compiling a display list,
>>>the GL deferences the vertex arrays immediatly.
>>
>>Can you say that for sure? Surely that's up
>>to the driver....
>
>
> No. The OpenGL 1.2 spec explictly says that vertex arrays are
> dereference immediatly when compiling a display list. It's the first
> paragraph of page 239 (page 253 of the PDF) of the OpenGL 2.0 spec:
>
Yes, I know that.
I was referring to the case of indexed vs. non-indexed
arrays. You said:
"The display list times *should* be identical. When you
are compiling a display list, the GL deferences the vertex
arrays immediatly. So, the performance of an "indexed
display list" would be identical to an "immediate mode
display list"."
Isn't it up to the driver how this works?
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
6/20/2005 6:36:13 AM
|
|
"Ian Romanick" <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote in
message news:1119226727.390599@q7.q7.com...
> Dave Eberly wrote:
>> So honestly, are we all missing some simple way to just include
>> the wglext.h and glxext.h files that you can download from the
>> SGI site and have them work with compiler issues?
>
> The Quake and Quake2 source code released by id Software handle this
> correctly.
<snip of outline for wrapping extensions>
If I understand your response, I cannot just #include the *.h files
and grab the function pointers. What you describe is a process
to write an extension wrapper, something I have already done.
So it appears that having *.h files is irrelevant. One must extract
the information from them and build a wrapper.
Your comments about loading the function pointers were
interesting. You say you contributed to glxext.h. If you have
any influence on this extension mechanism, it would really be
helpful to go to the SGI site and download the latest wrapper
rather than download glxext.h and wglext.h followed by a lengthy
process to write our own wrapper.
--
Dave Eberly
http://www.geometrictools.com
|
|
0
|
|
|
|
Reply
|
Dave
|
6/20/2005 2:02:20 PM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
fungus wrote:
> Ian Romanick wrote:
>> fungus wrote:
>>> Ian Romanick wrote:
>>>
>>>> When you are compiling a display list,
>>>> the GL deferences the vertex arrays immediatly.
>>>
>>> Can you say that for sure? Surely that's up
>>> to the driver....
>>
>> No. The OpenGL 1.2 spec explictly says that vertex arrays are
>> dereference immediatly when compiling a display list. It's the first
>> paragraph of page 239 (page 253 of the PDF) of the OpenGL 2.0 spec:
>
> Yes, I know that.
>
> I was referring to the case of indexed vs. non-indexed
> arrays. You said:
>
> "The display list times *should* be identical. When you
> are compiling a display list, the GL deferences the vertex
> arrays immediatly. So, the performance of an "indexed
> display list" would be identical to an "immediate mode
> display list"."
>
> Isn't it up to the driver how this works?
Ah. I see what you mean. I suppose at some level it is up to the
driver. However, dereferencing the vertex array in that manner
effectively makes it the same as immediate-mode calls. Once it's cooked
down to that level, the driver ought to be able to do the same level of
optimizations (or not) on both. So, while there's no guarantee that the
performance will be the same, it certainly is common.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCtvOKX1gOwKyEAw8RAsnRAJ42yEabYV4c+4z9KGcM24As3vwNNgCeMvsQ
Gu+Zx9eRfJVSWQS9DAP/B4E=
=DHKm
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/20/2005 4:28:31 PM
|
|
Ian Romanick wrote:
> Ah. I see what you mean. I suppose at some level it is up to the
> driver. However, dereferencing the vertex array in that manner
> effectively makes it the same as immediate-mode calls...
I think you are incorrectly assuming that the driver will start by
flattening the indirections into the equivalent of immediate mode calls. I
hope and doubt that this is the case as it could require substantially more
memory and could be slower to render. I would expect the driver to simply
copy the index (if any) and vertex arrays, not losing any performance.
But as fungus says, we don't know what the driver is doing. Indeed, some of
the drivers appear to be doing something else as the display list
performance is worse than the VBO performance for exactly the same data.
This means the driver is doing something to actually slow rendering down,
rather than optimising it.
We might be able to get more information if we look at memory usage...
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com
|
|
0
|
|
|
|
Reply
|
Jon
|
6/20/2005 5:04:36 PM
|
|
In article <1119226727.390599@q7.q7.com>,
Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
> 1. Test for the existance of the desired functionality by looking at
>the string returned by glGetString(GL_EXTENSIONS) or
>glGetString(GL_VERSION). Set some flag to know what code paths the
>application should use for rendering.
>
> 2. Get pointers to the functions using glXGetProcAddress (or
>wglGetProcAddress). Store the pointers in variables with names
>different than the functions. I believe Quake 2 uses things like
>qglMultiTexCoord3fv, for example.
>
> 3. Only call those function pointers when the flag set in step #1 is
>set. If you're stylistically comfortable using a macro like '#define
>glMultiTexCoord3fv(t,v) (*qglMultiTexCoord3fv)(t,v)' that's fine. Your
>redefined glMultiTexCoord3fv doesn't end up in the object file, so it
>doesn't matter.
Why not just compile in weak references to glMultiTexCoord3fv, and only
execute those calls if the glGetString(GL_EXTENSIONS) string says you
can? That would be a lot simpler than all this stuffing around with
procedure pointers that might or might not be context-dependent.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/23/2005 1:18:02 AM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Lawrence D�Oliveiro wrote:
> In article <1119226727.390599@q7.q7.com>,
> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>
>> 1. Test for the existance of the desired functionality by looking at
>>the string returned by glGetString(GL_EXTENSIONS) or
>>glGetString(GL_VERSION). Set some flag to know what code paths the
>>application should use for rendering.
>>
>> 2. Get pointers to the functions using glXGetProcAddress (or
>>wglGetProcAddress). Store the pointers in variables with names
>>different than the functions. I believe Quake 2 uses things like
>>qglMultiTexCoord3fv, for example.
>>
>> 3. Only call those function pointers when the flag set in step #1 is
>>set. If you're stylistically comfortable using a macro like '#define
>>glMultiTexCoord3fv(t,v) (*qglMultiTexCoord3fv)(t,v)' that's fine. Your
>>redefined glMultiTexCoord3fv doesn't end up in the object file, so it
>>doesn't matter.
>
> Why not just compile in weak references to glMultiTexCoord3fv, and only
> execute those calls if the glGetString(GL_EXTENSIONS) string says you
> can? That would be a lot simpler than all this stuffing around with
> procedure pointers that might or might not be context-dependent.
Because the symbols might not be there *at all*. The Linux OpenGL ABI
(link below) only requires that libGL have the symbols (i.e., visible
via nm or something similar) for OpenGL 1.2 and ARB_multitexture (so
glMultiTexCoord3fv was a bad example). Your weak references might not
be resolved in cases where the functionality *is* available.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCu5IwX1gOwKyEAw8RAlGGAJ9nuR/lSGwUOlOthw+Gadtc+xZ8gACeNuyd
1rWvHwwnMIKg/FLmtac71IY=
=Qxdq
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/24/2005 4:34:01 AM
|
|
In article <1119587641.342914@q7.q7.com>,
Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Lawrence D�Oliveiro wrote:
>> In article <1119226727.390599@q7.q7.com>,
>> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>
>>> 1. Test for the existance of the desired functionality by looking at
>>>the string returned by glGetString(GL_EXTENSIONS) or
>>>glGetString(GL_VERSION). Set some flag to know what code paths the
>>>application should use for rendering.
>>>
>>> 2. Get pointers to the functions using glXGetProcAddress (or
>>>wglGetProcAddress). Store the pointers in variables with names
>>>different than the functions. I believe Quake 2 uses things like
>>>qglMultiTexCoord3fv, for example.
>>>
>>> 3. Only call those function pointers when the flag set in step #1 is
>>>set. If you're stylistically comfortable using a macro like '#define
>>>glMultiTexCoord3fv(t,v) (*qglMultiTexCoord3fv)(t,v)' that's fine. Your
>>>redefined glMultiTexCoord3fv doesn't end up in the object file, so it
>>>doesn't matter.
>>
>> Why not just compile in weak references to glMultiTexCoord3fv, and only
>> execute those calls if the glGetString(GL_EXTENSIONS) string says you
>> can? That would be a lot simpler than all this stuffing around with
>> procedure pointers that might or might not be context-dependent.
>
>Because the symbols might not be there *at all*.
That's fine. That's why the references have to be weak.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/24/2005 10:09:08 AM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Lawrence D�Oliveiro wrote:
> In article <1119587641.342914@q7.q7.com>,
> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>Lawrence D�Oliveiro wrote:
>>>In article <1119226727.390599@q7.q7.com>,
>>> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>>
>>>> 3. Only call those function pointers when the flag set in step #1 is
>>>>set. If you're stylistically comfortable using a macro like '#define
>>>>glMultiTexCoord3fv(t,v) (*qglMultiTexCoord3fv)(t,v)' that's fine. Your
>>>>redefined glMultiTexCoord3fv doesn't end up in the object file, so it
>>>>doesn't matter.
As I mentioned in another message, glMultiTexCoord3fv is a bad example
because the Linux OpenGL ABI requires that symbol be in libGL. So, I'm
going to stop using that function as my example. ;)
>>>Why not just compile in weak references to glMultiTexCoord3fv, and only
>>>execute those calls if the glGetString(GL_EXTENSIONS) string says you
>>>can? That would be a lot simpler than all this stuffing around with
>>>procedure pointers that might or might not be context-dependent.
>>
>>Because the symbols might not be there *at all*.
>
> That's fine. That's why the references have to be weak.
I think you've missed what I'm saying. There exist, in the real world,
systems where libGL does *not* have the symbol glGenQueriesARB, for
example, but 'glGetString(GL_EXTENSIONS")' returns a string containing
'GL_ARB_occlusion_query' and 'glXGetProcAddress("glGenQueriesARB")'
returns a pointer to a valid function.
On those systems, a program that uses the technique you describe will
crash unexpectedly.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCwIeBX1gOwKyEAw8RAjycAJ4l3VNBxKH7gOXYR9xZv3ogyi3QGQCggsvJ
Fg+fGebnriAJmZ4murI8aO4=
=tpvj
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
6/27/2005 10:49:38 PM
|
|
In article <1119912575.901582@q7.q7.com>,
Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Lawrence D�Oliveiro wrote:
>> In article <1119587641.342914@q7.q7.com>,
>> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>>Lawrence D�Oliveiro wrote:
>>>>In article <1119226727.390599@q7.q7.com>,
>>>> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>>>
>>>>> 3. Only call those function pointers when the flag set in step #1 is
>>>>>set. If you're stylistically comfortable using a macro like '#define
>>>>>glMultiTexCoord3fv(t,v) (*qglMultiTexCoord3fv)(t,v)' that's fine. Your
>>>>>redefined glMultiTexCoord3fv doesn't end up in the object file, so it
>>>>>doesn't matter.
>
>As I mentioned in another message, glMultiTexCoord3fv is a bad example
>because the Linux OpenGL ABI requires that symbol be in libGL. So, I'm
>going to stop using that function as my example. ;)
>
>>>>Why not just compile in weak references to glMultiTexCoord3fv, and only
>>>>execute those calls if the glGetString(GL_EXTENSIONS) string says you
>>>>can? That would be a lot simpler than all this stuffing around with
>>>>procedure pointers that might or might not be context-dependent.
>>>
>>>Because the symbols might not be there *at all*.
>>
>> That's fine. That's why the references have to be weak.
>
>I think you've missed what I'm saying. There exist, in the real world,
>systems where libGL does *not* have the symbol glGenQueriesARB, for
>example, but 'glGetString(GL_EXTENSIONS")' returns a string containing
>'GL_ARB_occlusion_query' and 'glXGetProcAddress("glGenQueriesARB")'
>returns a pointer to a valid function.
>
>On those systems, a program that uses the technique you describe will
>crash unexpectedly.
In that case, skip the glGetString call and simply check that the weak
symbol is non-nil before trying to call it.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
6/28/2005 6:41:23 AM
|
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Lawrence D�Oliveiro wrote:
> In article <1119912575.901582@q7.q7.com>,
> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>Lawrence D�Oliveiro wrote:
>>>In article <1119587641.342914@q7.q7.com>,
>>> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>>
>>>>Lawrence D�Oliveiro wrote:
>>>>
>>>>>Why not just compile in weak references to glMultiTexCoord3fv, and only
>>>>>execute those calls if the glGetString(GL_EXTENSIONS) string says you
>>>>>can? That would be a lot simpler than all this stuffing around with
>>>>>procedure pointers that might or might not be context-dependent.
>>>>
>>>>Because the symbols might not be there *at all*.
>>>
>>>That's fine. That's why the references have to be weak.
>>
>>I think you've missed what I'm saying. There exist, in the real world,
>>systems where libGL does *not* have the symbol glGenQueriesARB, for
>>example, but 'glGetString(GL_EXTENSIONS")' returns a string containing
>>'GL_ARB_occlusion_query' and 'glXGetProcAddress("glGenQueriesARB")'
>>returns a pointer to a valid function.
>>
>>On those systems, a program that uses the technique you describe will
>>crash unexpectedly.
>
> In that case, skip the glGetString call and simply check that the weak
> symbol is non-nil before trying to call it.
And that will crash on systems where the libGL exports the symbol but
the driver doesn't support the function. I really don't understand the
resistance to doing things the documented, correct way. Is it really
worth it to save 5 lines of code?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFCxZDHX1gOwKyEAw8RAi3sAJ98B8cp1sSQX5OIOUvgFWNWcfvUYwCeP+by
09bYhyDEjPHTPNPmMETnuvA=
=qpvj
-----END PGP SIGNATURE-----
|
|
0
|
|
|
|
Reply
|
Ian
|
7/1/2005 6:30:15 PM
|
|
In article <1120242612.910472@q7.q7.com>,
Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>Lawrence D�Oliveiro wrote:
>> In article <1119912575.901582@q7.q7.com>,
>> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>>Lawrence D�Oliveiro wrote:
>>>>In article <1119587641.342914@q7.q7.com>,
>>>> Ian Romanick <idr@SPAM_MUST_DIEparanormal-entertainment.com> wrote:
>>>>
>>>>>Lawrence D�Oliveiro wrote:
>>>>>
>>>>>>Why not just compile in weak references to glMultiTexCoord3fv, and only
>>>>>>execute those calls if the glGetString(GL_EXTENSIONS) string says you
>>>>>>can? That would be a lot simpler than all this stuffing around with
>>>>>>procedure pointers that might or might not be context-dependent.
>>>>>
>>>>>Because the symbols might not be there *at all*.
>>>>
>>>>That's fine. That's why the references have to be weak.
>>>
>>>I think you've missed what I'm saying. There exist, in the real world,
>>>systems where libGL does *not* have the symbol glGenQueriesARB, for
>>>example, but 'glGetString(GL_EXTENSIONS")' returns a string containing
>>>'GL_ARB_occlusion_query' and 'glXGetProcAddress("glGenQueriesARB")'
>>>returns a pointer to a valid function.
>>>
>>>On those systems, a program that uses the technique you describe will
>>>crash unexpectedly.
>>
>> In that case, skip the glGetString call and simply check that the weak
>> symbol is non-nil before trying to call it.
>
>And that will crash on systems where the libGL exports the symbol but
>the driver doesn't support the function.
I thought the GL library _was_ part of the driver, so why should it be
exporting a function it doesn't support?
>I really don't understand the
>resistance to doing things the documented, correct way. Is it really
>worth it to save 5 lines of code?
And the extra indirection time every time you call the function.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
7/13/2005 8:49:04 AM
|
|
Lawrence D�Oliveiro wrote:
>
>>I really don't understand the
>>resistance to doing things the documented, correct way. Is it really
>>worth it to save 5 lines of code?
>
> And the extra indirection time every time you call the function.
a) The OpenGL library will most likely do the same
indirection - that's how it adapts to different
drivers and contexts. It may even be worse: if there's
stack parameters involved it might need to copy
them.
b) Modern CPUs have look-ahead logic to eliminate
things like indirections so they never reach the
core CPU.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
7/13/2005 12:33:51 PM
|
|
In article <TS7Be.49525$dr.20502@news.ono.com>,
fungus <umailMY@SOCKSartlum.com> wrote:
>Lawrence D�Oliveiro wrote:
>>
>>>I really don't understand the
>>>resistance to doing things the documented, correct way. Is it really
>>>worth it to save 5 lines of code?
>>
>> And the extra indirection time every time you call the function.
>
>a) The OpenGL library will most likely do the same
>indirection...
This is an extra one on top of that.
>b) Modern CPUs have look-ahead logic to eliminate
>things like indirections so they never reach the
>core CPU.
Nevertheless, the code is still simpler if you just use a weak reference
instead of having to manually set up a function pointer under a
different name from the function you're actually wanting to call.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
7/15/2005 11:21:05 AM
|
|
Lawrence D�Oliveiro wrote:
>
>>>And the extra indirection time every time you call the function.
>>
>>a) The OpenGL library will most likely do the same
>>indirection...
>
>
> This is an extra one on top of that.
>
By using your own pointer you might bypass the
other layers and go straight to the driver
- faster.
>>b) Modern CPUs have look-ahead logic to eliminate
>>things like indirections so they never reach the
>>core CPU.
>
> Nevertheless, the code is still simpler if you just use a weak reference
Like, how much simpler? Two or three lines
of code? Big deal....
After the initial learning curve I really don't
see where the pain is in using the extension
mechanism.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
7/15/2005 2:00:06 PM
|
|
In article <KjPBe.49837$dr.43495@news.ono.com>,
fungus <umailMY@SOCKSartlum.com> wrote:
>Lawrence D�Oliveiro wrote:
>>
>>>>And the extra indirection time every time you call the function.
>>>
>>>a) The OpenGL library will most likely do the same
>>>indirection...
>>
>> This is an extra one on top of that.
>
>By using your own pointer you might bypass the
>other layers and go straight to the driver
>- faster.
You might or you might not. You might simply get an indirect pointer to
the driver's indirect pointer.
>>>b) Modern CPUs have look-ahead logic to eliminate
>>>things like indirections so they never reach the
>>>core CPU.
>>
>> Nevertheless, the code is still simpler if you just use a weak reference
>
>Like, how much simpler? Two or three lines
>of code?
For _every_ new API call.
|
|
0
|
|
|
|
Reply
|
Lawrence
|
7/16/2005 12:04:55 AM
|
|
Lawrence D�Oliveiro wrote:
> In article <KjPBe.49837$dr.43495@news.ono.com>,
> fungus <umailMY@SOCKSartlum.com> wrote:
>
>
>>Lawrence D�Oliveiro wrote:
>>
>>>>>And the extra indirection time every time you call the function.
>>>>
>>>>a) The OpenGL library will most likely do the same
>>>>indirection...
>>>
>>>This is an extra one on top of that.
>>
>>By using your own pointer you might bypass the
>>other layers and go straight to the driver
>>- faster.
>
>
> You might or you might not. You might simply get an indirect pointer to
> the driver's indirect pointer.
>
Not very likely. The .lib file you link to is a
wrapper around the driver so the "functions" you
call are most likely just pointers anyway.
>>Like, how much simpler? Two or three lines
>>of code?
>
> For _every_ new API call.
With a macro it's one line and I doubt that's going
to put your project behind schedule.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
|
|
0
|
|
|
|
Reply
|
fungus
|
7/16/2005 5:04:54 AM
|
|
|
128 Replies
554 Views
(page loaded in 0.782 seconds)
Similiar Articles: speed up drawing meshes - comp.graphics.api.opengl> However, > it should be able to render your mesh much ... and glNewList to draw and use GL_TRIANGLE_STRIP, If I draw a mesh of ... Using 3D Textures - the best way - comp ... Fastest way to draw lots of triangles -- redux (FYI) - comp ...... sometimes) and they vary from, oh, 10 million triangle ... single VBO, ATI's driver quietly just doesn't render ... Fastest way to draw lots of triangles -- redux (FYI) - comp ... Alpha blending with depth buffer - comp.graphics.api.opengl ...... with the Depth Buffer" : >> If you want to render both ... With "surface" I mean a triangle (polygon with 3 vertices). ... sort the translucent (and mixed) objects in the fastest way ... Rendering to bitmap - comp.graphics.api.opengl... with a debugger will reveal that the colored triangle ... For Andy, is there a hardware-accelerated way to render to ... the user at the loading time, so no need to be fast. Shading problem - comp.graphics.api.openglI'm doing a 3D viewer of triangle meshes. I load ... to calculate again the norm for each triangle. Is there any way ... GL_TRIANGLES) is virtually as fast as triangle strips ... Quad vs Triangle strip - comp.graphics.api.opengl... card vendor wanted to make triangle strips fast ... Most graphics chips can only render triangles, thus the driver really draws a triangle ... is to always split the same way ... SPEED TEST: DirectX9 vs openGL - comp.graphics.api.opengl ...2) If you use DirectX's own Mesh primitive with the ... OpenGL's triangle at a time method should be equivalent ... Fastest way to draw lots of triangles -- redux (FYI) - comp ... Using 3D Textures - the best way - comp.graphics.api.opengl ...... of a single 3D texture is not a very great way to volume render ... You'll find out fast if the sampling aliasing hurts ... 3D brain mesh - comp.soft-sys.matlab 3D Texture ... OpenGL: High quality volume rendering - comp.graphics.api.opengl ...Volume of a mesh - comp.graphics.api.opengl... comp ... rendering? - comp.cad.solidworks What's the best way to ... Fast 2D blit and Fast Texture Upload - comp.graphics.api ... image processing - comp.soft-sys.matlabDoes anyone know the fastest way to perform image ... I tried inserting the following code into my render cod... ... seperately take pictures of square, circle and triangle ... text in OpenGL - comp.graphics.api.openglI mean, I want to render my 3d object and display ... I wouldn't use glTexGen function and the mesh hasn ... text - comp.graphics.api.opengl Hi, which way is the fastest one ... To display a 3D skeleton, what do I need to learn? - comp.graphics ...... exactly where you want every vertex of every triangle to go. ... You can look into using an actual modelled mesh for the ... respond to user interaction, then opengl is the way to ... OpenGL's default triangulation of a GL_POLYGON object - comp ...... v0; v1; ... vn; glEnd(); call is to make a "triangle fan ... the listing order, e.g., v0; v1; ... vn-1; vn; may render ... vertex which does > OTOH a fan isn't the nicest way ... wglMakeCurrent is verry slow... - comp.graphics.api.opengl ...It is important to note that this way of doing things ... Besides, an > application may need to render 3D graphics ... Ti 4200 with AGP8X/AGP Version - 1.4.0 the fast conext ... comp.graphics.api.opengl - page 69Mesh simplification algorithm for terrains 8 133 (11/18 ... glDisable(GL_TEXTURE_2d); glBegin(GL_TRIANGLE_FAN ... I can render the 3D models without the backgrou... Fastest way to render line mesh over 3D triangular mesh? - OpenGL ...Fastest way to render line mesh over 3D triangular mesh? - posted in OpenGL: I am rendering lines over 3D triangular mesh in order to visualize the mesh triangles. Triangle-strips-for-fast-rendering | Projects... in several different ways. Here we ... size 13.35?n is sufficient to render any polygon mesh ... Fast Mesh Rendering through Efficient Triangle Strip Generation. 7/27/2012 8:11:19 AM
|