A texture array is conceptionally what should be used in this case. One
advantage of this is that we don't have to generate mipmaps ourselves but can
let the graphics driver take care of it. Same for selection of the mipmap
level. This would even allow to choose different mipmap levels for different
textures.
This is a somewhat experimental change since it makes OpenGL 3.0 a hard
requirement for OpenClonk. I expect that this is fine, but if this causes
failures during landscape creation on common hardware/drivers we should
revisit.
Add a C4ShaderCall parameter to tho most important drawing functions, and
make C4DrawGL's CreateSpriteShader public with additional parameters to
specify additional defines and shader slices. C4Sky uses this to compile its
own shader with OC_SKY defined.
Instead of one draw call for each tile, do the whole operation with a single
draw call by setting GL_REPEAT on the texture. This affects sky, the upper
board and the background.
This also allows to remove some code that was making sure surfaces are big
enough.
Previously, the em <-> pixels conversion was a hardcoded value. Now the GUI scales with the font size that can be selected in the options.
Sadly, all scales were off since the hardcoded value was too low.
In comparison to the old system, this is a downgrade - instead of being
able to set a full color mapping by gamma ramp, we now get just a value
per colour channel.
Upside is that we do not need to play around with the global gamma ramps
any more, which was arguably the wrong way to do it.
This commit will likely break everything that has been using gamma so far.
Instead of doing the transformation when drawing a mesh. This allows making
the OpenGL normal matrix more consistent, since it does not include the
Ogre-To-Clonk transformation, and so that the transformation does not need
to be inverted in the shader.
As a side effect, all Attach transformations were updated, since before
they were specified in the OGRE reference frame, not the Clonk reference
frame.
For whatever reason, the shader code that was passed to the compiler was
different from the code that got written to the shader log. This is a
huge pain in the ass when trying to debug shader errors because the line
information is completely wrong. I assume this decision was a premature
optimization, so I've removed it and we'll now log the exact same code
as the shader compiler sees.
Several rendering changes have resulted in a non-rendering build that
failed to build from source. Dummy out all of these functions to make it
work again.
I'm sure there was a reason to have a separate DebugLog function inside
C4Draw, with a different visibility trigger, but I don't see it. Also
there was no DebugLogF, so that's fun too.
The GLEW headers of Ubuntu 12.04 LTS don't know about GL_KHR_debug yet,
so we have to test for it before using its enum. Additionally, drivers
without support for GL_KHR_debug would emit INVALID_ENUM, so we test for
driver support too.
When an error's log output is represented graphically the graphics
operation can lead to another error (or the same error again), which
will be logged graphically again and so forth, until stack overflow.
So log to the log file only.
To create debug contexts, we have to use glXCreateContextAttribsARB. To use
that, we have to initialize GLEW, which means creating a dummy GL context. To
create a dummy context with the same FB config as the final one, we need to...
initialize GLEW, because it suppresses the GLX 1.4 function declarations.
So instead we'll just manually initialize the three function pointers we're
going to need.
The GL driver is allowed to use different entry points depending on the
context. This means that we can't just initialize GLEW once and use it
all the time, but we must refresh the entry point list every time we
create a new context.
Some resources can't be shared across different rendering contexts while
others can. Additionally, the standard GLEW library does not support
multiple rendering contexts (that's what GLEX MX is for), even though it
might work on some (or even most) cards. WGL supports reuse of a
rendering context across multiple windows as long as the pixel formats
are the same.
4x3 matrices use the same number of uniform components as 4x4 ones.
If we're short on uniform components, don't transpose the transformation
matrix before sending it to the shader, and transpose it in the shader
itself instead, saving 4 components per bone.
The last row of the bone transformation matrix always is 0,0,0,1 so
there's no point in uploading it. Also reducing the max bone count to 80
which means the uniform array will fit into the available space on 6000
and 7000 series Geforce GPUs.
As long as we're not actually using a different shader for meshes
without bones, we need to upload an identity matrix so there's defined
data in the bone slot.
Doing skinning on the GPU shows a noticeable performance improvement in
pretty much any situation, but especially so in scenes with lots of
animated objects with high polygon counts.
Calling CStdGL::CheckGLError calls glGetError, which is really, really
slow because it has to flush the pipeline to check whether there's an
error or not. Plus it's not like we can do anything about it anyway. If
you want to be notified when an error happens, pass --debug-opengl to
the executable.
Instead of transforming all vertices on the CPU every time an animation
progresses, we now only recalculate the skeleton, leaving the heavy
lifting for the GPU. This also means we no longer have to push all
vertices onto the bus every frame, because the mesh isn't changing and
can therefore be stored in a GL_STATIC_DRAW VBO when it's first loaded.
The downside of this approach is that there's only a limited number of
uniforms and vertex attributes we can pass to the shader. At the moment
these limits are a maximum of 128 bones per skeleton, and no vertex can
be influenced by more than 8 bones at once. So far this is no problem,
as the most complex skeleton in the base game uses less than 64 bones
and no more than 6 bone weights per vertex.
Instead of having the default vertex shader hard-coded into the engine,
allow to load it from Graphics.ocg. There's still a fall-back version
wired into the engine because we can't return an error from
GetVertexShaderCodeForPass.
While we're still not doing skinning on the GPU, copying the vertex data
to a VBO immediately after updating the animation allows us to re-use
that data for unanimated meshes. It also allows us to store unanimated
data on the GPU, instead of transferring it over the bus for each frame.
In the very common case where the C4Surface only uses a single texture
to store its data, a lot of work GetTexAt is actually unnecessary. Split
it up so we can inline the fast path and only fallback to the slow path
when the surface is split up into multiple textures.
glGetString(GL_EXTENSIONS) is deprecated starting with OpenGL 3.0.
Instead, you're now supposed to retrieve the list of extensions one by
one with glGetStringi.
We've been using OpenGL 2.1 features for some time now, and hardware has
started supporting OpenGL 2.1 in 2005. I doubt this will make anyone
unable to run the game, and it's certainly better than crashing because
of a nullpointer dereference when some GL function we use can't be
found.
The MSDN reference for wglMakeCurrent states that the first (hdc)
parameter is ignored when the second one is NULL. This is incorrect: it
checks validity of the hdc parameter before doing any work. Since we
have a DC anyway, it's no problem to pass that to wglMakeCurrent.
Depending on how current your headers are, the userParam parameter to
GLDEBUGPROCARB may be const, or it may not. The ARB has added the const
qualifier at some point after publishing the specs. Hooray for breaking
API changes.
This introduces a new command line parameter "--debug-opengl", which
will create special debug OpenGL contexts and attach a callback that the
driver will invoke when it detects a problem. The callback will then
write the error message to the logfile, and break into the debugger if
one is attached.
Currently only works on Windows.
gluErrorString returns latin-1 encoded strings. Our code expects to
receive UTF-8 encoded strings everywhere, so make sure that the strings
are converted before passing them on.
Graphics are now pre-loaded and may then be accessed in random order. Reduces Objects.ocd load time from 20 seconds to 1 second for me.
Some ordering is still broken (e.g. material.ocg and player files).
While none of the mismatches were having a side-effect, this silences a
number of -Wreorder warnings which were drowning out potentially
important diagnostics.
This should improve cache coherency by having all surface tiles adjacent
instead of strewn across the heap. This will also remove an indirection
in the common case of only using one tile.
With this change, an additional rectangle is stored in C4FoWRegion that
represents the area covered by the viewport in fractional floating point
coordinates. This allows the light texture to be created for an arbitrary
portion of the landscape, and the coordinate transformations for the
shaders will still work.
Also, since the additional rectangle uses floating point precision, the
computed coordinate transformations do now give the exact same result as for
the landscape pixel-by-pixel, and there should not be any offsets left.
I also hope that this change improves or fixes the single-pixel-lines of sky
that are sometimes seen at the edges of the viewport.
This gets rid of GL state changes for questionable gain. It also fixes drawing
of fade sky backgrounds in global viewports (where, for some reason, the shade
model was set to GL_FLAT instead of GL_SMOOTH).
This allows to see the whole landscape without any areas covered by FoW
in the global viewport. Basically the ambient lighting is set to 1.0
independent of the ambient light map. In the course of this, a second
shader for the landscape has been introduced.
There were two problems with the previous transforms:
1) For inverting the Y axis for the ambient map, the total height of the
output window is needed, not only the viewport region.
2) The Y offset to only use the part of the light texture that is being
rendered to was not applied.
In order to keep the transformations more readable, a new lightweight class
C4FragTransform has been introduced which can only handle translations
and scales in x and y.
This allows to ignore slice declarations using `#define slice(x)`, which
mill be useful for custom mesh material shaders, allowing to write them
such that they can be used standalone in a mesh viewer but also as slices
for OpenClonk, in which case lighting and color modulation will be applied
automatically.
Since we're no longer using DirectX, nVidia's automatic detection no longer works classifies OpenClonk as a game to use the high performance GPU. Note that this flag does not work on some old drivers (version<302 according to specs). To support these old drivers, we would have to link against DirectX despite not using it.
This doesn't fix material preview in editor mode yet but at least there's no more assertion. We should probably create a proper render target surface for that.
Otherwise the default pack alignment is 4, and when the horizontal window dimension
is not a multiple of 4, glReadPixels() would read past the end of the buffer we
provided.
Otherwise, we would partly write files with uninitialized data, such as the
padding bytes in a BMP, or the palette if not specified explicitly. This
mostly fixes corresponding valgrind warnings, but also makes sure we obtain
the same BMP files everytime we store the same StdSurface8 object, bit-by-bit.
This will allow to avoid some code duplication when computing the coordinate
transform from fragment coordinates to ambient and light texture coordinates.