Click on all the images to get a larger version
Introduction
I have implemented part of “View-dependent displacement mapping” (VDM), presented at SIGGRAPH’03 by Wang et al. In order to efficiently render detailed surface mesostructure, VDM models surface displacements along the viewing direction. Unlike traditional displacement mapping, VDM allows for efficient rendering of self-shadows, occlusions and silhouettes without increasing the complexity of the underlying surface mesh, as it is performed in screen-space. VDM is based on per-pixel processing, and with hardware acceleration it can render mesostructure with rich visual appearance in real time.
|
| Figure 1: The definition and geometry of view-dependent depths |
The algorithm takes as input a textured height field sample and converts this to a VDM representation, which is a function dVDM(x,y,φ,θ,c) encoded in a high dimensional texture. The distances are obtained in the original paper by a ray-casting preprocess, but I opted for a technique using the depth buffer (see Implementation). During rendering, the parameters are obtained on a per-vertex basis (in a vertex program on the graphics hardware) and interpolated. Then, silhouettes, texture offsets, shadows and shading are computed for each pixel in the fragment program.
The principal techniques that were covered in COMP236 and that are employed in the algorithm are: perspective-correct texture mapping, bump mapping, displacement mapping, meshes, triangle strip acceleration, display lists, ray casting algorithms and pre-computed illumination models for shading. New techniques that I learned during this project are rendering to textures and graphics hardware programming in HLSL with precomputed textures for lighting and shadowing computations.
Implementation
I have implemented a working version of the algorithm for 128x128 texture patches, without doing the compression that is discussed in the paper. The basic preprocessing steps include meshing the height field, determining the view-dependent depths using the depth buffer 32x8 viewing directions and no curvature. The rendering part was implemented in a vertex and a fragment program.
Pre-processing
The VDM synthesis is performed as a pre-process, and implemented in C++. The basic steps performed are:
- Read in height data, form triangles, and reorganize in triangle strip for efficient rendering. Input height image file can be of any square size, in most image formats. I used 1024x1024 input images. A normal map is generated with NVidia's Normal Map Generator. Here's an example of input height field image, generated normal map, and a 3D view as rendered in the VDM generating application.
- Render the views of the height texture and reference plane, separately, for uniform samples of (φ, θ) in order to cover the entire hemisphere to offscreen textures (pbuffer), using Mark Harris' RenderTexture class.
- The outputs of the pre-process are 128x128 texture patches for
different views. The values in these texture patches correspond to
the view-dependent depths, measured from the reference plane to the
height field (see figure 1). To produce these, we use the previously
generated views (stored in the offscreen textures) and use them as a
projective texture to project them onto the reference square.
We place the camera above this reference square patch (call it the
canonical view) and render the required reparameterized view (for both
heightfield and reference plane), subtract one from the other, and store
the result into an image file. The figures illustrate the setup and
show a the canonical view of the depths in the application,
first of the mesostructure and then of the reference quad.


Figure 5: The setup for texture projection Figure 6: View-dependent depths, after projection onto square patch (large version)
The green arrow in figure 7 illustrates the transformation that has to be performed on the texture coordinates (specified in object space on the canonical square patch). After this transformation, the correctly live in the space of the generated view (here called light's clip space, in analogy to shadow mapping). Figure 8 shows the resulting down-scaled 128x128 texture patch for φ=33.75° and θ=33.75°.after mesostructure - reference depths subtraction.
Figure 7: Texture coordinates are specified in object space, and must be transformed to the light's point of view (similar to shadow mapping) Figure 8: Texture patch for φ=33.75° and θ=33.75°
The code for the transformation of the view-dependent depths to the canonical view is therefore simple and elegant with projective textures:// set the texture viewing matrix, it's just the light's MVP glMatrixMode( GL_TEXTURE ); glLoadIdentity(); glTranslatef( 0.5f, 0.5f, 0.5f ); // Offset glScalef( 0.5f, 0.5f, 0.5f ); // Bias glOrtho(-1.0, 1.0, -1.0, 1.0, -1.0, 1.0); glMultMatrixf( g_eyeLookatMatrix ); glMatrixMode( GL_MODELVIEW ); - Organize the entire collection of 32x8 patches into a big 2048x2048
lookup texture map by tiling them in a 16x16 grid. This step is
performed with the ImageMagick
montageprogram, the final result is shown in the following image. The green pixels correspond to directions where there was no intersection with the mesostructure (stored as a 0/1 value in the green channel.
|
|
|
| Figure 2: The input height field (1024x1024) | Figure 3: Normal map (1024x1024) |
|
|
| Figure 4: 3D view of the height field |
|
|
| Figure 9: The resulting tiled VDM texture(2048x2048) |
Rendering
This was implemented in a HLSL vertex and an HLSL fragment program. NVidia's fx composer was used for development and debugging.
Vertex programThe vertex program is responsible for transforming the eye, light and viewing directions to tangent (aka texture) space, defined by the vertex normal, binormal and tangent vector. We also compute the angles φ and θ from the tangent-space view vector. All these parameters are then sent over to the fragment program through texture coordinates.
Fragment programThe fragment program has the following inputs:
- Vertex NDC position
- Tangent space vectors for lighting: lightvector, halfangle vector, viewvector
- The viewing direction in spherical coordinates (φ, θ)
- 4 texture maps: material color decal map, normals map, phong
lighting lookup map and tiled VDM displacement map. The phong lookup
map is generated proceduraly in the shader, effectively encoding diffuse and
specular coefficients for corresponding <N,L> and <N,H> dot
products.
The following hlsl .fx snippet illustrates this process:
texture phongMap < string texturetype = "2D"; string function = "Phong"; int width = 256, height = 256; >; // Blinn/Phong lighting model // - this function is used to build the look-up table texture float4 Phong(float2 dots : POSITION) : COLOR { const float shininess = 90.0; float NdotL = dots.x; float NdotH = dots.y; float diffuse = max(NdotL, 0.0); float specular = pow(NdotH, shininess); if (NdotL <= 0) specular = 0; return float4(diffuse, diffuse, diffuse, specular); }
Here's a general overview of the per-pixel algorithm in the fragment program.
|
| Figure 10: Flowchart of per-pixel operations for rendering |
First, the appropriate texture coordinates into the tiled VDM displacement map are computed. Remember, we have to index into the correct 128x128 patch. The following algorithm is used:
The texture sample contains (as we encoded it during the VDM generation preprocess) the view-dependent displacement dVDM in the red channel and a binary value in the green channel. By killing all fragments that have value 1 in the green channel, we can effectively determine the silhouettes. The following figures show depths for two different views, and the effect of killing fragments for silhouette determination (here still shown in full red).// index into composed VDM map ===== int iTheta, iPhi; float tPhi = modf(IN.viewAngle.x / PI * 16, iPhi); float tTheta = modf(IN.viewAngle.y / PI * 16, iTheta); float2 vdm_xy; vdm_xy.x = (iPhi + IN.TexCoord0.x) / 16.0; if (iTheta > 15) vdm_xy.x = vdm_xy.x + 0.5; vdm_xy.y = (iTheta + IN.TexCoord0.y) / 16.0;
|
|
|
| Figure 11: Heights for two different views | |
Then, we compute the appropriate texture offset for accessing the decal and normal map. The offset is the key to access the correct color and normal values, as they are not stored at position T (ie. the texture coordinates on the reference plane, passed through by the application) but at position T' = T + dT, where dT = dVDM * Vxy (Vxy is the projection of the viewdirection onto the reference plane). This is illustrated in the following figure:
|
| Figure 12: Texture offset calculation |
Once the correct offset value is obtained, we have access to the
correct color value and normal from the decal and normal texture map respectively.
This allows us to compute the final shading value for the fragment, by
looking up the phong lighting coefficients in the phong lookup map.
Here, we show the decal map and the normal map, sampled at the real texture coordinates
T'.
|
|
|
| Figure 13: Decal map sampled at real texture coordinates | Figure 14: Normal map sampled at real texture coordinates |
The following figures show results for ambient only, diffuse lighting only and full phong lighting (ambient + diffuse + specular). Notice how the color map is correctly warped, giving the impression of real geometry, as opposed to bump mapping.
|
|
|
| Figure 15: Ambient lighting | Figure 16: Diffuse lighting |
|
|
|
| Figure 17: Phong Lighting | |
From figure 12, it is clear how we can determine self-shadowing. We do another depth lookup, but this time with the spherical coordinates of the light: (φL, θL). This gives us the distance dL to the reference plane, along the light's direction. If this distance is less than the distance of the mesostructure point P' to point P" on the reference plane, the point is in shadow, and we only apply ambient lighting. My implementation of this part has some issues: I had to tweak an offset in the comparison in order to get a plausible result. Moreover, large aliasing effects are noticable in the following image. I presume these will be less apparent at a smaller scale, when the patch would be tiled over a general surface.
|
|
| Figure 18: Self-shadows added |
Results: a comparison with simple bumpmapping
From the following two images, it is obvious that VDM offers quite an
increase of realism over simple bumpmapping. The algorithm operates in
image space and it has therefore been possible to implement it in a
simple one-pass fragment shader, only marginally increasing the
rendering cost and entirely done on the GPU. The real geometry does not have to be changed
as opposed to classic displacement mapping.
Note how the effect of silhouettes and
stretching and warping of the color and normal maps gives
the impression of real geometry, while we only render a flat
reference patch.
|
|
|
| Figure 19: Bump mapping: no silhouettes or texture displacement | Figure 20: VDM gives increased realism by silhouettes and decal displacements |
Extensions to the basic algorithm
Due to the limited resolution of the VDM texture patches (128x128), considerable banding and aliasing effects are observed. The following three extensions try to smooth out these artifacts.
Interpolation between (φ, θ) VDM texture patches.
Sampling only one VDM texture patch in the fragment program results in
discontinuities when the viewing angle changes. This effect appears in
the image as concentric 'bands' of color and lighting (due to normal
sampling). Doing a simple bilinear interpolation between adjacent
VDM texture patches improves the result considerably, as shown in the
following comparison. The banding has almost completely disappeared.
|
|
|
| Figure 21: Without interpolation | Figure 22: With interpolation: no banding |
The following code snippet shows how the bilinear interpolation is done:
// view-dependent distance ====== float3 dVDM = tex2D(vdmTex, vdm_xy); #ifdef INTERPOLATION float2 steptexX = float2(1.0/16.0, 0); float2 steptexY = float2(0, 1.0/16.0); float2 steptexXY = float2(1.0/16.0, 1.0/16.0); float3 dVDMPhi1 = (1-tPhi) * tex2D(vdmTex,vdm_xy) + tPhi * tex2D(vdmTex, vdm_xy + steptexX); float3 dVDMPhi2 = (1-tPhi) * tex2D(vdmTex, vdm_xy + steptexY) + tPhi * tex2D(vdmTex, vdm_xy + steptexXY); //dVDM.r = (1-tTheta) * dVDMPhi1.r + tTheta * dVDMPhi2.r; dVDM = (1-tTheta) * dVDMPhi1 + tTheta * dVDMPhi2; #endif
Soft (anti-aliased) silhouettes. For each pixel, the basic algorithm decides if there is mesostructure to be rasterized, depending on the value in the green VDM texture channel. This hard decision results in aliasing effects along the silhouette of the mesostructure. An alternative approach is a two-step technique:
- Determine if there is intersection. If there is not, kill the fragment.
- If there is intersection, weigh the resulting shading value for the fragment by the weighted sum (bilinear interpolation) of adjacent samples in the green channel.
|
|
|
| Figure 23: Aliased silhouettes | Figure 24: Anti-aliased silhouettes |
Soft self-shadows. A similar approach can be done for
self-shadows. We interpolate the distances from the mesostructure to the
reference surface in the light's direction between adjacent VDM texture
patches. Presumably due to my limited implementation of shadows, the improvement is
only slightly noticeable.
|
|
|
| Figure 23: Without interpolation | Figure 24: With interpolation |
Limitations and further work
- Only the hemisphere above the reference geometry was sampled. It is not guaranteed that rays that have no intersection for a particular view (φ, θ) have no intersection for the adjacent patch if we were to tile the patches over an arbitrary surface. This is illustrated in figure 25. After discussion with one of the authors, it was made clear that the assumption was made that the mesostructure touches the reference patch on its borders. This assumption implies that the textures can only be tiled over closed, welbehaving manifolds. The way I implemented it, it works for single patches too, but if I were to tile these, seams would appear.
- Curvature was not taken into account. The patches can therefore not be applied to curved surfaces. In order to do so, we have to change the dVDM(x,y,φ,θ) representation to a 5-dimensional function dVDM(x,y,φ,θ,c).
- The main disadvantage of this algorithm, is the heavy demand on hardware memory, because all the textures have to be stored on the card. The authors solved this by performing PCA compression on the data, effectively reducing the 68 Mb textures to 4Mb textures. Nevertheless, if you would want a different mesostructure for different triangles, you'd have to send these huge VDM textures to the card for every triangle that is rasterized, so this technique is only really suited for repeated tilings of textures.
|
| Figure 25: Some rays do not intersect the first patch, but are incorrectly clipped away by the algorithm. |
Some extra screenshots
An little extra to illustrate how a computer graphics grad can feel when
implementing cool algorithms...
|
|
|
Code, references and acknowledgements
Everthing nicely packed up in a zip file (with example data): nicoVDM.zip
Here are the links to separate parts of the code:
- VDM texture map generator
- HLSL shader code (.fx format)
- .fx composer project file.
Short HOWTO for the VDM texture map generator:
Usage: VDMCasterfilename: height map image file, in any format that ImageMagick can handle (includes .tga, .png, ...) The program will exit if the file cannot be read. If no filename was given, the program will try to load 'bump.tga'. Keystrokes: 1/2 rotate theta +/- 1 degree 3/4 rotate phi +/- 1 degree 5/6 rotate theta +/- 11.25 degrees 7/8 rotate phi +/- 11.25 degrees 9 toggle heightfield rendering 0 toggle reference plane rendering r toggle 3D view / canonical projected view g generate a 128x128 VDM depth texture patch of the current 3D / canonical view. the resulting image will be saved as 'vdm-<phi>-<theta>.png' G generate a set of 128x128 VDM depth texture patches. theta and phi are uniformly sampled between (0,90) and (0,360) degrees respectively, in steps of 11.25 degrees. the output is a set of images in the current directory in the following format: 'vdm-<phi>-<theta>.png'
To generate a tiled VDM depth map, one must assemble the entire set of 128x128 patches into a large image file. This can easily be achieved with the following script. (assumes the availability of the ImageMagick montage tool).
This scripts gives you a#!/bin/bash set -v fnames="" starttheta=0 endtheta=7 secondtheta=0 while [ $endtheta -lt 16 ]; do for theta in `seq $starttheta $endtheta`; do for phi in `seq 0 7`; do fnames="$fnames vdm-$theta-$phi.png" done let theta=theta+16 for phi in `seq 0 7`; do fnames="$fnames vdm-$theta-$phi.png" done done let starttheta=starttheta+8 let endtheta=endtheta+8 done montage +frame +shadow +label -tile 16x16 \ -geometry 128x128+0+0 $fnames VDM.png
VDM.png file that you can then load into the shader.
To run the hlsl shaders, the easiest is to simply load in the .fxcomposer project file in .fx Composer. Here you can assign the decal and normal map you want to use, and the generated VDM depth map.
The paper is available on ACM's website (requires registration):
“View-dependent
displacement mapping”
SIGGRAPH’03 by Lifeng Wang, Xi Wang, Xin Tong, Stephen Lin,
Shimin Hu, Baining Guo, Heung-Yeung Shumang
Thanks to:
- Brandon Lloyd: for pointing me to the paper in the first place.
- Leonard McMillan: for the useful discussion and help with understanding the paper.
- Xi Wang: as one of the implementing authors, for giving me pointers on some of the specifics of the assumptions and implementation details.
- Jingdan Zhang: for motivation and sitting next to me in the GLab ;-)