1.9. 坐标转换 Coordinate Transforms-阿里云开发者社区

1.9. 坐标转换 Coordinate Transforms

The purpose of the OpenGL graphics processing pipeline is to convert threedimensional

descriptions of objects into a two-dimensional image that can be displayed. In many ways, this

process is similar to using a camera to convert a real-world scene into a two-dimensional print.

To accomplish the transformation from three dimensions to two, OpenGL defines several

coordinate spaces and transformations between those spaces. Each coordinate space has some

properties that make it useful for some part of the rendering process. The transformations

defined by OpenGL afford applications a great deal of flexibility in defining the 3D-to-2D

mapping. For success at writing shaders in the OpenGL Shading Language, understanding the

various transformations and coordinate spaces used by OpenGL is essential.

In computer graphics, MODELING is the process of defining a numerical representation of an

object that is to be rendered. For OpenGL, this usually means creating a polygonal

representation of an object so that it can be drawn with the polygon primitives built into

OpenGL. At a minimum, a polygonal representation of an object needs to include the

coordinates of each vertex in each polygon and the connectivity information that defines the

polygons. Additional data might include the color of each vertex, the surface normal at each

vertex, one or more texture coordinates at each vertex, and so on.

In the past, modeling an object was a painstaking effort, requiring precise physical

measurement and data entry. (This is one of the reasons the Utah teapot, modeled by Martin

Newell in 1975, has been used in so many graphics images. It is an interesting object, and the

numerical data is freely available. Several of the shaders presented in this book are illustrated

with this object; see, for example, Color Plate 24.) More recently, a variety of modeling tools

have become available, both hardware and software, and this has made it relatively easy to

create numerical representations of threedimensional objects that are to be rendered.

Three-dimensional object attributes, such as vertex positions and surface normals, are defined

in OBJECT SPACE. This coordinate space is one that is convenient for describing the object that is

being modeled. Coordinates are specified in units that are convenient to that particular object.

Microscopic objects may be modeled in units of angstroms, everyday objects may be modeled

in inches or centimeters, buildings might be modeled in feet or meters, planets could be

modeled in miles or kilometers, and galaxies might be modeled in light years or parsecs. The

origin of this coordinate system (i.e., the point (0, 0, 0)) is also something that is convenient

for the object being modeled. For some objects, the origin might be placed at one corner of the

object's three-dimensional bounding box. For other objects, it might be more convenient to

define the origin at the centroid of the object. Because of its intimate connection with the task

of modeling, this coordinate space is also often referred to as MODEL SPACE or the MODELING

COORDINATE SYSTEM. Coordinates are referred to equivalently as object coordinates or modeling

coordinates.

To compose a scene that contains a variety of three-dimensional objects, each of which might

be defined in its own unique object space, we need a common coordinate system. This common

coordinate system is called WORLD SPACE or the WORLD COORDINATE SYSTEM, and it provides a

common frame of reference for all objects in the scene. Once all the objects in the scene are

transformed into a single coordinate system, the spatial relationships between all the objects,

the light sources, and the viewer are known. The units of this coordinate system are chosen in a

way that is convenient for describing a scene. You might choose feet or meters if you are

composing a scene that represents one of the rooms in your house, but you might choose city

blocks as your units if you are composing a scene that represents a city skyline. The choice for

the origin of this coordinate system is also arbitrary. You might define a three-dimensional

bounding box for your scene and set the origin at the corner of the bounding box such that all

of the other coordinates of the bounding box have positive values. Or you may want to pick an

important point in your scene (the corner of a building, the location of a key character, etc.)

and make that the origin.

After world space is defined, all the objects in the scene must be transformed from their own

unique object coordinates into world coordinates. The transformation that takes coordinates

from object space to world space is called the MODELING TRANSFORMATION. If the object's modeling

coordinates are in feet but the world coordinate system is defined in terms of inches, the object

coordinates must be scaled by a factor of 12 to produce world coordinates. If the object is

defined to be facing forward but in the scene it needs to be facing backwards, a rotation must

be applied to the object coordinates. A translation is also typically required to position the

object at its desired location in world coordinates. All of these individual transformations can be

put together into a single matrix, the MODEL TRANSFORMATION MATRIX, that represents the

transformation from object coordinates to world coordinates.

After the scene has been composed, the viewing parameters must be specified. One aspect of

the view is the vantage point (i.e., the eye or camera position) from which the scene will be

viewed. Viewing parameters also include the focus point (also called the lookat point or the

direction in which the camera is pointed) and the up direction (e.g., the camera may be held

sideways or upside down).

The viewing parameters collectively define the VIEWING TRANSFORMATION, and they can be

combined into a matrix called the VIEWING MATRIX. A coordinate multiplied by this matrix is

transformed from world space into EYE SPACE, also called the EYE COORDINATE SYSTEM. By definition,

the origin of this coordinate system is at the viewing (or eye) position. Coordinates in this space

are called eye coordinates. The spatial relationships in the scene remain unchanged, but

orienting the coordinate system in this way makes it easy to determine the distance from the

viewpoint to various objects in the scene.

Although some 3D graphics APIs allow applications to separately specify the modeling matrix

and the viewing matrix, OpenGL combines them into a single matrix called the MODELVIEW MATRIX.

This matrix is defined to transform coordinates from object space into eye space (see Figure

1.2).

Figure 1.2. Coordinate spaces and transforms in OpenGL

You can manipulate a number of matrices in OpenGL. Call the glMatrixMode function to select the

modelview matrix or one of OpenGL's other matrices. Load the current matrix with the identity

matrix by calling glLoadIdentity, or replace it with an arbitrary matrix by calling glLoadMatrix. Be

sure you know what you're doing if you specify an arbitrary matrixthe transformation might

give you a completely incomprehensible image! You can also multiply the current matrix by an

arbitrary matrix by calling glMultMatrix.

Applications often start by setting the current modelview matrix to the view matrix and then

add on the necessary modeling matrices. You can set the modelview matrix to a reasonable

viewing transformation with the gluLookAt function. (This function is not part of OpenGL proper

but is part of the OpenGL utility library that is provided with every OpenGL implementation.)

OpenGL actually supports a stack of modelview matrices, and you can duplicate the topmost

matrix and copy it onto the top of the stack with glPushMatrix. When this is done, you can

concatenate other transformations to the topmost matrix with the functions glScale, glTranslate,

and glRotate to define the modeling transformation for a particular threedimensional object in the

scene. Then, pop this topmost matrix off the stack with glPopMatrix to get back to the original

view transformation matrix. Repeat the process for each object in the scene.

At the time light source positions are specified with the glLight function, they are transformed by

the current modelview matrix. Therefore, light positions are stored within OpenGL as eye

coordinates. You must set up the modelview matrix to perform the proper transformation

before light positions are specified or you won't get the lighting effects that you expect. The

lighting calculations that occur in OpenGL are defined to happen on a per-vertex basis in the

eye coordinate system. For the necessary reflection computations, light positions and surface

normals must be in the same coordinate system. OpenGL implementations often choose to do

lighting calculations in eye space; therefore, the incoming surface normals have to be

transformed into eye space as well. You accomplish this by transforming surface normals by the

inverse transpose of the upper leftmost 3 x 3 matrix taken from the modelview matrix. At that

point, you can apply the pervertex lighting formulas defined by OpenGL to determine the lit

color at each vertex.

After coordinates have been transformed into eye space, the next thing is to define a viewing

volume. This is the region of the three-dimensional scene that is visible in the final image. The

transformation that takes the objects in the viewing volume into CLIP SPACE (also known as the

CLIPPING COORDINATE SYSTEM, a coordinate space that is suitable for clipping) is called the PROJECTION

TRANSFORMATION. In OpenGL, you establish the projection transformation by calling glMatrixMode to

select the projection matrix and then setting this matrix appropriately. Parameters that may go

into creating an appropriate projection matrix are the field of view (how much of the scene is

visible), the aspect ratio (the horizontal field of view may differ from the vertical field of view),

and near and far clipping planes to eliminate things that are too far away or too close (for

perspective projections, weirdness will occur if you try to draw things that are at or behind the

viewing position). Three utility functions set the projection matrix: glOrtho, glFrustum, and

gluPerspective. The difference between these functions is that glOrtho defines a parallel projection

(i.e., parallel lines in the scene are projected to parallel lines in the final two-dimensional

image), whereas glFrustum and gluPerspective define perspective projections (i.e., parallel lines in

the scene are foreshortened to produce a vanishing point in the image, such as railroad tracks

converging to a point in the distance).

FRUSTUM CLIPPING is the process of eliminating any graphics primitives that lie outside an axisaligned

cube in clip space. This cube is defined such that the x, y, and z components of the clip

space coordinate are less than or equal to the w component for the coordinate, and greater

than or equal to -w (i.e., -w x w, -w y w, and -w z w). Graphics primitives (or

portions thereof) that lie outside this cube are discarded. Frustum clipping is always performed

on all incoming primitives in OpenGL. USER CLIPPING, on the other hand, is a feature that can be

enabled or disabled by the application. Applications can call glClipPlane to specify one or more

clipping planes that further restrict the size of the viewing volume, and each clipping plane can

be individually enabled with glEnable. At the time user clipping planes are specified, OpenGL

transforms them into eye space using the inverse of the current modelview matrix. Each plane

specified in this manner defines a half-space, and only the portions of primitives that lie within

the intersection of the view volume and all of the enabled half-spaces defined by user clipping

planes are drawn.

The next step in the transformation of vertex positions is the perspective divide. This operation

divides each component of the clip space coordinate by the homogeneous coordinate w. The

resulting x, y, and z components range from [-1,1], and the resulting w coordinate is always 1,

so it is no longer needed. In other words, all the visible graphics primitives are transformed into

a cubic region between the point (-1, -1, -1) and the point (1, 1, 1). This is the NORMALIZED DEVICE

COORDINATE SPACE, which is an intermediate space that allows the viewing area to be properly

mapped onto a viewport of arbitrary size and depth.

Pixels within a window on the display device aren't referred to with floating-point coordinates

from -1 to 1; they are usually referred to with coordinates defined in the WINDOW COORDINATE

SYSTEM, where x values range from 0 to the width of the window minus 1, and y values range

from 0 to the height of the window minus 1. Therefore, one more transformation step is

required. The VIEWPORT TRANSFORMATION specifies the mapping from normalized device coordinates

into window coordinates. You specify this mapping by calling the OpenGL functions glViewport,

which specifies the mapping of the x and y coordinates, and glDepthRange, which specifies the

mapping of the z coordinate. Graphics primitives are rasterized in the window coordinate

system.

1.9. 坐标转换 Coordinate Transforms

热门文章

最新文章

相关电子书