Rendering (computer graphics) explained

Rendering or image synthesis is the process of generating a photorealistic or non-photorealistic image from a 2D or 3D model by means of a computer program. The resulting image is referred to as a rendering. Multiple models can be defined in a scene file containing objects in a strictly defined language or data structure. The scene file contains geometry, viewpoint, textures, lighting, and shading information describing the virtual scene. The data contained in the scene file is then passed to a rendering program to be processed and output to a digital image or raster graphics image file. The term "rendering" is analogous to the concept of an artist's impression of a scene. The term "rendering" is also used to describe the process of calculating effects in a video editing program to produce the final video output.

A software application or component that performs rendering is called a rendering engine,[1] render engine, rendering system, graphics engine, or simply a renderer.

Rendering is one of the major sub-topics of 3D computer graphics, and in practice it is always connected to the others. It is the last major step in the graphics pipeline, giving models and animation their final appearance. With the increasing sophistication of computer graphics since the 1970s, it has become a more distinct subject.

Rendering has uses in architecture, video games, simulators, movie and TV visual effects, and design visualization, each employing a different balance of features and techniques. A wide variety of renderers are available for use. Some are integrated into larger modeling and animation packages, some are stand-alone, and some are free open-source projects. On the inside, a renderer is a carefully engineered program based on multiple disciplines, including light physics, visual perception, mathematics, and software development.

Though the technical details of rendering methods vary, the general challenges to overcome in producing a 2D image on a screen from a 3D representation stored in a scene file are handled by the graphics pipeline in a rendering device such as a GPU. A GPU is a purpose-built device that assists a CPU in performing complex rendering calculations. If a scene is to look relatively realistic and predictable under virtual lighting, the rendering software must solve the rendering equation. The rendering equation does not account for all lighting phenomena, but instead acts as a general lighting model for computer-generated imagery.

In the case of 3D graphics, scenes can be pre-rendered or generated in realtime. Pre-rendering is a slow, computationally intensive process that is typically used for movie creation, where scenes can be generated ahead of time, while real-time rendering is often done for 3D video games and other applications that must dynamically create scenes. 3D hardware accelerators can improve realtime rendering performance.

Features

A rendered image can be understood in terms of a number of visible features. Rendering research and development has been largely motivated by finding ways to simulate these efficiently. Some relate directly to particular algorithms and techniques, while others are produced together.

Inputs

Before a 3D scene or 2D image can be rendered, it must be described in a way that the rendering software can understand. Historically, inputs for both 2D and 3D rendering were usually text files, which are easier than binary files for humans to edit and debug. For 3D graphics, text formats have largely been supplanted by more efficient binary formats, and by APIs which allow interactive applications to communicate directly with a rendering component without generating a file on disk (although a scene description is usually still created in memory prior to rendering).

Traditional rendering algorithms use geometric descriptions of 3D scenes or 2D images. Applications and algorithms that render visualizations of data scanned from the real world, or scientific simulations, may require different types of input data.

The PostScript format (which is often credited with the rise of desktop publishing) provides a standardized, interoperable way to describe 2D graphics and page layout. The Scalable Vector Graphics (SVG) format is also text-based, and the PDF format uses the PostScript language internally. In contrast, although many 3D graphics file formats have been standardized (including text-based formats such as VRML and X3D), different rendering applications typically use formats tailored to their needs, and this has led to a proliferation of proprietary and open formats, with binary files being more common.

2D vector graphics

A vector graphics image description may include:

3D geometry

A geometric scene description may include:

Many file formats exist for storing individual 3D objects or "models". These can be imported into a larger scene, or loaded on-demand by rendering software or games. A realistic scene may require hundreds of items like household objects, vehicles, and trees, and 3D artists often utilize large libraries of models. In game production, these models (along with other data such as textures, audio files, and animations) are referred to as "assets".

Volumetric data

Scientific and engineering visualization often requires rendering volumetric data generated by 3D scans or simulations. Perhaps the most common source of such data is medical CT and MRI scans, which need to be rendered for diagnosis. Volumetric data can be extremely large, and requires specialized data formats to store it efficiently, particularly if the volume is sparse (with empty regions that do not contain data).

Before rendering, level sets for volumetric data can be extracted and converted into a mesh of triangles, e.g. by using the marching cubes algorithm. Algorithms have also been developed that work directly with volumetric data, for example to render realistic depictions of the way light is scattered and absorbed by clouds and smoke, and this type of volumetric rendering is used extensively in visual effects for movies. When rendering lower-resolution volumetric data without interpolation, the individual cubes or "voxels" may be visible, an effect sometimes used deliberately for game graphics.

Photogrammetry and scanning

Photographs of real world objects can be incorporated into a rendered scene by using them as textures for 3D objects. Photos of a scene can also be stitched together to create panoramic images or environment maps, which allow the scene to be rendered very efficiently but only from a single viewpoint. Scanning of real objects and scenes using structured light or lidar produces point clouds consisting of the coordinates of millions of individual points in space, sometimes along with color information. These point clouds may either be rendered directly or converted into meshes before rendering. (Note: "point cloud" sometimes also refers to a minimalist rendering style that can be used for any 3D geometry, similar to wireframe rendering.)

Neural approximations and light fields

A more recent, experimental approach is description of scenes using radiance fields which define the color, intensity, and direction of incoming light at each point in space. (This is conceptually similar to, but not identical to, the light field recorded by a hologram.) For any useful resolution, the amount of data in a radiance field is so large that it is impractical to represent it directly as volumetric data, and an approximation function must be found. Neural networks are typically used to generate and evaluate these approximations, sometimes using video frames, or a collection of photographs of a scene taken at different angles, as "training data".

Algorithms related to neural networks have recently been used to find approximations of a scene as 3D Gaussians. The resulting representation is similar to a point cloud, except that it uses fuzzy, partially-transparent blobs of varying dimensions and orientations instead of points. As with neural radiance fields, these approximations are often generated from photographs or video frames.

Outputs

The output of rendering may be displayed immediately on the screen (many times a second, in the case of real-time rendering such as games) or saved in a raster graphics file format such as JPEG or PNG. High-end rendering applications commonly use the OpenEXR file format, which can represent finer gradations of colors and high dynamic range lighting, allowing tone mapping or other adjustments to be applied afterwards without loss of quality.

Quickly rendered animations can be saved directly as video files, but for high-quality rendering, individual frames (which may be rendered by different computers in a cluster or render farm and may take hours or even days to render) are output as separate files and combined later into a video clip.

The output of a renderer sometimes includes more than just RGB color values. For example, the spectrum can be sampled using multiple wavelengths of light, or additional information such as depth (distance from camera) or the material of each point in the image can be included (this data can be used during compositing or when generating texture maps for real-time rendering, or used to assist in removing noise from a path-traced image). Transparency information can be included, allowing rendered foreground objects to be composited with photographs or video. It is also sometimes useful to store the contributions of different lights, or of specular and diffuse lighting, as separate channels, so lighting can be adjusted after rendering. The OpenEXR format allows storing many channels of data in a single file.

Techniques

Choosing how to render a 3D scene usually involves trade-offs between speed, memory usage, and realism (although realism is not always desired). The developed over the years follow a loose progression, with more advanced methods becoming practical as computing power and memory capacity increased. Multiple techniques may be used for a single final image.

An important distinction is between image order algorithms, which iterate over pixels of the image plane, and object order algorithms, which iterate over objects in the scene. For simple scenes, object order is usually more efficient, as there are fewer objects than pixels.

2D vector graphics
  • The vector displays of the 1960s-1970s used deflection of an electron beam to draw line segments directly on the screen. Nowadays, vector graphics are rendered by rasterization algorithms that also support filled shapes. In principle, any 2D vector graphics renderer can be used to render 3D objects by first projecting them onto a 2D image plane.
    3D rasterization
  • Adapts 2D rasterization algorithms so they can be used more efficiently for 3D rendering, handling hidden surface removal via scanline or z-buffer techniques. Different realistic or stylized effects can be obtained by coloring the pixels covered by the objects in different ways. Surfaces are typically divided into meshes of triangles before being rasterized. Rasterization is usually synonymous with "object order" rendering (as described above).
    Ray casting
  • Uses geometric formulas to compute the first object that a ray intersects. It can be used to implement "image order" rendering by casting a ray for each pixel, and finding a corresponding point in the scene. Ray casting is a fundamental operation used for both graphical and non-graphical purposes, e.g. determining whether a point is in shadow, or checking what an enemy can see in a game.
    Ray tracing
  • Simulates the bouncing paths of light caused by specular reflection and refraction, requiring a varying number of ray casting operations for each path. Advanced forms use Monte Carlo techniques to render effects such as area lights, depth of field, blurry reflections, and soft shadows, but computing global illumination is usually in the domain of path tracing.
    Radiosity
  • A finite element analysis approach that breaks surfaces in the scene into pieces, and estimates the amount of light that each piece receives from light sources, or indirectly from other surfaces. Once the irradiance of each surface is known, the scene can be rendered using rasterization or ray tracing.
    Path tracing
  • Uses Monte Carlo integration with a simplified form of ray tracing, computing the average brightness of a sample of the possible paths that a photon could take when traveling from a light source to the camera (for some images, thousands of paths need to be sampled per pixel). It was introduced as a statistically unbiased way to solve the rendering equation, giving ray tracing a rigorous mathematical foundation.

    Each of the above approaches has many variations, and there is some overlap. Path tracing may be considered either a distinct technique or a particular type of ray tracing. Note that the usage of terminology related to ray tracing and path tracing has changed significantly over time.

    Ray marching is a family of algorithms, used by ray casting, for finding intersections between a ray and a complex object, such as a volumetric dataset or a surface defined by a signed distance function. It is not, by itself, a rendering method, but it can be incorporated into ray tracing and path tracing, and is used by rasterization to implement screen-space reflection and other effects.

    A technique called photon mapping traces paths of photons from a light source to an object, accumulating data about irradiance which is then used during conventional ray tracing or path tracing. Rendering a scene using only rays traced from the light source to the camera is impractical, even though it corresponds more closely to reality, because a huge number of photons would need to be simulated, only a tiny fraction of which actually hit the camera.

    Some authors call conventional ray tracing "backward" ray tracing because it traces the paths of photons backwards from the camera to the light source, and call following paths from the light source (as in photon mapping) "forward" ray tracing. However sometimes the meaning of these terms is reversed. Tracing rays starting at the light source can also be called particle tracing or light tracing, which avoids this ambiguity.

    Real-time rendering, including video game graphics, typically uses rasterization, but increasingly combines it with ray tracing and path tracing. To enable realistic global illumination, real-time rendering often relies on pre-rendered ("baked") lighting for stationary objects. For moving objects, it may use a technique called light probes, in which lighting is recorded by rendering omnidirectional views of the scene at chosen points in space (often points on a grid to allow easier interpolation). These are similar to environment maps, but typically use a very low resolution or an approximation such as spherical harmonics. (Note: Blender uses the term 'light probes' for a more general class of pre-recorded lighting data, including reflection maps.)

    Rasterization

    See main article: Rasterization.

    The term rasterization (in a broad sense) encompasses many techniques used for 2D rendering and real-time 3D rendering. 3D animated films were rendered by rasterization before ray tracing and path tracing became practical.

    A renderer combines rasterization with geometry processing (which is not specific to rasterization) and pixel processing which computes the RGB color values to be placed in the framebuffer for display.

    The main tasks of rasterization (including pixel processing) are:

    3D rasterization is typically part of a graphics pipeline in which an application provides lists of triangles to be rendered, and the rendering system transforms and projects their coordinates, determines which triangles are potentially visible in the viewport, and performs the above rasterization and pixel processing tasks before displaying the final result on the screen.

    Historically, 3D rasterization used algorithms like the Warnock algorithm and scanline rendering (also called "scan-conversion"), which can handle arbitrary polygons and can rasterize many shapes simultaneously. Although such algorithms are still important for 2D rendering, 3D rendering now usually divides shapes into triangles and rasterizes them individually using simpler methods.

    High-performance algorithms exist for rasterizing 2D lines, including anti-aliased lines, as well as ellipses and filled triangles. An important special case of 2D rasterization is text rendering, which requires careful anti-aliasing and rounding of coordinates to avoid distorting the letterforms and preserve spacing, density, and sharpness.

    After 3D coordinates have been projected onto the image plane, rasterization is primarily a 2D problem, but the 3rd dimension necessitates hidden surface removal. Early computer graphics used geometric algorithms or ray casting to remove the hidden portions of shapes, or used the painter's algorithm, which sorts shapes by depth (distance from camera) and renders them from back to front. Depth sorting was later avoided by incorporating depth comparison into the scanline rendering algorithm. The z-buffer algorithm performs the comparisons indirectly by including a depth or "z" value in the framebuffer. A pixel is only covered by a shape if that shape's z value is lower (indicating closer to the camera) than the z value currently in the buffer. The z-buffer requires additional memory (an expensive resource at the time it was invented) but simplifies the rasterization code and permits multiple passes. Memory is now faster and more plentiful, and a z-buffer is almost always used for real-time rendering.

    A drawback of the basic z-buffer algorithm is that each pixel ends up either entirely covered by a single object or filled with the background color, causing jagged edges in the final image. Early anti-aliasing approaches addressed this by detecting when a pixel is partially covered by a shape, and calculating the covered area. The A-buffer (and other sub-pixel and multi-sampling techniques) solve the problem less precisely but with higher performance. For real-time 3D graphics, it has become common to use complicated heuristics (and even neural-networks) to perform anti-aliasing.

    In 3D rasterization, color is usually determined by a pixel shader or fragment shader, a small program that is run for each pixel. The shader does not (or cannot) directly access 3D data for the entire scene (this would be very slow, and would result in an algorithm similar to ray tracing) and a variety of techniques have been developed to render effects like shadows and reflections using only texture mapping and multiple passes.

    Older and more basic 3D rasterization implementations did not support shaders, and used simple shading techniques such as flat shading (lighting is computed once for each triangle, which is then rendered entirely in one color), Gouraud shading (lighting is computed using normal vectors defined at vertices and then colors are interpolated across each triangle), or Phong shading (normal vectors are interpolated across each triangle and lighting is computed for each pixel).

    Until relatively recently, Pixar used rasterization for rendering its animated films. Unlike the renderers commonly used for real-time graphics, the Reyes rendering system in Pixar's RenderMan software was optimized for rendering very small (pixel-sized) polygons, and incorporated stochastic sampling techniques more typically associated with ray tracing.

    Ray casting

    See main article: Ray casting.

    One of the simplest ways to render a 3D scene is to test if a ray starting at the viewpoint (the "eye" or "camera") intersects any of the geometric shapes in the scene, repeating this test using a different ray direction for each pixel. This method, called ray casting, was important in early computer graphics, and is a fundamental building block for more advanced algorithms. Ray casting can be used to render shapes defined by constructive solid geometry (CSG) operations.

    Early ray casting experiments include the work of Arthur Appel in the 1960s. Appel rendered shadows by casting an additional ray from each visible surface point towards a light source. He also tried rendering the density of illumination by casting random rays from the light source towards the object and plotting the intersection points (similar to the later technique called photon mapping).

    When rendering scenes containing many objects, testing the intersection of a ray with every object becomes very expensive. Special data structures are used to speed up this process by allowing large numbers of objects to be excluded quickly (such as objects behind the camera). These structures are analogous to database indexes for finding the relevant objects. The most common are the bounding volume hierarchy (BVH), which stores a pre-computed bounding box or sphere for each branch of a tree of objects, and the k-d tree which recursively divides space into two parts. Recent GPUs include hardware acceleration for BVH intersection tests. K-d trees are a special case of binary space partitioning, which was frequently used in early computer graphics (it can also generate a rasterization order for the painter's algorithm). Octrees, another historically popular technique, are still often used for volumetric data.

    Geometric formulas are sufficient for finding the intersection of a ray with shapes like spheres, polygons, and polyhedra, but for most curved surfaces there is no analytic solution, or the intersection is difficult to compute accurately using limited precision floating point numbers. Root-finding algorithms such as Newton's method can sometimes be used. To avoid these complications, curved surfaces are often approximated as meshes of triangles. Volume rendering (e.g. rendering clouds and smoke), and some surfaces such as fractals, may require ray marching instead of basic ray casting.

    Ray tracing

    See main article: Ray tracing (graphics).

    Ray casting can be used to render an image by tracing light rays backwards from a simulated camera. After finding a point on a surface where a ray originated, another ray is traced towards the light source to determine if anything is casting a shadow on that point. If not, a reflectance model (such as Lambertian reflectance for matte surfaces, or the Phong reflection model for glossy surfaces) is used to compute the probability that a photon arriving from the light would be reflected towards the camera, and this is multiplied by the brightness of the light to determine the pixel brightness. If there are multiple light sources, brightness contributions of the lights are added together. For color images, calculations are repeated for multiple wavelengths of light (e.g. red, green, and blue).

    Classical ray tracing (also called Whitted-style or recursive ray tracing) extends this method so it can render mirrors and transparent objects. If a ray traced backwards from the camera originates at a point on a mirror, the reflection formula from geometric optics is used to calculate the direction the reflected ray came from, and another ray is cast backwards in that direction. If a ray originates at a transparent surface, rays are cast backwards for both reflected and refracted rays (using Snell's law to compute the refracted direction), and so ray tracing needs to support a branching "tree" of rays. In simple implementations, a recursive function is called to trace each ray.

    Ray tracing usually performs anti-aliasing by taking the average of multiple samples for each pixel. It may also use multiple samples for effects like depth of field and motion blur. If evenly-spaced ray directions or times are used for each of these features, many rays are required, and some aliasing will remain. Cook-style, stochastic, or Monte Carlo ray tracing avoids this problem by using random sampling instead of evenly-spaced samples. This type of ray tracing is commonly called distributed ray tracing, or distribution ray tracing because it samples rays from probability distributions. Distribution ray tracing can also render realistic "soft" shadows from large lights by using a random sample of points on the light when testing for shadowing, and it can simulate chromatic aberration by sampling multiple wavelengths from the spectrum of light.

    Real surface materials reflect small amounts of light in almost every direction because they have small (or microscopic) bumps and grooves. A distribution ray tracer can simulate this by sampling possible ray directions, which allows rendering blurry reflections from glossy and metallic surfaces. However if this procedure is repeated recursively to simulate realistic indirect lighting, and if more than one sample is taken at each surface point, the tree of rays quickly becomes huge. Another kind of ray tracing, called path tracing, handles indirect light more efficiently, avoiding branching, and ensures that the distribution of all possible paths from a light source to the camera is sampled in an unbiased way.

    Ray tracing was often used for rendering reflections in animated films, until path tracing became standard for film rendering. Films such as Shrek 2 and Monsters University also used distribution ray tracing or path tracing to precompute indirect illumination for a scene or frame prior to rendering it using rasterization.

    Advances in GPU technology have made real-time ray tracing possible in games, although it is currently almost always used in combination with rasterization. This enables visual effects that are difficult with only rasterization, including reflection from curved surfaces and interreflective objects, and shadows that are accurate over a wide range of distances and surface orientations. Ray tracing support is included in recent versions of the graphics APIs used by games, such as DirectX, Metal, and Vulkan.

    Ray tracing has been used to render simulated black holes, and the appearance of objects moving at close to the speed of light, by taking spacetime curvature and relativistic effects into account during light ray simulation.

    Radiosity

    See main article: Radiosity (computer graphics).

    Radiosity (named after the radiometric quantity of the same name) is a method for rendering objects illuminated by light bouncing off rough or matte surfaces. This type of illumination is called indirect light, environment lighting, or diffuse lighting, and the problem of rendering it realistically is called global illumination. Rasterization and basic forms of ray tracing (other than distribution ray tracing and path tracing) can only roughly approximate indirect light, e.g. by adding a uniform "ambient" lighting amount chosen by the artist. Radiosity techniques are also suited to rendering scenes with area lights such as rectangular fluorescent lighting panels, which are difficult for rasterization and traditional ray tracing. Radiosity is considered a physically-based method, meaning that it aims to simulate the flow of light in an environment using equations and experimental data from physics, however it often assumes that all surfaces are opaque and perfectly Lambertian, which reduces realism and limits its applicability.

    In the original radiosity method (first proposed in 1984) now called classical radiosity, surfaces and lights in the scene are split into pieces called patches, a process called meshing (this step makes it a finite element method). The rendering code must then determine what fraction of the light being emitted or diffusely reflected (scattered) by each patch is received by each other patch. These fractions are called form factors or view factors (first used in engineering to model radiative heat transfer). The form factors are multiplied by the albedo of the receiving surface and put in a matrix. The lighting in the scene can then be expressed as a matrix equation (or equivalently a system of linear equations) that can be solved by methods from linear algebra.

    Solving the radiosity equation gives the total amount of light emitted and reflected by each patch, which is divided by area to get a value called radiosity that can be used when rasterizing or ray tracing to determine the color of pixels corresponding to visible parts of the patch. For real-time rendering, this value (or more commonly the irradiance, which does not depend on local surface albedo) can be pre-computed and stored in a texture (called an irradiance map) or stored as vertex data for 3D models. This feature was used in architectural visualization software to allow real-time walk-throughs of a building interior after computing the lighting.

    The large size of the matrices used in classical radiosity (the square of the number of patches) causes problems for realistic scenes. Practical implementations may use Jacobi or Gauss-Seidel iterations, which is equivalent (at least in the Jacobi case) to simulating the propagation of light one bounce at a time until the amount of light remaining (not yet absorbed by surfaces) is insignificant. The number of iterations (bounces) required is dependent on the scene, not the number of patches, so the total work is proportional to the square of the number of patches (in contrast, solving the matrix equation using Gaussian elimination requires work proportional to the cube of the number of patches). Form factors may be recomputed when they are needed, to avoid storing a complete matrix in memory.

    The quality of rendering is often determined by the size of the patches, e.g. very fine meshes are needed to depict the edges of shadows accurately. An important improvement is hierarchical radiosity, which uses a coarser mesh (larger patches) for simulating the transfer of light between surfaces that are far away from one another, and adaptively sub-divides the patches as needed. This allows radiosity to be used for much larger and more complex scenes.

    Alternative and extended versions of the radiosity method support non-Lambertian surfaces, such as glossy surfaces and mirrors, and sometimes use volumes or "clusters" of objects as well as surface patches. Stochastic or Monte Carlo radiosity uses random sampling in various ways, e.g. taking samples of incident light instead of integrating over all patches, which can improve performance but adds noise (this noise can be reduced by using deterministic iterations as a final step, unlike path tracing noise). Simplified and partially precomputed versions of radiosity are widely used for real-time rendering, combined with techniques such as octree radiosity that store approximations of the light field.

    Path tracing

    See main article: Path tracing. As part of the approach known as physically based rendering, path tracing has become the dominant technique for rendering realistic scenes, including effects for movies. For example, the popular open source 3D software Blender uses path tracing in its Cycles renderer. Images produced using path tracing for global illumination are generally noisier than when using radiosity (the main competing algorithm for realistic lighting), but radiosity can be difficult to apply to complex scenes and is prone to artifacts that arise from using a tessellated representation of irradiance.

    Like distributed ray tracing, path tracing is a kind of stochastic or randomized ray tracing that uses Monte Carlo or Quasi-Monte Carlo integration. It was proposed and named in 1986 by Jim Kajiya in the same paper as the rendering equation. Kajiya observed that much of the complexity of distributed ray tracing could be avoided by only tracing a single path from the camera at a time (in Kajiya's implementation, this "no branching" rule was broken by tracing additional rays from each surface intersection point to randomly chosen points on each light source). Kajiya suggested reducing the noise present in the output images by using stratified sampling and importance sampling for making random decisions such as choosing which ray to follow at each step of a path. Even with these techniques, path tracing would not have been practical for film rendering, using computers available at the time, because the computational cost of generating enough samples to reduce variance to an acceptable level was too high. Monster House, the first feature film rendered entirely using path tracing, was not released until 20 years later.

    In its basic form, path tracing is inefficient (requiring too many samples) for rendering caustics and scenes where light enters indirectly through narrow spaces. Attempts were made to address these weaknesses in the 1990s. Bidirectional path tracing has similarities to photon mapping, tracing rays from the light source and the camera separately, and then finding ways to connect these paths (but unlike photon mapping it usually samples new light paths for each pixel rather than using the same cached data for all pixels). Metropolis light transport samples paths by modifying paths that were previously traced, spending more time exploring paths that are similar to other "bright" paths, which increases the chance of discovering even brighter paths. Multiple importance sampling provides a way to reduce variance when combining samples from more than one sampling method, particularly when some samples are much noisier than the others.

    This later work was summarized and expanded upon in Eric Veach's 1997 PhD thesis, which helped raise interest in path tracing in the computer graphics community. The Arnold renderer, first released in 1998, proved that path tracing was practical for rendering frames for films, and that there was a demand for unbiased and physically based rendering in the film industry; other commercial and open source path tracing renderers began appearing. Computational cost was addressed by rapid advances in CPU and cluster performance.

    Path tracing's relative simplicity and its nature as a Monte Carlo method (sampling hundreds or thousands of paths per pixel) have made it attractive to implement on a GPU, especially on recent GPUs that support ray tracing acceleration technology such as Nvidia's RTX and OptiX. However bidirectional path tracing and Metropolis light transport are more difficult to implement efficiently on a GPU.

    Research into improving path tracing continues. Recent path guiding approaches construct approximations of the light field probability distribution in each volume of space, so paths can be sampled more effectively. Many techniques have been developed to denoise the output of path tracing, reducing the number of paths required to achieve acceptable quality, at the risk of losing some detail or introducing small-scale artifacts that are more objectionable than noise; neural networks are now widely used for this purpose.

    Neural rendering

    Neural rendering is a rendering method using artificial neural networks.[2] [3] Neural rendering includes image-based rendering methods that are used to reconstruct 3D models from 2-dimensional images.[2] One of these methods are photogrammetry, which is a method in which a collection of images from multiple angles of an object are turned into a 3D model. There have also been recent developments in generating and rendering 3D models from text and coarse paintings by notably Nvidia, Google and various other companies.

    Scientific and mathematical basis

    See main article: Unbiased rendering.

    The implementation of a realistic renderer always has some basic element of physical simulation or emulation some computation which resembles or abstracts a real physical process.

    The term "physically based" indicates the use of physical models and approximations that are more general and widely accepted outside rendering. A particular set of related techniques have gradually become established in the rendering community.

    The basic concepts are moderately straightforward, but intractable to calculate; and a single elegant algorithm or approach has been elusive for more general purpose renderers. In order to meet demands of robustness, accuracy and practicality, an implementation will be a complex combination of different techniques.

    Rendering research is concerned with both the adaptation of scientific models and their efficient application.

    Mathematics used in rendering includes: linear algebra, calculus, numerical mathematics, signal processing, and Monte Carlo methods.

    The rendering equation

    See main article: Rendering equation.

    This is the key academic/theoretical concept in rendering. It serves as the most abstract formal expression of the non-perceptual aspect of rendering. All more complete algorithms can be seen as solutions to particular formulations of this equation.

    Lo(x,\omega)=Le(x,\omega)+\int\OmegaLi(x,\omega')fr(x,\omega',\omega)(\omega'n)d\omega'

    Meaning: at a particular position and direction, the outgoing light (Lo) is the sum of the emitted light (Le) and the reflected light. The reflected light being the sum of the incoming light (Li) from all directions, multiplied by the surface reflection and incoming angle. By connecting outward light to inward light, via an interaction point, this equation stands for the whole 'light transport' all the movement of light in a scene.

    The bidirectional reflectance distribution function

    The bidirectional reflectance distribution function (BRDF) expresses a simple model of light interaction with a surface as follows:

    fr(x,\omega',\omega)=

    dLr(x,\omega)
    Li(x,\omega')(\omega'\vecn)d\omega'

    Light interaction is often approximated by the even simpler models: diffuse reflection and specular reflection, although both can ALSO be BRDFs.

    Geometric optics

    Rendering is practically exclusively concerned with the particle aspect of light physics known as geometrical optics. Treating light, at its basic level, as particles bouncing around is a simplification, but appropriate: the wave aspects of light are negligible in most scenes, and are significantly more difficult to simulate. Notable wave aspect phenomena include diffraction (as seen in the colours of CDs and DVDs) and polarisation (as seen in LCDs). Both types of effect, if needed, are made by appearance-oriented adjustment of the reflection model.

    Visual perception

    Though it receives less attention, an understanding of human visual perception is valuable to rendering. This is mainly because image displays and human perception have restricted ranges. A renderer can simulate a wide range of light brightness and color, but current displays movie screen, computer monitor, etc. cannot handle so much, and something must be discarded or compressed. Human perception also has limits, and so does not need to be given large-range images to create realism. This can help solve the problem of fitting images into displays, and, furthermore, suggest what short-cuts could be used in the rendering simulation, since certain subtleties will not be noticeable. This related subject is tone mapping.

    Sampling and filtering

    One problem that any rendering system must deal with, no matter which approach it takes, is the sampling problem. Essentially, the rendering process tries to depict a continuous function from image space to colors by using a finite number of pixels. As a consequence of the Nyquist–Shannon sampling theorem (or Kotelnikov theorem), any spatial waveform that can be displayed must consist of at least two pixels, which is proportional to image resolution. In simpler terms, this expresses the idea that an image cannot display details, peaks or troughs in color or intensity, that are smaller than one pixel.

    If a naive rendering algorithm is used without any filtering, high frequencies in the image function will cause ugly aliasing to be present in the final image. Aliasing typically manifests itself as jaggies, or jagged edges on objects where the pixel grid is visible. In order to remove aliasing, all rendering algorithms (if they are to produce good-looking images) must use some kind of low-pass filter on the image function to remove high frequencies, a process called antialiasing.

    Hardware

    Rendering is usually limited by available computing power and memory bandwidth, and so specialized hardware has been developed to speed it up ("accelerate" it), particularly for real-time rendering. Hardware features such as a framebuffer for raster graphics are required to display the output of rendering smoothly in real time.

    History

    In the era of vector monitors (also called calligraphic displays), a display processing unit (DPU) was a dedicated CPU or coprocessor that maintained a list of visual elements and redrew them continuously on the screen by controlling an electron beam. Advanced DPUs such as Evans & Sutherland's Line Drawing System-1 (and later models produced into the 1980s) incorporated 3D coordinate transformation features to accelerate rendering of wire-frame images. Evans & Sutherland also made the Digistar planetarium projection system, which was a vector display that could render both stars and wire-frame graphics (the vector-based Digistar and Digistar II were used in many planetariums, and a few may still be in operation). A Digistar prototype was used for rendering 3D star fields for the film  - some of the first 3D computer graphics sequences ever seen in a feature film.

    Shaded 3D graphics rendering in the 1970s and early 1980s was usually implemented on general-purpose computers, such as the PDP-10 used by researchers at the University of Utah. It was difficult to speed up using specialized hardware because it involves a pipeline of complex steps, requiring data addressing, decision-making, and computation capabilities typically only provided by CPUs (although dedicated circuits for speeding up particular operations were proposed). Supercomputers or specially designed multi-CPU computers or clusters were sometimes used for ray tracing. In 1981, James H. Clark and Marc Hannah designed the Geometry Engine, a VLSI chip for performing some of the steps of the 3D rasterization pipeline, and started the company Silicon Graphics (SGI) to commercialize this technology.

    Home computers and game consoles in the 1980s contained graphics coprocessors that were capable of scrolling and filling areas of the display, and drawing sprites and lines, though they were not useful for rendering realistic images. Towards the end of the 1980s PC graphics cards and arcade games with 3D rendering acceleration began to appear, and in the 1990s such technology became commonplace. Today, even low-power mobile processors typically incorporate 3D graphics acceleration features.

    GPUs

    See main article: Graphics processing unit. The 3D graphics accelerators of the 1990s evolved into modern GPUs. GPUs are general-purpose processors, like CPUs, but they are designed for tasks that can be broken into many small, similar, mostly independent sub-tasks (such as rendering individual pixels) and performed in parallel. This means that a GPU can speed up any rendering algorithm that can be split into subtasks in this way, in contrast to 1990s 3D accelerators which were only designed to speed up specific rasterization algorithms and simple shading and lighting effects (although tricks could be used to perform more general computations).

    Due to their origins, GPUs typically still provide specialized hardware acceleration for some steps of a traditional 3D rasterization pipeline, including hidden surface removal using a z-buffer, and texture mapping with mipmaps, but these features are no longer always used. Recent GPUs have features to accelerate finding the intersections of rays with a bounding volume hierarchy, to help speed up all variants of ray tracing and path tracing, as well as neural network acceleration features sometimes useful for rendering.

    GPUs are usually integrated with high-bandwidth memory systems to support the read and write bandwidth requirements of high-resolution, real-time rendering, particularly when multiple passes are required to render a frame, however memory latency may be higher than on a CPU, which can be a problem if the critical path in an algorithm involves many memory accesses. GPU design accepts high latency as inevitable (in part because a large number of threads are sharing the memory bus) and attempts to "hide" it by efficiently switching between threads, so a different thread can be performing computations while the first thread is waiting for a read or write to complete.

    Rendering algorithms will run efficiently on a GPU only if they can be implemented using small groups of threads that perform mostly the same operations. As an example of code that meets this requirement: when rendering a small square of pixels in a simple ray-traced image, all threads will likely be intersecting rays with the same object and performing the same lighting computations. For performance and architectural reasons, GPUs run groups of around 16-64 threads called warps or wavefronts in lock-step (all threads in the group are executing the same instructions at the same time). If not all threads in the group need to run particular blocks of code (due to conditions) then some threads will be idle, or the results of their computations will be discarded, causing degraded performance.

    Chronology of algorithms and techniques

    The following is a rough timeline of frequently mentioned rendering techniques, including areas of current research. Note that even in cases where an idea was named in a specific paper, there were almost always multiple researchers or teams working in the same area (including earlier related work). When a method is first proposed it is often very inefficient, and it takes additional research and practical efforts to turn it into a useful technique.

    The list focuses on academic research and does not include hardware. (For more history see

    1. External links
    , as well as Computer graphics#History and Golden_age_of_arcade_video_games#Technology.)

    See also

    Further reading

    External links

    Notes and References

    1. Web site: What is a Rendering Engine? | Dictionary . 2024-02-21 . 2024-02-21 . https://web.archive.org/web/20240221024025/https://arvisual.eu/dictionary/rendering-engine/#:~:text=Definition,with%20a%20given%203D%20software. . live .
    2. State of the Art on Neural Rendering. 10.1111/cgf.14022. 2020. Tewari. A.. Fried. O.. Thies. J.. Sitzmann. V.. Lombardi. S.. Sunkavalli. K.. Martin-Brualla. R.. Simon. T.. Saragih. J.. Nießner. M.. Pandey. R.. Fanello. S.. Wetzstein. G.. Zhu. J.-Y.. Theobalt. C.. Agrawala. M.. Shechtman. E.. Goldman. D. B.. Zollhöfer. M.. Computer Graphics Forum. 39. 2. 701–727. 2004.03805. 215416317.
    3. Knight. Will. A New Trick Lets Artificial Intelligence See in 3D. en-US. Wired. 2022-02-08. 1059-1028. 2022-02-07. https://web.archive.org/web/20220207230740/https://www.wired.com/story/new-way-ai-see-3d/. live.
    4. Gouraud . H. . Henri Gouraud (computer scientist) . 1971 . Continuous shading of curved surfaces . dead . IEEE Transactions on Computers . 20 . 6 . 623–629 . 10.1109/t-c.1971.223313 . 123827991 . https://web.archive.org/web/20100702012343/http://www.cs.uiowa.edu/~cwyman/classes/spring05-22C251/papers/ContinuousShadingOfCurvedSurfaces.pdf . 2010-07-02.
    5. Web site: History | School of Computing . live . https://web.archive.org/web/20131203035242/http://www.cs.utah.edu/dept/history/ . 2013-12-03 . 2021-11-22 . en-US.
    6. Phong . B-T . Bui Tuong Phong . 1975 . Illumination for computer generated pictures . dead . Communications of the ACM . 18 . 6 . 311–316 . 10.1.1.330.4718 . 10.1145/360825.360839 . 1439868 . https://web.archive.org/web/20120327165141/http://jesper.kalliope.org/blog/library/p311-phong.pdf . 2012-03-27.
    7. Blinn . J.F. . James F. Blinn . Newell . M.E. . M. E. Newell . 1976 . Texture and reflection in computer generated images . Communications of the ACM . 19 . 10 . 542–546 . 10.1.1.87.8903 . 10.1145/360349.360353 . 408793.
    8. Blinn . James F. . 20 July 1977 . Models of light reflection for computer synthesized pictures . ACM SIGGRAPH Computer Graphics . 11 . 2 . 192–198 . 10.1145/965141.563893 . dl.acm.org . free.
    9. Crow . F.C. . Franklin C. Crow . 1977 . Shadow algorithms for computer graphics . 11 . 2 . 242–248 . https://web.archive.org/web/20120113074712/https://design.osu.edu/carlson/history/PDFs/crow-shadows.pdf . 2012-01-13 . 2011-07-15 . Computer Graphics (Proceedings of SIGGRAPH 1977) . dead.
    10. Williams . L. . Lance Williams (graphics researcher) . 1978 . Casting curved shadows on curved surfaces . 12 . 3 . 270–274 . 10.1.1.134.8225 . Computer Graphics (Proceedings of SIGGRAPH 1978).
    11. Blinn . J.F. . James F. Blinn . 1978 . Simulation of wrinkled surfaces . Computer Graphics (Proceedings of SIGGRAPH 1978) . 12 . 3 . 286–292 . https://web.archive.org/web/20120121143525/http://research.microsoft.com/pubs/73939/p286-blinn.pdf . 2012-01-21 . live.
    12. Fuchs . H. . Henry Fuchs . Kedem . Z.M. . Zvi M. Kedem . Naylor . B.F. . Bruce F. Naylor . 1980 . On visible surface generation by a priori tree structures . Computer Graphics (Proceedings of SIGGRAPH 1980) . 14 . 3 . 124–133 . 10.1.1.112.4406.
    13. Whitted . T. . Turner Whitted . 1980 . An improved illumination model for shaded display . Communications of the ACM . 23 . 6 . 343–349 . 10.1.1.114.7629 . 10.1145/358876.358882 . 9524504.
    14. Cook . R.L. . Robert L. Cook . Torrance . K.E. . Kenneth E. Torrance . 1981 . A reflectance model for computer graphics . Computer Graphics (Proceedings of SIGGRAPH 1981) . 15 . 3 . 307–316 . 10.1.1.88.7796.
    15. Williams . L. . Lance Williams (graphics researcher) . 1983 . Pyramidal parametrics . Computer Graphics (Proceedings of SIGGRAPH 1983) . 17 . 3 . 1–11 . 10.1.1.163.6298.
    16. Glassner . A.S. . Andrew Glassner . 1984 . Space subdivision for fast ray tracing . IEEE Computer Graphics & Applications . 4 . 10 . 15–22 . 10.1109/mcg.1984.6429331 . 16965964.
    17. Porter . T. . Duff . T. . Tom Duff . 1984 . Compositing digital images . Computer Graphics (Proceedings of SIGGRAPH 1984) . 18 . 3 . 253–259 . https://web.archive.org/web/20150216062332/http://keithp.com/~keithp/porterduff/p253-porter.pdf . 2015-02-16 . live.
    18. Cook . R.L. . Robert L. Cook . Porter . T. . Carpenter . L. . Loren Carpenter . 1984 . Distributed ray tracing . Computer Graphics (Proceedings of SIGGRAPH 1984) . 18 . 3 . 137–145.
    19. Goral . C. . Cindy M. Goral . Torrance . K.E. . Kenneth E. Torrance . Greenberg . D.P. . Donald P. Greenberg . Battaile . B. . Bennett Battaile . 1984 . Modeling the interaction of light between diffuse surfaces . Computer Graphics (Proceedings of SIGGRAPH 1984) . 18 . 3 . 213–222 . 10.1.1.112.356.
    20. Cohen . M.F. . Michael F. Cohen . Greenberg . D.P. . Donald P. Greenberg . 1985 . The hemi-cube: a radiosity solution for complex environments . Computer Graphics (Proceedings of SIGGRAPH 1985) . 19 . 3 . 31–40 . 10.1145/325165.325171 . https://web.archive.org/web/20140424063155/http://arnetminer.org/dev.do?m=downloadpdf&url=http%3A%2F%2Farnetminer.org%2Fpdf%2FPDFFiles2%2F--g---g-Index1255026826706%2FThe%2520hemi-cube%2520%2520a%2520radiosity%2520solution%2520for%2520complex%2520environments1255058011060.pdf . 2014-04-24 . 2020-03-25 . dead.
    21. Arvo . J. . James Arvo . 1986 . Backward ray tracing . SIGGRAPH 1986 Developments in Ray Tracing course notes . 10.1.1.31.581.
    22. Book: Wu, Xiaolin . An efficient antialiasing technique . ACM SIGGRAPH Computer Graphics . July 1991 . 25 . 4 . 143–152 . 10.1145/127719.122734 . 978-0-89791-436-9.
    23. Book: Wu, Xiaolin . Graphics Gems II . Morgan Kaufmann . 1991 . 978-0-12-064480-3 . James Arvo . San Francisco . 446–450 . Fast Anti-Aliased Circle Generation.
    24. Hanrahan . P. . Pat Hanrahan . Salzman . D. . David Salzman . Aupperle . L. . Larry Aupperle . 1991 . A rapid hierarchical radiosity algorithm . Computer Graphics (Proceedings of SIGGRAPH 1991) . 25 . 4 . 197–206 . 10.1.1.93.5694.
    25. M. Oren and S.K. Nayar, "Generalization of Lambert's Reflectance Model ". SIGGRAPH. pp.239-246, Jul, 1994
    26. Tumblin . J. . Jack Tumblin . Rushmeier . H.E. . Holly Rushmeier . 1993 . Tone reproduction for realistic computer generated images . live . IEEE Computer Graphics & Applications . 13 . 6 . 42–48 . 10.1109/38.252554 . 6459836 . https://web.archive.org/web/20111208231341/http://smartech.gatech.edu/bitstream/handle/1853/3686/92-31.pdf?sequence=1 . 2011-12-08.
    27. Hanrahan . P. . Pat Hanrahan . Krueger . W. . Wolfgang Krueger . 1993 . Reflection from layered surfaces due to subsurface scattering . Computer Graphics (Proceedings of SIGGRAPH 1993) . 27 . 165–174 . 10.1.1.57.9761.
    28. Lafortune . Eric . Willems . Yves . December 1993 . Bi-directional path tracing . 145–153 . 2 September 2024 . Proceedings of Third International Conference on Computational Graphics and Visualization Techniques (CompuGraphics) . 21 May 2022 . https://web.archive.org/web/20220521174659/https://www.cs.princeton.edu/courses/archive/fall03/cs526/papers/lafortune93.pdf . live .
    29. Book: Miller, Gavin . Proceedings of the 21st annual conference on Computer graphics and interactive techniques - SIGGRAPH '94 . 24 July 1994 . ACM . 978-0897916677 . 319–326 . Efficient algorithms for local and global accessibility shading . 10.1145/192161.192244 . 7 May 2018 . http://dl.acm.org/citation.cfm?id=192161.192244 . https://web.archive.org/web/20211122155805/https://dl.acm.org/doi/10.1145/192161.192244 . 22 November 2021 . live . dl.acm.org . 15271113.
    30. Jensen . H.W. . Henrik Wann Jensen . Christensen . N.J. . Niels Jørgen Christensen . 1995 . Photon maps in bidirectional monte carlo ray tracing of complex objects . Computers & Graphics . 19 . 2 . 215–224 . 10.1.1.97.2724 . 10.1016/0097-8493(94)00145-o.
    31. Veach . Eric . Guibas . Leonidas J. . 15 September 1995 . Optimally combining sampling techniques for Monte Carlo rendering . 419–428 . 10.1145/218380.218498 . 2 September 2024 . SIGGRAPH95: 22nd International ACM Conference on Computer Graphics and Interactive Techniques . 26 July 2024 . https://web.archive.org/web/20240726032649/https://dl.acm.org/doi/10.1145/218380.218498 . live .
    32. Veach . E. . Eric Veach . Guibas . L. . Leonidas J. Guibas . 1997 . Metropolis light transport . Computer Graphics (Proceedings of SIGGRAPH 1997) . 16 . 65–76 . 10.1.1.88.944.
    33. Veach . E. . Eric Veach . Guibas . L. . Leonidas J. Guibas . 1997 . Metropolis light transport . Computer Graphics (Proceedings of SIGGRAPH 1997) . 16 . 65–76 . 10.1.1.88.944.
    34. Keller . A. . Alex Keller . 1997 . Instant Radiosity . Computer Graphics (Proceedings of SIGGRAPH 1997) . 24 . 49–56 . 10.1.1.15.240.
    35. Sloan . P. . Peter Pike Sloan . Kautz . J. . Jan Kautz . Snyder . J. . John Snyder (computer scientist) . 2002 . Precomputed Radiance Transfer for Real-Time Rendering in Dynamic, Low Frequency Lighting Environments . Computer Graphics (Proceedings of SIGGRAPH 2002) . 29 . 527–536 . https://web.archive.org/web/20110724151631/http://www.mpi-inf.mpg.de/~jnkautz/projects/prt/prtSIG02.pdf . 2011-07-24 . dead.
    36. Matusik . W. . Pfister . H. . Brand . M. . McMillan . L. . July 2003 . A Data-Driven Reflectance Model . ACM Transactions on Graphics (TOG). 22 . 3 . 759–769 . 10.1145/882262.882343. 23 November 2024.
    37. Loper . Matthew M . Black . Michael J . 6 September 2014 . OpenDR: An approximate differentiable renderer . Zurich, Switzerland . Springer International Publishing . 8695 . 154–169 . 10.1007/978-3-319-10584-0_11 . 2 September 2024 . Computer Vision - ECCV 2014 . 24 June 2024 . https://web.archive.org/web/20240624151721/https://files.is.tue.mpg.de/black/papers/OpenDR.pdf . live .
    38. Practical Path Guiding for Efficient Light-Transport Simulation . Müller . Thomas . Gross . Markus . Novák . Jan . Computer Graphics Forum (Proceedings of EGSR) . 36 . 4 . June 2017 . 91–100 . The Eurographs Association & John Wiley & Sons, Ltd. . 10.1111/cgf.13227 . 4 September 2024.
    39. Bitterli . Benedikt . Wyman . Chris . Pharr . Matt . Shirley . Peter . Lefohn . Aaron . Jarosz . Wojciech . July 2020 . Spatiotemporal reservoir resampling for real-time ray tracing with dynamic direct lighting . ACM Transactions on Graphics. 39 . 4 . 10.1145/3386569.3392481 . 2 September 2024 . 1 March 2024 . https://web.archive.org/web/20240301064934/https://cs.dartmouth.edu/~wjarosz/publications/bitterli20spatiotemporal.html . live .