VISION is defined as physiological sense of sight by which the form, color, size, movements, and distance of objects are perceived.

*"During the process of seeing the eye has to fulfill two main tasks. First the eye has to recognize details of a scene, which means it has to perceive the spacial resolution of the picture. The second task is to recognize changes in a scene, in other words, to perceive a temporal resolution of a scene.

The term 'seeing,' as such, actually only describes the idea that light reflected by the objects surrounding us enters our eyes. The eye itself contains several parts that process reflected light and generate the image that our brain understands. When light has entered our eye, it passes through the cornea, the iris, the pupil and finally the lens. All these parts work together to put a focused image onto the back of the eye which is called the retina. Once on the retina, the image can be recognized and processed by the brain. To process the image information in the brain the retina is equipped with photoreceptors, which are stimulated differently.

There are two different kinds of photoreceptors: rods and cones. (These names are based on their actual shapes). It was found that with the rods, we are able to see black and white; while the cones give us the ability to distinguish between different colors. There are different kind of cones, which are especially sensitive for red, green and blue color. If light is reflected on a high number of cones, the cones then enable us to get a high spacial resolution of the image since small changes in the color can be recognized. Rods are more sensitive to the intensity of light itself. An important aspect of the rods and cones in the context of digital video is their number and their distribution on the retina. If we look for an example on the center of the retina we will only find cones. Areas further away from the center have a much higher distribution of rods. This is the reason why we have to look directly at some image to get all the details. (... )

In total, we have about one hundred and twenty million rods and only around eight million cones on the retina. The latter, as stated, are distributed close to the center of the retina. This leads to the fact that the eye is, in general, relatively less sensitive to color especially to color changes. Video compression techniques, like the one used in MPEG-2 Video, therefore utilize this low-color sensitivity by reducing the color information per image. MPEG-2 uses Discrete Cosine Transformation to identify and subsequently remove high frequency changes in color."

you can find more or less information here. Also from *"ATM & MPEG-2, INTEGRATING DIGITAL VIDEO INTO BROADBAND NETWORKS" by Michael Orzessek and Peter Sommer