From Wikipedia.org;
"In digital imaging a pixel, or pel, (picture element) is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled."
Most pixels are square, which means they can't be called dots. Some camera image sensors have had both square and rectangular pixels. The Nikon D1x image sensor has both square and rectangular pixels as an example.
The D3's and D300's I use have 12.1 and 12.3 mega pixel (MP) image sensors, respectively. In round numbers that's 12 million picture elements on each image sensor.
There are 2 kinds of image sensor: Charge Coupled Devices (CCD's), which were invented in 1969, the year I graduated from high school (Class of '69 Forever!!! Yep, I'm an old guy), and Complementary Metal–Oxide–Semiconductor, or CMOS - active pixel sensors.
Both types of sensor do essentially the same thing, capturing light and converting it into electrical voltages (signal). CMOS image sensors use less power, and because CMOS uses less power, CMOS generates less heat. Heat is one source of image noise, so less heat, less image noise.
Part of the reason long digital exposures have more noise is because the image sensor gets hotter the longer power is applied to it.
Make a note here: The image sensor in a digital camera (CCD pixels or CMOS pixels), isn't a digital device, it's an analog device. Make another note here: Neither type of image sensor can record color.
OK, so we now have the basics of what a pixel is.
About now you ask, "OK! But how does the voltage (pixel) get changed into a piece of a picture, and how come we can make color photographs if a camera image sensor can't record color?
I'm glad you asked.
First lets handle where the color comes from. The color is mathematically interpolated. For our purposes only part 2 of the Dictionary.com definition of interpolate is needed:
in·ter·po·late2.Mathematics . to insert, estimate, or find an intermediate term in (a sequence).
Yep, the color is estimated, but the estimate is pretty accurate because of a filter array that is placed in front of the image sensor, called a Bayer Array:
note that each array segment has 3 colors - red, green, and blue (RGB) and the array is passive. It just sits there in front of the pixels and it uses no power.
Digital images are made using the RGB color model. A single Bayer Array has 2 green squares, because human eyes are most sensitive to green light. The red square covers a single pixel, each green square coves a single pixel, and each blue square covers a single pixel. A 12 MP image sensor has 3,000,000 more of those 4 pixel Bayer arrays (4 times 3,000,000 = 12 MP).
The light falling on any 4 pixels arrayed right together like that, is almost certainly all the same color and the same intensity because those pixels are really, really small.
But not all red light is exactly red. It more often is some subtle shade of red. In the RGB color model different shades of color can be made by adding differing amounts of the three colors in the model.
Pure red is R=255, G=0, Blue=0. Pure green is R=0, G=255, B=0. Pure blue is R=0, G=0, B=255.
Yellow is a mix - R=255, G=255, B=0. Cyan is a mix - R=255, G=0, B=255. Any shades of red, yellow, green, blue, or cyan in between will have some of all 3 RGB colors.
White is a mix of all 3 at maximum value R=255, G=255, B=255.
So though the image sensor can't record colors, by having the Bayer Array in front of the pixels the voltage each pixel generates is in part determined by the color of light falling on the pixel, so the colors in the image can be mathematically interpolated.
The voltages are still analog information though, and the mathematical interpolation can only be performed on digital data. But the voltages the pixels generate are really small, and they need to be amplified. How much the voltages get amplified is determined by the camera's ISO setting.
once amplified the voltages are then input to an Analog To Digital (A/D) converter.
If the camera has been set up to record only Raw image data files, the output of the A/D converter is written to the memory card and the image data is not yet a photo you can see, it's all just 1's and 0's or Raw data. The Raw image data file has to be converted into a photo outside the camera using any of many Raw converters.
If JPEG, TIFF, or Raw + JPEG has been selected for output the JPEG and TIFF files have to be made in the camera.
In the camera, a demosaicing algorithm (a set of rules for solving a problem in a finite number of steps) is applied to the digital data that interpolates the digitized voltages the image sensor/Bayer Array captured, and further processes the image data to complete the JPEG or TIFF file conversion process before the image files are written to the memory card.
Since JPEG is a lossy, compressed, final, ready-to-print, file type, those files require less memory card space. Unfortunately, because so much image data is discarded making a JPEG file they can't be edited very much, if at all.