Neuromorphic Vision Chips [Physics]

Prev: Free Access to U.S. Research Papers Could Yield $1 Billion in Benefits
Next: Gono, Mugabe clash over bp's Kyoto cap&trade derivatives etc.

From: NSA TORTURE TECHNOLOGY, NEWS and RESEARCH on 4 Aug 2010 20:49

http://spectrum.ieee.org/biomedical/imaging/neuromorphic-vision-chips/2

Neuromorphic Vision Chips
ICs that mimic the structure of the retina turn shifting light and shade
into moving edges and surfaces, much as the eyes do in serving the visual
cortex.
Unconventional sensors

Functionally, neuromorphic vision chips do what a video camera does when
combined with a computer running some dedicated vision program, perhaps an
algorithm for detecting edges. Computationally, though, the architectures of
the two systems are quite different. Neuromorphic systems, like nervous
systems, use massively parallel, analog, nonclocked, collective processing,
rather than the numerical and symbolic processing basic to artificial
intelligence and conventional machine vision. These desirable neuromorphic
properties can implement types of mathematical operations that occur in
early vision, as it is called. (Early vision is the set of processes that
make use of two-dimensional intensity arrays to recover distance, texture,
and other physical properties associated with the surfaces of the
three-dimensional objects visible around the viewer.)

The first reflex of today's system engineers, surrounded as they are by
digital computers, is to sample and digitize the incoming video signal as
soon as possible. Yet since the brightness of an image is continuous in time
and amplitude, why import unnecessary artifacts? Why not instead exploit the
physics of conductances, capacitances, and nonlinearities inherent in
transistors to implement operations that are expensive in the digital
domain? When such analog circuits are integrated with 2-D arrays of
photoreceptors, the resulting silicon retinas capture the image with a
virtuosity no digital computer can match unless capable of hundreds of
millions of floating-point operations per second. And the package can be as
small as 1 cm2.

Before these devices can be built, several key components must be designed.
Adaptive photoreceptors are needed to sense image intensities over eight
orders of magnitude--the range of natural lighting from moonlight to high
noon. Linear and nonlinear resistive grids must filter the image in order to
reduce the ever-present noise and to enhance and detect certain features,
such as edges. Smart communication protocols are necessary to send streams
of visual information between chips. Velocity sensors have to reliably
detect motion in the scene. Finally the chips must be able to adapt their
outputs to wide variations in parameters using on-chip learning.

Not every IC dedicated to visual algorithms is a neuromorphic vision chip.
The latter processes the image on the same physical plane as it acquires the
image (focal plane processing). On the other hand, dedicated
signal-processing circuits take the digitized output of a camera and apply a
particular visual algorithm to every picture element (pixel) in the image,
one after the other.

The dedicated circuits are usually based either on standard digital
signal-processing (DSP) chips or on digital systems specially designed for
such applications as block matching for video applications or filtering
images using convolution. Block matching is popular for estimating motion in
images. In convolution, the most common image-processing technique, passing
a "filter function" over each point in the original image transforms it into
the filtered image. The new value of a pixel is the sum of the products of
this filter function with the image intensity at each pixel, suitably
normalized.

In these applications, a mathematical operation that needs to be repeated
over and over again is cast in special-purpose digital hardware; otherwise,
it would limit system performance too much. One example is the correlation
chip that Woodward Yang at Harvard University, Cambridge, Mass., developed
for recognizing faces. Here, the most demanding operation is to match one
face against a large database. So small chunks of the image are fed to
Yang's digital chip, which matches them against a template. The chip carries
out about 100 000 correlations each second on a 64-by-64-pixel image and
outputs the best fit. But although the correlator chip by itself only
requires 0.1 W of power, the entire system, including camera and
microprocessor, is still large and power hungry.

Adaptive photoreceptors

Today, there are two approaches to image acquisition. The first, sensors
based on charge-coupled devices (CCDs), dominates the consumer market. The
CCDs sense light intensity by integrating the photocharge in time on a grid
of some 800 by 600 pixels. The continuously valued output at each pixel,
digitized in time, constitutes the output of the camera. It is typically
sent to a "frame-grabber" board, where its amplitude is digitized (usually
to 8-bit, or 256-level, resolution) for further analysis.

The amplitude of light in the natural world, however, swings over eight
orders of magnitude from moonlight to a sun-filled day, while the dynamic
range of CCDs is unfortunately much less. When the dynamic range needed to
process the image exceeds the CCD's capability, the image is clipped; and
blooming can occur when the charge on a pixel exceeds its holding capacity
and the excess spills over into neighboring pixels. A clipped region in the
image will be uniformly white, with no details apparent. Blooming manifests
itself by a white line in the image, created by the excess charge that flows
from the bright pixel onto and along a rail in the imager. The usual remedy
for a limited dynamic range is to include automatic gain control. In this
case, a mechanical iris will serve, or else the charge integration time of
the imager may be adjusted to the brightness of the scene.

CCD cameras do not compute. Indeed, they should not, since their output, a
series of bits that can easily be transmitted to a TV monitor, should look
as much as possible like the input when displayed on the screen. This also
implies that the image requires high resolution all over, since it is not
known ahead of time where the viewer will be looking.

Biological creatures view things differently--the photoreceptors in their
eyes sense the intensity continuously in time and adapt to the local image
intensity in both space and time, thereby maximizing the receptors' dynamic
range. Photoreceptors with similar properties can be built using CMOS
devices. A simple photodiode can logarithmically compress the photocurrent
into a voltage signal, but its response is very slow at lower intensities.
Further, device mismatches due to fabrication variables will skew the
response of adjacent receptors to identical input. Indeed, variation in
voltage due to device mismatch can be as large as the signal itself. All
these problems can be solved by adaptive photoreceptors.

Some of the best adaptive photoreceptors have been designed by Tobias
Delbruck at Caltech. The response of his five-transistor photoreceptor is
logarithmic, so that the differential response to a constant contrast is
unaffected by changes in the absolute light intensity. Its output adapts to
slow (seconds long) changes in image intensity over more than six orders of
magnitude, while preserving a high gain for transient changes in the image.
And, in stark contrast to CCDs, no expensive clocks are needed, reducing
power consumption and the need for support circuitry.

There is a price, though. In a 1.2-um CMOS process, a single adaptive
photoreceptor uses about 52 by 52 um2 of silicon real-estate, compared to 7
by 7 um2 for a state-of-the-art CCD pixel [see "Vision chips compared"].

Image resolution is another important difference between artificial and
natural vision systems. While we primates sample the world in daylight using
one to two million photoreceptors, other animals need many fewer. Highly
evolved insects that use vision to find food and mates and avoid predators
and obstacles have in effect 10 000 or fewer pixels with which to sample
their environment. Although their visual performance in real time is beyond
current machine-vision systems, even the cheapest hand-held video camera has
many, many more pixels. The moral here is that while we humans are used to
seeing high-resolution images, many visual tasks need far fewer pixels.

.................................................................
Posted via TITANnews - Uncensored Newsgroups Access
>>>> at http://www.TitanNews.com <<<<
-=Every Newsgroup - Anonymous, UNCENSORED, BROADBAND Downloads=-

|
Pages: 1
Prev: Free Access to U.S. Research Papers Could Yield $1 Billion in Benefits
Next: Gono, Mugabe clash over bp's Kyoto cap&trade derivatives etc.