The public is becoming more and more aware that video surveillance can positively contribute to their feeling of safety in public areas. As a result of the debate that has taken place about these privacy aspects, guidelines are being drafted on a local and national level to determine how, where and in what manner video surveillance can take place. These guidelines also concern the period of time these images may be kept and regulate the encryption of the images.
A complete digital IP video surveillance system consists of cameras, video codecs, a transportation network, communication protocols, a video recorder, video management software, activators, event management software, intelligent image recognition and finally, the monitors that display the video images.
To guarantee an optimum operation of the complete video surveillance system, all these components have to be interoperable. This doesnt seem to pose any problem in this time of omnipresent IP networks and Ethernet-connections, but in practice that turns out to be a bit more complicated.
At present, Ethernet is the all accepted standard for IP networks. One should, however, ask oneself whether Ethernet meets the demands regarding the specific transport requirements of a system for video surveillance, under all circumstances. On reflection, it is clear that there are still a number of difficulties to be overcome, regardless of the fact that Ethernet is the present day standard.
It already starts with the initial design of the network. After all, that is what determines the diameter of the Ethernet-segments, the deployment of routers and router protocols, the IP plan, the number of sub networks, etc. For video surveillance systems, in most cases we opt for a multicast environment (point-to-multi-point).
Moreover, all network components and peripherals have to be interoperable. Since the communication takes place according to standardized protocols, they must all obey to the same rules. This is where problems often occur.
Lets take IGMP (Internet Group Management Protocol) as an example. IGMP and IGMP snooping are much-used standards that regulate the multicast traffic on an Ethernet network. IGMP is developed and implemented for a network in which the multicast streams go from a (central) router to the hosts. IGMP (snooping) makes sure that the stream is only received by a host that has joined that particular stream.
In a CCTV network the streams are reversed. They run from the hosts, the encoders, to the central router. IGMP is not designed for this purpose. With an encoder and a switch that are both compliant with the standard, it is still possible that all multicast streams are being flooded. In that case, whether or not a CCTV multicast network will operate properly depends on the way the manufacturer has implemented IGMP in the encoder and the switch.
VIDEO CODECS AND TRANSPORT MECHANISMS
Video codecs take care of encoding and decoding the camera images and in addition compress the images to reduce the size of the video stream. The used compression algorithm determines the final video quality, the necessary storage space and the required processor power. Naturally, the goal is to get the best video quality while taking up as little bandwidth and processor power as possible. But the choice for a certain compression standard also depends on the specific surveillance application.
Terms such as MPEG-4 or MPEG-2 do not say anything about the way video images are transported across the network. Both signals can be encapsulated in a transport stream or an elementary stream. These streams in their turn can be encapsulated in RTP (Real-Time Protocol over Ethernet), UDP and IP.
This stream has to be transmitted to a single decoder or a group of decoders. To establish, maintain, and end such a connection, we need yet another set of protocols: SAP (Session Announcement Protocol), RTSP (Real-Time Streaming Protocol), etc. These protocols suggest standardization, but they allow for everybody to develop his own variation.
In other words, most IP systems are not interoperable by definition.
Theoretically, Network Video Recording (NVR) is simple: after all, the video streams are already digitized and can be written directly to the hard disk. However, streams for live viewing have a high resolution and full frame rate. The storage space required for this type of stream is huge and expensive. Therefore, CIF-resolution (352x288 pixels) and only a few frames per second are generally used for simple background recording, which results in a significantly smaller video stream than in the first example.
Various solutions exist for storing larger video streams. Local storage on a hard disk that is placed in or near the camera is an option, but placing such a maintenance-prone component in the field is hardly a solution. Another possibility is to convert the video stream in the network video recorder itself, but this requires extra processor power or additional hardware. It is also possible to use multi-core codecs that are capable of sending two or more streams simultaneously. A fourth option is to only store relevant footage. In that case, one should use motion detection software or external triggers that activate a camera.
MAKING EVERTHING INTEROPERABLE
For the operator the system is an extension of his senses: sight, hearing, touch and memory. From his point of view, the system should only perform tasks that it is supposed to do. The intercommunication between the network components like codecs, software and such, ensures that this is exactly what happens. All components have to "speak the same language" so that they can all "understand" each other.
The only way to guarantee the correct operation of intercommunication, is by testing a set-up in every conceivable situation. Often, the large number of components constituting a video surveillance system and their impact on the network load result in unexpected errors.