Watch and learn

Posted to Articles on Monday, February 28, 2022 Share:

IF ONE phrase could sum up the advances coming to the video surveillance industry in 2022, it’s ‘deep learning’. This sub-set of Artificial Intelligence is progressing at a monumental pace, asserts Uri Guterman, in turn unlocking untold potential for surveillance system end users.

As American billionaire tech entrepreneur and investor Mark Cuban has advised: “Artificial Intelligence, deep learning, machine learning... Whatever you’re doing, if you don’t understand this terminology then learn all about it. Otherwise, you’re going to be a dinosaur within three years.”

Essentially, everything that Artificial Intelligence (AI)-powered cameras and video analytics deliver in the coming years will stem from deep learning.

It wasn’t long ago that security teams used video solely to detect and deter crimes like theft and vandalism. If positioned correctly, the quality of the images captured by traditional cameras could only tell a security team what was happening as it occurred. The cameras wouldn’t be expected to go beyond this narrow scope of work.

Further, in the 1990s and early 2000s, we simply didn’t have the hardware to run the complex algorithms and models required for deep learning, and certainly not on camera devices themselves.

All of that’s changing. Advances in those devices mean that the images – and audio – captured relay greater detail to operators along with a wealth of data. It’s impossible for a human to sift through all of this information, which is where AI and automation come in. Now, the industry is going a step further with deep learning to achieve added value from a surveillance installation.

The good news here is that deep learning applications are available at the edge and are also easy to implement, with training generally being available from – and offered by – the application developers themselves.

Learning by example

For those new to the terminology, ‘deep learning’ describes a form of AI where a model is ‘trained’ on hundreds of thousands of data sets to learn by example and from experience. In video analytics, deep learning involves training a mathematical network structure inspired by human brain neurons to identify and recognise specific objects, human attributes or behaviour and vehicles with human accuracy.

In short, it can then tell the difference between a bicycle and a truck or a male child versus an older woman.

If you still believe deep learning is reserved solely for Hollywood plotlines, you’re sorely mistaken. Deep learning has become mainstream, particularly so in business-to-consumer markets where the likes of Disney use it to improve the visitor experience in its theme parks and Burberry to detect handbag authenticity. Even Netflix employs deep learning to recommend the next ‘binge-watch’.

The pace at which deep learning-based applications are being implemented is accelerating – not to mention their increasing ease-of-use and scalability – so it’s really imperative for security leaders to keep pace with it.

In terms of use cases, HD cameras and connected sensors, coupled with video management systems and deep learning, now have a broad application beyond the traditional security remit. This includes boosting productivity, improving the customer experience, creating a safer working environment, proactive maintenance and minimising claims for lost goods and damages.

One capability of today’s advanced AI cameras is detecting suspicious behaviour. By monitoring what the ‘norm’ looks like, behaviour such as loitering, or someone following closely behind another person, can be flagged for security personnel.

A retail store worker taking things from a stock room or skimming the cash drawer – even someone picking up an object and quickly moving towards a store’s exit – can be detected and those security staff on duty duly alerted.

Likewise, potential anti-social behaviour and vandalism is detectable. A group of youths massing in an area could trigger an alert (or, assuming integration with a PA system, initiate an automatic dispersal message to play over speakers). Someone displaying threatening behaviour, or using raised voices, can be picked up through an audio system. Deep learning models are so accurate that they can differentiate between individuals dancing or fighting.

It isn’t just applicable to crime deterrence, though. Crowd formation alerts can be issued when the number of people in an area increases beyond predefined parameters. This is helpful in many contexts, from improving a customer’s in-store experience, or making a commute more pleasant in a station, right through to supporting social distancing efforts in an office.

Emergency response

It’s tough to monitor a site 24/7 and difficult in the extreme if the security team has to keep on top of multiple dispersed sites with different installations. Human operators can tire and become task-bored. With deep learning, though, not only is the detection itself much more accurate, but also the camera is trained to classify types of objects such that it knows when the alert is actually a person as opposed to an animal jumping over the fence, thus avoiding false alarms.

This ability to minimise time-wasting and reduce the occurrence of costly false alarms means that Control Room operators and security personnel are able to focus on responding to real incidents and emergencies.

Through deep learning, AI can monitor hundreds of camera feeds for emergencies like someone falling, a potential mugging, flooding and the outbreak of fire. Moreover, deep learning models trained to differentiate between individuals are able to support teams in finding a missing child. Audio analysis can recognise critical sounds like a window breaking, people screaming and shouting and explosions. Alarms and relevant information may be relayed to on-site security teams and, if relevant, any first responders from the Emergency Services.

Deep learning can also help to boost operators’ situational awareness. Accurate object classification can help them decide what to focus on in a frame, for example an individual with glasses carrying a bag or a specific type of vehicle, such as a blue van. If triggered by an alarm event, cameras can automatically lock on to people or vehicles that are potentially involved, following them around a camera’s field of view and helping to direct security personnel towards them.

This ability isn’t just helpful as a crime response and investigation tactic. It can also assist operators in feeling less overwhelmed when monitoring activity in open environments like car parks, stadiums, city centres, Shopping Centres, airports and busy stations.

Business intelligence

Occupancy monitoring references how many individuals are in a specific space. This may help to support public health measures and inform site managers on how an area is being used, pinpointing those zones of high footfall (which likely need more cleaning and maintenance).

Meanwhile, cameras in warehouses and factories can monitor equipment, not only to ensure that it isn’t stolen, but also to track its performance and warn a maintenance team when performance drops. In turn, this avoids any unexpected and unwanted downtime.

So far, we’ve examined analytics assisting in real-time. Deep learning can also help with investigations post-event.

AI can help operators to scan through hours of footage in minutes, looking for specific features or video of an event. It pinpoints exactly what an operator needs, including people wearing specific clothing items or those of a certain age, as well as different vehicle makes, colours and number plates. Such capabilities mean that shoplifters can be quickly apprehended or stolen vehicles tracked down by the authorities.

Here, the footage will not only help in swiftly identifying and catching suspects, but may also be used as evidence in a Court of Law if legal proceedings should occur.

Another overlooked benefit that deep learning brings to video surveillance is in improving video footage and optimising bandwidth. With hours of video being recorded across multiple areas and sites, space and bandwidth are necessarily at a premium. AI-based compression technology automatically applies a low compression rate to detected objects and a high compression to the rest of the image, which then minimises bandwidth and storage.

AI plays a vital role in providing high-quality images. AI-based noise reduction technology, for example, intelligently identifies movement to reduce blur in noisy low-light environments. AI-based preferred shutter technology adjusts the shutter speed depending on motion and lighting conditions.

Finally, AI-based Wide Dynamic Range with scene analysis technology takes full advantage of image contrast to see detailed objects clearly even in those environments where strong backlight conditions exist.

Accessibility of AI

Moving on from the use cases, it’s also worth knowing that deep learning is becoming much more accessible and cost-effective due to edge computing. The global edge AI market size is predicted to grow from US$590 million in 2020 to US$1,835 million by 2026, with video and image recognition having the largest share of this growth.

Edge AI essentially means that the cameras themselves are pre-loaded with video analytics functions like people counting and accurate object detection and classification. This cuts down on the amount of data that has to be transmitted back to a server or, in some instances, removes the need for a server entirely. It also opens up more opportunities for security teams to run on-board analytics without an extensive video management system (VMS) or network video recorder (NVR) set-up.

This makes AI cameras better suited to installations where scalability is a priority (where the sites to be monitored are highly distributed or remote, for example). It also works well for smaller installations (ie a car park) where a more cost-effective solution not reliant on a VMS or NVRs is demanded.

Cameras that run AI on the edge begin analysing as soon as they’re plugged in and receiving footage. This brings all of the power of AI to the security function, without the need for investment in specialist knowledge or data science skills to implement the deep learning model. The system will begin work and deliver insights within days and hours. No coding is required.

Ethical usage

It’s critical for detailed discussions around the ethical use of AI in video surveillance to occur if we are to safeguard essential data protection and privacy-related rights. We shouldn’t just limit the use of AI to what’s permissible by law, but in parallel must always set very clear policies that ensure respect for shared human values.

Compliance with legislation including the European Union’s General Data Protection Regulation is becoming increasingly important. For their part, surveillance system manufacturers must always seek to build-in AI with responsible use in mind. AI-based live view privacy masking technology, for example, is an important development in ensuring that surveillance-related data protection and privacy rules are respected and observed at all times.

Uri Guterman is Head of Product and Marketing at Hanwha Techwin Europe (www.hanwha-security.eu)