AI on the edge

Trend 9: Edge-based intelligence to address latency and point-specific contextual learning

Smart reply, auto suggestions for grammar, sentence completion while typing on a phone, voice recognition, voice assistants, facial biometrics to unlock a phone or an autonomous vehicle navigation system, robotics, augmented reality applications — all use local, natively deployed AI models to improve the response time. In the absence of a local AI model, the inference or prediction would be based on a remote server, and the experience would be suboptimal. Edge-based AI plays a quintessential role in remote locations, where network connectivity may not be continuous. Response times should be in the fractions of seconds and network latency cannot be afforded. Further, hypercontextualization is required with user-specific data.

Edge-based AI is feasible because of a significant improvement in edge processing-specialized embedded chip hardware and software such as Google tensor processing unit, field-programmable gate arrays, and GPU.

At the edge, two things happen — “inference/prediction” and “training/learning”. For the inference or prediction to happen, a lightweight model is available. The model with this training capability can use local context-based learning and synchronize with the central model at the appropriate time. The synchronization can be done by just sharing the model parameters, weights, features, etc., without compromising on data privacy. Once the central model builds itself with several such updates from different remote edge-based AI models, it can update its training and share the updated model footprint with all the edge-based devices or clients, ensuring everybody gets the benefit of the central learning capacity. This process of distributed learning is called federated learning. This is employed as a strategy where sharing data has challenges of data privacy, shareability, network transport limitations, etc., but at the same time needs to leverage the benefits of abstracted learning through central capacity.

TensorFlow Lite provides a complete toolkit to convert TensorFlow models to TensorFlow Lite, which can run on edge devices. Even with smaller models trained on less data, TensorFlow Lite gains the benefits of the central processing unit and GPU acceleration devices. MobileNet models convert several state-ofthe- art convolutional neural network models to device models by sizing network architecture patterns such as depth-wise separable convolutions, hyperparameter optimization for width multipliers, and resolution multipliers with the corresponding trade-offs in accuracy and latency.

A large telecom company, in collaboration with Infosys, developed a robust video intelligence solution for smart spaces. The solution takes data through real-time streaming protocol (RTSP) from CCTV, runs deep learning and computer vision models to detect humans across feeds, and tracks motion/movement. It derives insights such as people density in an area, ingress/egress count, dwell time analysis, and wait time analysis.

These models have been extended for social distancing compliance. Infosys developed an elevated body temperature detection solution as part of the COVID-19 response. The solution ingests feeds from thermal camera to edge devices. It then runs deep learning and computer vision models on edge to deduce a person's accurate temperature depicted within the thermal feed. Thereafter, it compares against the set threshold for detecting elevated body temperature as an indication of febrile symptoms.

Infosys has also built India's first commercial autonomous golf cart that uses an autonomous platform. Autonomous vehicles use deep learning and computer vision models to detect objects and lanes for autonomous navigation. It uses simultaneous localization and mapping (SLAM) and visual SLAM (vSLAM) models. The first autonomous vehicle is now deployed in a leading automotive OEM plant.