AI on the edge

Trend 8: Address latency, point-specific contextual learning with edge-based intelligence

Smart Reply, auto suggestions for grammar, sentence completion while typing on a phone, voice recognition, voice assistants, facial biometrics to unlock a phone or an autonomous vehicle navigation system, robotics, augmented reality applications — all of them use local, natively deployed AI models to improve the response time to user actions. Imagine, in the absence of a local AI model, the inference or prediction would have to be based on a remote server; the experience would be just completely opposite. AI plays a key role in providing improved experience to the user by leveraging edge-based AI.

Edge-based AI plays a quintessential role in remote locations, where network connectivity may not be continuous, response times should be in fractions of seconds and network latency cannot be afforded, and hypercontextualization is required with user-specific data in the given environment.

Edge-based AI is feasible because of a significant improvement in edge processing specialized embedded chip hardware and software such as Google tensor processing unit, field programmable gate arrays and GPU.

At the edge, typically two things happen, one being inference or prediction and the second being training or learning. For the inference or prediction to happen, a lightweight model is available to predict. The model with training capability can use local context-based learning and at the appropriate time can synchronize with the central model. The synchronization can be done by just sharing the model parameters, weights, features, etc. without needing to share the actual data, thus managing data privacy . Once the central model builds itself with several such updates from different remote edge-based AI models, it can update its training and share the updated model footprint with all the edge-based devices or clients, thus ensuring everybody gets the benefit of the central learning capacity. This process of distributed learning is called federated learning, and essentially it is employed as a strategy where sharing data has challenges of data privacy, shareability, network transport limitations, etc. but at the same time needs to leverage the benefits of abstracted learning available through the central capacity.

TensorFlow Lite provides the complete toolkit to convert TensorFlow models to TensorFlow Lite, which can run on edge devices. They also are made compatible to gain the benefits of central processing unit and GPU acceleration devices. MobileNet models adapt several state-of-the-art CNN models to device models by sizing network architecture patterns such as depthwise separable convolutions, hyperparameter optimization for width multipliers, and resolution multipliers with the corresponding trade-offs in accuracy and latency.

Infosys partnered with a large European car manufacturer to use edge computing to identify and predict failures of spindle machines in a brownfield environment through an “internet of things” gateway. This helped in setting up a cost-effective solution for handling the large data feeds coming from the spindle machine.

A large global mining company used wearable devices for safety monitoring of moving miners’ source data to the cloud for processing on a near-real-time basis.

For a large global manufacturer, optimization of the engine maintenance, repair and overhaul shop floor is driven through edge computing , which provides the availability and predictability of the machines for scheduling operations.


To keep yourself updated on the latest technology and industry trends subscribe to the Infosys Knowledge Institute’s publications

Infosys TechCompass