TechCasts

AI in the embedded World (Part 2)

In our second episode, we explore the integration of AI into safety-critical embedded systems highlighting the difficulties of running AI on resource-constrained platforms, balancing performance with safety and security, and the trade-offs between accuracy and system complexity. We also discuss strategies like model optimization, hardware acceleration, and system partitioning to tackle these challenges effectively.

Moreover, this episode emphasizes the importance of ensuring transparent decision-making and compliance with safety standards in AI-integrated systems to build a connected future that's both safe and secure, while navigating the complexities of AI certification and deployment.

Read the Transcript

Amani: Hello, and welcome back to the SYSGO TechCast! And today let’s start with a scenario: Image you’re driving down a quiet street. Suddenly a child steps into the road. In that critical moment, the AI in your car instantly recognizes the danger and applies the brakes, preventing what could have been a tragic accident. It is an AI integrated directly into devices, operating in real-time to enhance safety and responsiveness.

In our last episode, we discussed how this kind of technology is transforming our approach to safety across different domains from automotive to healthcare. And today, we will dive even deeper into the technical challenges and security concerns of these AI systems.

From your understanding, Abdessalem, what are the most common challenges that can arise when integrating AI into embedded systems?

Abdessalem: Running inference or training an AI model on an embedded system presents challenges similar to those faced when performing any task on resource-constrained platforms. So, here we have to carefully select the processing architecture, optimize memory usage, and stay within strict power limitations.

Embedded systems often struggle to support AI models that demand significant resources. While inferencing usually requires less computational power than training, both processes are still heavily restricted by the hardware's limitations. The reason is because AI relies heavily on complex mathematical operations, particularly those involving multi-dimensional arrays, or tensors, which are fundamental for neural networks. Efficiently managing these tensors within the tight constraints of an embedded system is a significant challenge.

In contrast to Cloud environments, where resources can be scaled to meet the needs of AI applications, embedded systems often run multiple applications with different levels of criticality on the same hardware, especially in fields like automotive and industrial automation. This means the limited resources must be carefully allocated among all applications, not just the AI tasks, and this adds another layer of complexity.

While there are many challenges we could discuss, I want to specifically emphasize the critical importance of security and privacy here.  Because cyberattacks are a genuine threat to the future of intelligent embedded systems, whether they come from remote intrusions or physical access. AI systems are heavily reliant on data - often sensitive data - to make decisions. This makes them attractive targets for attackers who may seek to manipulate the AI’s behavior or steal information. My greatest concern is the potential for intelligent systems to be tampered with. As we increasingly delegate tasks to machines across various fields, now the stakes are higher than ever. The last thing anyone wants is to be in a vehicle that fails to stop when it should or to have their location exposed to unauthorized individuals.

These are challenges that are waiting ahead and they are significant, and the future brings with it new risks that must be addressed now. The future we want to see is precisely what SYSGO has been working towards for decades: A safe and secure connected future. It’s not just about designing robust AI applications, but also ensuring that the underlying platforms running them are built to the highest standards of safety and security.

Amani: Given the challenges you mentioned with running AI on resource-constrained embedded systems, such as managing tensor operations within tight power and memory limits, could you share some specific practices or strategies to overcome these challenges without compromising the system performance?

Abdessalem: The first step in addressing resource constraints is to opt for a less complex model. While maintaining overall accuracy is important, it is also crucial to recognize the trade-off between accuracy and complexity. Several techniques can help reduce model size without really impacting the model performance. One such technique is Quantization, which involves converting high-precision numerical values to lower-precision formats. This can really reduce the model's memory footprint and computational load. Another method is Pruning, which removes less important or redundant weights and connections in a neural network to decrease the model's complexity and size.

In addition to these techniques, other strategies can further optimize performance under resource constraints. For example, Knowledge Distillation involves training a smaller model to replicate the behavior of a larger, more complex one while achieving comparable accuracy.

Additional methods, such as Model Partitioning and Federated Learning, can also be employed depending on the AI use case. Model partitioning divides tasks between local Edge devices and Cloud servers, while federated learning enables multiple devices to collaboratively train a model without centralized data storage while maintaining privacy and reducing latency at the same time.

Considering hardware optimizations is also important. Depending on the resource needs and use case, approaches such as using massively parallel architectures like GPUs or FPGAs, leveraging tensor-optimized architectures, or integrating hardware accelerators with legacy architectures, like using RISC-V with custom AI instructions, can significantly also enhance AI performance more effectively.

Amani: Now as we shift our focus to safety: What are the issues that may arise when integrating AI into safety-critical systems, especially in relation to safety and security certifications? And how can these issues be addressed?

Abdessalem: Integrating AI into safety-critical systems always introduces several significant challenges due to the inherent characteristics of these systems and AI applications. These challenges can create obstacles in the design and certifications of systems, particularly in critical domains such as automotive, aerospace, industrial automation, and healthcare.

Safety standards demand deterministic and predictable system behavior. This requirement ensures that the system consistently responds to inputs in a manner that can be always tested and validated. However, AI, especially machine learning models, often exhibits non-deterministic behavior. For example, AI models may alter their internal state based on new data or generate different outputs with slight input variations. This adaptability conflicts with the need for consistent and testable behavior, posing difficulties for safety certification.

Transparency in decision-making is another critical requirement for safety-critical systems. Each decision path must be traceable and comprehensible to ensure compliance with safety standards. However, AI models, specifically deep learning networks, are frequently perceived as "black boxes" due to their complexity and the difficulty in interpreting their decision-making processes. This opacity is at odds with safety requirements, where clear and justifiable decision-making is really essential.

AI models often need updates or retraining to enhance performance, correct errors, or adapt to new data. These updates can alter the system's behavior, requiring re-certification. This creates a tension between the need for dynamic updates to maintain AI effectiveness and the requirement for static configurations in safety-certified systems. And also, AI models typically rely on approaches based on probabilities. They may not always perform optimally under every possible scenario. So, ensuring that AI models can handle worst-case scenarios to the degree required by safety standards is challenging, as AI models might excel in average conditions but struggle with edge cases.

The evolving nature of AI technology also presents challenges, as many existing safety standards do not fully address AI-specific issues such as model verification, training data bias, and the implications of AI decision-making processes. This gap between current standards and the needs of AI-integrated systems often lead to a conflict during the certification process.

Amani: So, in your opinion, how to address those conflicts?

Abdessalem: To address these concerns, one approach is to design systems where AI applications handle non-critical tasks, while critical decisions are made by traditional, verifiable algorithms. For example, PikeOS allows the system to be divided into isolated partitions, each with its own level of criticality. This means, AI applications can run in one partition while safety-critical tasks operate in another, with strong isolation between them. This setup prevents faults or unpredictability in the AI components from affecting the safety-critical parts of the system.

Let’s consider the Lane Departure Warning System (LDWS) as an example: An AI-based system that uses a camera to detect lane markings and warns the driver if the vehicle drifts out of its lane. This partition can operate in a Linux-based environment within PikeOS. And since these applications are not safety-critical, they do not require the same level of timing precision as safety-critical systems like ESC or ABS. The safety-critical partition can be certified independently according to automotive safety standards, while the non-critical AI partition simplifies the certification process while still enabling advanced AI features in the vehicle.

In cases where AI applications handle critical tasks, using more interpretable models can improve traceability and transparency in decision-making. Explainable AI (XAI) is an area gaining attention, focusing on making AI systems' decision-making processes more transparent and understandable to humans. However, a trade-off often exists between model interpretability and performance, as more interpretable models (e.g., linear models, decision trees) may be less accurate than complex models based in neural networks.

Another critical aspect is verifying edge cases. There are rare scenarios where typical environmental conditions do not apply. For instance, an autonomous vehicle using an AI system to identify pedestrians may struggle if trained only in clear weather conditions. If the system fails to detect a pedestrian in poor visibility, such as darkness, this worst-case scenario could lead to severe consequences, as the vehicle may not take necessary action. To address this, diffusion models can be used to generate diverse training data, such as high-resolution, realistic images of pedestrians – and this is all in various conditions which can really improve the system's ability to handle edge cases.

Assessing AI systems' safety is complex, and I think if they cannot fully meet safety requirements, it is essential to have a comprehensive understanding of the risks involved. It is also important to stay informed about ongoing efforts by the certification authorities and bodies as they work to define the requirements of certifiable systems with AI applications.

Amani: Thank you, Abdessalem, for all the information and insights you shared with us! Coming to an end, what advice would you give to someone just starting their career in AI, specifically in embedded systems?

Abdessalem: During my modest experience with AI, one thing has become clear to me: The only constant in the AI field is rapid change! Think about how the field has shifted over the years. Neural networks, for example, were once considered niche, and now they're at the heart of deep learning. So, one of the best pieces of advice I have received before is to diversify your knowledge base. Don’t jump into one specific area too early. Instead, focus on building skills that teach you how to learn effectively and adapt quickly. For example, rather than just becoming an expert in a specific machine learning framework (like TensorFlow), maybe you could invest time in understanding the core principles - things like algorithms, data structures, and the mathematical foundations behind AI. Because at the end of the day, these fundamentals are timeless and will serve you well, even as specific tools and technologies change.

Amani: And with that we wrap up our discussion for today. Thank you for joining us and diving deep into the fascinating world of AI in embedded systems. We've covered a lot of ground, from ensuring safety and transparency to preparing for the rapid changes in the field. Remember, the journey in AI is as much about continuous learning and adaptation as it is about technology itself.

Keep exploring, stay curious, and join us next time for more insights into the evolving world of artificial intelligence.  

Until then, take care and keep innovating!

Shadow

Listen to our TechCasts on


YouTube

Listen now