Building Trustworthy Software with AI: The Generate-and-Check Paradigm

Whether it be designing products and creative content or software engineering, artificial intelligence is steadily changing how we engineer and interact with technology. But although AI can speed up the development process, the real price of the measure lies in trusting its output, particularly when dealing with safety-critical applications. How can AI-generated software be ensured to be correct, secure, and efficient within real-world parameters?

Bosch Research recognizes the immense promise of the generation-and-execution approach in driving innovation and practical impact. This synthesis combines generative AI to suggest solutions and systematic checks to enforce correctness, safety, and performance. Balancing AI creativity occurs with a touch of strictness-a balance that lands well upon software engineering.

How Generate-and-Check Works

Think of solving a crossword puzzle: you may try out different words, but each suggestion is validated against the length of the clue and the letters already in place. Similarly, in software engineering, AI can generate new code or refactor existing code, while automated checks verify compliance with rules and desired outcomes.

Those rules can be either very simple like the coding style enforcement or highly advanced, like formal verification of software properties. From this perspective, rather than verifying every possible system state, safety, correctness, and adherence to requirements are ensured by verifying AI proposals.

Less error-prone AI assistance, and much less reliance on human supervision all the time.

Use Case 1: Smarter Code Refactoring

Refactoring is a perfect application for generate-and-check. The AI proposes improvements, e.g., migrating to more efficient frameworks, while automated checks verify the equivalence of the new version with the old code.

This approach is somewhat different from the traditional ones based mostly on unit tests as it guarantees behavioral invariance, i.e., that the refactored code behaves exactly the same but better in terms of maintainability or efficiency. Tools developed at Bosch Research allow you to profile this too, to make sure that performance has stayed the same or improved after the changes have been made.

Use Case 2: Reliable Software Translation

On the other hand, software translation remains an area where AI excels but demands human monitoring. The idea of translating legacy code into a safer or new-age language seems nice, but oftentimes traditional transpilers would fail in preserving the idiomatic essence of the target environment.

Yet with generate-and-check, AIs can translate idiomatically while automated tools check for functional correctness, safety, security, and performance. This finally offers a chance to modernize codebases in great bulk without stealthily inserting vulnerabilities.

Embedding into the Developer Workflow

AI becomes valuable for developers if their tools support integration with existing toolchains. Generate-and-check would appear in various forms:

IDE plugins for quick, low-latency assistance during coding.

Background workflows for longer tasks, such as legacy migration, where AI proposals can be rolled out as pull requests. Each PR can provide evidence, such as performance metrics or validation checks, preserving developers’ agency albeit under automated rigor.

This guarantees that AI will continue to be an aid rather than a substitute, offering reliable recommendations while developers make the ultimate choices.

Looking Forward:

The generate-and-check paradigm is a mentality shift for trustworthy AI in software engineering, not merely a technical approach. AI offers safer, better, and more efficient software development by combining its generating capacity with reliable verification.

(This article has been adapted and modified from content on Bosch.)

Building Trustworthy Software with AI: The Generate-and-Check Paradigm

Must Read

About us

Technology

Electronics

Industry