Codemod AI : An experiment with multiple before and after examples

June 20, 2024·3 min read

Posted By

Pardis PashakhanlooFounding AI Engineer

In a previous technical post, we introduced an iterative approach to improve the accuracy of autogenerated codemods by integrating deterministic software engineering tools with AI. This approach significantly improved accuracy when using a single pair of before and after code examples to generate a codemod.

However, during a real-world scenario, the user provides more than one example pair in order to cover various edge cases. So, to further improve the practicality and effectiveness of our approach, we added the ability to generate a codemod given multiple pairs of before and after code examples. We then evaluated this new feature on our public registry.

Experimental Setup

Evaluation Results

Satisfying more constraints means potentially more sophisticated codemods, i.e., codemods that are more difficult to generate via AI. Quantifying this difficulty is the purpose of our following analysis. The analysis shows that Codemod AI with three refinement iterations is able to generate codemods that pass at least half the test cases in 78.95% of the cases, or 45 out of 57 codemods. However, codemod generation becomes more difficult as the number of before and after pairs increase. This is shown in the drop we observe in the accuracy. The following chart summarizes these results.

Discussion

The results demonstrate that our iterative approach continues to be effective when dealing with multiple before and after examples. Even when the model cannot perfectly generalize to all examples, it can capture at least some of the desired transformations.

This evaluation is our most realistic yet, as it guides the model to create a codemod that passes all given test cases. Future work will focus on improving generalization and exploring techniques to leverage multiple examples more effectively within the iterative refinement process.

Large-scale Next.js Migration at Cal.com Imperative vs. Declarative Codemods in Large-Scale Migrations