Comparative Analysis of Transformer and CodeBERT for Program Translation
DOI:
https://doi.org/10.3126/nccsrj.v3i1.72334Keywords:
Program Translation, Transformer, Code Bidirectional Encoder Representations from Transformers, Bilingual Evaluation Understudy (BLEU), Code Bilingual Evaluation Understudy (CodeBLEU)Abstract
Program translation refers to the technical process of automatically converting the source code of a computer program written in one programming language into an equivalent program in another. This study compares the transformer model and the CodeBERT-based encoder-decoder model on the program translation task. Specifically, it trains the 6 and 12-layer models for 50 and 100 epochs to translate programs written in Java to Python and Python to Java. The models were trained with 3133 set of Java Python parallel programs. Among different layered models, the transformer model with 6 layers trained for 50 epochs to translate from Java to Python achieved the highest BLEU and CodeBLEU scores, with values of 0.28 and 0.28, respectively. Similarly, the transformer model with 6 layers trained for 100 epochs to translate from Python to Java received the highest BLEU and CodeBLEU scores of 0.39 and 0.40, respectively. These results show that the transformer models perform better than the CodeBERT models. Also, the BLEU and CodeBLEU scores of the Java to Python and Python to Java translation models are different.