Phrase-Based Statistical Translation of Programming Languages
Phrase-based statistical machine translation approaches have been highly successful in translating between natural languages and are heavily used by commercial systems (e.g. Google Translate).
The main objective of this work is to investigate the applicability of these approaches for translating between programming languages. Towards that, we investigated several variants of the phrase-based translation approach: i) a direct application of the approach to programming languages, ii) a novel modification of the approach to incorporate the grammatical structure of the target programming language (so to avoid generating target programs which do not parse), and iii) combines ii) with custom rules added to improve the quality of the translation.
To experiment with the above systems, we investigated machine translation from C# to Java. For the training, which takes about 60 hours, we used a parallel corpus of 20,499 C#-to-Java method translations. We then evaluated each of the three systems above by translating 1,000 C# methods. Our experimental results indicate that with the most advanced system, about 60% of the translated methods compile (the top ranked) and out of a random sample of 50 correctly compiled methods, 68% (34 methods) were semantically equivalent to the reference solution.
Fri 24 OctDisplayed time zone: Tijuana, Baja California change
10:30 - 12:00
|Phrase-Based Statistical Translation of Programming Languages|
|Interleaving of Modification and Use in Data-driven Tool Development|
|Unifying Textual and Visual: a Theoretical Account of the Visual Perception of Programming Languages|
Stéphane Conversy University of Toulouse - ENAC
|Variational Data Structures: Exploring Tradeoffs in Computing with Variability|