Mining the Ecosystem to Improve Type Inference For Dynamically Typed Languages
Dynamically typed languages lack information about the types of variables in the source code. Developers care about this informa- tion as it supports program comprehension. Basic type inference techniques are helpful, but may yield many false positives or nega- tives. We propose to mine information from the software ecosystem on how frequently given types are inferred unambiguously to im- prove the quality of type inference for a single system. This paper presents an approach to augment existing type infer- ence techniques by supplementing the information available in the source code of a project with data from other projects written in the same language. For all available projects, we track how often messages are sent to instance variables throughout the source code. Then, predictions for the type of a variable are made based on the messages sent to it. The evaluation of a proof-of-concept prototype shows that this approach works well for types that are sufficiently popular, like those found in the standard libraries, and tends to create false positives for unpopular or domain specific types. The false positives are, in most cases, fairly easily identifiable. Also, the evaluation data shows a substantial increase in the number of correctly inferred types when compared to the non-augmented type inference.
Thu 23 Oct
|15:30 - 15:52|
|15:52 - 16:15|
|16:15 - 16:37|
James SkeneAuckland University of Technology
|16:37 - 17:00|