Article ID Journal Published Year Pages File Type
461495 Journal of Systems and Software 2016 14 Pages PDF
Abstract

•Generalize a semantic code search approach using input/output examples as queries.•Approach extension utilizes symbolic execution to find relevant code with many paths.•Rank source code results based on match strength between a specification and the code.•Evaluate search results against state-of-the-practice approaches, Google and Merobase.•Based on opinions of 30 programmers, our search returns the most relevant results.

In this work we generalize, improve, and extensively assess our semantic source code search engine through which developers use an input/output query model to specify what behavior they want instead of how it may be implemented. Under this approach a code repository contains programs encoded as constraints and an SMT solver finds encoded programs that match an input/output query. The search engine returns a list of source code snippets that match the specification.The initial instantiation of this approach showed potential but was limited. It only encoded single-path programs, reported just complete matches, did not rank the results, and was only partly assessed. In this work, we explore the use of symbolic execution to address some of these technical shortcomings. We implemented a tool, Satsy, that uses symbolic execution to encode multi-path programs as constraints and a novel ranking algorithm based on the strength of the match between an input/output query and the program paths traversed by symbolic execution. An assessment about the relevance of Satsy’s results versus other search engines, Merobase and Google, on eight novice-level programming tasks gathered from StackOverflow, using the opinions of 30 study participants, reveals that Satsy often out-performs the competition in terms of precision, and that matches are found in seconds.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , ,