Show simple item record

dc.contributor.authorSrinivasan, Venkatesh
dc.date.accessioned2017-03-08T18:55:56Z
dc.date.available2017-03-08T18:55:56Z
dc.date.issued2017-03-08T18:55:56Z
dc.identifier.citationTR1844en
dc.identifier.otherTR1844
dc.identifier.urihttp://digital.library.wisc.edu/1793/76116
dc.description.abstractThe analysis of binaries has gotten an increasing amount of attention from the academic community in the last decade. The results of binary analysis have been predominantly used to answer questions about the properties of binaries. Another potential use of analysis results is to rewrite the binary via semantic transformations. Semantics-based binary-rewriting tools become particularly important when one wishes to modify the functionality of the binary, and the source code and/or compiler toolchain for the binary is unavailable. Semantics-based binary rewriting can be done for the purposes of binary optimization (offline optimization, superoptimization), software reuse (partial evaluation, slicing, binary translation), or software security (binary obfuscation, de-obfuscation). This dissertation proposes various algorithms for synthesizing a machine-code implementation from a semantic specification of the desired behavior, and describes how one can use a machine-code synthesizer to build different semantics-based binary-rewriting tools. The first part of this dissertation introduces the problem of machine-code synthesis, proposes a framework for semantics-based binary rewriting via machine-code synthesis, and presents several algorithms for machine-code synthesis. A key challenge in synthesizing machine code is the enormous size of the synthesis search-space. Instruction sets like Intel's IA-32 have around 43,000 unique instruction schemas; this huge instruction pool, along with the exponential cost inherent in enumerative synthesis, results in an enormous search space for the synthesizer: even for relatively small specifications, a naïve synthesizer might take several days to find an implementation. We have developed a suite of techniques---ranging from pruning heuristics to machine learning---that assist an enumerative synthesizer in combating the enormous search space, and speeding up synthesis from the order of days to the order of seconds. The second part of this dissertation describes how one can instantiate the aforementioned synthesizer-equipped framework to create different semantics-based binary rewriters. In particular, we have instantiated the framework to create two tools that rewrite binaries for purposes of software reuse: a novel machine-code partial evaluator, and an improved machine-code slicer. Among other potential uses, these tools can be used to either modify or extract a component from a binary that lacks source code.en
dc.language.isoen_USen
dc.relation.ispartofseriestechnical reports;TR1844
dc.subjectslicingen
dc.subjectpartial evaluationen
dc.subjectmachine learningen
dc.subjectIA-32 instruction seten
dc.subjectmachine-code synthesisen
dc.titleSynthesis of Machine Code: Algorithms and Applicationsen
dc.typeTechnical Reporten


Files in this item

Thumbnail

This item appears in the following Collection(s)

  • CS Technical Reports
    Technical Reports Archive for the Department of Computer Sciences at the University of Wisconsin-Madison

Show simple item record