• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Dynamic Query Re-Planning Using QOOP

    Thumbnail
    File(s)
    TR1855.pdf (1.538Mb)
    Date
    2018-09-27
    Author
    Mahajan, Kshiteej
    Chowdhury, Mosharaf
    Akella, Aditya
    Chawla, Shuchi
    Metadata
    Show full item record
    Abstract
    Modern data processing clusters are highly dynamic – both in terms of the number of concurrently running jobs and their resource usage. To improve job performance, recent works have focused on optimizing the cluster scheduler and the jobs’ query planner with a focus on picking the right query execution plan (QEP) – represented as a directed acyclic graph – for a job in a resource-aware manner, and scheduling jobs in a QEP-aware manner. However, because existing solutions use a fixed QEP throughout the entire execution, the inability to adapt a QEP in reaction to resource changes often leads to large performance inefficiencies. This paper argues for dynamic query re-planning, wherein we re-evaluate and re-plan a job’s QEP during its execution. We show that designing for re-planning requires fundamental changes to the interfaces between key layers of data analytics stacks today, i.e., the query planner, the execution engine, and the cluster scheduler. Instead of pushing more complexity into the scheduler or the query planner, we argue for a redistribution of responsibilities between the three components to simplify their designs. Under this redesign, we analytically show that a greedy algorithm for re-planning and execution alongside a simple max-min fair scheduler can offer provably competitive behavior even under adversarial resource changes. We prototype our algorithms atop Apache Hive and Tez. Via extensive experiments, we show that our design can offer a median performance improvement of 1.47× compared to state-of-the-art alternatives.
    Subject
    Big-Data Systems
    Query Planning
    Scheduling
    Permanent Link
    http://digital.library.wisc.edu/1793/78832
    Type
    Technical Report
    Citation
    TR1855
    Part of
    • CS Technical Reports

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Contact Us | Send Feedback