• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    RDBMS Index Support for Sparse Data Sets

    Thumbnail
    File(s)
    TR1566.pdf (2.095Mb)
    Date
    2006
    Author
    Beckmann, Jennifer
    Chu, Eric
    Naughton, Jeffrey
    Publisher
    University of Wisconsin-Madison Department of Computer Sciences
    Metadata
    Show full item record
    Abstract
    Maintenance costs and storage overheads incurred by indexes often limit the number of indexes created per table in an RDBMS. For sparse data, where a table may have hundreds of attributes, indexing only a few attributes means that a vanishingly small percentage of attributes will have indexes, which unfortunately means that a table scan is the only evaluation plan for almost all selection queries on that table. This paper demonstrates that sparsity of the data actually enables index support for most, if not all, attributes in the data. Our approach leverages "sparse indexes", which are partial indexes that store only non-null values. Sparse indexes incur low maintenance costs and storage overheads because most values in a sparse table are null. Properties of the data lead us to two other contributions toward index support for sparse data; we show that sparse indexes benefit greatly from building all indexes in one-pass of the data; and we identify that multi-column sparse indexes are preferable as covering indexes when attributes in the data are correlated. We qualitatively evaluate our approaches with synthetic and real-world data to show that our suggestions significantly out-perform traditional indexing approaches designed for dense data.
    Permanent Link
    http://digital.library.wisc.edu/1793/60506
    Type
    Technical Report
    Citation
    TR1566
    Part of
    • CS Technical Reports

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Contact Us | Send Feedback