• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Towards Building XML Statistics for the Hidden Web

    Thumbnail
    File(s)
    TR1477.pdf (3.025Mb)
    Date
    2003
    Author
    Aboulnaga, Ashraf
    Naugbton, Jeffrey
    Publisher
    University of Wisconsin-Madison Department of Computer Sciences
    Metadata
    Show full item record
    Abstract
    There is currently a lot of interest in developing Internet query processors that can pose elaborate queries on XML data on the Web. Such query processors can query data sources that have static XML files, but they should also be able to query "hidden Web" data sources that export an XML view of data stored in a database. To optimize queries that involve these hidden Web data sources, we need to have XML statistics that can be used to estimate the selectivity of queries posed to these sources. Since we can only access the data at a hidden Web data source by issuing queries, we need to develop on-line XML statistics that are built by observing queries to a hidden Web data source and their result sizes. In this paper, we assume that queries to a hidden Web data source are XPath selections from a virtual XML document that represents all the data at this source. We observe the user XPath queries to the data source and convert them to a more abstract and generalized form that we call annotated path expressions. We describe an on-line statistics structure that stores such annotated path expressions and information about their selectivity for use in estimating the selectivity of future XPath queries. We experimentally demonstrate the convergence and accuracy of our proposed on-line statistics using real and synthetic XML data sets.
    Permanent Link
    http://digital.library.wisc.edu/1793/60350
    Type
    Technical Report
    Citation
    TR1477
    Part of
    • CS Technical Reports

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Contact Us | Send Feedback