Now showing items 3-7 of 7

    • Database Support for Matching: Limitations and Opportunities 

      Kini, Ameet; Shankar, Srinath; DeWitt, David; Naughton, Jeffrey (University of Wisconsin-Madison Department of Computer Sciences, 2005)
      A match join of R and S with predicate theta is a subset of the theta join of R and S such that each tuple of R and S contributes to at most one result tuple. Match joins and their generalizations arise in many scenarios, ...
    • On the Integration of Structure Indexes and Inverted Lists 

      Kaushik, Raghav; Krishnamurthy, Rajasekar; Naughton, Jeffrey; Ramakrishnan, Raghu (University of Wisconsin-Madison Department of Computer Sciences, 2003)
      We consider the problem of how to combine structure indexes and inverted lists to answer queries over a native XML DBMS, where the queries specify both path and keyword constraints. We augment the inverted list entries to ...
    • RDBMS Index Support for Sparse Data Sets 

      Beckmann, Jennifer; Chu, Eric; Naughton, Jeffrey (University of Wisconsin-Madison Department of Computer Sciences, 2006)
      Maintenance costs and storage overheads incurred by indexes often limit the number of indexes created per table in an RDBMS. For sparse data, where a table may have hundreds of attributes, indexing only a few attributes ...
    • A Survey of the Existing Landscape of ML Systems 

      Kumar, Arun; McCann, Robert; Naughton, Jeffrey; Patel, Jignesh M. (2015-11-27)
      We survey the existing landscape of ML systems to identify gaps that motivate our vision of a unifying abstraction to support the iterative process of model selection and lay a principled foundation for model selection ...
    • To Join or Not to Join? Thinking Twice about Joins before Feature Selection 

      Kumar, Arun; Naughton, Jeffrey; Patel, Jignesh M.; Zhu, Xiaojin (2015-11-27)
      Closer integration of machine learning (ML) with data processing is a booming area in both the data management industry and academia. Almost all ML toolkits assume that the input is a single table, but many datasets are ...