Querying Nested Relations
Participants: Peter Buneman, Leonid Libkin,
Dan Suciu, Val Tannen, Limsoon Wong
Background
We present a new principle for the development of database query languages
that the primitive operations should be organized around types. Viewing
a relational database as consisting of sets of records, this
principle dictates that we should investigate separately operations
for records and sets. There are two immediate advantages of this approach.
Firstly, it provides a language for structures in which record and
set types may be freely combined: nested relations or complex objects,
Second, the fundamental operations for sets are
closely related to those for other ``collection types''
such as bags and lists, and this suggests how database languages
may be uniformly extended to these new types.
We also look into various questions on the expressive power of
query languages resulting from this paradigm. We prove a number
of very fundamental results---the conservative extension property,
the finite-cofiniteness property, and the bounded degree property---on
nested relational query languages and on SQL.
Achievements
-
We demonstrated that a nested relational calculus, NRC(=), based on our
new paradigm is equivalent to various traditional nested relational
query languages. This allows expressive power results to be transferred
from our calculus to the traditional ones.
-
We showed that NRC(=) has the convervative extension property.
That is, every function expressible in NRC(=)
from relations of up to m levels of nestings
to relations up to n levels of nestings
can be expressed using an NRC(=) expression that does not
generate intermediate data involving more than max(m,n) levels
of nestings. In particular, functions from flat relations to flat relations
can be expressed using an NRC(=) expression that generates on
flat relations as intermediate data.
-
We also showed that
the conservative extension property continues to hold when aggregate
functions are added to NRC(=), implying that the flat fragment of NRC(=)
augmented with aggregate functions precisely captures SQL.
Thus this fragment of NRC(=) can be used for a theoretical analysis
of the expressive power of SQL.
-
We showed that NRC(=) and its extensions with aggregate functions---and thus
SQL---satisfy the finite-cofinite property on several varieties of graphs.
For example, every function from graphs to graphs that is
expressible in NRC(=) and its extensions with aggregate functions
either hold for finitely many k-multicycles (up to isomorphism).
This result allowed us to settle the Grumbach-Milo conjecture on
a nested bag query language.
-
We proved that NRC(=) and its extensions with aggregate functions---and thus
SQL---satisfy the bounded degree property. That is, every function
from graphs to graphs that is expressible in NRC(=) and its extensions
with aggregate functions can only produce an output graph containing
a small nmber of distinct degrees bounded by a function on the maximum
degree of the input graphs. In short, SQL cannot produce a complex
graph from simple input graphs. This property allows practically all
known expressive power results on relational algebra or first order
logic to be lifted to NRC(=) and SQL.
-
We proved perhaps one of the few known results on ``intensional''
expressive power on query languages. In particular, we showed that
any uniform translation of queries based on a sequential form of
iteration over NRC(=) to a data-parallel form of iteration over
NRC(=) must map some PTIME queries into exponential-space queries.
It is known that in the presence of higher-order functions
sequential iterations and data-parallel iterations can be
mutually translated into each other without losing efficiency.
Thus, our result demonstrated that it could be beneficial to
add higher-order function capability to query languages.
- We developed the
Kleisli Query System based on NRC(=).
-
Won multiple awards. In particular,
- The 1997 SNAS & NSTB Young Scientist Award
- The 1998 NUS Staff Achievement Award
- The 2014 ICDT Test of Time Award
Selected Publications
-
Val Breazu-Tannen, Peter Buneman, Limsoon Wong.
Naturally Embedded Query languages.
Proceedings of 4th International Conference on Database Theory,
Berlin, Germany, 140-154, October 1992.
-
Limsoon Wong.
Normal Forms and Conservative Properties of Query Languages
over Collection Types.
Proceedings of 12th ACM Symposium on Principles of Database Systems,
Washington, D. C., 26-36, May 1993.
-
Leonid Libkin, Limsoon Wong.
Semantic Representations and Query Languages for Or-sets.
Proceedings of 12th ACM Symposium on Principles of Database Systems,
Washington, D. C., 37-48, May 1993.
-
Leonid Libkin, Limsoon Wong.
Aggregate Functions, Conservative Extension, and Linear Orders.
Proceedings of 4th International Workshop on
Database Programming Languages,
Manhattan, New York, 282-294, August 1993.
-
Leonid Libkin, Limsoon Wong.
Some Properties of Query Languages for Bags.
Proceedings of 4th International Workshop on
Database Programming Languages,
Manhattan, New York, 97-114, August 1993.
-
Peter Buneman, Leonid Libkin, Dan Suciu, Val Tannen, Limsoon Wong.
Comprehension Syntax.
ACM SIGMOD Record, 23(1):87-96, March 1994. (Reviewed invited paper)
PS
- Leonid Libkin, Limsoon Wong.
New Techniques for Studying Set Languages, Bag Languages,
and Aggregate Functions.
Proceedings of 13th ACM Symposium on Principles of Database Systems,
Minneapolis, Minnesota, 155-166, May 1994.
-
Leonid Libkin, Limsoon Wong.
Conservativity of Nested Relational Calculi with
Internal Generic Functions.
Information Processing Letters, 49(6):273-280, March 1994.
PS
-
Dan Suciu, Limsoon Wong.
On Two Forms of Structural Recursion.
Proceedings of 5th International Conference on Database Theory,
Prague, Czech Republic, 111-124, January 1995.
PS
-
Limsoon Wong.
Polymorphic Queries Across Sets, Bags, and Lists.
ACM SIGPLAN Notices, 30(4):39-44, April 1995.
PS
-
Peter Buneman, Shamim Naqvi, Val Tannen, Limsoon Wong.
Principles of Programming with Complex Objects and Collection Types.
Theoretical Computer Science, 149(1):3--48, September 1995.
(Reviewed invited paper)
PDF
-
Limsoon Wong.
An Introduction to Remy's Fast Polymorphic Record Projection.
ACM SIGMOD Record, 24(3):34--39, September 1995.
PDF
-
Leonid Libkin, Limsoon Wong.
On Representation and Querying Incomplete Information in
Databases with Multisets.
Information Processing Letters, 56:209--214, November 1995.
PS
-
Leonid Libkin, Limsoon Wong.
Semantic Representations and Query Languages for Or-sets.
Journal of Computer and System Sciences, 52(1):125--142, February 1996.
PS
-
Limsoon Wong.
Normal Forms and Conservative Extension Properties for
Query Languages over Collection Types.
Journal of Computer and System Sciences, 52(3):495--505, June 1996.
(Reviewed invited paper)
PDF
-
Leonid Libkin, Rona Machlin, Limsoon Wong.
A Query Language for Multidimensional Arrays:
Design, Implementation, and Optimization Techniques.
Proceedings of ACM SIGMOD International Conference on Management of Data,
Montreal, Canada, 228-239, June 1996.
PS
-
Stephane Grumbach, Leonid Libkin, Tova Milo, Limsoon Wong.
Query Languages for Bags: Expressive Power and Complexity.
ACM SIGACT News, 27(2):30--37, June 1996. (Guest database column)
PS
-
Guozhu Dong, Leonid Libkin, Limsoon Wong.
Local Properties of Query Languages.
Proceedings of 6th International Conference on Database Theory,
Delphi, Greece, 140-154, January 1997.
-
Leonid Libkin, Limsoon Wong.
On the Power of Aggregation in Relational Query Languages.
Proceedings of 6th International Workshop on Database
Programming Languages,
Estes Park, Colorado, August 1997, 260--280.
-
Leonid Libkin, Limsoon Wong.
Query Languages for Bags and Aggregate Functions.
Journal of Computer and System Sciences,
55(2):241--272, October 1997. (Reviewed invited paper.)
PS
-
Guozhu Dong, Leonid Libkin, Limsoon Wong.
Local Properties of Query Languages.
Theoretical Computer Science, 239:277--308, 2000.
(Reviewed invited paper.)
PS
PDF
Dissertations
- Limsoon Wong.
Querying Nested Collections.
PhD thesis,
Dept of Computer and Information Science,
University of Pennsylvania,
Philadelphia,
1994.
Selected Presentations
-
Limsoon Wong.
Adventures of a Logician-Engineer.
Public Lecture by Winner of 1997 Singapore National Academy
of Science Young Scientist Award,
Singapore, April 1998.
-
Limsoon Wong.
Invariants in SQL.
Talk at Alfred Renyi Institute of Mathematics,
Hungarian Academy of Sciences, Budapest, June 1999.
-
Limsoon Wong.
Invariants in SQL.
Invited talk at KAIST Advanced Information Technology Research Center,
Taejon, Korea, November 2000.
PPT
-
Limsoon Wong.
Adventures of a Logician-Engineer: A Journey Through Logic,
Engineering, Medicine, Biology, and Statistics.
Invited talk at 8th International Symposium on Symbolic
and Numeric Algorithms for Scientific Computing (SYNASC 2006),
Timisoara, Romania, 26-29 September 2006.
PPT
- Peter Buneman, Val Tannen, Limsoon Wong.
A retrospective on naturally embedded query languages.
Test of Time Award talk at 17th International Conference on
Database Theory,
Athens, Greece, 25 March 2014.
PDF
Acknowledgements
This project is supported in part by
ONR grant NOOO-14-88-K-0634 (Buneman, Tannen),
NSF grant CCR-90-57570 (Suciu, Tannen: 90 - 95),
ARO grant DAALO3-89-C-0031-PRIME (Tannen, Wong: 91 - 94),
NSF grant IRI-86-10617 (Buneman),
NSF grant IRI-90-04137 (Libkin, Suciu: 91 - 94; Wong: 90), and
EDB grant to establish the NUS Bioinformatics Center (Wong: 96 - 98).
Last updated: 26/3/14, Limsoon Wong.