Skip to content
December 12, 2011 / ecschroder

Querying XML, UML, Indexes, Constraints and Triggers

These are some notes from the database class that Stanford offered online & free. Good times…

Querying XML

XSL = Extensible Stylesheet Language
XSLT = XSL (with) Transformation

XSLT performs rule-based transformations. It can be used to:
– match templates and replace
– recursively match templates
– extract values (using XPath)
– iteration (for each)
– conditions (if)

Watch out for some strange whitespace behavior.
XPath has an implicit template priority schema.

Here’s a snippet of XSL that finds all Title attributes of Book elements in an XML document and returns them within a set of <BookTitle> tags:

<xsl:stylesheet ….. >
<xsl:template match=”Book”>
<BookTitle><xsl:value-of select=”Title”></xsl:value-of>

Note: If data doesn’t match any of the templates, it will return a concatenated string of all leaf elements. To fix that (i.e. to remove the extraneous string), do this:

<xsl:template match=”text()”>

That way, the extra leaf elements will match this template element. Since we left the template blank, the extra string won’t appear in the output.

If part of an XML document matches two different templates, the processor will use the second template. Except, more-specific rules are considered high-priority. (That’s sorta similar to CSS.)

Here’s an example of a rule with a condition:
<match=”node[@Price &lt; 90]”>
Note that we used &lt; to represent <.

match=”/” returns all nodes from the root element

match=”*|@*|text()” returns everything

Recursively apply templates

If you want to restructure the entire XML document, this generic template will help. It returns a generic copy of everything.

<xsl:template match=”*|@*|text()”
<xsl:apply-templates select=”*|@*|text()” />

You can also use conditionals (if) and iterations (for each) in templates. That’s useful for turning XML into HTML.

Unified Modeling Language

Higher-level database design models include Entity Relationship Model (E/R) and Unified Modeling Language (UML). Both of these design models are graphical. Both can be translated to relational models automatically.There are five key concepts in UML:
– classes
– associations
– association classes
– subclasses
– composition and aggregationClasses consist of a name, attributes, and (optionally) methods.

Associations are the relationships between objects of two different classes. Association multiplicity can be one-to-one, many-to-one, many-to-many, or complete.

Each object of class C1 is related to at least m and at most n objects of class C2.

Association classes are unnecessary if the multiplicity is 0..1 or 1..1. Otherwise, we can represent association classes like this:

Subclasses are a specialization, whereas superclasses are a generalization. Subclasses can be incomplete (or partial) or complete. They can also be disjoint (or exclusive) or overlapping.

Composition and aggregation are used to describe how objects of one class belong to objects of another class. In this example, every department belongs to a college. Some, but not all, apartments belong to a college.

UML to Relations

UML can be translated directly to relations, provided that every “regular class” has a key. (Subclasses are not regular classes.)

Classes to relations

Straightforward, just like you’d think.

Associations to relations

Take the primary key from both sides.

Association classes to relations

Depends on multiplicity. There’s a rule:
If 0..1 is on the left side, then primary key from the right side is the key of the association relation.

Subclasses to relations

Can be translated many different ways. The best translation depends on the properties. A few schemas include:
1. Subclass relations contain superclass key and specialized attributes.
2. Subclass relations contain all attributes.
3. One relation containing all superclass and subclass attributes.

The cardinality |C| of a class C is equal to the number of objects in the class.


When querying a database, indexes make the difference between scanning the entire table and the immediate location of tuples.

Underlying data structures for indexes are:
– Balanced trees (a.k.a. B trees, B+ trees)
– Hash tables

Balanced trees can be used for locating attributes with values in a certain range, like att = value, att < value, or value < att < value.

Hash tables, on the other hand, run in real time. They can only be used for att = value.

Many database management systems build indexes automatically on primary keys.

There are some downsides to indexing:
– extra space (not really a big deal, just a marginal concern)
– overhead in index creation (a medium-sized concern)
– index maintenance (a large concern, since index maintenance can offset the benefits of indexing altogether)

A physical design advisor can be helpful for choosing which attributes to index. With an input of database stats and planned workload, it gives an output of recommended indexes.

SQL syntax

CREATE INDEX index_name
ON table_name (column_name)

Constraints and Triggers

Constraints and triggers for relational databases are part of the SQL standard. Constraints, as the name suggests, constrain the allowable states of the database. Triggers monitor changes to the database. When changes occur, triggers check conditions and initiate actions.

Integrity constraints impose restrictions on allowable data, beyond the restrictions imposed by structure or type. There are several different types of integrity constraints, including non-null contraints, key constraints, attribution-based and tuple-based constraints, general assertions, and referential integrity constraints (also known as foreign key constraints).

Triggers are also called event-condition-action (ECA) items. When an event occurs, check a condition. If it’s true, then perform this action. Triggers can be used to move logic from the application to the database itself. You might think of them as a special type of stored procedure.

Triggers vary by implementation. (In other words, MySQL, SQLite and PostgreSQL handle triggers differently.)

Before | After | Instead of events
WHEN (condition)

Triggers can reference variables from the old and new tuples. For example, you might include the line the following line if you need to make a reference to the old tuple within the trigger:
OLD as old, NEW as new

Triggers can be row-level (using the “FOR EACH ROW” keyword) or statement level (without the keyword).

Some tricky issues with triggers include:
– Knowing whether to use a row-level or statement-level trigger.
– When multiple triggers are activated at the same time, which one should go first?
– Trigger actions that activate other triggers (chaining). Also: self-triggering, cycles, nested invocations.
– Conditions in the WHEN clause vs. as part of the action clause.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: