What is XML Schema?
XML Schema is an
XML-based language used to create XML-based languages and data models. An XML
schema defines element and attribute names for a class of XML documents. The
schema also specifies the structure that those documents must adhere to and the
type of content that each element can hold. XML documents that attempt to
adhere to an XML schema are said to be instances of that schema. If they
correctly adhere to the schema, then they are valid instances. This is not the
same as being well formed. A well-formed XML document follows all the syntax
rules of XML, but it does necessarily adhere to any particular schema. So, an
XML document can be well formed without being valid, but it cannot be valid
unless it is well formed.
The Power of XML Schema:
You may already have
some experience with DTDs. DTDs are similar to XML schemas in that they are
used to create classes of XML documents. DTDs were around long before the
advent of XML. They were originally created to define languages based on SGML,
the parent of XML. Although DTDs are still common, XML Schema is a much
more powerful language.
Limitations of DTD:
1. DTDs do not
have built-in data types.
2. DTDs do not
support user-derived data types.
3. DTDs allow
only limited control over cardinality (the number of occurrences of an element
within its parent).
4. DTDs do not support Namespaces or
any simple way of reusing or importing other schemas.
Form of an XML Schema Definition:
An XML Schema is an XML document.
First item: xml declaration
<?xml version="1.0"?>
XML comments and processing
instructions are allowed.
Root element: schema with a namespace
declaration.
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- schema rules go here -->
</xs:schema>
Possible namespace prefixes: xs, xsd,
or none.
Element Specification
Elements are declared using an
element named xs:element with an attribute that gives the name of the element
being defined. The type of the content of the new element can be specified by
another attribute or by the content of the xs:element definition. Element
declarations can be one of two sorts.
Simple Type
Content of these elements can be text
only.
Examples
<xs:element name="item"
type="xs:string"/>
<xs:element name="price"
type="xs:decimal"/>
The values xs:string and xs:decimal
are two of the 44 simple types predefined in the XML Schema language.
Complex Type
Element content can contain other
elements or the element can have attributes (or both).
Example
<xs:element
name="location">
<xs:complexType>
<xs:sequence>
<xs:element name="city"
type="xs:string"/>
<xs:element name="state"
type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The element xs:sequence is one of
several ways to combine elements in the content.
Corresponding DTD:
<! ELEMENT location (city, state)>
Other xs:element Attributes
An xs:element element may also have
attributes that specify the number of occurrences of the element at this
position in the sequence.
minOccurs="0" // default =
1
maxOccurs="5" // default =
maximum(1, minOccurs)
maxOccurs="unbounded"
Mixed Content
Mixed content refers to the situation
where an element has both text and elements in its content. Because of the
sub-elements, the element being defined must be a complex type. To allow mixed
content in an element definition, simply add an attribute mixed to the
xs:complexType starting tag that asserts:
mixed="true"
This attribute has a default value of
"false".
Simple Types
XML Schema Definitions provide a rich
set of predefined primitive types along with a mechanism to customize these
types to create an accurate specification of XML documents. The predefined
types can be classified into several groups. Numeric, Date and time, XML types,
String, Boolean, URIs, and Binary data.
Unions
A simple type can be defined as the
(disjoint) union of two existing simple types. The element xs:union has an
attribute memberTypes whose value is a space-separated list of simple types
that have already been defined.
Example
Suppose we want to store exam scores,
but in some instances, the grade may not be available.
1. Define a type representing a missing score.
<xs:simpleType
name="noScoreType">
<xs:restriction
base="xs:string">
<xs:enumeration
value="none"/>
</xs:restriction>
</xs:simpleType>
2. Define a union type of integer scores and missing scores.
<xs:simpleType
name="scoreOrNoType">
<xs:union memberTypes
="xs:integer noScoreType"/>
</xs:simpleType>
3. Define a list of the union type.
<xs:simpleType
name="scoreOrNoList">
<xs:list itemType
="scoreOrNoType "/>
</xs:simpleType>
4. Define a type whose values can be a list of scores (or
none) or can be a date on which we
can expect the grades
to be made available.
<xs:simpleType
name="scoresOrDateType">
<xs:union memberTypes ="xs:date
scoreOrNoList"/>
</xs:simpleType>