What is a schema?

Schemas are a type of XML document that describe the structure of an XML document, and define how a set of XML tags can be used. This helps ensure that the elements and attributes in an XML document are being used properly.

Schemas can be used to:

provide users of an XML document with a list of elements and attributes that can be used in said document
indicate where elements and attributes can be used in a document, and the specific order elements need to appear in
provide information about a document that is both human- and machine-readable

There are many different languages available for writing schemas, but the three most popular languages are Document Type Definition (or DTD), XML Schema Language, and Relax NG. Each schema language offers different features and uses different syntax, and deciding on what language to use depends on what features an author needs for their XML document.

Why use a schema?

There are many reasons we might want to use a schema. For example, suppose you’re in charge of regularly publishing a list of job openings. To accomplish this, you make use of XML to mark up plain text documents containing job information, which will later be displayed on a website. If for some reason you need to be out of the office and someone else will be responsible for marking up and posting the job openings, a schema will make it easier for that person to understand how the document should be structured.

Additionally, it can be used to check the person’s markup and ensure they’re using it properly by validating the XML markup.

Validating XML

The process of validating XML involves checking to make sure that the elements and attributes in a document are being used according to the rules set out in a schema. There are a number of different ways to validate an XML document. If you’re using an XML editor like oXygen XML Editor, the schema will typically be automatically detected and used to indicate errors in a document. If working in a code editor, such as Notepad++ or Brackets, you’ll likely need to use an online validation service to check the validity of your XML code, such as the site XML Validation.

Creating a schema

Schemas can be created using the same software used to create XML documents, such as XML editors, code editors and plain text editors. If using an XML editor to create a schema, you can choose what language you’ll be creating a schema in, and the XML editor will help make sure you’re following that schema language’s syntax. If you’re working with a text or code editor, the main thing you’ll need to remember is to save the document with the appropriate file extension in order to make sure the schema works properly.

What to do before creating a schema

Before starting the process of creating a schema, you’ll need to determine what elements and attributes will be included, and how documents using the schema should be structured. Knowing the structure of XML documents that will use a specific schema will be helpful in the schema creation process.

It might help to sketch out a diagram of the various elements in your XML document that makes note of which elements are used and where, what attributes should be included and where they are used, as well as the order of child elements used in the document.

The schema languages covered in this course are:

Document Type Definition (DTD)
XML Schema Definition
Relax NG

As mentioned previously, these are the three primary schema languages used with XML. While these languages share a number of things in common, they also have their differences.

Schema language similarities and differences

The following are characteristics the three main schema languages have in common:

All give the ability to declare elements and attributes
All allow you to indicate the order of elements and child elements, as well as how many times those elements can appear in a document
All allow you to indicate whether elements and attributes are required or optional

However, there are some differences between the three languages:

Each schema language has its own syntax
Certain features available in one language may not be available in the others (for example, the ability to declare entity references exists in DTD but not the other two languages)