Exploring XML Syntax

As mentioned previously, the key to creating XML documents is to stick to the rules. As long as you follow a few simple requirements, you can create XML for nearly any purpose imaginable. Let's learn more about these rules, and then use them as a guide when creating XML markup for the job listings document.

Elements

The most basic unit in an XML document is an element. An element is a container that can hold data or other elements. Elements are generally delimited by start-tags and end-tags, both of which are enclosed in angle brackets. End-tags differ from start-tags in that they include a forward slash character (/) in front of the element name. Following is an example of an element:


<my-element>This text is contained within an element.</my-element>

Exiting code block.

There is another type of element called an empty element. These elements have no content between the opening and closing tags. Instead of writing out both tags, the empty element can be abbreviated by using a single tag and putting the slash at the end of the tag name. The tags <break></break> can be written simply as <break />. Empty elements are often used as a marker, such as the HTML <img /> element, which denotes the location of an image in a web page. In XML, empty elements are sometimes used to give commands to certain XML processors or as containers to store information only in attributes.

Element Relationships

As was mentioned, XML elements can contain other elements. This means that XML is ideal for storing information that can be structured like an outline or a tree-like hierarchy. If you imagine a diagram of a family tree, one can easily see the relationship between parents, children, and siblings. Similarly, the diagram of an XML document shows the same relationships. The recipe XML file shown above could be visualized by the following diagram:

Outline of the recipe XML document discussed earlier, broken down into a diagram.  The recipe element is at the top of the chart, and branching out from recipe are the elements dish-name, ingredients, and directions.  Branching out from the ingredients element are three ingredient elements, and branching out from the directions element are three step elements.

We often refer to elements by their relationship or function in an XML document:

    • Document or root element - An element that encloses all other elements in a document. In the example above, <recipe> is the document element. It is the only element that does not have a parent. Each XML document must have one and only one document element.
    • Child elements & parent elements - Child elements are contained within a parent element. Each child can have only one parent, but a parent can have more than one child. In the above example, <ingredient> is a child of <ingredients>.
    • Sibling elements - Elements that are on the same level and share the same parent. The elements <dish-name> and <ingredients> are siblings in the example above.

Properly Nesting Elements

When we're writing an XML document, we need to make sure that each element has one and only one parent element, and that it does not overlap with its siblings. For example, the following code will cause an error:


<last-name>Jones<address>1017 S West St.</last-name></address>

Exiting code block.

Notice that <last-name> and <address> overlap each other. To fix this, you should close the <last-name> element before beginning <address>, as demonstrated in the following code:


<last-name>Jones</last-name><address>1017 S West St.</address>

Exiting code block.

This makes them into proper sibling elements, and won't cause any errors when processed by an XML processor.

Valid Element Names

There are some limitations on names that can be chosen for elements. In general, you can use upper- and lower-case letters, numbers, and the following punctuation:

CharacterExample
underscore<vehicle_type>
hyphen<listing-id>
period<font.size>

Other punctuation should be avoided, as many punctuation marks serve special purposes in XML.

NOTE: It is possible to use non-English characters in element names, so long as they are supported by Unicode — however, keep in mind that it might be difficult for others using your XML document to understand tags in non-English characters, and they may not have the correct fonts needed for displaying non-English characters.  To learn more about using non-English characters in XML documents, read this W3C FAQ article on using non-English tags in XML.

Case Sensitivity

As previously mentioned, you can use upper- and lower-case letters in XML element names — however, one important thing to keep in mind is that XML is case-sensitive. If you use capital letters in the start-tag, you must use the same capitalization in the end-tag. For example, the following code example will cause an error because, to the XML parser, <first-step> is an entirely different element than </First-step>:


<first-step>A First Step</First-step>

Exiting code block.

Now that we know the rules of XML syntax, let's start marking up job_postings.xml.

Adding the Document Element

The first element you'll be adding to the job listings document is the document element.  As discussed previously, it encloses all other elements inside an XML document, so it makes sense to add this element first.  

Before doing that, let's take a quick look at job_postings.xml. You'll notice that the document has a heading at the top — we can replace this with the document element. For our purposes, the document element <midwest-job-listing> will suit our needs, as well as succinctly describe the document's contents. Let's remove the document's heading, then add the document element to job_postings.xml

NOTE: XML authors are advised to use human-readable element names such as the one above. Abbreviated tag names such as <mdw-job-lst> or <mjl> may mean less work typing, but they will make it much harder for others to understand your markup.

  1. To select the heading for the document,

    Press & Drag UNIVERSITY OF THE MIDWEST JOB POSTINGS

  2. To delete the text, on the keyboard, press:

    Backspace key

  1. To add the opening tag for the document element, add the highlighted code to the top of job_postings.xml:

<midwest-job-listing> (12/23/2018)

Instructional Technologist (37483)

$25-$30 an hour

Full-Time Faculty

Exiting code block.

  1.  Scroll to the bottom of the document.
  2. To add the closing tag for the document element, add the higlighted code to the bottom of job_postings.xml:

AVAILABILITY:
Days only.

Starts immediately

</midwest-job-listing>

Exiting code block.

  1. To save job_postings.xml, in the menu bar,

    Click File, Click Save

XML Declaration

Although it is not required, it's typically a good idea to include an XML declaration with your documents. The declaration serves to notify people and applications that you're using the XML format, as well as what specific version of XML and what type of encoding your document uses. It looks a lot like an element, but it isn't actually considered one by XML standards. If you include one in your document, it must be the first thing in that document — typically placed before the document element, with no whitespace or other characters before the declaration.

The XML declaration you're about to add will indicate that you're using XML 1.0 (the current standard) and that you're using the UTF-8, or Unicode, character set to encode our document. Unicode is a standard that allows you to use characters from a wide variety of writing systems (i.e. Korean, Greek, Arabic, Cyrillic, etc.) in your document. Because of this, it should be favored over narrower character sets that only allow for English or Western characters. If you do not specify a character encoding, XML will use Unicode by default.

Let's include the XML declaration in job_postings.xml.

  1. To position the cursor,

    Click to the left of <midwest-job-listing>press: Enter Up Arrow

  2. To add the declaration to the document, add the higlighted code to the beginning of job_postings.xml:


<?xml version="1.0" encoding="UTF-8"?>
<midwest-job-listing> (12/23/2018)

Instructional Technologist (37483)

$25-$30 an hour

Exiting code block.