As mentioned previously, the key to creating XML documents is to stick to the rules. As long as you follow a few simple requirements, you can create XML for nearly any purpose imaginable. Let's learn more about these rules, and then use them as a guide when creating XML markup for the job listings document.
Elements
The most basic unit in an XML document is an element. An element is a container that can hold data or other elements. Elements are generally delimited by start-tags and end-tags, both of which are enclosed in angle brackets. End-tags differ from start-tags in that they include a forward slash character (/) in front of the element name. Following is an example of an element:
<my-element>This text is contained within an element.</my-element>
Exiting code block.
There is another type of element called an empty element. These elements have no content between the opening and closing tags. Instead of writing out both tags, the empty element can be abbreviated by using a single tag and putting the slash at the end of the tag name. The tags <break></break>
can be written simply as <break />
. Empty elements are often used as a marker, such as the HTML <img />
element, which denotes the location of an image in a web page. In XML, empty elements are sometimes used to give commands to certain XML processors or as containers to store information only in attributes.
Element Relationships
As was mentioned, XML elements can contain other elements. This means that XML is ideal for storing information that can be structured like an outline or a tree-like hierarchy. If you imagine a diagram of a family tree, one can easily see the relationship between parents, children, and siblings. Similarly, the diagram of an XML document shows the same relationships. The recipe XML file shown above could be visualized by the following diagram:
We often refer to elements by their relationship or function in an XML document:
- Document or root element - An element that encloses all other elements in a document. In the example above,
<recipe>
is the document element. It is the only element that does not have a parent. Each XML document must have one and only one document element. - Child elements & parent elements - Child elements are contained within a parent element. Each child can have only one parent, but a parent can have more than one child. In the above example,
<ingredient>
is a child of<ingredients>
. - Sibling elements - Elements that are on the same level and share the same parent. The elements
<dish-name>
and<ingredients>
are siblings in the example above.
- Document or root element - An element that encloses all other elements in a document. In the example above,
Properly Nesting Elements
When we're writing an XML document, we need to make sure that each element has one and only one parent element, and that it does not overlap with its siblings. For example, the following code will cause an error:
<last-name>Jones<address>1017 S West St.</last-name></address>
Exiting code block.
Notice that <last-name>
and <address>
overlap each other. To fix this, you should close the <last-name>
element before beginning <address>
, as demonstrated in the following code:
<last-name>Jones</last-name><address>1017 S West St.</address>
Exiting code block.
This makes them into proper sibling elements, and won't cause any errors when processed by an XML processor.
Valid Element Names
There are some limitations on names that can be chosen for elements. In general, you can use upper- and lower-case letters, numbers, and the following punctuation:
Character | Example |
underscore | <vehicle_type> |
hyphen | <listing-id> |
period | <font.size> |
Other punctuation should be avoided, as many punctuation marks serve special purposes in XML.
NOTE: It is possible to use non-English characters in element names, so long as they are supported by Unicode — however, keep in mind that it might be difficult for others using your XML document to understand tags in non-English characters, and they may not have the correct fonts needed for displaying non-English characters. To learn more about using non-English characters in XML documents, read this W3C FAQ article on using non-English tags in XML.
Case Sensitivity
As previously mentioned, you can use upper- and lower-case letters in XML element names — however, one important thing to keep in mind is that XML is case-sensitive. If you use capital letters in the start-tag, you must use the same capitalization in the end-tag. For example, the following code example will cause an error because, to the XML parser, <first-step>
is an entirely different element than </First-step>
:
<first-step>A First Step</First-step>
Exiting code block.
Now that we know the rules of XML syntax, let's start marking up job_postings.xml.
Adding the Document Element
The first element you'll be adding to the job listings document is the document element. As discussed previously, it encloses all other elements inside an XML document, so it makes sense to add this element first.
Before doing that, let's take a quick look at job_postings.xml. You'll notice that the document has a heading at the top — we can replace this with the document element. For our purposes, the document element <midwest-job-listing>
will suit our needs, as well as succinctly describe the document's contents. Let's remove the document's heading, then add the document element to job_postings.xml.