XML, or eXtensible Markup Language, is a markup language that is designed for storing and sharing information across different computing platforms. XML provides structure to documents that can then be read and displayed by a program.
A Brief History of XML
The history of XML actually starts with the development of Standardized Generalized Markup Language (SGML) in the 1970s, which was an attempt at creating a language used to mark up documents with structural tags in order to make documents machine-readable. SGML became an international standard in 1986. After SGML came Hypertext Markup Language (HTML), developed by Tim Berners-Lee to transmit documents over the World Wide Web – it was released in 1993, originally as an application of SGML.
However, SGML and HTML had their limitations. SGML was a little too complex for general use, and required specialized programs to read the markup – and while HTML was easier to use, its main focus was displaying information, not providing meaningful structure to the information. This is where XML comes in!
XML aimed to be a human-readable markup language that’s easy to work with, and can be used to provide structure to documents in order to make them machine readable. XML was officially made a recommendation of the World Wide Web Consortium (W3C) in 1998.
XML in action
XML provides a general set of rules that authors can then use to create their own markup languages, or method of structuring or fomatting a document. Since XML files all share the same general specifications, they can be used with a wide variety of software and computing platforms, as well as shared with others who may have data similar to yours.
For example, suppose you had a large collection of recipes that you wanted to store on a computer. You could simply create text files that looked like the following:
Extreme Fajitas
Ingredients
1 lb. sliced flank steak
1 large yellow onion, sliced
2 bell peppers cut into half-inch wide strips
1 package Extreme Fajitas marinade
Directions
Let all ingredients marinate for at least two hours.
Pre-heat fajita pan in 350 degree oven. Place ingredients
in pan and cook for 20 minutes.
Remove from oven. Serve sizzling with tortillas,
tomatoes, guacamole, lettuce and sour cream.
Exiting code block.
There isn't any information in the document that meaningfully indicates the document's structure to a computer — and as a result, it would be difficult for a computer to take this information and make use of it. However, using the rules of XML, we could create a markup language that would indicate the different parts of a recipe in order to make it machine-readable as well as human-readable. The following example shows what such a markup language might look like:
<recipe>
<dish-name>Extreme Fajitas</dish-name>
<ingredients>
<ingredient>1 lb. sliced flank steak</ingredient>
<ingredient>1 large yellow onion, sliced</ingredient>
<ingredient>2 bell peppers cut into half-inch wide strips</ingredient>
<ingredient>1 package Extreme Fajitas marinade</ingredient>
</ingredients>
<directions>
<step>Let all ingredients marinate for at least two hours. </step>
<step>Pre-heat fajita pan in 350 degree oven. Place ingredients in pan and cook for 20 minutes.</step>
<step>Remove from oven. Serve sizzling with tortillas, tomatoes, guacamole, lettuce and sour cream.</step>
</directions>
</recipe>
Exiting code block.
If you’re even a little familiar with HTML, you’ll notice that HTML and XML share some similarities (both types of documents are built similarly, both use similar building blocks like elements, tags, and so on) – however, they also have some pretty big differences.
First off, XML’s main purpose is for structuring and describing data, while HTML’s main focus is displaying data. While HTML documents do have structure, that structure only focuses on the organization and appearance of the data, not what each piece of data actually means. For example, if we were to mark up our recipe with HTML, the elements used would be very different. Instead of indicating what a piece of information means and how it relates to the rest of the document, the markup would be describing how the information contained should look.
Unlike HTML, which has a set of predefined tags used for specific purposes (<p>
for paragraphs, <h2>
for second-level headings, and so on), XML has no predefined tags. This allows for authors to come up with their own tags and structure to fit the needs of their data. For recipes, this means we can come up with whatever tags we need to mark up the recipes we’ve got in our collection, like <dish-name>
and <ingredients>
.
Additionally, XML is extensible – meaning that you can add and remove elements to meet your document’s needs, without affecting how your content is processed. (In comparison, you can’t create custom HTML elements for your HTML document, as the set of HTML elements that are predefined aren’t extensible.) For example, if you needed to add an element for optional ingredients in your recipe markup language, you could easily do that — and it wouldn't affect how your content is read by an XML processor.
XML separates data from presentation, making it so that depending on the use, the data can be displayed/formatted in different ways to meet specific needs. XML data can also be transformed into other file types, such as an HTML document or a PDF, using programming languages like Extensible Stylesheet Language Transformations (XSLT), or displayed as part of an existing website using Javascript or PHP.
Why XML, instead of something else?
You might be thinking, why should I use XML to mark up my data? Let's go back to the example of the recipe collection — if you wanted, you could theoretically create a Microsoft Access database that holds your recipe collection, which you can then search through to find specific recipes. However, if you wanted to share your collection, only people who use Access could open and use your database. With XML, however, anyone with an XML processor can make use of your recipe database, and even add your recipies to their own collection, and you can easily add new elements to your recipes without having to rework your entire database.
In summary, XML is...
- Flexible: you can make XML documents that will suit whatever your needs are
- Shareable: allows for sharing of information in a non-proprietary format
- Easily readable: both by humans and XML processors
Now that you know a little bit more about XML, you're ready to start learning about how to create XML.