Hey y'all! If you're here, then you must be as excited about XML 1.0 as I am! I'm your go-to gal for all the latest and greatest news about XML 1.0. From the latest updates to quirky little tidbits, I'll have it all for you. So if you're looking for a good laugh, some cool knowledge, or just to stay up-to-date on XML 1.0, then follow me for my next blogs!
XML Documents
Textual objects that are well-formed according to the standard.
Each XML document has both a logical and a physical structure.
- Physically, a document is composed of storage units called entities.
- Logically, a document is composed of declarations, elements, comments, processing instructions, and other structural components.
Well-Formedness
Roughly, it means the following:
- A single top-level element called the root element contains all the other elements.
- Each start-tag has a corresponding end-tag.
- Elements are properly nested and do not overlap.
- Each entity referenced in the document is well-formed.
Many other constrains, the so-called well-formedness constraints, must be fulfilled.
Elements
Each element is delimited by a start-tag and an end-tag, or is is made up from a single empty-element tag.
Examples:
<author>Sir Arthur Conan Doyle</author>
<message xml:lang="en">Hello, World!</message>
<img src="logo.png" alt="Logo"/>
The name specified in a start-tag, end-tag, or empty-element tag is called the element type.
- Well-formedness constraint: the name in an element's end-tag must match the element type in the start-tag.
- The start-tag and the end-tag surrounds the content of the element.
- Elements may have a set of name-value pairs called attribute specifications.
An element with no content is said to be empty.
The representation of an empty element is either a start-tag immediately followed by an end-tag, or an empty-element tag.
Example:
<elem></elem>
or
<elem/>
Special Characters
The '&' and the '<' characters must not appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section.
- If they are needed elsewhere, they must be specified using either character references or the \& and \< entity references.
- The '>' character may be represented using the \> entity reference.
- To allow attribute values to contain both single and double quotes, the ' ' ' and ' " ' characters may be specified as \' and \", respectively.
Markup constructs
Start-Tags, End-Tags, and Empty-Element Tags
Start-tag:
<title>
or
<title xml:lang="en">
End-tag:
</title>
Empty element tag:
<br/>
or
<hr />
or
<img src="logo.png" alt="Logo"/>
Well-formedness constraint: an attribute name must not appear more than once in the same start-tag or empty-element tag.
The order of attribute specifications in a start-tag or empty-element tag is not significant.
Character References
In text, attribute values, and literal entity values Unicode characters may also be expressed using character references of the form:
- >&#nnnn; where nnnn is a sequence of decimal digits representing the code point.
- Examples: \© (©), \☯ ( ),☯), \😺
- >&#xhhhh; where hhhh is a sequence of hexadecimal digits representing the code point.
- Examples: \© (©), \☯ ( ),☯), \😺
Entity References
An entity reference refers to the content of a named entity.
- Reference to a parsed general entity: &name;
- Examples: \&, \Á, ©right;
- Parameter-entity reference: %name;
- Examples: %inline;, %ImgAlign;
- Entity references are discussed later with the physical structures.
Comments
They may appear anywhere in a document outside other markup.
- The only exception is the document type declaration, in which they may appear at some places.
Example:
<!-- This is a comment -->
Processing Instructions
They contain instructions for applications.
Example:
<?xml-stylesheet type="text/css" href="style.css"?>
CDATA Sections
They may occur anywhere where character data may occur.
- They are used to escape blocks of text containing characters which would otherwise be recognized as markup.
- Within a CDATA section, only the ']]>' string is recognized as markup.
Example:
<![CDATA[if (0 < n && n <= 10)]]>
XML Declaration
XML documents should begin with an XML declaration which specifies the version of XML being used.
- The character encoding must be specified if the encoding used is not UTF-8 or UTF-16, unless an encoding is determined by a higher-level protocol (e.g., HTTP).
Examples:
<?xml version="1.0"?>
<?xml version='1.0' encoding='UTF-8'?>
Conclusion
Well, that wraps up our discussion on XML 1.0. I hope you all have a better understanding of the language and its capabilities. Before you go, I leave you with one last thought: XML 1.0 has been around for over two decades, but it still has plenty of room for improvement. So, if you want to stay up-to-date on the latest and greatest XML developments, be sure to follow me for my next blog. Until then, happy coding!