Xml 1.0

Xml 1.0

Hey y'all! If you're here, then you must be as excited about XML 1.0 as I am! I'm your go-to gal for all the latest and greatest news about XML 1.0. From the latest updates to quirky little tidbits, I'll have it all for you. So if you're looking for a good laugh, some cool knowledge, or just to stay up-to-date on XML 1.0, then follow me for my next blogs!

XML Documents

Textual objects that are well-formed according to the standard.

Each XML document has both a logical and a physical structure.

  • Physically, a document is composed of storage units called entities.
  • Logically, a document is composed of declarations, elements, comments, processing instructions, and other structural components.

Well-Formedness

Roughly, it means the following:

  • A single top-level element called the root element contains all the other elements.
  • Each start-tag has a corresponding end-tag.
  • Elements are properly nested and do not overlap.
  • Each entity referenced in the document is well-formed.

Many other constrains, the so-called well-formedness constraints, must be fulfilled.

Elements

Each element is delimited by a start-tag and an end-tag, or is is made up from a single empty-element tag.

Examples:

<author>Sir Arthur Conan Doyle</author>
<message xml:lang="en">Hello, World!</message>
<img src="logo.png" alt="Logo"/>

The name specified in a start-tag, end-tag, or empty-element tag is called the element type.

  • Well-formedness constraint: the name in an element's end-tag must match the element type in the start-tag.
  • The start-tag and the end-tag surrounds the content of the element.
  • Elements may have a set of name-value pairs called attribute specifications.

An element with no content is said to be empty.

  • The representation of an empty element is either a start-tag immediately followed by an end-tag, or an empty-element tag.

    • Example:

      <elem></elem>
      

      or

      <elem/>
      

Special Characters

The '&' and the '<' characters must not appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section.

  • If they are needed elsewhere, they must be specified using either character references or the \& and \< entity references.
  • The '>' character may be represented using the \> entity reference.
  • To allow attribute values to contain both single and double quotes, the ' ' ' and ' " ' characters may be specified as \' and \", respectively.

Markup constructs

Start-Tags, End-Tags, and Empty-Element Tags

Start-tag:

<title>

or

<title xml:lang="en">

End-tag:

</title>

Empty element tag:

<br/>

or

<hr />

or

<img src="logo.png" alt="Logo"/>

Well-formedness constraint: an attribute name must not appear more than once in the same start-tag or empty-element tag.

The order of attribute specifications in a start-tag or empty-element tag is not significant.

Character References

In text, attribute values, and literal entity values Unicode characters may also be expressed using character references of the form:

  • >&#nnnn; where nnnn is a sequence of decimal digits representing the code point.
    • Examples: \© (©), \☯ ( ),☯), \😺
  • >&#xhhhh; where hhhh is a sequence of hexadecimal digits representing the code point.
    • Examples: \© (©), \☯ ( ),☯), \😺
Entity References

An entity reference refers to the content of a named entity.

  • Reference to a parsed general entity: &name;
    • Examples: \&, \Á, ©right;
  • Parameter-entity reference: %name;
    • Examples: %inline;, %ImgAlign;
  • Entity references are discussed later with the physical structures.
Comments

They may appear anywhere in a document outside other markup.

  • The only exception is the document type declaration, in which they may appear at some places.

Example:

<!-- This is a comment -->
Processing Instructions

They contain instructions for applications.

Example:

<?xml-stylesheet type="text/css" href="style.css"?>
CDATA Sections

They may occur anywhere where character data may occur.

  • They are used to escape blocks of text containing characters which would otherwise be recognized as markup.
  • Within a CDATA section, only the ']]>' string is recognized as markup.

Example:

<![CDATA[if (0 < n && n <= 10)]]>
XML Declaration

XML documents should begin with an XML declaration which specifies the version of XML being used.

  • The character encoding must be specified if the encoding used is not UTF-8 or UTF-16, unless an encoding is determined by a higher-level protocol (e.g., HTTP).

Examples:

<?xml version="1.0"?>
<?xml version='1.0' encoding='UTF-8'?>

Conclusion

Well, that wraps up our discussion on XML 1.0. I hope you all have a better understanding of the language and its capabilities. Before you go, I leave you with one last thought: XML 1.0 has been around for over two decades, but it still has plenty of room for improvement. So, if you want to stay up-to-date on the latest and greatest XML developments, be sure to follow me for my next blog. Until then, happy coding!

Did you find this article valuable?

Support Mojtaba Maleki by becoming a sponsor. Any amount is appreciated!