Glossary of terms

The following terms are used in an XML and COBOL context.

A COBOL table, that is, a data item described with the OCCURS clause.

Caching is a means of increasing performance by keeping loaded XSLT stylesheets, templates, and schema documents in memory for reuse without the need to reload them. If the application dynamically generates new copies of such documents, caching may be permanently or selectively disabled by the application. Caching is enabled by default at the beginning of an application.

COBOL data structure

A COBOL data structure is a COBOL data item. In general, it is a group data item, but in some cases, it may be a single elementary data item. The Enterprise Developer compiler's XMLGEN directive generates a file containing an XML representation of a COBOL data structure. This map can be used move data in either direction at runtime. Extensible Stylesheet Language Transformations (XSLT) of the XML data representation can be used to match XML element names to COBOL data-names in cases where the names differ.

Document Type Definition (DTD)

The document type definition occurs between the XML header and the first element of an XML document. It optionally declares the document structure and entities. Declared entities may be referenced in the document.


Acronym for Document Object Model. XML documents are parsed and stored in the DOM for processing.

External XSLT stylesheet

An XSLT stylesheet that is provided by the user and referenced as a parameter in the XML EXPORT FILE/TEXT, XML IMPORT FILE/TEXT, or XML TRANSFORM FILE/TEXT statements. See also XSLT stylesheet.


An acronym for Hypertext Markup Language. A text description language related to SGML; it mixes text format markup with plain text content to describe formatted text. HTML is ubiquitous as the source language for Web pages on the Internet. Starting with HTML 4.0, the Unicode Standard functions as the reference character set for HTML content. See also SGML, XHTML, and XML.


A character conversion library available on some UNIX systems for converting between UNICODE characters and local characters. When an iconv library is available, the MF_XMLEXT_LOCAL_ENCODING environment variable may specify the name of a conversion supported by that iconv library and the xmlif library will use that conversion. Otherwise, the only conversions supported are "rmlatin1" and "rmlatin9".

Model files

XML document files created by the XMLGEN compiler directive. The model file is usually named the same as the COBOL program source file but with an extension of .xml.

Schema valid XML document

An XML document that conforms to a particular XML schema.


An acronym for Standardized Generalized Markup Language. A standard framework, defined in ISO 8879, for defining particular text markup languages. The SGML framework allows for mixing structural tags that describe format with the plain text content of documents, so that fancy text can be fully described in a plain text stream of data. See also HTML, and XML.

Structured document

The term "structured document" describes the concept that a document can contain content, such as words, numbers, pictures, and so forth., as well as information describing the role of content elements and substructures. Adding "structure" to documents facilitates searching, sorting, or any one of a variety of operations to be performed on an electronic document. The benefits of adding structure to electronic documents include portability, re-usability, inter-system operability, ease-of-storage and retrieval, longevity, quick access, and low distribution costs. XML is a set of rules for structuring a document using hierarchical markup. See also XML.

See XSLT stylesheet.

An acronym for Universal Naming Convention. UNC is a filename format that is used to specify the location of files, folders, and resources on a local area network (LAN). For example, a UNC address may look something like this:


UNC also can be used to identify peripheral devices shared on the network, including scanners and printers. It provides each shared resource with a unique address, which allows operating systems that support UNC (such as Windows) to access specific resources quickly and efficiently.


Unicode was developed to support the worldwide interchange, processing, and display of diverse languages and technical disciplines of the world. Unicode is a character coding system that assigns a unique number to each character in each of the world's principal written languages. There exist several alternatives for how a sequence of such characters or their respective integer values can be represented as a sequence of bytes. The two most obvious encodings store Unicode text as either 2- or 4-byte sequences. The official terms for these encodings are UCS-2 and UCS-4, respectively. The current version of the Unicode Standard, developed by the Unicode Consortium, is v4.0.0. For an alternative encoding of Unicode, see also UTF-8.


An acronym for Universal Resource Locator, which is a unique identifier (address) of a specific resource, or file, that is available on the World Wide Web (WWW) and other Internet resources. The URL contains the protocol (the method of access) to be used to access the file resource (for example, http:// for World Wide Web pages, ftp:// for file transfers, mailto:// for e-mail, and so forth), the domain name that identifies a specific host computer on the Internet for the file, and the path that specifies the location of the file on that computer.

A URL is a type of URI (Uniform Resource Identifier, formerly called Universal Resource Identifier).

For XML Extensions purposes, a filename specification is considered to be a URL if it begins with http://, https://, or file://.


UTF stands for Unicode Transformation Format. UTF-8 is an encoding scheme (that is, a method of mapping the Unicode code points to a digital representation), which is commonly used under UNIX-style operating systems and in XML documents. Unicode is defined in ISO 10646-1:2000 Annex D and is also described in RFC 2279, as well as section 3.8 of the Unicode 3.0 standard. It is a variable length encoding scheme from 1 to 6 bytes per character. See also Unicode.

Valid XML document

See Schema valid XML document.

Well-formed XML document

A well-formed XML document is one that conforms to the syntax requirements of XML. A well-formed XML document may or may not be a valid document with respect to a particular XML schema.


An acronym for Extensible HyperText Markup Language. When HTML 4.0 is expressed as XML, it is called XHTML. See also HTML.


An acronym for Extensible Markup Language. A subset of SGML constituting a particular text markup language for interchange of structured data. The Unicode Standard is the reference character set for XML content. See also Unicode.

XML Instance document
XML instance documents (.xml files) are built using the elements declared in a schema, and also contain data. You use XML syntax extensions to read from and write to XML instance documents.
XML schema
An XML schema (usually an .xsd file) is made up in part of element definitions. These definitions dictate the structure and content of an XML instance document that references the schema. XML elements can have attributes associated with them. These attributes further describe the data contained in the element.

An acronym for Extensible Stylesheet Language. A W3C standard defining XSLT stylesheets for (and in) XML. See also XSLT and W3C.


An acronym for Extensible Stylesheet Language for Transformations. XSLT is the "Transformations" part of the Extensible Stylesheet Language (XSL). A W3C standard, it is used to transform XML documents to other formats, including HTML, other forms of XML, and plain text. This powerful stylesheet language allows for more complex processing of the XML document's data. See also XSL and W3C.

XSLT stylesheet

An XML document that is written in the Extensible Stylesheet Language for Transformations. Note that XSLT stylesheets should not be confused with Cascading Stylesheets (CSS), which are a simple method for adding style, such as fonts, color, and spacing, to a document for final output to a browser; cascading stylesheets are closely related to HTML and XHTML.


An acronym for World Wide Web Consortium. The main standards body for the World-Wide Web (WWW). W3C works with the global community to establish international standards for client and server protocols that enable online commence and communications on the Internet.