Advanced Java and Web Technologies for JNTUK
Blog providing beginner tutorials on different web technologies like HTML, CSS, Javascript, PHP, MYSQL, XML, Java Beans, Servlets, JSP and AJAX
Subscribe to Startertutorials.com's YouTube channel for different tutorial and lecture videos.

Categories: XML. No Comments on Document Type Definition

This article explains Document Type Definition (DTD) which is one of the ways to specify high level syntax for XML document. We will learn creating and using DTDs along with XML documents.

 

Introduction

 

A Document Type Definition (DTD) is a set of structural rules called declarations, which should be followed by the tags, attributes and entities in a XML document. A document can be tested against the DTD to determine whether it confirms to the rules that the DTD describes.

 

A DTD can be embedded in the XML document in which case it is called as internal DTD or the DTD can be specified in a separate file which can be linked to several XML documents. In such case, it is known as external DTD.

 

Syntactically, a DTD is a sequence of declarations, each of which has the form of a markup declaration as shown below:

<!keyword … >

 

The keyword can be any one of the following four keywords:

  1. ELEMENT – which defines a tag
  2. ATTLIST – which defines the attributes of a tag
  3. ENTITY – which defines an entity
  4. NOTATION – which defines data type notations

 

Declaring Elements

 

Each element declaration in a DTD specifies the structure of one category of elements. The declaration provides the element name along with the specification of the structure of that element.

 

An XML document can be thought of as a tree. An element is an internal node or a leaf node in the tree. The form of an element declaration for elements that contain other elements is as shown below:

<!ELEMENT  element_name (list of names of child elements)>

 

For example, the declaration of a student element can be created as shown below:

<!ELEMENT  student  (name, regdno, branch, section)>

 

Multiple occurrences of the child elements can be specified using the child element specification modifiers which are given below:

 

document-type-definition-modifiers

 

For example, consider the modified declaration of the above student element:

<!ELEMENT  student  (name, regdno, branch, section?, email*)>

 

In the above example declaration, section element can occur zero or one time and email element can occur zero or many times.

 

The leaf nodes of a DTD specify the data types of the content of their parent nodes, which are elements. Generally the content of leaf node is PCDATA, for parsable character data. It is a string of any printable characters except < and &.

 

Two other content types that can be specified are EMPTY and ANY. The EMPTY type specifies that the element has no content and ANY type specifies that the element might contain any content.

 

For example, the leaf element declaration is as shown below:

<!ELEMENT  element_name  (#PCDATA)>

 

Declaring Attributes

 

The attributes of an element are declared separately from the element declaration. The declaration of an attribute is as shown below:

<!ATTLIST  element_name  attribute_name  attribute_type  [default_value]>

 

If more than one attribute is declared for a given element, such declarations can be combined as shown below:

<!ATTLIST  element_name
      attribute_name_1  attribute_type  default_value_1
      attribute_name_2  attribute_type  default_value_2
      ---
      attribute_name_n  attribute_type  default_value_n
>

 

There are ten different attribute types. Among them, most frequently used type if CDATA, which specifies character data (any string characters except < and &).

 

The default value of an attribute can be an actual value or a requirement for the value of the attribute in the XML document. The possible default values for an attribute are given below:

 

document-type-definition-default-values

 

Declaring Entities

 

Entities can be defined so that they can be referenced anywhere in the content of an XML document, in which case they are called general entities. All predefined entities are general entities. Entities can also be defined so that they can be referenced only in DTDs. Such entities are called parameter entities.

 

An entity declaration is as shown below:

<!ENTITY  [%]  entity_name  “entity_value”>

 

The optional percentage sign (%) when present in the entity declaration denotes a parameter entity rather than a general entity.

 

When a document includes a large number of references to the abbreviation HyperText Markup Language, it can be defined as an entity as shown below:

<!ENTITY  html  “HyperText Markup Language”>

 

Any XML document that includes the DTD containing the above declaration can specify the complete name with just the reference &html;

 

When an entity is longer than a few words, its text is defined outside the DTD. In such cases, the entity is called an external text entity. The declaration of an external entity is shown below:

<!ENTITY  entity_name  SYSTEM  “file_location”>

 

The keyword SYSTEM specifies that the definition of the entity is in a different file, which is specified as the string following SYSTEM.

 

A Sample DTD

 

A Document Type Definition is saved with the extension .dtd and a normal XML file is saved with the extension .xml

 

Below is an example DTD which contains the specification for storing the details of students:

//students.dtd - DTD file
<?xml version="1.0" encoding="utf-8" ?>
<!ELEMENT students (student+)>
<!ELEMENT student (name, branch, section, regdno)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT branch (#PCDATA)>
<!ELEMENT section (#PCDATA)>
<!ELEMENT regdno (#PCDATA)>

<!ATTLIST student id CDATA #REQUIRED>

 

The XML code that conforms to the above DTD is given below:

//student.xml - XML file
<?xml version="1.0" encoding="utf-8"?>
<students>
   <student id="1">
      <name>K.Ramesh</name>
      <branch>CSE</branch>
      <section>A</section>
      <regdno>12PA1A0501</regdno>
   </student>
</students>

 

Internal and External DTDs

 

A DTD can be placed within the XML file or in a separate file. If the DTD is placed within the XML document, then it is called as internal DTD. An internal DTD is specified as shown below as the second line in the XML document:

<!DOCTYPE  root-element  [  —DTD text— ]>

 

Below is an example for internal DTD:

<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE students[
<!ELEMENT students (student+)>
<!ELEMENT student (name, branch, section, regdno)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT branch (#PCDATA)>
<!ELEMENT section (#PCDATA)>
<!ELEMENT regdno (#PCDATA)>

<!ATTLIST student id CDATA #REQUIRED>
]>

<students>
   <student id="1">
      <name>K.Ramesh</name>
      <branch>CSE</branch>
      <section>A</section>
      <regdno>12PA1A0501</regdno>
   </student>
</students>

 

If the DTD is written separately in another file, then it is called as external DTD. An external DTD is linked with an XML document as shown below:

<!DOCTYPE  root-element  SYSTEM  “filename.dtd”>

 

Below is an example for external DTD:

/students.dtd - DTD file
<?xml version="1.0" encoding="utf-8" ?>
<!ELEMENT students (student+)>
<!ELEMENT student (name, branch, section, regdno)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT branch (#PCDATA)>
<!ELEMENT section (#PCDATA)>
<!ELEMENT regdno (#PCDATA)>
<!ATTLIST student id CDATA #REQUIRED>

 

//student.xml - XML file
<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE  students  SYSTEM  "students.dtd">
<students>
   <student id="1">
      <name>K.Ramesh</name>
      <branch>CSE</branch>
      <section>A</section>
      <regdno>12PA1A0501</regdno>
   </student>
</students>

 

An XML document which contains a DTD and is validated by a validating XML parser is known as a valid XML document.

How useful was this post?

Click on a star to rate it!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

Suryateja Pericherla

Suryateja Pericherla, at present is a Research Scholar (full-time Ph.D.) in the Dept. of Computer Science & Systems Engineering at Andhra University, Visakhapatnam. Previously worked as an Associate Professor in the Dept. of CSE at Vishnu Institute of Technology, India.

He has 11+ years of teaching experience and is an individual researcher whose research interests are Cloud Computing, Internet of Things, Computer Security, Network Security and Blockchain.

He is a member of professional societies like IEEE, ACM, CSI and ISCA. He published several research papers which are indexed by SCIE, WoS, Scopus, Springer and others.

Note: Do you have a question on this article or have a suggestion to make this article better? You can ask or suggest us by filling in the below form. After commenting, your comment will be held for moderation and will be published in 24-48 hrs.

Leave a Reply

Your email address will not be published. Required fields are marked *