Introduction

The course consists of five (5) chapters. The first chapter is theoretical. The other four chapters have both a theoretical and a practical dimension. Each dimension will be presented as a series of learning activities. These five (5) chapters are:

Context and problem

This chapter reviews databases, their structure, and their applications. It then presents use cases where database usage can negatively impact system performance and flexibility. These two sections formalize the problem and the question that semi-structured data attempts to address.

Documents and hyper-documents

This chapter introduces the concepts of Document and Hyperdocument. It also presents document modeling and document classes. In this context, a document is a physical organization of data (in free form, list, tree, or forest), while the document class defines the conceptual aspect of the document (report, letter, article, book, etc.). These documents are an essential element for data exchange on the web; they thus constitute an ideal use case for the application of semi-structured data.

XML Kernel

After introducing the challenges of semi-structured data, the concepts of Document and Hyperdocument, and the applicability of semi-structured data, this chapter presents XML as an essential technology for physically implementing semi-structured data.
The term "XML Kernel" refers to the XML language and all the technologies directly linked to it, which together constitute the core language and technologies. This chapter presents the XML language (the data aspect) and the DTD and XSD languages ​​(the schema aspect).

XML Galaxy

The goal of this unit is to introduce students to all the technologies related to XML. This chapter focuses particularly on the DOM and the XPath language. The DOM model allows a document to be represented as a tree of nodes. This representation allows users to traverse the document to extract data. It also facilitates modification by inserting nodes in the desired positions. The XML ecosystem also offers SAX, another method for traversing the XML document without explicitly using the DOM model.

XML Databases

This chapter explores the possibility of using XML to build flexible databases that are easy to process and exchange. To achieve this, we revisit the three levels of abstraction (conceptual, logical, and physical) to define the transition from a conceptual data model (CDM) to the physical XML model via the hierarchical logical model.
The chapter also explores the XQuery language. This language allows for writing more sophisticated queries (compared to XML) to query XML databases.

Last modified: Monday, 23 February 2026, 10:59 PM