Manuscript Archival and Publication System

Architecture

Design Features

Manu is a presentation tool and contains a few back-end administrative interfaces as possible. There are other Fedora applications that allow for back-end collection development and maintenance (see Elated or Fez, for examples of these).

Manu requires no independent MySQL or other database. The goal is to leverage Fedora's design as opposed to developing an entire vertical application. Manu does have some configuration files and such but the overall goal is to avoid unnecessary complexity on the front end.

Manu contains as little code as possible. It seeks to be the starting point for other code that enables sophisticated search and presentation capabilities. Starting with an elaborate design makes organic growth more difficult. Though written in Java, it is designed with the model of community-driven PHP apps in mind (the reason Java was used and not PHP was the Lucene search engine, which does not exist for PHP; nevertheless, PHP programmers should find this app easy to follow).

Fedora Services

Manu defines a set of reusable "behavior" or "views" (Fedora disseminators) for use with TEI-based manuscripts.

About Fedora "Disseminators" / Fedora Behavior Definitions

If you are unfamiliar with Fedora...what this means is that Manu includes code that is attached to TEI Manuscript objects in Fedora itself. This code is packaged separately from Manu, and allows for TEI Manuscript objects to do certain things a TEI Manuscript might need to do (like provide a Table of Contents for itself). Most importantly, these behaviors can be used within any other application, not just Manu.

This behavior is called a Fedora Behavior Definition, or BDEF

TEI Manuscripts have the following methods.

Fedora PID Method Requires Description
manu:TOCBDEF getTOC returns a Table-of-Contents for the Manuscript
manu:TOCBDEF getTOCView returns a Table-of-Contents for the Manuscript, but also includes the TEI content for each section; for tasks like generating text-search indices for manuscripts, this can be a useful shortcut to requesting sections 1 at a time
manu:TOCBDEF getSection sectionNumber returns the designated section of the the TEI manuscript (sectionNumbers can be seen in the Table-of-Contents view) as well as a list of all IDs of all JPEG images
manu:BasicBDEF getFullImage dsid returns a full-resolution image
manu:BasicBDEF getThumbmail dsid returns a thumbnail image
manu:BasicBDEF getTEI returns the TEI manuscript
manu:PageBDEF getPage pageNumber returns a page from the TEI manuscript
manu:CoverImageDEF getFrontImage returns the "front" image of a manuscript

Fedora Content Model

To support these services, a TEI Manuscript object in Fedora should have the following content items. (Note that the Fedora IDs of these items does not matter, as Manu uses the Fedora disseminators listed above to "get at" the content.)

Fedora Datastream Description
TEI manuscript transcript There can be only 1 transcript.
Set of high-resolution JPEG images There can be any number of these
XML Table-of-Contents file This is an XML file that defines the Table of Contents display for the manu script. The Table of Contents also defines how the document is indexed for search purposes: each section of the document is indexed seaprately, so that searches can return more specific results about where search terms were found. Sections are defined in this file using XPath and accompanying JPEG images are references with each. The manuServices installation kit contains a DTD for this file in manuServices/misc/dtd/toc-mapping.dtd
XML Paging file This file indicates which JPEG images go with which pages in the TEI transcript. Page breaks should be marked up with the TEI <pb/> tag for paging to work. The manuServices installation kit contains a DTD for this file in manuServices/misc/dtd/page-mapping.dtd