EPUB-ptc Module

1 Introduction

The EPUB-ptc module recognizes and validates the EPUB format.

The module is invoked by the:

    jhove ... -m EPUB-ptc ...

command line option.

The EPUB-ptc JHOVE module is a wrapper around the official EPUBCheck tool. Visit the EPUB Specifications and Projects page for more information on the EPUB format.

2 Coverage

The EPUB-ptc module recognizes and validates the following public profiles:

Important note: The internal EPUB metadata only reveals its major version - which is currently either EPUB2 or EPUB3. This is based on the "version" property on the "package" tag within the OPF XML. The EPUBCheck tool will validate the file against the specification of the most recent point release associated with that major version - and it is this number that will appear as the "version" in the JHOVE report. To clarify, all EPUB2s will validate as 2.0.1 and all EPUB3s as 3.2 until a new EPUB version is released along with an updated EPUBCheck tool. Additional metadata provided as properties in the report (detailed below) will indicate the presence of features that may be dependent on the e-book software's support for them.

3 Well-Formedness

EPUBs are a form of ZIP archive file. Well formed status is based on evaluation of both the archive file as a whole and on the presence of specific files within it. The JHOVE report outputs the messages as provided by EPUBCheck - a full list of these can be seen in the EPUBCheck code base. In addition to this list, an undefined failure within the EPUBCheck module will report as a FATAL error - this happens if a non-EPUB file is passed into the module, for example. A message will cause a status of "Not Well Formed" if either:

  1. it has a severity level of FATAL
  2. it is a package-related message (starts with "PKG-") with a severity level of ERROR.

The following are some of the key criteria that must be met for an EPUB object to be considered Well Formed in the JHOVE report. For the full list, please refer to the EPUBCheck messages link provided above:

4 Validity

As with the "Well Formed" status, the criteria that determine "Validity" are based on the messages output by the EPUBCheck module. If the EPUB has a status of "Well Formed", but contains one or more message with a severity level of "ERROR" the EPUB is labelled as "Not Valid". As mentioned in the previous section, the exception is package ERRORs (PKG-*), which will always result in a "Not Well Formed" assignment.

5 Representation Information

The MIME type is reported as: application/epub+zip

In addition to the standard JHOVE representation information, the following EPUB-specific properties are reported:

5.1 Profiles

6 Additional Module Properties

7 Troubleshooting

The EPUB JHOVE module uses the EPUBCheck tool. This means it inherits a problem caused by the thread stack size being too small to process the EPUB in certain situations. This will likely manifest as a "StackOverflowError" in the console. Using a 32-bit JVM instead of a 64-bit one can cause this error. The work around is to increase the thread stack size by adding "-Xss1024k" as a parameter on the java command. To do this, open the jhove[.bat] file and manually modify the java command as follows:

java -Xss1024k -classpath "%CP%" Jhove -c "%CONFIG%" %*

If using the JHOVE GUI, open the jhove-gui[.bat] file and modify the following line to fix this issue:

java -Xss1024k -classpath "%CP%" JhoveView -c "%CONFIG%" %*