WARC-kb Module

1 Introduction

The WARC-kb module recognizes and validates the WARC (Web ARChive) format. [WARC]. It only validates the WARC file format and WARC headers, not the actual payload of the WARC records. This module uses the JWAT library for WARC parsing. For Compressed WARC files the JWAT library is also used to parse compressed WARCs (.warc.gz)

The module is invoked by the:

    jhove ... -m WARC-kb ...
  

command line option.

The WARC-kb module recognizes ISO28500:2009.

This module doesn't have configurable parameters.

2 Coverage

The WARC-kb module recognizes and validates the following profiles:

3 Well-Formedness

The WARC module doesn't check the well-formedness

4 Validity

The WARC module only validates the WARC file format, WARC headers. It doesn't check the payload of the WARC records.

5 Representation Information

The MIME type is reported as: application/warc [application/warc, application/warc-fields].

In addition to the standard JHOVE representation information, the following WARC-specific properties are reported:

6 Additional Module Properties