Processors read images from sources, decode them, transform them according to request arguments, and encode and write derivative images back to the client. Processors can be selected in different ways per-request.
Different processors use different underlying codecs and image processing engines, which may have different quality, compatibility, dependency, performance, and licensing characteristics. The ability to choose among different processors is intended to make it straightforward to add support for new image formats; improve support for existing image formats via the substitution of better codecs; and decouple the image server implementation from any one codec.
Different processors support different source formats. A table of supported formats is displayed in the Control Panel, as well as in the Supported Source Formats table below. A list of output formats supported for a given source format is contained within the response to an information request (such as /iiif/3/:identifier/info.json).
The processor.selection_strategy
configuration key controls how a processor is selected on a per-request basis.
AutomaticSelectionStrategy
ManualSelectionStrategy
processor.ManualSelectionStrategy.{format}
and processor.ManualSelectionStrategy.fallback
keys in the application configuration. This strategy offers more control, but requires knowing which processors support which source formats, and may require testing different processors to find the one that best meets a given use case.Processors ultimately read images from sources, of which there are two main types: those that can supply files (FileSources), and those that can supply streams (StreamSources). Correspondingly, there are two types of processors: those that can read from files (FileProcessors), and those that can read from streams (StreamProcessors). These distinctions are important because they influence how data flows through the processing pipeline, which influences performance.
The stream retrieval strategy (processor.stream_retrieval_strategy
) controls how content is fed to stream-based processors from stream-based sources.
StreamStrategy
DownloadStrategy
CacheStrategy
DownloadStrategy
if you can spare the disk space.The fallback retrieval strategy (processor.fallback_retrieval_strategy
) controls how an incompatible StreamSource
/FileProcessor
combination is dealt with.
DownloadStrategy
CacheStrategy
DownloadStrategy
if you can spare the disk space.AbortStrategy
Java 2D | JAI | Kakadu Native | OpenJPEG | Grok | FFmpeg | PDFBox | TurboJPEG | |
---|---|---|---|---|---|---|---|---|
Reading from files | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Reading from sequential streams | ✓ | ✓ | ✓ | × | × | × | ✓ | ✓ |
Reading from seekable streams | ✓ | ✓ | ✓ | × | × | × | × | × |
Redaction | ✓ | × | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Tiled reading | ✓ | ✓ | ✓ | ✓ | ✓ | N/A | N/A | N/A |
Multiresolution reading | ✓ | ✓ | ✓ | ✓ | ✓ | N/A | N/A | N/A |
Metadata | ✓ | ✓ | ✓ | ✓ | ✓ | × | ✓ | ✓ |
ICC profiles | ✓* | ✓* | ✓** | ✓** | ✓** | N/A | × | ✓** |
Selectable resample filters | ✓ | × | ×*** | ✓ | ✓ | ✓ | ✓ | ✓ |
✓* Copied into derivative images.
✓** Derivative image pixel data is modified according to the ICC profile.
×*** No, because the scaling quality is already near-optimal.
Java2D | JAI | Kakadu Native | OpenJPEG | Grok | PDFBox | TurboJPEG | |
---|---|---|---|---|---|---|---|
BMP | ✓ | ✓ | × | × | × | × | × |
GIF | ✓ | × | × | × | × | × | × |
JPEG | ✓ | CMYK/YCCK not supported | × | × | × | × | ✓ |
JPEG2000 | × | × | ✓ | ✓ | ✓ | × | × |
× | × | × | × | × | ✓ | × | |
PNG | ✓ | ✓ | × | × | × | × | × |
TIFF | ✓ | ✓ | × | × | × | × | × |
Java2dProcessor uses the Java Image I/O and Java 2D libraries to read and process images in a native-Java way. It is a good all-around processor with no dependencies.
This processor has been written to exploit the Image I/O readers as efficiently as possible. Special attention has been paid to its handling of tiled images, such as tile-encoded TIFFs, for which it reads only the necessary tiles for a given request. It is also able to read the sub-images contained within pyramidal TIFF images.
By default, this processor uses the format-specific Image I/O plugins bundled with the JDK. Other plugins can be used instead by setting the processor.imageio.{format}.reader
and/or processor.imageio.{format}.writer
configuration keys to the fully-qualified class name of a plugin reader and writer, respectively.
Recognized plugins are logged at startup.
This processor can read from files and streams. Seekable streams will be more efficient than sequential streams when working with pyramidal TIFF source images.
This processor relies the JDK's Image I/O JPEG plugin. This plugin implements the JFIF standard strictly and is known to fail to read certain images with error message such as, "Inconsistent metadata read from stream." If this turns out to be a problem, try updating to the latest JRE version and see if that fixes it. As a last resort, consider using TurboJpegProcessor instead.
Java Advanced Imaging (JAI) is a sophisticated image processing library developed by Sun Microsystems until the mid-2000s. It offers several advantages over Java 2D that make it ideal for an image server: a pull-based rendering pipeline that can reduce memory usage, and efficient region-of-interest decoding with some formats.
By default, this processor uses the format-specific Image I/O plugins bundled with the JDK. Other plugins can be used instead by setting the processor.imageio.{format}.reader
and/or processor.imageio.{format}.writer
configuration keys to the fully-qualified class name of a plugin reader and writer, respectively.
Recognized plugins are logged at startup.
This processor can read from files and streams. Seekable streams are more efficient than sequential streams when working with pyramidal TIFF source images.
Development on JAI ended a long time ago. Given that supporting JAI is likely to become more problematic as time goes on, this processor should be considered deprecated, and it may be removed in a future release.
When using this processor, it is normal to see the following log message:
Error: Could not find mediaLib accelerator wrapper classes. Continuing in pure Java mode.
This is harmless and expected when there is no mediaLib JAR on the classpath. Add the -Dcom.sun.media.jai.disableMediaLib=true
VM option to suppress it.
This processor uses the Java binding of the high-level TurboJPEG API on top of the libjpeg-turbo library to read and write JPEG images. It uses the same image processing engine as Java2dProcessor.
The design of the TurboJPEG Java binding is somewhat unfortunate in that it requires JPEG image data to be buffered fully in memory before it can be read or written, which costs time and RAM. However, overall performance is still significantly better than Java2dProcessor thanks to the much faster coding performance of libjpeg-turbo compared to the JDK's Image I/O JPEG plugin.
libjpeg(-turbo) is also more lenient than Java2dProcessor when reading malformed JPEGs, JPEGs with mismatching color profiles, and other quirky files.
This processor requires libjpeg-turbo 2.0.2 to be installed. Other 2.0.x versions may work, but are untested. libjpeg-turbo must be compiled with Java support (which it often isn't when installed via package managers). As of version 2.0.2, this just involves adding the -DWITH_JAVA=1
argument to the cmake
command.
Kakadu is widely considered one of the fastest CPU-based JPEG2000 codecs. Compared to the KakaduDemoProcessor from previous versions, this processor calls directly into the Kakadu library to decode JPEG2000 source images, and because of that:
libjpeg-turbo, if available, is used for writing JPEGs. Otherwise, and for other formats, Image I/O is used. See the TurboJpegProcessor section for information on installing libjpeg-turbo. The effective writer is logged during processing:
DEBUG e.i.l.c.p.WriterFacade - Writing with edu.illinois.library.cantaloupe.processor.codec.jpeg.TurboJPEGImageWriter
This processor can read from files and streams. Seekable streams are more efficient than sequential streams.
This processor must be able to locate the Kakadu JNI binding and shared library. The extracted release archive contains a folder named deps, which contains compiled binaries for several platforms. Copy the files from the platform-specific lib folder into one of the locations on the Java library path, which are logged at application startup in a message that looks like:
INFO e.i.l.c.ApplicationContextListener - Java library path: .....
For Windows, you may also need to install Microsoft Visual C++ Redistributable if it isn't already installed.
This processor was developed using a Kakadu Public Service License and may not be used commercially. See the Kakadu Software License Terms and Conditions for detailed terms.
OpenJpegProcessor uses the opj_decompress tool from the open-source OpenJPEG project to decode JPEG2000 source images. All other operations are performed using Java 2D, and basic image characteristics are acquired using custom code.
libjpeg-turbo, if available, is used for writing JPEGs. Otherwise, and for other formats, Image I/O is used. See the TurboJpegProcessor section for information on installing libjpeg-turbo. The effective writer is logged during processing:
DEBUG e.i.l.c.p.WriterFacade - Writing with edu.illinois.library.cantaloupe.processor.codec.jpeg.TurboJPEGImageWriter
To use this processor, OpenJPEG must be installed. The OpenJPEG binaries will automatically be detected if they are on the path; otherwise, set the OpenJpegProcessor.path_to_binaries
configuration key to the absolute path of the containing directory. The LD_LIBRARY_PATH
environment variable will also need to be set to locate the OpenJPEG shared library.
This processor can only read from files. For use with sources other than FilesystemSource, one of the fallback retrieval strategies must be used.
GrokProcessor uses the grk_decompress tool from the Grok library to decode JPEG2000 source images. All other operations are performed using Java 2D, and basic image characteristics are acquired using custom code.
libjpeg-turbo, if available, is used for writing JPEGs. Otherwise, and for other formats, Image I/O is used. See the TurboJpegProcessor section for information on installing libjpeg-turbo. The effective writer is logged during processing:
DEBUG e.i.l.c.p.WriterFacade - Writing with edu.illinois.library.cantaloupe.processor.codec.jpeg.TurboJPEGImageWriter
To use this processor, Grok must be installed. The Grok binaries will automatically be detected if they are on the path; otherwise, set the GrokProcessor.path_to_binaries
configuration key to the absolute path of the containing directory. The LD_LIBRARY_PATH
environment variable will also need to be set to locate the Grok shared library.
This processor can only read from files. For use with sources other than FilesystemSource, one of the fallback retrieval strategies must be used.
FfmpegProcessor uses the FFmpeg tool to extract still frames from video files.
It has been tested with FFmpeg version 2.8. Other versions may or may not work.
FFmpeg is used only for frame extraction. All subsequent steps are handled by Java 2D.
libjpeg-turbo, if available, is used for writing JPEGs. Otherwise, and for other formats, Image I/O is used. See the TurboJpegProcessor section for information on installing libjpeg-turbo. The effective writer is logged during processing:
DEBUG e.i.l.c.p.WriterFacade - Writing with edu.illinois.library.cantaloupe.processor.codec.jpeg.TurboJPEGImageWriter
This processor can only read from files. For use with sources other than FilesystemSource, one of the fallback retrieval strategies must be used, which may be painful due to the large size of video files.
This processor supports a meta-identifier-encoded second offset for requesting a frame at a particular second. (This is considered a page number.) For example, to request the frame closest to the 30-second offset using the StandardMetaIdentifierTransformer
with default configuration:
http://example.org/iiif/3/video.mp4;30/full/max/0/default.jpg
When a second offset is not present, the first frame is returned.
Previous verions supported a time
query argument. This still works in version 5.0, but is deprecated.
PdfBoxProcessor uses the Apache PDFBox library to read and rasterize PDF files. This is a pure-Java library that is bundled in and has no dependencies.
As PDF is a vector format, PdfBoxProcessor will convert to a raster (pixel) image and use a Java 2D pipeline to transform it according to the request arguments. The size of the base raster image, corresponding to a scale of 1, is configurable with the processor.dpi
configuration option. When a scale of ≤ 50% or ≥ 200% is requested, a fraction or multiple of this is used, respectively, in order to improve efficiency at small scales, and detail at large scales.
libjpeg-turbo, if available, is used for writing JPEGs. Otherwise, and for other formats, Image I/O is used. See the TurboJpegProcessor section for information on installing libjpeg-turbo. The effective writer is logged during processing:
DEBUG e.i.l.c.p.WriterFacade - Writing with edu.illinois.library.cantaloupe.processor.codec.jpeg.TurboJPEGImageWriter
This processor can read similarly well from all sources.
This processor supports a meta-identifier-encoded page number for requesting a particular page of a PDF. For example, to request page 2 using the StandardMetaIdentifierTransformer
with default configuration:
http://example.org/iiif/3/document.pdf;2/full/max/0/default.jpg
When a page number is not present, the first page is returned.
Previous verions supported a page
query argument. This still works in version 5.0, but is deprecated.