Cantaloupe offers a sophisticated and highly customizable caching subsystem that is capable of meeting a variety of needs while remaining easy to use. Three tiers of cache are available:
Cantaloupe can provide caching hints to clients using a
Cache-Control response header, which is configurable via the
cache.client.* keys in the configuration file. To enable this header, set the
cache.client.enabled key to
The default settings look something like this:
These are reasonable defaults that tell clients they can keep cached images for 30 days (2592000 seconds).
In a typical image server configuration, source images will be served from a local filesystem using FilesystemResolver. There, they are already as local as they can be, so there would be no point in caching them (although a derivative cache could still be of great benefit).
As explained in the Resolvers section, though, images do not have to be served from a local filesystem—they can also be served from a remote web server, cloud storage, or what have you. The source cache can be beneficial when one of these non-filesystem sources performs poorer than ideal. Setting
FilesystemCache will cause all source images from non-FilesystemResolvers to be automatically downloaded and stored in the source cache.
Another reason for a source cache is to work around the incompatibility between certain processors and resolvers. Some processors are only capable of reading source images located on the filesystem. By setting
CacheStrategy, and then configuring FilesystemCache, the source cache will be utilized to deal with incompatible processor/resolver situations by automatically pre-downloading source images, This makes it possible to use something like OpenJpegProcessor with AmazonS3Resolver.
Idealy, all cloud services and so on would offer faster-than-light-latency seekable-stream access, all image readers would be able to read from them as efficiently as from the local filesystem, and there would be no need to deal with the added complexity of a source cache. But, that is not the reality. Cantaloupe tries to keep things simple by integrating the source cache into the larger caching architecture, so all of the information about modes of operation and maintenance is applicable to both the source and derivative caches.
Note that unlike the derivative cache, there is only one available source cache implementation—FilesystemCache—and it will be used independently of the derivative cache.
The derivative cache caches post-processed images in order to spare the computational expense of processing the same image request over and over again. Derivative caches are pluggable, in order to enable different cache stores.
Derivative caching is recommended in production, as it will greatly reduce load on the server and improve response times accordingly. There are other ways of caching derivatives, such as by using a caching reverse proxy, but the built-in derivative cache is custom-tailored for this application and easy enough to set up.
Derivative caching is disabled by default. To enable it, set
cache.derivative to the name of a cache, such as FilesystemCache.
The source and derivative caches can be configured to operate in one of two ways:
cache.server.resolve_first = true)
cache.server.resolve_first = false)
Because cached content is not automatically deleted after expiring, there is likely to be a certain amount of expired content taking up space in the cache at any given time. Without periodic maintenance, the amount can only grow. If this is a problem, it can be dealt with manually or automatically.
To purge expired content only, start Cantaloupe with the
To purge all content, start Cantaloupe with the
Caches are careful not to leave miscellaneous detritus (like temp files) lying around. In case anything slips through, the above commands will take care of it. To only clean the cache while leaving all content alone, expired or not, start Cantaloupe with the
When Cantaloupe is started with any of these arguments, it will run in a special mode in which the web server will not be started, and exit when done. Thus, any of these tasks can be run in a separate process, on the live cache store, while the main server instance remains running.
Since version 2.2, a "cache worker" is available that will periodically clean and purge expired items from the cache in a low-priority background thread. (See the
cache.server.worker.* configuration options.)
FilesystemCache caches content in filesystem hierarchy. The location of the root directory is configurable, as is the "time-to-live" of the cache files, with the following options:
Some filesystems have per-directory file count limits, or thresholds beyond which performance starts to degrade. To work around this, cache files are stored in subdirectory trees consisting of leading fragments of MD5 identifier checksums. Ultimately, the cache structure looks like:
FilesystemCache is process-safe: it is safe to point multiple server instances at the same cache directory.
JdbcCache caches derivative images and metadata in relational database tables. To use this cache, a JDBC driver for your database must be installed on the classpath.
JdbcCache has been tested with H2 1.4. It is known to not work with the official PostgreSQL driver, as of version 9.4.1207. Other databases may work, but are untested.
JdbcCache can be configured with the following options:
JdbcCache will not create its schema automatically—this must be done manually using the following commands, which may have to be altered slightly for your particular database:
JdbcCache uses write transactions and is process-safe: it is safe to point multiple server instances at the same database tables.
AmazonS3Cache caches derivative images and metadata into an Amazon Simple Storage Service (S3) bucket. It can be configured with the following options:
us-east-1. Can be commented out or left blank to use a default region. (See S3 Regions.)
AzureStorageCache caches derivative images and metadata into a Microsoft Azure Storage container. It can be configured with the following options: