Some image formats support embedded metadata, which may contain information about image characteristics, as well as user- and/or device-supplied information about authoring, copyright, camera settings, and so on. This metadata may be encoded in a standard cross-format encoding like EXIF, IPTC IIM, or XMP, or it may use an encoding that is specific to a particular image format. More than one encoding may be present in the same file.
During a response, an entirely new image is generated and returned to the client. In some cases it might be desired that this image contain some subset of metadata from the source image, or all of it, or that new metadata be added to it.
EXIF | IPTC IIM | XMP | Native | |
---|---|---|---|---|
GIF | × | × | ✓ | × |
JPEG | ✓ | ✓ | ✓ | × |
JPEG2000 | × | ✓ | ✓ | × |
PNG | × | × | ✓ | ✓ |
TIFF | Stored in baseline IFD and several sub-IFDs | Stored in sub-IFD 33723 | Stored in sub-IFD 700 | Baseline IFD tags |
Not all processors are metadata-aware; see the table of processor-supported features.
Metadata can be returned in IIIF Image API information responses using an extra_iiifn_information_response_keys()
delegate method. (This requires Cantaloupe 5.0.3 or later.) An example follows:
def extra_iiif2_information_response_keys(options = {})
extra_information_response_keys
end
def extra_iiif3_information_response_keys(options = {})
extra_information_response_keys
end
def extra_information_response_keys
{
'exif' => context.dig('metadata', 'exif'),
'iptc' => context.dig('metadata', 'iptc'),
'xmp' => context.dig('metadata', 'xmp_string')
}
end
This will cause keys such as the following to be added to an information response:
{
"@context": "http://iiif.io/api/image/2/context.json",
"@id": "http://localhost:8182/iiif/3/metadata.jpg",
"protocol": "http://iiif.io/api/image",
"width": 64,
"height": 56,
...
"xmp": "<rdf:RDF xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\" ...",
"iptc": [
{
"City": "Urbana"
},
{
"ApplicationRecordVersion": 4
}
],
"exif": {
"tagSet": "Baseline TIFF",
"fields": {
"Orientation": 1,
"XResolution": {
"numerator": 72,
"denominator": 1
},
"YResolution": {
"numerator": 72,
"denominator": 1
},
"ResolutionUnit": 2,
"DateTime": "2015:12:31 12:42:48",
"EXIFIFD": {
"tagSet": "EXIF",
"fields": {
"ExposureTime": {
"numerator": 1,
"denominator": 40
},
"FNumber": {
"numerator": 11,
"denominator": 5
},
"PhotographicSensitivity": 40,
"ExifVersion": "MDIyMQ==",
"DateTimeOriginal": "2015:12:31 12:42:48",
"DateTimeDigitized": "2015:12:31 12:42:48"
}
}
}
}
}
Note that the view is into the low-level metadata structures without any higher-level categorization. For example, EXIF metadata may be present in the exif
key, but it may also be present in the XMP string.
EXIF string values ending with ==
are typically base64-encoded binary data.
While several different source metadata formats can be read, only XMP can be written. This simplifies the metadata API and reduces crosswalking challenges, as XMP is flexible enough to express EXIF and IIM with no information loss. XMP can also be embedded into many image formats and is widely supported by imaging software.
XMP is serialized as RDF/XML. A very simple XMP packet might look like:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:aux="http://ns.adobe.com/exif/1.0/aux/"
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<aux:Lens>5.4-10.8mm</aux:Lens>
<aux:FlashCompensation>0/1</aux:FlashCompensation>
<aux:Firmware>Firmware Version 1.00</aux:Firmware>
<aux:OwnerName>Ansel Adams</aux:OwnerName>
<xmp:CreateDate>2002-07-14T09:01:42</xmp:CreateDate>
<xmp:ModifyDate>2002-07-14T09:01:42</xmp:ModifyDate>
<xmp:CreatorTool>Photos 1.5</xmp:CreatorTool>
<photoshop:DateCreated>2002-07-14T09:01:42</photoshop:DateCreated>
<dc:subject>
<rdf:Bag>
<rdf:li>Mountains</rdf:li>
<rdf:li>Scenery</rdf:li>
<rdf:li>Landscapes</rdf:li>
</rdf:Bag>
</dc:subject>
</rdf:Description>
</rdf:RDF>
Note that the XMP standard is quite complex and is based on RDF, which itself is complex, so what follows is not implied to be a replacement for the several hundred pages of reading needed to fully grasp those respective standards.
The above XMP packet can be copied verbatim from a source image into derivative images quite easily:
def metadata(options = {})
context['metadata']['xmp_string']
end
Continuing with the example above, we assume that the source image's IPTC metadata contains a copyright statement that we'd like to copy into derivative images' XMP data.
According to the IPTC IIM specification, the tag most likely to contain a copyright statement is CopyrightNotice
(p. 39). We will choose to translate this into a Dublin Core rights
element in XMP.
java_import java.io.StringWriter
def metadata(options = {})
metadata = context['metadata']
iptc = metadata['iptc']
if iptc
# In IPTC IIM terminology, the basic data structure is the "data set."
# IIM data may contain multiple data sets, each identified by a "tag" and
# having a "data field" containing the value. This structure is translated
# into the delegate context as an array of one-element hashes representing
# the data sets, each with a tag name key and data field value.
data_set = iptc.find{ |ds| ds.keys.first == 'CopyrightNotice' }
if data_set
copyright = data_set['CopyrightNotice']
# Add a custom dc:rights property.
res = model.createResource
obj = model.createLiteral(copyright, false)
prop = model.createProperty('http://purl.org/dc/elements/1.1/rights')
stmt = model.createStatement(res, prop, obj)
model.add(stmt)
# Write the model to XML and return it.
writer = StringWriter.new
begin
model.write(writer);
return writer.toString
ensure
writer.close
end
end
end
end
In this example, we want to remove any dc:title
property that may be present in the source XMP, and add our own. We use Jena, which is bundled into the application, to do this.
java_import java.io.StringWriter
def metadata(options = {})
metadata = context['metadata']
model = metadata['xmp_model']
if model
# Search for dc:title properties.
prop = model.createProperty('http://purl.org/dc/elements/1.1/title')
it = model.listStatements(nil, prop, nil)
# Remove them.
it.removeNext while it.hasNext
# Add a custom dc:title property.
res = model.createResource
obj = model.createLiteral('Hello world', false)
stmt = model.createStatement(res, prop, obj)
model.add(stmt)
# Write the model to XML and return it.
writer = StringWriter.new
begin
model.write(writer)
return writer.toString
ensure
writer.close
end
end
nil
end