Information is translated from this Japanese blog post which was written in 2013-11. I recently saw the article in connection to the AutoIt script originally written by Sven Neuhaus.
In January of 2014, RICOH added PhotoSphere XMP metadata to THETA images, opening the way for third-party applications to edit the XMP metadata. Original announcement in Japanese.
Photo Sphere XMP is Google 's extension of Adobe 's metadata format, XMP.
Introduction
Although the orientation is properly adjusted properly when viewed from the official RICOH THETA viewer, the JPEG inside is not corrected for orientation. It looks like this.
If you just upload photos taken on the THETA to the official theta360.com website, there is no particular problem. However, if you want to move the image to Oculus, you need to access direction information.
In the official RICOH THETA Windows viewer, we can only open this JPEG and adjust its orientation so the data must be somewhere in JPEG.
EXIF Info
This is what the data looks like when I open it in ExifReader (when there is GPS information).
Note: This was originally in Japanese. The English translation may not match the actual data.
File name: R0010004.JPG
Exif : Exif
▼ Main information
Title:
Manufacturer name: RICOH
Model: RICOH THETA
Image Orientation: Upper Left
Width resolution: 72/1
Height resolution: 72/1
Resolution Unit: Inch
Software: RICOH THETA Ver 1.02
Modified date and time: 2013: 11: 09 17: 40: 42
YCbCrPositioning: Match
Copyright :
Exif information offset: 434
GPS information offset: 904
▼ Sub Information
Exposure time: 1/30 sec
Lens F value : F 2.1
Exposure control mode: Program AE
ISO sensitivity : 800
Unknown (8830) 3, 1: 1
Exif version: 0230
Original shooting date and time: 2013: 11: 09 17: 40: 42
Digitization date and time: 2013: 11: 09 17: 40: 42
Meaning of each component: YCbCr
Image compression ratio: 320/100 (bit / pixel)
Lens aperture value: F 2.1
Brightness of object: EV - 1.5
Exposure correction amount: EV 0.0
Open F value : F 2.1
Auto exposure metering mode: split photometry
Light source: unknown
Flash: Off-lens focal length : 0.75 (mm)
Camera internal information: RIOCH Format [.............]
User Comment:
Version of FlashPix: 0100
Color space information: sRGB
Image width: 3584
Image Height: 1792
ExifR 98 Extended information: 58224
Shooting mode: Auto
White balance mode: Focal length of auto lens (35 mm): 6 (mm)
Scene shooting type: standard
Sharpness: Standard ▼ GPS information
GPS Tag Version: 2, 3, 0, 0
Latitude (N / S): N
Latitude (numerical value): 34 ° ****. ** [DMS]
Longitude (E / W): E
Longitude (numerical value): 135 ° ****. ** [DMS]
Altitude standard: Altitude above sea level
Altitude (numerical value): 2148/100 m
GPS time ( UTC ): 08: 40: 37
Direction reference of photographed image: true azimuth Direction of photographed image: 22.50 °
Geodetic system : WGS 84
Time stamp: 2013: 11: 09
TOKYO Datum system conversion latitude: 34 /**/**.*** [DMS]
TOKYO Datum system converted longitude: 135 /**/**.*** [DMS]
▼ ExifR 98 information
Compatibility identifier: R 98
Version: 0100
▼ Thumbnail Information
Type of compression: OLDJPEG
Width resolution: 72/1
Height resolution: 72/1
Resolution Unit: Inch
JPEGInterchangeFormat: 58356
JPEGInterchangeFormatLength: 3225
There is “direction of image taken” (GPSImgDirection), but there is no acceleration information. It must be in a hidden format in “camera internal information”
Accessing Manufacturer Info
I first looked at the data that changed with a lot of pictures. As this was inefficient, I decided to look at the Windows version of the application by changing the policy a bit. As I looked closely, I found out that the RICOH desktop application was made with Adobe Air. I was able to decompile SphericalViewer.swf with JPEX free flash decompiler.
I first looked at jp.co.ricoh.Exif.RicohIFDEntry
and jp.co.ricoh.receptor.entities.EquirectangularImage
- ZenithEs (TagId = 0x0003)
- Zenith (TagId = 0x0006)
- CompassEs (TagId = 0x0004)
- Compass (TagId = 0x0007)
There is a tag called and there seems to be IFD (Image File Directory) from the JPEG standard. At first glance, the format is not so strange, but I do not know if there are differences. I’m only writing the format near this tag.
Fundamentally, it is big endian,
Entry
= TagId(uint16) TypeId(uint16) NumData(uint32) Offset(uint32) (Dataが大きい時)
| TagId(uint16) TypeId(uint16) NumData(uint32) Data{NumData} (そうでないとき)
TypeId
= 0x0005 (unsigned rational)
| 0x000a (signed rational)
Data
= a(uint32) b(uint32) (unsigned rational, a/b)
| a(int32) b(int32) (signed rational, a/b)
When the data body is small (when the total of data is 4 B or less), it is inline. When it is big it contains the offset of the file. Somehow the actual data starts from a +12 offset. In the files I saw, ZenithEs and CompassEs were defined,
- ZenithEs: signed ratioanl, NumData = 2
- CompassEs: unsigned rational, NumData = 1
You can grab the range of values and all degrees from the error check code.
- 0 <= ZenithEs [0] <= 360
- -90 <= ZenithEs [1] <= 90
- 0 <= CompassEs <= 360
The value of this side enters the class called Tilt3D and it is called ZenithX (ZenithEs [0]), ZenithY (ZenithEs [1]), ZenithZ (0), Compass (CompassEs). We have not yet examined the coordinate system etc, but it seems that a rotation matrix is made using only ZenithX, ZenithY.
m =
cos(zY) -sin(zY)*cos(zX) -sin(zY)*sin(zX)
sin(zY) cos(zY)*cos(zX) cos(zY)*sin(zX)
0 sin(zX) cos(zX)
Retrieving the values
It would be nice to parse the IFD properly, but since it looks like troublesome, I will write a method that I can easily retrieve. Since the tag I referred to earlier will be a relatively unique signature in the binary, it seems good to retrieve it and retrieve the value.
#!/bin/python2
import os
import subprocess
import struct
def find_data(s, tag):
ix = s.find(tag)
if ix < 0:
raise Exception('Cannot find tag')
return ix + len(tag)
def parse_u_rational(s):
a, b = struct.unpack('>II', s)
return float(a) / float(b)
def parse_s_rational(s):
a, b = struct.unpack('>ii', s)
return float(a) / float(b)
def get_angles(path):
f = open(path, 'rb')
head = f.read(10 * 1000) # take long enough header
# Find CompassEs
ix = find_data(head, '\x00\x04\x00\x05\x00\x00\x00\x01') # search CompassEs,UnsignedRational,1
offset = struct.unpack('>I', head[ix : ix + 4])[0] + 12
compass = parse_u_rational(head[offset : offset + 8])
# Find ZenithEs
ix = find_data(head, '\x00\x03\x00\x0a\x00\x00\x00\x02') # search ZenithEs,SignedRational,2
offset = struct.unpack('>I', head[ix : ix + 4])[0] + 12
zenith_x = parse_s_rational(head[offset : offset + 8])
zenith_y = parse_s_rational(head[offset + 8 : offset + 16])
return {
'zenith_x': zenith_x,
'zenith_y': zenith_y,
'compass': compass
}
For the file I’m using, it looks like usable values. In order to use these values you will need to examine the coordinate system a little more, but I think it is relatively easy.
Summary
It seems that some other interesting tags are defined, so I think that it is good to investigate for those who are interested.
- HDRType
- HDRData
- AbnormalAcc
Notes
1 : I will also use exiftool which supports some somewhat proprietary format, but THETA was too new to deal with it
2 : JPEXS Free Flash Decompiler GitHub - jindrapetrik/jpexs-decompiler: JPEXS Free Flash Decompiler was useful
Implementations
- Like Silk: RICOH THETA 360 -
Find zenith and compass information from EXIF data of the RICOH THETA 360 - RICOH THETA AutoIt Script for batch processing using the Ricoh Theta for Windows GUI application.