gyroflow
gyroflow copied to clipboard
Figure out Sony lens distortion parameters
Sony saves lens distortion parameters as a list of numbers, for example: LensDistortion Table: {"unk1":[30000000,20197756],"unk2":0,"unk3":200.0,"unk4":[770,1534,2287,3022,3737,4423,5082,5706,6297,6852]}
These numbers can be used to undistort the video instead of a lens profile (this is what Catalyst Browse does). The problem is to figure out how to use these numbers to undistort the image.
When plotted it looks like this:

gyro2bb --dump Sony_file.mp4 | grep LensDistortion can be used to dump these values from a file
https://github.com/AdrianEddy/telemetry-parser/releases
Probably the same format as in their stills EXIF: https://stannum.io/blog/0PwljB
Edit: Hmm only 10 parameters in your list, not 11 that I'd expect from an APS-C camera. Although since video is a 16:9 crop from the sensor, the outermost parameter for 3:2 may be N/A for a 16:9 crop?
Was this an APS-C camera or FF?
That was a full frame camera (a7s3), I can send you some sample files from different Sony cameras with that metadata if you want to take a look
I'll poke at my A7M4 later this week. Only 10 parameters from an FF camera not in APS-C crop mode seems a little strange to me, unless Sony is somehow interpolating the spline differently in video mode.
Which hints to me that the lens might be reporting even more detailed distortion data/a different format, which would explain why I wasn't ever able to easily find any correlation between logic analyzer traces and EXIF distortion data. (I've been poking at the E-mount electronic protocols on and off for a few years. Mostly off due to lack of time/motivation lately - https://www.dyxum.com/dforum/emount-electronic-protocol-reverse-engineering_topic119522.html )
FYI, parsing of this tag is implemented here: https://github.com/AdrianEddy/telemetry-parser/blob/master/src/sony/rtmd_tags.rs#L483-L501
Thanks. Will maaaaybe poke at this tonight.
As a side note: Since apparently Sony's embedded distortion profiles might be sufficient, is there a reason you do not support lensfun profiles?
I never heard of lensfun, looks like a pretty solid database, thanks. It would be nice to support them, I'll create a separate issue for this. However, this LensDistortion embedded metadata also supports zoom lenses, because it's stored per frame, and changes when you zoom in the lens during recording.
Hmm, the 24-105G could be fun, since distortion correction is mandatory for this lens (it has already been corrected) - but I'm guessing for rolling shutter compensation you might want to undo the distortion.
For a 24-105G on A7M4, zoomed all the way out at 24mm, the EXIF in a raw has:
Distortion Correction : Auto fixed by lens
Distortion Correction Setting : Auto
Distortion Corr Params : 16 18 0 -29 -65 -115 -172 -241 -314 -397 -478 -564 -645 -728 -801 -870 -927
The video dump has:
LensDistortion Enabled LensDistortion bool : true
LensDistortion Data LensDistortion Table : {"unk1":[24700000,19335340],"unk2":0,"unk3":200.0,"unk4":[895,1785,2663,3520,4355,5158,5925,6650,7329,7958]}
In your example my first thought was that unk1 was somehow connected to the sensor resolution, since yours is roughly a 3:2 ratio - but mine is clearly not a 3:2 ratio.
One thing I question from Mr Galka's efforts - if all of the offsets are in relation to the maximum value in the array, then why is Sony storing them in signed ints in the ARW variation? Why not just store "offset from maximum value"? That doesn't seem quite correct to me. Could it be that multiplications greater than 1 are allowed, but only so long as they are far enough away from an edge not to cross an image border?
This definitely needs some weekend time...
Edit: @AdrianEddy what lens was on the A7S3 when you tested? And can you take a single raw shot at the same focal length? No need to actually post the ARW, just pulling those three distortion-related fields using exiftool should be enough for now.
Let me guess, that was the Tamron 28-200 at 28mm?
30000000/20197756 = 1.4853
24700000/19335340 = 1.2774 (This is what a 24-105G reports at 24mm)
24*(1.4853/1.2774) = a shade under 28
It isn't quite tracking focal length - especially if you compare an FE85 to the 24-105G at 24 - a 3.5x ratio of focal lengths, but the scale factor ratio is only 3.1x. Or the 24-105G at 24 vs 105 - scale factor ratio of 3.72, actual focal length ratio 4.375
The slope of the list of 10 parameters seems to decrease with focal length
Yes that was Tamron 28-200 at 28mm
So I took out my A7S3 to take the RAW and the weirdest thing happened - the SD card started burning and it fell apart inside the body. Now my A7S3 doesn't read any SD cards...
EDIT: After removing the battery it started working again, I sent you sample files on discord
OK, will take a look later today. Hopefully I'll have some time to poke at this, I just realized I've been procrastinating way too much on other stuff in favor of poking at gyroflow. :)
Sounds like they are using a resettable fuse in the case of a short circuit on the SD card, or if you had the miraculous ability to see kernel logs of the camera's OS you'd see that the SD card errors caused a kernel panic. :) Either of these would be fixed with a battery pull.
My poking last night has me really scratching my head here. I should see if a lens which doesn't provide distortion info (such as a Canon lens on a Metabones adapter) puts some placeholder "no-op" info here... Unlikely but having a "no-op" reference would be very useful.
So, I realize I was actually semi-misled, and ignoring the obvious here:
The first number in "unk1" is almost exactly proportional to focal length. It's not quite - sometimes it's a bit high, sometimes a bit low, but really close. For example, 24700000 for 24mm. If you hold zoom position constant and move focus, this number never changes.
The second number is usually around 20000000 - This DOES vary with focus position slightly, and I am wondering if it happens to be a scaling factor used by focus breathing compensation. It would be interesting to compare a lens known for heavy focus breathing and a lens known for good focus breathing performance to see what happens when moving focus.
I haven't been poking at this much lately (busy with other stuff), but will hopefully be able to poke at this more in the coming weeks.
It would be interesting to compare a lens known for heavy focus breathing and a lens known for good focus breathing performance to see what happens when moving focus.
@Entropy512 I have a few of the lenses on the Focus Breathing Compensation compatibility list (SEL20F18G, SEL50F12GM, SEL70200GM2), happy to submit clips from my Sony A1 or A7 IV. Feel free to let me know if there's anything I can do to help!
@sturmen I'm slowly getting to figuring out more Sony stuff, sample files with Breathing Compensation will be very useful Also if you have some OSS lenses, that would be helpful too
Fantastic! Do you want me to run through any particular situations/actions? Here's what I was thinking:
- ILCE-7M4 + SEL50F12GM manually controlled focus rack for lens comp data
- ILCE-7M4 + SEL70200GM2 fixed focus/fixed zoom handheld shot for OSS data (Does lens comp being enabled matter here?)
sounds good, disable lens compensation in camera for these tests, or if you want, record one with on and one with off
Here are the clips: http://gofile.me/5i0cz/XyglPmACo
The camera forced on lens comp for a lot of them. I narrate the settings, and I'm pretty sure I got the narration mostly right π
Thanks! Looks like when the breathing compensation is enabled, it does it in the camera, so there's nothing to do in post on our side.
I would think that when it's off, it's saved in the metadata so it can be applied later, but all these clips have the "Lens breathing compensation" option in Catalyst Browse disabled, so maybe that's not the case? Could you try to find some settings that make the "Lens breathing compensation" available in Catalyst?
Do you have latest firmwares in camera and lenses?
Looks like the "Lens Breathing in Post" functionality is limited to only some of their cameras: https://support.d-imaging.sony.co.jp/www/cscs/function/catalyst.php?fnc=1011&area=gb&lang=en
Unfortunately I only own the Ξ±7 IV (ILCE-7M4) and the Ξ±1 (ILCE-1), so we'll need to recruit someone else.
i have an a7-4 and rx100 vii, what can i do to help move this forward? please elaborate and be specific and i'll get you anything you need.
Looks like the "Lens Breathing in Post" functionality is limited to only some of their cameras: https://support.d-imaging.sony.co.jp/www/cscs/function/catalyst.php?fnc=1011&area=gb&lang=en
Unfortunately I only own the Ξ±7 IV (ILCE-7M4) and the Ξ±1 (ILCE-1), so we'll need to recruit someone else.
I have cameras and lenses for testing, what's needed now is a reverse engineering work
Some additional information:
LensDistortion Data: {"focal_length_nm":10400000,"effective_sensor_height_nm":10521644,"unk1":0,"coeff_scale":200.
0,"coeffs":[1151,2281,3382,4441,5423,6341,7196,7980,8683,9303]}
Imager 0xe405 Sensor pixel size : (4896, 2760)
Imager 0xe407 Pixel pitch : (3910, 3910)
Imager 0xe409 Sensor crop origin : (1, 3)
Imager 0xe40a Sensor crop size : (4893, 2752)
Imager 0xe40c First sample timestamp : 34182 us
Imager 0xe40d Exposure time : 15621 us
Imager 0xe40e Frame readout time : 32199 us
To get polynomial coefficients, we need to take effective_sensor_height_nm, divide it by Pixel pitch, then multiply by (Video resolution / Sensor crop size), then divide that to 10 even steps and use as X axis for polyfit, and use coeffs / coeff_scale with added (0, 0) as first element, and fit a 6-degree polynomial to that.
and by using (focal_length_nm / effective_sensor_height_nm) * video_height we get pixel focal length for camera intrinsic matrix
let pixel_pitch = tag_map.get(&GroupId::Imager).and_then(|x| x.get_t(TagId::PixelPitch) as Option<&(u32, u32)>).cloned();
let crop_size = tag_map.get(&GroupId::Imager).and_then(|x| x.get_t(TagId::CaptureAreaSize) as Option<&(u32, u32)>).cloned();
if let Some(md) = tag_map.get(&GroupId::Custom("LensDistortion".into())) {
if let Some(v) = md.get_t(TagId::Enabled) as Option<&bool> {
}
if let Some(v) = md.get_t(TagId::Data) as Option<&serde_json::Value> {
telemetry_parser::try_block!({
pub fn polyfit(y_values: &[f64], x_values: &[f64], polynomial_degree: usize) -> ::core::result::Result<Vec<f64>, &'static str> {
use nalgebra as na;
let number_of_columns = polynomial_degree + 1;
let number_of_rows = x_values.len();
let mut a = na::DMatrix::zeros(number_of_rows, number_of_columns);
for (row, &x) in x_values.iter().enumerate() {
a[(row, 0)] = 1.0;
for col in 1..number_of_columns {
a[(row, col)] = x.powf(col as f64);
}
}
let b = na::DVector::from_row_slice(y_values);
let decomp = na::SVD::new(a, true, true);
match decomp.solve(&b, 1e-18f64) {
Ok(mat) => Ok(mat.data.into()),
Err(error) => Err(error),
}
}
let focal_length_nm = v.get("focal_length_nm")?.as_f64()?;
let effective_sensor_height_nm = v.get("effective_sensor_height_nm")?.as_f64()?;
let coeff_scale = v.get("coeff_scale")?.as_f64()?;
let mut coeffs: Vec<f64> = v.get("coeffs")?.as_array()?.into_iter().filter_map(|x| Some(x.as_f64()? / coeff_scale.max(1.0))).collect();
let video_scaler = 2160.0 / crop_size.unwrap().1 as f64;
let step = (effective_sensor_height_nm as f64 / pixel_pitch.unwrap().1 as f64 * video_scaler).round() as usize;
let x: Vec<f64> = (0..=10).map(|i| i as f64 * step as f64 / 10000.0).collect();
coeffs.insert(0, 0.0);
let y = coeffs;
let pixel_focal_length = (focal_length_nm as f64 / effective_sensor_height_nm as f64) * 2160.0;
let fit = polyfit(&x, &y, 6).unwrap();
let fit = format!("{:.5} {:.5} {:.5} {:.5} {:.5} {:.5} {:.5}", fit[0], fit[1], fit[2], fit[3], fit[4], fit[5], fit[6]).replace(".", ",");
println!("{pixel_focal_length:.3} px, {}", fit);
});
}
}
this is what I have, and this is what Catalyst does
the calculated values from polyfit are the polynomial coefficients, and the values are exactly the same as calculated by the catalyst's code
so that part I have correct, but I didn't make it further
Catalyst also calculates it 2 times, 1 is to undistort and 2 is to distort
I think the only difference is swapped x and y arguments to polyfit
I think the numbers in metadata are distances in nanometers from the center of the lens So we need to fit a polynomial to these values first Only that gives us actual coefficients That's why we also need to use the pixel pitch, physical sensor size, video size and crop area in these calculations
also, some information is in GLSL shaders, which are in plain text in the svmulib binary
vec2 p = point;
float l = length(p) / 1000.0;
float y[6];
float a = 0.0;
for (int i = 0; i < 6; i++) {
y[i] = pow(l, 6.0 - float(i));
a += (y[i] * k_y2a[i]);
}
a = clamp(a, 0.0, 89.0);
float l_cor = f_real * tan(radians(a));
float scale = l_cor / l;
return p * scale;
however, it can't be used directly with what we have, I tried it, and I think it also handles camera intrinsic matrix in a different way than what we have in Gyroflow, so I wouldn't rely on that code too much
What's left is to make some sense of the calculated polynomials and add a new lens model here which will use these polynomial coefficients.
/bounty $1000
π $1,000 bounty created by AdrianEddy
π If you start working on this, comment /attempt #44 to notify everyone
π To claim this bounty, submit a pull request that includes the text /claim #44 somewhere in its body
π Before proceeding, please make sure you can receive payouts in your country
π΅ Payment arrives in your account 2-5 days after the bounty is rewarded
π― You keep 100% of the bounty award
π Thank you for contributing to gyroflow/gyroflow!
| Attempt | Started (GMT+0) | Solution |
|---|---|---|
| π’ @ap172x | Sep 27, 2023, 10:30:28 AM | WIP |
I have cameras and lenses for testing, what's needed now is a reverse engineering work
I'd like to try out this bounty and reverse engineer the solution for this, it might take me a few days though and I do not have the cameras or lenses. Is it okay if I still give this a shot? @AdrianEddy. I'm a problem solver. Just give me a starting point and I'll find a way.
@ap172x sure, feel free to take this. A lot of details are explained in this thread, and also on Discord under "IBIS Sony support" I attached a Ghidra project for reverse engineering. Sample Sony files are here https://docs.gyroflow.xyz/app/readme/test-files and in the IBIS issue https://github.com/gyroflow/gyroflow/issues/727
Feel free to ask me on Discord for any additional info.
it might take me a few days though
that's a very optimistic assumptionπ I've been working on this for weeks at a time over the past year. But good luck and I wish you a fun ride!
@AdrianEddy I'm up for the challenge and excited to get started. I've joined the gyroflow discord. Thank you for this opportunity! /attempt #44
Options
What's left is to make some sense of the calculated polynomials
Hey, It sounds like you have 10 point with x defined by metadata and y = k * dy, where k is a point number starting from 0 and dy is a step size. So the first question is building a function that would give y at any other point in between. That is called interpolation and can be done with "Cubic spline" in this case. Spline is a series of 3rd degree polynomials each fitting a gap between two consecutive points. One does not need polynomials of higher order to do so. I am sure you can find a library for rust for working with splines. "Cubic spline" is one of a few most common types.
Now the second question is what this curve means and how to use it. Here I can help much less and you seem to have found already a lot. But one comment: if you're using y(x) for distortion - it's ok. But if you're using y'(x) for distortion, we may need to use other interpolation method than cubic spline, to make sure first derivative is smooth enough.