MediaPipeUnityPlugin
MediaPipeUnityPlugin copied to clipboard
How get Iris_depth
How can I get Iris_depth?
And how do I run it at the same time as holistic tracking?
((distance between iris
+ distance between 2 ears
+ distance between 2 noise
) / 3)
- normal value of the formula when some one stand front the camera )
do you want other thing ?
Thanks. I'm looking for a way to get const float left_iris_depth and right_iris_depth in a holistic scene.
iris_to_render_data_calculator.cc 188
if (cc->Inputs().HasTag(kLeftIrisDepthTag) &&
!cc->Inputs().Tag(kLeftIrisDepthTag).IsEmpty()) {
const float left_iris_depth =
cc->Inputs().Tag(kLeftIrisDepthTag).Get<float>();
if (!std::isinf(left_iris_depth)) {
line = "Left : ";
absl::StrAppend(&line, ":", std::round(left_iris_depth / 10), " cm");
lines.emplace_back(line);
}
}
if (cc->Inputs().HasTag(kRightIrisDepthTag) &&
!cc->Inputs().Tag(kRightIrisDepthTag).IsEmpty()) {
const float right_iris_depth =
cc->Inputs().Tag(kRightIrisDepthTag).Get<float>();
if (!std::isinf(right_iris_depth)) {
line = "Right : ";
absl::StrAppend(&line, ":", std::round(right_iris_depth / 10), " cm");
lines.emplace_back(line);
}
}
for left eye distance between landmark index 33 to 133 and for right eye distance between landmark index 362 to 363
@k0hh2 It looks like you are looking at iris_to_render_data_calculator.cc, but it is actually implemented in iris_to_depth_calculator.cc.
Currently, the sample app does not support the calculation of iris depths.
To calculate them, you need to implement it in C# or build native libraries with IrisToDepthCalculator and modify the graph config to use it.
Instructions on how to run Iris Tracking with Holistic can be found in https://github.com/homuler/MediaPipeUnityPlugin/issues/322#issuecomment-949286902.
@homuler So that's it. Thank you very much. Are there any plans to implement depth in the sample app in the future? I will try it myself, but I find it difficult.
I know how to perform iris tracking holistic. thank you.
for left eye distance between landmark index 33 to 133 and for right eye distance between landmark index 362 to 363
I'm not sure, is this a way to calculate iris_depth from the landmark? thank you.
Are there any plans to implement depth in the sample app in the future?
If you want, I'll treat this issue as a feature request. But it's not so difficult as it looks.
If we know the FoV and the focal length of the camera (these values need to be given beforehand or calculated somehow), we can calculate the focal length in pixels, since we know the resolution of the image.
Based on the landmark data, we can calculate the diameter of the iris in pixels, but if we make a reasonable assumption about the actual diameter of the iris (MediaPipe assumes that the average size of a human iris is 11.8mm), we can calculate millimeters per pixel (e.g. 11.8 / iris_in_pixels).
Finally, if you consider the focal length in pixels, you can calculate the distance from the lens to the iris (i.e. focal_length_in_pixels * mms_per_pixel).
@homuler I want you to treat this feature as a feature request 🙏
Thank you for explaining in detail. You can find the diameter of the iris from iris_landmark and know the distance to the camera from there. I'm not sure about the FoV and focal length of the camera due to lack of study.
Hi,
First of all, thank you very much for this plugin.
I am trying to get the Iris depth using your plugin. I can launch your project on device and on Unity Editor and I create an empty project with your plugin to understand most of how the iris tracking works.
However, I am using the event OnFaceLandmarksWithIrisOutput which give me OutputEventArgs<NormalizedLandmarkList>. I don't really understand what this list of values means ?
If I understand well your explanation here, I need to calculate the diameter of the iris in pixels based on the landmark data. I think OnFaceLandmarksWithIrisOutput and the landmark data are related but I am stuck at this point.
Can you give me some help ?
Hello,
I figured everything out and want to share if anyone needs help.
First, we get a list of landmark positions for the face and iris detected in the OnFaceLandmarksWithIrisOutput event. In order to split up the large list of landmarks, we know that the first 468 values are for the face. Then there are 5 values for the first (left) iris, and the next 5 values are for the second iris. In the MediaPipe source code we can find these values here: Annotation\FaceLandmarkListWithIrisAnnotation
To calculate the depth with these landmarks, I do these following steps:
-
We get the image size from the
webcamtexture. -
We calculate the width and length of the iris using our landmarks and multiply them by the image size to convert them to mm. The Iris Diameter is calculated by summing these values and dividing by 2 to get their average.
-
For each eye, we calculate the depth:
- Thanks to the Unity method
Vector2.Distance(), we calculate the distance between the iris center and the image center. - We calculate the distance between the displayed eye and the focal point with the Pythagorean theorem:
sqrt(distance_eye_image^2 + focal_length^2). - MediaPipe assumes that the average size of the human iris is 11.8mm. With this assumption, we calculate the ratio mm per px.
The distance iris/camera in mm is the distance iris/camera in pixels calculated with Pythagore above, multiplied by the ratio mm per px.
I would like to thank you again for your API and your explanations. It allowed me to do what I wanted to do.
Very helpful explanation thank you! Just a comment that multiplying the iris size in (landmark units) by the image size (in pixels) probably gives the iris size in pixels rather than in mm as suggested at the second point.
Hi there, I have two follow-up questions if I may:
-
Is the focal length accessible through Unity or is this external information we need to input manually?
-
I understand the X and Y positions of the eye's landmarks are in units of the image size. But what about the Z positions? Are they in units of the camera's focal length? I'm struggling to find that information in either MediaPipe or MediaPipeUnityPlugin, and that would be very helpful for precise depth and angles estimates.
Many thanks!!
Hi, I will try my best to answer:
- Unity doesn't give you the focal length. You can't get it without Unity either, because the phone constructors don't allow access to this data. You can get it if you take a picture with your phone, take the EXIF that have a FocalLength property. But because of inaccuracies, I created a simple application that gives me the distance calculated and I change the "focallength" more or less depending on the real distance and I take this value as focalLength.
- I also try to understand the Z positions and I tried to use them to find the distance but it did not work. I found this paper that explains "z-coordinates are relative to the face center of mass and are scaled proportionally to the face width"
This is the code I use to get various needed data in android
var unityPlayer = new AndroidJavaClass("com.unity3d.player.UnityPlayer");
var activity = unityPlayer.GetStatic<AndroidJavaObject>("currentActivity");
var context = activity.Call<AndroidJavaObject>("getApplicationContext");
var displayMetrics = activity.Call<AndroidJavaObject>("getResources").Call<AndroidJavaObject>("getDisplayMetrics");
var dpi = new Vector2(displayMetrics.Get<float>("xdpi"),displayMetrics.Get<float>("ydpi"));
var pixels = new Vector2(displayMetrics.Get<int>("widthPixels"),displayMetrics.Get<int>("heightPixels"));
monitorSizeM = pixels * 25.4f / (dpi * 1000);
var cameraManager = context.Call<AndroidJavaObject>("getSystemService",context.GetStatic<string>("CAMERA_SERVICE"));
Debug.LogFormat("cameraManager : {0}",cameraManager);
var cameraIDs = cameraManager.Call<string[]>("getCameraIdList");
Debug.LogFormat("cameraIDs : {0}",string.Join(" | ",cameraIDs));
cameraCharacteristics = cameraIDs.Select((cameraID) => CameraCharacteristics.FromCameraID(cameraManager,cameraID)).ToArray();
////////////////////////////////////////////////////////////////
public static CameraCharacteristics[] cameraCharacteristics;
public enum PreferFacing { Front = 0,Back = 1,None = 2 }
public struct CameraCharacteristics
{
public string CamerID;
public PreferFacing preferFacing;
public Vector2Int SensorSizePx;
public Vector2 SensorSizeMM;
public Vector3 LensPosition;
public Quaternion LensRotation;
public float[] FocalLengths;
public Vector2 PixelToMM(Vector2 px) => px * SensorSizeMM / SensorSizePx;
public Vector2 NormalizedToMM(Vector2 px) => px * SensorSizeMM;
public Vector2 DistanceMM(int findex,Vector2 px,Vector2 realSize) => DistanceMM(FocalLengths[findex],PixelToMM(px),realSize);
public Vector2 NormalizedToDistanceMM(int findex,Vector2 px,Vector2 realSize) => DistanceMM(FocalLengths[findex],px * SensorSizeMM,realSize);
public static float DistanceMM(in float focalLength,in float lensSizeMM,in float realSizeMM)
{
return focalLength + (focalLength * realSizeMM / lensSizeMM);
}
public static Vector2 DistanceMM(in float focalLength,in Vector2 lensSizeMM,in Vector2 realSizeMM)
{
return new Vector2(DistanceMM(focalLength,lensSizeMM.x,realSizeMM.x),DistanceMM(focalLength,lensSizeMM.y,realSizeMM.y));
}
#if UNITY_ANDROID
public static CameraCharacteristics FromCameraID(AndroidJavaObject cameraManager,string cameraID)
{
var CameraCharacteristics = cameraManager.Call<AndroidJavaObject>("getCameraCharacteristics",cameraID);
CameraCharacteristics data;
data.CamerID = cameraID;
data.preferFacing = (PreferFacing)CameraCharacteristics.GetFromConstantName<AndroidJavaObject>("LENS_FACING").Call<int>("intValue");
data.FocalLengths = CameraCharacteristics.GetFromConstantName<float[]>("LENS_INFO_AVAILABLE_FOCAL_LENGTHS");
data.SensorSizeMM = CameraCharacteristics.GetFromConstantName<AndroidJavaObject>("SENSOR_INFO_PHYSICAL_SIZE").ConvertFromSizeF();
data.SensorSizePx = CameraCharacteristics.GetFromConstantName<AndroidJavaObject>("SENSOR_INFO_PIXEL_ARRAY_SIZE").ConvertFromSize();
data.LensPosition = CameraCharacteristics.GetFromConstantName<float[]>("LENS_POSE_TRANSLATION").ReadVector3();
data.LensRotation = CameraCharacteristics.GetFromConstantName<float[]>("LENS_POSE_ROTATION").ReadQuaternion();
return data;
}
#endif
}
Currently now I am still stuck over the iris data from normalized landmark. I am not sure what unit it was, is it the percentage of camera pixel sensor? Seem like it is and we don't need SensorSizePx to calculate from normalized iris