MediaPipeUnityPlugin icon indicating copy to clipboard operation
MediaPipeUnityPlugin copied to clipboard

How get Iris_depth

Open k0hh2 opened this issue 4 years ago • 14 comments

How can I get Iris_depth?

And how do I run it at the same time as holistic tracking?

k0hh2 avatar Oct 24 '21 01:10 k0hh2

((distance between iris
+ distance between 2 ears 
+ distance between 2 noise
) / 3)
- normal value of the formula  when some one stand front the camera )

do you want other thing ?

ehsanwwe avatar Nov 02 '21 11:11 ehsanwwe

Thanks. I'm looking for a way to get const float left_iris_depth and right_iris_depth in a holistic scene.

iris_to_render_data_calculator.cc 188


  if (cc->Inputs().HasTag(kLeftIrisDepthTag) &&
      !cc->Inputs().Tag(kLeftIrisDepthTag).IsEmpty()) {
    const float left_iris_depth =
        cc->Inputs().Tag(kLeftIrisDepthTag).Get<float>();
    if (!std::isinf(left_iris_depth)) {
      line = "Left : ";
      absl::StrAppend(&line, ":", std::round(left_iris_depth / 10), " cm");
      lines.emplace_back(line);
    }
  }
  if (cc->Inputs().HasTag(kRightIrisDepthTag) &&
      !cc->Inputs().Tag(kRightIrisDepthTag).IsEmpty()) {
    const float right_iris_depth =
        cc->Inputs().Tag(kRightIrisDepthTag).Get<float>();
    if (!std::isinf(right_iris_depth)) {
      line = "Right : ";
      absl::StrAppend(&line, ":", std::round(right_iris_depth / 10), " cm");
      lines.emplace_back(line);
    }
  }

k0hh2 avatar Nov 03 '21 02:11 k0hh2

for left eye distance between landmark index 33 to 133 and for right eye distance between landmark index 362 to 363

ehsanwwe avatar Nov 06 '21 10:11 ehsanwwe

@k0hh2 It looks like you are looking at iris_to_render_data_calculator.cc, but it is actually implemented in iris_to_depth_calculator.cc.

Currently, the sample app does not support the calculation of iris depths. To calculate them, you need to implement it in C# or build native libraries with IrisToDepthCalculator and modify the graph config to use it. Instructions on how to run Iris Tracking with Holistic can be found in https://github.com/homuler/MediaPipeUnityPlugin/issues/322#issuecomment-949286902.

homuler avatar Nov 06 '21 13:11 homuler

@homuler So that's it. Thank you very much. Are there any plans to implement depth in the sample app in the future? I will try it myself, but I find it difficult.

I know how to perform iris tracking holistic. thank you.

k0hh2 avatar Nov 07 '21 05:11 k0hh2

for left eye distance between landmark index 33 to 133 and for right eye distance between landmark index 362 to 363

I'm not sure, is this a way to calculate iris_depth from the landmark? thank you.

k0hh2 avatar Nov 07 '21 05:11 k0hh2

Are there any plans to implement depth in the sample app in the future?

If you want, I'll treat this issue as a feature request. But it's not so difficult as it looks.

If we know the FoV and the focal length of the camera (these values need to be given beforehand or calculated somehow), we can calculate the focal length in pixels, since we know the resolution of the image. Based on the landmark data, we can calculate the diameter of the iris in pixels, but if we make a reasonable assumption about the actual diameter of the iris (MediaPipe assumes that the average size of a human iris is 11.8mm), we can calculate millimeters per pixel (e.g. 11.8 / iris_in_pixels).

Finally, if you consider the focal length in pixels, you can calculate the distance from the lens to the iris (i.e. focal_length_in_pixels * mms_per_pixel).

homuler avatar Nov 07 '21 11:11 homuler

@homuler I want you to treat this feature as a feature request 🙏

Thank you for explaining in detail. You can find the diameter of the iris from iris_landmark and know the distance to the camera from there. I'm not sure about the FoV and focal length of the camera due to lack of study.

k0hh2 avatar Nov 09 '21 03:11 k0hh2

Hi,

First of all, thank you very much for this plugin.

I am trying to get the Iris depth using your plugin. I can launch your project on device and on Unity Editor and I create an empty project with your plugin to understand most of how the iris tracking works.

However, I am using the event OnFaceLandmarksWithIrisOutput which give me OutputEventArgs<NormalizedLandmarkList>. I don't really understand what this list of values means ? If I understand well your explanation here, I need to calculate the diameter of the iris in pixels based on the landmark data. I think OnFaceLandmarksWithIrisOutput and the landmark data are related but I am stuck at this point.

Can you give me some help ?

Morgane-Tilak avatar Oct 07 '22 13:10 Morgane-Tilak

Hello,

I figured everything out and want to share if anyone needs help.

First, we get a list of landmark positions for the face and iris detected in the OnFaceLandmarksWithIrisOutput event. In order to split up the large list of landmarks, we know that the first 468 values are for the face. Then there are 5 values for the first (left) iris, and the next 5 values are for the second iris. In the MediaPipe source code we can find these values here: Annotation\FaceLandmarkListWithIrisAnnotation

To calculate the depth with these landmarks, I do these following steps:

  • We get the image size from the webcamtexture.

  • We calculate the width and length of the iris using our landmarks and multiply them by the image size to convert them to mm. The Iris Diameter is calculated by summing these values and dividing by 2 to get their average.

  • For each eye, we calculate the depth:

  1. Thanks to the Unity method Vector2.Distance(), we calculate the distance between the iris center and the image center.
  2. We calculate the distance between the displayed eye and the focal point with the Pythagorean theorem: sqrt(distance_eye_image^2 + focal_length^2).
  3. MediaPipe assumes that the average size of the human iris is 11.8mm. With this assumption, we calculate the ratio mm per px.

The distance iris/camera in mm is the distance iris/camera in pixels calculated with Pythagore above, multiplied by the ratio mm per px.

I would like to thank you again for your API and your explanations. It allowed me to do what I wanted to do.

Morgane-Tilak avatar May 26 '23 08:05 Morgane-Tilak

Very helpful explanation thank you! Just a comment that multiplying the iris size in (landmark units) by the image size (in pixels) probably gives the iris size in pixels rather than in mm as suggested at the second point.

mariomorvan avatar Aug 29 '23 14:08 mariomorvan

Hi there, I have two follow-up questions if I may:

  • Is the focal length accessible through Unity or is this external information we need to input manually?

  • I understand the X and Y positions of the eye's landmarks are in units of the image size. But what about the Z positions? Are they in units of the camera's focal length? I'm struggling to find that information in either MediaPipe or MediaPipeUnityPlugin, and that would be very helpful for precise depth and angles estimates.

Many thanks!!

mariomorvan avatar Aug 29 '23 17:08 mariomorvan

Hi, I will try my best to answer:

  • Unity doesn't give you the focal length. You can't get it without Unity either, because the phone constructors don't allow access to this data. You can get it if you take a picture with your phone, take the EXIF that have a FocalLength property. But because of inaccuracies, I created a simple application that gives me the distance calculated and I change the "focallength" more or less depending on the real distance and I take this value as focalLength.
  • I also try to understand the Z positions and I tried to use them to find the distance but it did not work. I found this paper that explains "z-coordinates are relative to the face center of mass and are scaled proportionally to the face width"

Morgane-Tilak avatar Aug 30 '23 09:08 Morgane-Tilak

This is the code I use to get various needed data in android


		var unityPlayer = new AndroidJavaClass("com.unity3d.player.UnityPlayer");
		var activity = unityPlayer.GetStatic<AndroidJavaObject>("currentActivity");
		var context = activity.Call<AndroidJavaObject>("getApplicationContext");
		
		var displayMetrics = activity.Call<AndroidJavaObject>("getResources").Call<AndroidJavaObject>("getDisplayMetrics");

		var dpi = new Vector2(displayMetrics.Get<float>("xdpi"),displayMetrics.Get<float>("ydpi"));
		var pixels = new Vector2(displayMetrics.Get<int>("widthPixels"),displayMetrics.Get<int>("heightPixels"));

		monitorSizeM = pixels * 25.4f / (dpi * 1000);

		var cameraManager = context.Call<AndroidJavaObject>("getSystemService",context.GetStatic<string>("CAMERA_SERVICE"));
		Debug.LogFormat("cameraManager : {0}",cameraManager);

		var cameraIDs = cameraManager.Call<string[]>("getCameraIdList");
		Debug.LogFormat("cameraIDs : {0}",string.Join(" | ",cameraIDs));

		cameraCharacteristics = cameraIDs.Select((cameraID) => CameraCharacteristics.FromCameraID(cameraManager,cameraID)).ToArray();

////////////////////////////////////////////////////////////////

	public static CameraCharacteristics[] cameraCharacteristics;
	public enum PreferFacing { Front = 0,Back = 1,None = 2 }
	public struct CameraCharacteristics
	{
		public string CamerID;
		public PreferFacing preferFacing;
		public Vector2Int SensorSizePx;
		public Vector2 SensorSizeMM;
		public Vector3 LensPosition;
		public Quaternion LensRotation;
		public float[] FocalLengths;

		public Vector2 PixelToMM(Vector2 px) => px * SensorSizeMM / SensorSizePx;
		public Vector2 NormalizedToMM(Vector2 px) => px * SensorSizeMM;

		public Vector2 DistanceMM(int findex,Vector2 px,Vector2 realSize) => DistanceMM(FocalLengths[findex],PixelToMM(px),realSize);
		public Vector2 NormalizedToDistanceMM(int findex,Vector2 px,Vector2 realSize) => DistanceMM(FocalLengths[findex],px * SensorSizeMM,realSize);

		public static float DistanceMM(in float focalLength,in float lensSizeMM,in float realSizeMM)
		{
			return focalLength + (focalLength * realSizeMM / lensSizeMM);
		}
		
		public static Vector2 DistanceMM(in float focalLength,in Vector2 lensSizeMM,in Vector2 realSizeMM)
		{
			return new Vector2(DistanceMM(focalLength,lensSizeMM.x,realSizeMM.x),DistanceMM(focalLength,lensSizeMM.y,realSizeMM.y));
		}

#if UNITY_ANDROID
		public static CameraCharacteristics FromCameraID(AndroidJavaObject cameraManager,string cameraID)
		{
			var CameraCharacteristics = cameraManager.Call<AndroidJavaObject>("getCameraCharacteristics",cameraID);

			CameraCharacteristics data;

			data.CamerID = cameraID;
			data.preferFacing = (PreferFacing)CameraCharacteristics.GetFromConstantName<AndroidJavaObject>("LENS_FACING").Call<int>("intValue");
			data.FocalLengths = CameraCharacteristics.GetFromConstantName<float[]>("LENS_INFO_AVAILABLE_FOCAL_LENGTHS");
			data.SensorSizeMM = CameraCharacteristics.GetFromConstantName<AndroidJavaObject>("SENSOR_INFO_PHYSICAL_SIZE").ConvertFromSizeF();
			data.SensorSizePx = CameraCharacteristics.GetFromConstantName<AndroidJavaObject>("SENSOR_INFO_PIXEL_ARRAY_SIZE").ConvertFromSize();
			data.LensPosition = CameraCharacteristics.GetFromConstantName<float[]>("LENS_POSE_TRANSLATION").ReadVector3();
			data.LensRotation = CameraCharacteristics.GetFromConstantName<float[]>("LENS_POSE_ROTATION").ReadQuaternion();

			return data;
		}
#endif
	}

Currently now I am still stuck over the iris data from normalized landmark. I am not sure what unit it was, is it the percentage of camera pixel sensor? Seem like it is and we don't need SensorSizePx to calculate from normalized iris

Thaina avatar Dec 03 '23 18:12 Thaina