react-native-vision-camera 🐛 Front camera doesn't recognize text on iOS

What were you trying to do?

Hi, I am implementing the text recognition function using Vision camera's frame processors. I used aarongrider/vision-camera-ocr plugin.

Reproduceable Code

import React, {useEffect, useState} from 'react';
import 'react-native-reanimated';
import {Text, View, PixelRatio, TouchableHighlight} from 'react-native';
import {scanOCR} from 'vision-camera-ocr';
import {
  Camera,
  useCameraDevices,
  useFrameProcessor,
} from 'react-native-vision-camera';
import {runOnJS} from 'react-native-reanimated';

const ScanView = ({navigation}) => {
  const [hasPermission, setHasPermission] = useState(false);
  const [cameraPosition, setCameraPosition] = useState('back');

  const [ocr, setOcr] = useState();
  const [pixelRatio, setPixelRatio] = useState(1);
  const devices = useCameraDevices();
  const device = devices[cameraPosition];

  const frameProcessor = useFrameProcessor(frame => {
    'worklet';
    const data = scanOCR(frame);
    runOnJS(setOcr)(data?.result?.text);
  }, []);

  const onSwitchCamera = () => {
    setCameraPosition(p => (p === 'back' ? 'front' : 'back'));
  };

  useEffect(() => {
    (async () => {
      const status = await Camera.requestCameraPermission();
      setHasPermission(status === 'authorized');
    })();
  }, []);

  return device !== undefined && hasPermission ? (
    <View style={{flex: 1}}>
      <Camera
        style={{flex: 1}}
        frameProcessor={frameProcessor}
        device={device}
        isActive={true}
        frameProcessorFps={5}
        onLayout={event => {
          setPixelRatio(
            event.nativeEvent.layout.width /
              PixelRatio.getPixelSizeForLayoutSize(
                event.nativeEvent.layout.width,
              ),
          );
        }}
      />
      <TouchableHighlight
        onPress={onSwitchCamera}
        underlayColor="#0877a1"
        style={{backgroundColor: '#005F83', height: 60}}>
        <Text
          style={{
            color: 'white',
            textAlign: 'center',
            lineHeight: 60,
            fontSize: 24,
          }}>
          Switch Camera
        </Text>
      </TouchableHighlight>
    </View>
  ) : (
    <View>
    </View>
  );
};
export default ScanView;

What happened instead?

At first, the rear camera only worked successfully in landscape mode, but when I upgraded the iOS software to 15.5, it also worked in portrait mode on the rear camera.(Refer to https://github.com/aarongrider/vision-camera-ocr/issues/6 ) However, the front camera still has the following error and does not recognize the text.

2022-06-23 16:54:46.978811+0900 OCRTest[1530:771941] [native] VisionCamera.captureOutput(:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped. 2022-06-23 16:54:47.212998+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped. 2022-06-23 16:54:47.245639+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped. 2022-06-23 16:54:47.284475+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped. 2022-06-23 16:54:47.317643+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.

I saw a similar issue related to this on the vision camera github, but the vision camera claimed that it was a plug-in problem. However, the ocr plug-in side says to contact the vision camera. https://github.com/aarongrider/vision-camera-ocr/issues/6

What am I supposed to do?

Relevant log output

2022-06-23 16:54:42.313559+0900 OCRTest[1530:771946] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.347084+0900 OCRTest[1530:771946] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.380720+0900 OCRTest[1530:802277] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.413700+0900 OCRTest[1530:802277] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.441685+0900 OCRTest[1530:802277] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.474329+0900 OCRTest[1530:775074] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.707918+0900 OCRTest[1530:773694] [native] VisionCamera.invokeOnFrameProcessorPerformanceSuggestionAvailable(currentFps:suggestedFps:): Frame Processor Performance Suggestion available!
2022-06-23 16:54:42.741557+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.775176+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.808394+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.847382+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.880107+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.913682+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.941626+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:42.975107+0900 OCRTest[1530:773694] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
2022-06-23 16:54:43.209156+0900 OCRTest[1530:775074] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.

Device

iPhone 8 plus (iOS 15.5)

VisionCamera Version

2.13.5

Additional information

[ ] I am using Expo
[X] I have read the Troubleshooting Guide
[X] I agree to follow this project's Code of Conduct
[X] I searched for similar issues in this repository and found none.

Jun 23 '22 08:06 Noma98

Hmm, maybe you chose a format that is not supported? Try selecting a different format and log the colorspace as well as the video dimensions

Jun 30 '22 08:06 mrousavy

@mrousavy I don't know why, but I checked with another device model after a long time and it worked. But now, when the text is recognized by the front camera, the text is read in reverse.(e.g." >>>>wOaHMMO3Ya>>") Isn't there any way to make text recognizable as human readable? (e.g. "PMKORJU<<BYEON<<<")

Jul 29 '22 08:07 Noma98

Ah, you need to flip the frame before passing it to the frame processor in that case. That's because the selfie camera is flipped by VisionCamera to match the style of Snapchat, Insta and TikTok (what people are used to)

Jul 29 '22 14:07 mrousavy

@mrousavy Can you provide a code snippet on how to flip the frame before handing it over to the frame processor?

Aug 01 '22 07:08 Noma98

Closing as this is a stale issue - this might have been fixed with the full rewrite in VisionCamera V3 (🥳) - if not, please create a new issue.

Sep 30 '23 09:09 mrousavy

(also this is not really related to VisionCamera - your model might not be trained on mirrored data, or the pixelFormat is not supported (YUV_420-8bit vs YUV_420-10bit HDR))

Sep 30 '23 09:09 mrousavy