camerakit-android icon indicating copy to clipboard operation
camerakit-android copied to clipboard

Camera preview works only first time,

Open darwinfrancis opened this issue 5 years ago • 21 comments

Is this a bug report?

Yes

Have you read the Contributing Guidelines?

Yes

Environment

Android

Steps to Reproduce

(Write your steps here:)

  1. Create fragment A and Integrate camera kit to fragment A

  2. open fragment A

  3. move to Fragment B on button click

  4. then come back to fragment A

    implementation 'com.camerakit:camerakit:1.0.0-beta3.10' implementation 'com.camerakit:jpegkit:0.1.0' implementation 'org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.3.0' implementation 'org.jetbrains.kotlinx:kotlinx-coroutines-android:1.0.0'

Expected Behavior

The camera preview should work as normal.

(Write what you thought would happen.)

The issue may be related to error while closing the camera.

Actual Behavior

The camera preview not working, it show black screen only. There are no error or warning logs on log cat.

The below scenario describes the issue in detail I opened the camera kit fragment and preview works fine then come back to home screen and opened other fragment say Fragment B (which contains a different camera implementation not the camera kit) then it show the following error

`W/CameraBase: An error occurred while connecting to camera 0: Status(-8): '8: connectHelper:1677: Too many cameras already open, cannot open camera "0"'

W/System.err: java.lang.RuntimeException: Fail to connect to camera service`

Reproducible Demo

(Paste the link to an example project and exact instructions to reproduce the issue.)

code.zip

darwinfrancis avatar Feb 19 '19 11:02 darwinfrancis

I have been running into what may be the exact same problem - when I use it in my app the screen goes blank after rotation and then I don't get it back until I restart the app. I've documented the problem over on the ( apparently entirely unattended ) Spectrum Chat and got as far as the camera not being released.

Specifically what I found was that onPause and onStop were called but stopPreview and closeCamera in CameraPreview.kt did not appear to be hit at all. I see the problem both in emulators and on my phone, all of which on a new enough version of Android to be using the Camera2 api.

I suspect the root of this problem may be the same as #468, in which case the fix for that might be useful here.

glenatron avatar Feb 19 '19 15:02 glenatron

Adding some logging in CameraPreview.kt to onPause and onStop the pattern is the same:

    fun pause() {
        Log.i("CameraPreview.kt", "Pause called");
        GlobalScope.launch(cameraDispatcher) {
            runBlocking {
                Log.i("CameraPreview.kt", "Pause initiated, stop preview state is "+lifecycleState)
                lifecycleState = LifecycleState.PAUSED
                stopPreview()
            }
        }
    } 

In my logs I see this:

2019-02-19 16:40:55.509 24555-24555/com.myapp.app I/CameraFragment: Initiate Shutdown 2019-02-19 16:40:55.509 24555-24555/com.myapp.app I/CameraPreview.kt: Pause called 2019-02-19 16:40:55.509 24555-24555/com.myapp.app I/CameraPreview.kt: Stop called

Then nothing more. I don't know a lot about Kotlin, but it looks like the inner call is never actually happening.

glenatron avatar Feb 19 '19 16:02 glenatron

Interestingly nothing inside the GlobalScope.launch call seems to happen, but if I put the content of the stopPreview() function into the pause function, not only does it start to work but from that point forward the GlobalScope.launch calls also work.

fun pause() {
    Log.i("CameraPreview.kt", "Pause called");
    GlobalScope.launch(cameraDispatcher) {
        Log.i(TAG, "Pause scope launched, runblocking call ahead.")
        runBlocking {
            Log.i("CameraPreview.kt", "Pause initiated, stop preview state is "+lifecycleState)
            lifecycleState = LifecycleState.PAUSED
            stopPreview()
        }
    }
    cameraState = CameraState.PREVIEW_STOPPING
    cameraApi.stopPreview()
}

Obviously this is not a practical fix for the problem, I'm sure those co-routines exist for a reason, but someone with a little more Kotlin savvy might find it a useful guide to what is going wrong.

glenatron avatar Feb 20 '19 16:02 glenatron

as per my understanding that is coroutines implementation that is being caused of this unusual behaviour.

UmarBhutta avatar Feb 20 '19 17:02 UmarBhutta

Removing runBlocking as recommended on SO doesn't seem to have a terrible effect, certainly no worse.

glenatron avatar Feb 20 '19 23:02 glenatron

Same problem Camera open only first time and

private View.OnTouchListener onTouchCaptureImage = new View.OnTouchListener() { @Override public boolean onTouch(final View view, MotionEvent motionEvent) { switch (motionEvent.getAction()) { case MotionEvent.ACTION_DOWN: { handleViewTouchFeedback(view, motionEvent); LoggerUtil.logItem("Touch down"); cameraKitView.captureImage((cameraKitView, capturedImage) -> { LoggerUtil.logItem("Capture Camera");
imageCaptured(capturedImage); });

                break;
            }

            case MotionEvent.ACTION_UP: {
                LoggerUtil.logItem("Touch up");
                handleViewTouchFeedback(view, motionEvent);
                break;
            }
        }
        return true;
    }
};

cameraKitView.captureImage((cameraKitView, capturedImage) -> { LoggerUtil.logItem("Capture Camera");
imageCaptured(capturedImage); });

This method on work in ontouch listner

VivekRai29 avatar Mar 14 '19 08:03 VivekRai29

I have the same problem as well.

vladimirpetrovski avatar Apr 17 '19 08:04 vladimirpetrovski

looking another component or solution.

tired of this

alemdg avatar Jul 16 '19 18:07 alemdg

Any update on this?

lighthou avatar Aug 19 '19 02:08 lighthou

To add to this conversation, I've been observing this behaviour as well. I think I might have some additional useful information.

I've been observing black camera previews after granting the camera permission from a dialog. This...

  1. ...only happens on some devices. I've observed it happening on an AOS 23 Samsung, but not an AOS 28 Pixel.
  2. ...is resolved after restarting the application once permissions have been granted on the affected devices.

The cause seems to be an issue with the coroutines setup, like @glenatron highlighted. The startPreview() method is not being called within the coroutine block. Since the problem is resolved after application restart (i.e. process death) my intuition is that the root problem might be to do with the threading setup at CameraPreview#79 with newSingleThreadContext("CAMERA").

I'll continue playing around and report back.

AlgoRyan avatar Sep 11 '19 04:09 AlgoRyan

So, after some more debugging I've verified my initial instinct.

After granting the camera permission from a permissions dialog and then backgrounding and foregrounding the application a number of times the CameraPreview#cameraDispatcher looks like this...

CameraPreview Frozen Executor

After then restarting the application, reopening the camera screen, and then similarly backgrounding and foregrounding the application a number of times the CameraPreview#cameraDispatcher looks like this...

CameraPreview Working Executor

For some reason in the first scenario the executor is freezing and failing to execute any tasks, resulting in CameraPreview#startPreview() not being called and thus the preview not rendering.

To clarify, this is happening on a Samsung Galaxy S7 running Android 6.0.1 with the com.camerakit:camerakit:1.0.0-beta3.11 version set up very much according to the documented guides here on the repo.

Could you please investigate and prioritise a fix for this? Granting camera permissions is the initial flow for any camera screen, and failing here creates a bad first impression for users.

AlgoRyan avatar Sep 11 '19 06:09 AlgoRyan

The scenario described above by @AlgoRyan is the result of a deadlock against the @Synchronized open(facing: CameraFacing) method on the Camera2 class. I'm still mentally scaffolding the coroutines, handlers, and @Synchronized usages over ManageCameraApi and Camera2 to determine as to why it's occurring, but the Camera2 instance's monitor appears to be locked at the point where open() is called and so the CAMERA thread blocks on that call as the monitor is already acquired and doesn't ever appear to be released.

UPDATE: I am mistaken. Realized that the debugger was just flat out stopping (as in disconnecting) at the point where I hit these calls. Some reinstallation of lldb and adb fixed the issues. There is a deadlock though, see below.

nullptr2this avatar Sep 12 '19 17:09 nullptr2this

Some more information. The TL;DR of it is that coroutines are being suspended and there is no guarantee that they will be resumed, which is causing the single threaded executor that backs the coroutine dispatcher to block forever and queue up any work scheduled thereafter. That is why so many of us do not see startPreview called in CameraPreview.resume()

I believe this ultimately has to do with the way in which the camera is released and some missing handling in how the camera is acquired. Here is what I see happening when the blank preview screen issue occurs. All code references are for SHA ce5f8ca720e7849521f749e26a3de6f5a2c9e7d4 which is the v1.0.0-beta-3.11 tag.

CameraPreview has start(), resume(), stop(), and pause() methods which launch coroutines from a global scope and immediately delegate to private helper methods.

Screen Shot 2019-09-12 at 2 59 44 PM

Those private helper methods are suspending methods which immediately suspend the current co-routine and capture a reference to resulting Continuation.

Screen Shot 2019-09-12 at 3 01 16 PM

Calls are then delegated down the stack into the ManagedCameraApi and Camera API classes where handlers are used to run operations against the camera manager.

ManagedCameraApi Screen Shot 2019-09-12 at 3 08 33 PM

CameraApi2 Screen Shot 2019-09-12 at 3 09 32 PM

CameraApi2, upon opening the camera signals CameraPreview through the CameraEvents.onCameraOpened callback method.

Screen Shot 2019-09-12 at 3 11 18 PM

CameraPreview then resumes the Continuation it captured at the beginning of the start callstack and everything continues on as planned.

Screen Shot 2019-09-12 at 3 13 29 PM

However, the Continuation is only resumed if the camera is actually opened. Here are the 3 callbacks from the attempt to open the camera. Only 2 callback to the CameraPreview (onOpened and onDisconnected).

Screen Shot 2019-09-12 at 3 17 43 PM

And of those 2 callback implementations, only the onCameraOpened callback resumes the continuation.

Screen Shot 2019-09-12 at 3 18 52 PM

So only in the event that the camera is successfully opened will the Continuation be resumed, and as a result the single threaded dispatcher will be deadlocked for the remainder of its lifetime. This explains @AlgoRyan 's debugger variable watch output with the dispatcher queuing up tasks.

This is what causes the start, resume, stop, and pause methods to all stop at the point before their actual co-routine blocks are run which is in turn why you don't see startPreview called. The coroutine blocks are being scheduled, they are just blocked behind the abandoned Continuation.

So that's one issue, but I believe the main issue is the way in device availability is currently managed. Prior to making the openCamera call in Camera2, a callback is registered with CameraManager.whenDeviceAvailable. This callback waits for the target device to become available and then calls back to trigger the open camera call.

Screen Shot 2019-09-12 at 3 29 51 PM

Looking at the implementation of this registered wait, there is no handling for the device being flagged as unavailable and there is no timeout on waiting for the device to become available. So if the device in use by another application and hasn't been freed properly or wasn't freed properly by our application or CameraKit, then an attempt to open the camera is never made and that Continuation is again never resumed.

Screen Shot 2019-09-12 at 3 30 43 PM

In cases where I experience this issue, onCameraUnavailable is always called with the target device id that I am waiting for.

nullptr2this avatar Sep 12 '19 19:09 nullptr2this

As a follow up, I don't believe that this is OS version or device specific, but rather think that there might be a racy condition on releasing the camera through a full start/stop lifecycle procession.

More on that as I dig through it.

nullptr2this avatar Sep 12 '19 19:09 nullptr2this

I have found the issue that causes camera release to fail and ultimately results in the subsequent failures to start preview described above.

TL;DR -> Much like the issue described above there is a condition that causes a continuation to be abandoned, which blocks the coroutine dispatcher and effectively prevents onPause and onStop from shutting down preview and releasing the camera after that first initial successful setup of a camera preview.

As above, all code references are for SHA ce5f8ca which is the v1.0.0-beta-3.11 tag.

And here we go.

There are two pathways in the code that lead to starting preview. The first, and probably most common, is through the normal lifecycle invocation of CameraPreview.resume(). The other is via the onSurfaceReady callback method of the CameraSurfaceTextureListener set on the CameraPreview instance. That callback delegates to the CameraPreview.resume().

Screen Shot 2019-09-13 at 10 09 58 AM Screen Shot 2019-09-13 at 10 10 09 AM

Both pathways end up invoking camera preview startup via the CameraPreview.startPreview() private helper method, which like other private helper methods in this class is a suspending method that immediately suspends the coroutine and captures the resulting Continuation for later use.

Screen Shot 2019-09-13 at 10 13 07 AM

There are 2 ways in which that captured Continuation are resumed. The first is when the surface is not yet ready and the startPreview method quickly jumps to resume the Continuation with an exception to halt the attempt to start preview (for obvious reasons)

Screen Shot 2019-09-13 at 10 16 14 AM ... some stuff being executed if the condition is true...

... and the else block for when the condition is false ... Screen Shot 2019-09-13 at 10 16 22 AM

The other is when the preview is successfully started by the Camera API. Much like the pathway to resume the Continuation that is captured when opening the camera, the Continuation captured when starting preview is cached and then resumed when the Camera API calls back to the CameraPreview via the CameraEvents interface.

Screen Shot 2019-09-13 at 10 18 06 AM

The only place the camera event callback for preview started is called in the Camera2 implementation here, and it is called conditionally (remember that the implementation of this callback is responsible for resuming Continuation captured from suspending the startPreview co-routine):

Screen Shot 2019-09-13 at 10 34 50 AM

Well, now you might say ok, so the callback isn't fired if preview is already started, but we shouldn't start preview again if it is already started anyway. I agree. There are, however, event sequences that make it possible for this to happen, which is to what we will now turn our attention.

Let's return to the 2 pathways that can result in a call to startPreview (which is responsible for suspending co-routine execution during the preview startup process). We have resume() being called by the lifecycle, and resume() being called by the onSurfaceReady callback. There are gates setup in each to attempt to halt progress and abort if the implementation is not in the proper state at the time each is called.

The actual startPreview implementation will abort and resume the captured Continuation if the the surface is null or the attributes are null. The onSurfaceReady callback implementation gates it's execution of CameraPreview.resume() on whether or not the lifecycle is resumed or started. Here in lies the problem. There is no synchronization between this callback (either through locking or relying on the serialized work queue provided by the single threaded co-routine dispatcher) and it appears the assumption is that if the lifecycle has been updated to resumed or started state that the lifecycle based pathway has failed because at the time the surface must have been null. However, this is not the case.

The following event sequence results in a second call to startPreview in the CameraPreview class and the Continuation it captures is never resumed because the Camera2 implementation fails to make the preview started callback that will resume it. This abandoned Continuation blocks the dispatcher from executing the co-routine actions in onPause and onStop that are responsible for shutting down preview and releasing the camera device.

  • CameraPreview.start() is called by the lifecycle, coroutine to open the camera is created and queued for execution on the dispatcher. start() method returns after splitting off the coroutine.
  • CameraPreview.resume() is called called by the lifecycle, coroutine to start preview is created and queued for execution on the dispatcher behind the work to open the camera. resume() returns and the lifecycle continues.
  • The coroutine to open the camera suspends, updates the lifecycle state to STARTED, and enters the wait state for the camera device to be reported as ready.
  • onSurfaceReady callback is called to alert the implementation that there is actually an existing surface. The surface is captured by the CameraPreview instance and is now NON-NULL. Because the lifecycle state is STARTED, a call to resume() is made and another coroutine action to start preview is enqueued on the dispatcher behind the original action that as enqueued by the lifecycle.
  • The camera device becomes available and is opened, the open camera coroutine suspension is resumed via the onCameraOpened callback to CameraPreview and the coroutine dispatcher continues executing enqueued actions. The attributes of the device are captured and are now NON-NULL.
  • The first action to start preview is executed by the co-routine dispatcher. The coroutine is suspended and the start preview continuation is captured. The check to see if the surface exists and attributes exist succeeds because the surface callback has already executed and captured the surface and the camera device has been opened and its attributes captured. The Camera2 impl starts a capture session, tracks that preview has begun, calls back to CameraPreview to notify it that the preview has started. The start preview continuation is resumed.
  • The coroutine dispatcher continues executing queued actions and so reaches the second enqueued action to start preview. Because we have already captured the surface and the camera attributes, the checks pass and the Camera2 instance is told again to start preview. There are no checks in Camera2 to see if a capture session is already running so it proceeds to create another one. When the first capture from this second session is completed, the call back to alert CameraPreview instance that preview successfully started is bypassed because it was already started once before and a flag indicates that the callback should not be executed again.
  • The second captured start preview continuation is abandoned and leaves the coroutine dispatcher blocked waiting for it's current suspension to be resumed.
  • Subsequent calls to pause and stop are made, but their enqueued coroutine actions are never executed because they are enqueued behind the stalled second start preview continuation which is never resumed. This causes the camera device to remain open when it should be released.

nullptr2this avatar Sep 13 '19 15:09 nullptr2this

@austinkettner @dwillmc @emersoncloud I see that the use of coroutines has been removed from the pause() and stop() methods on CameraPreview as part of PR 498 (https://github.com/CameraKit/camerakit-android/pull/498). Presumably they were removed to unblock the clean up of capture sessions and the hold on the camera device, but I don't think that their removal cleans up the actual issue causing the co-routines to block in the first place. See my last 2 comments for analysis of these conditions. They still exist and, though they don't seem to impact the clean up process at this time, they are still there and could be potential bug land mines for future efforts.

This issue of the surface ready/created callback duplicating start preview calls seems to occur more on older versions of the OS. I only see it happen once in a blue moon on API 28, but it happens every time on API 23 (running in an emulator; intermittent on devices).

UPDATED: Actually after some testing, with the PR 498 code, the blocking issue is actually problematic if the user backgrounds the activity because the start/resume coroutines that would re-setup the camera are blocked and the camera will not be acquired and preview will not be started, however the surface, because it still exists, will show the last captured from from the previous capture session.

nullptr2this avatar Sep 13 '19 16:09 nullptr2this

Issue is heavily exacerbated between versions v1.0.0-beta3.10 and v1.0.0-beta3.11 by the update made in this commit c723e984aaff6423b4e7dbf0eb2746ba577908f5, which opens the window a much larger window during which onSurfaceReady can set the surface reference while a lifecycle is already queued and potentially creating a lifecycle state that allows onSurfaceReady to proceed to call resume().

nullptr2this avatar Sep 13 '19 20:09 nullptr2this

Any workaround or solution? The library is now basically useless.

EndlezzCoding avatar Sep 18 '19 10:09 EndlezzCoding

No work around that I've been able to find. I have a few ideas for quick type fixes, but have been struggling for time to put together a PR. This issue is on the radar at work so I might soon have some actual work time to try to put together a fix.

nullptr2this avatar Sep 18 '19 12:09 nullptr2this

If you don't mind creating a fork of the code I just did that and removed the runblocking section because I needed a working implementation. That has been running fine for me for the last six months or so. I had to hack around a bunch of other stuff to get it reliable ( the library is quite bad at recovering from the camera disconnecting, which happens fairly often ) and I don't think the changes I made are suited to general use ( they work for what I'm doing but I doubt they'd work for everone ) which is why I haven't published them.

glenatron avatar Sep 19 '19 00:09 glenatron

You should use image.close() or pipe will be broken

RandGor avatar Aug 12 '20 16:08 RandGor