DXcam icon indicating copy to clipboard operation
DXcam copied to clipboard

Feature: Allow capturing a specific windows only

Open zawlin opened this issue 2 years ago • 16 comments

Should also work when it is minimized like OBS studio.

Relevant parts of implementation.

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/window-capture.c#L563

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/dc-capture.c#L158

zawlin avatar Aug 08 '22 06:08 zawlin

I have figured out how to do this. here's a minimal code. It's not very fast though around 50-60 fps. I wonder if it's possible to get it to do faster. It also have some weird corrupted bits on calcuator and notepad although it does work on 3d games I have tested without any issues

import cv2
import numpy as np
import win32gui
import win32ui
import win32con
import ctypes

def set_dpi_awareness():
    awareness = ctypes.c_int()
    errorCode = ctypes.windll.shcore.GetProcessDpiAwareness(
        0, ctypes.byref(awareness))
    errorCode = ctypes.windll.shcore.SetProcessDpiAwareness(2)
    success = ctypes.windll.user32.SetProcessDPIAware()

def capture( win_title='', win_cls = None):
    hwnd = win32gui.FindWindow(win_cls, win_title)
    x1, y1, x2, y2 = win32gui.GetClientRect(hwnd)
    width = x2-x1
    height = y2-y1

    wx1, wy1, wx2, wy2 = win32gui.GetWindowRect(hwnd)
    # normalize to origin
    wx1, wx2 = wx1-wx1, wx2-wx1
    wy1, wy2 = wy1-wy1, wy2-wy1
    # compute border width and title height
    bw = int((wx2-x2)/2.)
    th = wy2-y2-bw
    # calc offset x and y taking into account border and titlebar, screen coordiates of client rect
    sx = bw
    sy = th

    wndc = win32gui.GetWindowDC(hwnd)
    imdc = win32ui.CreateDCFromHandle(wndc)
    # create a memory based device context
    memdc = imdc.CreateCompatibleDC()
    # create a bitmap object
    screenshot = win32ui.CreateBitmap()
    screenshot.CreateCompatibleBitmap(imdc, width, height)
    oldbmp = memdc.SelectObject(screenshot)
    # copy the screen into our memory device context
    memdc.BitBlt((0, 0), (width, height), imdc, (sx, sy), win32con.SRCCOPY)
    memdc.SelectObject(oldbmp)
    bmpinfo = screenshot.GetInfo()
    bmpstr = screenshot.GetBitmapBits(True)
    img = np.frombuffer(bmpstr, dtype='uint8')
    win32gui.DeleteObject(screenshot.GetHandle())
    imdc.DeleteDC()
    win32gui.ReleaseDC(hwnd, wndc)
    memdc.DeleteDC()
    img.shape = (height, width, 4)
    return cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)

set_dpi_awareness()

im = capture('*Untitled - Notepad')
cv2.namedWindow('im',0)
cv2.imshow('im',im)
cv2.waitKey(0)

zawlin avatar Sep 18 '22 05:09 zawlin

DC capture is slower than desktop dup api ( the api that dxcam used ). However, the desktop dup api is designed to capture the entire monitor with small processing overhead. So the region cropping is handled client side. DXCam has an api to capture a specific region and you can use win32gui.findwindow and win32gui.getclientrect to pass the determined window location to DXcam to do the capture.

ra1nty avatar Sep 18 '22 06:09 ra1nty

Hmm..but I assume it can't handle overlap ya? It will capture everything in the region, not just the game? And if the game window is hidden by other windows, I guess it wouldn't work?

I saw some other methods here. https://learn.microsoft.com/en-us/archive/blogs/dsui_team/ways-to-capture-the-screen, it seems bitblt is like the fastest possible way that can capture specific window. Not sure about mirror driver but it seems to be similar to dup api as well and has similar draw backs.

zawlin avatar Sep 18 '22 11:09 zawlin

if you need to handle overlap, i.e. not be bothered by it, there's the "thumbnail" API from the DWM. gives you a full resolution picture of any window. I don't know if DXcam implements that.

https://learn.microsoft.com/en-us/windows/win32/dwm/thumbnail-ovw

(no, it's not for generating a thumbnail of your own window. it's specifically for getting a view of any window.)

crackwitz avatar Sep 18 '22 13:09 crackwitz

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

zawlin avatar Sep 19 '22 02:09 zawlin

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

This is the newer windows graphics capture API. This would be the best API to use to my knowledge. However, seems that it requires a non-trivial amount of work if we want to use that in python ( not as simple as using desktop duplication at least). Haven't really got a chance to look into that in depth.

ra1nty avatar Sep 19 '22 05:09 ra1nty

Hmm..but I assume it can't handle overlap ya? It will capture everything in the region, not just the game? And if the game window is hidden by other windows, I guess it wouldn't work?

I saw some other methods here. https://learn.microsoft.com/en-us/archive/blogs/dsui_team/ways-to-capture-the-screen, it seems bitblt is like the fastest possible way that can capture specific window. Not sure about mirror driver but it seems to be similar to dup api as well and has similar draw backs.

Yes it can't handle overlap. And in my own test bitblt (dc capture) is way slower than desktop duplication api. The best I can do with bitblt is <70fps.

ra1nty avatar Sep 19 '22 05:09 ra1nty

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

This is the newer windows graphics capture API. This would be the best API to use to my knowledge. However, seems that it requires a non-trivial amount of work if we want to use that in python ( not as simple as using desktop duplication at least). Haven't really got a chance to look into that in depth.

I think it might be easier to wrap libobs-winrt.dll via ctypes and avoid most of the api calling in python. maybe need to modify obslib a bit to make it as simple as possible for wrapper. Anyway, if I figure it out, I will post a snippet here.

zawlin avatar Sep 19 '22 07:09 zawlin

It seems someone has figured it out in a somewhat questionable project. I took the relevant parts out and made a simple usage example here.

I am not sure how the author did the bindings, either manually or generated since I couldn't find any trace of the wrapped code(the rotypes stuff) in any other public repositories. But it seems to be incomplete as it's missing the apis for removing the yellow border that show up in graphics capture. I also found a more official looking bindings for winrt here, which seems to have everything needed for this functionality.

zawlin avatar Sep 21 '22 05:09 zawlin

That seems like a full-featured binding. BTW on windows 10, you can not remove the yellow border when using the windows.graphic.capture API. In Windows 11 it can be removed. If you are on win 10 and don't want the border then dc capture and desktop duplication API are the only choices.

Thanks for the find! I will take a look to see if I can borrow the bindings and make the win capture api available in dxcam when I have free time. Meanwhile, feel free to submit a PR : )

ra1nty avatar Sep 21 '22 06:09 ra1nty

I have figured out how to do this. here's a minimal code. It's not very fast though around 50-60 fps. I wonder if it's possible to get it to do faster. It also have some weird corrupted bits on calcuator and notepad although it does work on 3d games I have tested without any issues

After some modifications. Namely blindly trusting that the screen size will not change during use this works well for me getting on avg 100 fps. with 55 fps lows and 130 fps peaks. I am also dedicating a thread to screen capture in a loop while writing the latest image to a locking buffer for pulling the latest image. This serves my needs and can hopefully help others looking for a solution until further work is done on the project. Note: The constructor is a bit messy as I have been playing with a few different implementations and the window border is not properly cropped.

`import numpy as np import win32con import win32gui import win32ui import cv2 as cv import copy import time from threading import Thread, Lock

class WindowCapture:

# constructor
def __init__(self, window_name):
    #
    self.__lock = Lock()
    t1 = Thread(target=self.__doWork)
    t1.start()
    self.__newestImage = np.array(np.zeros((100,100,3), dtype=np.uint8))
    self.__intermediaryImage = np.array(np.zeros((100,100,3), dtype=np.uint8))

    # find the handle for the window we want to capture
    self.hwnd = win32gui.FindWindow(None, window_name)
    self.window_name = window_name
    if not self.hwnd:
        raise Exception('Window not found: {}'.format(window_name))

    # get the window size
    window_rect = win32gui.GetWindowRect(self.hwnd)
    self.w = window_rect[2] - window_rect[0]
    self.h = window_rect[3] - window_rect[1]
    print(f"self.w: {self.w}; self.h: {self.h}")

    # account for the window border and titlebar and cut them off
    border_pixels = 8
    titlebar_pixels = 30
    #self.w = self.w - (border_pixels * 2)
    #self.h = self.h - titlebar_pixels - border_pixels
    self.cropped_x = border_pixels
    self.cropped_y = titlebar_pixels

    # set the cropped coordinates offset so we can translate screenshot
    # images into actual screen positions
    self.offset_x = window_rect[0] + self.cropped_x
    self.offset_y = window_rect[1] + self.cropped_y

def get_screenshot(self):
    hwnd = win32gui.FindWindow(None, self.window_name)
    wndc = win32gui.GetWindowDC(self.hwnd)
    imdc = win32ui.CreateDCFromHandle(wndc)
    # create a memory based device context
    memdc = imdc.CreateCompatibleDC()
    # create a bitmap object
    screenshot = win32ui.CreateBitmap()
    screenshot.CreateCompatibleBitmap(imdc, self.w, self.h)
    oldbmp = memdc.SelectObject(screenshot)
    # copy the screen into our memory device context
    memdc.BitBlt((0, 0), (self.w, self.h), imdc, (0, 0), win32con.SRCCOPY)
    memdc.SelectObject(oldbmp)
    bmpstr = screenshot.GetBitmapBits(True)
    img = np.frombuffer(bmpstr, dtype='uint8')
    win32gui.DeleteObject(screenshot.GetHandle())
    imdc.DeleteDC()
    win32gui.ReleaseDC(hwnd, wndc)
    memdc.DeleteDC()
    img.shape = (self.h, self.w, 4)
    return cv.cvtColor(img, cv.COLOR_BGRA2BGR)


def __doWork(self):
    loop_time = 0
    while True:
        try:
            self.__intermediaryImage = self.get_screenshot()
            self.__lock.acquire()
            self.__newestImage = self.__intermediaryImage
            self.__lock.release()
        except Exception as ex:
            print(ex, flush=True)
            continue
        try:
            fps = 1 / (time.time() - loop_time)
        except:
            pass
        print(f'Raw FPS {fps}', flush=True)
        loop_time = time.time()


def GetLatestImage(self):
    self.__lock.acquire()
    copyImage = copy.copy(self.__newestImage)
    self.__lock.release()
    return copyImage

#untested insertion of my main if name == 'main': windowCap = WindowCapture('Spotify Premium') loop_time = 0

while True:
    try:
        fps = 1 / (time.time() - loop_time)
    except:
        pass
    loop_time = time.time()
    #print(f'FPS {fps}', flush=True)
    img = windowCap.GetLatestImage()
    if img is None:
        continue
    #time.sleep(5//100)
    cv.imshow("hi", img)
    cv.waitKey(1)

`

JustinHenderson98 avatar Oct 04 '22 00:10 JustinHenderson98

self.__intermediaryImage = self.get_screenshot() self.__lock.acquire() self.__newestImage = self.__intermediaryImage self.__lock.release()

that is pointless, and so is the locking in GetLatestImage. python variables are references. the assignment sets a reference to the object. this operation requires no locks at all. drop the locking.

your get_screenshot always creates a new object. nothing in your code ever "writes into" these objects, after they've been created and returned from that function.

crackwitz avatar Oct 04 '22 10:10 crackwitz

Didn't read through the entire thread, but DirectX Desktop Duplication, BitBlt and Windows Graphics Capture API are all completely different capture methods that server their own purpose and have their limitations.

Summary from my experience on AutoSplit where I had to implement BitBlt and WGC myself: https://github.com/Avasam/Auto-Split#capture-method

Implementation details if you need some inspiration: https://github.com/Avasam/Auto-Split/tree/2.0.0/src/capture_method

Avasam avatar Oct 25 '22 09:10 Avasam

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

This is the newer windows graphics capture API. This would be the best API to use to my knowledge. However, seems that it requires a non-trivial amount of work if we want to use that in python ( not as simple as using desktop duplication at least). Haven't really got a chance to look into that in depth.

@ra1nty have you worked on it?

lucasmonstrox avatar Feb 13 '23 04:02 lucasmonstrox

This issue is tagged as "help wanted"

crackwitz avatar Feb 13 '23 08:02 crackwitz

当它像 OBS 工作室一样最小化时也应该工作。

实施的相关部分。

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/window-capture.c#L563

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/dc-capture.c#L158

Have you implemented the d3d11 window specified in the background screenshot? If you could share your code

xiaobaixuejava avatar Mar 30 '24 01:03 xiaobaixuejava