mint-lab/cv_tutorial: Computer Vision Tutorial

Computer Vision Tutorial

Computer Vision Tutorial includes classical theories and techniques and also recent ML/DL-based methods for computer vision. As classical theories and techniques, the tutorial contains image processing, camera projection models, camera calibration, and pose estimation. As recent ML/DL-based methods, the tutorial deals with object categorization (and backbone networks), and its extensions such as object detection and instance segmentation. It also explains about further topics such as multi-object tracking, structure-from-motion, NeRF, and so on.

This tutorial has been initiated and maintained to teach undergraduate CSE students in SEOULTECH as the course of Computer Vision (109079).

This tutorial contains code examples briefly written in Python with OpenCV and PyTorch.

:bulb: Some of code examples will help readers to understand inside of algorithms (e.g. how it works).
:wrench: Some of code examples will provide usages and applications of OpenCV functions (e.g. how to use it).
:camera: Some of code examples came from my 3D Computer Vision Tutorial, 3dv_tutorial.

Lecture Slides

Section 1. Introduction
Section 2. Image Editing: Learning OpenCV
Section 3. Image Processing
Section 4. Color
Section 5. Image Formation
Section 6. Image Geometry
Section 7. Solving Problems
Section 8. Image Correspondence
Section 9. Image Classification: CNN Backbones
Section 10. Object Detection
Section 11. Object Tracking
Advanced Topic 1. 3D Vision
Advanced Topic 2. ViT, CLIP, and More

Example Codes

Section 1. Introduction [slides]
- Note) How to install prerequisite packages in Python: pip install -r requirements.txt
Section 2. Image Editing: Learning OpenCV [slides]
- OpenCV Image Representation
  - Image creation: image_creation.py :bulb:
- OpenCV Image and Video Input/Output
  - Image file viewer: image_viewer.py :wrench:
  - Image format converter: image_converter.py :wrench:
  - Video file player: video_player.py :wrench:
  - Video format converter: video_converter.py :wrench:
- OpenCV Drawing Functions
  - Shape drawing: shape_drawing.py :wrench:
- OpenCV High-level GUI
  - (Handling keyboard events) Video file player with frame navigation: video_player+navigation.py :wrench:
  - (Handling mouse events) Free drawing: free_drawing.py :wrench:
- Image Editing
  - Negative image and flip: negative_image_and_flip.py :bulb:
  - Intensity transformation with contrast and brightness: intensity_transformation.py :bulb:
  - (Image addition) Alpha blending: alpha_blending.py :bulb:
  - (Image addition) Background extraction: background_extraction.py :bulb:
  - (Image subtraction) Image difference: image_difference.py :bulb:
  - (Image subtraction) Background subtraction: background_subtraction.py :bulb:
  - (Image crop) Image file viewer with the zoom window: image_viewer+zoom.py :bulb:
  - Image resize with backward value copy: image_resize.py :bulb:
  - Image rotation with backward/forward value copy: image_rotation.py :bulb:
Section 3. Image Processing [slides]
- Intensity Transformation
  - Image histogram: histogram.py :bulb:
  - Contrast stretching with min-max stretching: contrast_stretching.py :bulb:
  - Histogram equalization: histogram_equalization.py :wrench:
- Image Segmentation
  - Thresholding: thresholding.py :wrench:
- Image Filtering
  - Image filtering with various kernels: image_filtering.py :bulb:
  - Median filter: median_filter.py :wrench:
  - Sobel edge detection: Sobel_edge.py :bulb:
  - Canny edge detection: Canny_edge.py :wrench:
  - Bilateral filter: bilateral_filter.py :wrench:
- Morphological Operations
  - Morphological operations with various operations and kernels: morpology.py :wrench:
  - Application) Background subtraction (foreground extraction): background_subtraction.py :wrench:
Section 4. Color [slides]
- Color space conversion: color_bgr2hsv.py :wrench:
- Color histogram equalization: histogram_equalization+color.py :bulb:
Section 5. Image Formation [slides]
- Getting Started with 2D
  - 3D rotation conversion: 3d_rotation_conversion.py :camera:
- Pinhole Camera Model
  - Object localization: object_localization.py :camera:
  - Image formation: image_formation.py :camera:
- Geometric Distortion Models
  - Geometric distortion visualization: distortion_visualization.py :camera:
  - Geometric distortion correction: distortion_correction.py :camera: [result video]
- Camera Calibration
  - Camera calibration: camera_calibration.py :camera:
- Absolute Camera Pose Estimation (a.k.a. perspective-n-point; PnP)
  - Pose estimation (chessboard): pose_estimation_chessboard.py :camera: [result video]
  - Pose estimation (book): pose_estimation_book1.py :camera:
  - Pose estimation (book) with camera calibration: pose_estimation_book2.py :camera:
  - Pose estimation (book) with camera calibration without initial $K$: pose_estimation_book3.py :camera: [result video]
Section 6. Image Geometry [slides]
- Planar Homography
  - Perspective distortion correction: perspective_correction.py :camera:
  - Planar image stitching: image_stitching.py :camera:
  - 2D video stabilization: video_stabilization.py :camera: [result video]
- Triangulation
  - Triangulation: triangulation.py :camera:
Section 7. Solving Problems [slides]
- Solving Linear Equations in 3D Vision
  - Affine transformation estimation: affine_estimation_implement :camera:
  - Planar homography estimation: homography_estimation_implement :camera:
    - Appendix) Image warping using homography: image_warping_implement.py :camera:
  - Triangulation: triangulation_implement.py :camera:
- Solving Nonlinear Equations in 3D Vision
  - Absolute camera pose estimation: pose_estimation_implement.py :camera:
  - Camera calibration: camera_calibration_implement.py :camera:
Section 8. Image Correspondence [slides]
- Feature Points and Descriptors
  - Harris corner
  - Feature point comparison
- Feature Matching and Tracking
  - Feature matching comparison
  - Feature tracking with KLT tracker
- Outlier Rejection
  - Line fitting with RANSAC: line_fitting_ransac.py :camera:
  - Planar homography estimation with RANSAC
Section 9. Image Classification: CNN Backbones
Section 10. Object Detection
Section 11. Object Tracking

Authors

Sunglok Choi

Acknowledgements

The authors thank the following contributors and projects.

ImageProcessingPlace.com for test images (lena.tif, baboon.tif, and peppers.tif)
MOTChallenge for test images (PETS09-S2L1-raw.webm)
Wikipedia for a test image (salt_and_pepper.png)
OpenCV for a test image (sudoku.png)

cv_tutorial
cv_tutorial copied to clipboard

Metadata

Computer Vision Tutorial

Lecture Slides

Example Codes

Authors

Acknowledgements

← Metadata

Owner

Metadata

cv_tutorial cv_tutorial copied to clipboard

Metadata

Computer Vision Tutorial

Lecture Slides

Example Codes

Authors

Acknowledgements

← Metadata

Owner

Metadata

cv_tutorial
cv_tutorial copied to clipboard