【WebRTC】视频采集模块中各个类的简单分析

1.视频采集模块中的类

WebRTC主体是基于C++实现的框架，有必要分析其中各个类之间的继承关系。视频采集模块的代码在modules/video_capture中，其中有很多个文件，逐一分析。需要说明的是，采集模块与使用的平台强相关，不同的操作系统会使用不同的采集方式，在这里我记录的是Windows平台。因为RTC场景下Android平台和Windows平台使用量比较多，但是Android平台还涉及到JNI这一层，比Windows更复杂一些。

1.1 视频采集基础模块（VideoCaptureModule）

VideoCaptureModule的声明位于modules/video_capture/video_capture.h中，是VideoCapture模块中很基础的类，其中声明了大量纯虚函数，即VideoCaptureModule的子类必须实现这些虚函数。VideoCaptureModule中主要定义了设备信息、采集功能、旋转角度（与渲染有关）的相关函数

class VideoCaptureModule : public RefCountInterface {
 public:
  // Interface for receiving information about available camera devices.
  class DeviceInfo {
   public:
    virtual uint32_t NumberOfDevices() = 0; // 设备的数量，例如对于Android设备的Camera1和Camera2

    // Returns the available capture devices.
    // deviceNumber   - Index of capture device.
    // deviceNameUTF8 - Friendly name of the capture device.
    // deviceUniqueIdUTF8 - Unique name of the capture device if it exist.
    //                      Otherwise same as deviceNameUTF8.
    // productUniqueIdUTF8 - Unique product id if it exist.
    //                       Null terminated otherwise.
    // 获取可用的采集设备
    // deviceNumber: 设备的编号
    // deviceNameUTF8: 设备的昵称（例如用户易于识别的型号）
    // deviceUniqueIdUTF8: 设备的独有名称
    // productUniqueIdUTF8: product独有ID号
    virtual int32_t GetDeviceName(uint32_t deviceNumber,
                                  char* deviceNameUTF8,
                                  uint32_t deviceNameLength,
                                  char* deviceUniqueIdUTF8,
                                  uint32_t deviceUniqueIdUTF8Length,
                                  char* productUniqueIdUTF8 = 0,
                                  uint32_t productUniqueIdUTF8Length = 0) = 0;

    // Returns the number of capabilities this device.
    // 返回当前设备具有的能力
    virtual int32_t NumberOfCapabilities(const char* deviceUniqueIdUTF8) = 0;

    // Gets the capabilities of the named device.
    // 获取给定名称的设备的能力
    virtual int32_t GetCapability(const char* deviceUniqueIdUTF8,
                                  uint32_t deviceCapabilityNumber,
                                  VideoCaptureCapability& capability) = 0;

    // Gets clockwise angle the captured frames should be rotated in order
    // to be displayed correctly on a normally rotated display.
    // 获取顺时针旋转角度，以便在普通旋转的显示屏上正确显示捕获的帧
    // 这里应该面向的是渲染问题，因为渲染过程中的图像是需要倒置的，180°旋转
    virtual int32_t GetOrientation(const char* deviceUniqueIdUTF8,
                                   VideoRotation& orientation) = 0;

    // Gets the capability that best matches the requested width, height and
    // frame rate.
    // Returns the deviceCapabilityNumber on success.
    // 获取与请求的宽度、高度和帧率最为匹配的能力。
	// 成功时返回设备能力编号。
    virtual int32_t GetBestMatchedCapability(
        const char* deviceUniqueIdUTF8,
        const VideoCaptureCapability& requested,
        VideoCaptureCapability& resulting) = 0;

    // Display OS /capture device specific settings dialog
    // 显示操作系统/捕获设备特定的设置对话框。
    virtual int32_t DisplayCaptureSettingsDialogBox(
        const char* deviceUniqueIdUTF8,
        const char* dialogTitleUTF8,
        void* parentWindow,
        uint32_t positionX,
        uint32_t positionY) = 0;

    virtual ~DeviceInfo() {}
  };

  // Register capture data callback
  // 注册采集数据的回调函数
  virtual void RegisterCaptureDataCallback(
      rtc::VideoSinkInterface<VideoFrame>* dataCallback) = 0;
  virtual void RegisterCaptureDataCallback(
      RawVideoSinkInterface* dataCallback) = 0;

  //  Remove capture data callback
  // 移除采集数据的回调函数
  virtual void DeRegisterCaptureDataCallback() = 0;

  // Start capture device
  // 开始执行采集
  virtual int32_t StartCapture(const VideoCaptureCapability& capability) = 0;
  // 停止采集
  virtual int32_t StopCapture() = 0;

  // Returns the name of the device used by this module.
  // 获取当前设备的名称
  virtual const char* CurrentDeviceName() const = 0;

  // Returns true if the capture device is running
  // 检查当前设备是否已经开始采集
  virtual bool CaptureStarted() = 0;

  // Gets the current configuration.
  // 获取当前的配置信息
  virtual int32_t CaptureSettings(VideoCaptureCapability& settings) = 0;

  // Set the rotation of the captured frames.
  // If the rotation is set to the same as returned by
  // DeviceInfo::GetOrientation the captured frames are
  // displayed correctly if rendered.
  // 设置捕获帧的旋转角度。
  // 如果旋转角度设置与`DeviceInfo::GetOrientation`返回的值相同，则捕获的帧在渲染时能够正确显示。
  virtual int32_t SetCaptureRotation(VideoRotation rotation) = 0;

  // Tells the capture module whether to apply the pending rotation. By default,
  // the rotation is applied and the generated frame is up right. When set to
  // false, generated frames will carry the rotation information from
  // SetCaptureRotation. Return value indicates whether this operation succeeds.
  // 告诉捕获模块是否应用待定的旋转。默认情况下，
  // 旋转会被应用，生成的帧是正向的。当设置为false时，生成的帧将携带从
  // SetCaptureRotation设置的旋转信息。返回值指示此操作是否成功。
  virtual bool SetApplyRotation(bool enable) = 0;

  // Return whether the rotation is applied or left pending.
  // 返回旋转是否已应用或保持待定状态。
  virtual bool GetApplyRotation() = 0;

 protected:
  ~VideoCaptureModule() override {}
};

从继承关系来看，VideoCaptureModule的父类是RefCountInterface，这个类的声明位于api/ref_count.h中，是WebRTC当中一个至关重要的基类，因为涉及到了内存管理任务。在这个类当中，声明了两个const函数: Addref()和Release()，其中Addref()的作用是添加对当前对象的引用计数，Release()的作用是将当前对象的引用计数减1，如果减1之后为0，那么当前的对象会被释放掉。

对于这个类的简单理解是，程序在运行的过程中，有可能有很多地方需要使用到某一个对象，为此需要对当前对象进行一个使用计数，每多一个使用者，计数器加1，只有计数器为0时，该对象才可以被释放掉，这对于内存管理是至关重要的，因为极有可能会出现某一个对象在错误的时刻被释放掉，从而导致内存泄漏。这个概念可以类比FFmpeg当中的AVBufferRef，与FFmpeg不同的是，这里使用的是继承的方法实现，而FFmpeg中是结构体中将AVBufferRef作为一个成员。

// Refcounted objects should implement the following informal interface:
//
// void AddRef() const ;
// RefCountReleaseStatus Release() const;
//
// You may access members of a reference-counted object, including the AddRef()
// and Release() methods, only if you already own a reference to it, or if
// you're borrowing someone else's reference. (A newly created object is a
// special case: the reference count is zero on construction, and the code that
// creates the object should immediately call AddRef(), bringing the reference
// count from zero to one, e.g., by constructing an rtc::scoped_refptr).
//
// 您可以访问引用计数对象的成员，包括AddRef()和Release()方法，但前提是您已经拥有对它的引用，
// 或者您是借用别人的引用。（新创建的对象是一个特殊情况：在构造时引用计数为零，创建对象的代码
// 应立即调用AddRef()，将引用计数从零增加到一，例如，通过构造一个rtc::scoped_refptr）。

// AddRef() creates a new reference to the object.
//
// AddRef()会为当前对象创建一个新的引用

// Release() releases a reference to the object; the caller now has one less
// reference than before the call. Returns kDroppedLastRef if the number of
// references dropped to zero because of this (in which case the object destroys
// itself). Otherwise, returns kOtherRefsRemained, to signal that at the precise
// time the caller's reference was dropped, other references still remained (but
// if other threads own references, this may of course have changed by the time
// Release() returns).
//
// `Release()` 方法释放对对象的引用；调用者现在比调用前少了一个引用。如果因为这次调用，
// 引用计数降到了零（在这种情况下，对象将销毁自身），则返回 `kDroppedLastRef`。否则，
// 返回 `kOtherRefsRemained`，以表明在调用者引用被释放的确切时刻，还有其他引用存在
// （但如果其他线程拥有引用，这可能在 `Release()` 返回时已发生变化）。

// The caller of Release() must treat it in the same way as a delete operation:
// Regardless of the return value from Release(), the caller mustn't access the
// object. The object might still be alive, due to references held by other
// users of the object, but the object can go away at any time, e.g., as the
// result of another thread calling Release().
//
// 调用 `Release()` 的代码必须将其视为与删除操作相同的方式处理：无论 `Release()` 的
// 返回值如何，调用者都不能访问该对象。由于其他用户对对象持有的引用，对象可能仍然存活，
// 但对象随时可能消失，例如，作为另一个线程调用 `Release()` 的结果。

// Calling AddRef() and Release() manually is discouraged. It's recommended to
// use rtc::scoped_refptr to manage all pointers to reference counted objects.
// Note that rtc::scoped_refptr depends on compile-time duck-typing; formally
// implementing the below RefCountInterface is not required.
// 手动调用 `AddRef()` 和 `Release()` 是不被鼓励的。建议使用 `rtc::scoped_refptr` 
// 来管理所有指向引用计数对象的指针。请注意，`rtc::scoped_refptr` 依赖于编译时的鸭子
// 类型（duck-typing）；正式实现下面的 `RefCountInterface` 并不是必需的。

enum class RefCountReleaseStatus { kDroppedLastRef, kOtherRefsRemained };

// Interfaces where refcounting is part of the public api should
// inherit this abstract interface. The implementation of these
// methods is usually provided by the RefCountedObject template class,
// applied as a leaf in the inheritance tree.
// 对于引用计数是公共API一部分的接口，应该继承这个抽象接口。这些方法的实现通常由 
// `RefCountedObject` 模板类提供，它被用作继承树中的叶节点。

class RefCountInterface {
 public:
  virtual void AddRef() const = 0;
  virtual RefCountReleaseStatus Release() const = 0;

  // Non-public destructor, because Release() has exclusive responsibility for
  // destroying the object.
 protected:
  virtual ~RefCountInterface() {}
};

1.2 视频采集工厂类（VideoCaptureFactory）

VideoCapture的工厂类VideoCaptureFactory，声明位于modules/video_capture/video_capture_factory.h中，包含了Create()和CreateDeviceInfo()函数，其中Create()用于创建一个视频采集模块对象，这个对象的类型是VideoCaptureModule，CreateDeviceInfo()用于创建设备信息，对象类型为DeviceInfo。另外，还可以通过VideoCaptureOptions辅助信息来创建这两个对象。

class RTC_EXPORT VideoCaptureFactory {
 public:
  // Create a video capture module object
  // id - unique identifier of this video capture module object.
  // deviceUniqueIdUTF8 - name of the device.
  //                      Available names can be found by using GetDeviceName
  static rtc::scoped_refptr<VideoCaptureModule> Create(
      const char* deviceUniqueIdUTF8);
  static rtc::scoped_refptr<VideoCaptureModule> Create(
      VideoCaptureOptions* options,
      const char* deviceUniqueIdUTF8);

  static VideoCaptureModule::DeviceInfo* CreateDeviceInfo();
  static VideoCaptureModule::DeviceInfo* CreateDeviceInfo(
      VideoCaptureOptions* options);

 private:
  ~VideoCaptureFactory();
};

上面使用到了VideoCaptureOptions这个类，它的声明位于modules/video_capture/video_capture_options.h中，定义了视频采集的一些参数，包括采集状态、是否使用Linux、是否使用PIPEWIRE等。

#if defined(WEBRTC_USE_PIPEWIRE)
namespace videocapturemodule {
class PipeWireSession;
}
#endif

// An object that stores initialization parameters for video capturers
class RTC_EXPORT VideoCaptureOptions {
 public:
  VideoCaptureOptions();
  VideoCaptureOptions(const VideoCaptureOptions& options);
  VideoCaptureOptions(VideoCaptureOptions&& options);
  ~VideoCaptureOptions();

  VideoCaptureOptions& operator=(const VideoCaptureOptions& options);
  VideoCaptureOptions& operator=(VideoCaptureOptions&& options);
  // 采集状态
  enum class Status {
    SUCCESS,
    UNINITIALIZED,
    UNAVAILABLE,
    DENIED,
    ERROR,
    MAX_VALUE = ERROR
  };
  // 回调函数
  class Callback {
   public:
    // 纯虚回调函数，表示当前的VideoCaptureOption是否已经初始化了
    virtual void OnInitialized(Status status) = 0; 

   protected:
    virtual ~Callback() = default;
  };
  // 初始化
  void Init(Callback* callback);

#if defined(WEBRTC_LINUX)
  // V4L2是Linux下的用于处理视频设备的驱动程序开发的内核框架
  bool allow_v4l2() const { return allow_v4l2_; }
  void set_allow_v4l2(bool allow) { allow_v4l2_ = allow; }
#endif

#if defined(WEBRTC_USE_PIPEWIRE)
  bool allow_pipewire() const { return allow_pipewire_; }
  void set_allow_pipewire(bool allow) { allow_pipewire_ = allow; }
  void set_pipewire_fd(int fd) { pipewire_fd_ = fd; }
  rtc::scoped_refptr<videocapturemodule::PipeWireSession> pipewire_session();
#endif

 private:
#if defined(WEBRTC_LINUX)
  bool allow_v4l2_ = false;
#endif
#if defined(WEBRTC_USE_PIPEWIRE)
  bool allow_pipewire_ = false;
  int pipewire_fd_ = kInvalidPipeWireFd;
  rtc::scoped_refptr<videocapturemodule::PipeWireSession> pipewire_session_;
#endif
};

1.3 设备信息的实现（DeviceInfoImpl）

在进行视频采集之前，需要先看看设备的初始化，DeviceInfoImpl是VideoCaptureModule中DeviceInfo的子类，实现了其中的一部分纯虚函数，其声明位于modules/video_capture/device_info_impl.h中。实现的函数包括：确定设备的能力、获取设备能力等功能。这里没有将所有父类函数实现，是因为并不确定是在哪种平台下面执行具体的代码，例如Windows、Linux和Android等，还需要其后的子类实现。

class DeviceInfoImpl : public VideoCaptureModule::DeviceInfo {
 public:
  DeviceInfoImpl();
  ~DeviceInfoImpl(void) override;
  // 获取该设备能力的数量
  int32_t NumberOfCapabilities(const char* deviceUniqueIdUTF8) override;
  // 获取能力
  int32_t GetCapability(const char* deviceUniqueIdUTF8,
                        uint32_t deviceCapabilityNumber,
                        VideoCaptureCapability& capability) override;
  // 获取最佳匹配的能力
  int32_t GetBestMatchedCapability(const char* deviceUniqueIdUTF8,
                                   const VideoCaptureCapability& requested,
                                   VideoCaptureCapability& resulting) override;
  // 获取旋转角
  int32_t GetOrientation(const char* deviceUniqueIdUTF8,
                         VideoRotation& orientation) override;

 protected:
  /* Initialize this object*/

  virtual int32_t Init() = 0;
  /*
   * Fills the member variable _captureCapabilities with capabilities for the
   * given device name.
   */
  virtual int32_t CreateCapabilityMap(const char* deviceUniqueIdUTF8)
      RTC_EXCLUSIVE_LOCKS_REQUIRED(_apiLock) = 0;

 protected:
  // Data members
  // 一系列的能力
  typedef std::vector<VideoCaptureCapability> VideoCaptureCapabilities;
  VideoCaptureCapabilities _captureCapabilities RTC_GUARDED_BY(_apiLock);
  Mutex _apiLock;
  char* _lastUsedDeviceName RTC_GUARDED_BY(_apiLock);
  uint32_t _lastUsedDeviceNameLength RTC_GUARDED_BY(_apiLock);
};

1.4 视频采集的实现（VideoCaptureImpl）

下面看看VideoCapture的实现，实现的类为VideoCaptureImpl，其声明位于modules/video_capture/video_capture_impl.h中，父类是VideoCaptureModule。这个类当中实现的功能有：创建VideoCapture、创建DeviceInfo、旋转角度、注册函数的回调、

namespace videocapturemodule {
// Class definitions
class RTC_EXPORT VideoCaptureImpl : public VideoCaptureModule {
 public:
  /*
   *   Create a video capture module object
   *
   *   id              - unique identifier of this video capture module object
   *   deviceUniqueIdUTF8 -  name of the device. Available names can be found by
   * using GetDeviceName
   */
  // 创建VideoCaptureModule
  static rtc::scoped_refptr<VideoCaptureModule> Create(
      const char* deviceUniqueIdUTF8);
  static rtc::scoped_refptr<VideoCaptureModule> Create(
      VideoCaptureOptions* options,
      const char* deviceUniqueIdUTF8);
  // 创建DeviceInfo
  static DeviceInfo* CreateDeviceInfo();
  static DeviceInfo* CreateDeviceInfo(VideoCaptureOptions* options);

  // Helpers for converting between (integral) degrees and
  // VideoRotation values.  Return 0 on success.
  // 用于在（整数）度和VideoRotation值之间相互转换的助手。成功时返回0。
  static int32_t RotationFromDegrees(int degrees, VideoRotation* rotation);
  static int32_t RotationInDegrees(VideoRotation rotation, int* degrees);

  // Call backs
  // 注册采集数据的回调函数
  // 支持VideoSinkInterface和RawVideoSinkInterface两种类型
  void RegisterCaptureDataCallback(
      rtc::VideoSinkInterface<VideoFrame>* dataCallback) override;
  virtual void RegisterCaptureDataCallback(
      RawVideoSinkInterface* dataCallback) override;
  // 取消采集数据的回调函数
  void DeRegisterCaptureDataCallback() override;
  // 设置采集的角度
  int32_t SetCaptureRotation(VideoRotation rotation) override;
  bool SetApplyRotation(bool enable) override;
  bool GetApplyRotation() override;
  // 获取当前设备名称
  const char* CurrentDeviceName() const override;

  // `capture_time` must be specified in NTP time format in milliseconds.
  // 处理摄像头获取的帧（核心函数）
  int32_t IncomingFrame(uint8_t* videoFrame,
                        size_t videoFrameLength,
                        const VideoCaptureCapability& frameInfo,
                        int64_t captureTime = 0);

  // Platform dependent
  // 与平台相关的采集函数
  int32_t StartCapture(const VideoCaptureCapability& capability) override;
  int32_t StopCapture() override;
  bool CaptureStarted() override;
  int32_t CaptureSettings(VideoCaptureCapability& /*settings*/) override;

 protected:
  VideoCaptureImpl();
  ~VideoCaptureImpl() override;

  // Calls to the public API must happen on a single thread.
  // 这是一个SequenceChecker对象，用于确保公共API的调用都在同一个线程上执行。
  // 它用来检查对API的调用是否违反了单线程的约定，从而帮助防止多线程环境下的竞态条件。
  SequenceChecker api_checker_;
  // RaceChecker for members that can be accessed on the API thread while
  // capture is not happening, and on a callback thread otherwise.
  // 这是一个rtc::RaceChecker对象，用于检查在特定条件下，成员变量是否被多个线程访问。
  // 当捕获没有发生时，成员变量可以在API线程上被访问，而在捕获发生时，则可能在回调线程上被访问。
  // RaceChecker用于避免数据竞争，确保线程安全
  rtc::RaceChecker capture_checker_;
  // current Device unique name;
  // 当前设备独有名称，这里的RTC_GUARDED_BY意思是只能在api_checker_的线程中进行访问
  char* _deviceUniqueId RTC_GUARDED_BY(api_checker_);
  Mutex api_lock_;
  // Should be set by platform dependent code in StartCapture.
  // 应当在StartCapture中被设置，基于不同的平台
  VideoCaptureCapability _requestedCapability RTC_GUARDED_BY(api_checker_);

 private:
  void UpdateFrameCount();
  uint32_t CalculateFrameRate(int64_t now_ns);
  int32_t DeliverCapturedFrame(VideoFrame& captureFrame)
      RTC_EXCLUSIVE_LOCKS_REQUIRED(api_lock_);
  void DeliverRawFrame(uint8_t* videoFrame,
                       size_t videoFrameLength,
                       const VideoCaptureCapability& frameInfo,
                       int64_t captureTime)
      RTC_EXCLUSIVE_LOCKS_REQUIRED(api_lock_);

  // last time the module process function was called.
  // 上一次模块过程函数被调用的时间
  int64_t _lastProcessTimeNanos RTC_GUARDED_BY(capture_checker_);
  // last time the frame rate callback function was called.
  // 上一次帧率回调函数被调用的时间
  int64_t _lastFrameRateCallbackTimeNanos RTC_GUARDED_BY(capture_checker_);

  rtc::VideoSinkInterface<VideoFrame>* _dataCallBack RTC_GUARDED_BY(api_lock_);
  RawVideoSinkInterface* _rawDataCallBack RTC_GUARDED_BY(api_lock_);

  int64_t _lastProcessFrameTimeNanos RTC_GUARDED_BY(capture_checker_);
  // timestamp for local captured frames
  // 本地捕获帧的时间戳
  int64_t _incomingFrameTimesNanos[kFrameRateCountHistorySize] RTC_GUARDED_BY(
      capture_checker_);
  // Set if the frame should be rotated by the capture module.
  // 如果需要旋转，则由采集模块配置
  VideoRotation _rotateFrame RTC_GUARDED_BY(api_lock_);

  // Indicate whether rotation should be applied before delivered externally.
  bool apply_rotation_ RTC_GUARDED_BY(api_lock_);
};

在上面的声明中，涉及到了VideoSinkInterface和RawVideoSinkInterface两个类，这两个类都能够用于处理获取的视频帧。VideoSinkInterface的声明位于api/video/video_sink_interface.h中，RawVideoSinkInterface的声明位于modules/video_capture/raw_video_sink_interface.h中。从声明中可以看出，其中定义的核心函数分别是OnRawFrame()和OnFrame()，均为回调函数（一般带有On前缀的是回调函数）。

// 位于modules/video_capture/raw_video_sink_interface.h
class RawVideoSinkInterface {
 public:
  virtual ~RawVideoSinkInterface() = default;
  // 处理原始视频帧
  virtual int32_t OnRawFrame(uint8_t* videoFrame,
                             size_t videoFrameLength,
                             const webrtc::VideoCaptureCapability& frameInfo,
                             VideoRotation rotation,
                             int64_t captureTime) = 0;
};

// 位于api/video/video_sink_interface.h
template <typename VideoFrameT>
class VideoSinkInterface {
 public:
  virtual ~VideoSinkInterface() = default;
  // 处理视频帧
  virtual void OnFrame(const VideoFrameT& frame) = 0;

  // Should be called by the source when it discards the frame due to rate
  // limiting.
  // 当源由于速率限制丢弃帧时，应该被源调用。
  virtual void OnDiscardedFrame() {}

  // Called on the network thread when video constraints change.
  // TODO(crbug/1255737): make pure virtual once downstream project adapts.
  // 在网络线程上调用，当视频约束发生变化时。
  // TODO(crbug/1255737)：一旦下游项目适应，就变为纯虚函数。
  virtual void OnConstraintsChanged(
      const webrtc::VideoTrackSourceConstraints& /* constraints */) {}
};

1.5 Windows平台下设备信息的实现（DeviceInfoDS）

老版本的Windows设备使用的是DirectShow（DS）多媒体框架，新版本的Windows设备普遍使用的是Microsoft Media Foundation（IMF）多媒体框架，新版本Windows兼容DirectShow。DeviceInfoDS描述了在Windows平台下，处理视频采集设备类的实现，其声明位于modules/video_capture/windows/device_info_ds.h

class DeviceInfoDS : public DeviceInfoImpl {
 public:
  // Factory function.
  static DeviceInfoDS* Create();

  DeviceInfoDS();
  ~DeviceInfoDS() override;
  // 设备的初始化
  int32_t Init() override;
  // 设备的数量
  uint32_t NumberOfDevices() override;

  /*
   * Returns the available capture devices.
   */
  // 获取可用设备
  int32_t GetDeviceName(uint32_t deviceNumber,
                        char* deviceNameUTF8,
                        uint32_t deviceNameLength,
                        char* deviceUniqueIdUTF8,
                        uint32_t deviceUniqueIdUTF8Length,
                        char* productUniqueIdUTF8,
                        uint32_t productUniqueIdUTF8Length) override;

  /*
   * Display OS /capture device specific settings dialog
   */
  // 显示操作系统/捕获设备特定的设置对话框。
  int32_t DisplayCaptureSettingsDialogBox(const char* deviceUniqueIdUTF8,
                                          const char* dialogTitleUTF8,
                                          void* parentWindow,
                                          uint32_t positionX,
                                          uint32_t positionY) override;

  // Windows specific

  /* Gets a capture device filter
   The user of this API is responsible for releasing the filter when it not
   needed.
   */
  // 获取一个捕获设备过滤器，使用此API的用户在不再需要该过滤器时负责释放它。
  IBaseFilter* GetDeviceFilter(const char* deviceUniqueIdUTF8,
                               char* productUniqueIdUTF8 = NULL,
                               uint32_t productUniqueIdUTF8Length = 0);
  // 获取Windows平台下设备的能力
  int32_t GetWindowsCapability(
      int32_t capabilityIndex,
      VideoCaptureCapabilityWindows& windowsCapability);
  // 获取product的ID号
  static void GetProductId(const char* devicePath,
                           char* productUniqueIdUTF8,
                           uint32_t productUniqueIdUTF8Length);

 protected:
  // 获取设备信息
  int32_t GetDeviceInfo(uint32_t deviceNumber,
                        char* deviceNameUTF8,
                        uint32_t deviceNameLength,
                        char* deviceUniqueIdUTF8,
                        uint32_t deviceUniqueIdUTF8Length,
                        char* productUniqueIdUTF8,
                        uint32_t productUniqueIdUTF8Length);
  // 创建能力图
  int32_t CreateCapabilityMap(const char* deviceUniqueIdUTF8) override
      RTC_EXCLUSIVE_LOCKS_REQUIRED(_apiLock);

 private:
  // ICreateDevEnum 接口用于创建一个设备枚举器，该枚举器可以列出系统中的所有视频捕获设备。
  // 这个变量用于访问和管理视频捕获设备。
  ICreateDevEnum* _dsDevEnum;
  // IEnumMoniker 接口用于枚举设备，它提供了迭代设备的能力。
  // 这个变量用于遍历和管理设备列表。
  IEnumMoniker* _dsMonikerDevEnum;
  // CoUninitialize 是Windows API中的一个函数，用于在应用程序结束之前清理COM库。
  // 这个变量通常用于确保在应用程序退出时正确清理COM资源。
  bool _CoUninitializeIsRequired;
  // VideoCaptureCapabilityWindows 是一个类，它封装了Windows平台上的视频捕获能力，如分辨率、帧率和格式等。
  // 这个变量用于存储和管理Windows平台上所有可用的视频捕获能力。
  std::vector<VideoCaptureCapabilityWindows> _captureCapabilitiesWindows;
};

其中使用的VideoCaptureCapabilityWindows的声明如下

struct VideoCaptureCapabilityWindows : public VideoCaptureCapability {
  uint32_t directShowCapabilityIndex;
  bool supportFrameRateControl;
  VideoCaptureCapabilityWindows() {
    directShowCapabilityIndex = 0;
    supportFrameRateControl = false;
  }
};

1.6 Windows平台接受采集数据（CaptureInputPin和CaptureSinkFilter）

从Windows平台上接受采集数据的类有两个，CaptureInputPin和CaptureSinkFilter，其中CaptureInputPin直接与Windows端口连接，更加底层，而CaptureSinkFilter会处理从Windows端口获取的帧数据，还会对连接进行操作，相对而言上层一点。CaptureInputPin声明位于modules/video_capture/windows/sink_filter_ds.h中，实现了一系列与Windows平台对接的函数，如连接、问询、接收数据等功能

// Input pin for camera input
// Implements IMemInputPin, IPin.
class CaptureInputPin : public IMemInputPin, public IPin {
 public:
  CaptureInputPin(CaptureSinkFilter* filter);
  // 根据要求设置能力
  HRESULT SetRequestedCapability(const VideoCaptureCapability& capability);

  // Notifications from the filter.
  // Filter使用的回调函数
  void OnFilterActivated();
  void OnFilterDeactivated();

 protected:
  virtual ~CaptureInputPin();

 private:
  CaptureSinkFilter* Filter() const;
  // 尝试连接
  HRESULT AttemptConnection(IPin* receive_pin, const AM_MEDIA_TYPE* media_type);
  std::vector<AM_MEDIA_TYPE*> DetermineCandidateFormats(
      IPin* receive_pin,
      const AM_MEDIA_TYPE* media_type);
  void ClearAllocator(bool decommit);
  HRESULT CheckDirection(IPin* pin) const;

  // IUnknown
  /* 
    STDMETHOD宏，它通常定义为virtual HRESULT STDMETHODCALLTYPE，其中STDMETHODCALLTYPE
    是一个调用约定宏，确保函数使用stdcall调用约定，这是COM接口的标准调用约定。
  */ 
  // 用来查询视频捕获设备是否支持特定的接口， 例如IMediaEvent或IMediaControl
  // REFIID是IID（Interface Identifier）的引用类型
  STDMETHOD(QueryInterface)(REFIID riid, void** ppv) override;

  // clang-format off
  // clang isn't sure what to do with the longer STDMETHOD() function
  // declarations.

  // IPin
  // 建立两个DirectShow过滤器之间的连接，receive_pin参数是指向接收端过滤器的IPin
  // 接口的指针，这个过滤器是数据流的接收端
  STDMETHOD(Connect)(IPin* receive_pin,
                     const AM_MEDIA_TYPE* media_type) override;
  // 
  STDMETHOD(ReceiveConnection)(IPin* connector,
                               const AM_MEDIA_TYPE* media_type) override;
  STDMETHOD(Disconnect)() override;
  STDMETHOD(ConnectedTo)(IPin** pin) override;
  STDMETHOD(ConnectionMediaType)(AM_MEDIA_TYPE* media_type) override;
  STDMETHOD(QueryPinInfo)(PIN_INFO* info) override;
  STDMETHOD(QueryDirection)(PIN_DIRECTION* pin_dir) override;
  STDMETHOD(QueryId)(LPWSTR* id) override;
  STDMETHOD(QueryAccept)(const AM_MEDIA_TYPE* media_type) override;
  STDMETHOD(EnumMediaTypes)(IEnumMediaTypes** types) override;
  STDMETHOD(QueryInternalConnections)(IPin** pins, ULONG* count) override;
  STDMETHOD(EndOfStream)() override;
  STDMETHOD(BeginFlush)() override;
  STDMETHOD(EndFlush)() override;
  STDMETHOD(NewSegment)(REFERENCE_TIME start, REFERENCE_TIME stop,
                        double rate) override;

  // IMemInputPin
  STDMETHOD(GetAllocator)(IMemAllocator** allocator) override;
  STDMETHOD(NotifyAllocator)(IMemAllocator* allocator, BOOL read_only) override;
  STDMETHOD(GetAllocatorRequirements)(ALLOCATOR_PROPERTIES* props) override;
  STDMETHOD(Receive)(IMediaSample* sample) override;
  STDMETHOD(ReceiveMultiple)(IMediaSample** samples, long count,
                             long* processed) override;
  STDMETHOD(ReceiveCanBlock)() override;
  // clang-format on

  SequenceChecker main_checker_;
  SequenceChecker capture_checker_;

  VideoCaptureCapability requested_capability_ RTC_GUARDED_BY(main_checker_);
  // Accessed on the main thread when Filter()->IsStopped() (capture thread not
  // running), otherwise accessed on the capture thread.
  VideoCaptureCapability resulting_capability_;
  DWORD capture_thread_id_ = 0;
  rtc::scoped_refptr<IMemAllocator> allocator_ RTC_GUARDED_BY(main_checker_);
  rtc::scoped_refptr<IPin> receive_pin_ RTC_GUARDED_BY(main_checker_);
  std::atomic_bool flushing_{false};
  std::atomic_bool runtime_error_{false};
  // Holds a referenceless pointer to the owning filter, the name and
  // direction of the pin. The filter pointer can be considered const.
  PIN_INFO info_ = {};
  AM_MEDIA_TYPE media_type_ RTC_GUARDED_BY(main_checker_) = {};
};

CaptureSinkFilter的定义位于modules/video_capture/windows/sink_filter_ds.h中

// Implement IBaseFilter (including IPersist and IMediaFilter).
class CaptureSinkFilter : public IBaseFilter {
 public:
  CaptureSinkFilter(VideoCaptureImpl* capture_observer);
  // 根据要求设置能力
  HRESULT SetRequestedCapability(const VideoCaptureCapability& capability);

  // Called on the capture thread.
  // 处理从摄像头获取的帧，在capture县城中执行
  void ProcessCapturedFrame(unsigned char* buffer,
                            size_t length,
                            const VideoCaptureCapability& frame_info);

  void NotifyEvent(long code, LONG_PTR param1, LONG_PTR param2);
  bool IsStopped() const;

  //  IUnknown
  // 用于接口查询。它允许客户端请求对象的其他接口。REFIID 是一个接口ID的引用
  STDMETHOD(QueryInterface)(REFIID riid, void** ppv) override;

  // IPersist
  // IPersist 接口的方法，用于获取对象的类ID。类ID是一个全局唯一标识符（GUID），
  // 用于标识对象的类。CLSID 是 GUID 类型的别名，clsid 参数是一个指向 GUID 的指针，用于存储对象的类ID
  STDMETHOD(GetClassID)(CLSID* clsid) override;

  // IMediaFilter.
  // 下面都是IMediaFilter接口的方法
  // 获取媒体过滤器的当前状态，FILTER_STATE 是一个枚举类型，表示过滤器的状态（如停止、暂停、运行等）
  STDMETHOD(GetState)(DWORD msecs, FILTER_STATE* state) override;
  // 设置同步源。IReferenceClock 是一个接口，提供时间戳和同步功能。
  STDMETHOD(SetSyncSource)(IReferenceClock* clock) override;
  // 获取当前的同步源
  STDMETHOD(GetSyncSource)(IReferenceClock** clock) override;
  // 暂停媒体流的处理
  STDMETHOD(Pause)() override;
  // 开始处理媒体流
  STDMETHOD(Run)(REFERENCE_TIME start) override;
  // 停止媒体流的处理
  STDMETHOD(Stop)() override;

  // IBaseFilter
  // 下面都是IBaseFilter接口的方法
  // 枚举一个过滤器的所有pin
  STDMETHOD(EnumPins)(IEnumPins** pins) override;
  // 根据引脚的标识符查找特定的pin
  STDMETHOD(FindPin)(LPCWSTR id, IPin** pin) override;
  // 查询过滤器的信息
  STDMETHOD(QueryFilterInfo)(FILTER_INFO* info) override;
  // 将过滤器加入到一个过滤器图（filter graph）中
  STDMETHOD(JoinFilterGraph)(IFilterGraph* graph, LPCWSTR name) override;
  // 查询过滤器的供应商信息
  STDMETHOD(QueryVendorInfo)(LPWSTR* vendor_info) override;

 protected:
  virtual ~CaptureSinkFilter();

 private:
  /*
    关于SequenceChecker的解释，WebRTC是这么说的：
    SequenceChecker 是一个辅助类，用于帮助验证类的某些方法是否在相同的任务队列或线程上被调用。
    如果一个 SequenceChecker 对象是在任务队列上创建的，那么它就与该任务队列绑定；否则，它与一个线程绑定。
  */
  SequenceChecker main_checker_; 
  const rtc::scoped_refptr<ComRefCount<CaptureInputPin>> input_pin_;
  VideoCaptureImpl* const capture_observer_;
  FILTER_INFO info_ RTC_GUARDED_BY(main_checker_) = {};
  // Set/cleared in JoinFilterGraph. The filter must be stopped (no capture)
  // at that time, so no lock is required. While the state is not stopped,
  // the sink will be used from the capture thread.
  IMediaEventSink* sink_ = nullptr;
  FILTER_STATE state_ RTC_GUARDED_BY(main_checker_) = State_Stopped;
};

1.7 Windows平台执行采集（VideoCaptureDS）

Windows平台执行采集的类为VideoCaptureDS，其父类为VideoCaptureImpl。在VideoCaptureDS中，主要实现了采集的开始/结束两个函数，即StartCapture()和StopCapture()。VideoCaptureDS

class VideoCaptureDS : public VideoCaptureImpl {
 public:
  VideoCaptureDS();

  virtual int32_t Init(const char* deviceUniqueIdUTF8);

  /*************************************************************************
   *
   *   Start/Stop
   *
   *************************************************************************/
  int32_t StartCapture(const VideoCaptureCapability& capability) override;
  int32_t StopCapture() override;

  /**************************************************************************
   *
   *   Properties of the set device
   *
   **************************************************************************/

  bool CaptureStarted() override;
  int32_t CaptureSettings(VideoCaptureCapability& settings) override;

 protected:
  ~VideoCaptureDS() override;

  // Help functions

  int32_t SetCameraOutput(const VideoCaptureCapability& requestedCapability);
  int32_t DisconnectGraph();
  HRESULT ConnectDVCamera();

  DeviceInfoDS _dsInfo RTC_GUARDED_BY(api_checker_);

  IBaseFilter* _captureFilter RTC_GUARDED_BY(api_checker_);
  IGraphBuilder* _graphBuilder RTC_GUARDED_BY(api_checker_);
  IMediaControl* _mediaControl RTC_GUARDED_BY(api_checker_);
  rtc::scoped_refptr<CaptureSinkFilter> sink_filter_
      RTC_GUARDED_BY(api_checker_);
  IPin* _inputSendPin RTC_GUARDED_BY(api_checker_);
  IPin* _outputCapturePin RTC_GUARDED_BY(api_checker_);

  // Microsoft DV interface (external DV cameras)
  IBaseFilter* _dvFilter RTC_GUARDED_BY(api_checker_);
  IPin* _inputDvPin RTC_GUARDED_BY(api_checker_);
  IPin* _outputDvPin RTC_GUARDED_BY(api_checker_);
};

2.数据图

面向Windows平台的视频帧采集流程，结合UML图，做了一张数据流图，大体分为四个步骤：
（1）从摄像头获取视频帧数据（底层实现）
（2）处理从摄像头中获取到的视频帧
（3）处理从Windows层获取到的视频帧，并且送入帧处理器中
从底层获取的帧，可以分为RawFrame和Frame，RawFrame通常是未转换格式的帧，如MJPG，而Frame通常是转换格式之后的帧，如I420格式。
（4）处理获取的帧
对于已经获取的帧，可以送入到渲染当中进行本地渲染显示，也可以送入到编码流程当中进行编码，随后发送到远端。大多数情况下，会同时进行本地显示和编码发送，所以需要以组播的方式发送到两个不同的模块当中。
在这里插入图片描述