CVPR2021论文列表（中英对照）

Scale-Localized Abstract Reasoning 尺度本地化抽象推理
How Does Topology Influence Gradient Propagation and Model Performance of 拓扑如何影响梯度传播和模型性能
AdaBins Depth Estimation Using Adaptive Bins 使用自适应 bin 的 AdaBins 深度估计
Deep Burst Super-Resolution 深度爆发超分辨率
Euro-PVI Pedestrian Vehicle Interactions in Dense Urban Centers 密集城市中心的 Euro-PVI 行人车辆互动
View Generalization for Single Image Textured 3D Models 查看单图像纹理 3D 模型的泛化
MetaHTR Towards Writer-Adaptive Handwritten Text Recognition MetaHTR 迈向作家自适应手写文本识别
More Photos Are All You Need Semi-Supervised Learning for Fine-Grained 更多照片是你所需要的所有细粒度的半监督学习
Vectorization and Rasterization Self-Supervised Learning for Sketch and Handwriting 草图和手写的矢量化和光栅化自我监督学习
Quantum Permutation Synchronization 量子置换同步
Behavior-Driven Synthesis of Human Dynamics 人类动力学的行为驱动综合
Understanding Object Dynamics for Interactive Image-to-Video Synthesis 了解交互式图像到视频合成的对象动力学
Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning 用于基于自我训练的转导零样本学习的硬度采样
Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions 使用关系布局进行人机交互的分层视频预测
Convolutional Dynamic Alignment Networks for Interpretable Classifications 用于可解释分类的卷积动态对齐网络
Towards Part-Based Understanding of RGB-D Scans 对 RGB-D 扫描的基于部件的理解
InverseForm A Loss Function for Structured Boundary-Aware Segmentation 用于结构化边界感知分割的 InverseForm A 损失函数
Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?没有元学习的小样本分割一个好的转导推理就是一切
OCONet Image Extrapolation by Object Completion 通过对象完成进行 OCONet 图像外推
Neural Deformation Graphs for Globally-Consistent Non-Rigid Reconstruction 用于全局一致非刚性重建的神经变形图
GAIA A Transfer Learning System of Object Detection That Fits GAIA 适合的目标检测迁移学习系统
Asymmetric Metric Learning for Knowledge Transfer 知识转移的不对称度量学习
Fine-Grained Angular Contrastive Learning With Coarse Labels 带有粗标签的细粒度角度对比学习
Limitations of Post-Hoc Feature Alignment for Robustness 事后特征对齐对鲁棒性的限制
FBI-Denoiser Fast Blind Image Denoiser for Poisson-Gaussian Noise 用于泊松-高斯噪声的 FBI-Denoiser 快速盲图像降噪器
Deep Lesion Tracker Monitoring Lesions in 4D Longitudinal Imaging Studies 深度病变跟踪器监测 4D 纵向成像研究中的病变
Exponential Moving Average Normalization for Self-Supervised and Semi-Supervised Learning 自我监督和半监督学习的指数移动平均归一化
Extreme Rotation Estimation Using Dense Correlation Volumes 使用密集相关体积的极端旋转估计
Rethinking Graph Neural Architecture Search From Message-Passing 从消息传递重新思考图神经架构搜索
Revisiting Superpixels for Active Learning in Semantic Segmentation With Realistic 用真实的语义分割重新审视超像素以进行主动学习
Semantic Scene Completion via Integrating Instances and Scene In-the-Loop 通过集成实例和场景在环完成语义场景
Debiased Subjective Assessment of Real-World Image Enhancement 真实世界图像增强的去偏主观评估
Normal Integration via Inverse Plane Fitting With Minimum Point-to-Plane Distance 通过具有最小点到平面距离的反平面拟合进行正态积分
ReMix Towards Image-to-Image Translation With Limited Data ReMix 以有限的数据实现图像到图像的转换
Sequential Graph Convolutional Network for Active Learning 用于主动学习的序列图卷积网络
MP3 A Unified Model To Map Perceive Predict and Plan MP3 映射感知预测和计划的统一模型
Architectural Adversarial Robustness The Case for Deep Pursuit 架构对抗鲁棒性深度追求的案例
Deep Perceptual Preprocessing for Video Coding 视频编码的深度感知预处理
Ensembling With Deep Generative Views 使用深度生成视图进行集成
To the Point Efficient 3D Object Detection in the Range 范围内的点高效 3D 对象检测
Truly Shift-Invariant Convolutional Neural Networks 真正的移位不变卷积神经网络
BasicVSR The Search for Essential Components in Video Super-Resolution and BasicVSR 视频超分辨率和基本组件的搜索
GLEAN Generative Latent Bank for Large-Factor Image Super-Resolution 用于大因子图像超分辨率的 GLEAN Generative Latent Bank
Pi-GAN Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis 用于 3D 感知图像合成的 Pi-GAN 周期性隐式生成对抗网络
Adaptive Convolutions for Structure-Aware Style Transfer 结构感知风格迁移的自适应卷积
A Closer Look at Fourier Spectrum Discrepancies for CNN-Generated Images 仔细研究 CNN 生成图像的傅立叶谱差异
Learning Discriminative Prototypes With Dynamic Time Warping 学习具有动态时间规整的判别原型
Towards Robust Classification Model by Counterfactual and Invariant Data Generation 通过反事实和不变数据生成实现稳健的分类模型
Your Flamingo is My Bird Fine-Grained or Not 你的火烈鸟是我的鸟
Conceptual 12M Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual 概念 12M 推送 Web 规模的图像文本预训练以识别长尾视觉
DexYCB A Benchmark for Capturing Hand Grasping of Objects DexYCB 用于捕获抓握物体的基准
On Focal Loss for Class-Posterior Probability Estimation A Theoretical Perspective 关于类后验概率估计的焦点损失的理论视角
Semi-Supervised Synthesis of High-Resolution Editable Textures for 3D Humans 用于 3D 人体的高分辨率可编辑纹理的半监督合成
Transformer Interpretability Beyond Attention Visualization 超越注意力可视化的 Transformer 可解释性
How Privacy-Preserving Are Line Clouds Recovering Scene Details From 3D 线云如何从 3D 中恢复场景细节以保护隐私
Adaptive Image Transformer for One-Shot Object Detection 用于一次性目标检测的自适应图像转换器
AQD Towards Accurate Quantized Object Detection AQD 迈向准确的量化目标检测
Blind Deblurring for Saturated Images 饱和图像的盲去模糊
Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D 通过语义聚合和自适应 2D-1D 进行相机空间手网格恢复
Class-Aware Robust Adversarial Training for Object Detection 用于对象检测的类感知鲁棒对抗训练
Contrastive Neural Architecture Search With Neural Architecture Comparators 使用神经架构比较器进行对比神经架构搜索
DECOR-GAN 3D Shape Detailization by Conditional Refinement 条件细化的 DECOR-GAN 3D 形状细节化
Deep Analysis of CNN-Based Spatio-Temporal Representations for Action Recognition 深度分析基于 CNN 的时空表示以进行动作识别
Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity 利用跨层统计自相似性进行深度纹理识别
Delving Deep Into Many-to-Many Attention for Few-Shot Video Object Segmentation 深入研究少镜头视频对象分割的多对多注意力
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning 通过组合对比学习提炼视听知识
Distilling Knowledge via Knowledge Review 通过知识回顾提炼知识
DualAST Dual Style-Learning Networks for Artistic Style Transfer 用于艺术风格迁移的 DualAST 双风格学习网络
Dynamic Region-Aware Convolution 动态区域感知卷积
ECKPN Explicit Class Knowledge Propagation Network for Transductive Few-Shot Learning 用于转导小样本学习的 ECKPN 显式类知识传播网络
Efficient Object Embedding for Spliced Image Retrieval 用于拼接图像检索的高效对象嵌入
Equivariant Point Network for 3D Point Cloud Analysis 用于 3D 点云分析的等变点网络
Exploring Simple Siamese Representation Learning 探索简单的连体表示学习
FS-Net Fast Shape-Based Network for Category-Level 6D Object Pose Estimation 用于类别级 6D 对象姿态估计的 FS-Net 快速基于形状的网络
GeoSim Realistic Video Simulation via Geometry-Aware Composition for Self-Driving GeoSim 通过几何感知组合实现自动驾驶的真实视频模拟
High-Fidelity Face Tracking for ARVR via Deep Lighting Adaptation 通过深度照明适应实现 ARVR 的高保真人脸跟踪
Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles 具有特定于动词的语义角色的类人可控图像字幕
Hybrid Rotation Averaging A Fast and Robust Rotation Averaging Approach 混合旋转平均一种快速且稳健的旋转平均方法
I3Net Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors I3Net 隐式实例不变网络适用于单阶段目标检测器
Indoor Lighting Estimation Using an Event Camera 使用事件相机进行室内照明估计
Jigsaw Clustering for Unsupervised Visual Representation Learning 用于无监督视觉表示学习的拼图聚类
Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification 用于无监督人员重新识别的联合生成和对比学习
Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification 为纹理不敏感的人重新识别学习 3D 形状特征
Learning a Non-Blind Deblurring Network for Night Blurry Images 学习用于夜间模糊图像的非盲去模糊网络
Learning Continuous Image Representation With Local Implicit Image Function 使用局部隐式图像函数学习连续图像表示
Learning Feature Aggregation for Deep 3D Morphable Models 深度 3D 可变形模型的学习特征聚合
Learning Student Networks in the Wild 在野外学习学生网络
Learning the Best Pooling Strategy for Visual Semantic Embedding 学习视觉语义嵌入的最佳池化策略
Localizing Visual Sounds the Hard Way 本地化视觉听起来很困难
MagDR Mask-Guided Detection and Reconstruction for Defending Deepfakes 用于防御 Deepfake 的 MagDR 掩模引导检测和重建
Model-Based 3D Hand Reconstruction via Self-Supervised Learning 通过自我监督学习进行基于模型的 3D 手部重建
MonoRUn Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation MonoRUn 通过重构和不确定性传播的单目 3D 对象检测
Neural Feature Search for RGB-Infrared Person Re-Identification 用于 RGB 红外人重新识别的神经特征搜索
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking 通过多样性引导搜索空间收缩的 One-Shot 神经集成架构搜索
Pareto Self-Supervised Training for Few-Shot Learning
Perceptual Indistinguishability-Net PI-Net Facial Image Obfuscation With Manipulable Semantics 具有可操作语义的面部图像混淆
Points As Queries Weakly Semi-Supervised Object Detection by Points 点作为查询弱半监督对象检测点
Predicting Human Scanpaths in Visual Question Answering 在视觉问答中预测人类扫描路径
Pre-Trained Image Processing Transformer 预训练的图像处理转换器
Progressive Semantic-Aware Style Transformation for Blind Face Restoration 用于盲人脸恢复的渐进式语义感知风格转换
PSD Principled Synthetic-to-Real Dehazing Guided by Physical Priors 由物理先验引导的 PSD 原则合成到真实去雾
Reformulating HOI Detection As Adaptive Set Prediction 将 HOI 检测重新定义为自适应集预测
Robust and Accurate Object Detection via Adversarial Learning 通过对抗性学习进行稳健且准确的目标检测
Robust Representation Learning With Feedback for Single Image Deraining 具有反馈的鲁棒表示学习，用于单幅图像去雨
S2R-DepthNet Learning a Generalizable Depth-Specific Structural Representation S2R-DepthNet 学习可泛化的深度特定结构表示
Scale-Aware Automatic Augmentation for Object Detection 用于对象检测的规模感知自动增强
Scan2Cap Context-Aware Dense Captioning in RGB-D Scans RGB-D 扫描中的 Scan2Cap 上下文感知密集字幕
Scene Text Telescope Text-Focused Scene Image Super-Resolution 场景文本望远镜文本聚焦场景图像超分辨率
Semantic Audio-Visual Navigation 语义视听导航
Semi-Supervised Domain Adaptation Based on Dual-Level Domain Mixing for Semantic 基于双级域混合的语义半监督域自适应
Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision 具有交叉伪监督的半监督语义分割
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection 用于场景边界检测的镜头对比自监督学习
The Lottery Tickets Hypothesis for Supervised and Self-Supervised Pre-Training in 监督和自我监督预训练的彩票假说
Topological Planning With Transformers for Vision-and-Language Navigation 用于视觉和语言导航的变压器拓扑规划
Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning面向弱监督密集事件字幕的桥接事件字幕和句子定位器
Transformer Tracking 变压器跟踪
Triple-Cooperative Video Shadow Detection 三重协作视频阴影检测
Wasserstein Contrastive Representation Distillation Wasserstein 对比表示蒸馏法
Wide-Baseline Relative Camera Pose Estimation With Directional Learning 具有方向学习的宽基线相对相机姿态估计
You Only Look One-Level Feature 你只看一级特征
(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network稀疏语义分割网络中具有自适应特征选择的注意力特征融合
Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point 点中基于投票的 3D 对象检测的回溯代表点
Boundary IoU Improving Object-Centric Image Segmentation Evaluation 边界 IoU 改进以对象为中心的图像分割评估
Learning Deep Classifiers Consistent With Fine-Grained Novelty Detection 学习与细粒度新奇检测一致的深度分类器
Learning To Filter Siamese Relation Network for Robust Tracking 学习过滤孪生关系网络以实现鲁棒跟踪
Light Field Super-Resolution With Zero-Shot Learning 具有零样本学习的光场超分辨率
Memory-Efficient Network for Large-Scale Video Compressive Sensing 用于大规模视频压缩感知的内存高效网络
Modular Interactive Video Object Segmentation Interaction-to-Mask Propagation and Difference-Aware Fusion 模块化交互式视频对象分割 Interaction-to-Mask 传播和差异感知融合
Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up 通过集成自顶向下和自底向上的单目 3D 多人姿势估计
Multi-View 3D Reconstruction of a Texture-Less Smooth Surface of Unknown 未知的无纹理光滑表面的多视图 3D 重建
NBNet Noise Basis Learning for Image Denoising With Subspace Projection 基于子空间投影的图像去噪的 NBNet 噪声基础学习
Style-Aware Normalized Loss for Improving Arbitrary Style Transfer 用于改进任意风格迁移的风格感知归一化损失
Semantic-Aware Knowledge Distillation for Few-Shot Class-Incremental Learning 少样本增量学习的语义感知知识蒸馏
Navigating the GAN Parameter Space for Semantic Image Editing 为语义图像编辑导航 GAN 参数空间
Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow,Stereo Depth and Camera Motion特征级协作：光流、立体深度和相机运动的联合无监督学习
Test-Time Fast Adaptation for Dynamic Scene Deblurring via Meta-Auxiliary Learning 基于元辅助学习的动态场景去模糊测试时间快速适应
Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes新颖场景稀疏视图的学习型视图合成
PiCIE Unsupervised Semantic Segmentation Using Invariance and Equivariance in Clustering 在聚类中使用不变性和等变性进行 PiCIE 无监督语义分割
Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video超越静态特征，用于视频时间一致的3D人类姿势和形状估计
Meta Batch-Instance Normalization for Generalizable Person Re-Identification
RobustNet Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective RobustNet 通过实例选择改进城市场景分割中的域泛化
Shared Cross-Modal Trajectory Prediction for Autonomous Driving 自动驾驶的共享跨模态轨迹预测
VaB-AL Incorporating Class Imbalance and Difficulty With Variational Bayes for VaB-AL 将类不平衡和难度与变分贝叶斯相结合
VITON-HD High-Resolution Virtual Try-On via Misalignment-Aware Normalization VITON-HD 高分辨率虚拟试穿通过错位感知归一化
Mask-ToF Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging学习微透镜掩模以在飞行时间成像中进行飞行像素校正
Probabilistic Embeddings for Cross-Modal Retrieval 跨模态检索的概率嵌入
Multi-Label Learning From Single Positive Labels 从单个正标签进行多标签学习
Correlated Input-Dependent Label Noise in Large-Scale Image Classification 大规模图像分类中的相关输入相关标签噪声
Differentiable Patch Selection for Image Recognition 图像识别的可微分块选择
SMPLicit Topology-Aware Generative Model for Clothed People 有衣人的拓扑感知生成模型
Zillow Indoor Dataset Annotated Floor Plans With 360deg Panoramas and 3D Room LayoutsZillow 室内数据集带有 360 度全景图和 3D 房间布局的带注释的平面图
Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation具有连续速率自适应的非对称获得深度图像压缩
GANmut Learning Interpretable Conditional Space for Gamut of Emotions 学习情绪范围的可解释条件空间
Geus Part-Aware Panoptic SegmentationGeus 部分感知全景分割
Animating Pictures With Eulerian Motion Fields 用欧拉运动场动画图片
Composing Photos Like a Photographer 像摄影师一样构图
Disentangling Label Distribution for Long-Tailed Visual Recognition 解开长尾视觉识别的标签分布
Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification 细粒度形貌互学习换衣人再识别
LiDAR-Based Panoptic Segmentation via Dynamic Shifting Network 基于 LiDAR 的动态移动网络全景分割
LPSNet A Lightweight Solution for Fast Panoptic Segmentation LPSNet 一种用于快速全景分割的轻量级解决方案
Panoramic Image Reflection Removal全景图像反射去除
Reinforced Attention for Few-Shot Learning and Beyond 加强对 Few-Shot 学习及其他学习的关注
StereoPIFu Depth Aware Clothed Human Digitization via Stereo Vision StereoPIFu 深度感知穿衣人体数字化通过立体视觉
Student-Teacher Learning From Clean Inputs to Noisy Inputs 师生从干净输入到嘈杂输入的学习
StyleMix Separating Content and Style for Enhanced Data Augmentation StyleMix 分离内容和样式以增强数据增强
Transformation Driven Visual Reasoning 转换驱动的视觉推理
VLN BERT A Recurrent Vision-and-Language BERT for Navigation VLN BERT 用于导航的循环视觉和语言 BERT
DSRNA Differentiable Search of Robust Neural Architectures 稳健神经架构的 DSRNA 可微搜索
Image Change Captioning by Learning From an Auxiliary Task 通过辅助任务学习图像更改字幕
Affordance Transfer Learning for Human-Object Interaction Detection 人与物体交互检测的可供性迁移学习
BiCnet-TKS Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification BiCnet-TKS 学习用于视频人物重新识别的高效时空表示
Coordinate Attention for Efficient Mobile Network Design 协调注意力以实现高效的移动网络设计
Detecting Human-Object Interaction via Fabricated Compositional Learning 通过虚构的组合学习检测人与物体的交互
Exploring Data-Efficient 3D Scene Understanding With Contrastive Scene Contexts 使用对比场景上下文探索数据高效的 3D 场景理解
Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object 跨域弱监督对象的信息一致对应挖掘
Towards High Fidelity Face Relighting With Realistic Shadows 使用逼真的阴影实现高保真面部重新照明
Visualizing Adapted Knowledge in Domain Transfer 可视化领域迁移中的适应知识
Three Ways To Improve Semantic Segmentation With Self-Supervised Depth Estimation 使用自监督深度估计改进语义分割的三种方法
DARCNN Domain Adaptive Region-Based Convolutional Neural Network for Unsupervised Instance 用于无监督实例的 DARCNN 域自适应区域卷积神经网络
A2-FPN Attention Aggregation Based Feature Pyramid Network for Instance Segmentation 用于实例分割的基于 A2-FPN 注意力聚合的特征金字塔网络
AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations from Self-Trained Negative AdversariesAdCo：有效学习自训练的负面对手的无监督表示的对抗对比
Bidirectional Projection Network for Cross Dimension Scene Understanding 用于跨维度场景理解的双向投影网络
Dense Relation Distillation With Context-Aware Aggregation for Few-Shot Object Detection 具有上下文感知聚合的密集关系蒸馏，用于少镜头目标检测
Distilling Causal Effect of Data in Class-Incremental Learning 在类增量学习中提取数据的因果效应
Efficient Deformable Shape Correspondence via Multiscale Spectral Manifold Wavelets Preservation 基于多尺度谱流形小波保存的高效可变形形状对应
FVC A New Framework Towards Deep Video Compression in Feature SpaceFVC 一个面向特征深度视频压缩的新框架
Learning Cross-Modal Retrieval With Noisy Labels 学习带噪声标签的跨模态检索
Learning Position and Target Consistency for Memory-Based Video Object Segmentation 基于内存的视频对象分割的学习位置和目标一致性
Model-Aware Gesture-to-Gesture Translation 模型感知手势到手势转换
Pseudo 3D Auto-Correlation Network for Real Image Denoising 用于真实图像去噪的伪 3D 自相关网络
Safe Local Motion Planning With Self-Supervised Freespace Forecasting 具有自我监督自由空间预测的安全局部运动规划
SAIL-VOS 3D A Synthetic Dataset and Baselines for Object Detection SAIL-VOS 3D 目标检测的合成数据集和基线
Self-Supervised 3D Mesh Reconstruction From Single Images 从单幅图像进行自我监督 3D 网格重建
SimPLE Similar Pseudo Label Exploitation for Semi-Supervised Classification 用于半监督分类的 SimPLE 相似伪标签开发
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds A Dataset 迈向城市规模 3D 点云的语义分割数据集
Wide-Depth-Range 6D Object Pose Estimation in Space 空间中的宽深度范围 6D 对象姿态估计
A Multiplexed Network for End-to-End Multilingual OCR 用于端到端多语言 OCR 的多路复用网络
Brain Image Synthesis With Unsupervised Multivariate Canonical CSCl4Net 使用无监督多元规范 CSCl4Net 进行脑图像合成
Cross-View Regularization for Domain Adaptive Panoptic Segmentation 域自适应全景分割的跨视图正则化
Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging 用于光谱压缩成像的深高斯尺度混合先验
DeepLM Large-Scale Nonlinear Least Squares on Deep Learning Frameworks Using DeepLM 在深度学习框架上使用的大规模非线性最小二乘法
DI-Fusion Online Implicit 3D Reconstruction With Deep Priors 具有深度先验的 DI-Fusion 在线隐式 3D 重建
Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling 个性化几何和纹理建模的小样本人体运动传输
FSDR Frequency Space Domain Randomization for Domain Generalization 用于域泛化的 FSDR 频域域随机化
Geo-FARM Geodesic Factor Regression Model for Misaligned Pre-Shape Responses in Geo-FARM 测地线因子回归模型，用于未对齐的预形状响应
Group Whitening Balancing Learning Efficiency and Representational Capacity 群体美白：平衡学习效率和表征能力
Look Before You Leap Learning Landmark Features for One-Stage Visual Grounding跳之前先看看：学习用于单阶段视觉基础的地标特征
Memory Oriented Transfer Learning for Semi-Supervised Image Deraining 半监督图像去雨的面向记忆的迁移学习
MetaSets Meta-Learning on Point Sets for Generalizable Representations MetaSets 基于点集的元学习用于泛化表示
MetricOpt Learning To Optimize Black-Box Evaluation Metrics MetricOpt 学习优化黑盒评估指标
MOS Towards Scaling Out-of-Distribution Detection for Large Semantic Space MOS 面向扩展大语义空间的分布外检测
MultiBodySync Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization MultiBodySync 通过 3D 扫描同步进行多体分割和运动估计
Neighbor2Neighbor Self-Supervised Denoising From Single Noisy Images Neighbor2Neighbor 从单个噪声图像中进行自我监督去噪
Predator Registration of 3D Point Clouds With Low Overlap 低重叠 3D 点云的捕食者配准
Revisiting Knowledge Distillation An Inheritance and Exploration Framework 重温知识蒸馏的继承和探索框架
S3 Learnable Sparse Signal Superdensity for Guided Depth Estimation 用于引导深度估计的 S3 可学习稀疏信号超密度
Searching by Generating Flexible and Efficient One-Shot NAS With Architecture 通过生成具有架构的灵活高效的 One-Shot NAS 进行搜索
Seeing Out of the Box End-to-End Pre-Training for Vision-Language Representation 开箱即用的视觉语言表示端到端预训练
Self-Supervised Motion Learning From Static Images 从静态图像中进行自我监督运动学习
Self-Supervised Video Representation Learning by Context and Motion Decoupling 基于上下文和运动解耦的自监督视频表示学习
Video Rescaling Networks With Joint Optimization Strategies for Downscaling and Upscaling具有用于缩小和放大的联合优化策略的视频重新缩放网络
VS-Net Voting With Segmentation for Visual Localization视觉定位的分割投票
When Age-Invariant Face Recognition Meets Face Age Synthesis： A Multi-Task Learning Framework当年龄不变的人脸识别遇到人脸年龄合成：一项多任务学习框架
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation 用于语言查询视频演员分割的协作时空建模
Learning the Non-Differentiable Optimization for Blind Super-Resolution 学习盲超分辨率的不可微优化
ATSO Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation ATSO 异步师生优化半监督图像分割
Self-Supervised Multi-Frame Monocular Scene Flow 自监督多帧单目场景流
Progressive Semantic Segmentation 渐进式语义分割
Exemplar-Based Open-Set Panoptic Segmentation Network 基于示例的开放集全景分割网络
Self-Supervised Video GANs Learning for Appearance Consistency and Motion Coherency 自我监督视频 GAN 学习外观一致性和运动连贯性
3D Shape Generation With Grid-Based Implicit Functions 使用基于网格的隐式函数生成 3D 形状
Shape From Sky Polarimetric Normal Recovery Under the Sky 天空下的极化法线恢复形状
Optimal Quantization Using Scaled Codebook 使用缩放码本的最佳量化
Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries 在遮挡边界处使用双曲面外推的深度补全
Passive Inter-Photon Imaging 被动光子间成像
Multi-Target Domain Adaptation With Collaborative Consistency Learning 具有协作一致性学习的多目标域适应
Facial Action Unit Detection With Transformers 使用变形金刚进行面部动作单元检测
Learning High Fidelity Depths of Dressed Humans by Watching Social 通过观看社交来学习穿着打扮的人的高保真深度
KeypointDeformer Unsupervised 3D Keypoint Discovery for Shape Control 用于形状控制的无监督 3D 关键点发现
CAMERAS Enhanced Resolution and Sanity Preserving Class Activation Mapping for image saliency图像显着性的增强分辨率和健全性保留类激活映射
MeanShift Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking用于分割和对象的 MeanShift 极快模式搜索
NewtonianVAE Proportional Control and Goal Identification From Pixels via Physical NewtonianVAE 比例控制和通过物理从像素识别目标
Quantifying Explainers of Graph Neural Networks in Computational Pathology 量化计算病理学中图神经网络的解释器
UV-Net Learning From Boundary Representations UV-Net 从边界表示中学习
Mining Better Samples for Contrastive Learning of Temporal Correspondence 挖掘更好的样本用于时间对应的对比学习
Few-Shot Open-Set Recognition by Transformation Consistency 基于变换一致性的 Few-Shot 开集识别
Interpolation-Based Semi-Supervised Learning for Object Detection 用于目标检测的基于插值的半监督学习
Memory-Guided Unsupervised Image-to-Image Translation记忆引导的无监督图像到图像转换
Audio-Driven Emotional Video Portraits 音频驱动的情感视频肖像
Calibrated RGB-D Salient Object Detection 校准的 RGB-D 显着目标检测
Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling 通过多评价者协议建模学习校准的医学图像分割
Refine Myself by Teaching Myself Feature Refinement via Self-Knowledge Distillation 通过自我知识蒸馏自学特征细化来细化自己
Intentonomy A Dataset and Study Towards Human Intent Understanding Intentonomy 一个数据集和对人类意图理解的研究
IoU Attack Towards Temporally Coherent Black-Box Adversarial Attack for Visual IoU 攻击对视觉的时间相干黑盒对抗攻击
Leveraging Line-Point Consistence To Preserve Structures for Wide Parallax Image 利用线点一致性保留宽视差图像的结构
Scalability vs. Utility Do We Have To Sacrifice One for 可扩展性与实用性我们是否必须为此牺牲一个？
Learning Compositional Representation for 4D Captures With Neural ODE 使用神经 ODE 学习 4D 捕获的组合表示
Learning Optical Flow From a Few Matches 从几场比赛中学习光流
Regressive Domain Adaptation for Unsupervised Keypoint Detection 无监督关键点检测的回归域自适应
Robust Reference-Based Super-Resolution via C2-Matching 通过 C2 匹配实现强大的基于参考的超分辨率
Saliency-Guided Image Translation 显着性引导的图像翻译
EffiScene Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation用于光流、深度、相机姿势和运动分割的无监督联合学习的高效逐像素刚性推断
Harmonious Semantic Line Detection via Maximal Weight Clique Selection 基于最大权重筛选的调和语义分割检测
Teachers Do More Than Teach Compressing Image-to-Image Models 教师所做的不仅仅是教授压缩图像到图像模型
Amalgamating Knowledge From Heterogeneous Graph Neural Networks 融合来自异构图神经网络的知识
Cross-Modal Center Loss for 3D Cross-Modal Retrieval 用于 3D 跨模态检索的跨模态中心损失
Locate Then Segment A Strong Pipeline for Referring Image Segmentation 定位然后分割一个强大的管道用于参考图像分割
Turning Frequency to Resolution Video Super-Resolution via Event Cameras 通过事件摄像机将频率转换为分辨率视频超分辨率
Practical Single-Image Super-Resolution Using Look-Up Table 使用查找表的实用单图像超分辨率
Tackling the Ill-Posedness of Super-Resolution Through Adaptive Target Generation 通过自适应目标生成解决超分辨率的弊端
Towards Open World Object Detection 迈向开放世界目标检测
Joint Deep Model-Based MR Image and Coil Sensitivity Reconstruction Network 基于联合深度模型的 MR 图像和线圈灵敏度重建网络
Fair Feature Distillation for Visual Recognition 视觉识别的公平特征蒸馏
Time Adaptive Recurrent Neural Network 时间自适应递归神经网络
Coarse-Fine Networks for Temporal Activity Detection in Videos 用于视频中时间活动检测的粗细网络
In the Light of Feature Distributions Moment Matching for Neural 根据神经网络的特征分布矩匹配
Relative Order Analysis and Optimization for Unsupervised Deep Metric Learning 无监督深度度量学习的相对阶分析和优化
Blur Noise and Compression Robust Generative Adversarial Networks 模糊噪声和压缩鲁棒生成对抗网络
Unsupervised Learning of Depth and Depth-of-Field Effect From Natural Images with Aperture Rendering Generative Adversarial Networks使用孔径渲染生成对抗网络从自然图像中进行深度和景深效应的无监督学习
Guided Integrated Gradients An Adaptive Path Method for Removing Noise 引导积分梯度一种自适应路径去除噪声的方法
High-Fidelity Neural Human Motion Transfer From Monocular Video 单目视频的高保真神经人体运动传输
Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single 批量归一化单的快速贝叶斯不确定性估计和减少
Zero-Shot Single Image Restoration Through Controlled Perturbation of Koschmieders Model 通过 Koschmieders 模型的受控扰动实现零样本单图像恢复
MAZE Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation 使用零阶梯度估计的 MAZE 无数据模型窃取攻击
Differentiable SLAM-Net Learning Particle SLAM for Visual Navigation 用于视觉导航的微分 SLAM-Net 学习粒子 SLAM
Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces 一般表面光度立体的未校准神经逆向渲染
Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers 具有重叠双层的深度遮挡感知实例分割
Neural Lumigraph Rendering 神经 Lumigraph 渲染
Hierarchical Lovasz Embeddings for Proposal-Free Panoptic Segmentation 用于无提议全景分割的分层 Lovasz 嵌入
How Transferable Are Reasoning Patterns in VQA VQA 中的推理模式如何转移
Roses Are Red Violets Are Blue… but Should VQA Expect 玫瑰是红紫罗兰是蓝色的……但 VQA 应该期待吗
Neural Response Interpretation Through the Lens of Critical Pathways 从关键路径的角度解读神经反应
Differentiable Diffusion for Dense Depth Estimation From Multi-View Images 多视图图像密集深度估计的微分扩散
UniT Unified Knowledge Transfer for Any-Shot Object Detection and Segmentation 用于 Any-Shot 对象检测和分割的 UnitT 统一知识转移
Neural Side-by-Side Predicting Human Preferences for No-Reference Super-Resolution Evaluation 用于无参考超分辨率评估的神经并排预测人类偏好
Discriminative Appearance Modeling With Multi-Track Pooling for Real-Time Multi-Object Tracking 用于实时多目标跟踪的多轨道池的判别外观建模
DriveGAN Towards a Controllable High-Quality Neural Simulation DriveGAN 迈向可控的高质量神经仿真
Embedding Transfer With Label Relaxation for Improved Metric Learning 嵌入带有标签松弛的迁移以改进度量学习
Exploiting Spatial Dimensions of Latent in GAN for Real-Time Image 在实时图像中利用 GAN 中潜在的空间维度
High-Quality Stereo Image Restoration From Double Refraction 双折射的高质量立体图像恢复
HOTR End-to-End Human-Object Interaction Detection With Transformers HOTR 使用 Transformer 的端到端人与物体交互检测
Improving Accuracy of Binary Neural Networks Using Unbalanced Activation Distribution 使用不平衡激活分布提高二元神经网络的准确性
IronMask Modular Architecture for Protecting Deep Face Template IronMask 模块化架构保护深面模板
Joint Negative and Positive Learning for Noisy Labels 噪声标签的联合消极和积极学习
KOALAnet Blind Super-Resolution Using Kernel-Oriented Adaptive Local Adjustment 使用面向内核的自适应局部调整的 KOALAnet 盲超分辨率
LaPred Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents 动态智能体多模态未来轨迹的车道感知预测
Not Just Compete but Collaborate Local Image-to-Image Translation via Cooperative 不仅仅是竞争，而是通过合作进行本地图像到图像的翻译
Prototype-Guided Saliency Feature Learning for Person Search 用于人员搜索的原型引导显着性特征学习
Quality-Agnostic Image Recognition via Invertible Decoder 通过可逆解码器的质量不可知图像识别
SetVAE Learning Hierarchical Composition for Generative Modeling of Set-Structured Data 用于集合结构数据生成建模的 SetVAE 学习分层组合
Task-Aware Variational Adversarial Active Learning 任务感知变分对抗主动学习
XProtoNet Diagnosis in Chest Radiography With Global and Local Explanations XProtoNet 胸部 XProtoNet 诊断与全局和局部解释
FlowStep3D Model Unrolling for Self-Supervised Scene Flow Estimation 用于自监督场景流估计的 FlowStep3D 模型展开
How To Exploit the Transferability of Learned Image Compression to 如何利用学习图像压缩的可迁移性
Cuboids Revisited Learning Robust 3D Shape Fitting to Single RGB ImagesCuboids 重新审视学习稳健的 3D 形状拟合到单个 RGB图像
T-vMF Similarity for Regularizing Intra-Class Feature Distribution 用于正则化类内特征分布的 T-vMF 相似性
Learning Monocular 3D Reconstruction of Articulated Categories From Motion 从运动中学习铰接类别的单目 3D 重建
MoViNets Mobile Video Networks for Efficient Video Recognition 用于高效视频识别的 MoViNets 移动视频网络
ClassSR A General Framework to Accelerate Super-Resolution Networks by Data ClassSR 一个通过数据加速超分辨率网络的通用框架
Robust Consistent Video Depth Estimation 稳健一致的视频深度估计
Interpretable Social Anchors for Human Trajectory Forecasting in Crowds 人群中人类轨迹预测的可解释社会锚
Weakly-Supervised Physically Unconstrained Gaze Estimation 弱监督物理无约束注视估计
Rethinking Style Transfer From Pixels to Parameterized Brushstrokes 重新思考从像素到参数化笔触的风格转换
QPP Real-Time Quantization Parameter Prediction for Deep Neural Networks 深度神经网络的 QPP 实时量化参数预测
Hierarchical Motion Understanding via Motion Programs 通过运动程序理解分层运动
GrooMeD-NMS Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS 用于单目 3D 目标检测的分组数学可微分 NMS
Controllable Image Restoration for Under-Display Camera in Smartphones 智能手机屏下摄像头的可控图像恢复
Single-View Robot Pose and Joint Angle Estimation via Render 通过渲染进行单视图机器人姿态和关节角度估计
IMODAL Creating Learnable User-Defined Deformation Models IMODAL 创建可学习的用户定义变形模型
LipSync3D Data-Efficient Learning of Personalized 3D Talking Faces From Video LipSync3D 数据高效学习视频中的个性化 3D 会说话的面孔
Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency 具有方向上下文感知一致性的半监督语义分割
CoCoNets Continuous Contrastive 3D Scene Representations CoCoNets 连续对比 3D 场景表示
Restoring Extremely Dark Images in Real Time 实时恢复极暗图像
BRepNet A Topological Message Passing System for Solid Models BRepNet 实体模型的拓扑消息传递系统
General Multi-Label Image Classification With Transformers 使用 Transformer 的通用多标签图像分类
Pulsar Efficient Sphere-Based Neural Rendering Pulsar 高效的基于球体的神经渲染
Moing Semantic Palette Guiding Scene Generation With Class Proportions Moing 语义调色板用类比例指导场景生成
MongeNet Efficient Sampler for Geometric Deep Learning 用于几何深度学习的 MongeNet 高效采样器
3D Video Stabilization With Depth Estimation by CNN-Based Optimization 通过基于 CNN 的优化进行深度估计的 3D 视频稳定
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation 弱和半监督语义分割的反对抗操纵属性
BBAM Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation用于弱监督语义和实例分割的 BBAM 边界框属性图
Blocks-World Cameras Blocks-世界相机
CoSMo Content-Style Modulation for Image Retrieval With Text Feedback 带有文本反馈的图像检索的 CoSMo 内容样式调制
Depth Completion Using Plane-Residual Representation 使用平面残差表示完成深度
DRANet Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation 用于无监督跨域适应的 DRANet 解开表示和适应网络
Iterative Filter Adaptive Network for Single Image Defocus Deblurring 用于单图像散焦去模糊的迭代滤波器自适应网络
Large-Scale Localization Datasets in Crowded Indoor Spaces 拥挤室内空间中的大规模定位数据集
Looking Into Your Speech Learning Cross-Modal Affinity for Audio-Visual Speech Separation调查您的语音学习视听语音的跨模式亲和力
Network Quantization With Element-Wise Gradient Scaling 使用逐元素梯度缩放的网络量化
PatchMatch-Based Neighborhood Consensus for Semantic Correspondence 基于 PatchMatch 的语义对应邻域共识
Railroad Is Not a Train Saliency As Pseudo-Pixel Supervision for 铁路不是作为伪像素监督的火车显着性
Regularization Strategy for Point Cloud via Rigidly Mixed Sample 基于刚性混合样本的点云正则化策略
Relevance-CAM Your Model Already Knows Where To Look Relevance-CAM 你的模型已经知道去哪里找
Restore From Restored Video Restoration With Pseudo Clean Video 使用伪干净视频从恢复的视频恢复中恢复
Rotation-Only Bundle Adjustment 仅旋转捆绑调整
SIPSA-Net： Shift-Invariant Pan Sharpening With Moving Object Alignment for Satellite 与卫星移动对象对齐
Video Prediction Recalling Long-Term Motion Context via Memory Alignment Learning 通过记忆对齐学习回忆长期运动上下文的视频预测
Less Is More ClipBERT for Video-and-Language Learning via Sparse Sampling 少即是多 ClipBERT 通过稀疏采样进行视频和语言学习
Picasso A CUDA-Based Library for Deep Learning Over 3D Meshes Picasso 基于 CUDA 的 3D 网格深度学习库
Robust Reflection Removal With Reflection-Free Flash-Only Cues 使用无反射仅闪光提示的强大反射去除
2D or not 2D Adaptive 3D Convolution Selection for Efficient Video Recognition 用于高效视频识别的 2D 或非 2D 自适应 3D 卷积选择
3D Human Action Representation Learning via Cross-View Consistency Pursuit 通过跨视图一致性追求的 3D 人类行为表示学习
Action Shuffle Alternating Learning for Unsupervised Action Segmentation 无监督动作分割的动作洗牌交替学习
Adaptive Prototype Learning and Allocation for Few-Shot Segmentation 少镜头分割的自适应原型学习和分配
Anchor-Constrained Viterbi for Set-Supervised Action Segmentation 用于集合监督动作分割的锚约束维特比
ARVo Learning All-Range Volumetric Correspondence for Video Deblurring ARVo 学习用于视频去模糊的全范围体积对应
Beyond Max-Margin Class Margin Equilibrium for Few-Shot Object Detection
Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene 具有自适应消息传递的无偏场景二分图网络
Causal Hidden Markov Model for Time Series Disease Forecasting 时间序列疾病预测的因果隐马尔可夫模型
Combined Depth Space Based Architecture Search for Person Re-Identification 用于人员重新识别的基于组合深度空间的架构搜索
Continuous Face Aging via Self-Estimated Residual Age Embedding 通过自估计剩余年龄嵌入进行连续人脸老化
Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation 用于半监督域自适应的跨域自适应聚类
CutPaste Self-Supervised Learning for Anomaly Detection and Localization 用于异常检测和定位的 CutPaste 自监督学习
D2IM-Net Learning Detail Disentangled Implicit Fields From Single Images D2IM-Net 学习细节从单个图像中分离出隐含字段
DeepI2P Image-to-Point Cloud Registration via Deep Classification 基于深度分类的 DeepI2P 图像到点云配准
Diverse Part Discovery Occluded Person Re-Identification With Part-Aware Transformer 使用 Part-Aware Transformer 对不同部分发现遮挡人员进行重新识别
Domain Consensus Clustering for Universal Domain Adaptation 通用域自适应的域共识聚类
Dual-Stream Multiple Instance Learning Network for Whole Slide Image Classification 用于全幻灯片图像分类的双流多实例学习网络
Dynamic Class Queue for Large Scale Face Recognition in the 用于大规模人脸识别的动态类队列
Dynamic Domain Adaptation for Efficient Inference 高效推理的动态域适应
Dynamic Slimmable Network 动态可精简网络
Dynamic Transfer for Multi-Source Domain Adaptation 多源域自适应的动态迁移
Ego-Exo Transferring Visual Representations From Third-Person to First-Person Videos Ego-Exo 将视觉表征从第三人称视频转移到第一人称视频
Exploring Adversarial Fake Images on Face Manifold 探索人脸歧管上的对抗性假图像
Exploring intermediate representation for monocular vehicle pose estimation 探索用于单目车辆姿态估计的中间表示
FaceInpainter High Fidelity Face Adaptation to Heterogeneous Domains FaceInpainter 高保真人脸适应异构域
Few-Shot Object Detection via Classification Refinement and Distractor Retreatment 通过分类细化和 Distractor Retreatment 进行 Few-Shot 目标检测
Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face 人脸单中心损失监督的频率感知判别特征学习
From Synthetic to Real Unsupervised Domain Adaptation for Animal Pose 从合成到真正的无监督领域适应动物姿势
Fully Convolutional Networks for Panoptic Segmentation 用于全景分割的全卷积网络
Generalized Focal Loss V2 Learning Reliable Localization Quality Estimation for Dense Object Detection学习用于密集对象检测的可靠定位质量估计
Generalizing to the Open World Deep Visual Odometry With Online Adaptation通过在线适应推广到开放世界深度视觉里程计
HCRF-Flow： Scene Flow From Point Clouds With Continuous High-Order CRFs 来自具有连续高阶 CRF 的点云的场景流
Hilbert Sinkhorn Divergence for Optimal Transport
HybrIK A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation用于 3D 人体姿势和形状估计的混合分析神经逆运动学解决方案
Image-to-Image Translation via Hierarchical Style Disentanglement 通过分层样式解缠结实现图像到图像的转换
Involution Inverting the Inherence of Convolution for Visual Recognition 对卷积反转视觉识别的内在性
Learning Invariant Representations and Risks for Semi-Supervised Domain Adaptation 半监督域适应的学习不变表示和风险
Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression 学习用于不确定性感知回归的概率序数嵌入
Learning To Identify Correct 2D-2D Line Correspondences on Sphere 学习识别球体上正确的 2D-2D 线对应关系
LiDAR R-CNN An Efficient and Universal 3D Object Detector LiDAR R-CNN 一种高效且通用的 3D 物体检测器
Lighting Reflectance and Geometry Estimation From 360deg Panoramic Stereo 360度全景立体照明反射率和几何估计
Meta-Mining Discriminative Samples for Kinship Verification 用于亲属关系验证的元挖掘判别样本
MetaSAug Meta Semantic Augmentation for Long-Tailed Visual Recognition MetaSAug 用于长尾视觉识别的元语义增强
Model-Contrastive Federated Learning 模型对比联邦学习
Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic 用于动态时空视图合成的神经场景流场
NPAS A Compiler-Aware Framework of Unified Network Pruning and Architecture NPAS 一个编译器感知的统一网络修剪和架构框架
On Feature Normalization and Data Augmentation 关于特征归一化和数据增强
OpenRooms An Open Framework for Photorealistic Indoor Scene Datasets OpenRooms 一个用于逼真的室内场景数据集的开放框架
Point Cloud Upsampling via Disentangled Refinement 通过分离细化进行点云上采样
PointFlow Flowing Semantics Through Points for Aerial Image Segmentation 用于航空图像分割的 PointFlow 流语义通过点
PointNetLK Revisited 重访PointNetLK
Pose Recognition With Cascade Transformers 使用级联变压器进行姿势识别
POSEFusion Pose-Guided Selective Fusion for Single-View Human Volumetric Capture POSEFusion 用于单视图人体体积捕获的姿势引导选择性融合
Probabilistic Model Distillation for Semantic Correspondence 语义对应的概率模型蒸馏
Progressive Domain Expansion Network for Single Domain Generalization 单域泛化的渐进域扩展网络
Progressive Stage-Wise Learning for Unsupervised Feature Representation Enhancement 无监督特征表示增强的渐进式阶段学习
QAIR Practical Query-Efficient Black-Box Attacks for Image Retrieval 用于图像检索的 QAIR 实用高效查询黑盒攻击
Ranking Neural Checkpoints 对神经检查点进行排名
Representing Videos As Discriminative Sub-Graphs for Action Recognition 将视频表示为动作识别的判别子图
Searching for Fast Model Families on Datacenter Accelerators 在数据中心加速器上搜索快速模型系列
SelfDoc Self-Supervised Document Representation Learning SelfDoc 自我监督文档表示学习
Self-Point-Flow Self-Supervised Scene Flow Estimation From Point Clouds With Optimal 具有最优点云的自点流自监督场景流估计
Self-Supervised Video Hashing via Bidirectional Transformers 通过双向变压器的自我监督视频散列
Semantic Segmentation With Generative Models Semi-Supervised Learning and Strong Out-of-Domain Generalization生成模型的语义分割：半监督学习和强大的域外泛化
Spatial Assembly Networks for Image Representation Learning 用于图像表示学习的空间组装网络
Spatial Feature Calibration and Temporal Fusion for Effective One-Stage Video Instance Segmentation有效单阶段视频的空间特征校准和时间融合
Spherical Confidence Learning for Face Recognition 人脸识别的球形置信度学习
Surrogate Gradient Field for Latent Space Manipulation 潜在空间操作的替代梯度场
Temporal Action Segmentation From Timestamp Supervision 来自时间戳监督的时间动作分割
The Heterogeneity Hypothesis Finding Layer-Wise Differentiated Network Architectures 发现分层差异化网络架构的异质性假设
Three Birds with One Stone Multi-Task Temporal Action Detection via 三只鸟一石多任务时间动作检测
Toward Accurate and Realistic Outfits Visualization With Attention to Details 注重细节的准确和现实的服装可视化
Towards Compact CNNs via Collaborative Compression 通过协作压缩实现紧凑型 CNN
Transferable Semantic Augmentation for Domain Adaptation 用于域适应的可转移语义增强
Transformation Invariant Few-Shot Object Detection 变换不变少镜头目标检测
UAV-Human A Large Benchmark for Human Behavior Understanding With Unmanned Aerial VehiclesUAV-Human：通过无人机进行人类行为理解的大基准
Uncertainty-Aware Joint Salient Object and Camouflaged Object Detection 不确定性感知联合显着目标和伪装目标检测
VirFace Enhancing Face Recognition via Unlabeled Shallow Data VirFace 通过未标记的浅层数据增强人脸识别
Virtual Fully-Connected Layer Training a Large-Scale Face Recognition Dataset With 虚拟全连接层训练大规模人脸识别数据集
Domain Adaptation With Auxiliary Target Domain-Oriented Classifier 具有辅助目标面向域分类器的域自适应
Flow-Based Kernel Prior With Application to Blind Super-Resolution 基于流的内核先验应用于盲超分辨率
High-Resolution Photorealistic Image Translation in Real-Time A Laplacian Pyramid Translation 实时高分辨率真实感图像翻译拉普拉斯金字塔翻译
OPANAS One-Shot Path Aggregation Network Architecture Search for Object Detection 用于对象检测的 OPANAS One-Shot Path Aggregation 网络架构搜索
PPR10K A Large-Scale Portrait Photo Retouching Dataset With Human-Region Mask and Group-Level ConsistencyPPR10K 具有人体区域掩码和组级一致性的大规模人像照片修饰数据集
RangeIoUDet Range Image Based Real-Time 3D Object Detector Optimized by RangeIoUDet 基于范围图像的实时 3D 对象检测器
4D Hyperspectral Photoacoustic Data Restoration With Reliability Analysis 具有可靠性分析的 4D 高光谱光声数据恢复
Image Inpainting Guided by Coherence Priors of Semantics and Textures 由语义和纹理的相干先验引导的图像修复
Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets 迈向有效注释大规模图像分类数据集的良好实践
Shape and Material Capture at Home 在家中捕捉形状和材料
Monocular Depth Estimation via Listwise Ranking Using the Plackett-Luce Model 使用 Plackett-Luce 模型通过 Listwise Ranking 进行单目深度估计
Building Reliable Explanations of Unreliable Neural Networks Locally Smoothing Perspective 从局部平滑视角构建不可靠神经网络的可靠解释
Anycost GANs for Interactive Image Synthesis and Editing 用于交互式图像合成和编辑的 Anycost GAN
COMPLETER Incomplete Multi-View Clustering via Contrastive Prediction 通过对比预测完成不完全多视图聚类
Drafting and Revision Laplacian Pyramid Network for Fast High-Quality Artistic 用于快速高质量艺术的拉普拉斯金字塔网络的起草和修订
End-to-End Human Pose and Mesh Reconstruction with Transformers 使用变形金刚进行端到端人体姿势和网格重建
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization 学习无锚时间动作定位的显着边界特征
MOOD Multi-Level Out-of-Distribution Detection MOOD 多级分布外检测
Multi-View Multi-Person 3D Pose Estimation With Plane Sweep Stereo 平面扫描立体多视图多人 3D 姿态估计
Point2Skeleton Learning Skeletal Representations from Point Clouds Point2Skeleton 从点云中学习骨骼表示
Real-Time High-Resolution Background Matting实时高分辨率背景抠图
Reciprocal Landmark Detection and Tracking With Extremely Few Annotations 具有极少注释的互惠地标检测和跟踪
Rich Context Aggregation With Reflection Prior for Glass Surface Detection 用于玻璃表面检测的具有反射先验的丰富上下文聚合
Scene-Intuitive Agent for Remote Embodied Visual Grounding 用于远程体现视觉接地的场景直观代理
Vx2Text End-to-End Learning of Video-Based Text Generation From Multimodal Inputs Vx2Text 从多模式输入中端到端学习基于视频的文本生成
What Can Style Transfer and Paintings Do for Model Robustness 样式迁移和绘画可以为模型的稳健性做些什么
AutoInt Automatic Integration for Fast Neural Volume Rendering AutoInt 用于快速神经体积渲染的自动集成
Region-Aware Adaptive Instance Normalization for Image Harmonization 用于图像协调的区域感知自适应实例归一化
3D-to-2D Distillation for Indoor Scene Parsing 用于室内场景解析的 3D 到 2D 蒸馏
Adaptive Aggregation Networks for Class-Incremental Learning 用于类增量学习的自适应聚合网络
Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval 用于跨域视觉语言检索的自适应跨模态原型
Anti-Aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation 少镜头语义分割的抗锯齿语义重建
Cluster-Wise Hierarchical Generative Model for Deep Amortized Clustering 用于深度摊销聚类的分簇层次生成模型
Content-Aware GAN Compression 内容感知 GAN 压缩
Context-Aware Biaffine Localizing Network for Temporal Sentence Grounding 用于时间句子接地的上下文感知双仿射定位网络
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting跨模式协同表示学习和大规模 RGBT 人群计数基准
Deep Dual Consecutive Network for Human Pose Estimation 用于人体姿态估计的深度对偶连续网络
Deep Implicit Moving Least-Squares Functions for 3D Reconstruction 用于 3D 重建的深度隐式移动最小二乘函数
Deep Learning in Latent Space for Video Prediction and Compression 用于视频预测和压缩的潜在空间中的深度学习
DeepMetaHandles Learning Deformation Meta-Handles of 3D Meshes With Biharmonic Coordinates DeepMetaHandles 学习具有双调和坐标的 3D 网格的变形元句柄
DeFLOCNet Deep Image Editing via Flexible Low-Level Controls DeFLOCNet 通过灵活的低级控制进行深度图像编辑
Discovering Hidden Physics Behind Transport Dynamics 发现运输动力学背后的隐藏物理
DivCo Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network 通过对比生成对抗网络进行 DivCo 不同条件图像合成
Exploit Visual Dependency Relations for Semantic Segmentation 利用视觉依赖关系进行语义分割
Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation探索和提炼后验和先验知识的放射学报告生成
FedDG Federated Domain Generalization on Medical Image Segmentation via Episodic FedDG 基于Episodic的医学图像分割的联邦域泛化
From Shadow Generation To Shadow Removal 从阴影生成到阴影去除
Fully Convolutional Scene Graph Generation 全卷积场景图生成
Fully Understanding Generic Objects Modeling Segmentation and Reconstruction 全面理解通用对象建模分割与重构
Generic Perceptual Loss for Modeling Structured Output Dependencies 用于建模结构化输出依赖关系的通用感知损失
Goal-Oriented Gaze Estimation for Zero-Shot Learning 零样本学习的面向目标的注视估计
iMiGUE An Identity-Free Video Dataset for Micro-Gesture Understanding and Emotion AnalysisiMiGUE 用于微手势理解和情感分析的无身份视频数据集
Inception Convolution With Efficient Dilation Search 具有高效膨胀搜索的 Inception 卷积
Invertible Denoising Network A Light Solution for Real Noise Removal 可逆去噪网络一种用于真正去噪的轻型解决方案
Learnable Motion Coherence for Correspondence Pruning 对应剪枝的可学习运动相干性
Learning To Warp for Style Transfer 学习变形以进行风格迁移
Mask-Embedded Discriminator With Region-Based Semantic Regularization for Semi-Supervised Class-Conditional Image Synthesis用于半监督类条件图像合成的具有基于区域语义正则化的掩码嵌入鉴别器
Multimodal Motion Prediction With Stacked Transformers 堆叠变压器的多模态运动预测
Multi-Shot Temporal Event Localization A Benchmark 多镜头时间事件定位基准
Neighborhood Normalization for Robust Geometric Feature Learning 稳健几何特征学习的邻域归一化
No Frame Left Behind Full Video Action Recognition 全视频动作识别不留帧
Noise-Resistant Deep Metric Learning With Ranking-Based Instance Selection 具有基于排名的实例选择的抗噪声深度度量学习
One Thing One Click A Self-Training Approach for Weakly Supervised 一件事一点击弱监督的自我训练方法
Orthogonal Over-Parameterized Training 正交过参数化训练
PD-GAN Probabilistic Diverse GAN for Image Inpainting 用于图像修复的 PD-GAN Probabilistic Diverse GAN
PluckerNet Learn To Register 3D Line Reconstructions PluckerNet 学习注册 3D 线重建
PointGuard Provably Robust 3D Point Cloud Classification PointGuard 可证明稳健的 3D 点云分类
RankDetNet Delving Into Ranking Constraints for Object Detection RankDetNet 深入研究对象检测的排名约束
Rank-One Prior Toward Real-Time Scene Recovery 在实时场景恢复方面排名第一
Refer-It-in-RGBD A Bottom-Up Approach for 3D Visual Grounding in RGBD Refer-It-in-RGBD 一种自下而上的方法，用于 RGBD 中的 3D 视觉接地
Relation-aware Instance Refinement for Weakly Supervised Visual Grounding 弱监督视觉接地的关系感知实例细化
Retinex-Inspired Unrolling With Cooperative Prior Architecture Search for Low-Light Image EnhancementRetinex 启发的展开与协作先验架构搜索低光图像
Semi-Supervised 3D Hand-Object Poses Estimation With Interactions in Time 半监督 3D 手对象姿势估计与时间交互
SG-Net Spatial Granularity Network for One-Stage Video Instance Segmentation 用于单阶段视频实例分割的 SG-Net 空间粒度网络
Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation 平滑解开的潜在样式空间以进行无监督的图像到图像转换
Source-Free Domain Adaptation for Semantic Segmentation 语义分割的无源域自适应
Spatial-Phase Shallow Learning Rethinking Face Forgery Detection in Frequency Domain 空间阶段浅层学习重新思考频域中的人脸伪造检测
Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos 视频中人物重新识别的时空相关和拓扑学习
Spatiotemporal Registration for Event-Based Visual Odometry 基于事件的视觉里程计的时空配准
The Blessings of Unlabeled Background in Untrimmed Videos 未修剪视频中未标记背景的好处
Towards Unified Surgical Skill Assessment 迈向统一的手术技能评估
Unsupervised Part Segmentation Through Disentangling Appearance and Shape 通过解开外观和形状进行无监督零件分割
Watching You Global-Guided Reciprocal Learning for Video-Based Person Re-Identification 观看全球引导的互惠学习，以进行基于视频的人员重新识别
Weakly Supervised Instance Segmentation for Videos With Temporal Mask Consistency 具有时间掩模一致性的视频的弱监督实例分割
Zero-Shot Adversarial Quantization 零样本对抗量化
CLCC Contrastive Learning for Color Constancy CLCC 颜色恒常性对比学习
Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks 使用对极时空网络进行多视图深度估计
Radar-Camera Pixel Depth Association for Depth Completion 用于深度完成的雷达-相机像素深度关联
Bridging the Visual Gap Wide-Range Image Blending弥合视觉差距大范围图像混合
CGA-Net Category Guided Aggregation for Point Cloud Semantic Segmentation 用于点云语义分割的 CGA-Net 类别引导聚合
Dual-GAN Joint BVP and Noise Modeling for Remote Physiological Measurement 用于远程生理测量的双 GAN 联合 BVP 和噪声建模
Large-Capacity Image Steganography Based on Invertible Neural Networks 基于可逆神经网络的大容量图像隐写术
MASA-SR Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution 基于参考的图像超分辨率的 MASA-SR 匹配加速和空间适应
Metadata Normalization 元数据规范化
Omnimatte Associating Objects and Their Effects in Video Omnimatte 关联对象及其在视频中的效果
Personalized Outfit Recommendation With Learnable Anchors 具有可学习锚的个性化服装推荐
Taskology Utilizing Task Relations at Scale 大规模利用任务关系的任务学
Action Unit Memory Network for Weakly Supervised Temporal Action Localization 用于弱监督时间动作定位的动作单元记忆网络
Conditional Bures Metric for Domain Adaptation 域适应的条件 Bures 度量
Diffusion Probabilistic Models for 3D Point Cloud Generation 用于 3D 点云生成的扩散概率模型
Generalizing Face Forgery Detection With High-Frequency Features 用高频特征推广人脸伪造检测
Intelligent Carpet Inferring 3D Human Pose From Tactile Signals 从触觉信号中推断出 3D 人体姿势的智能地毯
M3DSSD Monocular 3D Single Stage Object Detector M3DSSD 单目 3D 单级物体检测器
Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement 使用 StyleGAN 和感知细化的归一化头像合成
Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation 重新思考自下而上人体姿势估计的热图回归
Scalable Differential Privacy With Sparse Network Finetuning 具有稀疏网络微调的可扩展差分隐私
Self-Supervised Pillar Motion Learning for Autonomous Driving 自动驾驶的自我监督支柱运动学习
Stay Positive Non-Negative Image Synthesis for Augmented Reality 为增强现实保持积极的非消极图像合成
UPFlow Upsampling Pyramid for Unsupervised Optical Flow Learning 用于无监督光流学习的 UPFlow 上采样金字塔
Learning Normal Dynamics in Videos With Meta Prototype Network 使用元原型网络学习视频中的正态动力学
Learning Semantic Person Image Generation by Region-Adaptive Normalization 通过区域自适应归一化学习语义人物图像生成
Progressive Modality Reinforcement for Human Multimodal Emotion Recognition From Unaligned 从未对齐的人类多模态情感识别的渐进模态强化
Residential Floor Plan Recognition and Reconstruction 住宅平面图识别和重建
Simultaneously Localize Segment and Rank the Camouflaged Objects 同时定位分段并对伪装对象进行排序
Towards Evaluating and Training Verifiably Robust Neural Networks 评估和训练可验证的鲁棒神经网络
Activate or Not Learning Customized Activation 激活或不学习自定义激活
CapsuleRRT Relationships-Aware Regression Tracking via Capsules 通过 Capsule 的 CapsuleRRT 关系感知回归跟踪
Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center 具有光度对齐和类别中心的粗到细域自适应语义分割
Context Modeling in 3D Human Pose Estimation A Unified Perspective 3D 人体姿势估计中的上下文建模一个统一的视角
Delving Into Localization Errors for Monocular 3D Object Detection 深入研究单目 3D 目标检测的定位错误
IQDet Instance-Wise Quality Distribution Sampling for Object Detection 用于对象检测的 IQDet 实例质量分布采样
MUST-GAN Multi-Level Statistics Transfer for Self-Driven Person Image Generation 用于自驱动人图像生成的 MUST-GAN 多级统计传输
Pixel Codec Avatars 像素编解码器头像
SCALE Modeling Clothed Humans with a Surface Codec of Articulated SCALE 使用铰接式表面编解码器对穿着衣服的人进行建模
Simulating Unknown Target Models for Query-Efficient Black-Box Attacks 为查询高效的黑盒攻击模拟未知目标模型
Weakly Supervised Action Selection Learning in Video 视频中的弱监督动作选择学习
Generative Classifiers as a Basis for Trustworthy Image Classification 生成分类器作为可信图像分类的基础
Efficient Multi-Stage Video Denoising With Recurrent Spatio-Temporal Fusion 循环时空融合的高效多级视频去噪
MultiLink Multi-Class Structure Recovery via Agglomerative Clustering and Model Selection 通过凝聚聚类和模型选择的 MultiLink 多类结构恢复
SurFree A Fast Surrogate-Free Black-Box Attack SurFree 快速无代理黑盒攻击
Gradient Forward-Propagation for Large-Scale Temporal Video Modelling 用于大规模时间视频建模的梯度前向传播
Magic Layouts Structural Prior for Component Detection in User Interface Magic Layouts Structural Prior 用于用户界面中的组件检测
Open World Compositional Zero-Shot Learning 开放世界组合零样本学习
FCPose Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions FCPose 具有动态实例感知卷积的全卷积多人姿势估计
Generative Interventions for Causal Learning 因果学习的生成性干预
KRISP Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA 为开放领域知识为基础的 VQA 集成隐式和符号知识的 KRISP
A 3D GAN for Improved Large-Pose Facial Recognition 用于改进大姿势面部识别的 3D GAN
NeRF in the Wild Neural Radiance Fields for Unconstrained Photo NeRF in the Wild: 无约束照片集的神经辐射场
Permute Quantize and Fine-Tune Efficient Compression of Neural Networks 置换量化和微调神经网络的有效压缩
Visual Navigation With Spatial Attention 具有空间注意力的视觉导航
How Robust Are Randomized Smoothing Based Defenses to Data Poisoning 基于随机平滑的数据中毒防御的鲁棒性如何
Camouflaged Object Segmentation With Distraction Mining 分心挖掘的伪装对象分割
Depth-Aware Mirror Segmentation 深度感知镜面分割
Image Super-Resolution With Non-Local Sparse Attention 具有非局部稀疏注意的图像超分辨率
PixMatch Unsupervised Domain Adaptation via Pixelwise Consistency Training PixMatch 通过像素一致性训练进行无监督域适应
Playable Video Generation 可播放视频生成
Connecting What To Say With Where To Look by Modeling 通过建模将要说的内容与看的地方联系起来
MagFace A Universal Representation for Face Recognition and Quality Assessment MagFace 人脸识别和质量评估的通用表示
StEP Style-Based Encoder Pre-Training for Multi-Modal Image Synthesis 用于多模态图像合成的基于 Step 样式的编码器预训练
Real-Time Sphere Sweeping Stereo From Multiview Fisheye Images 从多视图鱼眼图像中进行实时球体扫描立体
An Alternative Probabilistic Interpretation of the Huber Loss Huber损失的另一种概率解释
Physically-Aware Generative Network for 3D Shape Modeling 用于 3D 形状建模的物理感知生成网络
HDMapGen A Hierarchical Graph Generative Model of High Definition Maps HDMapGen 一种高分辨率地图的层次图生成模型
Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution 通过内容自适应多分辨率将单目深度估计模型提升到高分辨率
PVGNet A Bottom-Up One-Stage 3D Object Detector With Integrated Multi-Level PVGNet 一种自下而上的单级 3D 对象检测器，具有集成的多级
VSPW A Large-scale Dataset for Video Scene Parsing in the wildVSPW 用于视频场景解析的大规模数据集
Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent 通过稀疏和解开潜在的排斥吸引的连续语义分割
Thinking Fast and Slow Efficient Text-to-Visual Retrieval With Transformers 使用 Transformers 思考快速和慢速高效的文本到视觉检索
DeepSurfels Learning Online Appearance Fusion DeepSurfels 学习在线外观融合
LEAP Learning Articulated Occupancy of People LEAP 学习铰接式人员占用
Convolutional Hough Matching Networks 卷积霍夫匹配网络
GATSBI Generative Agent-Centric Spatio-Temporal Object Interaction GATSBI 生成代理中心时空对象交互
Generalized Domain Adaptation 广义域适应
Affect2MM Affective Analysis of Multimedia Content Using Emotion Causality 使用情感因果关系的多媒体内容的 Affect2MM 情感分析
Spoken Moments Learning Joint Audio-Visual Representations From Video Descriptions 口语时刻：从视频描述中学习联合视听表示
Wasserstein Barycenter for Multi-Source Domain Adaptation 用于多源域适应的 Wasserstein 重心
Learning Asynchronous and Sparse Human-Object Interaction in Videos 学习视频中的异步和稀疏人与对象交互
Audio-Visual Instance Discrimination with Cross-Modal Agreement 跨模式协议的视听实例歧视
Robust Audio-Visual Instance Discrimination 强大的视听实例识别
Neural Surface Maps 神经表面图
Extreme Low-Light Environment-Driven Image Denoising Over Permanently Shadowed Lunar Regions 永久阴影月球区域的极低光环境驱动图像去噪
Background Splitting Finding Rare Classes in a Sea of Background 背景拆分在背景海中寻找稀有类
On Self-Contact and Human Pose 关于自我接触和人体姿势
Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences 在 RGB-D 序列中查看对象背后的 3D 多对象跟踪
Multi-Person Implicit Reconstruction From a Single Image 从单个图像进行多人隐式重建
FixBi Bridging Domain Spaces for Unsupervised Domain Adaptation FixBi 桥接域空间用于无监督域适应
Learning Graph Embeddings for Compositional Zero-Shot Learning 用于组合零样本学习的学习图嵌入
Polygonal Point Set Tracking多边形点集跟踪
Reducing Domain Gap by Reducing Style Bias 通过减少风格偏见来减少领域差距
Interventional Video Grounding With Dual Contrastive Learning 具有双重对比学习的介入视频接地
Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction 分而治之的车道感知多样化轨迹预测
All Labels Are Not Created Equal Enhancing Semi-Supervision via Label 并非所有标签都平等地通过标签增强半监督
House-GAN Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent 面向智能计算代理的 House-GAN 生成对抗布局细化网络
Neural Prototype Trees for Interpretable Fine-Grained Image Recognition 用于可解释的细粒度图像识别的神经原型树
Hyperdimensional Computing as a Framework for Systematic Aggregation of Image 超维计算作为图像系统聚合的框架
Pedestrian and Ego-Vehicle Trajectory Prediction From Monocular Camera 单目相机的行人和自我车辆轨迹预测
Discovering Relationships Between Object Categories via Universal Canonical Maps 通过通用规范映射发现对象类别之间的关系
Body2Hands Learning To Infer 3D Hands From Conversational Gesture Body Dynamics学习从会话手势身体动力学推断 3D 手
Clusformer A Transformer Based Clustering Approach to Unsupervised Large-Scale Face Clusformer 一种基于 Transformer 的无监督大规模人脸聚类方法
Dictionary-Guided Scene Text Recognition 字典引导的场景文本识别
FAPIS A Few-Shot Anchor-Free Part-Based Instance Segmenter
Lipstick Aint Enough Beyond Color Matching for In-the-Wild Makeup Transfer 除了配色之外，口红还不足以进行野外化妆转移
Controlling the Rain From Removal to Rendering 控制雨水从移除到渲染
M3P Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training 通过多任务多语言多模式预训练的 M3P 学习通用表示
RfD-Net Point Scene Understanding by Semantic Instance Reconstruction 通过语义实例重构的 RfD-Net 点场景理解
GIRAFFE Representing Scenes As Compositional Generative Neural Feature Fields 将场景表示为组合生成神经特征场的 GIRAFFE
HyperSeg Patch-Wise Hypernetwork for Real-Time Semantic Segmentation 用于实时语义分割的 HyperSeg Patch-Wise 超网络
Augmentation Strategies for Learning With Noisy Labels 使用嘈杂标签进行学习的增强策略
Counterfactual VQA A Cause-Effect Look at Language Bias 反事实 VQA 对语言偏见的因果观察
HVPR Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection 用于单级 3D 对象检测的 HVPR 混合体素点表示
Permuted AdaIN Reducing the Bias Towards Global Statistics in Image 置换 AdaIN 减少图像中对全球统计的偏见
Automated Log-Scale Quantization for Low-Cost Deep Neural Networks 低成本深度神经网络的自动对数尺度量化
Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation 弱监督语义分割的背景感知池化和噪声感知损失
Few-Shot Image Generation via Cross-Domain Correspondence 通过跨域对应生成少镜头图像
A Quasiconvex Formulation for Radial Cameras 径向相机的拟凸公式
Protecting Intellectual Property of Generative Adversarial Networks From Ambiguity Attacks 保护生成对抗网络的知识产权免受歧义攻击
Neural Auto-Exposure for High-Dynamic Range Object Detection 用于高动态范围目标检测的神经自动曝光
Bilinear Parameterization for Non-Separable Singular Value Penalties 不可分离奇异值惩罚的双线性参数化
Multi-Objective Interpolation Training for Robustness To Label Noise 多目标插值训练的鲁棒性标注噪声
Neural Scene Graphs for Dynamic Scenes 动态场景的神经场景图
SDD-FIQA Unsupervised Face Image Quality Assessment With Similarity Distribution Distance 具有相似分布距离的 SDD-FIQA 无监督人脸图像质量评估
Neural Camera Simulators 神经相机模拟器
Fast Sinkhorn Filters Using Matrix Scaling for Non-Rigid Shape Correspondence 使用矩阵缩放的非刚性形状对应的快速 Sinkhorn 滤波器
Synthesize-It-Classifier Learning a Generative Classifier Through Recurrent Self-Analysis Synthesize-It-Classifier 通过循环自我分析学习生成分类器
3D Object Detection With Pointformer 使用 Pointformer 进行 3D 对象检测
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization 用于时空动作定位的 Actor-Context-Actor 关系网络
Dual Pixel Exploration Simultaneous Depth Estimation and Image Restoration 双像素探索同时深度估计和图像恢复
Unveiling the Potential of Structure Preserving for Weakly Supervised Object 揭示弱监督对象结构保持的潜力
Variational Relational Point Completion Network 变分关系点完成网络
VideoMoCo Contrastive Video Representation Learning With Temporally Adversarial Examples VideoMoCo 具有时间对抗性示例的对比视频表示学习
Generalization on Unseen Domains via Inference-Time Label-Preserving Target Projections 通过推理时间标签保留目标投影对看不见的域进行泛化
PGT A Progressive Method for Training Models on Long Videos PGT 一种在长视频上训练模型的渐进方法
Quasi-Dense Similarity Learning for Multiple Object Tracking 多目标跟踪的准密集相似性学习
Recorrupted-to-Recorrupted Unsupervised Deep Learning for Image Denoising 用于图像去噪的 Recorrupted-to-Recorrupted 无监督深度学习
TearingNet Point Cloud Autoencoder To Learn Topology-Friendly Representations 用于学习拓扑友好表示的 TearingNet 点云自动编码器
Function4D Real-Time Human Volumetric Capture From Very Sparse Consumer RGBD 来自非常稀疏的消费者 RGBD 的 Function4D 实时人体体积捕获
LAFEAT Piercing Through Adversarial Defenses With Latent Features LAFEAT 穿透具有潜在特征的对抗性防御
Landmark Regularization Ranking Guided Super-Net Training in Neural Architecture Search 地标正则化排名引导神经架构搜索中的超网训练
Lite-HRNet A Lightweight High-Resolution Network Lite-HRNet 轻量级高分辨率网络
Mask Guided Matting via Progressive Refinement Network 通过渐进式细化网络进行蒙版引导遮罩
Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner 以对比方式对稀疏神经网络进行微创手术
PCLs Geometry-Aware Neural Reconstruction of 3D Pose With Perspective Crop 具有透视裁剪的 3D 姿势的几何感知神经重建
pixelNeRF Neural Radiance Fields From One or Few Images 来自一个或几个图像的 pixelNeRF 神经辐射场
Real-Time Selfie Video Stabilization 实时自拍视频稳定
Transitional Adaptation of Pretrained Models for Visual Storytelling 视觉叙事预训练模型的过渡适应
Multimodal Contrastive Training for Visual Representation Learning 视觉表征学习的多模态对比训练
Multiple Instance Active Learning for Object Detection 用于对象检测的多实例主动学习
Perception Matters Detecting Perception Failures of VQA Models Using Metamorphic 感知很重要使用 Metamorphic 检测 VQA 模型的感知失败
Robust Instance Segmentation Through Reasoning About Multi-Object Occlusion 基于多对象遮挡推理的鲁棒实例分割
SimPoE Simulated Character Control for 3D Human Pose Estimation 用于 3D 人体姿势估计的 SimPoE 模拟字符控制
STaR Self-Supervised Tracking and Reconstruction of Rigid Objects in Motion 运动中刚体的自监督跟踪与重建
Counterfactual Zero-Shot and Open-Set Visual Recognition 反事实零样本和开集视觉识别
Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation 用于 Few-Shot 无监督域适应的原型跨域自监督学习
Semi-Supervised Video Deraining With Dynamical Rain Generator 使用动态雨水发生器的半监督视频去雨
Re-Labeling ImageNet From Single to Multi-Labels From Global to Localized 将 ImageNet 从单标签重新标记到多标签，从全局到本地化
Adaptive Weighted Discriminator for Training Generative Adversarial Networks 用于训练生成对抗网络的自适应加权鉴别器
Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces 使用一维子空间联合的分布外检测
Multi-Stage Progressive Image Restoration 多阶段渐进式图像恢复
Neural Descent for Visual 3D Human Pose and Shape 视觉 3D 人体姿势和形状的神经下降
Open-Vocabulary Object Detection Using Captions 使用字幕的开放词汇对象检测https://zhuanlan.zhihu.com/p/419255664
CorrNet3D Unsupervised End-to-End Learning of Dense Correspondence for 3D Point CorrNet3D 3D 点密集对应的无监督端到端学习
Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval 用于跨模态视频时刻检索的多模态关系图
Pushing It Out of the Way Interactive Visual Navigation 将其推开交互式视觉导航
Hyper-LifelongGAN Scalable Lifelong Learning for Image Conditioned Generation 用于图像条件生成的 Hyper-LifelongGAN 可扩展终身学习
Mutual Graph Learning for Camouflaged Object Detection 用于伪装目标检测的互图学习
Unbalanced Feature Transport for Exemplar-Based Image Translation 基于样本的图像翻译的不平衡特征传输
ABMDRNet Adaptive-Weighted Bi-Directional Modality Difference Reduction Network for RGB-T Semantic Segmentation用于 RGB-T 语义分割的 ABMDRNet 自适应加权双向模态差异减少网络
Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution 通过概率绑架和执行的抽象时空推理
Accurate Few-Shot Object Detection With Support-Query Mutual Guidance and Hybrid 支持-查询互导和混合的精确少镜头目标检测
ACRE Abstract Causal REasoning Beyond Covariation ACRE 超越协变的抽象因果推理
Attention-Guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency 通过压缩感知显着性深度重建的注意力引导图像压缩
Body Meshes as Points 体网格作为点
Coarse-To-Fine Person Re-Identification With Auxiliary-Domain Classification and Second-Order Information Bottleneck 辅助域分类和二阶信息瓶颈的粗到细人员再识别
CoLA Weakly-Supervised Temporal Action Localization With Snippet Contrastive Learning CoLA 基于片段对比学习的弱监督时间动作定位
Confluent Vessel Trees With Accurate Bifurcations 具有精确分叉的汇合血管树
Cross-Modal Contrastive Learning for Text-to-Image Generation 用于文本到图像生成的跨模态对比学习
Cross-View Cross-Scene Multi-View Crowd Counting 跨视图跨场景多视图人群统计
Cross-View Gait Recognition With Deep Universal Linear Embeddings 具有深度通用线性嵌入的跨视图步态识别
Data-Free Knowledge Distillation for Image Super-Resolution 图像超分辨率的无数据知识蒸馏
DatasetGAN Efficient Labeled Data Factory With Minimal Human Effort DatasetGAN 高效标记数据工厂，只需最少的人力
DCNAS Densely Connected Neural Architecture Search for Semantic Image Segmentation 用于语义图像分割的 DCNAS 密集连接神经架构搜索
Deep Stable Learning for Out-of-Distribution Generalization 分布外泛化的深度稳定学习
DeepACG Co-Saliency Detection via Semantic-Aware Contrast Gromov-Wasserstein Distance 通过语义感知对比 Gromov-Wasserstein 距离的 DeepACG 共显着性检测
Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy 通过动态卷积和 MOT 哲学实现 Distractor-Aware 快速跟踪
Distribution Alignment A Unified Framework for Long-Tail Visual Recognition 分布对齐长尾视觉识别的统一框架
Diversifying Sample Generation for Accurate Data-Free Quantization 多样化样本生成以实现准确的无数据量化
DoDNet Learning To Segment Multi-Organ and Tumors From Multiple Partially DoDNet 学习从多个部分中分割多器官和肿瘤
Domain-Robust VQA With Diverse Datasets and Methods but No Target 具有多种数据集和方法但没有目标的领域鲁棒 VQA
DualGraph A Graph-Based Method for Reasoning About Label Noise DualGraph 一种基于图的标签噪声推理方法
EDNet Efficient Disparity Estimation With Cost Volume Combination and Attention-Based 结合成本量和基于注意力的 EDNet 高效视差估计
Event-Based Synthetic Aperture Imaging With a Hybrid Network 具有混合网络的基于事件的合成孔径成像
Explicit Knowledge Incorporation for Visual Reasoning 视觉推理的显性知识整合
Exploiting Edge-Oriented Reasoning for 3D Point-Based Scene Graph Analysis 利用面向边缘的推理进行 3D 基于点的场景图分析
Few-Shot Incremental Learning With Continually Evolved Classifiers 使用不断进化的分类器进行少量增量学习
Flow-Guided One-Shot Talking Face Generation With a High-Resolution Audio-Visual Dataset 具有高分辨率视听数据集的流引导式一次性说话人脸生成
Generating Manga From Illustrations via Mimicking Manga Creation Workflow 通过模仿漫画创作工作流程从插图生成漫画
Hallucination Improves Few-Shot Object Detection 幻觉改善了少镜头目标检测
Holistic 3D Scene Understanding From a Single Image With Implicit 从单幅图像中理解整体 3D 场景
iVPF Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression 用于高效无损压缩的 iVPF 数值可逆保体积流
Keypoint-Graph-Driven Learning Framework for Object Pose Estimation 用于对象姿态估计的关键点图驱动学习框架
Learning a Facial Expression Embedding Disentangled From Identity 学习与身份分离的面部表情嵌入
Learning a Self-Expressive Network for Subspace Clustering 学习用于子空间聚类的自我表达网络
Learning by Watching 边看边学
Learning Temporal Consistency for Low Light Video Enhancement From Single 从单次学习低光视频增强的时间一致性
Learning Tensor Low-Rank Prior for Hyperspectral Image Reconstruction 学习高光谱图像重建的张量低秩先验
Learning To Aggregate and Personalize 3D Face From In-the-Wild Photo 学习从野外照片中聚合和个性化 3D 人脸
Learning To Restore Hazy Video A New Real-World Dataset and A New Method学习恢复模糊视频一个新的真实世界数据集和
MR Image Super-Resolution With Squeeze and Excitation Reasoning Attention Network 具有挤压和激发推理注意网络的 MR 图像超分辨率
Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations 使用活动特定特征和活动相关性的多标签活动识别
Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos 用于视频中时间语言定位的多阶段聚合变压器网络
Neural Architecture Search With Random Labels 带有随机标签的神经架构搜索
No Shadow Left Behind: Removing Objects and their Shadows using Approximate Lighting and Geometr不留下阴影：使用近似照明和几何图形去除物体及其阴影
Objects Are Different Flexible Monocular 3D Object Detection 物体不同灵活的单目 3D 物体检测
Open-Book Video Captioning With Retrieve-Copy-Generate Network 使用检索-复制-生成网络的开放式视频字幕
Person Re-Identification Using Heterogeneous Local Graph Attention Networks 使用异构局部图注意网络的人员重新识别
PhySG Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing 使用球形高斯函数进行基于物理的材料编辑的 PhySG 逆向渲染
Physics-Based Iterative Projection Complex Neural Network for Phase Retrieval in 用于相位检索的基于物理的迭代投影复杂神经网络
PISE Person Image Synthesis and Editing With Decoupled GAN 使用解耦 GAN 的 PISE 人物图像合成和编辑
Point Cloud Instance Segmentation Using Probabilistic Embeddings 使用概率嵌入的点云实例分割
Posterior Promoted GAN With Distribution Discriminator for Unsupervised Image Synthesis 用于无监督图像合成的具有分布鉴别器的后置促进 GAN
Prototype Completion With Primitive Knowledge for Few-Shot Learning 使用原始知识完成原型以进行少量学习
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain 域的原型伪标签去噪和目标结构学习
PSRR-MaxpoolNMS Pyramid Shifted MaxpoolNMS With Relationship Recovery 金字塔转移 MaxpoolNMS 与关系恢复
RefineMask Towards High-Quality Instance Segmentation With Fine-Grained Features RefineMask 实现具有细粒度特征的高质量实例分割
Refining Pseudo Labels With Clustering Consensus Over Generations for Unsupervised 无监督的多代聚类共识提炼伪标签
Repetitive Activity Counting by Sight and Sound 通过视觉和听觉计算重复活动
Rethinking Class Relations Absolute-Relative Supervised and Unsupervised Few-Shot Learning 重新思考类关系绝对相对监督和无监督的小样本学习
Robust Bayesian Neural Networks by Spectral Expectation Bound Regularization 基于谱期望界正则化的鲁棒贝叶斯神经网络
RPN Prototype Alignment for Domain Adaptive Object Detector 域自适应对象检测器的 RPN 原型对齐
RSTNet Captioning With Adaptive Attention on Visual and Non-Visual Words 对视觉和非视觉词进行自适应注意的字幕
Self-Guided and Cross-Guided Learning for Few-Shot Segmentation 少镜头分割的自引导和交叉引导学习
Sketch2Model View-Aware 3D Modeling From Single Free-Hand Sketches 从单幅手绘草图中进行可视化 3D 建模
Sparse Multi-Path Corrections in Fringe Projection Profilometry 条纹投影轮廓测量中的稀疏多路径校正
SRDAN Scale-Aware and Range-Aware Domain Adaptation Network for Cross-Dataset 3D 跨数据集 3D 的尺度感知和范围感知域适应网络
Stochastic Whitening Batch Normalization 随机美白批量标准化
Temporal Query Networks for Fine-Grained Video Understanding 用于细粒度视频理解的时间查询网络
TSGCNet Discriminative Geometric Feature Learning With Two-Stream Graph Convolutional Network TSGCNet 判别几何特征学习与两流图卷积网络
UnrealPerson An Adaptive Pipeline Towards Costless Person Re-Identification UnrealPerson 一种面向无成本人员重新识别的自适应管道
Unsupervised 3D Shape Completion Through GAN Inversion 通过 GAN 反转完成无监督 3D 形状
User-Guided Line Art Flat Filling With Split Filling Mechanism 具有分割填充机制的用户引导线艺术平面填充
Variational Pedestrian Detection 变分行人检测
VarifocalNet An IoU-Aware Dense Object Detector VarifocalNet 一种 IoU 感知的密集对象检测器
View-Guided Point Cloud Completion 视图引导点云完成
VinVL Revisiting Visual Representations in Vision-Language Models VinVL 重新审视视觉语言模型中的视觉表示
We Are More Than Our Joints Predicting How 3D Bodies 我们不仅仅是预测 3D 身体的关节
3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation Diagnosis 用于胰腺肿块分割诊断的 3D 图形解剖几何集成网络
Camera Pose Matters Improving Depth Prediction by Mitigating Pose Distribution 相机姿势很重要，通过减轻姿势分布来改善深度预测
Cascaded Prediction Network via Segment Tree for Temporal Video Grounding 用于时间视频接地的分段树级联预测网络
Deep Lucas-Kanade Homography for Multimodal Image Alignment 用于多模态图像对齐的深度 Lucas-Kanade 单应性
Distribution-Aware Adaptive Multi-Bit Quantization 分布感知自适应多比特量化
Few-Shot 3D Point Cloud Semantic Segmentation Few-Shot 3D 点云语义分割
Graph-Based High-Order Relation Discovery for Fine-Grained Recognition 用于细粒度识别的基于图的高阶关系发现
Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for 学习通过基于记忆的多源元学习来概括看不见的领域
Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information 通过对比交叉视图互信息学习视图解耦的人体姿态表示
Multi-Attentional Deepfake Detection 多注意 Deepfake 检测
PhD Learning Learning With Pompeiu-Hausdorff Distances for Video-Based Vehicle Re-Identification 博士学习使用庞培-豪斯多夫距离进行基于视频的车辆重新识别
Prior Based Human Completion 基于先验的人工完成
Self-Generated Defocus Blur Detection via Dual Adversarial Discriminators 通过双对抗判别器自生成散焦模糊检测
Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction from Raw Point Clouds用于形状建模和从原始点云重建的表面自相似性的符号不可知隐式学习
Spk2ImgNet Learning To Reconstruct Dynamic Scene From Continuous Spike Stream Spk2ImgNet 学习从连续尖峰流中重建动态场景
Unpaired Image-to-Image Translation via Latent Energy Transport 通过潜在能量传输的未配对图像到图像转换
Weakly Supervised Video Salient Object Detection 弱监督视频显着目标检测
Simpler Certified Radius Maximization by Propagating Covariances 通过传播协方差实现更简单的认证半径最大化
A Deep Emulator for Secondary Motion of 3D Characters 用于 3D 角色二次运动的深度仿真器
Deep Compositional Metric Learning 深度组合度量学习
Deep Convolutional Dictionary Learning for Image Denoising 用于图像去噪的深度卷积字典学习
Deep Implicit Templates for 3D Shape Representation 用于 3D 形状表示的深层隐式模板
Group-aware Label Transfer for Domain Adaptive Person Re-identification 用于域自适应人员重新识别的组感知标签传输
High-Speed Image Reconstruction Through Short-Term Plasticity for Spiking Cameras 脉冲相机的短期可塑性高速图像重建
Improving Multiple Object Tracking With Single Object Tracking 使用单个对象跟踪改进多对象跟踪
Patchwise Generative ConvNet: Training Energy-Based Models from a Single Natural Image for Internal LearningPatchwise Generative ConvNet：从单个自然图像训练基于能量的模型进行内部学习
Regularizing Neural Networks via Adversarial Model Perturbation 通过对抗模型扰动正则化神经网络
Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers 使用 Transformer 从序列到序列的角度重新思考语义分割
SE-SSD Self-Ensembling Single-Stage Object Detector From Point Cloud 来自点云的 SE-SSD 自集成单级目标检测器
Single Image Reflection Removal With Absorption Effect 具有吸收效果的单图像反射去除
The Spatially-Correlative Loss for Various Image Translation Tasks 各种图像翻译任务的空间相关损失
Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning 基于多引导双边学习的超高清图像去雾
Unsupervised Disentanglement of Linear-Encoded Facial Semantics 线性编码面部语义的无监督解开
Zero-Shot Instance Segmentation 零样本实例分割
DAP Detection-Aware Pre-Training With Weak Supervision 弱监督下的 DAP 检测感知预训练
Glance and Gaze Inferring Action-Aware Points for One-Stage Human-Object Interaction Glance 和 Gaze 推断用于单阶段人-物交互的动作感知点
Improving Calibration for Long-Tailed Recognition 改进长尾识别的校准
Neighborhood Contrastive Learning for Novel Class Discovery 新类发现的邻域对比学习
OpenMix Reviving Known Knowledge for Discovering Novel Visual Categories in OpenMix 恢复已知知识以发现新的视觉类别
Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes 动态场景中的滚动快门校正和去模糊
CoCosNet v2 Full-Resolution Correspondence Learning for Image Translation 图像翻译的全分辨率对应学习
Cross-MPI Cross-Scale Stereo for Image Super-Resolution Using Multiplane Images 使用多平面图像的图像超分辨率的跨尺度立体
Decoupled Dynamic Filter Networks 解耦动态滤波器网络
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing 用于实例感知人类语义解析的可微多粒度人类表征学习
Effective Sparsification of Neural Networks With Global Sparsity Constraint 具有全局稀疏约束的神经网络的有效稀疏化
Embracing Uncertainty Decoupling and De-Bias for Robust Temporal Grounding 采用不确定性去耦和去偏来实现稳健的时间接地
Face Forensics in the Wild 在野外面对取证
Graph-Based High-Order Relation Modeling for Long-Term Action Recognition 用于长期动作识别的基于图的高阶关系建模
Human De-Occlusion Invisible Perception and Recovery for Humans 人类去遮挡隐形感知和人类恢复
Image De-Raining via Continual Learning 通过持续学习进行图像去雨
Image Restoration for Under-Display Camera 屏下摄像头的图像修复
Improving Sign Language Translation With Monolingual Data by Sign Back-Translation 通过手语反向翻译改进单语数据的手语翻译
Instant-Teaching An End-to-End Semi-Supervised Object Detection Framework 即时教学端到端半监督目标检测框架
Learning Placeholders for Open-Set Recognition 学习开放集识别的占位符
Mesoscopic Photogrammetry With an Unstabilized Phone Camera 使用不稳定的手机摄像头的细观摄影测量
Monocular 3D Object Detection An Extrinsic Parameter Free Approach 单目 3D 对象检测一种无外在参数的方法
Monocular Real-Time Full Body Capture With Inter-Part Correlations 具有部件间相关性的单目实时全身捕捉
NeRD Neural 3D Reflection Symmetry Detector NeRD 神经 3D 反射对称检测器
Panoptic-PolarNet Proposal-Free LiDAR Point Cloud Panoptic Segmentation Panoptic-PolarNet Proposal-Free LiDAR 点云全景分割
Patch2Pix Epipolar-Guided Pixel-Level Correspondences Patch2Pix 对极引导的像素级对应
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation 通过隐式模块化视听表示生成姿态可控的说话人脸
Positive Sample Propagation Along the Audio-Visual Event Line 沿视听事件线的正样本传播
Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation 无监督视频多对象分割的目标感知对象发现和关联
TransFill Reference-Guided Image Inpainting by Merging Multiple Color and Spatial 通过合并多种颜色和空间的 TransFill 参考引导图像修复
UC2 Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training UC2 通用跨语言跨模式视觉和语言预训练
A Second-Order Approach to Learning With Instance-Dependent Label Noise 一种基于实例的标签噪声学习的二阶方法
Complementary Relation Contrastive Distillation 互补关系对比蒸馏
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation 用于 LiDAR 分割的圆柱形和非对称 3D 卷积网络
Face Forgery Detection by 3D Decomposition 通过 3D 分解进行人脸伪造检测
Fourier Contour Embedding for Arbitrary-Shaped Text Detection 用于任意形状文本检测的傅里叶轮廓嵌入
Learning Neural Representation of Camera Pose with Matrix Representation of 用矩阵表示学习相机姿态的神经表示
Learning Statistical Texture for Semantic Segmentation 学习语义分割的统计纹理
Learning the Superpixel in a Non-Iterative and Lifelong Manner 以非迭代和终生的方式学习超像素
One Shot Face Swapping on Megapixels 百万像素上的一枪换脸
Prototype Augmentation and Self-Supervision for Incremental Learning 增量学习的原型增强和自我监督
RGB-D Local Implicit Function for Depth Completion of Transparent Objects 用于透明对象深度补全的 RGB-D 局部隐函数
Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning 用于 Few-Shot 类增量学习的自我提升原型改进
Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection 镜头稳定少镜头目标检测的语义关系推理
SOON Scenario Oriented Object Navigation With Graph-Based Exploration SOON 基于图形探索的面向场景的对象导航
Spatially-Varying Outdoor Lighting Estimation From Intrinsics 从本质上估计空间变化的户外照明
VIGOR Cross-View Image Geo-Localization Beyond One-to-One Retrieval 超越一对一检索的 VIGOR 跨视图图像地理定位
WebFace260M A Benchmark Unveiling the Power of Million-Scale Deep Face WebFace260M 揭示百万级深度人脸功能的基准测试
Where and What Examining Interpretable Disentangled Representations 检查可解释的解耦表示的地点和内容
Fusing the Old with the New Learning Relative Camera Pose 融合旧的与新的学习相对相机姿势
Kaleido-BERT Vision-Language Pre-Training on Fashion Domain 时尚领域视觉语言预训练
The Translucent Patch A Physical and Universal Attack on Object 半透明补丁对物体的物理和通用攻击
End-to-End Human Object Interaction Detection With HOI Transformer 使用 HOI Transformer 进行端到端人类对象交互检测
Learning To Reconstruct High Speed and High Dynamic Range Videos 学习重建高速和高动态范围的视频
Progressive Temporal Feature Alignment Network for Video Inpainting 用于视频修复的渐进式时间特征对齐网络
Stylized Neural Painting 程式化的神经绘画
Leveraging the Availability of Two Cameras for Illuminant Estimation 利用两个相机的可用性进行光源估计
IIRC Incremental Implicitly-Refined Classification IIRC 增量隐式细化分类
Sequence-to-Sequence Contrastive Learning for Text Recognition 用于文本识别的序列到序列对比学习
Adaptive Consistency Regularization for Semi-Supervised Transfer Learning 半监督迁移学习的自适应一致性正则化
LQF Linear Quadratic Fine-Tuning LQF 线性二次微调
ArtEmis Affective Language for Visual Art ArtEmis 视觉艺术的情感语言
HistoGAN Controlling Colors of GAN-Generated and Real Images via Color Histograms通过颜色直方图控制 GAN 生成的真实图像的颜色
Learning Multi-Scale Photo Exposure Correction 学习多尺度照片曝光校正
Objectron A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations带有姿势注释的野外以对象为中心的视频的大规模数据集
Object Classification From Randomized EEG Trials 随机脑电图试验的对象分类
Unsupervised Multi-Source Domain Adaptation Without Access to Source Data 无需访问源数据的无监督多源域自适应
Learning Decision Trees Recurrently Through Communication 通过交流循环学习决策树
img2pose Face Alignment and Detection via 6DoF Face Pose Estimation img2pose 通过 6DoF 人脸姿态估计进行人脸对齐和检测
Learning Optical Flow From Still Images 从静止图像中学习光流
RPSRNet End-to-End Trainable Rigid Point Set Registration Network Using Barnes-Hut 使用 Barnes-Hut 表示的端到端可训练刚性点集注册网络
Denoise and Contrast for Category Agnostic Shape Completion 类别不可知形状完成的去噪和对比度
Understanding and Simplifying Perceptual Distances 理解和简化感知距离
Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map 道路动态和成本图的自监督同时多步预测
ArtFlow Unbiased Image Style Transfer via Reversible Neural Flows ArtFlow 通过可逆神经流实现无偏图像风格转移
Learning Deep Latent Variable Models by Short-Run MCMC Inference With 通过短期 MCMC 推理学习深度潜在变量模型
MIST Multiple Instance Spatial Transformer MIST 多实例空间转换器
Image Generators With Conditionally-Independent Pixel Synthesis 具有条件独立像素合成的图像生成器
SpinNet Learning a General Surface Descriptor for 3D Point Cloud SpinNet 学习 3D 点云的通用表面描述符
Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation 自适应语义分割的自监督增强一致性
Partition-Guided GANs 分区引导 GAN
Variational Transformer Networks for Layout Generation 用于布局生成的变分变换器网络
Dogfight Detecting Drones From Drones Videos 从无人机视频中检测无人机的混战
Adversarial Robustness Across Representation Spaces 跨表示空间的对抗鲁棒性
4D Panoptic LiDAR Segmentation 4D 全景 LiDAR 分割
Binary TTC A Temporal Geofence for Autonomous Navigation 用于自主导航的二进制 TTC 时间地理围栏
Polka Lines Learning Structured Illumination and Reconstruction for Active Stereo Polka Lines 学习结构化照明和重建主动立体
What if We Only Use Real Datasets for Scene Text 如果我们只对场景文本使用真实数据集会怎样
Whats in the Image Explorable Decoding of Compressed Images 压缩图像的图像探索解码中的内容
Binary Graph Neural Networks 二元图神经网络
GMOT-40 A Benchmark for Generic Multiple Object Tracking GMOT-40 通用多目标跟踪的基准
Learning Scalable lY-Constrained Near-Lossless Image Compression via Joint Lossy Image 通过联合有损图像学习可扩展 lY 约束的近无损图像压缩
Person30K A Dual-Meta Generalization Network for Person Re-Identification Person30K 用于人员重新识别的双元泛化网络
PointDSC Robust Point Cloud Registration Using Deep Spatial Consistency 使用深度空间一致性的 PointDSC 鲁棒点云配准
Riggable 3D Face Reconstruction via In-Network Optimization 通过网络内优化进行可操纵的 3D 人脸重建
Unsupervised Multi-Source Domain Adaptation for Person Re-Identification 用于人员重新识别的无监督多源域自适应
UnsupervisedRR Unsupervised Point Cloud Registration via Differentiable Rendering UnsupervisedRR 通过可微渲染进行无监督点云注册
Rainbow Memory Continual Learning With a Memory of Diverse Samples Rainbow Memory 持续学习与不同样本的记忆
Efficient Initial Pose-Graph Generation for Global SfM 全局 SfM 的高效初始位姿图生成
ReAgent Point Cloud Registration Using Imitation and Reinforcement Learning 使用模仿和强化学习的 ReAgent 点云注册
Fostering Generalization in Single-View 3D Reconstruction by Learning a Hierarchy 通过学习层次结构促进单视图 3D 重建中的泛化
Learning Semantic-Aware Dynamics for Video Prediction 学习视频预测的语义感知动力学
Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression 用于不确定性校准和自适应压缩的贝叶斯嵌套神经网络
Towards Accurate 3D Human Motion Prediction From Incomplete Observations 从不完整的观察中实现准确的 3D 人体运动预测
SuperMix Supervising the Mixing Data Augmentation SuperMix 监督混合数据增强
Digital Gimbal End-to-End Deep Image Stabilization With Learnable Exposure Times 具有可学习曝光时间的数字云台端到端深度图像稳定
A Hyperbolic-to-Hyperbolic Graph Convolutional Network 双曲到双曲图卷积网络
Dynamic Head Unifying Object Detection Heads With Attentions 动态头部将物体检测头部与注意力统一起来
FBNetV3 Joint Architecture-Recipe Search Using Predictor Pretraining 使用预测器预训练的 FBNetV3 联合架构-食谱搜索
General Instance Distillation for Object Detection 对象检测的通用实例蒸馏
Generalizable Person Re-Identification With Relevance-Aware Mixture of Experts 具有相关性意识的专家混合的可概括的人重新识别
Learning a Proposal Classifier for Multiple Object Tracking 学习用于多对象跟踪的建议分类器
Learning Affinity-Aware Upsampling for Deep Image Matting 学习用于深度图像抠图的 Affinity-Aware 上采样
Progressive Contour Regression for Arbitrary-Shape Scene Text Detection 任意形状场景文本检测的渐进轮廓回归
SPSG Self-Supervised Photometric Scene Generation From RGB-D Scans SPSG 从 RGB-D 扫描生成自监督光度场景
UP-DETR Unsupervised Pre-Training for Object Detection With Transformers UP-DETR 无监督预训练使用 Transformer 进行目标检测
Nearest Neighbor Matching for Deep Clustering 深度聚类的最近邻匹配
Soft-IntroVAE Analyzing and Improving the Introspective Variational Autoencoder Soft-IntroVAE 分析和改进内省变分自编码器
Cloud2Curve Generation and Vectorization of Parametric Sketches 参数草图的生成和矢量化
Square Root Bundle Adjustment for Large-Scale Reconstruction 大规模重建的平方根束调整
3D AffordanceNet A Benchmark for Visual Object Affordance Understanding 3D AffordanceNet 视觉对象可供性理解的基准
Are Labels Always Necessary for Classifier Accuracy Evaluation 分类器精度评估是否总是需要标签
Deep Homography for Efficient Stereo Image Compression 用于高效立体图像压缩的深度单应性
Deformed Implicit Field Modeling 3D Shapes With Learned Dense Correspondence 具有学习密集对应关系的变形隐式场建模 3D 形状
LAU-Net Latitude Adaptive Upscaling Network for Omnidirectional Image Super-Resolution 用于全向图像超分辨率的
LiBRe A Practical Bayesian Approach to Adversarial Detection LiBRe 对抗性检测的实用贝叶斯方法
PML Progressive Margin Loss for Long-Tailed Age Classification 长尾年龄分类的 PML 渐进式边际损失
Sketch Ground and Refine Top-Down Dense Video Captioning 草绘地面并优化自上而下的密集视频字幕
Spatially-Invariant Style-Codes Controlled Makeup Transfer空间不变样式代码控制的化妆转移
Unbiased Mean Teacher for Cross-Domain Object Detection 跨域目标检测的无偏平均教师
Variational Prototype Learning for Deep Face Recognition 用于深度人脸识别的变分原型学习
VirTex Learning Visual Representations From Textual Annotations VirTex 从文本注释中学习视觉表示
Deep Polarization Imaging for 3D Shape and SVBRDF Acquisition 用于 3D 形状和 SVBRDF 采集的深度偏振成像
Biase Pixel-Wise Anomaly Detection in Complex Driving Scenes 复杂驾驶场景中的偏差像素异常检测
BASARBlack-Box Attack on Skeletal Action Recognition BASARBlack-Box 攻击骨骼动作识别
CDFI Compression-Driven Network Design for Frame Interpolation 用于帧插值的 CDFI 压缩驱动网络设计
Deeply Shape-Guided Cascade for Instance Segmentation 用于实例分割的深度形状引导级联
Diverse Branch Block Building a Convolution as an Inception-Like Unit 将卷积构建为类 Inception 单元的多样化分支块
Globally Optimal Relative Pose Estimation With Gravity Prior 具有重力先验的全局最优相对姿态估计
HR-NAS Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers HR-NAS 使用轻量级 Transformer 搜索高效的高分辨率神经架构
RepVGG Making VGG-Style ConvNets Great Again RepVGG 让 VGG 风格的 ConvNets 再次伟大
On Robustness and Transferability of Convolutional Neural Networks 关于卷积神经网络的鲁棒性和可迁移性
Fast and Accurate Model Scaling 快速准确的模型缩放
Learning Spatially-Variant MAP Models for Non-Blind Image Deblurring 学习用于非盲图像去模糊的空间变体 MAP 模型
Robust Neural Routing Through Space Partitions for Camera Relocalization in 用于相机重定位的空间分区的鲁棒神经路由
Stochastic Image-to-Video Synthesis Using cINNs 使用 cINN 的随机图像到视频合成
PLOP Learning Without Forgetting for Continual Semantic Segmentation PLOP 学习不忘记持续语义分割
Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation 无监督域自适应的跨域梯度差异最小化
Adversarial Laser Beam Effective Physical-World Attack to DNNs in a 对抗性激光束对 DNN 的有效物理世界攻击
EventZoom Learning To Denoise and Super Resolve Neuromorphic Events EventZoom 学习去噪和超级解决神经形态事件
SLADE A Self-Training Framework for Distance Metric Learning SLADE 距离度量学习的自我训练框架
TransNAS-Bench-101 Improving Transferability and Generalizability of Cross-Task Neural Architecture Search TransNAS-Bench-101 提高跨任务神经架构搜索的可迁移性和泛化性
How2Sign A Large-Scale Multimodal Dataset for Continuous American Sign Language How2Sign 用于连续美国手语的大规模多模态数据集
Adaptive Methods for Real-World Domain Generalization 现实世界领域泛化的自适应方法
Compatibility-Aware Heterogeneous Visual Search 兼容感知异构视觉搜索
SSTVOS Sparse Spatiotemporal Transformers for Video Object Segmentation 用于视频对象分割的 SSTVOS 稀疏时空变换器
Masksembles for Uncertainty Estimation 用于不确定性估计的掩码
Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings 通过对抗仿射子空间嵌入保护隐私的图像特征
DeepVideoMVS Multi-View Stereo on Video With Recurrent Spatio-Temporal Fusion DeepVideoMVS 循环时空融合视频多视图立体
Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative 通过学习离散生成对 3D 点云进行自我监督学习
ManipulaTHOR A Framework for Visual Object Manipulation ManipulaTHOR 视觉对象操作框架
NeuroMorph Unsupervised Shape Interpolation and Correspondence in One Go NeuroMorph 一次性无监督形状插值和对应
Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball 使用感知球上的对抗性扰动来解释分类器
From Points to Multi-Object 3D Reconstruction 从点到多对象 3D 重建
Learning Goals From Failure 从失败中学习目标
How Well Do Self-Supervised Models Transfer 自监督模型的迁移效果如何
Taming Transformers for High-Resolution Image Synthesis 驯服变形金刚用于高分辨率图像合成
Adversarially Adaptive Normalization for Single Domain Generalization 单域泛化的对抗性自适应归一化
Generalized Few-Shot Object Detection Without Forgetting 不遗忘的广义小样本目标检测
Group Collaborative Learning for Co-Salient Object Detection 共同显着目标检测的小组协作学习
Learning Triadic Belief Dynamics in Nonverbal Communication From Videos 从视频中学习非语言交流中的三元信念动态
Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud 用于点云中的时空建模
Rethinking BiSeNet for Real-Time Semantic Segmentation 重新思考用于实时语义分割的 BiSeNet
SCF-Net Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation 用于大规模点云分割的 SCF-Net 学习空间上下文特征
Dual Attention Guided Gaze Target Detection in the Wild 野外双注意引导注视目标检测
LiDAR-Aug A General Rendering-Based Augmentation Framework for 3D Object Detection LiDAR-Aug 用于 3D 对象检测的通用基于渲染的增强框架
Read Like Humans Autonomous Bidirectional and Iterative Language Modeling for 像人类一样阅读自主双向和迭代语言建模
Reconstructing 3D Human Pose by Watching Humans in the Mirror 通过观察镜子中的人来重建 3D 人体姿势
Cross-Domain Similarity Learning for Face Recognition in Unseen Domains 未见领域人脸识别的跨领域相似性学习
3D CNNs With Adaptive Temporal Feature Resolutions 具有自适应时间特征分辨率的 3D CNN
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning 无监督时空表示学习的大规模研究
Encoder Fusion Network With Co-Attention Embedding for Referring Image Segmentation 用于参考图像分割的具有共同注意嵌入的编码器融合网络
MIST Multiple Instance Self-Training Framework for Video Anomaly Detection 用于视频异常检测的 MIST 多实例自训练框架
Optimal Gradient Checkpoint Search for Arbitrary Computation Graphs 任意计算图的最优梯度检查点搜索
Recurrent Multi-View Alignment Network for Unsupervised Surface Registration 用于无监督表面配准的循环多视图对齐网络
Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip 通过动态跳过去除显示屏不足相机中的衍射图像伪影
Semantic-Aware Video Text Detection语义感知视频文本检测
Siamese Natural Language Tracker Tracking by Natural Language Descriptions With Siamese Trackers使用连体追踪器通过自然语言描述进行追踪
Anticipating Human Actions by Correlating Past With the Future With Jaccard similarity measures通过 Jaccard 相似性度量将过去与未来相关联来预测人类行为
AIFit Automatic 3D Human-Interpretable Feedback Models for Fitness Training 用于健身训练的 AFit 自动 3D 人工可解释反馈模型
StickyPillars Robust and Efficient Feature Matching on Point Clouds Using StickyPillars 在点云上使用鲁棒和高效的特征匹配
Global Transport for Fluid Reconstruction With Learned Self-Supervision 具有学习自我监督的流体重建全球运输
A Multi-Task Network for Joint Specular Highlight Detection and Removal 用于联合镜面高光检测和去除的多任务网络
Auto-Exposure Fusion for Single-Image Shadow Removal 用于单张图像阴影去除的自动曝光融合
Double Low-Rank Representation With Projection Distance Penalty for Clustering 用于聚类的具有投影距离惩罚的双低秩表示
Learning to Track Instances without Video Annotations 学习在没有视频注释的情况下跟踪实例
Partial Feature Selection and Alignment for Multi-Source Domain Adaptation 多源域自适应的部分特征选择和对齐
Robust Point Cloud Registration Framework Based on Deep Graph Matching 基于深度图匹配的鲁棒点云配准框架
STMTrack Template-Free Visual Tracking With Space-Time Memory Networks STMTrack 使用时空记忆网络的无模板视觉跟踪
Transferable Query Selection for Active Domain Adaptation 主动域适应的可转移查询选择
Unsupervised Pre-Training for Person Re-Identification 人员重新识别的无监督预训练
Polarimetric Normal Stereo 极化正常立体声
Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction 用于单目 4D 面部头像重建的动态神经辐射场
Single-Shot Freestyle Dance Reenactment 单发自由式舞蹈重演
Multiple Instance Captioning Learning Representations From Histopathology Textbooks and Articles 来自组织病理学教科书和文章的多实例字幕学习表示
Incremental Few-Shot Instance Segmentation 增量少样本实例分割
Deep Graph Matching Under Quadratic Constraint 二次约束下的深度图匹配
Global2Local Efficient Structure Search for Video Action Segmentation 用于视频动作分割的 Global2Local 高效结构搜索
High-Fidelity and Arbitrary Face Editing 高保真任意人脸编辑
Information Bottleneck Disentanglement for Identity Swapping 身份交换的信息瓶颈解耦
Isometric Multi-Shape Matching 等距多形状匹配
Network Pruning via Performance Maximization 通过性能最大化进行网络修剪
Privacy-Preserving Collaborative Learning With Automatic Transformation Search 使用自动转换搜索保护隐私的协作学习
Representative Batch Normalization With Feature Calibration 具有特征校准的代表性批量标准化
Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression 远程体现引用表达的空间和对象感知知识推理
VisualVoice： Audio-Visual Speech Separation With Cross-Modal Consistency 具有跨模态一致性的 VisualVoice 视听语音分离
WOAD Weakly Supervised Online Action Detection in Untrimmed Videos 未修剪视频中的 WOAD 弱监督在线动作检测
A Peek Into the Reasoning of Neural Networks Interpreting With Structural Visual Concepts窥探神经网络的推理：用结构视觉概念解释
Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On 高度逼真的虚拟试穿的解开循环一致性
OTA Optimal Transport Assignment for Object Detection 用于目标检测的 OTA 最优传输分配
Parser-Free Virtual Try-On via Distilling Appearance Flows 通过蒸馏外观流进行无解析器虚拟试穿
Video Object Segmentation Using Global and Instance Embedding Learning 使用全局和实例嵌入学习的视频对象分割
OSTeC One-Shot Texture Completion OSTeC One-Shot 纹理完成
Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression 通过分离关键点回归进行自下而上的人体姿势估计
Cross Modal Focal Loss for RGBD Face Anti-Spoofing RGBD人脸反欺骗的交叉模态焦点损失
Anomaly Detection in Video via Self-Supervised and Multi-Task Learning 通过自我监督和多任务学习的视频异常检测
Privacy Preserving Localization and Mapping From Uncalibrated Cameras 未校准相机的隐私保护定位和映射
Neural Reprojection Error Merging Feature Learning and Camera Pose Estimation 融合特征学习和相机姿态估计的神经重投影误差
Simple Copy-Paste Is a Strong Data Augmentation Method for Instance 简单的复制粘贴是一种强大的实例数据增强方法
FrameExit Conditional Early Exiting for Efficient Video Recognition 用于高效视频识别的 FrameExit 条件提前退出
Learning Graphs for Knowledge Transfer With Limited Labels 具有有限标签的知识转移学习图
OBoW Online Bag-of-Visual-Words Generation for Self-Supervised Learning 用于自我监督学习的 OBoW 在线视觉词袋生成
Polygonal Building Extraction by Frame Field Learning 框架场学习的多边形建筑物提取
The Lottery Ticket Hypothesis for Object Recognition 对象识别的彩票假说
Weakly Supervised Learning of Rigid 3D Scene Flow 刚性 3D 场景流的弱监督学习
Mixed-Privacy Forgetting in Deep Networks 深度网络中的混合隐私遗忘
AlphaMatch Improving Consistency for Semi-Supervised Learning With Alpha-Divergence AlphaMatch 通过 Alpha-Divergence 提高半监督学习的一致性
Cluster Split Fuse and Update：Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation开放复合域自适应语义分割的元学习
KeepAugment A Simple Information-Preserving Data Augmentation Approach KeepAugment 一种简单的信息保存数据增强方法
MaxUp Lightweight Adversarial Training With Data Augmentation Improves Neural Network 带有数据增强的 MaxUp 轻量级对抗训练改进了神经网络
Mitigating Face Recognition Bias via Group Adaptive Classifier 通过组自适应分类器减轻人脸识别偏差
Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning 基于渐进式感受野分量推理的全监督点云分割
PoseAug A Differentiable Pose Augmentation Framework for 3D Human Pose PoseAug 用于 3D 人体姿势的可微分姿势增强框架
PLADE-Net Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation With PLADE-Net 实现自我监督单视图深度估计的像素级精度
Panoptic Segmentation Forecasting 全景分割预测
ContactOpt Optimizing Contact To Improve Grasps ContactOpt 优化接触以提高抓地力
Depth From Camera Motion and Object Detection 相机运动和物体检测的深度
StylePeople A Generative Model of Fullbody Human Avatars StylePeople 全身人体化身的生成模型
AGQA A Benchmark for Compositional Spatio-Temporal Reasoning AGQA 组合时空推理的基准
Capsule Network Is Not More Robust Than Convolutional Network 胶囊网络并不比卷积网络更健壮
DOTS Decoupling Operation and Topology in Differentiable Architecture Search 可微架构搜索中的 DOTS 解耦操作和拓扑
Interpreting Super-Resolution Networks With Local Attribution Maps 使用本地归因图解释超分辨率网络
Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction 域外人体网格重建的双层在线适应
AutoDO Robust AutoAugment for Biased Data With Label Noise via 通过可扩展的概率隐式微分，对具有标签噪声的偏置数据进行稳健的自动检测
Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion 用于统一单目深度预测和完成的稀疏辅助网络
Beyond Bounding-Box Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection面向定向和密集对象检测的凸包特征自适应
Distilling Object Detectors via Decoupled Features 通过解耦特征提取目标检测器
Graph Attention Tracking 图注意力跟踪
Intrinsic Image Harmonization 内在图像协调
Inverse Simulation Reconstructing Dynamic Geometry of Clothed Humans via Optimal Control逆向仿真通过最优控制重构穿衣人的动态几何
Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and 统一训练协同训练的长尾多标签视觉识别
MetaCorrection： Domain-Aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation语义分割中无监督域自适应的域感知元损失校正
Multi-Institutional Collaborations for Improving Deep Learning-Based Magnetic Resonance Image Reconstruction 多机构合作改进基于深度学习的磁共振图像重建
Multispectral Photometric Stereo for Spatially-Varying Spectral Reflectances： A Well Posed problem?用于空间变化光谱反射率的多光谱光度立体：一个很好的问题？
Online Multiple Object Tracking With Cross-Task Synergy 具有跨任务协同作用的在线多对象跟踪
Positive-Unlabeled Data Purification in the Wild for Object Detection 用于对象检测的野外无标记数据纯化
SSAN Separable Self-Attention Network for Video Representation Learning 用于视频表示学习的 SSAN 可分离自注意力网络
Strengthen Learning Tolerance for Weakly Supervised Object Localization 加强对弱监督目标定位的学习容忍度
Rotation Equivariant Siamese Networks for Tracking 用于跟踪的旋转等变连体网络
Human POSEitioning System HPS： 3D Human Pose Estimation and Self-Localization in Large Scenes from Body-Mounted Sensors人体安装传感器在大场景中的 3D 人体姿势估计和自我定位
NormalFusion Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning NormalFusion 实时获取高分辨率 RGB-D 扫描的表面法线
Skip-Convolutions for Efficient Video Processing 用于高效视频处理的跳过卷积
Representation Learning via Global Temporal Alignment and Cycle-Consistency 通过全局时间对齐和循环一致性进行表示学习
Lips Dont Lie A Generalisable and Robust Approach To Face Forgery Detection一种通用且稳健的人脸伪造检测方法
Heterogeneous Grid Convolution for Adaptive Efficient and Controllable Computation 用于自适应高效可控计算的异构网格卷积
Monte Carlo Scene Search for 3D Scene Understanding 蒙特卡洛场景搜索以了解 3D 场景
Contrastive Embedding for Generalized Zero-Shot Learning 广义零样本学习的对比嵌入
Learning To Fuse Asymmetric Feature Maps in Siamese Trackers 学习在连体追踪器中融合不对称特征图
ReDet A Rotation-Equivariant Detector for Aerial Object Detection ReDet 用于航空物体检测的旋转等变检测器
Rethinking Channel Dimensions for Efficient Model Design 重新思考高效模型设计的渠道维度
Crossing Cuts Polygonal Puzzles Models and Solvers 交叉切割多边形拼图模型和求解器
Learning by Aligning Videos in Time 通过及时对齐视频来学习
Track Check Repeat An EM Approach to Unsupervised Tracking 跟踪检查重复 EM 方法进行无监督跟踪
Generalizable Pedestrian Detection ：The Elephant in the Room 可推广的行人检测房间里的大象
Populating 3D Scenes by Learning Human-Scene Interaction 通过学习人景交互来填充 3D 场景
Sewer-ML A Multi-Label Sewer Defect Classification Dataset and Benchmark Sewer-ML 多标签下水道缺陷分类数据集和基准
Patch-NetVLAD Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition 用于地方识别的局部全局描述符的 Patch-NetVLAD 多尺度融合
ChallenCap Monocular 3D Capture of Challenging Human Performances Using Multi-Modal ChallenCap 使用多模态对具有挑战性的人类行为进行单目 3D 捕捉
Checkerboard Context Model for Efficient Learned Image Compression 用于高效学习图像压缩的棋盘上下文模型
Context-Aware Layout to Image Generation With Enhanced Object Appearance 具有增强对象外观的图像生成的上下文感知布局
DiNTS Differentiable Neural Network Topology Search for 3D Medical Image 用于 3D 医学图像的 DiNTS 可微神经网络拓扑搜索
DyCo3D Robust Instance Segmentation of 3D Point Clouds Through Dynamic DyCo3D 通过动态对 3D 点云进行鲁棒实例分割
FFB6D A Full Flow Bidirectional Fusion Network for 6D Pose EstimationFFB6D 6D 姿势估计的全流双向融合网络
ForgeryNet A Versatile Benchmark for Comprehensive Forgery Analysis ForgeryNet 用于综合伪造分析的多功能基准
Learnable Graph Matching Incorporating Graph Partitioning With Deep Feature Learning 将图分区与深度特征学习相结合的可学习图匹配
MOST A Multi-Oriented Scene Text Detector With Localization Refinement MOST 具有定位细化的多向场景文本检测器
Multi-Source Domain Adaptation With Collaborative Learning for Semantic Segmentation 多源域自适应与语义分割的协作学习
Partial Person Re-Identification With Part-Part Correspondence Learning 部分人重新识别与部分对应学习
Towards Fast and Accurate Real-World Depth Super-Resolution Benchmark Dataset and Baseline迈向快速准确的真实世界深度超分辨率基准数据集和
A Sliced Wasserstein Loss for Neural Texture Synthesis 用于神经纹理合成的切片 Wasserstein 损失
Natural Adversarial Examples 自然对抗的例子
Unsupervised Learning of 3D Object Categories From Videos in the Wild从视频中的 3D 对象类别的无监督学习
Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps 使用基于可靠性的注意力图引导交互式视频对象分割
Neural Cellular Automata Manifold 神经元胞自动机流形
Trajectory Prediction With Latent Belief Energy-Based Model 基于潜在信念能量模型的轨迹预测
Back to Event Basics Self-Supervised Learning of Image Reconstruction for 回到事件基础图像重建的自监督学习
Beyond Image to Depth： Improving Depth Prediction Using Echoes Beyond Image to Depth：使用 Echoes 改进深度预测
Bridge To Answer： Structure-Aware Graph Interaction Network for Video Question AnsweringBridge To Answer：用于视频问答的结构感知图交互网络
Improving Unsupervised Image Clustering With Robust Learning 通过鲁棒学习改进无监督图像聚类
Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised 在半监督中使用重用门函数学习动态网络
Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders 通过消息传递自动编码器进行无监督双曲线表示学习
Dual Contradistinctive Generative Autoencoder 双对比生成自动编码器
Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging 快速全局最优旋转平均的旋转坐标下降
Neural Parts Learning Expressive 3D Shape Abstractions With Invertible Neural 使用可逆神经元学习富有表现力的 3D 形状抽象的神经部分
AGORA Avatars in Geography Optimized for Regression Analysis 针对回归分析进行优化的地理 AGORA Avatars
LayoutGMN Neural Graph Matching for Structural Layout Similarity 用于结构布局相似性的 LayoutGMN 神经图匹配
SOLD2 Self-Supervised Occlusion-Aware Line Description and Detection SOLD2 自监督遮挡感知线描述和检测
Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE 使用分层 VQ-VAE 生成用于图像修复的多样化结构
Neural Body Implicit Neural Representations With Structured Latent Codes for 具有结构化潜在代码的神经体隐式神经表示
Temporal-Relational CrossTransformers for Few-Shot Action Recognition 用于少量动作识别的时间关系交叉变换器
Black-Box Explanation of Object Detectors via Saliency Maps 通过显着图对目标检测器进行黑盒解释
Learning To Predict Visual Attributes in the Wild 在野外学习预测视觉属性
Meta Pseudo Labels 元伪标签
Adversarial Imaging Pipelines 对抗性成像管道
Deep Multi-Task Learning for Joint Localization Perception and Prediction 用于联合定位感知和预测的深度多任务学习
Inverting Generative Adversarial Renderer for Face Reconstruction 用于人脸重建的反向生成对抗渲染器
Recognizing Actions in Videos From Unseen Viewpoints 从看不见的角度识别视频中的动作
SliceNet Deep Dense Depth Estimation From a Single Indoor Panorama using a slice-based representation使用基于切片的表示从单个室内全景进行深度密集深度估计
CoMoGAN Continuous Model-Guided Image-to-Image Translation CoMoGAN 连续模型引导的图像到图像转换
Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks 针对视频识别网络的空中对抗闪烁攻击
CompositeTasking Understanding Images by Spatial Composition of Tasks CompositeTasking 通过任务的空间组合理解图像
Improving Panoptic Segmentation at All Scales 在所有尺度上改进全景分割
A Functional Approach to Rotation Equivariant Non-Linearities for Tensor Field 张量场旋转等变非线性的一种函数方法
Labeled From Unlabeled Exploiting Unlabeled Data for Few-Shot Deep HDR Deghosting从未标记利用未标记数据进行标记以进行少量深度 HDR 去重影
Multi-Modal Fusion Transformer for End-to-End Autonomous Driving 用于端到端自动驾驶的多模态融合变压器
Lifelong Person Re-Identification via Adaptive Knowledge Accumulation 通过自适应知识积累进行终身人员重新识别
D-NeRF Neural Radiance Fields for Dynamic Scenes 动态场景的 D-NeRF 神经辐射场
BABEL Bodies Action and Behavior With English Labels 带有英文标签的身体动作和行为
Multi-Scale Aligned Distillation for Low-Resolution Detection 用于低分辨率检测的多尺度对齐蒸馏
Offboard 3D Object Detection From Point Cloud Sequences 从点云序列进行离线 3D 对象检测
PQA Perceptual Question Answering PQA 感知问答
PU-GCN Point Cloud Upsampling Using Graph Convolutional Networks 使用图卷积网络的 PU-GCN 点云上采样
Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar 使用互补激光雷达在雾天进行稳健的多模式车辆检测
Roof-GAN Learning To Generate Roof Geometry and Relations for Residential Roof-GAN 学习为住宅生成屋顶几何形状和关系
Spatiotemporal Contrastive Video Representation Learning 时空对比视频表示学习
DetectoRS Detecting Objects With Recursive Feature Pyramid and Switchable Atrous DetectoRS 使用递归特征金字塔和可切换的 Atrous 检测对象
Uncertainty-Guided Model Generalization to Unseen Domains 对未知领域的不确定性引导模型泛化
VIP-DeepLab Learning Visual Perception With Depth-Aware Video Panoptic Segmentation VIP-DeepLab 通过深度感知视频全景分割学习视觉感知
Temporal Context Aggregation Network for Temporal Action Proposal Refinement 用于时间行动建议细化的时间上下文聚合网络
3DCaricShop： A Dataset and a Baseline Method for Single-View 3D Caricature Face Reconstruction用于单视图 3D 漫画人脸重建的数据集和基线方法
Boosting Video Representation Learning With Multi-Faceted Integration 通过多方面集成促进视频表示学习
Effective Snapshot Compressive-Spectral Imaging via Deep Denoising and Total Variation 通过深度去噪和总变化进行有效的快照压缩光谱成像
Scene Essence 场景精华
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation 通过双边增强对真实点云场景进行语义分割
DAT Training Deep Networks Robust To Label-Noise by Matching the DAT 训练深度网络通过匹配
Focus on Local Detecting Lane Marker From Bottom Up via Key Point专注于通过关键点自下而上检测车道标记
DyGLIP A Dynamic Graph Model With Link Prediction for Accurate Multi-Camera Multiple Object Tracking用于精确多摄像机多目标跟踪的具有链路预测的动态图模型
Removing Raindrops and Rain Streaks in One Go 一次性去除雨滴和雨痕
VoxelContext-Net An Octree Based Framework for Point Cloud Compression VoxelContext-Net 基于八叉树的点云压缩框架
Learning Complete 3D Morphable Face Models From Images and Videos 从图像和视频中学习完整的 3D 可变形人脸模型
Monocular Reconstruction of Neural Face Reflectance Fields 神经人脸反射场的单目重建
Exploiting Refining Depth Distributions With Triangulation Light Curtains 利用三角光幕精炼深度分布
Home Action Genome： Cooperative Compositional Action Understanding 家庭行动基因组：合作组成行动理解
ANR Articulated Neural Rendering for Virtual Avatars ANR：虚拟头像的铰接式神经渲染
Pixel-Aligned Volumetric Avatars 像素对齐的体积化身
Learning Delaunay Surface Elements for Mesh Reconstruction 学习用于网格重建的 Delaunay 曲面元素
Single Image Depth Prediction With Wavelet Decomposition 小波分解的单幅图像深度预测
Fair Attribute Classification Through Latent Space De-Biasing 通过潜在空间去偏的公平属性分类
Universal Spectral Adversarial Attacks for Deformable Shapes 可变形形状的通用光谱对抗攻击
Learning To Count Everything 学习计算一切
Categorical Depth Distribution Network for Monocular 3D Object Detection 用于单目 3D 目标检测的分类深度分布网络
DeRF Decomposed Radiance Fields DeRF 分解辐射场
Im2Vec Synthesizing Vector Graphics Without Vector Supervision Im2Vec 在没有矢量监督的情况下合成矢量图形
TesseTrack End-to-End Learnable Multi-Person Articulated 3D Pose Tracking TesseTrack 端到端可学习多人关节式 3D 姿势跟踪
SelfAugment Automatic Augmentation Policies for Self-Supervised Learning SelfAugment 用于自我监督学习的自动增强策略
Every Annotation Counts Multi-Label Deep Supervision for Medical Image Segmentation 医学图像分割的每个注释都计入多标签深度监督
PANDA Adapting Pretrained Features for Anomaly Detection and Segmentation PANDA 调整预训练特征以进行异常检测和分割
3D Spatial Recognition Without Spatially Labeled 3D 3D 空间识别，无需空间标记 3D
Adaptive Consistency Prior Based Deep Network for Image Denoising 基于自适应一致性先验的图像去噪深度网络
Flow Guided Transformable Bottleneck Networks for Motion Retargeting 用于运动重定向的流引导可转换瓶颈网络
Learning From the Master Distilling Cross-Modal Advanced Knowledge for Lip Reading向大师学习：提炼唇读的跨模态高级知识
Reciprocal Transformations for Unsupervised Video Object Segmentation 无监督视频对象分割的倒数变换
On the Difficulty of Membership Inference Attacks 关于成员推断攻击的难度
Encoding in Style A StyleGAN Encoder for Image-to-Image Translation 用于图像到图像转换的 StyleGAN 编码器
Stable View Synthesis 稳定的视图合成
Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot 探索 Few-Shot 的不变和等变表示的互补优势
Enriching ImageNet With Human Similarity Judgments and Psychological Embeddings 用人类相似性判断和心理嵌入丰富 ImageNet
End-to-End High Dynamic Range Camera Pipeline Optimization 端到端高动态范围相机流程优化
Spatially Consistent Representation Learning 空间一致表示学习
Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation 多目标领域适应的课程图联合教学
DeFMO Deblurring and Shape Recovery of Fast Moving Objects 快速移动物体的 DeFMO 去模糊和形状恢复
Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition 用于有效面部表情识别的特征分解和重建学习
Gaussian Context Transformer 高斯上下文转换器
Learning an Explicit Weighting Scheme for Adapting Complex HSI Noise 学习适应复杂 HSI 噪声的显式加权方案
Uncertainty Reduction for Model Adaptation in Semantic Segmentation 语义分割中模型适应的不确定性降低
Visual Semantic Role Labeling for Video Understanding 用于视频理解的视觉语义角色标签
Learning-Based Image Registration With Meta-Regularization 元正则化的基于学习的图像配准
Learning To Relate Depth and Semantics for Unsupervised Domain Adaptation 学习关联深度和语义以进行无监督域适应
LOHO Latent Optimization of Hairstyles via Orthogonalization 通过正交化对发型进行 LOHO 潜在优化
StyleMeUp Towards Style-Agnostic Sketch-Based Image Retrieval StyleMeUp 迈向与风格无关的基于草图的图像检索
SCANimate Weakly Supervised Learning of Skinned Clothed Avatar Networks SCANimate 弱监督学习的蒙皮阿凡达网络
Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking 多目标跟踪的概率 Tracklet 评分和修复
Multiresolution Knowledge Distillation for Anomaly Detection 用于异常检测的多分辨率知识蒸馏
Revamping Cross-Modal Recipe Retrieval With Hierarchical Transformers and Self-Supervised Learning 使用分层变压器和自我监督学习改进跨模式配方检索
Affective Processes： Stochastic Modelling of Temporal Context for Emotion and facial expression recognition情感过程：用于情感和面部表情识别的时间上下文的随机建模
Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual 通过用于虚拟的生成 3D 服装模型的自我监督碰撞处理
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation 用于无监督动作分割的时间加权分层聚类
Back to the Feature Learning Robust Camera Localization From Pixels 回到从像素学习鲁棒相机定位的特征
Domain-Independent Dominance of Adaptive Methods 自适应方法的领域无关优势
Information-Theoretic Segmentation by Inpainting Error Maximization 通过修复误差最大化的信息论分割
Improved Handling of Motion Blur in Online Object Detection 在线对象检测中改进的运动模糊处理
Invisible Perturbations Physical Adversarial Examples Exploiting the Rolling Shutter Effect 不可见的扰动物理对抗示例利用滚动快门效应
Unsupervised Human Pose Estimation Through Transforming Shape Templates 通过变换形状模板进行无监督人体姿势估计
CASTing Your Model Learning To Localize Improves Self-Supervised Representations 铸造你的模型学习本地化改进自我监督的表示
Probabilistic 3D Human Shape and Pose Estimation From Multiple Unconstrained 来自多个无约束的概率 3D 人体形状和姿势估计
Look Before You Speak Visually Contextualized Utterances 在你说视觉语境化的话语之前先看看
Multi-Perspective LSTM for Joint Visual Representation Learning 用于联合视觉表示学习的多视角 LSTM
Achieving Robustness in Classification Using Optimal Transport With Hinge Regularization 使用带有铰链正则化的最优传输实现分类的鲁棒性
Single Pair Cross-Modality Super Resolution 单对跨模态超分辨率
Introvert： Human Trajectory Prediction via Conditional 3D Attention 通过条件3D注意预测人类轨迹
Spatially-Adaptive Pixelwise Networks for Fast Image Translation 用于快速图像翻译的空间自适应像素网络
Efficient Conditional GAN Transfer With Knowledge Propagation Across Classes 跨类知识传播的高效条件 GAN 迁移
Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation 用于无监督域适应的基于实例级亲和力的迁移
Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression 通过增加动态范围和抑制来增强夜间能见度
Dive Into Ambiguity Latent Distribution Mining and Pairwise Uncertainty Estimation 潜入歧义潜在分布挖掘和成对不确定性估计
Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment 用于统一美学评估的分层布局感知图卷积网络
CFNet Cascade and Fused Cost Volume for Robust Stereo Matching 用于稳健立体匹配的 CFNet 级联和融合成本体积
Closed-Form Factorization of Latent Semantics in GANs GAN 中潜在语义的封闭式分解
DCT-Mask Discrete Cosine Transform Mask Representation for Instance Segmentation 用于实例分割的 DCT-Mask 离散余弦变换掩码表示
Learning To Segment Actions From Visual and Language Instructions via Differentiable Weak Sequence Alignment学习通过可微弱序列对齐从视觉和语言指令中分割动作
S2-BNN Bridging the Gap Between Self-Supervised Real and 1-Bit Neural S2-BNN 弥合自我监督真实和 1 位神经之间的差距
Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes 具有 107 个节点的大规模图上的结构感知人脸聚类
Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation 面向弱监督全景分割的联合事物挖掘
Training Generative Adversarial Networks in One Stage 在一个阶段训练生成对抗网络
Verifiability and Predictability： Interpreting Utilities of Network Architectures for Point Cloud Processing可验证性和可预测性：解释用于点云处理的网络架构的实用程序
SSN Soft Shadow Network for Image Compositing 用于图像合成的 SSN 软阴影网络
Continual Learning via Bit-Level Information Preserving 通过比特级信息保存进行持续学习
Fingerspelling Detection in American Sign Language 美国手语中的手指拼写检测
GLAVNet Global-Local Audio-Visual Cues for Fine-Grained Material Recognition 用于细粒度材料识别的 GLAVNet 全局-局部视听线索
Learning by Planning Language-Guided Global Image Editing 通过规划语言引导的全局图像编辑来学习
Lifting 2D StyleGAN for 3D-Aware Face Generation 提升 2D StyleGAN 以生成 3D 感知人脸
Self-Supervised Visibility Learning for Novel View Synthesis 新视图合成的自监督可见性学习
SGCN Sparse Graph Convolution Network for Pedestrian Trajectory Prediction 用于行人轨迹预测的 SGCN 稀疏图卷积网络
Skeleton Merger An Unsupervised Aligned Keypoint Detector Skeleton Merger 无监督对齐关键点检测器
StablePose Learning 6D Object Poses From Geometrically Stable Patches StablePose 从几何稳定的补丁中学习 6D 对象姿势
ZeroScatter： Domain Transfer for Long Distance Imaging and Vision Through Scattering MediaZeroScatter：通过散射介质进行远距离成像和视觉的域转移
clDice - A Novel Topology-Preserving Loss Function for Tubular Structure SegmentationclDice - 一种管状结构的新型拓扑保持损失函数
Learning Spatial-Semantic Relationship for Facial Attribute Recognition With Limited Labeled 学习空间语义关系用于有限标记的面部属性识别
Open Domain Generalization with Domain-Augmented Meta-Learning 具有域增强元学习的开放域泛化
SiamMOT Siamese Multi-Object Tracking SiamMOT连体多目标追踪
Motion Representations for Articulated Animation 关节动画的运动表示
On Learning the Geodesic Path for Incremental Learning 关于学习增量学习的测地线路径
Combining Semantic Guidance and Deep Reinforcement Learning for Generating Human 结合语义指导和深度强化学习来生成人类
DISCO Dynamic and Invariant Sensitive Channel Obfuscation for Deep Neural networksDISCO 用于深度神经的动态和不变敏感通道混淆
Rectification-Based Knowledge Retention for Continual Learning 持续学习的基于修正的知识保留
Semi-Supervised Action Recognition With Temporal Contrastive Learning 时间对比学习的半监督动作识别
TextOCR Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text 面向任意形状场景文本的大规模端到端推理的 TextOCR
Understanding Failures of Deep Networks via Robust Feature Extraction 通过鲁棒特征提取了解深度网络的故障
Deep Animation Video Interpolation in the Wild 野外深度动画视频插值
Adversarial Generation of Continuous Images 连续图像的对抗性生成
HDR Environment Map Estimation for Real-Time Augmented Reality 实时增强现实的 HDR 环境地图估计
SRWarp Generalized Image Super-Resolution under Arbitrary Transformation SRWarp 任意变换下的广义图像超分辨率
AdaStereo A Simple and Efficient Approach for Adaptive Stereo Matching AdaStereo 一种简单有效的自适应立体匹配方法
AdderSR Towards Energy Efficient Image Super-Resolution AdderSR 迈向节能图像超分辨率
Co-Grounding Networks With Semantic Attention for Referring Expression Comprehension in Videos具有语义注意的共同接地网络用于在视频中引用表达理解
Communication Efficient SGD via Gradient Sampling With Bayes Prior 通过使用贝叶斯先验的梯度采样实现高效的 SGD
Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation 用于面部动作单元强度估计的动态概率图卷积
Hybrid Message Passing With Performance-Driven Structures for Facial Action Unit Detection用于面部动作单元检测的性能驱动结构的混合消息传递
Mesh Saliency ：An Independent Perceptual Measure or a Derivative of Image Saliency?Mesh Saliency：一种独立的感知度量还是图像显着性的导数？
Pareidolia Face Reenactment Pareidolia 面部重演
Spatio-temporal Contrastive Domain Adaptation for Action Recognition 动作识别的时空对比域自适应
Towards Diverse Paragraph Captioning for Untrimmed Videos 迈向未修剪视频的多样化段落字幕
Tree-Like Decision Distillation 树状决策蒸馏
Bottleneck Transformers for Visual Recognition 视觉识别的瓶颈转换器
NeRV： Neural Reflectance and Visibility Fields for Relighting and View SynthesisNeRV：用于重新照明和视图合成的神经反射率和可见性场
Improving Multiple Pedestrian Tracking by Track Management and Occlusion Handling 通过跟踪管理和遮挡处理改进多行人跟踪
Right for the Right Concept Revising Neuro-Symbolic Concepts by Interacting 正确概念的权利通过互动修改神经符号概念
Using Shape To Categorize Low-Shot Learning With an Explicit Shape 使用形状对具有显式形状的 Low-Shot 学习进行分类
SMURF： Self-Teaching Multi-Frame Unsupervised RAFT With Full-Image Warping 具有全图像变形的自学多帧无监督 RAFT
A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification 用于细粒度分类的半监督学习的现实评估
ArtCoder An End-to-End Method for Generating Scanning-Robust Stylized QR Codes ArtCoder 一种生成扫描功能强大的风格化二维码的端到端方法
BCNet Searching for Network Width With Bilaterally Coupled Network BCNet用双边耦合网络搜索网络宽度
Prioritized Architecture Sampling With Monto-Carlo Tree Search 使用 Monto-Carlo 树搜索的优先架构采样
The Affective Growth of Computer Vision 计算机视觉的情感增长
Energy-Based Learning for Scene Graph Generation 用于场景图生成的基于能量的学习
Gated Spatio-Temporal Attention-Guided Video Deblurring 门控时空注意力引导视频去模糊
AutoFlow Learning a Better Training Set for Optical Flow AutoFlow 学习更好的光流训练集
Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal 深度敏感注意力和自动多模态的深度 RGB-D 显着性检测
Deep Video Matting via Spatio-Temporal Alignment and Aggregation 通过时空对齐和聚合进行深度视频抠图
Dynamic Metric Learning Towards a Scalable Metric Space To Accommodate 动态度量学习迈向可扩展的度量空间以适应
FSCE Few-Shot Object Detection via Contrastive Proposal Encoding 通过对比建议编码的 FSCE 少镜头目标检测
HoHoNet 360 Indoor Holistic Understanding With Latent Horizontal Features 具有潜在水平特征的 HoHoNet 360 室内整体理解
Improving the Efficiency and Robustness of Deepfakes Detection Through Precise 通过精确度提高 Deepfake 检测的效率和稳健性
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer 通过分而治之的室内全景平面 3D 重建
Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning 使用深度强化学习的引用表达基础的迭代收缩
Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution通过单深度超分辨率的跨任务知识转移学习场景结构指导
Learning View Selection for 3D Scenes 学习 3D 场景的视图选择
Lesion-Aware Transformers for Diabetic Retinopathy Grading 用于糖尿病视网膜病变分级的病变感知变压器
LoFTR Detector-Free Local Feature Matching With Transformers LoFTR 无检测器局部特征匹配与变压器
NeuralRecon Real-Time Coherent 3D Reconstruction From Monocular Video 基于单目视频的 NeuralRecon 实时相干 3D 重建
RSN Range Sparse Net for Efficient Accurate LiDAR 3D Object Detection用于高效准确 LiDAR 3D 对象检测的 RSN 范围稀疏网络
Semantic Image Matting 语义图像抠图
Soteria Provable Defense Against Privacy Leakage in Federated Learning From Soteria 可证明的联邦学习中的隐私泄露防御
Sparse R-CNN End-to-End Object Detection With Learnable Proposals 具有可学习建议的稀疏 R-CNN 端到端对象检测
Task Programming Learning Data Efficient Behavior Representations 任务编程学习数据高效行为表示
Tuning IR-Cut Filter for Illumination-Aware Spectral Reconstruction From RGB 调整 IR-Cut 滤光片以从 RGB 进行照明感知光谱重建
Tracking Pedestrian Heads in Dense Crowd 跟踪密集人群中的行人头部
NeuralHumanFVV Real-Time Neural Volumetric Human Performance Rendering Using RGB Cameras 使用 RGB 相机的 NeuralHumanFVV 实时神经体积人体性能渲染
TrafficSim Learning To Simulate Realistic Multi-Agent Behaviors TrafficSim 学习模拟现实的多智能体行为
Learning the Predictability of the Future 学习未来的可预测性
Fast End-to-End Learning on Protein Surfaces 蛋白质表面的快速端到端学习
Keep Your Eyes on the Lane Real-Time Attention-Guided Lane Detection 关注车道实时注意力引导车道检测
The Neural Tangent Link Between CNN Denoisers and Non-Local Filters CNN 降噪器和非局部滤波器之间的神经切线链接
Knowledge Evolution in Neural Networks 神经网络中的知识进化
Self-Supervised Wasserstein Pseudo-Labeling for Semi-Supervised Image Classification 用于半监督图像分类的自监督 Wasserstein 伪标记
Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks 用于密集预测任务的密集连接多扩张卷积网络
Event-Based Bispectral Photometry Using Temporally Modulated Illumination 使用时间调制照明的基于事件的双谱光度测量
Neural Geometric Level of Detail Real-Time Rendering With Implicit 3D 具有隐式 3D 的神经几何细节实时渲染
QPIC Query-Based Pairwise Human-Object Interaction Detection With Image-Wide Contextual Information 基于图像范围上下文信息的基于 QPIC 查询的成对人-对象交互检测
CodedStereo Learned Phase Masks for Large Depth-of-Field Stereo 用于大景深立体声的 CodedStereo 学习相位模板
Diverse Semantic Image Synthesis via Probability Distribution Modeling 基于概率分布建模的多样化语义图像合成
Equalization Loss v2 A New Gradient Balance Approach for Long-Tailed Equalization Loss v2 一种新的长尾梯度平衡方法
HumanGPS Geodesic PreServing Feature for Dense Human Correspondences 用于密集人类通信的 HumanGPS 测地线保存功能
Mirror3D Depth Refinement for Mirror Surfaces 镜面的深度细化
OTCE A Transferability Metric for Cross-Domain Cross-Task Representations OTCE 跨域跨任务表示的可转移性指标
Practical Wide-Angle Portraits Correction With Deep Structured Models 深度结构化模型的实用广角人像校正
SceneGen Learning To Generate Realistic Traffic Scenes SceneGen 学习生成逼真的交通场景
Learned Initializations for Optimizing Coordinate-Based Neural Representations 用于优化基于坐标的神经表示的学习初始化
Humble Teachers Teach Better Students for Semi-Supervised Object Detection 谦虚的老师教更好的学生进行半监督目标检测
Layerwise Optimization by Gradient Decomposition for Continual Learning 通过梯度分解进行分层优化以进行持续学习
Learning Camera Localization via Dense Scene Matching 通过密集场景匹配学习相机定位
Learning Parallel Dense Correspondence From Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction从时空描述符中学习并行密集对应，以实现高效和稳健的 4D 重建
Leveraging Large-Scale Weakly Labeled Data for Semi-Supervised Mass Detection in Mammograms利用大规模弱标记数据在乳房 X 线照片中进行半监督质量检测
Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation更接近于细分：实例细分的边界补丁细化
Manifold Regularized Dynamic Network Pruning 流形正则化动态网络修剪
Mutual CRF-GNN for Few-Shot Learning 用于 Few-Shot 学习的相互 CRF-GNN
SKFAC Training Neural Networks With Faster Kronecker-Factored Approximate Curvature SKFAC 训练具有更快克罗内克因子近似曲率的神经网络
HITNet Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching 用于实时立体匹配的 HITNet 分层迭代瓦片细化网络
EnD Entangling and Disentangling Deep Representations for Bias Correction End Entangling 和 Disentangling 用于偏差校正的深度表示
RAFT-3D Scene Flow Using Rigid-Motion Embeddings 使用刚体运动嵌入的 RAFT-3D 场景流
Tangent Space Backpropagation for 3D Transformation Groups 3D 变换组的切线空间反向传播
Consensus Maximisation Using Influences of Monotone Boolean Functions 使用单调布尔函数影响的共识最大化
Nutrition5k Towards Automatic Nutritional Understanding of Generic Food Nutrition5k 迈向对普通食品的自动营养理解
BoxInst High-Performance Instance Segmentation With Box Annotations BoxInst 使用框注释的高性能实例分割
Can Audio-Visual Integration Strengthen Robustness Under Multimodal Attacks 视听融合能否增强多模态攻击下的鲁棒性
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation 探测物体视觉接地和声音分离的循环协同学习
Farewell to Mutual Information Variational Distillation for Cross-Modal Person Re-Identification 告别互信息变分蒸馏用于跨模态人员重新识别
Probabilistic Selective Encryption of Convolutional Neural Networks for Hierarchical Services 用于分层服务的卷积神经网络的概率选择性加密
Unsupervised Object Detection With LIDAR Clues 使用 LIDAR 线索进行无监督目标检测
Modeling Multi-Label Action Dependencies for Temporal Action Localization 为时间动作本地化建模多标签动作依赖关系
Coming Down to Earth Satellite-to-Street View Synthesis for Geo-Localization 用于地理定位的地球卫星到街景合成
Post-Hoc Uncertainty Calibration for Domain Drift Scenarios 域漂移场景的事后不确定性校准
FaceSec A Fine-Grained Robustness Evaluation Framework for Face Recognition Systems FaceSec 人脸识别系统的细粒度鲁棒性评估框架
SMD-Nets Stereo Mixture Density Networks SMD-Nets 立体混合密度网络
Automatic Correction of Internal Units in Generative Neural Networks 生成神经网络内部单元的自动校正
Explore Image Deblurring via Encoded Blur Kernel Space 通过编码模糊内核空间探索图像去模糊
SSLayout360 Semi-Supervised Indoor Layout Estimation From 360deg Panorama SSLayout360 360 度全景的半监督室内布局估计
Repurposing GANs for One-Shot Semantic Part Segmentation 将 GAN 重新用于一次性语义部分分割
Reconsidering Representation Alignment for Multi-View Clustering 重新考虑多视图聚类的表示对齐
Data-Free Model Extraction 无数据模型提取
Learning Accurate Dense Correspondences and When To Trust Them 学习准确的密集通信以及何时信任它们
Unsupervised Learning for Robust Fitting A Reinforcement Learning Approach 用于鲁棒拟合的无监督学习强化学习方法
Regularizing Generative Adversarial Networks Under Limited Data 在有限数据下规范生成对抗网络
Learning Better Visual Dialog Agents With Pretrained Visual-Linguistic Representation 使用预训练的视觉语言表示学习更好的视觉对话代理
ColorRL Reinforced Coloring for End-to-End Instance Segmentation 用于端到端实例分割的 ColorRL 增强着色
Time Lens Event-Based Video Frame Interpolation 基于时间镜头事件的视频帧插值
Found a Reason for me Weakly-supervised Grounded Visual Question Answering 为我找到了一个理由弱监督接地视觉问答
Joint Learning of 3D Shape Retrieval and Deformation 3D形状检索和变形的联合学习
Uncertainty-Aware Camera Pose Estimation From Points and Lines 基于点和线的不确定性感知相机姿态估计
There Is More Than Meets the Eye Self-Supervised Multi-Object Detection 自我监督的多目标检测远不止于此
The Multi-Temporal Urban Development SpaceNet Dataset多时相城市发展 SpaceNet 数据集
Horn Benchmarking Representation Learning for Natural World Image Collections 自然世界图像集合的喇叭基准表示学习
Read and Attend： Temporal Localisation in Sign Language Videos 阅读并参加：手语视频中的时间本地化
Scaling Local Self-Attention for Parameter Efficient Visual Backbones 为参数有效的视觉骨干缩放局部自我注意
Efficient Feature Transformations for Discriminative and Generative Continual Learning 判别式和生成式持续学习的高效特征转换
CRFace Confidence Ranker for Model-Agnostic Face Detection Refinement CRFace Confidence Ranker 用于模型无关的人脸检测细化
Plan2Scene Converting Floorplans to 3D Scenes Plan2Scene 将平面图转换为 3D 场景
Continual Adaptation of Visual Representations via Domain Randomization and Meta-Learning 通过域随机化和元学习持续适应视觉表示
VDSM Unsupervised Video Disentanglement With State-Space Modeling and Deep Mixtures 使用状态空间建模和深度混合的 VDSM 无监督视频解缠结
MeGA-CDA Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object 类别感知无监督域自适应对象的 MeGA-CDA 内存引导注意
Can We Characterize Tasks Without Labels or Features 我们可以在没有标签或特征的情况下表征任务吗
A Generalized Loss Function for Crowd Counting and Localization 人群计数和定位的广义损失函数
Self-Attention Based Text Knowledge Mining for Text Detection 用于文本检测的基于自注意力的文本知识挖掘
CanonPose Self-Supervised Monocular 3D Human Pose Estimation in the Wild CanonPose 在野外自我监督的单目 3D 人体姿态估计
3DIoUMatch Leveraging IoU Prediction for Semi-Supervised 3D Object Detection 3DIoUMatch 利用 IoU 预测进行半监督 3D 对象检测
A Self-Boosting Framework for Automated Radiographic Report Generation 用于自动生成射线照相报告的自增强框架
ACTION-Net Multipath Excitation for Action Recognition 用于动作识别的 ACTION-Net 多路径激励
Adaptive Class Suppression Loss for Long-Tail Object Detection 长尾目标检测的自适应类抑制损失
AdvSim Generating Safety-Critical Scenarios for Self-Driving Vehicles AdvSim 为自动驾驶汽车生成安全关键场景
AttentiveNAS Improving Neural Architecture Search via Attentive Sampling AttentiveNAS 通过 Attentive Sampling 改进神经架构搜索
Automatic Vertebra Localization and Identification in CT by Spine Rectification 脊柱矫正在 CT 中的椎体自动定位和识别
Bi-GCN Binary Graph Convolutional Network Bi-GCN 二值图卷积网络
Birds of a Feather Capturing Avian Shape Models From Images 从图像中捕捉鸟类形状模型的羽毛鸟
Combinatorial Learning of Graph Edit Distance via Dynamic Embedding 基于动态嵌入的图编辑距离组合学习
Contrastive Learning Based Hybrid Networks for Long-Tailed Image Classification 基于对比学习的长尾图像分类混合网络
Convolutional Neural Network Pruning With Structural Redundancy Reduction 具有结构冗余减少的卷积神经网络修剪
Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection 用于半监督目标检测的数据不确定性引导多阶段学习
Deep Two-View Structure-From-Motion Revisited 重新审视运动的深度双视图结构
Delving into Data Effectively Substitute Training for Black-box Attack 深入研究数据有效地替代了黑盒攻击的训练
Dense Contrastive Learning for Self-Supervised Visual Pre-Training 用于自我监督视觉预训练的密集对比学习
Depth-Conditioned Dynamic Message Propagation for Monocular 3D Object Detection 用于单目 3D 目标检测的深度条件动态消息传播
Domain-Specific Suppression for Adaptive Object Detection 自适应对象检测的域特定抑制
Dual Attention Suppression Attack Generate Adversarial Camouflage in Physical World 双重注意力抑制攻击在物理世界中产生对抗伪装
End-to-End Object Detection With Fully Convolutional Network 使用全卷积网络的端到端目标检测
End-to-End Video Instance Segmentation With Transformers 使用 Transformer 进行端到端视频实例分割
Enhancing the Transferability of Adversarial Attacks Through Variance Tuning 通过方差调整增强对抗性攻击的可转移性
EvDistill Asynchronous Events To End-Task Learning via Bidirectional Reconstruction-Guided Cross-Modal EvDistill 异步事件通过双向重构引导的跨模态完成任务学习
Exploring Sparsity in Image Super-Resolution for Efficient Inference 探索图像超分辨率中的稀疏性以实现高效推理
FAIEr Fidelity and Adequacy Ensured Image Caption Evaluation FAIer 保真度和充分性确保图像说明评估
FESTA Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds 通过场景点云的时空注意进行 FESTA 流估计
From Rain Generation to Rain Removal 从造雨到除雨
From Semantic Categories to Fixations A Novel Weakly-Supervised Visual-Auditory Saliency 从语义类别到注视一种新的弱监督视听显着性
GDR-Net Geometry-Guided Direct Regression Network for Monocular 6D Object Pose 用于单目 6D 对象姿态的 GDR-Net 几何引导直接回归网络
Glancing at the Patch Anomaly Localization With Global and Local 从全局和局部看补丁异常定位
Gradient-Based Algorithms for Machine Teaching 基于梯度的机器教学算法
Hijack-GAN Unintended-Use of Pretrained Black-Box GANs Hijack-GAN 对预训练黑盒 GAN 的意外使用
HLA-Face Joint High-Low Adaptation for Low Light Face Detection 低光人脸检测联合高低自适应
IBRNet Learning Multi-View Image-Based Rendering IBRNet 学习多视图基于图像的渲染
Image Inpainting With External-Internal Learning and Monochromic Bottleneck 具有内外学习和单色瓶颈的图像修复
IMAGINE Image Synthesis by Image-Guided Model Inversion 通过图像引导模型反演 IMAGINE 图像合成
Implicit Feature Alignment Learn To Convert Text Recognizer to Text Spotter隐式特征对齐学习将文本识别器转换为文本定位器
Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship 通过结合几何关系改进基于 OCR 的图像描述
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation 通过对比知识蒸馏改进弱监督的视觉定位
Learning Compositional Radiance Fields of Dynamic Human Heads 学习动态人头的组成辐射场
Learning Fine-Grained Segmentation of 3D Shapes Without Part Labels 学习没有零件标签的 3D 形状的细粒度分割
LED2-Net Monocular 360deg Layout Estimation via Differentiable Depth Rendering 基于可微深度渲染的 LED2-Net 单目 360 度布局估计
Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration 用于 3D 人体网格配准的局部感知分段变换场
MaX-DeepLab End-to-End Panoptic Segmentation With Mask Transformers MaX-DeepLab 端到端全景分割与掩模转换器
MetaSCI Scalable and Adaptive Reconstruction for Video Compressive Sensing 用于视频压缩感知的 MetaSCI 可扩展和自适应重建
Multi-Decoding Deraining Network and Quasi-Sparsity Based Training 多解码去雨网络和基于准稀疏的训练
Multiple Object Tracking With Correlation Learning 使用相关学习进行多对象跟踪
One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing 用于视频会议的 One-Shot Free-View 神经说话头合成
ORDisCo Effective and Efficient Usage of Incremental Unlabeled Data for ORDisCo 有效和高效地使用增量未标记数据
PatchmatchNet Learned Multi-View Patchmatch Stereo PatchmatchNet 学习多视图 Patchmatch Stereo
PAUL Procrustean Autoencoder for Unsupervised Lifting 用于无监督提升的 PAUL Procrustean 自动编码器
PointAugmenting Cross-Modal Augmentation for 3D Object Detection 用于 3D 对象检测的 PointAugmenting 跨模态增强
ProSelfLC Progressive Self Label Correction for Training Robust Deep Neural ProSelfLC 渐进式自我标签校正用于训练鲁棒深度神经
Prototype-Supervised Adversarial Network for Targeted Attack of Deep Hashing 用于深度哈希目标攻击的原型监督对抗网络
Pseudo Facial Generation With Extreme Poses for Face Recognition 具有极端姿势的人脸识别伪面部生成
PWCLO-Net： Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask OptimizationPWCLO-Net：在 3D 点云中使用分层嵌入掩模优化的深度 LiDAR 里程计
Removing the Background by Adding the Background Towards Background Robust 通过向背景添加背景来去除背景
Repopulating Street Scenes 重新填充街景
Representative Forgery Mining for Fake Face Detection 用于假人脸检测的代表性伪造挖掘
Rethinking and Improving the Robustness of Image Style Transfer 重新思考和提高图像风格迁移的鲁棒性
Rich Features for Perceptual Quality Assessment of UGC Videos 丰富的 UGC 视频感知质量评估功能
RSG A Simple but Effective Module for Learning Imbalanced Datasets RSG 一个简单但有效的学习不平衡数据集的模块
Scaled-YOLOv4 Scaling Cross Stage Partial Network Scaled-YOLOv4 缩放跨阶段部分网络
Scene Text Retrieval via Joint Text Detection and Similarity Learning 通过联合文本检测和相似性学习进行场景文本检索
Scene-Aware Generative Network for Human Motion Synthesis 用于人体运动合成的场景感知生成网络
Seesaw Loss for Long-Tailed Instance Segmentation 长尾实例分割的跷跷板损失
Self-Supervised Learning for Semi-Supervised Temporal Action Proposal 半监督时间行动建议的自我监督学习
Single-Stage Instance Shadow Detection With Bidirectional Relation Learning 具有双向关系学习的单阶段实例阴影检测
Structured Multi-Level Interaction Network for Video Moment Localization via Language 通过语言进行视频时刻定位的结构化多级交互网络
Structured Scene Memory for Vision-Language Navigation 视觉语言导航的结构化场景记忆
SwiftNet Real-Time Video Object Segmentation SwiftNet 实时视频对象分割
Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes 在 3D 场景中合成长期 3D 人体运动和交互
T2VLAD Global-Local Sequence Alignment for Text-Video Retrieval 用于文本视频检索的 T2VLAD 全局-局部序列对齐
TDN Temporal Difference Networks for Efficient Action Recognition 用于高效动作识别的 TDN 时间差分网络
Towards More Flexible and Accurate Object Tracking With Natural Language 使用自然语言实现更灵活、更准确的对象跟踪
Towards Real-World Blind Face Restoration With Generative Facial Prior 通过生成面部先验实现真实世界的盲人脸修复
Training Networks in Null Space of Feature Covariance for Continual 连续特征协方差零空间中的训练网络
Transformer Meets Tracker Exploiting Temporal Context for Robust Visual Tracking Transformer 遇上跟踪器，利用时间上下文实现强大的视觉跟踪
Troubleshooting Blind Image Quality Models in the Wild 在野外对盲图像质量模型进行故障排除
Understanding the Behaviour of Contrastive Loss 了解对比损失的行为
Understanding the Robustness of Skeleton-Based Action Recognition Under Adversarial Attack 了解对抗性攻击下基于骨架的动作识别的鲁棒性
Unsupervised Degradation Representation Learning for Blind Super-Resolution 用于盲超分辨率的无监督退化表示学习
Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination 通过跨级实例组判别进行无监督特征学习
Unsupervised Visual Attention and Invariance for Reinforcement Learning 用于强化学习的无监督视觉注意和不变性
Unsupervised Visual Representation Learning by Tracking Patches in Video 通过跟踪视频中的补丁进行无监督视觉表示学习
Weakly-Supervised Instance Segmentation via Class-Agnostic Learning With Salient Images 基于显着图像的类不可知学习的弱监督实例分割
When Human Pose Estimation Meets Robustness Adversarial Algorithms and Benchmarks 当人体姿势估计遇到鲁棒性对抗算法和基准时
The Temporal Opportunist Self-Supervised Multi-Frame Monocular Depth 时间机会主义自监督多帧单目深度
NeuralFusion Online Depth Fusion in Latent Space 潜在空间中的 NeuralFusion 在线深度融合
CReST A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning CReST 用于不平衡半监督学习的类再平衡自我训练框架
Improved Image Matting via Real-Time User Clicks and Uncertainty Estimation 通过实时用户点击和不确定性估计改进图像抠图
MetaAlign Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation MetaAlign 用于无监督域适应的协调域对齐和分类
PV-RAFT Point-Voxel Correlation Fields for Scene Flow Estimation of Point 用于点场景流估计的 PV-RAFT 点体素相关场
Shallow Feature Matters for Weakly Supervised Object Localization 弱监督对象定位的浅特征很重要
Unsupervised Real-World Image Super Resolution via Domain-Distance Aware Training 通过域距离感知训练的无监督真实世界图像超分辨率
Visual Room Rearrangement 视觉室重新布置
Autoregressive Stylized Motion Synthesis With Generative Flow 具有生成流的自回归程式化运动合成
Cycle4Completion Unpaired Point Cloud Completion Using Cycle Transformation With Missing Cycle4Completion 未配对点云完成使用带有缺失的循环变换
Detection Tracking and Counting Meets Drones in Crowds ：A Benchmark 检测跟踪和计数在人群中遇到无人机：基准
Learning Progressive Point Embeddings for 3D Point Cloud Generation 学习用于 3D 点云生成的渐进式点嵌入
PMP-Net Point Cloud Completion by Learning Multi-Step Point Moving Paths 通过学习多步点移动路径完成 PMP-Net 点云
Seeking the Shape of Sound An Adaptive Framework for Learning 寻找声音的形状自适应学习框架
Holistic 3D Human and Scene Mesh Estimation From Single View 单一视图的整体 3D 人体和场景网格估计
Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical 使用分层的实例分割中长尾的无监督发现
Backdoor Attacks Against Deep Learning Systems in the Physical World 物理世界中针对深度学习系统的后门攻击
Few-Shot Classification With Feature Map Reconstruction Networks 使用特征图重建网络进行 Few-Shot 分类
Separating Skills and Concepts for Novel Visual Question Answering 新颖视觉问答的分离技巧和概念
Deep Active Surface Models 深层活动表面模型
Co-Attention for Conditioned Image Matching 条件图像匹配的共同注意
Neural Splines Fitting 3D Surfaces With Infinitely-Wide Neural Networks 使用无限宽神经网络拟合 3D 表面的神经样条
MonoRec Semi-Supervised Dense Reconstruction in Dynamic Environments From a Single 动态环境中的 MonoRec 半监督密集重建
NeX Real-Time View Synthesis With Neural Basis Expansion 具有神经基础扩展的 NeX 实时视图合成
DeFlow： Learning Complex Image Degradations From Unpaired Data With Conditional DeFlow 从具有条件的未配对数据中学习复杂的图像退化
Learning To Associate Every Segment for Video Panoptic Segmentation 学习关联视频全景分割的每个片段
On Semantic Similarity in Video Retrieval 视频检索中的语义相似度
Adversarial Robustness Under Long-Tailed Distribution 长尾分布下的对抗鲁棒性
Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics 通过重新审视集成多样性指标来提高集成精度
Contrastive Learning for Compact Single Image Dehazing 紧凑型单幅图像去雾的对比学习
DANNet： A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic SegmentationDANNet：一种用于无监督夜间语义分割的单阶段域适应网络
De-Rendering the Worlds Revolutionary Artefacts 对世界革命性文物进行去渲染
Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification 发现用于可见红外人员重新识别的跨模态细微差别
Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation 用于弱监督语义分割的嵌入式判别注意机制
Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing 探索弱监督视听视频解析的异构线索
Fashion IQ A New Dataset Towards Retrieving Images by Natural Fashion IQ 一个通过自然检索图像的新数据集
Forecasting Irreversible Disease via Progression Learning 通过渐进学习预测不可逆疾病
Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction 用于大规模视频预测的贪婪分层变分自编码器
Improving the Transferability of Adversarial Samples With Adversarial Transformations 通过对抗性转换提高对抗性样本的可转移性
Incremental Learning via Rate Reduction 通过降速进行增量学习
MotionRNN A Flexible Model for Video Prediction With Spacetime-Varying Motions MotionRNN 一种灵活的时空运动视频预测模型
Progressive Unsupervised Learning for Visual Object Tracking 视觉对象跟踪的渐进式无监督学习
SceneGraphFusion Incremental 3D Scene Graph Prediction From RGB-D Sequences SceneGraphFusion 基于 RGB-D 序列的增量 3D 场景图预测
StyleSpace Analysis Disentangled Controls for StyleGAN Image Generation 用于 StyleGAN 图像生成的 StyleSpace 分析分离控制
Towards Long-Form Video Understanding 迈向长视频理解
Track To Detect and Segment： An Online Multi-Object Tracker 跟踪检测和分割在线多对象跟踪器
Deep Denoising of Flash and No-Flash Pairs for Photography in 用于摄影的 Flash 和非 Flash 对的深度去噪
SOE-Net： A Self-Attention and Orientation Encoding Network for Point Cloud based Place Recognition基于点云的位置识别的自注意力和方向编码网络
TediGAN Text-Guided Diverse Face Image Generation and Manipulation TediGAN 文本引导的多样化人脸图像生成和操作
Space-Time Neural Irradiance Fields for Free-Viewpoint Video 自由视点视频的时空神经辐照场
A Dual Iterative Refinement Method for Non-Rigid Shape Matching 一种非刚性形状匹配的双重迭代细化方法
NeuTex Neural Texture Mapping for Volumetric Neural Rendering 用于体积神经渲染的 NeuTex 神经纹理映射
Dynamic Weighted Learning for Unsupervised Domain Adaptation 无监督域自适应的动态加权学习
Improving Transferability of Adversarial Patches on Face Recognition With Generative 使用生成提高人脸识别对抗性补丁的可迁移性
NExT-QA Next Phase of Question-Answering to Explaining Temporal Actions NExT-QA 下一阶段的问答解释时间行为
Space-Time Distillation for Video Super-Resolution 视频超分辨率的时空蒸馏
You See What I Want You To See： Exploring Targeted Black-Box Transferability Attack for Hash-based Image Retrieval Systems你看到我想让你看到的：探索基于哈希的图像检索系统的目标黑盒可转移性攻击
DG-Font Deformable Generative Networks for Unsupervised Font Generation 用于无监督字体生成的 DG-Font 可变形生成网络
Efficient Regional Memory Network for Video Object Segmentation 用于视频对象分割的高效区域记忆网络
Exploiting Aliasing for Manga Restoration 利用别名来恢复漫画
Generative PointNet ：Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification用于 3D 生成、重建和分类的无序点集的基于深度能量的学习
Propagate Yourself Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning 传播自己探索无监督视觉表示学习的像素级一致性
Scale-Aware Graph Neural Network for Few-Shot Semantic Segmentation 用于少镜头语义分割的尺度感知图神经网络
Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion 用于点云补全的具有对抗性渲染的基于样式的点生成器
End-to-End Learning for Joint Image Demosaicing， Denoising and Super-Resolution 联合图像去马赛克去噪和超分辨率的端到端学习
Invertible Image Signal Processing 可逆图像信号处理
MobileDets Searching for Object Detection Architectures for Mobile Accelerators MobileDets 为移动加速器搜索对象检测架构
Seeing in Extra Darkness Using a Deep-Red Flash 使用深红色闪光灯在额外的黑暗中看到
A Fourier-Based Framework for Domain Generalization 基于傅里叶的域泛化框架
Adaptive Rank Estimate in Robust Principal Component Analysis 稳健主成分分析中的自适应秩估计
Bilateral Grid Learning for Stereo Matching Networks 立体匹配网络的双边网格学习
Consistent Instance FALSE Positive Improves Fairness in Face Recognition Consistent Instance FALSE positive 提高了人脸识别的公平性
Deep Gradient Projection Networks for Pan-sharpening 用于全色锐化的深度梯度投影网络
Discrimination-Aware Mechanism for Fine-Grained Representation Learning 细粒度表示学习的判别感知机制
Faster Meta Update Strategy for Noise-Robust Deep Learning 用于抗噪深度学习的更快元更新策略
Generative Hierarchical Features From Synthesizing Images 来自合成图像的生成层次特征
Graph Stacked Hourglass Networks for 3D Human Pose Estimation 用于 3D 人体姿势估计的图形堆叠沙漏网络
Inferring CAD Modeling Sequences Using Zone Graphs 使用区域图推断 CAD 建模序列
Layer-Wise Searching for 1-Bit Detectorss逐层搜索 1 位检测器
Layout-Guided Novel View Synthesis From a Single Indoor Panorama 来自单个室内全景的布局引导的新视图合成
Learning Dynamic Alignment via Meta-Filter for Few-Shot Learning 通过元过滤器学习动态对齐以进行 Few-Shot 学习
Line Segment Detection Using Transformers Without Edges 使用无边缘变压器的线段检测
Linear Semantics in Generative Adversarial Networks 生成对抗网络中的线性语义
PAConv Position Adaptive Convolution With Dynamic Kernel Assembling on Point PAConv 位置自适应卷积与点上动态内核组装
Positional Encoding As Spatial Inductive Bias in GANs 位置编码作为 GAN 中的空间归纳偏差
ReNAS Relativistic Evaluation of Neural Architecture Search 神经架构搜索的 ReNAS 相对论评估
Rethinking Text Segmentation A Novel Dataset and a Text-Specific Refinement 重新思考文本分割：一个新的数据集和文本特定的细化
SUTD-TrafficQA A Question Answering Benchmark and an Efficient Network for SUTD-TrafficQA 问答基准和高效网络
Temporal Modulation Network for Controllable Space-Time Video Super-Resolution 用于可控时空视频超分辨率的时间调制网络
Towards Accurate Text-Based Image Captioning With Content Diversity Exploration 通过内容多样性探索实现基于文本的准确图像字幕
ViPNAS Efficient Video Pose Estimation via Neural Architecture Search 通过神经架构搜索的 ViPNAS 高效视频姿势估计
Visually Informed Binaural Audio Generation without Binaural Audios 无需双耳音频的直观双耳音频生成
Wide-Baseline Multi-Camera Calibration Using Person Re-Identification 使用人员重新识别的宽基线多相机校准
Intra-Inter Camera Similarity for Unsupervised Person Re-Identification 无监督人员重新识别的内部相机相似性
Learnable Companding Quantization for Accurate Low-Bit Neural Networks 用于精确低位神经网络的可学习压扩量化
Alpha-Refine Boosting Tracking Performance by Precise Bounding Box Estimation Alpha-Refine 通过精确边界框估计提高跟踪性能
Anchor-Free Person Search 无锚人搜索
DER Dynamically Expandable Representation for Class Incremental Learning 用于类增量学习的 DER 动态可扩展表示
Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching 用于图像-文本匹配的离散连续动作空间策略基于梯度的注意
FP-NAS Fast Probabilistic Neural Architecture Search FP-NAS 快速概率神经架构搜索
LightTrack Finding Lightweight Neural Networks for Object Tracking via One-Shot LightTrack 通过 One-Shot 寻找用于对象跟踪的轻量级神经网络
Online Learning of a Probabilistic and Adaptive Scene Representation 概率和自适应场景表示的在线学习
Positive-Congruent Training Towards Regression-Free Model Updates 面向无回归模型更新的正一致训练
Primitive Representation Learning for Scene Text Recognition 场景文本识别的原始表示学习
Self-Aligned Video Deraining With Transmission-Depth Consistency 具有传输深度一致性的自对准视频去雨
Unsupervised Hyperbolic Metric Learning 无监督双曲度量学习
3D-MAN 3D Multi-Frame Attention Network for Object Detection 用于对象检测的 3D-MAN 3D 多帧注意力网络
A Circular-Structured Representation for Visual Emotion Distribution Learning 视觉情绪分布学习的循环结构表示
Beyond Short Clips End-to-End Video-Level Learning With Collaborative Memories 超越短片的端到端视频级学习与协作记忆
Bottom-Up Shift and Reasoning for Referring Image Segmentation 参考图像分割的自下而上转换和推理
Capturing Omni-Range Context for Omnidirectional Segmentation 捕获全方位上下文以进行全方位分割
Causal Attention for Vision-Language Tasks 视觉语言任务的因果注意
CausalVAE Disentangled Representation Learning via Neural Structural Causal Models CausalVAE 通过神经结构因果模型解开表示学习
CondenseNet V2 Sparse Feature Reactivation for Deep Networks 深度网络的 CondenseNet V2 稀疏特征重新激活
CT-Net Complementary Transfering Network for Garment Transfer With Arbitrary Geometric CT-Net 互补传输网络，用于任意几何的服装传输
Deep Optimized Priors for 3D Shape Modeling and Reconstruction 用于 3D 形状建模和重建的深度优化先验
Defending Multimodal Fusion Models Against Single-Source Adversaries 防御多模式融合模型对抗单源对手
Dense Label Encoding for Boundary Discontinuity Free Rotation Detection 用于边界不连续自由旋转检测的密集标签编码
Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes 发现超越二进制属性的 GAN 的可解释潜在空间方向
DSC-PoseNet Learning 6DoF Object Pose Estimation via Dual-Scale Consistency
DyStaB Unsupervised Object Segmentation via Dynamic-Static Bootstrapping 基于动态-静态自举的 DyStaB 无监督对象分割
End-to-End Rotation Averaging With Multi-Source Propagation 多源传播的端到端旋转平均
Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods 通过结构化随机准牛顿方法增强曲率信息
Exploiting Semantic Embedding and Visual Feature for Facial Action Unit 利用面部动作单元的语义嵌入和视觉特征
Few-Shot Transformation of Common Actions Into Time and Space 常见动作到时间和空间的小范围转换
GAN Prior Embedded Network for Blind Face Restoration in the Wild用于野外盲人脸修复的 GAN 先验嵌入式网络
HourNAS Extremely Fast Neural Architecture Search Through an Hourglass Lens HourNAS 通过沙漏透镜实现极快的神经架构搜索
Instance Localization for Self-Supervised Detection Pretraining 自监督检测预训练的实例定位
Interactive Self-Training With Mean Teachers for Semi-Supervised Object Detection 与普通教师进行交互式自我训练以进行半监督目标检测
Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-ID无监督的联合抗噪学习和元相机移位适应
KSM Fast Multiple Task Adaption via Kernel-Wise Soft Mask Learning 基于 Kernel-Wise Soft Mask 学习的 KSM 快速多任务适应
L2M-GAN Learning To Manipulate Latent Space Semantics for Facial Attribute EditingL2M-GAN 学习操纵面部属性的潜在空间语义
LASR Learning Articulated Shape Reconstruction From a Monocular Video LASR 从单目视频中学习关节形状重建
LayoutTransformer Scene Layout Generation With Conceptual and Spatial Diversity 具有概念和空间多样性的 LayoutTransformer 场景布局生成
Learning Dynamics via Graph Neural Networks for Human Pose Estimation 通过图神经网络学习动力学用于人体姿势估计
Learning To Segment Rigid Motions From Two Frames 学习从两帧中分割刚性运动
Mol2Image Improved Conditional Flow Models for Molecule to Image Synthesis Mol2Image 改进了分子到图像合成的条件流模型
NetAdaptV2 Efficient Neural Architecture Search With Fast Super-Network Training and NetAdaptV2 具有快速超级网络训练的高效神经架构搜索和
Partially View-Aligned Representation Learning With Noise-Robust Contrastive Loss 具有抗噪对比损失的部分视图对齐表示学习
Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation 用于场景图生成的语义歧义概率建模
Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow 使用外观流进行鱼眼图像校正的渐进互补网络
Projecting Your View Attentively Monocular Road Scene Layout Estimation via Cross-view Transformation通过跨视图变换来专注地投影您的视图单目道路场景布局估计
S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human ModelingS3：用于 3D 人体建模的神经形状、骨架和蒙皮字段
SelfSAGCN Self-Supervised Semantic Alignment for Graph Convolution Network SelfSAGCN 图卷积网络的自监督语义对齐
Self-Supervised Geometric Perception 自我监督的几何感知
Self-Supervised Learning of Depth Inference for Multi-View Stereo 多视图立体深度推理的自监督学习
Single-View 3D Object Reconstruction From Shape Priors in Memory 从内存中的形状先验重建单视图 3D 对象
Slimmable Compressive Autoencoders for Practical Neural Image Compression 用于实际神经图像压缩的 Slimmable Compressive Autoencoders
ST3D Self-Training for Unsupervised Domain Adaptation on 3D Object Detection 用于 3D 对象检测的无监督域自适应的 ST3D 自训练
StruMonoNet Structure-Aware Monocular 3D Prediction StruMonoNet 结构感知单目 3D 预测
TAP Text-Aware Pre-Training for Text-VQA and Text-Caption TAP Text-Aware Pre-Training for Text-VQA 和 Text-Caption
Towards Improving the Consistency Efficiency and Flexibility of Differentiable Neural 提高可微神经网络的一致性效率和灵活性
Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection 用于弱监督时间动作检测的不确定性引导协同训练
A Decomposition Model for Stereo Matching 立体匹配的分解模型
Cross-Iteration Batch Normalization 交叉迭代批量标准化
Joint-DetNAS Upgrade Your Detector With NAS Pruning and Dynamic Distillation Joint-DetNAS 通过 NAS 修剪和动态蒸馏升级您的探测器
Jo-SRC A Contrastive Approach for Combating Noisy Labels Jo-SRC 一种对抗噪声标签的对比方法
Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation 用于弱监督语义分割的非显着区域对象挖掘
Adversarial Invariant Learning 对抗不变学习
Closing the Loop Joint Rain Generation and Removal via Disentangled 通过 Disentangled 关闭循环联合雨水生成和去除
DeepTag An Unsupervised Deep Learning Method for Motion Tracking on DeepTag 一种用于运动跟踪的无监督深度学习方法
Hierarchical and Partially Observable Goal-Driven Policy Learning With Goals Relational 具有目标关系的分层和部分可观察的目标驱动的政策学习
Linguistic Structures As Weak Supervision for Visual Scene Graph Generation 语言结构作为视觉场景图生成的弱监督
Shelf-Supervised Mesh Prediction in the Wild 野外货架监督网格预测
TPCN Temporal Point Cloud Networks for Motion Forecasting 用于运动预测的 TPCN 时间点云网络
i3DMM Deep Implicit 3D Morphable Model of Human Heads i3DMM 人头深度隐式 3D 可变形模型
Complete Label A Domain Adaptation Approach to Semantic Segmentation 语义分割的完整标签域自适应方法
Iso-Points Optimizing Neural Implicit Surfaces With Hybrid Representations 用混合表示优化神经隐式表面的等点
Center-Based 3D Object Detection and Tracking 基于中心的 3D 对象检测和跟踪
ID-Unet Iterative Soft and Hard Deformation for View Synthesis 用于视图合成的 ID-Unet 迭代软硬变形
Learning To Recommend Frame for Interactive Video Object Segmentation in 学习推荐帧用于交互式视频对象分割
Learning To Recover 3D Scene Shape From a Single Image 学习从单个图像中恢复 3D 场景形状
See Through Gradients Image Batch Recovery via GradInversion 通过 GradInversion 透视梯度图像批量恢复
Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework 使用优化框架实现基于张量分解的高效 DNN 模型压缩
Towards Extremely Compact RNNs for Video Recognition With Fully Decomposed 面向具有完全分解的视频识别的极其紧凑的 RNN
Patch-VQ Patching Up the Video Quality Problem Patch-VQ 修补视频质量问题
RaScaNet Learning Tiny Models by Raster-Scanning Images RaScaNet 通过光栅扫描图像学习微型模型
Pose-Guided Human Animation From a Single Image in the Wild 来自野外单个图像的姿势引导人体动画
Divergence Optimization for Noisy Universal Domain Adaptation 噪声通用域自适应的散度优化

CVPR2021论文列表（中英对照）

悦读