本专栏是计算机视觉方向论文收集积累,时间:2021年5月11日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: Multiple Domain Experts Collaborative Learning: Multi-Source Domain Generalization For Person Re-Identification
AUTHORS: SHIJIE YU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To make ReID more practical and generalizable, we formulate person re-identification as a Domain Generalization (DG) problem and propose a novel training framework, named Multiple Domain Experts Collaborative Learning (MD-ExCo).
2, TITLE: How to Calibrate Your Event Camera
AUTHORS: Manasi Muglikar ; Mathias Gehrig ; Daniel Gehrig ; Davide Scaramuzza
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a generic event camera calibration framework using image reconstruction.
3, TITLE: Self-Guided Instance-Aware Network for Depth Completion and Enhancement
AUTHORS: Zhongzhen Luo ; Fengjia Zhang ; Guoyi Fu ; Jiajie Xu
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: To address these problems, we propose a novel self-guided instance-aware network (SG-IANet) that: (1) utilize self-guided mechanism to extract instance-level features that is needed for depth restoration, (2) exploit the geometric and context information into network learning to conform to the underlying constraints for edge clarity and structure consistency, (3) regularize the depth estimation and mitigate the impact of noise by instance-aware learning, and (4) train with synthetic data only by domain randomization to bridge the reality gap.
4, TITLE: Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey
AUTHORS: FEIFEI SHAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, in this paper, we consider WSOL is a sub-task of WSOD and provide a comprehensive survey of the recent achievements of WSOD. Then, we introduce the widely-used datasets and evaluation metrics of WSOD.
5, TITLE: Recent Standard Development Activities on Video Coding for Machines
AUTHORS: WEN GAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we will address the recent development activities in the MPEG VCM group.
6, TITLE: Anticipating Human Actions By Correlating Past with The Future with Jaccard Similarity Measures
AUTHORS: Basura Fernando ; Samitha Herath
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a framework for early action recognition and anticipation by correlating past features with the future using three novel similarity measures called Jaccard vector similarity, Jaccard cross-correlation and Jaccard Frobenius inner product over covariances.
7, TITLE: Social-IWSTCNN: A Social Interaction-Weighted Spatio-Temporal Convolutional Neural Network for Pedestrian Trajectory Prediction in Urban Traffic Scenarios
AUTHORS: Chi Zhang ; Christian Berger ; Marco Dozza
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper, we present the Social Interaction-Weighted Spatio-Temporal Convolutional Neural Network (Social-IWSTCNN), which includes both the spatial and the temporal features.
8, TITLE: Detecting Biological Locomotion in Video: A Computational Approach
AUTHORS: Soo Min Kang ; Richard P. Wildes
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this report, we refer to the locomotion of general biological species as biolocomotion.
9, TITLE: Disentangled Face Attribute Editing Via Instance-Aware Latent Space Search
AUTHORS: Yuxuan Han ; Jiaolong Yang ; Ying Fu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel framework (IALS) that performs Instance-Aware Latent-Space Search to find semantic directions for disentangled attribute editing.
10, TITLE: Occlusion Aware Kernel Correlation Filter Tracker Using RGB-D
AUTHORS: Srishti Yadav
CATEGORY: cs.CV [cs.CV, cs.LG, cs.RO]
HIGHLIGHT: We believe this work will set the basis for a better understanding of the effectiveness of kernel-based correlation filter trackers and to further define some of its possible advantages in tracking.
11, TITLE: Unsupervised Part Segmentation Through Disentangling Appearance and Shape
AUTHORS: Shilong Liu ; Lei Zhang ; Xiao Yang ; Hang Su ; Jun Zhu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We study the problem of unsupervised discovery and segmentation of object parts, which, as an intermediate local representation, are capable of finding intrinsic object structure and providing more explainable recognition results.
12, TITLE: SB-GCN: Structured BREP Graph Convolutional Network for Automatic Mating of CAD Assemblies
AUTHORS: BENJAMIN JONES et. al.
CATEGORY: cs.CV [cs.CV, cs.GR, cs.LG, I.3.5; I.2.10]
HIGHLIGHT: We propose SB-GCN, a representation learning scheme on BREPs that retains the topological structure of parts, and use these learned representations to predict CAD type mates.
13, TITLE: Pattern Detection in The Activation Space for Identifying Synthesized Content
AUTHORS: CELIA CINTAS et. al.
CATEGORY: cs.CV [cs.CV, cs.CR, cs.LG]
HIGHLIGHT: We propose SubsetGAN to identify generated content by detecting a subset of anomalous node-activations in the inner layers of pre-trained neural networks.
14, TITLE: AutoReCon: Neural Architecture Search-based Reconstruction for Data-free Compression
AUTHORS: Baozhou Zhu ; Peter Hofstee ; Johan Peltenburg ; Jinho Lee ; Zaid Alars
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: Specifically, we propose the AutoReCon method, which is a neural architecture search-based reconstruction method.
15, TITLE: Edge Detection for Satellite Images Without Deep Networks
AUTHORS: Joshua Abraham ; Calden Wloka
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Recent approaches to satellite image analysis have largely emphasized deep learning methods.
16, TITLE: Improving Sign Language Translation with Monolingual Data By Sign Back-Translation
AUTHORS: Hao Zhou ; Wengang Zhou ; Weizhen Qi ; Junfu Pu ; Houqiang Li
CATEGORY: cs.CV [cs.CV, cs.CL]
HIGHLIGHT: To tackle this parallel data bottleneck, we propose a sign back-translation (SignBT) approach, which incorporates massive spoken language texts into SLT training.
17, TITLE: Learning A Model-Driven Variational Network for Deformable Image Registration
AUTHORS: XI JIA et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: To address this whilst retaining the fast inference speed of deep learning, we propose VR-Net, a novel cascaded variational network for unsupervised deformable image registration.
18, TITLE: FINNger -- Applying Artificial Intelligence to Ease Math Learning for Children
AUTHORS: Rafael Baldasso Audibert ; Vinicius Marinho Maschio
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: With this work, we create the basis for an intuitive application that could join the fact that children have a lot of ease when using such technological applications, trying to shrink the gap between a fun and enjoyable activity with something that will improve the children knowledge and ability to understand concepts when in a low age, by using a novel convolutional neural network to achieve so, named FINNger.
19, TITLE: Predicting Invasive Ductal Carcinoma Using A Reinforcement Sample Learning Strategy Using Deep Learning
AUTHORS: Rushabh Patel
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, eess.IV]
HIGHLIGHT: We are proposing a method for Invasive ductal carcinoma that will use convolutional neural networks (CNN) on mammograms to assist radiologists in diagnosing the disease.
20, TITLE: Learning to Detect Fortified Areas
AUTHORS: Allan Gr�nlund ; Jonas Tranberg
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper we consider the problem of classifying which areas of a given surface are fortified by for instance, roads, sidewalks, parking spaces, paved driveways and terraces.
21, TITLE: Sli2Vol: Annotate A 3D Volume from A Single Slice with Self-Supervised Learning
AUTHORS: Pak-Hei Yeung ; Ana I. L. Namburete ; Weidi Xie
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: The objective of this work is to segment any arbitrary structures of interest (SOI) in 3D volumes by only annotating a single slice, (i.e. semi-automatic 3D segmentation).
22, TITLE: Aggregating Nested Transformers
AUTHORS: Zizhao Zhang ; Han Zhang ; Long Zhao ; Ting Chen ; Tomas Pfister
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we explore the idea of nesting basic local transformers on non-overlapping image blocks and aggregating them in a hierarchical manner.
23, TITLE: Enhance to Read Better: An Improved Generative Adversarial Network for Handwritten Document Image Enhancement
AUTHORS: Sana Khamekhem Jemni ; Mohamed Ali Souibgui ; Yousri Kessentini ; Alicia Forn�s
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose an end to end architecture based on Generative Adversarial Networks (GANs) to recover the degraded documents into a clean and readable form.
24, TITLE: Spatio-Contextual Deep Network Based Multimodal Pedestrian Detection For Autonomous Driving
AUTHORS: Kinjal Dasgupta ; Arindam Das ; Sudip Das ; Ujjwal Bhattacharya ; Senthil Yogamani
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper proposes an end-to-end multimodal fusion model for pedestrian detection using RGB and thermal images.
25, TITLE: PSGAN++: Robust Detail-Preserving Makeup Transfer and Removal
AUTHORS: SI LIU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we address the makeup transfer and removal tasks simultaneously, which aim to transfer the makeup from a reference image to a source image and remove the makeup from the with-makeup image respectively.
26, TITLE: Performance Analysis of A Foreground Segmentation Neural Network Model
AUTHORS: Joel Tom�s Morais ; Ant�nio Ramires Fernandes ; Andr� Leite Ferreira ; Bruno Faria
CATEGORY: cs.CV [cs.CV, I.4.6]
HIGHLIGHT: We present an ablation study of FgSegNet_v2, analysing its three stages: (i) Encoder, (ii) Feature Pooling Module and (iii) Decoder.
27, TITLE: Context-aware Cross-level Fusion Network for Camouflaged Object Detection
AUTHORS: Yujia Sun ; Geng Chen ; Tao Zhou ; Yi Zhang ; Nian Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel Context-aware Cross-level Fusion Network (C2F-Net) to address the challenging COD task.
28, TITLE: Low Resolution Information Also Matters: Learning Multi-Resolution Representations for Person Re-Identification
AUTHORS: Guoqing Zhang ; Yuhao Chen ; Weisi Lin ; Arun Chandran ; Xuan Jing
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we explore the influence of resolutions on feature extraction and develop a novel method for cross-resolution person re-ID called \emph{\textbf{M}ulti-Resolution \textbf{R}epresentations \textbf{J}oint \textbf{L}earning} (\textbf{MRJL}).
29, TITLE: KLIEP-based Density Ratio Estimation for Semantically Consistent Synthetic to Real Images Adaptation in Urban Traffic Scenes
AUTHORS: Artem Savkin ; Federico Tombari
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle this issue we propose density prematching strategy using KLIEP-based density ratio estimation procedure.
30, TITLE: Unsupervised Video Summarization Via Multi-source Features
AUTHORS: Hussain Kanafani ; Junaid Ahmed Ghauri ; Sherzod Hakimov ; Ralph Ewerth
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Therefore, we propose the incorporation of multiple feature sources with chunk and stride fusion to provide more information about the visual content.
31, TITLE: Style Similarity As Feedback for Product Design
AUTHORS: Mathew Schwartz ; Tomer Weiss ; Esra Ataer-Cansizoglu ; Jae-Woo Choi
CATEGORY: cs.CV [cs.CV, cs.GR, cs.LG]
HIGHLIGHT: We propose a designer in-the-loop workflow that mirrors methods of displaying similar products to consumers browsing e-commerce websites.
32, TITLE: Using The Overlapping Score to Improve Corruption Benchmarks
AUTHORS: Alfred Laugros ; Alice Caplier ; Matthieu Ospici
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we propose a metric called corruption overlapping score, which can be used to reveal flaws in corruption benchmarks.
33, TITLE: Dynamic Probabilistic Pruning: A General Framework for Hardware-constrained Pruning at Different Granularities
AUTHORS: Lizeth Gonzalez-Carabarin ; Iris A. M. Huijben ; Bastiaan S. Veeling ; Alexandre Schmid ; Ruud J. G. van Sloun
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: Here we propose a flexible new pruning mechanism that facilitates pruning at different granularities (weights, kernels, filters/feature maps), while retaining efficient memory organization (e.g. pruning exactly k-out-of-n weights for every output neuron, or pruning exactly k-out-of-n kernels for every feature map).
34, TITLE: The Nonlinearity Coefficient -- A Practical Guide to Neural Architecture Design
AUTHORS: George Philipp
CATEGORY: cs.LG [cs.LG, cs.CV, cs.NE]
HIGHLIGHT: In this work, we present a different and complementary approach to architecture design, which we term 'zero-shot architecture design' (ZSAD).
35, TITLE: Towards An IMU-based Pen Online Handwriting Recognizer
AUTHORS: MOHAMAD WEHBI et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper we present a online handwriting recognition system for word recognition which is based on inertial measurement units (IMUs) for digitizing text written on paper.
36, TITLE: Calibrated Prediction in and Out-of-domain for State-of-the-art Saliency Modeling
AUTHORS: Akis Linardos ; Matthias K�mmerer ; Ori Press ; Matthias Bethge
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: We conduct a large-scale transfer learning study which tests different ImageNet backbones, always using the same read out architecture and learning protocol adopted from DeepGaze II.
37, TITLE: Blurs Make Results Clearer: Spatial Smoothings to Improve Accuracy, Uncertainty, and Robustness
AUTHORS: Namuk Park ; Songkuk Kim
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, stat.ML]
HIGHLIGHT: To alleviate this issue, we propose spatial smoothing, a method that ensembles neighboring feature map points of CNNs.
38, TITLE: Adversarial Robustness Against Multiple $l_p$-threat Models at The Price of One and How to Quickly Fine-tune Robust Models to Another Threat Model
AUTHORS: Francesco Croce ; Matthias Hein
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: In this paper we develop a simple and efficient training scheme to achieve adversarial robustness against the union of $l_p$-threat models.
39, TITLE: Predict Then Interpolate: A Simple Algorithm to Learn Stable Classifiers
AUTHORS: Yujia Bao ; Shiyu Chang ; Regina Barzilay
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CL, cs.CV]
HIGHLIGHT: We propose Predict then Interpolate (PI), a simple algorithm for learning correlations that are stable across environments.
40, TITLE: Graph Self Supervised Learning: The BT, The HSIC, and The VICReg
AUTHORS: Sayan Nag
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CG, cs.CV, stat.ML]
HIGHLIGHT: In this paper, we have used a graph based self-supervised learning strategy with different loss functions (Barlow Twins[ 7], HSIC[ 4], VICReg[ 1]) which have shown promising results when applied with CNNs previously.
41, TITLE: On The Advantages of Multiple Stereo Vision Camera Designs for Autonomous Drone Navigation
AUTHORS: Rui Pimentel de Figueiredo ; Jakob Grimm Hansen ; Jonas Le Fevre ; Martim Brand�o ; Erdal Kayacan
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: In this work we showcase the design and assessment of the performance of a multi-camera UAV, when coupled with state-of-the-art planning and mapping algorithms for autonomous navigation.
42, TITLE: SimNet: Learning Reactive Self-driving Simulations from Real-world Observations
AUTHORS: LUCA BERGAMINI et. al.
CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG]
HIGHLIGHT: In this work, we present a simple end-to-end trainable machine learning system capable of realistically simulating driving experiences.
43, TITLE: What Data Do We Need for Training An AV Motion Planner?
AUTHORS: LONG CHEN et. al.
CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG]
HIGHLIGHT: We present experiments using up to 1000 hours worth of expert demonstration and find that training with 10x lower-quality data outperforms 1x AV-grade data in terms of planner performance.
44, TITLE: Smile Like You Mean It: Driving Animatronic Robotic Face with Learned Models
AUTHORS: Boyuan Chen ; Yuhang Hu ; Lianfeng Li ; Sara Cummings ; Hod Lipson
CATEGORY: cs.RO [cs.RO, cs.AI, cs.CV, cs.HC, cs.LG]
HIGHLIGHT: We addressed this challenge by designing a physical animatronic robotic face with soft skin and by developing a vision-based self-supervised learning framework for facial mimicry.
45, TITLE: Towards Transparent Application of Machine Learning in Video Processing
AUTHORS: Luka Murn ; Marc Gorriz Blanch ; Maria Santamaria ; Fiona Rivera ; Marta Mrak
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG, cs.MM]
HIGHLIGHT: The aim of this work is to understand and optimise learned models in video processing applications so systems that incorporate them can be used in a more trustworthy manner.
46, TITLE: Permutation Invariance and Uncertainty in Multitemporal Image Super-resolution
AUTHORS: Diego Valsesia ; Enrico Magli
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we show how building a model that is fully invariant to temporal permutation significantly improves performance and data efficiency.
47, TITLE: CBANet: Towards Complexity and Bitrate Adaptive Deep Image Compression Using A Single Network
AUTHORS: Jinyang Guo ; Dong Xu ; Guo Lu
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we propose a new deep image compression framework called Complexity and Bitrate Adaptive Network (CBANet), which aims to learn one single network to support variable bitrate coding under different computational complexity constraints.
48, TITLE: Weighing Features of Lung and Heart Regions for Thoracic Disease Classification
AUTHORS: JIANSHENG FANG et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: Inspired by this, we propose a novel deep learning framework to explore discriminative information from lung and heart regions.