| 2024-04-11 |
Connecting NeRFs, Images, and Text |
Francesco Ballerini et.al. |
2404.07993v1 |
null |
| 2024-04-11 |
GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh |
Jing Wen et.al. |
2404.07991v1 |
null |
| 2024-04-11 |
WaveMo: Learning Wavefront Modulations to See Through Scattering |
Mingyang Xie et.al. |
2404.07985v1 |
null |
| 2024-04-11 |
Gaga: Group Any Gaussians via 3D-aware Memory Bank |
Weijie Lyu et.al. |
2404.07977v1 |
null |
| 2024-04-11 |
FusionMamba: Efficient Image Fusion with State Space Model |
Siran Peng et.al. |
2404.07932v1 |
null |
| 2024-04-11 |
HGRN2: Gated Linear RNNs with State Expansion |
Zhen Qin et.al. |
2404.07904v1 |
link |
| 2024-04-11 |
Q-ITAGS: Quality-Optimized Spatio-Temporal Heterogeneous Task Allocation with a Time Budget |
Glen Neville et.al. |
2404.07902v1 |
null |
| 2024-04-11 |
Auditing health-related recommendations in social media: A Case Study of Abortion on YouTube |
Mohammed Lahsaini et.al. |
2404.07896v1 |
null |
| 2024-04-11 |
Typical blocks of the category $\mathcal O$ and Whittaker modules for Takiff superalgebras |
Chih-Whi Chen et.al. |
2404.07894v1 |
null |
| 2024-04-11 |
Context-aware Video Anomaly Detection in Long-Term Datasets |
Zhengye Yang et.al. |
2404.07887v1 |
null |
| 2024-04-10 |
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion |
Jaidev Shriram et.al. |
2404.07199v1 |
null |
| 2024-04-10 |
GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA |
Bingyi Zhang et.al. |
2404.07188v1 |
null |
| 2024-04-10 |
Adinkras and Pure Spinors |
Richard Eager et.al. |
2404.07167v1 |
null |
| 2024-04-10 |
Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations |
Ofir Shifman et.al. |
2404.07153v1 |
null |
| 2024-04-10 |
Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization |
Michael Kohler et.al. |
2404.07128v1 |
null |
| 2024-04-10 |
Measuring proximity to standard planes during fetal brain ultrasound scanning |
Chiara Di Vece et.al. |
2404.07124v1 |
null |
| 2024-04-10 |
"My toxic trait is thinking I'll remember this": gaps in the learner experience of video tutorials for feature-rich software |
Ian Drosos et.al. |
2404.07114v1 |
null |
| 2024-04-10 |
The generic dual of p-adic groups and applications |
Chris Jantzen et.al. |
2404.07111v1 |
null |
| 2024-04-10 |
Learning Priors for Non Rigid SfM from Casual Videos |
Yoni Kasten et.al. |
2404.07097v1 |
null |
| 2024-04-10 |
VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning |
Alexandros Xenos et.al. |
2404.07078v1 |
link |
| 2024-04-09 |
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering |
Juhong Min et.al. |
2404.06511v1 |
null |
| 2024-04-10 |
Reconstructing Hand-Held Objects in 3D |
Jane Wu et.al. |
2404.06507v2 |
null |
| 2024-04-09 |
A Machine Learning Framework for the Prediction of Grain Boundary Segregation in Chemically Complex Environments |
Doruk Aksoy et.al. |
2404.06499v1 |
null |
| 2024-04-10 |
Flying with Photons: Rendering Novel Views of Propagating Light |
Anagh Malik et.al. |
2404.06493v2 |
null |
| 2024-04-09 |
Uncovering Tidal Treasures: Automated Classification of Faint Tidal Features in DECaLS Data |
Alexander J. Gordon et.al. |
2404.06487v1 |
null |
| 2024-04-09 |
RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos |
Bochao Zou et.al. |
2404.06483v1 |
null |
| 2024-04-09 |
Laue Indexing with Optimal Transport |
Tomasz Kacprzak et.al. |
2404.06478v1 |
link |
| 2024-04-09 |
A comparative analysis of deep learning models for lung segmentation on X-ray images |
Weronika Hryniewska-Guzik et.al. |
2404.06455v1 |
link |
| 2024-04-09 |
QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding |
Yash Mehan et.al. |
2404.06442v1 |
null |
| 2024-04-09 |
ClassiPyGRB: Machine Learning-Based Classification and Visualization of Gamma Ray Bursts using t-SNE |
Keneth Garcia-Cifuentes et.al. |
2404.06439v1 |
null |
| 2024-04-08 |
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding |
Bo He et.al. |
2404.05726v1 |
null |
| 2024-04-08 |
Predicting Overtakes in Trucks Using CAN Data |
Talha Hanif Butt et.al. |
2404.05723v1 |
null |
| 2024-04-08 |
Case Study: Neural Network Malware Detection Verification for Feature and Image Datasets |
Preston K. Robinette et.al. |
2404.05703v1 |
null |
| 2024-04-08 |
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding |
Ahmad Idrissi-Yaghir et.al. |
2404.05694v1 |
null |
| 2024-04-08 |
Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery |
Ionut M. Motoi et.al. |
2404.05693v1 |
null |
| 2024-04-08 |
AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation |
Jiannan Ge et.al. |
2404.05667v1 |
null |
| 2024-04-08 |
Oblique photons, plasmons, and current-plasmons in relativistic plasmas and their topological implications |
Hong Qin et.al. |
2404.05636v1 |
null |
| 2024-04-08 |
AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets |
Pietro Lesci et.al. |
2404.05623v1 |
null |
| 2024-04-08 |
Experimental observation of a time rondeau crystal: Temporal Disorder in Spatiotemporal Order |
Leo Joon Il Moon et.al. |
2404.05620v1 |
null |
| 2024-04-08 |
Self-Explainable Affordance Learning with Embodied Caption |
Zhipeng Zhang et.al. |
2404.05603v1 |
null |
| 2024-04-05 |
On classification of global dynamics for energy-critical equivariant harmonic map heat flows and radial nonlinear heat equation |
Kihyun Kim et.al. |
2404.04247v1 |
null |
| 2024-04-05 |
Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism |
Trilokesh Ranjan Sarkar et.al. |
2404.04245v1 |
null |
| 2024-04-05 |
player2vec: A Language Modeling Approach to Understand Player Behavior in Games |
Tianze Wang et.al. |
2404.04234v1 |
null |
| 2024-04-05 |
Deep-learning Segmentation of Small Volumes in CT images for Radiotherapy Treatment Planning |
Jianxin Zhou et.al. |
2404.04202v1 |
null |
| 2024-04-05 |
SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers |
Weile Li et.al. |
2404.04179v1 |
link |
| 2024-04-05 |
Noisy Label Processing for Classification: A Survey |
Mengting Li et.al. |
2404.04159v1 |
null |
| 2024-04-05 |
Improving Detection in Aerial Images by Capturing Inter-Object Relationships |
Botao Ren et.al. |
2404.04140v1 |
null |
| 2024-04-05 |
Label Propagation for Zero-shot Classification with Vision-Language Models |
Vladan Stojnić et.al. |
2404.04072v1 |
link |
| 2024-04-05 |
VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots |
Akhil Padmanabha et.al. |
2404.04066v1 |
null |
| 2024-04-05 |
Phase Binarization in Mutually Synchronized Bias Field-free Spin Hall Nano-oscillators for Reservoir Computing |
Sourabh Manna et.al. |
2404.04023v1 |
null |
| 2024-04-04 |
OW-VISCap: Open-World Video Instance Segmentation and Captioning |
Anwesa Choudhuri et.al. |
2404.03657v1 |
null |
| 2024-04-04 |
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation |
Shuting He et.al. |
2404.03645v1 |
link |
| 2024-04-04 |
On the Efficiency of Convolutional Neural Networks |
Andrew Lavin et.al. |
2404.03617v1 |
null |
| 2024-04-04 |
Creator Hearts: Investigating the Impact Positive Signals from YouTube Creators in Shaping Comment Section Behavior |
Frederick Choi et.al. |
2404.03612v1 |
null |
| 2024-04-04 |
InsectMamba: Insect Pest Classification with State Space Model |
Qianning Wang et.al. |
2404.03611v1 |
null |
| 2024-04-04 |
DiffDet4SAR: Diffusion-based Aircraft Target Detection Network for SAR Images |
Zhou Jie et.al. |
2404.03595v1 |
link |
| 2024-04-04 |
Alzheimer's disease detection in PSG signals |
Lorena Gallego-Viñarás et.al. |
2404.03549v1 |
null |
| 2024-04-04 |
Towards Transcranial 3D Ultrasound Localization Microscopy of the Nonhuman Primate Brain |
Paul Xing et.al. |
2404.03547v1 |
null |
| 2024-04-04 |
Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models |
Siyuan Mei et.al. |
2404.03541v1 |
null |
| 2024-04-05 |
A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data |
Iqra Bano et.al. |
2404.03493v2 |
null |
| 2024-04-03 |
LidarDM: Generative LiDAR Simulation in a Generated World |
Vlas Zyrianov et.al. |
2404.02903v1 |
null |
| 2024-04-03 |
Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds |
Kamalika Chaudhuri et.al. |
2404.02866v1 |
link |
| 2024-04-03 |
Semisimple Algebras of Vector Fields on $\mathbb{C}^{3}$ |
Sajid Ali et.al. |
2404.02847v1 |
null |
| 2024-04-03 |
GPU-Accelerated RSF Level Set Evolution for Large-Scale Microvascular Segmentation |
Meher Niger et.al. |
2404.02813v1 |
null |
| 2024-04-03 |
Generative-Contrastive Heterogeneous Graph Neural Network |
Yu Wang et.al. |
2404.02810v1 |
null |
| 2024-04-03 |
FPT: Feature Prompt Tuning for Few-shot Readability Assessment |
Ziyang Wang et.al. |
2404.02772v1 |
link |
| 2024-04-03 |
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement |
Hao Wu et.al. |
2404.02755v1 |
null |
| 2024-04-03 |
Terraced Compression Method with Automated Threshold Selection for Multidimensional Image Clustering of Heterogeneous Bodies |
Jiatong Li et.al. |
2404.02744v1 |
null |
| 2024-04-03 |
Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss |
Yunfan Lu et.al. |
2404.02731v1 |
link |
| 2024-04-03 |
Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLM |
Zhe Liu et.al. |
2404.02706v1 |
null |
| 2024-04-02 |
Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models |
Zeyu Yang et.al. |
2404.02148v1 |
link |
| 2024-04-02 |
Multiparametric quantification and visualization of liver fat using ultrasound |
Jihye Baek et.al. |
2404.02143v1 |
null |
| 2024-04-03 |
ResNet with Integrated Convolutional Block Attention Module for Ship Classification Using Transfer Learning on Optical Satellite Imagery |
Ryan Donghan Kwon et.al. |
2404.02135v2 |
null |
| 2024-04-02 |
ViTamin: Designing Scalable Vision Models in the Vision-Language Era |
Jienneg Chen et.al. |
2404.02132v1 |
link |
| 2024-04-02 |
ImageNot: A contrast with ImageNet preserves model rankings |
Olawale Salaudeen et.al. |
2404.02112v1 |
null |
| 2024-04-02 |
CameraCtrl: Enabling Camera Control for Text-to-Video Generation |
Hao He et.al. |
2404.02101v1 |
link |
| 2024-04-02 |
Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows |
Grace Guo et.al. |
2404.02081v1 |
null |
| 2024-04-02 |
Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation |
Hui Xiao et.al. |
2404.02065v1 |
null |
| 2024-04-02 |
Long-context LLMs Struggle with Long In-context Learning |
Tianle Li et.al. |
2404.02060v1 |
link |
| 2024-04-02 |
Deconstructing In-Context Learning: Understanding Prompts via Corruption |
Namrata Shivagunde et.al. |
2404.02054v1 |
link |
| 2024-03-29 |
Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations |
Jaisidh Singh et.al. |
2403.20312v1 |
link |
| 2024-03-29 |
Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation |
Fangxu Yu et.al. |
2403.20289v1 |
link |
| 2024-03-29 |
Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges |
Shreyasi Pathak et.al. |
2403.20260v1 |
null |
| 2024-03-29 |
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions |
Runhao Zeng et.al. |
2403.20254v1 |
null |
| 2024-03-29 |
Latent Embedding Clustering for Occlusion Robust Head Pose Estimation |
José Celestino et.al. |
2403.20251v1 |
null |
| 2024-03-29 |
Long-Tailed Anomaly Detection with Learnable Class Names |
Chih-Hui Ho et.al. |
2403.20236v1 |
null |
| 2024-04-02 |
Artificial Neural Networks-based Real-time Classification of ENG Signals for Implanted Nerve Interfaces |
Antonio Coviello et.al. |
2403.20234v2 |
null |
| 2024-03-29 |
MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark |
Sanghyun Woo et.al. |
2403.20225v1 |
null |
| 2024-03-29 |
Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science |
Yazheng Yang et.al. |
2403.20208v1 |
null |
| 2024-03-29 |
The Future of Combating Rumors? Retrieval, Discrimination, and Generation |
Junhao Xu et.al. |
2403.20204v1 |
null |
| 2024-03-28 |
RSMamba: Remote Sensing Image Classification with State Space Model |
Keyan Chen et.al. |
2403.19654v1 |
link |
| 2024-03-28 |
Square patterns in dynamical orbits |
Vefa Goksel et.al. |
2403.19642v1 |
null |
| 2024-03-28 |
Siamese Vision Transformers are Scalable Audio-visual Learners |
Yan-Bo Lin et.al. |
2403.19638v1 |
null |
| 2024-03-28 |
Four-dimensional gradient Ricci solitons with (half) nonnegative isotropic curvature |
Huai-Dong Cao et.al. |
2403.19627v1 |
null |
| 2024-03-28 |
Top-$k$ Classification and Cardinality-Aware Prediction |
Anqi Mao et.al. |
2403.19625v1 |
null |
| 2024-03-28 |
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents |
Zeren Chen et.al. |
2403.19622v1 |
null |
| 2024-03-28 |
SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects |
Avinash Ummadisingu et.al. |
2403.19607v1 |
null |
| 2024-03-28 |
Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model |
Zhicai Wang et.al. |
2403.19600v1 |
link |
| 2024-03-28 |
Frame by Familiar Frame: Understanding Replication in Video Diffusion Models |
Aimon Rahman et.al. |
2403.19593v1 |
null |
| 2024-03-28 |
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation |
Zhongliang Zhou et.al. |
2403.19584v1 |
null |
| 2024-03-27 |
MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering |
Guoxing Sun et.al. |
2403.18820v1 |
null |
| 2024-03-27 |
Breaking the Limitations with Sparse Inputs by Variational Frameworks (BLIss) in Terahertz Super-Resolution 3D Reconstruction |
Yiyao Zhang et.al. |
2403.18776v1 |
null |
| 2024-03-27 |
CaT: Constraints as Terminations for Legged Locomotion Reinforcement Learning |
Elliot Chane-Sane et.al. |
2403.18765v1 |
null |
| 2024-03-27 |
A vascular synthetic model for improved aneurysm segmentation and detection via Deep Neural Networks |
Rafic Nader et.al. |
2403.18734v1 |
null |
| 2024-03-27 |
Contrastive Learning with Orthonormal Anchors (CLOA) |
Huanran Li et.al. |
2403.18699v1 |
null |
| 2024-03-27 |
Annolid: Annotate, Segment, and Track Anything You Need |
Chen Yang et.al. |
2403.18690v1 |
null |
| 2024-03-27 |
InceptionTime vs. Wavelet -- A comparison for time series classification |
Daniel Klenkert et.al. |
2403.18687v1 |
null |
| 2024-03-27 |
TransFusion: Contrastive Learning with Transformers |
Huanran Li et.al. |
2403.18681v1 |
null |
| 2024-03-28 |
FluxGAT: Integrating Flux Sampling with Graph Neural Networks for Unbiased Gene Essentiality Classification |
Kieren Sharma et.al. |
2403.18666v2 |
null |
| 2024-03-27 |
Indecomposable set-theoretical solutions to the Yang-Baxter equation of size $p^2$ |
Carsten Dietzel et.al. |
2403.18653v1 |
null |
| 2024-03-26 |
Efficient Video Object Segmentation via Modulated Cross-Attention Memory |
Abdelrahman Shaker et.al. |
2403.17937v1 |
link |
| 2024-03-26 |
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis |
Muhammad Hamza Mughal et.al. |
2403.17936v1 |
null |
| 2024-03-26 |
OmniVid: A Generative Framework for Universal Video Understanding |
Junke Wang et.al. |
2403.17935v1 |
link |
| 2024-03-26 |
Track Everything Everywhere Fast and Robustly |
Yunzhou Song et.al. |
2403.17931v1 |
null |
| 2024-03-26 |
FastCAR: Fast Classification And Regression Multi-Task Learning via Task Consolidation for Modelling a Continuous Property Variable of Object Classes |
Anoop Kini et.al. |
2403.17926v1 |
null |
| 2024-03-26 |
The Need for Speed: Pruning Transformers with One Recipe |
Samir Khaki et.al. |
2403.17921v1 |
link |
| 2024-03-26 |
TC4D: Trajectory-Conditioned Text-to-4D Generation |
Sherwin Bahmani et.al. |
2403.17920v1 |
null |
| 2024-03-26 |
AgentStudio: A Toolkit for Building General Virtual Agents |
Longtao Zheng et.al. |
2403.17918v1 |
null |
| 2024-03-26 |
Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos |
Akshay Paruchuri et.al. |
2403.17915v1 |
null |
| 2024-03-26 |
Hierarchical Multi-label Classification for Fine-level Event Extraction from Aviation Accident Reports |
Xinyu Zhao et.al. |
2403.17914v1 |
null |
| 2024-03-25 |
DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking |
Yichuan Li et.al. |
2403.16786v1 |
null |
| 2024-03-25 |
C-arm inverse geometry CT for 3D cardiac chamber mapping |
Jordan M. Slagowski et.al. |
2403.16779v1 |
null |
| 2024-03-25 |
Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases |
Sophie Starck et.al. |
2403.16776v1 |
null |
| 2024-03-25 |
As Good As A Coin Toss Human detection of AI-generated images, videos, audio, and audiovisual stimuli |
Di Cooke et.al. |
2403.16760v1 |
null |
| 2024-03-25 |
Creating a Digital Twin of Spinal Surgery: A Proof of Concept |
Jonas Hein et.al. |
2403.16736v1 |
null |
| 2024-03-25 |
A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models |
Nils Ingelhag et.al. |
2403.16730v1 |
null |
| 2024-03-25 |
One-Shot Domain Incremental Learning |
Yasushi Esaki et.al. |
2403.16707v1 |
null |
| 2024-03-25 |
Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer |
Dominik Müller et.al. |
2403.16695v1 |
null |
| 2024-03-25 |
DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks |
Dominik Müller et.al. |
2403.16678v1 |
link |
| 2024-03-25 |
FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression |
Alireza Furutanpey et.al. |
2403.16677v1 |
null |
| 2024-03-25 |
A Novel Loss Function-based Support Vector Machine for Binary Classification |
Yan Li et.al. |
2403.16654v1 |
null |
| 2024-03-25 |
Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution |
Qingping Zheng et.al. |
2403.16643v1 |
null |
| 2024-03-25 |
Multi-Scale Texture Loss for CT denoising with GANs |
Francesco Di Feola et.al. |
2403.16640v1 |
link |
| 2024-03-25 |
AI-Generated Video Detection via Spatio-Temporal Anomaly Learning |
Jianfa Bai et.al. |
2403.16638v1 |
null |
| 2024-03-25 |
Distributed collaborative anomalous sound detection by embedding sharing |
Kota Dohi et.al. |
2403.16610v1 |
null |
| 2024-03-25 |
EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation |
Kudaibergen Abutalip et.al. |
2403.16594v1 |
null |
| 2024-03-22 |
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models |
Yuzhang Shang et.al. |
2403.15388v1 |
null |
| 2024-03-22 |
Time-efficient, high-resolution 3T whole-brain relaxometry using Cartesian 3D MR-STAT with CSF suppression |
Hongyan Liu et.al. |
2403.15379v1 |
null |
| 2024-03-22 |
Long-CLIP: Unlocking the Long-Text Capability of CLIP |
Beichen Zhang et.al. |
2403.15378v1 |
null |
| 2024-03-22 |
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding |
Yi Wang et.al. |
2403.15377v1 |
null |
| 2024-03-22 |
Cascading Blackout Severity Prediction with Statistically-Augmented Graph Neural Networks |
Joe Gorka et.al. |
2403.15363v1 |
null |
| 2024-03-22 |
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series |
Badri N. Patro et.al. |
2403.15360v1 |
null |
| 2024-03-22 |
Ultrasound Imaging based on the Variance of a Diffusion Restoration Model |
Yuxin Zhang et.al. |
2403.15316v1 |
null |
| 2024-03-22 |
Global Control for Local SO(3)-Equivariant Scale-Invariant Vessel Segmentation |
Patryk Rygiel et.al. |
2403.15314v1 |
null |
| 2024-03-22 |
Quantum-inspired classification via efficient simulation of Helstrom measurement |
Wooseop Hwang et.al. |
2403.15308v1 |
null |
| 2024-03-22 |
Reconnaissance ultracool spectra in the Euclid Deep Fields |
Jerry Jun-Yan Zhang et.al. |
2403.15288v1 |
null |
| 2024-03-21 |
Language Repository for Long Video Understanding |
Kumara Kahatapitiya et.al. |
2403.14622v1 |
link |
| 2024-03-22 |
Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion |
Xiang Fan et.al. |
2403.14617v2 |
null |
| 2024-03-21 |
Explorative Inbetweening of Time and Space |
Haiwen Feng et.al. |
2403.14611v1 |
null |
| 2024-03-21 |
ReNoise: Real Image Inversion Through Iterative Noising |
Daniel Garibi et.al. |
2403.14602v1 |
null |
| 2024-03-21 |
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model |
Zheng Zhang et.al. |
2403.14598v1 |
link |
| 2024-03-21 |
Large Language Models for Multi-Choice Question Classification of Medical Subjects |
Víctor Ponce-López et.al. |
2403.14582v1 |
null |
| 2024-03-21 |
DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video |
Narek Tumanyan et.al. |
2403.14548v1 |
null |
| 2024-03-21 |
Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images |
Tom Burgert et.al. |
2403.14547v1 |
null |
| 2024-03-21 |
Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets |
Ahmet Alp Kindiroglu et.al. |
2403.14534v1 |
link |
| 2024-03-21 |
Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration |
Chenyang Li et.al. |
2403.14523v1 |
null |
| 2024-03-21 |
Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting |
Alicia Durrer et.al. |
2403.14499v1 |
link |
| 2024-03-20 |
TimeRewind: Rewinding Time with Image-and-Events Video Diffusion |
Jingxi Chen et.al. |
2403.13800v1 |
null |
| 2024-03-20 |
Hierarchical NeuroSymbolic Approach for Action Quality Assessment |
Lauren Okamoto et.al. |
2403.13798v1 |
null |
| 2024-03-20 |
Bridge the Modality and Capacity Gaps in Vision-Language Model Selection |
Chao Yi et.al. |
2403.13797v1 |
null |
| 2024-03-20 |
The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency and Usability in AI |
Matt White et.al. |
2403.13784v1 |
null |
| 2024-03-20 |
Gradings on associative triple systems of the second kind |
Alberto Daza-Garcia et.al. |
2403.13775v1 |
null |
| 2024-03-20 |
Towards Principled Representation Learning from Videos for Reinforcement Learning |
Dipendra Misra et.al. |
2403.13765v1 |
null |
| 2024-03-20 |
Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model |
Diwei Wang et.al. |
2403.13756v1 |
null |
| 2024-03-20 |
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation |
Fu-Yun Wang et.al. |
2403.13745v1 |
null |
| 2024-03-20 |
Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes |
Yifan Chen et.al. |
2403.13724v1 |
null |
| 2024-03-20 |
Improving the Adaptive Moment Estimation (ADAM) stochastic optimizer through an Implicit-Explicit (IMEX) time-stepping approach |
Abhinab Bhattacharjee et.al. |
2403.13704v1 |
null |
| 2024-03-19 |
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression |
Zhuoshi Pan et.al. |
2403.12968v1 |
null |
| 2024-03-19 |
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation |
Shuai Yang et.al. |
2403.12962v1 |
link |
| 2024-03-19 |
WHAC: World-grounded Humans and Cameras |
Wanqi Yin et.al. |
2403.12959v1 |
null |
| 2024-03-19 |
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation |
Rajeev Yasarla et.al. |
2403.12953v1 |
null |
| 2024-03-19 |
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models |
Elaine Sui et.al. |
2403.12952v1 |
link |
| 2024-03-19 |
Legendrian loops and cluster modular groups |
James Hughes et.al. |
2403.12951v1 |
null |
| 2024-03-19 |
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers |
Vidhi Jain et.al. |
2403.12943v1 |
null |
| 2024-03-19 |
Contextual AD Narration with Interleaved Multimodal Sequence |
Hanlin Wang et.al. |
2403.12922v1 |
null |
| 2024-03-19 |
Semantic Layering in Room Segmentation via LLMs |
Taehyeon Kim et.al. |
2403.12920v1 |
null |
| 2024-03-19 |
Yell At Your Robot: Improving On-the-Fly from Language Corrections |
Lucy Xiaoyang Shi et.al. |
2403.12910v1 |
null |
| 2024-03-18 |
Time Series Compression using Quaternion Valued Neural Networks and Quaternion Backpropagation |
Johannes Pöppelbaum et.al. |
2403.11722v1 |
null |
| 2024-03-18 |
Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing |
Juan Zhang et.al. |
2403.11700v1 |
null |
| 2024-03-18 |
A Spatial-Temporal Progressive Fusion Network for Breast Lesion Segmentation in Ultrasound Videos |
Zhengzheng Tu et.al. |
2403.11699v1 |
null |
| 2024-03-18 |
Object Segmentation-Assisted Inter Prediction for Versatile Video Coding |
Zhuoyuan Li et.al. |
2403.11694v1 |
null |
| 2024-03-19 |
MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image Segmentation |
Haoyu Zhao et.al. |
2403.11689v2 |
null |
| 2024-03-18 |
Better (pseudo-)labels for semi-supervised instance segmentation |
François Porcher et.al. |
2403.11675v1 |
null |
| 2024-03-19 |
WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising |
Haoyu Zhao et.al. |
2403.11672v2 |
null |
| 2024-03-18 |
Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection |
Julia Wolleb et.al. |
2403.11667v1 |
null |
| 2024-03-18 |
Combining Local and Global Perception for Autonomous Navigation on Nano-UAVs |
Lorenzo Lamberti et.al. |
2403.11661v1 |
null |
| 2024-03-18 |
LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model |
Yuxin Cao et.al. |
2403.11656v1 |
null |
| 2024-03-15 |
Strong and Controllable Blind Image Decomposition |
Zeyu Zhang et.al. |
2403.10520v1 |
link |
| 2024-03-15 |
Frozen Feature Augmentation for Few-Shot Image Classification |
Andreas Bär et.al. |
2403.10519v1 |
null |
| 2024-03-15 |
VideoAgent: Long-form Video Understanding with Large Language Model as Agent |
Xiaohan Wang et.al. |
2403.10517v1 |
null |
| 2024-03-15 |
Surveyor: Facilitating Discovery Within Video Games for Blind and Low Vision Players |
Vishnu Nair et.al. |
2403.10512v1 |
null |
| 2024-03-15 |
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study |
Chenguang Wang et.al. |
2403.10499v1 |
link |
| 2024-03-15 |
Joint Multimodal Transformer for Dimensional Emotional Recognition in the Wild |
Paul Waligora et.al. |
2403.10488v1 |
null |
| 2024-03-15 |
Tensor Star Decomposition |
Wuyang Zhou et.al. |
2403.10481v1 |
null |
| 2024-03-15 |
Using an LLM to Turn Sign Spottings into Spoken Language Sentences |
Ozge Mercanoglu Sincan et.al. |
2403.10434v1 |
null |
| 2024-03-15 |
Neural Networks Hear You Loud And Clear: Hearing Loss Compensation Using Deep Neural Networks |
Peter Leer et.al. |
2403.10420v1 |
null |
| 2024-03-15 |
A comparative study on machine learning approaches for rock mass classification using drilling data |
Tom F. Hansen et.al. |
2403.10404v1 |
null |
| 2024-03-14 |
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models |
Akhil Kedia et.al. |
2403.09635v1 |
link |
| 2024-03-14 |
Generalized Predictive Model for Autonomous Driving |
Jiazhi Yang et.al. |
2403.09630v1 |
link |
| 2024-03-14 |
From the Conformal Anomaly to the Virasoro Algebra |
Sid Maibach et.al. |
2403.09628v1 |
null |
| 2024-03-14 |
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding |
Guo Chen et.al. |
2403.09626v1 |
link |
| 2024-03-14 |
Score-Guided Diffusion for 3D Human Recovery |
Anastasis Stathopoulos et.al. |
2403.09623v1 |
link |
| 2024-03-14 |
PosSAM: Panoptic Open-vocabulary Segment Anything |
Vibashan VS et.al. |
2403.09620v1 |
null |
| 2024-03-14 |
Explore In-Context Segmentation via Latent Diffusion Models |
Chaoyang Wang et.al. |
2403.09616v1 |
null |
| 2024-03-14 |
Compute-first optical detection for noise-resilient visual perception |
Jungmin Kim et.al. |
2403.09612v1 |
null |
| 2024-03-14 |
Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds |
Ilyass Moummad et.al. |
2403.09598v1 |
link |
| 2024-03-14 |
DungeonMaker: Embedding Tangible Creation and Destruction in Hybrid Board Games through Personal Fabrication Technology |
Evgeny Stemasov et.al. |
2403.09592v1 |
null |
| 2024-03-13 |
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis |
Enric Corona et.al. |
2403.08764v1 |
null |
| 2024-03-13 |
Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches |
Yun Xin Teoh et.al. |
2403.08761v1 |
null |
| 2024-03-13 |
MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning |
Jialv Zou et.al. |
2403.08760v1 |
link |
| 2024-03-13 |
Spatiotemporal Diffusion Model with Paired Sampling for Accelerated Cardiac Cine MRI |
Shihan Qiu et.al. |
2403.08758v1 |
null |
| 2024-03-13 |
DAM: Dynamic Adapter Merging for Continual Video QA Learning |
Feng Cheng et.al. |
2403.08755v1 |
link |
| 2024-03-13 |
Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI |
Shihan Qiu et.al. |
2403.08749v1 |
null |
| 2024-03-13 |
Torsion pairs, t-structures, and co-t-structures for completions of discrete cluster categories |
Sofia Franchini et.al. |
2403.08735v1 |
null |
| 2024-03-13 |
Euclid: Testing photometric selection of emission-line galaxy targets |
M. S. Cagliari et.al. |
2403.08726v1 |
null |
| 2024-03-13 |
Diffusion-based Iterative Counterfactual Explanations for Fetal Ultrasound Image Quality Assessment |
Paraskevas Pegios et.al. |
2403.08700v1 |
null |
| 2024-03-13 |
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention |
Heejune Sheen et.al. |
2403.08699v1 |
null |
| 2024-03-12 |
OPEN TEACH: A Versatile Teleoperation System for Robotic Manipulation |
Aadhithya Iyer et.al. |
2403.07870v1 |
null |
| 2024-03-12 |
TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation |
Shivin Dass et.al. |
2403.07869v1 |
null |
| 2024-03-12 |
Iterative Graph Neural Network Enhancement via Frequent Subgraph Mining of Explanations |
Harish G. Naik et.al. |
2403.07849v1 |
null |
| 2024-03-12 |
When Eye-Tracking Meets Machine Learning: A Systematic Review on Applications in Medical Image Analysis |
Sahar Moradizeyveh et.al. |
2403.07834v1 |
null |
| 2024-03-12 |
DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies |
William Xie et.al. |
2403.07832v1 |
null |
| 2024-03-12 |
A geometric model for the module category of a string algebra |
Karin Baur et.al. |
2403.07810v1 |
null |
| 2024-03-12 |
BraSyn 2023 challenge: Missing MRI synthesis and the effect of different learning objectives |
Ivo M. Baltruschat et.al. |
2403.07800v1 |
null |
| 2024-03-12 |
A robust SVM-based approach with feature selection and outliers detection for classification problems |
Marta Baldomero-Naranjo et.al. |
2403.07753v1 |
null |
| 2024-03-12 |
Vision-based Vehicle Re-identification in Bridge Scenario using Flock Similarity |
Chunfeng Zhang et.al. |
2403.07752v1 |
null |
| 2024-03-12 |
Harnessing two-photon dissipation for enhanced quantum measurement and control |
Antoine Marquet et.al. |
2403.07744v1 |
null |
| 2024-03-11 |
Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling |
Wele Gedara Chaminda Bandara et.al. |
2403.06978v1 |
link |
| 2024-03-12 |
VideoMamba: State Space Model for Efficient Video Understanding |
Kunchang Li et.al. |
2403.06977v2 |
link |
| 2024-03-11 |
Memory-based Adapters for Online 3D Scene Perception |
Xiuwei Xu et.al. |
2403.06974v1 |
null |
| 2024-03-11 |
Explainable Transformer Prototypes for Medical Diagnoses |
Ugur Demir et.al. |
2403.06961v1 |
link |
| 2024-03-11 |
Quadruped-Frog: Rapid Online Optimization of Continuous Quadruped Jumping |
Guillaume Bellegarda et.al. |
2403.06954v1 |
null |
| 2024-03-11 |
Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer |
Siddhant Satyanaik et.al. |
2403.06953v1 |
null |
| 2024-03-11 |
Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge |
Yuting Zhang et.al. |
2403.06947v1 |
link |
| 2024-03-11 |
Conditional Score-Based Diffusion Model for Cortical Thickness Trajectory Prediction |
Qing Xiao et.al. |
2403.06940v1 |
null |
| 2024-03-11 |
FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks |
Muhammad Saif Ullah Khan et.al. |
2403.06904v1 |
null |
| 2024-03-11 |
Benign overfitting in leaky ReLU networks with moderate input dimension |
Kedar Karhadkar et.al. |
2403.06903v1 |
null |
| 2024-03-08 |
Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos |
Tarun Kalluri et.al. |
2403.05535v1 |
null |
| 2024-03-08 |
Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets |
Lorenzo Brigato et.al. |
2403.05532v1 |
null |
| 2024-03-08 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context |
Machel Reid et.al. |
2403.05530v1 |
null |
| 2024-03-08 |
Take Your Best Shot: Sampling-Based Next-Best-View Planning for Autonomous Photography & Inspection |
Shijie Gao et.al. |
2403.05477v1 |
null |
| 2024-03-08 |
Will GPT-4 Run DOOM? |
Adrian de Wynter et.al. |
2403.05468v1 |
null |
| 2024-03-08 |
Evaluating AI and Human Authorship Quality in Academic Writing through Physics Essays |
Will Yeadon et.al. |
2403.05458v1 |
null |
| 2024-03-08 |
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models |
Yabo Zhang et.al. |
2403.05438v1 |
link |
| 2024-03-08 |
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors |
Anindya Mondal et.al. |
2403.05435v1 |
null |
| 2024-03-08 |
Infinite Translation Surfaces in the Wild |
Vincent Delecroix et.al. |
2403.05424v1 |
null |
| 2024-03-08 |
Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery |
Mubashir Noman et.al. |
2403.05419v1 |
link |
| 2024-03-07 |
DeepSee: Multidimensional Visualizations of Seabed Ecosystems |
Adam Coscia et.al. |
2403.04761v1 |
link |
| 2024-03-07 |
iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries |
Adam Coscia et.al. |
2403.04760v1 |
link |
| 2024-03-07 |
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts |
Adam Coscia et.al. |
2403.04758v1 |
link |
| 2024-03-07 |
Preliminary Guidelines For Combining Data Integration and Visual Data Analysis |
Adam Coscia et.al. |
2403.04757v1 |
link |
| 2024-03-07 |
Photonic probabilistic machine learning using quantum vacuum noise |
Seou Choi et.al. |
2403.04731v1 |
null |
| 2024-03-07 |
Analysis of Systems' Performance in Natural Language Processing Competitions |
Sergio Nava-Muñoz et.al. |
2403.04693v1 |
null |
| 2024-03-07 |
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios |
Qilang Ye et.al. |
2403.04640v1 |
link |
| 2024-03-07 |
Scalable, Simulation-Guided Compliant Tactile Finger Design |
Yuxiang Ma et.al. |
2403.04638v1 |
null |
| 2024-03-08 |
Pix2Gif: Motion-Guided Diffusion for GIF Generation |
Hitesh Kandala et.al. |
2403.04634v2 |
null |
| 2024-03-07 |
MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder |
Lei Li et.al. |
2403.04626v1 |
null |
| 2024-03-06 |
3D Diffusion Policy |
Yanjie Ze et.al. |
2403.03954v1 |
link |
| 2024-03-06 |
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL |
Jesse Farebrother et.al. |
2403.03950v1 |
null |
| 2024-03-06 |
Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation |
Marcel Torne et.al. |
2403.03949v1 |
null |
| 2024-03-06 |
DART: Implicit Doppler Tomography for Radar Novel View Synthesis |
Tianshu Huang et.al. |
2403.03896v1 |
null |
| 2024-03-06 |
Joint multi-task learning improves weakly-supervised biomarker prediction in computational pathology |
Omar S. M. El Nahhas et.al. |
2403.03891v1 |
link |
| 2024-03-06 |
Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation |
Xiao Ma et.al. |
2403.03890v1 |
null |
| 2024-03-06 |
Decoupled Vertical Federated Learning for Practical Training on Vertically Partitioned Data |
Avi Amalanshu et.al. |
2403.03871v1 |
null |
| 2024-03-06 |
X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification |
Hanzi Xu et.al. |
2403.03863v1 |
link |
| 2024-03-06 |
ProxNF: Neural Field Proximal Training for High-Resolution 4D Dynamic Image Reconstruction |
Luke Lozenski et.al. |
2403.03860v1 |
null |
| 2024-03-06 |
MedMamba: Vision Mamba for Medical Image Classification |
Yubiao Yue et.al. |
2403.03849v1 |
link |
| 2024-03-05 |
Extension Theory and Fermionic Strongly Fusion 2-Categories |
Thibault D. Décoppet et.al. |
2403.03211v1 |
null |
| 2024-03-05 |
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis |
Patrick Esser et.al. |
2403.03206v1 |
null |
| 2024-03-05 |
Behavior Generation with Latent Actions |
Seungjae Lee et.al. |
2403.03181v1 |
link |
| 2024-03-05 |
Deep-Learned Compression for Radio-Frequency Signal Classification |
Armani Rodriguez et.al. |
2403.03150v1 |
null |
| 2024-03-05 |
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization |
Yuxin Guo et.al. |
2403.03145v1 |
link |
| 2024-03-05 |
Motion-Corrected Moving Average: Including Post-Hoc Temporal Information for Improved Video Segmentation |
Robert Mendel et.al. |
2403.03120v1 |
null |
| 2024-03-05 |
Equilibria in Two-Stage Facility Location with Atomic Clients |
Simon Krogmann et.al. |
2403.03114v1 |
null |
| 2024-03-05 |
Galaxies in the Zone of Avoidance: Misclassifications using machine learning tools |
P. Marchant Cortés et.al. |
2403.03098v1 |
null |
| 2024-03-05 |
Collective self-caging of active filaments in virtual confinement |
Maximilian Kurjahn et.al. |
2403.03093v1 |
null |
| 2024-03-05 |
A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives |
Simone Alberto Peirone et.al. |
2403.03037v1 |
null |
| 2024-03-03 |
Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model |
Rui Yang et.al. |
2403.01362v1 |
null |
| 2024-03-02 |
Improve Cost Efficiency of Active Learning over Noisy Dataset |
Zan-Kai Chong et.al. |
2403.01346v1 |
null |
| 2024-03-02 |
An eternal hypersurface flow arising in centro-affine geometry |
Xinjie Jiang et.al. |
2403.01340v1 |
null |
| 2024-03-02 |
Image-Based Dietary Assessment: A Healthy Eating Plate Estimation System |
Assylzhan Izbassar et.al. |
2403.01310v1 |
null |
| 2024-03-02 |
VNLP: Turkish NLP Package |
Meliksah Turker et.al. |
2403.01309v1 |
null |
| 2024-03-02 |
Towards a classification of $p^2$-discriminant ideal twins over number fields |
Alyson Deines et.al. |
2403.01287v1 |
null |
| 2024-03-02 |
$π$-systems and the Embedding problem for rank $2$ Kac-Moody Lie algebras |
Irfan Habib et.al. |
2403.01285v1 |
null |
| 2024-03-02 |
Fast Low-parameter Video Activity Localization in Collaborative Learning Environments |
Venkatesh Jatla et.al. |
2403.01281v1 |
null |
| 2024-03-02 |
Rigidity results for group von Neumann algebras with diffuse center |
Ionuţ Chifan et.al. |
2403.01280v1 |
null |
| 2024-03-02 |
Can a Confident Prior Replace a Cold Posterior? |
Martin Marek et.al. |
2403.01272v1 |
link |
| 2024-02-29 |
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers |
Tsai-Shien Chen et.al. |
2402.19479v1 |
null |
| 2024-02-29 |
Towards Generalizable Tumor Synthesis |
Qi Chen et.al. |
2402.19470v1 |
null |
| 2024-02-29 |
Humanoid Locomotion as Next Token Prediction |
Ilija Radosavovic et.al. |
2402.19469v1 |
null |
| 2024-03-01 |
TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning |
Kate Sanders et.al. |
2402.19467v2 |
null |
| 2024-02-29 |
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models |
Frederik Kunstner et.al. |
2402.19449v1 |
null |
| 2024-02-29 |
Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems |
Quentin Raymondaud et.al. |
2402.19443v1 |
null |
| 2024-02-29 |
Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation |
Jonathan Yang et.al. |
2402.19432v1 |
null |
| 2024-02-29 |
PaECTER: Patent-level Representation Learning using Citation-informed Transformers |
Mainak Ghosh et.al. |
2402.19411v1 |
null |
| 2024-02-29 |
Navigating Hallucinations for Reasoning of Unintentional Activities |
Shresth Grover et.al. |
2402.19405v1 |
null |
| 2024-02-29 |
A Newborn AGN in a Starforming Galaxy |
P. Arévalo et.al. |
2402.19403v1 |
null |
| 2024-02-28 |
Time-efficient filtering of polarimetric data by checking physical realizability of experimental Mueller matrices |
Tatiana Novikova et.al. |
2402.18555v1 |
null |
| 2024-02-28 |
Selection of appropriate multispectral camera exposure settings and radiometric calibration methods for applications in phenotyping and precision agriculture |
Vaishali Swaminathan et.al. |
2402.18553v1 |
null |
| 2024-02-28 |
Implicit Bias of Next-Token Prediction |
Christos Thrampoulidis et.al. |
2402.18551v1 |
null |
| 2024-02-28 |
Defect Detection in Tire X-Ray Images: Conventional Methods Meet Deep Structures |
Andrei Cozma et.al. |
2402.18527v1 |
null |
| 2024-02-28 |
Do galaxy mergers prefer under-dense environments? |
U. Sureshkumar et.al. |
2402.18520v1 |
null |
| 2024-02-28 |
Log Neural Controlled Differential Equations: The Lie Brackets Make a Difference |
Benjamin Walker et.al. |
2402.18512v1 |
null |
| 2024-02-28 |
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling |
Mahdi Karami et.al. |
2402.18508v1 |
null |
| 2024-02-28 |
Detection of Micromobility Vehicles in Urban Traffic Videos |
Khalil Sabri et.al. |
2402.18503v1 |
link |
| 2024-02-28 |
Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification |
Garima Chhikara et.al. |
2402.18502v1 |
null |
| 2024-02-28 |
ROG$_{PL}$: Robust Open-Set Graph Learning via Region-Based Prototype Learning |
Qin Zhang et.al. |
2402.18495v1 |
null |
| 2024-02-27 |
Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning |
Xiaoyu Zhang et.al. |
2402.17768v1 |
null |
| 2024-02-27 |
Towards Optimal Learning of Language Models |
Yuxian Gu et.al. |
2402.17759v1 |
null |
| 2024-02-27 |
An Eye Gaze Heatmap Analysis of Uncertainty Head-Up Display Designs for Conditional Automated Driving |
Michael A. Gerber et.al. |
2402.17751v1 |
null |
| 2024-02-27 |
Scaling on-chip photonic neural processors using arbitrarily programmable wave propagation |
Tatsuhiro Onodera et.al. |
2402.17750v1 |
link |
| 2024-02-27 |
Linking Order to Strength in Metals |
Nicolas Argibay et.al. |
2402.17728v1 |
null |
| 2024-02-27 |
MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation |
Hanan Gani et.al. |
2402.17725v1 |
link |
| 2024-02-27 |
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners |
Yazhou Xing et.al. |
2402.17723v1 |
null |
| 2024-02-27 |
Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers |
Yiwei Lu et.al. |
2402.17710v1 |
null |
| 2024-02-27 |
NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents |
Tamara Czinczoll et.al. |
2402.17682v1 |
null |
| 2024-02-27 |
MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning |
Huiyu Xiong et.al. |
2402.17680v1 |
null |
| 2024-02-26 |
Open Your Ears to Take a Look: A State-of-the-Art Report on the Integration of Sonification and Visualization |
Kajetan Enge et.al. |
2402.16558v1 |
null |
| 2024-02-26 |
LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification |
Yiping Song et.al. |
2402.16515v1 |
null |
| 2024-02-26 |
Photonic Neural Network Fabricated on Thin Film Lithium Niobate for High-Fidelity and Power-Efficient Matrix Computation |
Yong Zheng et.al. |
2402.16513v1 |
null |
| 2024-02-26 |
Intelligent Known and Novel Aircraft Recognition -- A Shift from Classification to Similarity Learning for Combat Identification |
Ahmad Saeed et.al. |
2402.16486v1 |
null |
| 2024-02-26 |
Edge Detectors Can Make Deep Convolutional Neural Networks More Robust |
Jin Ding et.al. |
2402.16479v1 |
null |
| 2024-02-26 |
Autonomous Integration of TSN-unaware Applications with QoS Requirements in TSN Networks |
Moritz Fluechter et.al. |
2402.16454v1 |
null |
| 2024-02-26 |
Retrouver l'inventeur-auteur : la lev{é}e d'homonymies d'autorat entre les brevets et les publications scientifiques |
David Reymond et.al. |
2402.16440v1 |
null |
| 2024-02-26 |
Improving behavior based authentication against adversarial attack using XAI |
Dong Qin et.al. |
2402.16430v1 |
null |
| 2024-02-26 |
Adaptive Online Learning of Separable Path Graph Transforms for Intra-prediction |
Wen-Yang Lu et.al. |
2402.16371v1 |
null |
| 2024-02-26 |
DEYO: DETR with YOLO for End-to-End Object Detection |
Haodong Ouyang et.al. |
2402.16370v1 |
null |
| 2024-02-26 |
SPINEPS -- Automatic Whole Spine Segmentation of T2-weighted MR images using a Two-Phase Approach to Multi-class Semantic and Instance Segmentation |
Hendrik Möller et.al. |
2402.16368v1 |
link |
| 2024-02-26 |
An Integrated Data Processing Framework for Pretraining Foundation Models |
Yiding Sun et.al. |
2402.16358v1 |
link |
| 2024-02-26 |
What Text Design Characterizes Book Genres? |
Daichi Haraguchi et.al. |
2402.16356v1 |
null |
| 2024-02-23 |
A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends |
Abolfazl Younesi et.al. |
2402.15490v1 |
null |
| 2024-02-23 |
Retinotopic Mapping Enhances the Robustness of Convolutional Neural Networks |
Jean-Nicolas Jérémie et.al. |
2402.15480v1 |
null |
| 2024-02-23 |
FAIR: Filtering of Automatically Induced Rules |
Divya Jyoti Bajpai et.al. |
2402.15472v1 |
null |
| 2024-02-23 |
GROS: A General Robust Aggregation Strategy |
Alejandro Cholaquidis et.al. |
2402.15442v1 |
null |
| 2024-02-23 |
Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales |
Shuren Qi et.al. |
2402.15430v1 |
link |
| 2024-02-23 |
ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation |
Yi Zhang et.al. |
2402.15429v1 |
link |
| 2024-02-23 |
Understanding Entrainment in Human Groups: Optimising Human-Robot Collaboration from Lessons Learned during Human-Human Collaboration |
Eike Schneiders et.al. |
2402.15427v1 |
null |
| 2024-02-23 |
PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning |
Simon Holk et.al. |
2402.15420v1 |
null |
| 2024-02-23 |
G-RepsNet: A Fast and General Construction of Equivariant Networks for Arbitrary Matrix Groups |
Sourya Basu et.al. |
2402.15413v1 |
null |
| 2024-02-23 |
A Universal Method for Solar Filament Detection from H-alpha Observations using Semi-supervised Deep Learning |
Andrea Diercke et.al. |
2402.15407v1 |
null |
| 2024-02-22 |
Link Prediction under Heterophily: A Physics-Inspired Graph Neural Network Approach |
Andrea Giuseppe Di Francesco et.al. |
2402.14802v1 |
null |
| 2024-02-22 |
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis |
Willi Menapace et.al. |
2402.14797v1 |
null |
| 2024-02-22 |
Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models |
Yixuan Ren et.al. |
2402.14780v1 |
null |
| 2024-02-22 |
Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using Self-Supervised Learning |
Daniel Capellán-Martín et.al. |
2402.14741v1 |
null |
| 2024-02-22 |
Solitons of the mean curvature flow in $\mathbb{s}^2\times\mathbb{R}$ |
Rafael López et.al. |
2402.14727v1 |
null |
| 2024-02-22 |
A Transformer Model for Boundary Detection in Continuous Sign Language |
Razieh Rastgoo et.al. |
2402.14720v1 |
null |
| 2024-02-22 |
InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks |
Somnath Banerjee et.al. |
2402.14702v1 |
null |
| 2024-02-22 |
Big data analytics to classify earthwork-related locations: A Chengdu study |
Lei Yu et.al. |
2402.14698v1 |
null |
| 2024-02-22 |
Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off |
Futa Waseda et.al. |
2402.14648v1 |
null |
| 2024-02-22 |
Distributed Radiance Fields for Edge Video Compression and Metaverse Integration in Autonomous Driving |
Eugen Šlapak et.al. |
2402.14642v1 |
null |
| 2024-02-21 |
A Simple and Yet Fairly Effective Defense for Graph Neural Networks |
Sofiane Ennadir et.al. |
2402.13987v1 |
link |
| 2024-02-21 |
On modular representations of inner forms of $\mathrm{GL}_n$ over a local non-archimedean field |
Johannes Droschl et.al. |
2402.13969v1 |
null |
| 2024-02-21 |
New directions in algebraic statistics: Three challenges from 2023 |
Yulia Alexandr et.al. |
2402.13961v1 |
null |
| 2024-02-21 |
On the topological classification of complex plane curve singularities |
Alberto Fernández-Hernández et.al. |
2402.13941v1 |
null |
| 2024-02-21 |
Verifying message-passing neural networks via topology-based bounds tightening |
Christopher Hojny et.al. |
2402.13937v1 |
null |
| 2024-02-21 |
Tumor segmentation on whole slide images: training or prompting? |
Huaqian Wu et.al. |
2402.13932v1 |
null |
| 2024-02-21 |
BenchCloudVision: A Benchmark Analysis of Deep Learning Approaches for Cloud Detection and Segmentation in Remote Sensing Imagery |
Loddo Fabio et.al. |
2402.13918v1 |
link |
| 2024-02-21 |
An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach |
Mohammad Amaz Uddin et.al. |
2402.13871v1 |
null |
| 2024-02-21 |
RFI-DRUnet: Restoring dynamic spectra corrupted by radio frequency interference -- Application to pulsar observations |
Xiao Zhang et.al. |
2402.13867v1 |
null |
| 2024-02-21 |
What we can learn from TikTok through its Research API |
Francesco Corso et.al. |
2402.13855v1 |
null |
| 2024-02-20 |
Video ReCap: Recursive Captioning of Hour-Long Videos |
Md Mohaiminul Islam et.al. |
2402.13250v1 |
null |
| 2024-02-20 |
SMORE: Similarity-based Hyperdimensional Domain Adaptation for Multi-Sensor Time Series Classification |
Junyao Wang et.al. |
2402.13233v1 |
null |
| 2024-02-20 |
A Touch, Vision, and Language Dataset for Multimodal Alignment |
Letian Fu et.al. |
2402.13232v1 |
null |
| 2024-02-20 |
NeRF Solves Undersampled MRI Reconstruction |
Tae Jun Jang et.al. |
2402.13226v1 |
null |
| 2024-02-20 |
VideoPrism: A Foundational Visual Encoder for Video Understanding |
Long Zhao et.al. |
2402.13217v1 |
null |
| 2024-02-20 |
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena |
Marco Gaido et.al. |
2402.13208v1 |
null |
| 2024-02-20 |
A novel image correction method for cloud-affected observations with Imaging Atmospheric Cherenkov Telescopes |
Natalia Żywucka et.al. |
2402.13190v1 |
null |
| 2024-02-20 |
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing |
Jianhong Bai et.al. |
2402.13185v1 |
null |
| 2024-02-20 |
DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models |
Norman Di Palo et.al. |
2402.13181v1 |
null |
| 2024-02-20 |
3D Kinematics Estimation from Video with a Biomechanical Model and Synthetic Training Data |
Zhi-Yi Lin et.al. |
2402.13172v1 |
null |
| 2024-02-19 |
Short-Period Variables in TESS Full-Frame Image Light Curves Identified via Convolutional Neural Networks |
Greg Olmschenk et.al. |
2402.12369v1 |
null |
| 2024-02-19 |
The first all-sky survey of star-forming galaxies with eROSITA: Scaling relations and a population of X-ray luminous starbursts |
E. Kyritsis et.al. |
2402.12367v1 |
null |
| 2024-02-19 |
An Adversarial Approach to Evaluating the Robustness of Event Identification Models |
Obai Bahwal et.al. |
2402.12338v1 |
null |
| 2024-02-19 |
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models |
Christian Schlarmann et.al. |
2402.12336v1 |
link |
| 2024-02-19 |
Generating Survival Interpretable Trajectories and Data |
Andrei V. Konstantinov et.al. |
2402.12331v1 |
null |
| 2024-02-19 |
Asymptotic Gaussian Fluctuations of Eigenvectors in Spectral Clustering |
Hugo Lebeau et.al. |
2402.12302v1 |
null |
| 2024-02-19 |
Time-periodic behaviour in one- and two-dimensional interacting particle systems |
Jonas Köppl et.al. |
2402.12300v1 |
null |
| 2024-02-19 |
Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports |
Felix J. Dorfner et.al. |
2402.12298v1 |
null |
| 2024-02-19 |
Revisiting registration-based synthesis: A focus on unsupervised MR image synthesis |
Savannah P. Hays et.al. |
2402.12288v1 |
null |
| 2024-02-19 |
Significance of Chirp MFCC as a Feature in Speech and Audio Applications |
S. Johanan Joysingh et.al. |
2402.12239v1 |
null |
| 2024-02-16 |
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter |
Junfei Xiao et.al. |
2402.10896v1 |
null |
| 2024-02-16 |
Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning |
Chia-Ling Tsai et.al. |
2402.10894v1 |
null |
| 2024-02-16 |
Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation |
Ziyang Wang et.al. |
2402.10887v1 |
link |
| 2024-02-16 |
Control Color: Multimodal Diffusion-based Interactive Image Colorization |
Zhexin Liang et.al. |
2402.10855v1 |
null |
| 2024-02-16 |
HistoSegCap: Capsules for Weakly-Supervised Semantic Segmentation of Histological Tissue Type in Whole Slide Images |
Mobina Mansoori et.al. |
2402.10851v1 |
null |
| 2024-02-16 |
FedD2S: Personalized Data-Free Federated Knowledge Distillation |
Kawa Atapour et.al. |
2402.10846v1 |
null |
| 2024-02-16 |
Pedipulate: Enabling Manipulation Skills using a Quadruped Robot's Leg |
Philip Arm et.al. |
2402.10837v1 |
null |
| 2024-02-16 |
GAN-driven Electromagnetic Imaging of 2-D Dielectric Scatterers |
Ehtasham Naseer et.al. |
2402.10831v1 |
null |
| 2024-02-16 |
Structure results for torus fixed loci |
Jarod Alper et.al. |
2402.10823v1 |
null |
| 2024-02-16 |
Training Class-Imbalanced Diffusion Model Via Overlap Optimization |
Divin Yan et.al. |
2402.10821v1 |
link |
| 2024-02-15 |
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling |
Raunaq Bhirangi et.al. |
2402.10211v1 |
null |
| 2024-02-15 |
FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients |
Xinchi Qiu et.al. |
2402.10191v1 |
null |
| 2024-02-15 |
Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning |
Euclid Collaboration et.al. |
2402.10187v1 |
link |
| 2024-02-15 |
DeepSRGM -- Sequence Classification and Ranking in Indian Classical Music with Deep Learning |
Sathwik Tejaswi Madhusudhan et.al. |
2402.10168v1 |
null |
| 2024-02-15 |
Holographic covering and the fortuity of black holes |
Chi-Ming Chang et.al. |
2402.10129v1 |
null |
| 2024-02-15 |
Classification Diffusion Models |
Shahar Yadin et.al. |
2402.10095v1 |
null |
| 2024-02-15 |
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations |
Benedikt Alkin et.al. |
2402.10093v1 |
link |
| 2024-02-15 |
GraphCBAL: Class-Balanced Active Learning for Graph Neural Networks via Reinforcement Learning |
Chengcheng Yu et.al. |
2402.10074v1 |
null |
| 2024-02-15 |
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence |
Weixiang Zhao et.al. |
2402.10073v1 |
null |
| 2024-02-15 |
NYCTALE: Neuro-Evidence Transformer for Adaptive and Personalized Lung Nodule Invasiveness Prediction |
Sadaf Khademi et.al. |
2402.10066v1 |
null |
| 2024-02-14 |
LL-GABR: Energy Efficient Live Video Streaming Using Reinforcement Learning |
Adithya Raman et.al. |
2402.09392v1 |
null |
| 2024-02-14 |
GraSSRep: Graph-Based Self-Supervised Learning for Repeat Detection in Metagenomic Assembly |
Ali Azizpour et.al. |
2402.09381v1 |
link |
| 2024-02-14 |
Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge |
Jiancheng Yang et.al. |
2402.09372v1 |
null |
| 2024-02-14 |
Magic-Me: Identity-Specific Video Customized Diffusion |
Ze Ma et.al. |
2402.09368v1 |
null |
| 2024-02-14 |
Small instanton-induced flavor invariants and the axion potential |
Ravneet Bedi et.al. |
2402.09361v1 |
null |
| 2024-02-14 |
Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy |
Brice Rauby et.al. |
2402.09359v1 |
null |
| 2024-02-14 |
DoRA: Weight-Decomposed Low-Rank Adaptation |
Shih-Yang Liu et.al. |
2402.09353v1 |
null |
| 2024-02-14 |
Irreducible representations of the crystallisation of the $C^{*}$-algebra $C(SU_{q}(n+1))$ |
Manabendra Giri et.al. |
2402.09347v1 |
null |
| 2024-02-14 |
Registration of Longitudinal Spine CTs for Monitoring Lesion Growth |
Malika Sanhinova et.al. |
2402.09341v1 |
null |
| 2024-02-14 |
Stability and Multigroup Fairness in Ranking with Uncertain Predictions |
Siddartha Devic et.al. |
2402.09326v1 |
null |
| 2024-02-13 |
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation |
Luke Melas-Kyriazi et.al. |
2402.08682v1 |
null |
| 2024-02-13 |
A Convergence Analysis of Approximate Message Passing with Non-Separable Functions and Applications to Multi-Class Classification |
Burak Çakmak et.al. |
2402.08676v1 |
null |
| 2024-02-13 |
Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and Feedback |
Jenny Zhang et.al. |
2402.08662v1 |
null |
| 2024-02-13 |
BdSLW60: A Word-Level Bangla Sign Language Dataset |
Husne Ara Rubaiyeat et.al. |
2402.08635v1 |
link |
| 2024-02-13 |
Convolutional Neural Networks Towards Facial Skin Lesions Detection |
Reza Sarshar et.al. |
2402.08592v1 |
null |
| 2024-02-13 |
Totally geodesic submanifolds and polar actions on Stiefel manifolds |
Claudio Gorodski et.al. |
2402.08585v1 |
null |
| 2024-02-13 |
Motion-Adaptive Inference for Flexible Learned B-Frame Compression |
M. Akin Yilmaz et.al. |
2402.08550v1 |
null |
| 2024-02-13 |
Approximately Piecewise E(3) Equivariant Point Networks |
Matan Atzmon et.al. |
2402.08529v1 |
null |
| 2024-02-13 |
Reduced-order modeling of the dynamics of an inverted flag from experimental data |
Zhenwei Xu et.al. |
2402.08504v1 |
null |
| 2024-02-13 |
Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models |
Shaeke Salman et.al. |
2402.08473v1 |
null |
| 2024-02-13 |
Wavefront Randomization Improves Deconvolution |
Amit Kohli et.al. |
2402.07900v2 |
null |
| 2024-02-12 |
Detection of Spider Mites on Labrador Beans through Machine Learning Approaches Using Custom Datasets |
Violet Liu et.al. |
2402.07895v1 |
null |
| 2024-02-12 |
Perfect stable regularity lemma and slice-wise stable hypergraphs |
Artem Chernikov et.al. |
2402.07870v1 |
null |
| 2024-02-12 |
On Computationally Efficient Multi-Class Calibration |
Parikshit Gopalan et.al. |
2402.07821v1 |
null |
| 2024-02-12 |
A Benchmark Grocery Dataset of Realworld Point Clouds From Single View |
Shivanand Venkanna Sheshappanavar et.al. |
2402.07819v1 |
null |
| 2024-02-12 |
Fixation for $\mathcal{U}$-Ising and $\mathcal{U}$-voter dynamics with frozen vertices |
Laure Marêché et.al. |
2402.07807v1 |
null |
| 2024-02-12 |
Estimation of non-uniform blur using a patch-based regression convolutional neural network (CNN) |
Luis G. Varela et.al. |
2402.07796v1 |
null |
| 2024-02-12 |
"Layer-by-layer" Unsupervised Clustering of Statistically Relevant Fluctuations in Noisy Time-series Data of Complex Dynamical Systems |
Matteo Becchi et.al. |
2402.07786v1 |
null |
| 2024-02-12 |
Solving parameter-dependent semi-algebraic systems |
Louis Gaillard et.al. |
2402.07782v1 |
null |
| 2024-02-12 |
Observations of the new meteor shower from comet 46P/Wirtanen |
D. Vida et.al. |
2402.07769v1 |
null |
| 2024-02-09 |
A two-stage algorithm in evolutionary product unit neural networks for classification |
Antonio J. Tallón-Ballesteros et.al. |
2402.06622v1 |
null |
| 2024-02-09 |
Image-based Deep Learning for the time-dependent prediction of fresh concrete properties |
Max Meyer et.al. |
2402.06611v1 |
null |
| 2024-02-09 |
SAE: Single Architecture Ensemble Neural Networks |
Martin Ferianc et.al. |
2402.06580v1 |
null |
| 2024-02-09 |
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning |
Amir Ziai et.al. |
2402.06560v1 |
link |
| 2024-02-09 |
Self Supervised Learning for Improved Calibrationless Radial MRI with NLINV-Net |
Moritz Blumenthal et.al. |
2402.06550v1 |
null |
| 2024-02-09 |
Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA |
Marek Šuppa et.al. |
2402.06549v1 |
null |
| 2024-02-09 |
Feature Density Estimation for Out-of-Distribution Detection via Normalizing Flows |
Evan D. Cook et.al. |
2402.06537v1 |
null |
| 2024-02-09 |
Refining Myocardial Infarction Detection: A Novel Multi-Modal Composite Kernel Strategy in One-Class Classification |
Muhammad Uzair Zahid et.al. |
2402.06530v1 |
null |
| 2024-02-09 |
Flexible infinite-width graph convolutional networks and the importance of representation learning |
Ben Anson et.al. |
2402.06525v1 |
null |
| 2024-02-09 |
Dynamic swarms regulate the morphology and distribution of soft membrane domains |
Aakanksha Gubbala et.al. |
2402.06518v1 |
null |
| 2024-02-08 |
Classifying Nodes in Graphs without GNNs |
Daniel Winter et.al. |
2402.05934v1 |
link |
| 2024-02-08 |
An Interactive Agent Foundation Model |
Zane Durante et.al. |
2402.05929v1 |
null |
| 2024-02-08 |
Point-VOS: Pointing Up Video Object Segmentation |
Idil Esen Zulfikar et.al. |
2402.05917v1 |
null |
| 2024-02-08 |
A Survey on Detection, Classification, and Tracking of Aerial Threats using Radar and Communications Systems |
Wahab Khawaja et.al. |
2402.05909v1 |
null |
| 2024-02-09 |
Large Language Model Meets Graph Neural Network in Knowledge Distillation |
Shengxiang Hu et.al. |
2402.05894v2 |
null |
| 2024-02-08 |
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data |
Shufan Li et.al. |
2402.05892v1 |
null |
| 2024-02-08 |
CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion |
Shoubin Yu et.al. |
2402.05889v1 |
null |
| 2024-02-08 |
Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers |
Onur G. Guleryuz et.al. |
2402.05887v1 |
link |
| 2024-02-08 |
GET-Tok: A GenAI-Enriched Multimodal TikTok Dataset Documenting the 2022 Attempted Coup in Peru |
Gabriela Pinto et.al. |
2402.05882v1 |
link |
| 2024-02-08 |
You've Got to Feel It To Believe It: Multi-Modal Bayesian Inference for Semantic and Property Prediction |
Parker Ewen et.al. |
2402.05872v1 |
null |
| 2024-02-07 |
Edu-ConvoKit: An Open-Source Library for Education Conversation Data |
Rose E. Wang et.al. |
2402.05111v1 |
link |
| 2024-02-07 |
Moduli Parameters of Complex Singularities with Non-Degenerate Newton Boundary |
Janko Boehm et.al. |
2402.05093v1 |
null |
| 2024-02-07 |
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation |
Ziyang Wang et.al. |
2402.05079v1 |
link |
| 2024-02-07 |
Arbitrary Scale Super-Resolution Assisted Lunar Crater Detection in Satellite Images |
Atal Tewari et.al. |
2402.05068v1 |
null |
| 2024-02-07 |
Efficient Multi-Resolution Fusion for Remote Sensing Data with Label Uncertainty |
Hersh Vakharia et.al. |
2402.05045v1 |
link |
| 2024-02-07 |
PAC Learnability under Explanation-Preserving Graph Perturbations |
Xu Zheng et.al. |
2402.05039v1 |
null |
| 2024-02-07 |
Strong convexity-guided hyper-parameter optimization for flatter losses |
Rahul Yedida et.al. |
2402.05025v1 |
null |
| 2024-02-07 |
Example-based Explanations for Random Forests using Machine Unlearning |
Tanmay Surve et.al. |
2402.05007v1 |
null |
| 2024-02-07 |
Randomized Confidence Bounds for Stochastic Partial Monitoring |
Maxime Heuillet et.al. |
2402.05002v1 |
null |
| 2024-02-07 |
Beyond explaining: XAI-based Adaptive Learning with SHAP Clustering for Energy Consumption Prediction |
Tobias Clement et.al. |
2402.04982v1 |
null |
| 2024-02-06 |
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters |
Quan Sun et.al. |
2402.04252v1 |
link |
| 2024-02-06 |
The spectrum of excisive functors |
Gregory Arone et.al. |
2402.04244v1 |
null |
| 2024-02-06 |
A classification of nonzero skew immaculate functions |
Sarah Mason et.al. |
2402.04219v1 |
null |
| 2024-02-06 |
Resource-Aware Hierarchical Federated Learning in Wireless Video Caching Networks |
Md Ferdous Pervej et.al. |
2402.04216v1 |
null |
| 2024-02-06 |
"Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors |
Lin Guan et.al. |
2402.04210v1 |
null |
| 2024-02-06 |
3D Volumetric Super-Resolution in Radiology Using 3D RRDB-GAN |
Juhyung Ha et.al. |
2402.04171v1 |
null |
| 2024-02-06 |
Human Emotions Analysis and Recognition Using EEG Signals in Response to 360$^\circ$ Videos |
Haseeb ur Rahman Abbasi et.al. |
2402.04142v1 |
null |
| 2024-02-06 |
Hierarchical Delay Attribution Classification using Unstructured Text in Train Management Systems |
Anton Borg et.al. |
2402.04108v1 |
null |
| 2024-02-06 |
Analysis of Deep Image Prior and Exploiting Self-Guidance for Image Reconstruction |
Shijun Liang et.al. |
2402.04097v1 |
null |
| 2024-02-06 |
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation |
Zhengbo Wang et.al. |
2402.04087v1 |
link |
| 2024-02-05 |
Multiclass Classification Procedure for Detecting Attacks on MQTT-IoT Protocol |
Hector Alaiz-Moreton et.al. |
2402.03270v1 |
null |
| 2024-02-05 |
Security Advice for Parents and Children About Content Filtering and Circumvention as Found on YouTube and TikTok |
Ran Elgedawy et.al. |
2402.03255v1 |
null |
| 2024-02-05 |
JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching |
Antoine Magron et.al. |
2402.03242v1 |
link |
| 2024-02-05 |
FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition |
Xiaohu Huang et.al. |
2402.03241v1 |
null |
| 2024-02-05 |
IGUANe: a 3D generalizable CycleGAN for multicenter harmonization of brain MR images |
Vincent Roca et.al. |
2402.03227v1 |
null |
| 2024-02-05 |
English Prompts are Better for NLI-based Zero-Shot Emotion Classification than Target-Language Prompts |
Patrick Barreiß et.al. |
2402.03223v1 |
null |
| 2024-02-05 |
"Define Your Terms" : Enhancing Efficient Offensive Speech Classification with Definition |
Huy Nghiem et.al. |
2402.03221v1 |
link |
| 2024-02-05 |
Isotropy, Clusters, and Classifiers |
Timothee Mickus et.al. |
2402.03191v1 |
null |
| 2024-02-06 |
Cool-chic video: Learned video coding with 800 parameters |
Thomas Leguay et.al. |
2402.03179v2 |
null |
| 2024-02-05 |
Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings |
Gonçalo Gomes et.al. |
2402.03172v1 |
link |
| 2024-02-02 |
From gas to stars: MUSEings on the internal evolution of IC 1613 |
S. Taibi et.al. |
2402.01631v1 |
null |
| 2024-02-02 |
Truncation technique for variational quantum eigensolver for Molecular Hamiltonians |
Qidong Xu et.al. |
2402.01630v1 |
null |
| 2024-02-02 |
L2G2G: a Scalable Local-to-Global Network Embedding with Graph Autoencoders |
Ruikang Ouyang et.al. |
2402.01614v1 |
link |
| 2024-02-02 |
Immersive Video Compression using Implicit Neural Representations |
Ho Man Kwan et.al. |
2402.01596v1 |
link |
| 2024-02-02 |
NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties |
Jingyuan Sun et.al. |
2402.01590v1 |
null |
| 2024-02-02 |
Boximator: Generating Rich and Controllable Motions for Video Synthesis |
Jiawei Wang et.al. |
2402.01566v1 |
null |
| 2024-02-02 |
Deep Continuous Networks |
Nergis Tomen et.al. |
2402.01557v1 |
link |
| 2024-02-02 |
SLYKLatent, a Learning Framework for Facial Features Estimation |
Samuel Adebayo et.al. |
2402.01555v1 |
null |
| 2024-02-02 |
Advancing Brain Tumor Inpainting with Generative Models |
Ruizhi Zhu et.al. |
2402.01509v1 |
null |
| 2024-02-02 |
Di-NeRF: Distributed NeRF for Collaborative Learning with Unknown Relative Poses |
Mahboubeh Asadi et.al. |
2402.01485v1 |
null |
| 2024-02-01 |
We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline |
Simar Kareer et.al. |
2402.00868v1 |
link |
| 2024-02-01 |
Deep Room Impulse Response Completion |
Jackie Lin et.al. |
2402.00859v1 |
null |
| 2024-02-01 |
Early Time Classification with Accumulated Accuracy Gap Control |
Liran Ringel et.al. |
2402.00857v1 |
link |
| 2024-02-01 |
BootsTAP: Bootstrapped Training for Tracking-Any-Point |
Carl Doersch et.al. |
2402.00847v1 |
link |
| 2024-02-01 |
Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering |
Pinxin Liu et.al. |
2402.00827v1 |
null |
| 2024-02-01 |
Examining the Influence of Digital Phantom Models in Virtual Imaging Trials for Tomographic Breast Imaging |
Amar Kavuri et.al. |
2402.00812v1 |
null |
| 2024-02-01 |
ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models |
Zhixue Zhao et.al. |
2402.00794v1 |
link |
| 2024-02-01 |
Distinguishing the Indistinguishable: Human Expertise in Algorithmic Prediction |
Rohan Alur et.al. |
2402.00793v1 |
link |
| 2024-02-02 |
CroissantLLM: A Truly Bilingual French-English Language Model |
Manuel Faysse et.al. |
2402.00786v2 |
link |
| 2024-02-01 |
Hybrid Quantum Vision Transformers for Event Classification in High Energy Physics |
Eyup B. Unlu et.al. |
2402.00776v1 |
null |
| 2024-01-31 |
Classification-Oriented Semantic Wireless Communications |
Emrecan Kutay et.al. |
2401.18069v1 |
null |
| 2024-01-31 |
Rank Supervised Contrastive Learning for Time Series Classification |
Qianying Ren et.al. |
2401.18057v1 |
null |
| 2024-01-31 |
Variable selection for Naïve Bayes classification |
Rafael Blanquero et.al. |
2401.18039v1 |
null |
| 2024-01-31 |
Optimizing contrastive learning for cortical folding pattern detection |
Aymeric Gaudin et.al. |
2401.18035v1 |
null |
| 2024-01-31 |
A Neural Enhancement Post-Processor with a Dynamic AV1 Encoder Configuration Strategy for CLIC 2024 |
Darren Ramsook et.al. |
2401.18021v1 |
null |
| 2024-01-31 |
EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation |
Jonathan W. Kim et.al. |
2401.18006v1 |
null |
| 2024-01-31 |
Unsupervised Learning of Topological Non-Abelian Braiding in Non-Hermitian Bands |
Yang Long et.al. |
2401.17968v1 |
null |
| 2024-01-31 |
Error-Tolerant E-Discovery Protocols |
Jinshuo Dong et.al. |
2401.17952v1 |
null |
| 2024-01-31 |
HyperZ$\cdot$Z$\cdot$W Operator Connects Slow-Fast Networks for Full Context Interaction |
Harvie Zhang et.al. |
2401.17948v1 |
null |
| 2024-01-31 |
Probabilistic Photonic Computing with Chaotic Light |
Frank Brückerhoff-Plückelmann et.al. |
2401.17915v1 |
null |
| 2024-01-30 |
The SRG/eROSITA all-sky survey: Hard X-ray selected Active Galactic Nuclei |
Sophia G. H. Waddell et.al. |
2401.17306v1 |
null |
| 2024-01-30 |
Compact white-dwarf binaries in the combined SRG/eROSITA/SDSS eFEDS survey |
A. Schwope et.al. |
2401.17304v1 |
null |
| 2024-01-30 |
Searching for X-ray counterparts of unassociated Fermi-LAT sources and rotation-powered pulsars with SRG/eROSITA |
Martin G. F. Mayer et.al. |
2401.17295v1 |
null |
| 2024-01-30 |
X-ray AGNs with SRG/eROSITA: Multi-wavelength observations reveal merger triggering and post-coalescence circumnuclear blowout |
Robert W. Bickley et.al. |
2401.17277v1 |
null |
| 2024-01-30 |
ReacLLaMA: Merging chemical and textual information in chemical reactivity AI models |
Aline Hartgers et.al. |
2401.17267v1 |
null |
| 2024-01-30 |
SLIC: A Learned Image Codec Using Structure and Color |
Srivatsa Prativadibhayankaram et.al. |
2401.17246v1 |
link |
| 2024-01-31 |
Faster coloring and embedding in dense hypergraphs via stability |
Jianfeng Hou et.al. |
2401.17219v2 |
null |
| 2024-01-31 |
GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart Eyewear |
Robert Konrad et.al. |
2401.17217v2 |
null |
| 2024-01-30 |
Single Word Change is All You Need: Designing Attacks and Defenses for Text Classifiers |
Lei Xu et.al. |
2401.17196v1 |
null |
| 2024-01-30 |
GraphViz2Vec: A Structure-aware Feature Generation Model to Improve Classification in GNNs |
Shraban Kumar Chatterjee et.al. |
2401.17178v1 |
null |
| 2024-01-29 |
Computer Vision for Primate Behavior Analysis in the Wild |
Richard Vogg et.al. |
2401.16424v1 |
null |
| 2024-01-29 |
Synchformer: Efficient Synchronization from Sparse Cues |
Vladimir Iashin et.al. |
2401.16423v1 |
null |
| 2024-01-29 |
Strategic Usage in a Multi-Learner Setting |
Eliot Shekhtman et.al. |
2401.16422v1 |
null |
| 2024-01-29 |
ReTaSA: A Nonparametric Functional Estimation Approach for Addressing Continuous Target Shift |
Hwanwoo Kim et.al. |
2401.16410v1 |
null |
| 2024-01-29 |
Is K-fold cross validation the best model selection method for Machine Learning? |
Juan M Gorriz et.al. |
2401.16407v1 |
null |
| 2024-01-29 |
Zero-shot Imitation Policy via Search in Demonstration Dataset |
Federco Malato et.al. |
2401.16398v1 |
null |
| 2024-01-29 |
Ovarian Cancer Diagnostics using Wavelet Packet Scaling Descriptors |
Raymond J. Hinton Jr. et.al. |
2401.16396v1 |
null |
| 2024-01-29 |
Evaluation of pseudo-healthy image reconstruction for anomaly detection with deep generative models: Application to brain FDG PET |
Ravi Hassanaly et.al. |
2401.16363v1 |
link |
| 2024-01-29 |
Curriculum-Based Reinforcement Learning for Quadrupedal Jumping: A Reference-free Design |
Vassil Atanassov et.al. |
2401.16337v1 |
null |
| 2024-01-29 |
Making the unmodulated Pyramid wavefront sensor smart. Closed-loop demonstration of neural network wavefront reconstruction with MagAO-X |
Rico Landman et.al. |
2401.16325v1 |
null |
| 2024-01-26 |
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities |
Chaochao Lu et.al. |
2401.15071v1 |
null |
| 2024-01-26 |
Deep learning-based approach for tomato classification in complex scenes |
Mikael A. Mousse et.al. |
2401.15055v1 |
null |
| 2024-01-26 |
Non-Unitary $3 \times 3$ Mixing in Majorana Neutrinos and Vector-like Quark Models |
Pedro M. F. Pereira et.al. |
2401.15049v1 |
null |
| 2024-01-26 |
Machine learning-based analysis of glioma tissue sections: a review |
Jan-Philipp Redlich et.al. |
2401.15022v1 |
null |
| 2024-01-26 |
Enhancement of a Text-Independent Speaker Verification System by using Feature Combination and Parallel-Structure Classifiers |
Kerlos Atia Abdalmalak et.al. |
2401.15018v1 |
null |
| 2024-01-26 |
Graph-based Active Learning for Entity Cluster Repair |
Victor Christen et.al. |
2401.14992v1 |
null |
| 2024-01-26 |
Stokes graphs of the Rabi problem with real parameters |
René Langøen et.al. |
2401.14991v1 |
null |
| 2024-01-26 |
Minimum-dissipation principle for synchronised stochastic oscillators far from equilibrium |
Jan Meibohm et.al. |
2401.14982v1 |
null |
| 2024-01-26 |
Microwave lymphedema assessment using deep learning with contour assisted backprojection |
Yuyi Chang et.al. |
2401.14970v1 |
null |
| 2024-01-26 |
Hold Tight: Identifying Behavioral Patterns During Prolonged Work in VR through Video Analysis |
Verena Biener et.al. |
2401.14920v1 |
null |
| 2024-01-25 |
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities |
Yiyuan Zhang et.al. |
2401.14405v1 |
link |
| 2024-01-25 |
Adaptive Mobile Manipulation for Articulated Objects In the Open World |
Haoyu Xiong et.al. |
2401.14403v1 |
null |
| 2024-01-25 |
Range-Agnostic Multi-View Depth Estimation With Keyframe Selection |
Andrea Conti et.al. |
2401.14401v1 |
link |
| 2024-01-25 |
Rethinking Patch Dependence for Masked Autoencoders |
Letian Fu et.al. |
2401.14391v1 |
null |
| 2024-01-25 |
Smooth Ranking SVM via Cutting-Plane Method |
Erhan Can Ozcan et.al. |
2401.14388v1 |
link |
| 2024-01-25 |
Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs |
Michael R. H. Vorndran et.al. |
2401.14387v1 |
link |
| 2024-01-25 |
A Comparative Analysis of Noise Reduction Methods in Sentiment Analysis on Noisy Bengali Texts |
Kazi Toufique Elahi et.al. |
2401.14360v1 |
link |
| 2024-01-25 |
Computing Derivations on Nilpotent Quadratic Lie Algebras |
Pilar Benito et.al. |
2401.14348v1 |
null |
| 2024-01-25 |
Class-attribute Priors: Adapting Optimization to Heterogeneity and Fairness Objective |
Xuechen Zhang et.al. |
2401.14343v1 |
null |
| 2024-01-25 |
Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition |
Dichao Liu et.al. |
2401.14336v1 |
link |
| 2024-01-24 |
Tyche: Stochastic In-Context Learning for Medical Image Segmentation |
Marianne Rakic et.al. |
2401.13650v1 |
null |
| 2024-01-24 |
Quantifying the Impact of Frame Preemption on Combined TSN Shapers |
Rubi Debnath et.al. |
2401.13631v1 |
null |
| 2024-01-24 |
Can overfitted deep neural networks in adversarial training generalize? -- An approximation viewpoint |
Zhongjie Shi et.al. |
2401.13624v1 |
null |
| 2024-01-24 |
FLLIC: Functionally Lossless Image Compression |
Xi Zhang et.al. |
2401.13616v1 |
null |
| 2024-01-24 |
Enhancing Image Retrieval : A Comprehensive Study on Photo Search using the CLIP Mode |
Naresh Kumar Lahajal et.al. |
2401.13613v1 |
null |
| 2024-01-24 |
Prompt Weight Experiments for LLM Instruction Fine-Tuning |
Mathew Huerta-Enochian et.al. |
2401.13586v1 |
null |
| 2024-01-24 |
WPDA: Frequency-based Backdoor Attack with Wavelet Packet Decomposition |
Zhengyao Song et.al. |
2401.13578v1 |
null |
| 2024-01-24 |
CNN architecture extraction on edge GPU |
Peter Horvath et.al. |
2401.13575v1 |
null |
| 2024-01-24 |
Benchmarking the Fairness of Image Upsampling Methods |
Mike Laszkiewicz et.al. |
2401.13555v1 |
null |
| 2024-01-24 |
PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition |
Otto Brookes et.al. |
2401.13554v1 |
null |
| 2024-01-23 |
SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI |
Hanxue Gu et.al. |
2401.12974v1 |
null |
| 2024-01-23 |
On the Efficacy of Text-Based Input Modalities for Action Anticipation |
Apoorva Beedu et.al. |
2401.12972v1 |
null |
| 2024-01-23 |
The role of environment and AGN feedback in quenching local galaxies: Comparing cosmological hydrodynamical simulations to the SDSS |
Paul H. Goubert et.al. |
2401.12953v1 |
null |
| 2024-01-23 |
Lumiere: A Space-Time Diffusion Model for Video Generation |
Omer Bar-Tal et.al. |
2401.12945v1 |
null |
| 2024-01-23 |
Long-range three-dimensional tracking of nanoparticles using interferometric scattering (iSCAT) microscopy |
Kiarash Kasaian et.al. |
2401.12939v1 |
null |
| 2024-01-23 |
Neural deformation fields for template-based reconstruction of cortical surfaces from MRI |
Fabian Bongratz et.al. |
2401.12938v1 |
null |
| 2024-01-23 |
Segmentation of tibiofemoral joint tissues from knee MRI using MtRA-Unet and incorporating shape information: Data from the Osteoarthritis Initiative |
Akshay Daydar et.al. |
2401.12932v1 |
null |
| 2024-01-23 |
pyAKI - An Open Source Solution to Automated KDIGO classification |
Christian Porschen et.al. |
2401.12930v1 |
null |
| 2024-01-23 |
Performance Analysis of Support Vector Machine (SVM) on Challenging Datasets for Forest Fire Detection |
Ankan Kar et.al. |
2401.12924v1 |
null |
| 2024-01-23 |
Advancing Glitch Classification in Gravity Spy: Multi-view Fusion with Attention-based Machine Learning for Advanced LIGO's Fourth Observing Run |
Yunan Wu et.al. |
2401.12913v1 |
null |
| 2024-01-22 |
Connecting the Dots: Leveraging Spatio-Temporal Graph Neural Networks for Accurate Bangla Sign Language Recognition |
Haz Sameen Shahgir et.al. |
2401.12210v1 |
null |
| 2024-01-22 |
Unsupervised Machine Learning for the Classification of Astrophysical X-ray Sources |
Víctor Samuel Pérez-Díaz et.al. |
2401.12203v1 |
link |
| 2024-01-22 |
OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics |
Peiqi Liu et.al. |
2401.12202v1 |
null |
| 2024-01-22 |
In-Context Learning for Extreme Multi-Label Classification |
Karel D'Oosterlinck et.al. |
2401.12178v1 |
null |
| 2024-01-22 |
Broiler-Net: A Deep Convolutional Framework for Broiler Behavior Analysis in Poultry Houses |
Tahereh Zarrat Ehsan et.al. |
2401.12176v1 |
link |
| 2024-01-22 |
VRMN-bD: A Multi-modal Natural Behavior Dataset of Immersive Human Fear Responses in VR Stand-up Interactive Games |
He Zhang et.al. |
2401.12133v1 |
link |
| 2024-01-22 |
Evaluation of QCNN-LSTM for Disability Forecasting in Multiple Sclerosis Using Sequential Multisequence MRI |
John D. Mayfield et.al. |
2401.12132v1 |
null |
| 2024-01-22 |
Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy |
Will LeVine et.al. |
2401.12129v1 |
link |
| 2024-01-22 |
Measures of the Capital Network of the U.S. Economy |
Ben Klemens et.al. |
2401.12118v1 |
null |
| 2024-01-22 |
A quantitative version of the Steinhaus theorem |
Alex Iosevich et.al. |
2401.12112v1 |
null |
| 2024-01-19 |
Classifying affine structures with focus-focus singularities |
Xiudi Tang et.al. |
2401.10881v1 |
null |
| 2024-01-19 |
Motion Consistency Loss for Monocular Visual Odometry with Attention-Based Deep Learning |
André O. Françani et.al. |
2401.10857v1 |
null |
| 2024-01-19 |
Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models |
Mia Mohammad Imran et.al. |
2401.10845v1 |
null |
| 2024-01-19 |
Understanding Video Transformers via Universal Concept Discovery |
Matthew Kowal et.al. |
2401.10831v1 |
null |
| 2024-01-19 |
Long-Term Monitoring of the Oe Star VES 735: Ope! Not So Quiet After All |
Brandon Marshall et.al. |
2401.10829v1 |
null |
| 2024-01-19 |
ActAnywhere: Subject-Aware Video Background Generation |
Boxiao Pan et.al. |
2401.10822v1 |
null |
| 2024-01-19 |
RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision |
Fernando Pérez-García et.al. |
2401.10815v1 |
null |
| 2024-01-19 |
Learning to Visually Connect Actions and their Effects |
Eric Peh et.al. |
2401.10805v1 |
null |
| 2024-01-19 |
Endovascular Detection of Catheter-Thrombus Contact by Vacuum Excitation |
Jared Lawson et.al. |
2401.10804v1 |
null |
| 2024-01-19 |
TDC-less Direct Time-of-Flight Imaging Using Spiking Neural Networks |
Jack MacLean et.al. |
2401.10793v1 |
null |
| 2024-01-18 |
Simultaneous Tactile Estimation and Control for Extrinsic Dexterity |
Antonia Bronars et.al. |
2401.10230v1 |
null |
| 2024-01-18 |
OMG-Seg: Is One Model Good Enough For All Segmentation? |
Xiangtai Li et.al. |
2401.10229v1 |
link |
| 2024-01-18 |
RAP-SAM: Towards Real-Time All-Purpose Segment Anything |
Shilin Xu et.al. |
2401.10228v1 |
link |
| 2024-01-18 |
Towards Language-Driven Video Inpainting via Multimodal Large Language Models |
Jianzong Wu et.al. |
2401.10226v1 |
null |
| 2024-01-18 |
Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions |
Namitha Padmanabhan et.al. |
2401.10217v1 |
null |
| 2024-01-18 |
Transfer Learning in Human Activity Recognition: A Survey |
Sourish Gunesh Dhekane et.al. |
2401.10185v1 |
null |
| 2024-01-18 |
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild |
Andreas Engelhardt et.al. |
2401.10171v1 |
null |
| 2024-01-19 |
Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation |
Changgu Chen et.al. |
2401.10150v2 |
null |
| 2024-01-18 |
Few-shot learning for COVID-19 Chest X-Ray Classification with Imbalanced Data: An Inter vs. Intra Domain Study |
Alejandro Galán-Cuenca et.al. |
2401.10129v1 |
null |
| 2024-01-18 |
Sub2Full: split spectrum to boost OCT despeckling without clean data |
Lingyun Wang et.al. |
2401.10128v1 |
link |
| 2024-01-17 |
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model |
Lianghui Zhu et.al. |
2401.09417v1 |
link |
| 2024-01-17 |
Vlogger: Make Your Dream A Vlog |
Shaobin Zhuang et.al. |
2401.09414v1 |
link |
| 2024-01-17 |
Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text |
Mazal Bethany et.al. |
2401.09407v1 |
null |
| 2024-01-17 |
Élivágar: Efficient Quantum Circuit Search for Classification |
Sashwat Anagolum et.al. |
2401.09393v1 |
null |
| 2024-01-17 |
Tri$^{2}$-plane: Volumetric Avatar Reconstruction with Feature Pyramid |
Luchuan Song et.al. |
2401.09386v1 |
link |
| 2024-01-17 |
New relations of pod partition and its connection with other partition functions |
Hemjyoti Nath et.al. |
2401.09374v1 |
null |
| 2024-01-17 |
To deform or not: treatment-aware longitudinal registration for breast DCE-MRI during neoadjuvant chemotherapy via unsupervised keypoints detection |
Luyi Han et.al. |
2401.09336v1 |
link |
| 2024-01-17 |
Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora |
Diana Davila Gordillo et.al. |
2401.09333v1 |
null |
| 2024-01-17 |
Spectral Distribution Complexity of the Surface Fibrillatory Waves Predicts Post-Catheter Ablation Relapse in Persistent Atrial Fibrillation |
Pilar Escribano et.al. |
2401.09297v1 |
null |
| 2024-01-17 |
T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis |
Yoonjin Chung et.al. |
2401.09294v1 |
null |
| 2024-01-16 |
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers |
Jiu Feng et.al. |
2401.08415v1 |
null |
| 2024-01-16 |
Faster ISNet for Background Bias Mitigation on Deep Neural Networks |
Pedro R. A. S. Bassi et.al. |
2401.08409v1 |
null |
| 2024-01-16 |
Training and Comparison of nnU-Net and DeepMedic Methods for Autosegmentation of Pediatric Brain Tumors |
Arastoo Vossough et.al. |
2401.08404v1 |
null |
| 2024-01-16 |
High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering |
Xin Ming et.al. |
2401.08398v1 |
null |
| 2024-01-16 |
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models |
Zongxin Yang et.al. |
2401.08392v1 |
link |
| 2024-01-16 |
We don't need no labels: Estimating post-deployment model performance under covariate shift without ground truth |
Jakub Białek et.al. |
2401.08348v1 |
null |
| 2024-01-16 |
Learn What You Need in Personalized Federated Learning |
Kexin Lv et.al. |
2401.08327v1 |
link |
| 2024-01-16 |
Application of LLM Agents in Recruitment: A Novel Framework for Resume Screening |
Chengguang Gan et.al. |
2401.08315v1 |
null |
| 2024-01-16 |
Central extensions of restricted Lie superalgebras and classification of $p$-nilpotent Lie superalgebras in dimension $4$ |
Sofiane Bouarroudj et.al. |
2401.08313v1 |
null |
| 2024-01-16 |
Evaluating online elasticity estimation of soft objects using standard robot grippers |
Shubhan P. Patni et.al. |
2401.08298v1 |
null |
| 2024-01-16 |
Multitask Learning in Minimally Invasive Surgical Vision: A Review |
Oluwatosin Alabi et.al. |
2401.08256v1 |
null |
| 2024-01-16 |
Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization |
Chongzhi Zhang et.al. |
2401.08232v1 |
null |
| 2024-01-16 |
Towards Causal Relationship in Indefinite Data: Baseline Model and New Datasets |
Hang Chen et.al. |
2401.08221v1 |
link |
| 2024-01-16 |
Ship Detection in SAR Images with Human-in-the-Loop |
Hecheng Jia et.al. |
2401.08213v1 |
null |
| 2024-01-16 |
ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification |
Zhongbin Fang et.al. |
2401.08210v1 |
link |
| 2024-01-12 |
Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements |
Anton Voronov et.al. |
2401.06766v1 |
null |
| 2024-01-12 |
Classification of singularities of cluster algebras of finite type II: coefficients |
Angélica Benito et.al. |
2401.06758v1 |
null |
| 2024-01-12 |
Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction |
Muhammad Naveed Riaz et.al. |
2401.06757v1 |
null |
| 2024-01-12 |
Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection |
Muhammad Tayyab Zamir et.al. |
2401.06752v1 |
null |
| 2024-01-12 |
Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part II: Spatial and Tonal Data Optimization |
Niklas Kämper et.al. |
2401.06747v1 |
null |
| 2024-01-12 |
Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting |
Niklas Kämper et.al. |
2401.06744v1 |
null |
| 2024-01-12 |
Complexity Classification of Product State Problems for Local Hamiltonians |
John Kallaugher et.al. |
2401.06725v1 |
null |
| 2024-01-12 |
Obstacle-Aware Positioning of a Mobile Robotic Platform for 6G Networks |
Alexandre Costa et.al. |
2401.06717v1 |
null |
| 2024-01-12 |
Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text |
Muskan Garg et.al. |
2401.06709v1 |
null |
| 2024-01-12 |
On the existence of charged electrostatic black holes in arbitrary topology |
Martin Reiris et.al. |
2401.06702v1 |
null |
| 2024-01-11 |
Distilling Vision-Language Models on Millions of Videos |
Yue Zhao et.al. |
2401.06129v1 |
null |
| 2024-01-11 |
Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors |
Jack Saunders et.al. |
2401.06126v1 |
null |
| 2024-01-11 |
Gaussian Shadow Casting for Neural Characters |
Luis Bolanos et.al. |
2401.06116v1 |
null |
| 2024-01-11 |
A Closer Look at AUROC and AUPRC under Class Imbalance |
Matthew B. A. McDermott et.al. |
2401.06091v1 |
link |
| 2024-01-12 |
LEGO:Language Enhanced Multi-modal Grounding Model |
Zhaowei Li et.al. |
2401.06071v2 |
link |
| 2024-01-11 |
On the Power of Graph Neural Networks and Feature Augmentation Strategies to Classify Social Networks |
Walid Guettala et.al. |
2401.06048v1 |
null |
| 2024-01-11 |
RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks |
Partha Ghosh et.al. |
2401.06035v1 |
null |
| 2024-01-11 |
Attention to detail: inter-resolution knowledge distillation |
Rocío del Amor et.al. |
2401.06010v1 |
link |
| 2024-01-11 |
Sea ice detection using concurrent multispectral and synthetic aperture radar imagery |
Martin S J Rogers et.al. |
2401.06009v1 |
null |
| 2024-01-11 |
Boosting Mixed-Initiative Co-Creativity in Game Design: A Tutorial |
Solange Margarido et.al. |
2401.05999v1 |
null |
| 2024-01-10 |
Towards Online Sign Language Recognition and Translation |
Ronglai Zuo et.al. |
2401.05336v1 |
link |
| 2024-01-10 |
ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video |
Kevin Cai et.al. |
2401.05314v1 |
link |
| 2024-01-10 |
Strategic Client Selection to Address Non-IIDness in HAPS-enabled FL Networks |
Amin Farajzadeh et.al. |
2401.05308v1 |
null |
| 2024-01-10 |
Frame-like Fourier expansions for finite Borel measures on $\mathbb{R}$ |
Chad Berner et.al. |
2401.05243v1 |
null |
| 2024-01-10 |
Learning effective good variables from physical data |
Giulio Barletta et.al. |
2401.05226v1 |
link |
| 2024-01-10 |
TOVAC: Tele-operated Vehicle Admission Control and Routing |
Jorge Martín-Pérez et.al. |
2401.05225v1 |
null |
| 2024-01-10 |
Do Vision and Language Encoders Represent the World Similarly? |
Mayug Maniparambil et.al. |
2401.05224v1 |
null |
| 2024-01-10 |
Exploring Vulnerabilities of No-Reference Image Quality Assessment Models: A Query-Based Black-Box Method |
Chenxi Yang et.al. |
2401.05217v1 |
null |
| 2024-01-10 |
Pre-trained Large Language Models for Financial Sentiment Analysis |
Wei Luo et.al. |
2401.05215v1 |
link |
| 2024-01-10 |
A Novel Prompt-tuning Method: Incorporating Scenario-specific Concepts into a Verbalizer |
Yong Ma et.al. |
2401.05204v1 |
null |
| 2024-01-09 |
A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars |
Ronglai Zuo et.al. |
2401.04730v1 |
link |
| 2024-01-09 |
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation |
Jun Ma et.al. |
2401.04722v1 |
null |
| 2024-01-09 |
Helicoidal surfaces of prescribed mean curvature in $\mathbb{R}^3$ |
Aires Eduardo Menani Barbieri et.al. |
2401.04721v1 |
null |
| 2024-01-09 |
Low-resource finetuning of foundation models beats state-of-the-art in histopathology |
Benedikt Roth et.al. |
2401.04720v1 |
null |
| 2024-01-09 |
Jump Cut Smoothing for Talking Heads |
Xiaojuan Wang et.al. |
2401.04718v1 |
null |
| 2024-01-09 |
NIPn CHIPS |
Blaise Boissonneau et.al. |
2401.04697v1 |
null |
| 2024-01-09 |
CoordGate: Efficiently Computing Spatially-Varying Convolutions in Convolutional Neural Networks |
Sunny Howard et.al. |
2401.04680v1 |
null |
| 2024-01-09 |
Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset |
Galib Muhammad Shahriar Himel et.al. |
2401.04666v1 |
null |
| 2024-01-09 |
DepressionEmo: A novel dataset for multilabel classification of depression emotions |
Abu Bakar Siddiqur Rahman et.al. |
2401.04655v1 |
link |
| 2024-01-09 |
Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots |
Immanuel Ampomah Mensah et.al. |
2401.04650v1 |
null |
| 2024-01-08 |
Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning |
Chen Zhao et.al. |
2401.04105v1 |
null |
| 2024-01-08 |
RudolfV: A Foundation Model by Pathologists for Pathologists |
Jonas Dippel et.al. |
2401.04079v1 |
null |
| 2024-01-08 |
Variance Reduction in Ratio Metrics for Efficient Online Experiments |
Shubham Baweja et.al. |
2401.04062v1 |
null |
| 2024-01-08 |
Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations |
Nabajeet Barman et.al. |
2401.04039v1 |
null |
| 2024-01-08 |
Blocks whose defect groups are Suzuki $2$-groups |
Charles W. Eaton et.al. |
2401.04028v1 |
null |
| 2024-01-08 |
IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification |
Abdullah Alsuhaibani et.al. |
2401.04025v1 |
null |
| 2024-01-08 |
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification |
Wentao Zhu et.al. |
2401.04023v1 |
null |
| 2024-01-08 |
Resident space object detection method based on the connection between Fourier spectrum of the video data difference frame and the linear velocity projection |
V. S. Baranova et.al. |
2401.04021v1 |
null |
| 2024-01-09 |
Recognizing Blazars Using Radio Morphology from the VLA Sky Survey |
Zhang-Liang Xie et.al. |
2401.04009v2 |
null |
| 2024-01-08 |
Calabi-Yau Varieties via Cyclic Covers, and Complex Hyperbolic Structures for their Moduli Spaces |
Chenglong Yu et.al. |
2401.04006v1 |
null |
| 2024-01-05 |
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively |
Haobo Yuan et.al. |
2401.02955v1 |
link |
| 2024-01-05 |
The Dark Energy Survey Supernova Program: Cosmological Analysis and Systematic Uncertainties |
M. Vincenzi et.al. |
2401.02945v1 |
null |
| 2024-01-05 |
Digital-analog quantum learning on Rydberg atom arrays |
Jonathan Z. Lu et.al. |
2401.02940v1 |
null |
| 2024-01-05 |
Mixing Magnetic and Electric Ehlers-Harrison transformations: The Electromagnetic Swirling Spacetime and Novel Type I Backgrounds |
José Barrientos et.al. |
2401.02924v1 |
null |
| 2024-01-05 |
Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks |
Kevin Everson et.al. |
2401.02921v1 |
null |
| 2024-01-05 |
Analytically-Driven Resource Management for Cloud-Native Microservices |
Yanqi Zhang et.al. |
2401.02920v1 |
null |
| 2024-01-05 |
Introducing Bode: A Fine-Tuned Large Language Model for Portuguese Prompt-Based Task |
Gabriel Lino Garcia et.al. |
2401.02909v1 |
null |
| 2024-01-05 |
Robust Bichromatic Classification using Two Lines |
Erwin Glazenburg et.al. |
2401.02897v1 |
null |
| 2024-01-05 |
Particle-Wise Higher-Order SPH Field Approximation for DVR |
Jonathan Fischer et.al. |
2401.02896v1 |
null |
| 2024-01-05 |
Nonlinear functional regression by functional deep neural network with kernel embedding |
Zhongjie Shi et.al. |
2401.02890v1 |
null |
| 2024-01-04 |
asimulation: Domain formation and impact on observables in resolved cosmological simulations of the (a)symmetron |
Øyvind Christiansen et.al. |
2401.02410v1 |
link |
| 2024-01-04 |
Gravitational waves from dark domain walls |
Øyvind Christiansen et.al. |
2401.02409v1 |
link |
| 2024-01-05 |
Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks |
Hartwig H. Hochmair et.al. |
2401.02404v2 |
null |
| 2024-01-04 |
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation |
Zihao Xiao et.al. |
2401.02402v1 |
null |
| 2024-01-04 |
Analyzing Misinformation Claims During the 2022 Brazilian General Election on WhatsApp, Twitter, and Kwai |
Scott A. Hale et.al. |
2401.02395v1 |
null |
| 2024-01-04 |
Image denoising and model-independent parameterization for improving IVIM MRI |
Caleb Sample et.al. |
2401.02394v1 |
null |
| 2024-01-04 |
Survey of 3D Human Body Pose and Shape Estimation Methods for Contemporary Dance Applications |
Darshan Venkatrayappa et.al. |
2401.02383v1 |
null |
| 2024-01-04 |
A novel method to enhance pneumonia detection via a model-level ensembling of CNN and vision transformer |
Sandeep Angara et.al. |
2401.02358v1 |
null |
| 2024-01-04 |
ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation |
Xinyang Pu et.al. |
2401.02326v1 |
link |
| 2024-01-04 |
Reflection physics in X-ray-emitting Symbiotic Stars |
Jesús A. Toalá et.al. |
2401.02318v1 |
null |
| 2024-01-03 |
Profinite equivariant spectra and their tensor-triangular geometry |
Scott Balchin et.al. |
2401.01878v1 |
null |
| 2024-01-03 |
A spatial mixture model for spaceborne lidar observations over mixed forest and non-forest land types |
Paul B. May et.al. |
2401.01848v1 |
null |
| 2024-01-03 |
Teaching with a companion: the case of gravity |
Iuliia Zhurakovskaia et.al. |
2401.01832v1 |
null |
| 2024-01-03 |
Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling |
Himmet Toprak Kesgin et.al. |
2401.01830v1 |
null |
| 2024-01-03 |
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions |
David Junhao Zhang et.al. |
2401.01827v1 |
link |
| 2024-01-03 |
Detours for Navigating Instructional Videos |
Kumar Ashutosh et.al. |
2401.01823v1 |
null |
| 2024-01-03 |
SENS3: Multisensory Database of Finger-Surface Interactions and Corresponding Sensations |
Jagan K. Balasubramanian et.al. |
2401.01818v1 |
null |
| 2024-01-03 |
Signal Processing in the Retina: Interpretable Graph Classifier to Predict Ganglion Cell Responses |
Yasaman Parhizkar et.al. |
2401.01813v1 |
null |
| 2024-01-03 |
Efficient Computation of Confidence Sets Using Classification on Equidistributed Grids |
Lujie Zhou et.al. |
2401.01804v1 |
null |
| 2024-01-03 |
An experimental sorting method for improving metagenomic data encoding |
Diogo Pratas et.al. |
2401.01786v1 |
null |
| 2024-01-02 |
Street Gaussians for Modeling Dynamic Urban Scenes |
Yunzhi Yan et.al. |
2401.01339v1 |
null |
| 2024-01-02 |
Classifying Words with 3-sort Automata |
Tomasz Jastrząb et.al. |
2401.01314v1 |
null |
| 2024-01-03 |
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models |
S. M Towhidul Islam Tonmoy et.al. |
2401.01313v2 |
null |
| 2024-01-02 |
Integrating Edges into U-Net Models with Explainable Activation Maps for Brain Tumor Segmentation using MR Images |
Subin Sahayam et.al. |
2401.01303v1 |
null |
| 2024-01-02 |
$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy |
Nicola Novello et.al. |
2401.01268v1 |
link |
| 2024-01-02 |
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM |
Fuchen Long et.al. |
2401.01256v1 |
null |
| 2024-01-02 |
An operational approach to classifying measurement incompatibility |
Arun Kumar Das et.al. |
2401.01236v1 |
null |
| 2024-01-03 |
Distribution Matching for Multi-Task Learning of Classification Tasks: a Large-Scale Study on Faces & Beyond |
Dimitrios Kollias et.al. |
2401.01219v2 |
null |
| 2024-01-02 |
FGENet: Fine-Grained Extraction Network for Congested Crowd Counting |
Hao-Yuan Ma et.al. |
2401.01208v1 |
null |
| 2024-01-02 |
Whole-examination AI estimation of fetal biometrics from 20-week ultrasound scans |
Lorenzo Venturini et.al. |
2401.01201v1 |
null |
| 2023-12-29 |
Computational Tools for Trees in Gauge Theory and Gravity |
Jacob L. Bourjaily et.al. |
2312.17745v1 |
null |
| 2023-12-29 |
Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization |
Ioanna Ntinou et.al. |
2312.17686v1 |
null |
| 2023-12-29 |
Malware Detection in IOT Systems Using Machine Learning Techniques |
Ali Mehrban et.al. |
2312.17683v1 |
null |
| 2023-12-29 |
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis |
Feng Liang et.al. |
2312.17681v1 |
null |
| 2023-12-29 |
Grasping, Part Identification, and Pose Refinement in One Shot with a Tactile Gripper |
Joyce Xin-Yan Lim et.al. |
2312.17650v1 |
null |
| 2023-12-29 |
MoD2T:Model-Data-Driven Motion-Static Object Tracking Method |
Yang Feng et.al. |
2312.17641v1 |
null |
| 2023-12-29 |
A New Explanation of the Mechanism of Hadley Circulation |
Wei Huang et.al. |
2312.17637v1 |
null |
| 2023-12-29 |
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training |
Dongfang Li et.al. |
2312.17591v1 |
null |
| 2023-12-29 |
A Tool for the Procedural Generation of Shaders using Interactive Evolutionary Algorithms |
Elio Sasso et.al. |
2312.17587v1 |
link |
| 2023-12-29 |
Distribution-based Low-rank Embedding |
Bardia Yousefi et.al. |
2312.17579v1 |
null |
| 2023-12-28 |
A Simple LLM Framework for Long-Range Video Question-Answering |
Ce Zhang et.al. |
2312.17235v1 |
null |
| 2023-12-28 |
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency |
Yuyang Yin et.al. |
2312.17225v1 |
null |
| 2023-12-28 |
EFHQ: Multi-purpose ExtremePose-Face-HQ dataset |
Trung Tuan Dao et.al. |
2312.17205v1 |
null |
| 2023-12-28 |
One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts |
Ziheng Zhao et.al. |
2312.17183v1 |
null |
| 2023-12-28 |
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action |
Jiasen Lu et.al. |
2312.17172v1 |
null |
| 2023-12-28 |
Classification of multiplication modules over multiplication rings with finitely many minimal primes |
Volodymyr Bavula et.al. |
2312.17170v1 |
null |
| 2023-12-28 |
Securing NextG Systems against Poisoning Attacks on Federated Learning: A Game-Theoretic Solution |
Yalin E. Sagduyu et.al. |
2312.17164v1 |
null |
| 2023-12-28 |
Replica Tree-based Federated Learning using Limited Data |
Ramona Ghilea et.al. |
2312.17159v1 |
null |
| 2023-12-29 |
ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe |
Yifan Bai et.al. |
2312.17133v2 |
null |
| 2023-12-28 |
Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos |
Houlun Chen et.al. |
2312.17117v1 |
null |
| 2023-12-26 |
Microwave signal processing using an analog quantum reservoir computer |
Alen Senanian et.al. |
2312.16166v1 |
null |
| 2023-12-26 |
Large-scale Long-tailed Disease Diagnosis on Radiology Images |
Qiaoyu Zheng et.al. |
2312.16151v1 |
null |
| 2023-12-27 |
The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias |
Timo Spinde et.al. |
2312.16148v2 |
link |
| 2023-12-26 |
The non-Abelian Aharonov-Bohm effect |
P. A. Horvathy et.al. |
2312.16133v1 |
null |
| 2023-12-26 |
LangSplat: 3D Language Gaussian Splatting |
Minghan Qin et.al. |
2312.16084v1 |
null |
| 2023-12-26 |
AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts |
Yingpeng Wen et.al. |
2312.16046v1 |
null |
| 2023-12-26 |
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification |
Hyenkyun Woo et.al. |
2312.16043v1 |
null |
| 2023-12-26 |
Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB Spectral Domain Translation |
Xingxing Yang et.al. |
2312.16040v1 |
null |
| 2023-12-26 |
Plug-and-Play Regularization on Magnitude with Deep Priors for 3D Near-Field MIMO Imaging |
Okyanus Oral et.al. |
2312.16024v1 |
null |
| 2023-12-26 |
Classification of positive solutions of Hardy-Sobolev equation without the finite volume constraints |
Lu Chen et.al. |
2312.16017v1 |
null |
| 2023-12-25 |
Training Convolutional Neural Networks with the Forward-Forward algorithm |
Riccardo Scodellaro et.al. |
2312.14924v2 |
null |
| 2023-12-22 |
DRStageNet: Deep Learning for Diabetic Retinopathy Staging from Fundus Images |
Yevgeniy Men et.al. |
2312.14891v1 |
null |
| 2023-12-22 |
On rate-optimal classification from non-private and from private data |
Balázs Csanád Csáji et.al. |
2312.14889v1 |
null |
| 2023-12-22 |
Classification of cubic tricirculant nut graphs |
Ivan Damnjanović et.al. |
2312.14884v1 |
null |
| 2023-12-22 |
Neural-network-based regularization methods for inverse problems in imaging |
Andreas Habring et.al. |
2312.14849v1 |
null |
| 2023-12-22 |
Classification of 3-GNDB Graphs |
Amir Hosseini et.al. |
2312.14835v1 |
null |
| 2023-12-22 |
Dreaming of Electrical Waves: Generative Modeling of Cardiac Excitation Waves using Diffusion Models |
Tanish Baranwal et.al. |
2312.14830v1 |
null |
| 2023-12-22 |
Classification of generalised higher-order Einstein-Maxwell Lagrangians |
Aimeric Colléaux et.al. |
2312.14814v1 |
null |
| 2023-12-22 |
On support vector machines under a multiple-cost scenario |
Sandra Benítez-Peña et.al. |
2312.14795v1 |
null |
| 2023-12-22 |
The Rate-Distortion-Perception-Classification Tradeoff: Joint Source Coding and Modulation via Inverse-Domain GANs |
Junli Fang et.al. |
2312.14792v1 |
null |
| 2023-12-21 |
3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera |
Christen Millerdurai et.al. |
2312.14157v1 |
null |
| 2023-12-21 |
Virtual Pets: Animatable Animal Generation in 3D Scenes |
Yen-Chi Cheng et.al. |
2312.14154v1 |
null |
| 2023-12-21 |
TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification |
Qinying Liu et.al. |
2312.14149v1 |
link |
| 2023-12-21 |
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs |
Artem Sevastopolsky et.al. |
2312.14140v1 |
null |
| 2023-12-21 |
Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach |
Qinying Liu et.al. |
2312.14138v1 |
link |
| 2023-12-21 |
Diffusion Reward: Learning Rewards via Conditional Video Diffusion |
Tao Huang et.al. |
2312.14134v1 |
null |
| 2023-12-21 |
WellFactor: Patient Profiling using Integrative Embedding of Healthcare Data |
Dongjin Choi et.al. |
2312.14129v1 |
null |
| 2023-12-21 |
VideoPoet: A Large Language Model for Zero-Shot Video Generation |
Dan Kondratyuk et.al. |
2312.14125v1 |
null |
| 2023-12-21 |
LingoQA: Video Question Answering for Autonomous Driving |
Ana-Maria Marcu et.al. |
2312.14115v1 |
link |
| 2023-12-21 |
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding |
Senqiao Yang et.al. |
2312.14074v1 |
null |
| 2023-12-20 |
Deep Learning on 3D Neural Fields |
Pierluigi Zama Ramirez et.al. |
2312.13277v1 |
null |
| 2023-12-20 |
The 1/4-BPS building blocks of brane interactions |
Ben Eckardt et.al. |
2312.13269v1 |
null |
| 2023-12-20 |
ClassLIE: Structure- and Illumination-Adaptive Classification for Low-Light Image Enhancement |
Zixiang Wei et.al. |
2312.13265v1 |
null |
| 2023-12-20 |
Putting the p back in Prym |
Jeff Achter et.al. |
2312.13263v1 |
null |
| 2023-12-20 |
The role of data embedding in equivariant quantum convolutional neural networks |
Sreetama Das et.al. |
2312.13250v1 |
null |
| 2023-12-20 |
Enhancing Neural Training via a Correlated Dynamics Model |
Jonathan Brokman et.al. |
2312.13247v1 |
null |
| 2023-12-20 |
SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space |
Oscar Dabrowski et.al. |
2312.13220v1 |
null |
| 2023-12-20 |
Boost recall in QSO selection from highly imbalanced photometric datasets |
Giorgio Calderone et.al. |
2312.13194v1 |
null |
| 2023-12-20 |
Ergodic measures for periodic type $\mathbb{Z}^m$-skew-products over Interval Exchange Transformations |
Yuriy Tumarkin et.al. |
2312.13165v1 |
null |
| 2023-12-20 |
Underwater Acoustic Signal Recognition Based on Salient Features |
Minghao Chen et.al. |
2312.13143v1 |
null |
| 2023-12-19 |
Tracking Any Object Amodally |
Cheng-Yen Hsieh et.al. |
2312.12433v1 |
null |
| 2023-12-19 |
The Endoscapes Dataset for Surgical Scene Segmentation, Object Detection, and Critical View of Safety Assessment: Official Splits and Benchmark |
Aditya Murali et.al. |
2312.12429v1 |
null |
| 2023-12-19 |
Chasing Fairness in Graphs: A GNN Architecture Perspective |
Zhimeng Jiang et.al. |
2312.12369v1 |
link |
| 2023-12-19 |
Easy quantum groups |
Teo Banica et.al. |
2312.12368v1 |
null |
| 2023-12-19 |
SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Action Segmentation |
Feixiang Zhou et.al. |
2312.12347v1 |
null |
| 2023-12-19 |
On the Effectiveness of Retrieval, Alignment, and Replay in Manipulation |
Norman Di Palo et.al. |
2312.12345v1 |
null |
| 2023-12-19 |
Full-reference Video Quality Assessment for User Generated Content Transcoding |
Zihao Qi et.al. |
2312.12317v1 |
null |
| 2023-12-19 |
First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria |
Stefan Schoder et.al. |
2312.12314v1 |
null |
| 2023-12-19 |
Holography of New Conformal Higher Spin Gravities in 3d |
I. Lovrekovic et.al. |
2312.12301v1 |
null |
| 2023-12-19 |
Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation |
Junxiang Wang et.al. |
2312.12276v1 |
null |
| 2023-12-18 |
Development and Evaluation of Ensemble Learning-based Environmental Methane Detection and Intensity Prediction Models |
Reek Majumder et.al. |
2312.10879v1 |
null |
| 2023-12-18 |
Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation |
Hui Fu et.al. |
2312.10877v1 |
null |
| 2023-12-17 |
Global relaxation-based LP-Newton method for multiple hyperparameter selection in support vector classification with feature selection |
Qingna Li et.al. |
2312.10848v1 |
null |
| 2023-12-17 |
Online Boosting Adaptive Learning under Concept Drift for Multistream Classification |
En Yu et.al. |
2312.10841v1 |
null |
| 2023-12-17 |
Learning to Act without Actions |
Dominik Schmidt et.al. |
2312.10812v1 |
null |
| 2023-12-17 |
Land use/land cover classification of fused Sentinel-1 and Sentinel-2 imageries using ensembles of Random Forests |
Shivam Pande et.al. |
2312.10798v1 |
null |
| 2023-12-17 |
Learning to Learn in Interactive Constraint Acquisition |
Dimos Tsouros et.al. |
2312.10795v1 |
null |
| 2023-12-17 |
Identification of Knowledge Neurons in Protein Language Models |
Divya Nori et.al. |
2312.10770v1 |
null |
| 2023-12-17 |
Multi-Label Classification of COVID-Tweets Using Large Language Models |
Aniket Deroy et.al. |
2312.10748v1 |
link |
| 2023-12-17 |
Unmasking Deepfake Faces from Videos Using An Explainable Cost-Sensitive Deep Learning Approach |
Faysal Mahmud et.al. |
2312.10740v1 |
link |
| 2023-12-15 |
Understanding Probe Behaviors through Variational Bounds of Mutual Information |
Kwanghee Choi et.al. |
2312.10019v1 |
link |
| 2023-12-15 |
Wearable Coaxially-shielded Metamaterial for Magnetic Resonance Imaging |
Xia Zhu et.al. |
2312.10018v1 |
null |
| 2023-12-15 |
On the Invertibility of Euler Integral Transforms with Hyperplanes and Quadric Hypersurfaces |
Mattie Ji et.al. |
2312.10002v1 |
null |
| 2023-12-15 |
Towards Architecture-Insensitive Untrained Network Priors for Accelerated MRI Reconstruction |
Yilin Liu et.al. |
2312.09988v1 |
null |
| 2023-12-15 |
DHFormer: A Vision Transformer-Based Attention Module for Image Dehazing |
Abdul Wasi et.al. |
2312.09955v1 |
null |
| 2023-12-15 |
Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction |
Yuanbo Hou et.al. |
2312.09952v1 |
null |
| 2023-12-15 |
LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer |
Yuxin Cao et.al. |
2312.09935v1 |
link |
| 2023-12-15 |
RDR: the Recap, Deliberate, and Respond Method for Enhanced Language Understanding |
Yuxin Zi et.al. |
2312.09932v1 |
null |
| 2023-12-15 |
Reliable Probabilistic Classification with Neural Networks |
Harris Papadopoulos et.al. |
2312.09912v1 |
null |
| 2023-12-15 |
TMP: Temporal Motion Propagation for Online Video Super-Resolution |
Zhengqiang Zhang et.al. |
2312.09909v1 |
null |
| 2023-12-14 |
3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting |
Zhiyin Qian et.al. |
2312.09228v1 |
null |
| 2023-12-14 |
Efficient Online Learning of Contact Force Models for Connector Insertion |
Kevin Tracy et.al. |
2312.09190v1 |
null |
| 2023-12-14 |
General Object Foundation Model for Images and Videos at Scale |
Junfeng Wu et.al. |
2312.09158v1 |
null |
| 2023-12-14 |
Evaluating Augmented Reality Communication: How Can We Teach Procedural Skill in AR? |
Manuel Rebol et.al. |
2312.09152v1 |
null |
| 2023-12-14 |
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting |
Anthony Chen et.al. |
2312.09148v1 |
null |
| 2023-12-14 |
Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy |
Junsu Kim et.al. |
2312.09139v1 |
null |
| 2023-12-14 |
Less is more -- the Dispatcher/ Executor principle for multi-task Reinforcement Learning |
Martin Riedmiller et.al. |
2312.09120v1 |
null |
| 2023-12-14 |
VideoLCM: Video Latent Consistency Model |
Xiang Wang et.al. |
2312.09109v1 |
null |
| 2023-12-14 |
FastInject: Injecting Unpaired Text Data into CTC-based ASR training |
Keqi Deng et.al. |
2312.09100v1 |
null |
| 2023-12-14 |
Agent Attention: On the Integration of Softmax and Linear Attention |
Dongchen Han et.al. |
2312.08874v1 |
link |
| 2023-12-13 |
VLAP: Efficient Video-Language Alignment via Frame Prompting and Distilling for Video Question Answering |
Xijun Wang et.al. |
2312.08367v1 |
null |
| 2023-12-13 |
Challenges and Opportunities in Implementing Negative Differential Resistance Mode Reconfigurable Field Effect Transistors |
Lephe S et.al. |
2312.08351v1 |
null |
| 2023-12-13 |
Ehancing CT Image synthesis from multi-modal MRI data based on a multi-task neural network framework |
Zhuoyao Xin et.al. |
2312.08343v1 |
null |
| 2023-12-13 |
Preparing VVC for Streaming: A Fast Multi-Rate Encoding Approach |
Yiqun Liu et.al. |
2312.08330v1 |
null |
| 2023-12-13 |
Affine monoids of corank one |
Yulia Zaitseva et.al. |
2312.08316v1 |
null |
| 2023-12-13 |
VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space |
Guénolé Fiche et.al. |
2312.08291v1 |
null |
| 2023-12-13 |
PhenDiff: Revealing Invisible Phenotypes with Conditional Diffusion Models |
Anis Bourou et.al. |
2312.08290v1 |
link |
| 2023-12-13 |
On the verification of Embeddings using Hybrid Markov Logic |
Anup Shakya et.al. |
2312.08287v1 |
null |
| 2023-12-14 |
High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models |
Songchi Zhou et.al. |
2312.08274v2 |
null |
| 2023-12-13 |
Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation |
Arul Selvam Periyasamy et.al. |
2312.08268v1 |
null |
| 2023-12-12 |
diff History for Long-Context Language Agents |
Ulyana Piterbarg et.al. |
2312.07540v1 |
null |
| 2023-12-12 |
FreeInit: Bridging Initialization Gap in Video Diffusion Models |
Tianxing Wu et.al. |
2312.07537v1 |
link |
| 2023-12-12 |
WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion |
Soyong Shin et.al. |
2312.07531v1 |
null |
| 2023-12-12 |
RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation |
Peng Lu et.al. |
2312.07526v1 |
link |
| 2023-12-12 |
PEEKABOO: Interactive Video Generation via Masked-Diffusion |
Yash Jain et.al. |
2312.07509v1 |
null |
| 2023-12-12 |
NAC-TCN: Temporal Convolutional Networks with Causal Dilated Neighborhood Attention for Emotion Understanding |
Alexander Mehta et.al. |
2312.07507v1 |
link |
| 2023-12-12 |
COLMAP-Free 3D Gaussian Splatting |
Yang Fu et.al. |
2312.07504v1 |
null |
| 2023-12-12 |
NearbyPatchCL: Leveraging Nearby Patches for Self-Supervised Patch-Level Multi-Class Classification in Whole-Slide Images |
Gia-Bao Le et.al. |
2312.07489v1 |
null |
| 2023-12-12 |
MinD-3D: Reconstruct High-quality 3D objects in Human Brain |
Jianxiong Gao et.al. |
2312.07485v1 |
null |
| 2023-12-12 |
Classification of retail products: From probabilistic ranking to neural networks |
Manar Mohamed Hafez et.al. |
2312.07482v1 |
null |
| 2023-12-11 |
Photorealistic Video Generation with Diffusion Models |
Agrim Gupta et.al. |
2312.06662v1 |
null |
| 2023-12-11 |
LightSim: Neural Lighting Simulation for Urban Scenes |
Ava Pun et.al. |
2312.06654v1 |
null |
| 2023-12-11 |
Beyond Classification: Definition and Density-based Estimation of Calibration in Object Detection |
Teodora Popordanoska et.al. |
2312.06645v1 |
null |
| 2023-12-11 |
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution |
Shangchen Zhou et.al. |
2312.06640v1 |
null |
| 2023-12-12 |
TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation |
Rongkun Zheng et.al. |
2312.06630v2 |
link |
| 2023-12-11 |
Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism |
Georgios Milis et.al. |
2312.06613v1 |
link |
| 2023-12-11 |
Early Action Recognition with Action Prototypes |
Guglielmo Camporese et.al. |
2312.06598v1 |
null |
| 2023-12-11 |
Flexible visual prompts for in-context learning in computer vision |
Thomas Foster et.al. |
2312.06592v1 |
link |
| 2023-12-11 |
QuickQuakeBuildings: Post-earthquake SAR-Optical Dataset for Quick Damaged-building Detection |
Yao Sun et.al. |
2312.06587v1 |
null |
| 2023-12-12 |
ESO/HARPS Radial Velocities Catalog |
Mauro Barbieri et.al. |
2312.06586v2 |
null |
| 2023-12-08 |
The Long Secondary Period (LSP) Variables: Overview and Some Analysis |
John R. Percy et.al. |
2312.05255v1 |
null |
| 2023-12-08 |
Few-Shot Class-Incremental Learning via Training-Free Prototype Calibration |
Qi-Wei Wang et.al. |
2312.05229v1 |
null |
| 2023-12-08 |
Shape Matters: Detecting Vertebral Fractures Using Differentiable Point-Based Shape Decoding |
Hellena Hempe et.al. |
2312.05220v1 |
link |
| 2023-12-08 |
Enhancing Facial Classification and Recognition using 3D Facial Models and Deep Learning |
Houting Li et.al. |
2312.05219v1 |
null |
| 2023-12-08 |
IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing |
Shaofei Wang et.al. |
2312.05210v1 |
null |
| 2023-12-08 |
Embedding theory in ML toward real-time tracking of structural dynamics through hyperspectral datasets |
Jonathan D Hollenbach et.al. |
2312.05201v1 |
null |
| 2023-12-08 |
Video-Based Rendering Techniques: A Survey |
Rafael Kuffner dos Anjos et.al. |
2312.05179v1 |
null |
| 2023-12-08 |
Enhancing Single-Frame Supervision for Better Temporal Action Localization |
Changjian Chen et.al. |
2312.05178v1 |
null |
| 2023-12-08 |
MRI Scan Synthesis Methods based on Clustering and Pix2Pix |
Giulia Baldini et.al. |
2312.05176v1 |
null |
| 2023-12-08 |
TriHuman : A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis |
Heming Zhu et.al. |
2312.05161v1 |
null |
| 2023-12-07 |
GenDeF: Learning Generative Deformation Field for Video Generation |
Wen Wang et.al. |
2312.04561v1 |
null |
| 2023-12-07 |
MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar |
Yufan Chen et.al. |
2312.04558v1 |
null |
| 2023-12-07 |
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation |
Shoufa Chen et.al. |
2312.04557v1 |
null |
| 2023-12-07 |
SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing |
Tomoki Ichikawa et.al. |
2312.04553v1 |
null |
| 2023-12-07 |
PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play |
Lili Chen et.al. |
2312.04549v1 |
null |
| 2023-12-07 |
Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception? |
Aritra Dutta et.al. |
2312.04548v1 |
null |
| 2023-12-07 |
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models |
Ivan Kapelyukh et.al. |
2312.04533v1 |
null |
| 2023-12-07 |
Camera Height Doesn't Change: Unsupervised Monocular Scale-Aware Road-Scene Depth Estimation |
Genki Kinoshita et.al. |
2312.04530v1 |
null |
| 2023-12-07 |
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models |
Ozgur Kara et.al. |
2312.04524v1 |
link |
| 2023-12-07 |
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation |
Zhiwu Qing et.al. |
2312.04483v1 |
null |
| 2023-12-06 |
OneLLM: One Framework to Align All Modalities with Language |
Jiaming Han et.al. |
2312.03700v1 |
link |
| 2023-12-07 |
Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers |
Umberto Cappellazzo et.al. |
2312.03694v2 |
null |
| 2023-12-06 |
Direct Exoplanet Detection Using Deep Convolutional Image Reconstruction (ConStruct): A New Algorithm for Post-Processing High-Contrast Images |
Trevor N. Wolf et.al. |
2312.03671v1 |
null |
| 2023-12-06 |
Annihilating branching Brownian motion |
Daniel Ahlberg et.al. |
2312.03669v1 |
null |
| 2023-12-06 |
Towards small and accurate convolutional neural networks for acoustic biodiversity monitoring |
Serge Zaugg et.al. |
2312.03666v1 |
null |
| 2023-12-06 |
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving |
Ming Nie et.al. |
2312.03661v1 |
link |
| 2023-12-06 |
Editable Stain Transformation Of Histological Images Using Unpaired GANs |
Tibor Sloboda et.al. |
2312.03647v1 |
link |
| 2023-12-06 |
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation |
Zhouxia Wang et.al. |
2312.03641v1 |
null |
| 2023-12-06 |
Training Neural Networks on RAW and HDR Images for Restoration Tasks |
Lei Luo et.al. |
2312.03640v1 |
link |
| 2023-12-07 |
Evaluation of Active Feature Acquisition Methods for Static Feature Settings |
Henrik von Kleist et.al. |
2312.03619v2 |
null |
| 2023-12-05 |
Dexterous Functional Grasping |
Ananye Agarwal et.al. |
2312.02975v1 |
null |
| 2023-12-05 |
Describing Differences in Image Sets with Natural Language |
Lisa Dunlap et.al. |
2312.02974v1 |
link |
| 2023-12-05 |
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos |
Shoukang Hu et.al. |
2312.02973v1 |
link |
| 2023-12-05 |
Detecting algorithmic bias in medical AI-models |
Jeffrey Smith et.al. |
2312.02959v1 |
null |
| 2023-12-05 |
Classification for everyone : Building geography agnostic models for fairer recognition |
Akshat Jindal et.al. |
2312.02957v1 |
null |
| 2023-12-05 |
Choroidalyzer: An open-source, end-to-end pipeline for choroidal analysis in optical coherence tomography |
Justin Engelmann et.al. |
2312.02956v1 |
null |
| 2023-12-05 |
An alternating peak-optimization method for optimal trajectory generation of quadrotor drones |
Wytze A. B. de Vries et.al. |
2312.02944v1 |
null |
| 2023-12-05 |
Fast CT anatomic localization algorithm |
Amit Oved et.al. |
2312.02941v1 |
null |
| 2023-12-05 |
Drag-A-Video: Non-rigid Video Editing with Point-based Interaction |
Yao Teng et.al. |
2312.02936v1 |
null |
| 2023-12-06 |
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation |
Jiachen Lu et.al. |
2312.02934v2 |
link |
| 2023-12-04 |
iMatching: Imperative Correspondence Learning |
Zitong Zhan et.al. |
2312.02141v1 |
null |
| 2023-12-04 |
Fast View Synthesis of Casual Videos |
Yao-Chih Lee et.al. |
2312.02135v1 |
null |
| 2023-12-04 |
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians |
Liangxiao Hu et.al. |
2312.02134v1 |
null |
| 2023-12-04 |
Hot PATE: Private Aggregation of Distributions for Diverse Task |
Edith Cohen et.al. |
2312.02132v1 |
null |
| 2023-12-04 |
Can we truly transfer an actor's genuine happiness to avatars? An investigation into virtual, real, posed and spontaneous faces |
Vitor Miguel Xavier Peres et.al. |
2312.02128v1 |
null |
| 2023-12-04 |
Cosmic star-formation history and black hole accretion history inferred from the JWST mid-infrared source counts |
Seong Jin Kim et.al. |
2312.02090v1 |
null |
| 2023-12-05 |
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence |
Yuchao Gu et.al. |
2312.02087v2 |
null |
| 2023-12-04 |
Integrating AI into CCTV Systems: A Comprehensive Evaluation of Smart Video Surveillance in Community Space |
Shanle Yao et.al. |
2312.02078v1 |
null |
| 2023-12-04 |
GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians |
Shenhan Qian et.al. |
2312.02069v1 |
null |
| 2023-12-04 |
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding |
Shuhuai Ren et.al. |
2312.02051v1 |
null |
| 2023-12-01 |
Dense Optical Tracking: Connecting the Dots |
Guillaume Le Moing et.al. |
2312.00786v1 |
null |
| 2023-12-01 |
Sequential Modeling Enables Scalable Learning for Large Vision Models |
Yutong Bai et.al. |
2312.00785v1 |
null |
| 2023-12-01 |
MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video |
Hengyi Wang et.al. |
2312.00778v1 |
null |
| 2023-12-01 |
VideoBooth: Diffusion-based Video Generation with Image Prompts |
Yuming Jiang et.al. |
2312.00777v1 |
null |
| 2023-12-01 |
Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans |
Homanga Bharadhwaj et.al. |
2312.00775v1 |
null |
| 2023-12-01 |
Explaining Knock-on Effects of Bias Mitigation |
Svetoslav Nizhnichenkov et.al. |
2312.00765v1 |
null |
| 2023-12-04 |
Deep Unlearning: Fast and Efficient Training-free Approach to Controlled Forgetting |
Sangamesh Kodge et.al. |
2312.00761v2 |
null |
| 2023-12-01 |
Mitigating Over-smoothing in Transformers via Regularized Nonlocal Functionals |
Tam Nguyen et.al. |
2312.00751v1 |
null |
| 2023-12-01 |
Tight-minimal dichotomies in Banach spaces |
Alejandra C. Cáceres-Rigo et.al. |
2312.00721v1 |
null |
| 2023-12-01 |
GIFT: Generative Interpretable Fine-Tuning Transformers |
Chinmay Savadikar et.al. |
2312.00700v1 |
link |
| 2023-11-30 |
Just Add $π$! Pose Induced Video Transformers for Understanding Activities of Daily Living |
Dominick Reilly et.al. |
2311.18840v1 |
null |
| 2023-11-30 |
TrafficMOT: A Challenging Dataset for Multi-Object Tracking in Complex Traffic Scenarios |
Lihao Liu et.al. |
2311.18839v1 |
null |
| 2023-11-30 |
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models |
Zhen Xing et.al. |
2311.18837v1 |
null |
| 2023-11-30 |
ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with Diffusion Models |
Wenming Weng et.al. |
2311.18834v1 |
null |
| 2023-11-30 |
MotionEditor: Editing Video Motion via Content-Aware Diffusion |
Shuyuan Tu et.al. |
2311.18830v1 |
link |
| 2023-11-30 |
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation |
Yanhui Wang et.al. |
2311.18829v1 |
null |
| 2023-11-30 |
Motion-Conditioned Image Animation for Video Editing |
Wilson Yan et.al. |
2311.18827v1 |
null |
| 2023-11-30 |
CAST: Cross-Attention in Space and Time for Video Action Recognition |
Dongho Lee et.al. |
2311.18825v1 |
link |
| 2023-11-30 |
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking |
Kaifeng Lyu et.al. |
2311.18817v1 |
link |
| 2023-11-30 |
BIOCLIP: A Vision Foundation Model for the Tree of Life |
Samuel Stevens et.al. |
2311.18803v1 |
null |
| 2023-11-30 |
Do text-free diffusion models learn discriminative visual representations? |
Soumik Mukhopadhyay et.al. |
2311.17921v2 |
null |
| 2023-11-29 |
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving |
Yuqi Wang et.al. |
2311.17918v1 |
link |
| 2023-11-29 |
HUGS: Human Gaussian Splats |
Muhammed Kocabas et.al. |
2311.17910v1 |
null |
| 2023-11-29 |
SODA: Bottleneck Diffusion Models for Representation Learning |
Drew A. Hudson et.al. |
2311.17901v1 |
null |
| 2023-11-30 |
Knowledge Pursuit Prompting for Zero-Shot Multimodal Synthesis |
Jinqi Luo et.al. |
2311.17898v2 |
null |
| 2023-11-29 |
On the geometry of tensor products over finite fields |
Stefano Lia et.al. |
2311.17896v1 |
null |
| 2023-11-29 |
Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation |
Shuangrui Ding et.al. |
2311.17893v1 |
null |
| 2023-11-29 |
TSDF-Sampling: Efficient Sampling for Neural Surface Field using Truncated Signed Distance Field |
Chaerin Min et.al. |
2311.17878v1 |
null |
| 2023-11-29 |
Enhancing Post-Hoc Explanation Benchmark Reliability for Image Classification |
Tristan Gomez et.al. |
2311.17876v1 |
null |
| 2023-11-29 |
On the Adversarial Robustness of Graph Contrastive Learning Methods |
Filippo Guerranti et.al. |
2311.17853v1 |
null |
| 2023-11-28 |
Panoptic Video Scene Graph Generation |
Jingkang Yang et.al. |
2311.17058v1 |
link |
| 2023-11-28 |
Self-Supervised Motion Magnification by Backpropagating Through Optical Flow |
Zhaoying Pan et.al. |
2311.17056v1 |
null |
| 2023-11-28 |
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training |
Pavan Kumar Anasosalu Vasu et.al. |
2311.17049v1 |
null |
| 2023-11-28 |
Jets of foliations and $b^k$-algebroids |
Francis Bischoff et.al. |
2311.17045v1 |
null |
| 2023-11-28 |
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models |
Yanwei Li et.al. |
2311.17043v1 |
link |
| 2023-11-29 |
Efficient In-Context Learning in Vision-Language Models for Egocentric Videos |
Keunwoo Peter Yu et.al. |
2311.17041v2 |
null |
| 2023-11-28 |
Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer |
Danah Yatim et.al. |
2311.17009v1 |
null |
| 2023-11-28 |
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark |
Kunchang Li et.al. |
2311.17005v1 |
link |
| 2023-11-28 |
Mirković-Vilonen Polytopes from Combinatorics |
Mario Sanchez et.al. |
2311.16979v1 |
null |
| 2023-11-28 |
Natural Language Processing Through Transfer Learning: A Case Study on Sentiment Analysis |
Aman Yadav et.al. |
2311.16965v1 |
null |
| 2023-11-28 |
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models |
Munan Ning et.al. |
2311.16103v2 |
link |
| 2023-11-27 |
GART: Gaussian Articulated Template Models |
Jiahui Lei et.al. |
2311.16099v1 |
null |
| 2023-11-27 |
On Bringing Robots Home |
Nur Muhammad Mahi Shafiullah et.al. |
2311.16098v1 |
link |
| 2023-11-27 |
CG-HOI: Contact-Guided 3D Human-Object Interaction Generation |
Christian Diller et.al. |
2311.16097v1 |
null |
| 2023-11-27 |
Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling |
Zhe Li et.al. |
2311.16096v1 |
link |
| 2023-11-27 |
Three-dimensional $\mathbb{Z}$ topological insulators without reflection symmetry |
Alexander C. Tyner et.al. |
2311.16092v1 |
null |
| 2023-11-27 |
BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification |
Dmitri Roussinov et.al. |
2311.16083v1 |
link |
| 2023-11-27 |
ViT-Lens-2: Gateway to Omni-modal Intelligence |
Weixian Lei et.al. |
2311.16081v1 |
link |
| 2023-11-27 |
Correlated Spectral and Recurrence Variations of Cygnus X-1 |
E. M. Broadbent et.al. |
2311.16070v1 |
null |
| 2023-11-27 |
DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization |
Zhaoyang Xia et.al. |
2311.16060v1 |
link |
| 2023-11-24 |
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation |
Lingchen Meng et.al. |
2311.14671v1 |
link |
| 2023-11-24 |
JetLOV: Enhancing Jet Tree Tagging through Neural Network Learning of Optimal LundNet Variables |
Mauricio A. Diaz et.al. |
2311.14654v1 |
link |
| 2023-11-24 |
Learning in Deep Factor Graphs with Gaussian Belief Propagation |
Seth Nabarro et.al. |
2311.14649v1 |
null |
| 2023-11-24 |
Continuous football player tracking from discrete broadcast data |
Matthew J. Penn et.al. |
2311.14642v1 |
null |
| 2023-11-24 |
Emergent Topology in Many-Body Dissipative Quantum Chaos |
Antonio M. García-García et.al. |
2311.14640v1 |
null |
| 2023-11-24 |
Unsupervised high-throughput segmentation of cells and cell nuclei in quantitative phase images |
Julia Sistermanns et.al. |
2311.14639v1 |
null |
| 2023-11-24 |
ARIA: On the interaction between Architectures, Aggregation methods and Initializations in federated visual classification |
Vasilis Siomos et.al. |
2311.14625v1 |
null |
| 2023-11-24 |
Neural Style Transfer for Computer Games |
Eleftherios Ioannou et.al. |
2311.14617v1 |
null |
| 2023-11-24 |
Animate124: Animating One Image to 4D Dynamic Scene |
Yuyang Zhao et.al. |
2311.14603v1 |
null |
| 2023-11-24 |
A Metalearned Neural Circuit for Nonparametric Bayesian Inference |
Jake C. Snell et.al. |
2311.14601v1 |
link |
| 2023-11-22 |
WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space |
Katja Schwarz et.al. |
2311.13570v1 |
null |
| 2023-11-22 |
Belted sum decompositions of fully augmented links |
Porter Morgan et.al. |
2311.13540v1 |
null |
| 2023-11-22 |
Learned Nonlinear Predictor for Critically Sampled 3D Point Cloud Attribute Compression |
Tam Thuc Do et.al. |
2311.13539v1 |
null |
| 2023-11-22 |
Leveraging CNNs and Ensemble Learning for Automated Disaster Image Classification |
Archit Rathod et.al. |
2311.13531v1 |
null |
| 2023-11-22 |
Applying Dimensionality Reduction as Precursor to LSTM-CNN Models for Classifying Imagery and Motor Signals in ECoG-Based BCIs |
Soham Bafana et.al. |
2311.13507v1 |
link |
| 2023-11-22 |
Current Topological and Machine Learning Applications for Bias Detection in Text |
Colleen Farrelly et.al. |
2311.13495v1 |
null |
| 2023-11-22 |
Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning |
Bhavya Mehta et.al. |
2311.13490v1 |
null |
| 2023-11-22 |
Deep-learning-based acceleration of MRI for radiotherapy planning of pediatric patients with brain tumors |
Shahinur Alam et.al. |
2311.13485v1 |
link |
| 2023-11-22 |
Solution discovery via reconfiguration for problems in P |
Mario Grobler et.al. |
2311.13478v1 |
null |
| 2023-11-22 |
Experimentation in Early-Stage Video Game Startups: Practices and Challenges |
Henry Edison et.al. |
2311.13462v1 |
null |
| 2023-11-21 |
Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models |
David Stotko et.al. |
2311.12796v1 |
null |
| 2023-11-21 |
Quantifying Impairment and Disease Severity Using AI Models Trained on Healthy Subjects |
Boyang Yu et.al. |
2311.12781v1 |
link |
| 2023-11-21 |
Swift Parameter-free Attention Network for Efficient Super-Resolution |
Cheng Wan et.al. |
2311.12770v1 |
link |
| 2023-11-22 |
Investigating Weight-Perturbed Deep Neural Networks With Application in Iris Presentation Attack Detection |
Renu Sharma et.al. |
2311.12764v2 |
link |
| 2023-11-21 |
High-resolution Image-based Malware Classification using Multiple Instance Learning |
Tim Peters et.al. |
2311.12760v1 |
link |
| 2023-11-21 |
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction |
Yuanhui Huang et.al. |
2311.12754v1 |
link |
| 2023-11-21 |
Image Transformation for IoT Time-Series Data: A Review |
Duygu Altunkaya et.al. |
2311.12742v1 |
null |
| 2023-11-21 |
Exploring Graph Classification Techniques Under Low Data Constraints: A Comprehensive Study |
Kush Kothari et.al. |
2311.12737v1 |
null |
| 2023-11-21 |
Not Just Training, Also Testing: High School Youths' Perspective-Taking through Peer Testing Machine Learning-Powered Applications |
L. Morales-Navarro et.al. |
2311.12733v1 |
null |
| 2023-11-21 |
Cascade Learning Localises Discriminant Features in Visual Scene Classification |
Junwen Wang et.al. |
2311.12704v1 |
null |
| 2023-11-20 |
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation |
Wenhao Li et.al. |
2311.12028v1 |
null |
| 2023-11-20 |
GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration |
Naoki Wake et.al. |
2311.12015v1 |
null |
| 2023-11-20 |
Evaluating Supervision Levels Trade-Offs for Infrared-Based People Counting |
David Latortue et.al. |
2311.11974v1 |
null |
| 2023-11-20 |
SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks |
Jin Ye et.al. |
2311.11969v1 |
link |
| 2023-11-20 |
Correlated Attention in Transformers for Multivariate Time Series |
Quang Minh Nguyen et.al. |
2311.11959v1 |
null |
| 2023-11-20 |
Tubular Curvature Filter: Implicit Pointwise Curvature Calculation Method for Tubular Objects |
Elifnur Sunger et.al. |
2311.11931v1 |
null |
| 2023-11-20 |
LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions |
Songhao Han et.al. |
2311.11904v1 |
null |
| 2023-11-20 |
Multimodal Characterization of Emotion within Multimedia Space |
Dayo Samuel Banjo et.al. |
2311.11892v1 |
null |
| 2023-11-20 |
SniffyArt: The Dataset of Smelling Persons |
Mathias Zinnen et.al. |
2311.11888v1 |
null |
| 2023-11-20 |
Multi-Task Faces (MTF) Data Set: A Legally and Ethically Compliant Collection of Face Images for Various Classification Tasks |
Rami Haffar et.al. |
2311.11882v1 |
link |
| 2023-11-17 |
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning |
Rohit Girdhar et.al. |
2311.10709v1 |
null |
| 2023-11-17 |
SpACNN-LDVAE: Spatial Attention Convolutional Latent Dirichlet Variational Autoencoder for Hyperspectral Pixel Unmixing |
Soham Chitnis et.al. |
2311.10701v1 |
null |
| 2023-11-17 |
A note on the convergence of the Bayesian entropy estimator for exchangeable partitions |
Servet Martinez et.al. |
2311.10698v1 |
null |
| 2023-11-17 |
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections |
Lihan Zha et.al. |
2311.10678v1 |
link |
| 2023-11-17 |
3D-TexSeg: Unsupervised Segmentation of 3D Texture using Mutual Transformer Learning |
Iyyakutti Iyappan Ganapathi et.al. |
2311.10651v1 |
null |
| 2023-11-17 |
User Dynamics-Aware Edge Caching and Computing for Mobile Virtual Reality |
Mushu Li et.al. |
2311.10645v1 |
null |
| 2023-11-17 |
Image-Domain Material Decomposition for Dual-energy CT using Unsupervised Learning with Data-fidelity Loss |
Junbo Peng et.al. |
2311.10641v1 |
null |
| 2023-11-17 |
Scaling TabPFN: Sketching and Feature Selection for Tabular Prior-Data Fitted Networks |
Benjamin Feuer et.al. |
2311.10609v1 |
null |
| 2023-11-17 |
Designing Reconfigurable Intelligent Systems with Markov Blankets |
Boris Sedlak et.al. |
2311.10597v1 |
null |
| 2023-11-17 |
FOCAL: A Cost-Aware Video Dataset for Active Learning |
Kiran Kokilepersaud et.al. |
2311.10591v1 |
link |
| 2023-11-16 |
Traffic Video Object Detection using Motion Prior |
Lihao Liu et.al. |
2311.10092v1 |
null |
| 2023-11-16 |
Moduli space of rank three logarithmic connections on the projective line with three poles |
Takafumi Matsumoto et.al. |
2311.10071v1 |
null |
| 2023-11-16 |
Inherently Interpretable Time Series Classification via Multiple Instance Learning |
Joseph Early et.al. |
2311.10049v1 |
link |
| 2023-11-16 |
On the potential of Carbon-Enhanced Metal-Poor stars for Galactic Archaeology |
Aruna Goswami et.al. |
2311.10043v1 |
null |
| 2023-11-16 |
Match and Locate: low-frequency monocular odometry based on deep feature matching |
Stepan Konev et.al. |
2311.10034v1 |
null |
| 2023-11-16 |
Revolutionizing Customer Interactions: Insights and Challenges in Deploying ChatGPT and Generative Chatbots for FAQs |
Feriel Khennouche et.al. |
2311.09976v1 |
null |
| 2023-11-16 |
From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning |
Jiansong Zhang et.al. |
2311.09974v1 |
null |
| 2023-11-16 |
VertDetect: Fully End-to-End 3D Vertebral Instance Segmentation Model |
Geoff Klein et.al. |
2311.09958v1 |
null |
| 2023-11-16 |
Harnessing Transformers: A Leap Forward in Lung Cancer Image Detection |
Amine Bechar et.al. |
2311.09942v1 |
null |
| 2023-11-17 |
A Framework for Monitoring and Retraining Language Models in Real-World Applications |
Jaykumar Kasundra et.al. |
2311.09930v2 |
null |
| 2023-11-15 |
Single-Image 3D Human Digitization with Shape-Guided Diffusion |
Badour AlBahar et.al. |
2311.09221v1 |
null |
| 2023-11-15 |
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy |
Kirill Vishniakov et.al. |
2311.09215v1 |
link |
| 2023-11-15 |
Topology of Pulsar Profiles (ToPP). I. Graph theory method and classification of the EPN |
D. Vohl et.al. |
2311.09201v1 |
null |
| 2023-11-15 |
ExpM+NF: Differentially Private Machine Learning that Surpasses DPSGD |
Robert A. Bridges et.al. |
2311.09200v1 |
null |
| 2023-11-15 |
Domain Aligned CLIP for Few-shot Classification |
Muhammad Waleed Gondal et.al. |
2311.09191v1 |
null |
| 2023-11-15 |
ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models |
Jierui Li et.al. |
2311.09182v1 |
null |
| 2023-11-15 |
RBPGAN: Recurrent Back-Projection GAN for Video Super Resolution |
Dareen Hussein et.al. |
2311.09178v1 |
null |
| 2023-11-15 |
Model Agnostic Explainable Selective Regression via Uncertainty Estimation |
Andrea Pugnana et.al. |
2311.09145v1 |
null |
| 2023-11-15 |
Explainable Text Classification Techniques in Legal Document Review: Locating Rationales without Using Human Annotated Training Text Snippets |
Christian Mahoney et.al. |
2311.09133v1 |
null |
| 2023-11-15 |
Cross-view and Cross-pose Completion for 3D Human Understanding |
Matthieu Armando et.al. |
2311.09104v1 |
null |
| 2023-11-14 |
MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation |
Ehsan Asali et.al. |
2311.08393v1 |
null |
| 2023-11-14 |
USLR: an open-source tool for unbiased and smooth longitudinal registration of brain MR |
Adrià Casamitjana et.al. |
2311.08371v1 |
link |
| 2023-11-14 |
Inverse Learning with Extremely Sparse Feedback for Recommendation |
Guanyu Lin et.al. |
2311.08302v1 |
null |
| 2023-11-14 |
Level Set KSVD |
Omer Sapir et.al. |
2311.08284v1 |
null |
| 2023-11-14 |
TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity Recognition |
Yunjiao Zhou et.al. |
2311.08245v1 |
null |
| 2023-11-14 |
MCMC to address model misspecification in Deep Learning classification of Radio Galaxies |
Devina Mohan et.al. |
2311.08243v1 |
null |
| 2023-11-14 |
Learning Physics-Inspired Regularization for Medical Image Registration with Hypernetworks |
Anna Reithmeir et.al. |
2311.08239v1 |
link |
| 2023-11-14 |
Counterfactual Explanation for Regression via Disentanglement in Latent Space |
Xuan Zhao et.al. |
2311.08228v1 |
null |
| 2023-11-14 |
Uni-COAL: A Unified Framework for Cross-Modality Synthesis and Super-Resolution of MR Images |
Zhiyun Song et.al. |
2311.08225v1 |
null |
| 2023-11-14 |
Eval-GCSC: A New Metric for Evaluating ChatGPT's Performance in Chinese Spelling Correction |
Kunting Li et.al. |
2311.08219v1 |
link |
| 2023-11-13 |
GPT-4V(ision) as A Social Media Analysis Engine |
Hanjia Lyu et.al. |
2311.07547v1 |
link |
| 2023-11-13 |
mlscorecheck: Testing the consistency of reported performance scores and experiments in machine learning |
György Kovács et.al. |
2311.07541v1 |
null |
| 2023-11-13 |
FEMDA: a unified framework for discriminant analysis |
Pierre Houdouin et.al. |
2311.07518v1 |
null |
| 2023-11-13 |
Reducing the Need for Backpropagation and Discovering Better Optima With Explicit Optimizations of Neural Networks |
Jake Ryland Williams et.al. |
2311.07498v1 |
null |
| 2023-11-13 |
Towards Robotic Tree Manipulation: Leveraging Graph Representations |
Chung Hee Kim et.al. |
2311.07479v1 |
null |
| 2023-11-13 |
Temporal Performance Prediction for Deep Convolutional Long Short-Term Memory Networks |
Laura Fieback et.al. |
2311.07477v1 |
null |
| 2023-11-13 |
Masked Face Dataset Generation and Masked Face Recognition |
Rui Cai et.al. |
2311.07475v1 |
link |
| 2023-11-13 |
A Bayesian Approach to Strong Lens Finding in the Era of Wide-area Surveys |
Philip Holloway et.al. |
2311.07455v1 |
null |
| 2023-11-13 |
On the Robustness of Neural Collapse and the Neural Collapse of Robustness |
Jingtong Su et.al. |
2311.07444v1 |
null |
| 2023-11-13 |
Optimising Human-AI Collaboration by Learning Convincing Explanations |
Alex J. Chan et.al. |
2311.07426v1 |
null |
| 2023-11-10 |
Learning Human Action Recognition Representations Without Real Humans |
Howard Zhong et.al. |
2311.06231v1 |
link |
| 2023-11-10 |
Semantic-aware Video Representation for Few-shot Action Recognition |
Yutao Tang et.al. |
2311.06218v1 |
null |
| 2023-11-10 |
MultiIoT: Towards Large-scale Multisensory Learning for the Internet of Things |
Shentong Mo et.al. |
2311.06217v1 |
null |
| 2023-11-10 |
Deep learning segmentation of fibrous cap in intravascular optical coherence tomography images |
Juhwan Lee et.al. |
2311.06202v1 |
null |
| 2023-11-10 |
An Automated Pipeline for Tumour-Infiltrating Lymphocyte Scoring in Breast Cancer |
Adam J Shephard et.al. |
2311.06185v1 |
link |
| 2023-11-10 |
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers |
Saurav Sengupta et.al. |
2311.06176v1 |
null |
| 2023-11-10 |
Two vertex geometrically irreducible algebras |
Grzegorz Bobinski et.al. |
2311.06173v1 |
null |
| 2023-11-10 |
Time Scale Network: A Shallow Neural Network For Time Series Data |
Trevor Meyer et.al. |
2311.06170v1 |
null |
| 2023-11-10 |
Deep Fast Vision: A Python Library for Accelerated Deep Transfer Learning Vision Prototyping |
Fabi Prezja et.al. |
2311.06169v1 |
link |
| 2023-11-10 |
Going beyond persistent homology using persistent homology |
Johanna Immonen et.al. |
2311.06152v1 |
null |
| 2023-11-09 |
FogROS2-Sky: Optimizing Latency and Cost for Multi-Cloud Robot Applications |
Kaiyuan Chen et.al. |
2311.05600v1 |
null |
| 2023-11-09 |
A Coefficient Makes SVRG Effective |
Yida Yin et.al. |
2311.05589v1 |
link |
| 2023-11-09 |
Outlier-Robust Wasserstein DRO |
Sloan Nietert et.al. |
2311.05573v1 |
link |
| 2023-11-09 |
Exploring Emotion Expression Recognition in Older Adults Interacting with a Virtual Coach |
Cristina Palmero et.al. |
2311.05567v1 |
null |
| 2023-11-09 |
Disentangling Quantum and Classical Contributions in Hybrid Quantum Machine Learning Architectures |
Michael Kölle et.al. |
2311.05559v1 |
null |
| 2023-11-09 |
L-WaveBlock: A Novel Feature Extractor Leveraging Wavelets for Generative Adversarial Networks |
Mirat Shah et.al. |
2311.05548v1 |
null |
| 2023-11-09 |
BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis |
Hao-Bin Duan et.al. |
2311.05521v1 |
null |
| 2023-11-09 |
Dirichlet Active Learning |
Kevin Miller et.al. |
2311.05501v1 |
null |
| 2023-11-09 |
Retinal OCT Synthesis with Denoising Diffusion Probabilistic Models for Layer Segmentation |
Yuli Wu et.al. |
2311.05479v1 |
null |
| 2023-11-09 |
Robust Retraining-free GAN Fingerprinting via Personalized Normalization |
Jianwei Fei et.al. |
2311.05478v1 |
null |
| 2023-11-08 |
Towards Few-Annotation Learning in Computer Vision: Application to Image Classification and Object Detection tasks |
Quentin Bouniot et.al. |
2311.04888v1 |
null |
| 2023-11-08 |
Are foundation models efficient for medical image segmentation? |
Danielle Ferreira et.al. |
2311.04847v1 |
null |
| 2023-11-08 |
Bayesian multi-band fitting of alerts for kilonovae detection |
Biswajit Biswas et.al. |
2311.04845v1 |
null |
| 2023-11-08 |
Hierarchically Gated Recurrent Neural Network for Sequence Modeling |
Zhen Qin et.al. |
2311.04823v1 |
link |
| 2023-11-08 |
A Lightweight Architecture for Real-Time Neuronal-Spike Classification |
Muhammad Ali Siddiqi et.al. |
2311.04808v1 |
null |
| 2023-11-08 |
Determination of toxic comments and unintended model bias minimization using Deep learning approach |
Md Azim Khan et.al. |
2311.04789v1 |
null |
| 2023-11-08 |
VioLA: Aligning Videos to 2D LiDAR Scans |
Jun-Jee Chao et.al. |
2311.04783v1 |
null |
| 2023-11-08 |
FetMRQC: an open-source machine learning framework for multi-centric fetal brain MRI quality control |
Thomas Sanchez et.al. |
2311.04780v1 |
link |
| 2023-11-08 |
GCS-ICHNet: Assessment of Intracerebral Hemorrhage Prognosis using Self-Attention with Domain Knowledge Integration |
Xuhao Shan et.al. |
2311.04772v1 |
link |
| 2023-11-08 |
An attention-based deep learning network for predicting Platinum resistance in ovarian cancer |
Haoming Zhuang et.al. |
2311.04769v1 |
null |
| 2023-11-08 |
Video Instance Matting |
Jiachen Li et.al. |
2311.04212v2 |
link |
| 2023-11-07 |
JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction |
Zhongfen Deng et.al. |
2311.04196v1 |
link |
| 2023-11-07 |
Linear to circular conversion in the polarized radio emission of a magnetar |
Marcus E. Lower et.al. |
2311.04195v1 |
null |
| 2023-11-07 |
SpaDeLeF: A Dataset for Hierarchical Classification of Lexical Functions for Collocations in Spanish |
Yevhen Kostiuk et.al. |
2311.04189v1 |
null |
| 2023-11-07 |
A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis |
Dipanjyoti Paul et.al. |
2311.04157v1 |
link |
| 2023-11-07 |
Galaxy Spectra neural Network (GaSNet). II. Using Deep Learning for Spectral Classification and Redshift Predictions |
Fucheng Zhong et.al. |
2311.04146v1 |
null |
| 2023-11-07 |
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models |
Shiwei Zhang et.al. |
2311.04145v1 |
null |
| 2023-11-07 |
Modelling Sentiment Analysis: LLMs and data augmentation techniques |
Guillem Senabre Prades et.al. |
2311.04139v1 |
null |
| 2023-11-07 |
Improved Topological Preservation in 3D Axon Segmentation and Centerline Detection using Geometric Assessment-driven Topological Smoothing (GATS) |
Nina I. Shamsi et.al. |
2311.04116v1 |
null |
| 2023-11-07 |
Joint modelling of recurrent and terminal events with discretely-distributed non-parametric frailty: application on re-hospitalizations and death in heart failure patients |
Chiara Masci et.al. |
2311.04103v1 |
null |
| 2023-11-06 |
A Classification of Graphs through Quadratic Embedding Constants and Clique Graph Insights |
Edy Tri Baskoro et.al. |
2311.03342v1 |
null |
| 2023-11-06 |
Tackling Concept Shift in Text Classification using Entailment-style Modeling |
Sumegh Roychowdhury et.al. |
2311.03320v1 |
null |
| 2023-11-06 |
A Foundation Model for Music Informatics |
Minz Won et.al. |
2311.03318v1 |
link |
| 2023-11-06 |
FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data |
Lisa Weijler et.al. |
2311.03314v1 |
link |
| 2023-11-06 |
A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation |
Qitao Zhao et.al. |
2311.03312v1 |
null |
| 2023-11-06 |
Advancing Post Hoc Case Based Explanation with Feature Highlighting |
Eoin Kenny et.al. |
2311.03246v1 |
null |
| 2023-11-06 |
Machine Learning-Based Tea Leaf Disease Detection: A Comprehensive Review |
Faruk Ahmed et.al. |
2311.03240v1 |
null |
| 2023-11-06 |
Out-of-distribution Detection Learning with Unreliable Out-of-distribution Sources |
Haotian Zheng et.al. |
2311.03236v1 |
null |
| 2023-11-06 |
Segmentation of Drone Collision Hazards in Airborne RADAR Point Clouds Using PointNet |
Hector Arroyo et.al. |
2311.03221v1 |
null |
| 2023-11-06 |
Leveraging Transformers to Improve Breast Cancer Classification and Risk Assessment with Multi-modal and Longitudinal Data |
Yiqiu Shen et.al. |
2311.03217v1 |
null |
| 2023-11-03 |
LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery |
Weikang Wan et.al. |
2311.02058v1 |
null |
| 2023-11-03 |
MetaFast: Enabling Fast Metagenomic Classification via Seed Counting and Edit Distance Approximation |
Arvid E. Gollwitzer et.al. |
2311.02029v1 |
null |
| 2023-11-03 |
A Structured Pruning Algorithm for Model-based Deep Learning |
Chicago Park et.al. |
2311.02003v1 |
null |
| 2023-11-03 |
Detection of keratoconus Diseases using deep Learning |
AKM Enzam-Ul Haque et.al. |
2311.01996v1 |
null |
| 2023-11-03 |
Obtaining Explainable Classification Models using Distributionally Robust Optimization |
Sanjeeb Dash et.al. |
2311.01994v1 |
null |
| 2023-11-03 |
Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation |
Shichao Dong et.al. |
2311.01989v1 |
null |
| 2023-11-06 |
RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches |
Jiayuan Gu et.al. |
2311.01977v2 |
null |
| 2023-11-03 |
Welded graphs, Wirtinger groups and knotted punctured spheres |
Benjamin Audoux et.al. |
2311.01922v1 |
null |
| 2023-11-03 |
Contrast-Agnostic Groupwise Registration by Robust PCA for Quantitative Cardiac MRI |
Xinqi Li et.al. |
2311.01916v1 |
null |
| 2023-11-03 |
VQPy: An Object-Oriented Approach to Modern Video Analytics |
Shan Yu et.al. |
2311.01623v1 |
null |
| 2023-11-02 |
Tailoring Mixup to Data using Kernel Warping functions |
Quentin Bouniot et.al. |
2311.01434v1 |
link |
| 2023-11-02 |
Identifying Alzheimer Disease Dementia Levels Using Machine Learning Methods |
Md Gulzar Hussain et.al. |
2311.01428v1 |
null |
| 2023-11-02 |
Exploring Deep Learning Techniques for Glaucoma Detection: A Comprehensive Review |
Aized Amin Soofi et.al. |
2311.01425v1 |
null |
| 2023-11-02 |
Holistic Transfer: Towards Non-Disruptive Fine-Tuning with Partial Target Data |
Cheng-Hao Tu et.al. |
2311.01420v1 |
null |
| 2023-11-02 |
Learning to See Physical Properties with Active Sensing Motor Policies |
Gabriel B. Margolis et.al. |
2311.01405v1 |
null |
| 2023-11-02 |
Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors |
Gabriele M. Caddeo et.al. |
2311.01380v1 |
link |
| 2023-11-02 |
Deep learning based Image Compression for Microscopy Images: An Empirical Study |
Yu Zhou et.al. |
2311.01352v1 |
null |
| 2023-11-02 |
Unreading Race: Purging Protected Features from Chest X-ray Embeddings |
Tobias Weber et.al. |
2311.01349v1 |
null |
| 2023-11-02 |
Scattering Vision Transformer: Spectral Mixing Matters |
Badri N. Patro et.al. |
2311.01310v1 |
null |
| 2023-11-02 |
Hybrid-Fusion Transformer for Multisequence MRI |
Jihoon Cho et.al. |
2311.01308v1 |
null |
| 2023-11-01 |
Software Repositories and Machine Learning Research in Cyber Security |
Mounika Vanamala et.al. |
2311.00691v1 |
null |
| 2023-11-01 |
What User Behaviors Make the Differences During the Process of Visual Analytics? |
Shahin Doroudian et.al. |
2311.00690v1 |
null |
| 2023-11-01 |
Deep Learning-Based Classification of Gamma Photon Interactions in Room-Temperature Semiconductor Radiation Detectors |
Sandeep K. Chaudhuri et.al. |
2311.00682v1 |
null |
| 2023-11-01 |
Latent Space Translation via Semantic Alignment |
Valentino Maiorca et.al. |
2311.00664v1 |
link |
| 2023-11-01 |
Rediscussion of eclipsing binaries. Paper XV. The B-type supergiant system V1765 Cygni |
John Southworth et.al. |
2311.00655v1 |
null |
| 2023-11-02 |
Emergence of Collective Open-Ended Exploration from Decentralized Meta-Reinforcement Learning |
Richard Bornemann et.al. |
2311.00651v2 |
null |
| 2023-11-01 |
Understanding the Issues and Causes in WebAssembly Application Development: A Mining-based Study |
Muhammad Waseem et.al. |
2311.00646v1 |
null |
| 2023-11-01 |
A Bi-level Framework for Traffic Accident Duration Prediction: Leveraging Weather and Road Condition Data within a Practical Optimum Pipeline |
Rafat Tabassum Sukonna et.al. |
2311.00634v1 |
null |
| 2023-11-01 |
Controllable Music Production with Diffusion Models and Guidance Gradients |
Mark Levy et.al. |
2311.00613v1 |
null |
| 2023-11-01 |
A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma based on CT Images |
Ni Yao et.al. |
2311.00567v1 |
null |
| 2023-10-31 |
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders |
Srijan Das et.al. |
2310.20704v1 |
null |
| 2023-10-31 |
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction |
Xinyuan Chen et.al. |
2310.20700v1 |
null |
| 2023-10-31 |
StairNet: Visual Recognition of Stairs for Human-Robot Locomotion |
Andrew Garrett Kurbis et.al. |
2310.20666v1 |
null |
| 2023-10-31 |
Performance Improvement in Multi-class Classification via Automated Hierarchy Generation and Exploitation through Extended LCPN Schemes |
Celal Alagoz et.al. |
2310.20641v1 |
null |
| 2023-10-31 |
Deepfake detection by exploiting surface anomalies: the SurFake approach |
Andrea Ciamarra et.al. |
2310.20621v1 |
null |
| 2023-10-31 |
Enhanced Synthetic MRI Generation from CT Scans Using CycleGAN with Feature Extraction |
Saba Nikbakhsh et.al. |
2310.20604v1 |
null |
| 2023-10-31 |
Finiteness properties for Shimura curves and modified diagonal cycles |
Congling Qiu et.al. |
2310.20600v1 |
null |
| 2023-10-31 |
Brain-like Flexible Visual Inference by Harnessing Feedback-Feedforward Alignment |
Tahereh Toosi et.al. |
2310.20599v1 |
link |
| 2023-10-31 |
Tracially Complete C-Algebras* |
José R. Carrión et.al. |
2310.20594v1 |
null |
| 2023-10-31 |
Strongly Magnetized Tidal Disruption Event Disks via Stream Injection in GRMHD |
Brandon Curd et.al. |
2310.20592v1 |
null |
| 2023-10-29 |
Improved Motor Imagery Classification Using Adaptive Spatial Filters Based on Particle Swarm Optimization Algorithm |
Xiong Xiong et.al. |
2310.19202v1 |
null |
| 2023-10-29 |
Enhancing Motor Imagery Decoding in Brain Computer Interfaces using Riemann Tangent Space Mapping and Cross Frequency Coupling |
Xiong Xiong et.al. |
2310.19198v1 |
null |
| 2023-10-29 |
A Survey on Watching Social Issue Videos among YouTube and TikTok Users |
Shuo Niu et.al. |
2310.19193v1 |
null |
| 2023-10-29 |
Subjective Quality Evaluation of Point Clouds Using a Head Mounted Display |
Joao Prazeres et.al. |
2310.19179v1 |
null |
| 2023-10-29 |
Robustifying Language Models with Test-Time Adaptation |
Noah Thomas McDermott et.al. |
2310.19177v1 |
null |
| 2023-10-29 |
Predicting recovery following stroke: deep learning, multimodal data and feature selection using explainable AI |
Adam White et.al. |
2310.19174v1 |
null |
| 2023-10-29 |
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping |
Srikumar Sastry et.al. |
2310.19168v1 |
link |
| 2023-10-29 |
Unified Representation for Non-compositional and Compositional Expressions |
Ziheng Zeng et.al. |
2310.19127v1 |
null |
| 2023-10-29 |
Efficient IoT Inference via Context-Awareness |
Mohammad Mehdi Rastikerdar et.al. |
2310.19112v1 |
null |
| 2023-10-29 |
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models |
Shikhar Murty et.al. |
2310.19089v1 |
null |
| 2023-10-27 |
Addressing GAN Training Instabilities via Tunable Classification Losses |
Monica Welfert et.al. |
2310.18291v1 |
null |
| 2023-10-27 |
PlantPlotGAN: A Physics-Informed Generative Adversarial Network for Plant Disease Prediction |
Felipe A. Lopes et.al. |
2310.18268v1 |
null |
| 2023-10-27 |
MalFake: A Multimodal Fake News Identification for Malayalam using Recurrent Neural Networks and VGG-16 |
Adhish S. Sujan et.al. |
2310.18263v1 |
null |
| 2023-10-27 |
Edge AI-Based Vein Detector for Efficient Venipuncture in the Antecubital Fossa |
Edwin Salcedo et.al. |
2310.18234v1 |
null |
| 2023-10-27 |
TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis |
Ziquan Zhu et.al. |
2310.18222v1 |
null |
| 2023-10-27 |
ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models |
Benjamin Feuer et.al. |
2310.18208v1 |
link |
| 2023-10-27 |
Artifact-Robust Graph-Based Learning in Digital Pathology |
Saba Heidari Gheshlaghi et.al. |
2310.18192v1 |
null |
| 2023-10-27 |
Globular clusters and bar: captured or not captured? |
Anton A. Smirnov et.al. |
2310.18172v1 |
null |
| 2023-10-27 |
Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN |
Neeraj Kumar et.al. |
2310.18169v1 |
null |
| 2023-10-27 |
DESiRED -- Dynamic, Enhanced, and Smart iRED: A P4-AQM with Deep Reinforcement Learning and In-band Network Telemetry |
Leandro C. de Almeida et.al. |
2310.18159v1 |
null |
| 2023-10-26 |
A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection |
Anas Al-lahham et.al. |
2310.17650v1 |
null |
| 2023-10-26 |
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP |
Yoshitomo Matsubara et.al. |
2310.17644v1 |
link |
| 2023-10-26 |
Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models |
Tsun-Hsuan Wang et.al. |
2310.17642v1 |
null |
| 2023-10-26 |
Skew Products on the Berkovich Projective Line |
Richard A. P. Birkett et.al. |
2310.17628v1 |
null |
| 2023-10-26 |
A Survey on Transferability of Adversarial Examples across Deep Neural Networks |
Jindong Gu et.al. |
2310.17626v1 |
link |
| 2023-10-26 |
MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations |
Ajay Mandlekar et.al. |
2310.17596v1 |
null |
| 2023-10-26 |
Linear $x$-coordinate relations of triples on elliptic curves |
Jerson Caro et.al. |
2310.17592v1 |
null |
| 2023-10-26 |
A minimax optimal control approach for robust neural ODEs |
Cristina Cipriani et.al. |
2310.17584v1 |
null |
| 2023-10-26 |
BLIS-Net: Classifying and Analyzing Signals on Graphs |
Charles Xu et.al. |
2310.17579v1 |
null |
| 2023-10-26 |
Knots bounding non-isotopic ribbon disks |
Jeffrey Meier et.al. |
2310.17564v1 |
null |
| 2023-10-25 |
RDBench: ML Benchmark for Relational Databases |
Zizhao Zhang et.al. |
2310.16837v1 |
link |
| 2023-10-25 |
TD-MPC2: Scalable, Robust World Models for Continuous Control |
Nicklas Hansen et.al. |
2310.16828v1 |
null |
| 2023-10-26 |
Deep machine learning for meteor monitoring: advances with transfer learning and gradient-weighted class activation mapping |
Eloy Peña-Asensio et.al. |
2310.16826v2 |
null |
| 2023-10-25 |
Uncovering a new group of T Tauri stars in the Taurus-Auriga molecular complex from Gaia and GALEX data |
Ana Inés Gómez de Castro et.al. |
2310.16820v1 |
null |
| 2023-10-25 |
Using Diffusion Models to Generate Synthetic Labelled Data for Medical Image Segmentation |
Daniel Saragih et.al. |
2310.16794v1 |
null |
| 2023-10-25 |
Navigating Socio-Emotional Risk through Comfort-Building in a Physics Teaching Community of Practice: A Case Study |
Maggie Mahmood et.al. |
2310.16778v1 |
null |
| 2023-10-25 |
IntenDD: A Unified Contrastive Learning Approach for Intent Detection and Discovery |
Bhavuk Singhal et.al. |
2310.16761v1 |
null |
| 2023-10-25 |
Interferometric Neural Networks |
Arun Sehrawat et.al. |
2310.16742v1 |
link |
| 2023-10-25 |
A No-Reference Quality Assessment Method for Digital Human Head |
Yingjie Zhou et.al. |
2310.16732v1 |
null |
| 2023-10-25 |
Spherical Wavefront Near-Field DoA Estimation in THz Automotive Radar |
Ahmet M. Elbir et.al. |
2310.16724v1 |
null |
| 2023-10-24 |
From Posterior Sampling to Meaningful Diversity in Image Restoration |
Noa Cohen et.al. |
2310.16047v1 |
null |
| 2023-10-24 |
Finetuning Offline World Models in the Real World |
Yunhai Feng et.al. |
2310.16029v1 |
null |
| 2023-10-24 |
Human-in-the-Loop Task and Motion Planning for Imitation Learning |
Ajay Mandlekar et.al. |
2310.16014v1 |
null |
| 2023-10-24 |
CVPR 2023 Text Guided Video Editing Competition |
Jay Zhangjie Wu et.al. |
2310.16003v1 |
null |
| 2023-10-24 |
Vision-Language Pseudo-Labels for Single-Positive Multi-Label Learning |
Xin Xing et.al. |
2310.15985v1 |
link |
| 2023-10-24 |
Geometry-Aware Video Quality Assessment for Dynamic Digital Human |
Zicheng Zhang et.al. |
2310.15984v1 |
null |
| 2023-10-24 |
Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees |
Verónica Álvarez et.al. |
2310.15974v1 |
link |
| 2023-10-24 |
Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection |
Manyuan Zhang et.al. |
2310.15955v1 |
null |
| 2023-10-25 |
Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles |
Xing Shen et.al. |
2310.15952v2 |
null |
| 2023-10-24 |
ShARc: Shape and Appearance Recognition for Person Identification In-the-wild |
Haidong Zhu et.al. |
2310.15946v1 |
null |
| 2023-10-23 |
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling |
Haonan Qiu et.al. |
2310.15169v1 |
null |
| 2023-10-23 |
Bitrate Ladder Prediction Methods for Adaptive Video Streaming: A Review and Benchmark |
Ahmed Telili et.al. |
2310.15163v1 |
null |
| 2023-10-23 |
Linear Representations of Sentiment in Large Language Models |
Curt Tigges et.al. |
2310.15154v1 |
null |
| 2023-10-23 |
Unlocking the Transferability of Tokens in Deep Models for Tabular Data |
Qi-Le Zhou et.al. |
2310.15149v1 |
null |
| 2023-10-23 |
When Should the FDA Inspect Pharmaceutical Manufacturing Facilities to Better Mitigate Drug Shortages? |
Daniel Kosmas et.al. |
2310.15146v1 |
null |
| 2023-10-23 |
Novel-View Acoustic Synthesis from 3D Reconstructed Rooms |
Byeongjoo Ahn et.al. |
2310.15130v1 |
link |
| 2023-10-23 |
Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models |
Gabriel Sarch et.al. |
2310.15127v1 |
null |
| 2023-10-23 |
SpVOS: Efficient Video Object Segmentation with Triple Sparse Convolution |
Weihao Lin et.al. |
2310.15115v1 |
null |
| 2023-10-23 |
The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills |
Qingxiao Zheng et.al. |
2310.15112v1 |
null |
| 2023-10-23 |
Matryoshka Diffusion Models |
Jiatao Gu et.al. |
2310.15111v1 |
null |
| 2023-10-20 |
Using Human-like Mechanism to Weaken Effect of Pre-training Weight Bias in Face-Recognition Convolutional Neural Network |
Haojiang Ying et.al. |
2310.13674v1 |
null |
| 2023-10-23 |
Explainable Depression Symptom Detection in Social Media |
Eliseo Bao Souto et.al. |
2310.13664v2 |
null |
| 2023-10-20 |
Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification |
Amr Keleg et.al. |
2310.13661v1 |
link |
| 2023-10-20 |
Optimal Transport for Measures with Noisy Tree Metric |
Tam Le et.al. |
2310.13653v1 |
null |
| 2023-10-20 |
Principal $2$-blocks with wreathed defect groups up to splendid Morita equivalence |
Shigeo Koshitani et.al. |
2310.13621v1 |
null |
| 2023-10-20 |
Skin Lesion Segmentation Improved by Transformer-based Networks with Inter-scale Dependency Modeling |
Sania Eskandari et.al. |
2310.13604v1 |
link |
| 2023-10-20 |
Classification of quantum states of light using random measurements through a multimode fiber |
Saroch Leedumrongwatthanakun et.al. |
2310.13599v1 |
null |
| 2023-10-20 |
Longer-range Contextualized Masked Autoencoder |
Taekyung Kim et.al. |
2310.13593v1 |
null |
| 2023-10-20 |
POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal Action Localization |
Elahe Vahdani et.al. |
2310.13585v1 |
null |
| 2023-10-20 |
Progressive Dual Priori Network for Generalized Breast Tumor Segmentation |
Li Wang et.al. |
2310.13574v1 |
null |
| 2023-10-19 |
Putting the Object Back into Video Object Segmentation |
Ho Kei Cheng et.al. |
2310.12982v1 |
link |
| 2023-10-19 |
Variational Inference for SDEs Driven by Fractional Noise |
Rembert Daems et.al. |
2310.12975v1 |
null |
| 2023-10-19 |
Frozen Transformers in Language Models Are Effective Visual Encoder Layers |
Ziqi Pang et.al. |
2310.12973v1 |
link |
| 2023-10-19 |
Bialgebra structures on flat Lie algebras |
Amine Bahayou et.al. |
2310.12966v1 |
null |
| 2023-10-19 |
End-to-End Delay Minimization based on Joint Optimization of DNN Partitioning and Resource Allocation for Cooperative Edge Inference |
Xinrui Ye et.al. |
2310.12937v1 |
null |
| 2023-10-19 |
Digital Twin-Enabled Intelligent DDoS Detection Mechanism for Autonomous Core Networks |
Yagmur Yigit et.al. |
2310.12924v1 |
null |
| 2023-10-19 |
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning |
Juan Rocamonde et.al. |
2310.12921v1 |
null |
| 2023-10-19 |
Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey |
Oriane Siméoni et.al. |
2310.12904v1 |
link |
| 2023-10-19 |
A Markovian dynamics for $C. elegans$ behavior across scales |
Antonio C. Costa et.al. |
2310.12883v1 |
link |
| 2023-10-19 |
Perceptual Assessment and Optimization of High Dynamic Range Image Rendering |
Peibei Cao et.al. |
2310.12877v1 |
null |
| 2023-10-18 |
SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks |
Mohammadreza Salehi et.al. |
2310.12126v1 |
null |
| 2023-10-18 |
Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture |
Daniel Y. Fu et.al. |
2310.12109v1 |
null |
| 2023-10-18 |
HSTR-Net: Reference Based Video Super-resolution for Aerial Surveillance with Dual Cameras |
H. Umut Suluhan et.al. |
2310.12092v1 |
null |
| 2023-10-18 |
Chemical Analysis of the Brightest Star of the Cetus II Ultra-Faint Dwarf Galaxy Candidate |
K. B. Webber et.al. |
2310.12090v1 |
null |
| 2023-10-18 |
One-Shot Imitation Learning: A Pose Estimation Perspective |
Pietro Vitiello et.al. |
2310.12077v1 |
null |
| 2023-10-18 |
Exploring Fairness in Pre-trained Visual Transformer based Natural and GAN Generated Image Detection Systems and Understanding the Impact of Image Compression in Fairness |
Manjary P. Gangan et.al. |
2310.12076v1 |
null |
| 2023-10-18 |
Black-Box Training Data Identification in GANs via Detector Networks |
Lukman Olagoke et.al. |
2310.12063v1 |
null |
| 2023-10-19 |
Robust Class-Conditional Distribution Alignment for Partial Domain Adaptation |
Sandipan Choudhuri et.al. |
2310.12060v2 |
null |
| 2023-10-18 |
Exact and efficient solutions of the LMC Multitask Gaussian Process model |
Olivier Truffinet et.al. |
2310.12032v1 |
link |
| 2023-10-18 |
CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation |
Philipp Borchert et.al. |
2310.12024v1 |
link |
| 2023-10-17 |
DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis |
Youngjoong Kwon et.al. |
2310.11449v1 |
null |
| 2023-10-18 |
4K4D: Real-Time 4D View Synthesis at 4K Resolution |
Zhen Xu et.al. |
2310.11448v2 |
null |
| 2023-10-18 |
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models |
Yaofang Liu et.al. |
2310.11440v2 |
null |
| 2023-10-17 |
Transitive generalized toggle groups containing a cycle |
Jonathan S. Bloom et.al. |
2310.11387v1 |
null |
| 2023-10-17 |
DialogueLLM: Context and Emotion Knowledge-Tuned LLaMA Models for Emotion Recognition in Conversations |
Yazhou Zhang et.al. |
2310.11374v1 |
null |
| 2023-10-17 |
VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights |
Shanshan Xu et.al. |
2310.11368v1 |
null |
| 2023-10-17 |
Lie Group Decompositions for Equivariant Neural Networks |
Mircea Mironenco et.al. |
2310.11366v1 |
null |
| 2023-10-17 |
Hybrid quantum-classical graph neural networks for tumor classification in digital pathology |
Anupama Ray et.al. |
2310.11353v1 |
null |
| 2023-10-17 |
The effect of stemming and lemmatization on Portuguese fake news text classification |
Lucca de Freitas Santos et.al. |
2310.11344v1 |
null |
| 2023-10-17 |
Influencing factors on false positive rates when classifying tumor cell line response to drug treatment |
Priyanka Vasanthakumari et.al. |
2310.11329v1 |
null |
| 2023-10-16 |
A Survey on Video Diffusion Models |
Zhen Xing et.al. |
2310.10647v1 |
link |
| 2023-10-16 |
Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting |
Zeyu Yang et.al. |
2310.10642v1 |
link |
| 2023-10-16 |
Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models |
Kevin Black et.al. |
2310.10639v1 |
null |
| 2023-10-16 |
Efficacy of Dual-Encoders for Extreme Multi-Label Classification |
Nilesh Gupta et.al. |
2310.10636v1 |
null |
| 2023-10-16 |
Overcoming the Rayleigh limit in extremely low SNR |
Hyunsoo Choi et.al. |
2310.10633v1 |
null |
| 2023-10-16 |
Video Language Planning |
Yilun Du et.al. |
2310.10625v1 |
null |
| 2023-10-16 |
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing |
Jia-Wei Liu et.al. |
2310.10624v1 |
null |
| 2023-10-16 |
BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation |
Ji Qi et.al. |
2310.10586v1 |
null |
| 2023-10-16 |
RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets |
Zhicheng Cai et.al. |
2310.10563v1 |
link |
| 2023-10-16 |
Deep learning applied to EEG data with different montages using spatial attention |
Dung Truong et.al. |
2310.10550v1 |
null |
| 2023-10-13 |
An Unbiased Look at Datasets for Visuo-Motor Pre-Training |
Sudeep Dasari et.al. |
2310.09289v1 |
null |
| 2023-10-13 |
Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning |
Geri Skenderi et.al. |
2310.09278v1 |
null |
| 2023-10-13 |
A Hybrid Approach for Depression Classification: Random Forest-ANN Ensemble on Motor Activity Signals |
Anket Patil et.al. |
2310.09277v1 |
null |
| 2023-10-13 |
PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Programming |
Chufan Gao et.al. |
2310.09265v1 |
null |
| 2023-10-13 |
Political claim identification and categorization in a multilingual setting: First experiments |
Urs Zaberer et.al. |
2310.09256v1 |
null |
| 2023-10-13 |
It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models |
Lin Chen et.al. |
2310.09250v1 |
null |
| 2023-10-13 |
A Multifaceted Look at Starlink Performance |
Nitinder Mohan et.al. |
2310.09242v1 |
null |
| 2023-10-13 |
Time CNN and Graph Convolution Network for Epileptic Spike Detection in MEG Data |
Pauline Mouches et.al. |
2310.09236v1 |
null |
| 2023-10-13 |
Ultrasound Image Segmentation of Thyroid Nodule via Latent Semantic Feature Co-Registration |
Xuewei Li et.al. |
2310.09221v1 |
null |
| 2023-10-13 |
PaLI-3 Vision Language Models: Smaller, Faster, Stronger |
Xi Chen et.al. |
2310.09199v1 |
null |
| 2023-10-12 |
Octopus: Embodied Vision-Language Programmer from Environmental Feedback |
Jingkang Yang et.al. |
2310.08588v1 |
link |
| 2023-10-12 |
Is Generalized Dynamic Novel View Synthesis from Monocular Videos Possible Today? |
Xiaoming Zhao et.al. |
2310.08587v1 |
null |
| 2023-10-12 |
Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes |
Haotong Lin et.al. |
2310.08585v1 |
null |
| 2023-10-12 |
Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video |
Shashanka Venkataramanan et.al. |
2310.08584v1 |
null |
| 2023-10-12 |
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy |
Zichen Zhang et.al. |
2310.08581v1 |
null |
| 2023-10-12 |
Learning to Act from Actionless Videos through Dense Correspondences |
Po-Chen Ko et.al. |
2310.08576v1 |
null |
| 2023-10-12 |
Effective isometries of periodic shells |
Hussein Nassar et.al. |
2310.08531v1 |
null |
| 2023-10-12 |
LLM-augmented Preference Learning from Natural Language |
Inwon Kang et.al. |
2310.08523v1 |
null |
| 2023-10-12 |
Impact of time and note duration tokenizations on deep learning symbolic music modeling |
Nathan Fradet et.al. |
2310.08497v1 |
link |
| 2023-10-12 |
GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models |
Yuanchun Shen et.al. |
2310.08487v1 |
link |
| 2023-10-11 |
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models |
Yingqing He et.al. |
2310.07702v1 |
link |
| 2023-10-11 |
ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation |
Bo Peng et.al. |
2310.07697v1 |
null |
| 2023-10-11 |
Large-scale photonic computing with nonlinear disordered media |
Hao Wang et.al. |
2310.07690v1 |
null |
| 2023-10-11 |
Deep Video Inpainting Guided by Audio-Visual Self-Supervision |
Kyuyeon Kim et.al. |
2310.07663v1 |
null |
| 2023-10-11 |
Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral Physiological Signals |
Eleonora Lopez et.al. |
2310.07648v1 |
null |
| 2023-10-11 |
Attention-Map Augmentation for Hypercomplex Breast Cancer Classification |
Eleonora Lopez et.al. |
2310.07633v1 |
null |
| 2023-10-11 |
Differentiable Euler Characteristic Transforms for Shape Classification |
Ernst Roell et.al. |
2310.07630v1 |
link |
| 2023-10-11 |
Time-Resolved Reconstruction of Motion, Force, and Stiffness using Spectro-Dynamic MRI |
Max H. C. van Riel et.al. |
2310.07622v1 |
null |
| 2023-10-11 |
Reinforcement Learning-based Knowledge Graph Reasoning for Explainable Fact-checking |
Gustav Nikopensius et.al. |
2310.07613v1 |
null |
| 2023-10-11 |
QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking |
Liangming Pan et.al. |
2310.07609v1 |
link |
| 2023-10-10 |
Convivial Solipsism as a maximally perspectival interpretation |
Herve Zwirn et.al. |
2310.06815v1 |
null |
| 2023-10-10 |
A Supervised Embedding and Clustering Anomaly Detection method for classification of Mobile Network Faults |
R. Mosayebi et.al. |
2310.06779v1 |
null |
| 2023-10-10 |
Optical assembly of nanostructures mediated by surface roughness |
Robert G. Felsted et.al. |
2310.06774v1 |
null |
| 2023-10-10 |
Uni3D: Exploring Unified 3D Representation at Scale |
Junsheng Zhou et.al. |
2310.06773v1 |
link |
| 2023-10-10 |
Improved convergence rates for some kernel random forest algorithms |
Isidoros Iakovidis et.al. |
2310.06760v1 |
null |
| 2023-10-10 |
Geographic Location Encoding with Spherical Harmonics and Sinusoidal Representation Networks |
Marc Rußwurm et.al. |
2310.06743v1 |
link |
| 2023-10-10 |
Multi-domain improves out-of-distribution and data-limited scenarios for medical image analysis |
Ece Ozkan et.al. |
2310.06737v1 |
null |
| 2023-10-10 |
S4Sleep: Elucidating the design space of deep-learning-based sleep stage classification models |
Tiezhi Wang et.al. |
2310.06715v1 |
link |
| 2023-10-10 |
Tertiary Lymphoid Structures Generation through Graph-based Diffusion |
Manuel Madeira et.al. |
2310.06661v1 |
null |
| 2023-10-10 |
Assessing the Impact of a Supervised Classification Filter on Flow-based Hybrid Network Anomaly Detection |
Dominik Macko et.al. |
2310.06656v1 |
link |
| 2023-10-09 |
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing |
Yuren Cong et.al. |
2310.05922v1 |
null |
| 2023-10-09 |
Enumerating Calabi-Yau Manifolds: Placing bounds on the number of diffeomorphism classes in the Kreuzer-Skarke list |
Aditi Chandra et.al. |
2310.05909v1 |
null |
| 2023-10-09 |
ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models |
Kaiwen Zhou et.al. |
2310.05872v1 |
null |
| 2023-10-10 |
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models |
Guangzhi Sun et.al. |
2310.05863v2 |
link |
| 2023-10-09 |
Latent Wander: an Alternative Interface for Interactive and Serendipitous Discovery of Large AV Archives |
Yuchen Yang et.al. |
2310.05835v1 |
null |
| 2023-10-09 |
Write What You Want: Applying Text-to-video Retrieval to Audiovisual Archives |
Yuchen Yang et.al. |
2310.05825v1 |
null |
| 2023-10-09 |
Dipole-Spread Function Engineering for 6D Super-Resolution Microscopy |
Tingting Wu et.al. |
2310.05810v1 |
null |
| 2023-10-09 |
A Simple Open-Loop Baseline for Reinforcement Learning Locomotion Tasks |
Antonin Raffin et.al. |
2310.05808v1 |
null |
| 2023-10-09 |
Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis |
Haoyu Zhang et.al. |
2310.05804v1 |
null |
| 2023-10-10 |
Two-timescale Derivative Free Optimization for Performative Prediction with Markovian Data |
Haitong Liu et.al. |
2310.05792v2 |
null |
| 2023-10-06 |
Exploiting Transformer Activation Sparsity with Dynamic Inference |
Mikołaj Piórczyński et.al. |
2310.04361v1 |
null |
| 2023-10-06 |
SwimXYZ: A large-scale dataset of synthetic swimming motions and videos |
Fiche Guénolé et.al. |
2310.04360v1 |
null |
| 2023-10-06 |
Large-Scale Korean Text Dataset for Classifying Biased Speech in Real-World Online Services |
Dasol Choi et.al. |
2310.04313v1 |
null |
| 2023-10-06 |
Convergent ADMM Plug and Play PET Image Reconstruction |
Florent Sureau et.al. |
2310.04299v1 |
null |
| 2023-10-06 |
A Plug-and-Play Image Registration Network |
Junhao Hu et.al. |
2310.04297v1 |
null |
| 2023-10-06 |
Towards Non-contact 3D Ultrasound for Wrist Imaging |
Antony Jerald et.al. |
2310.04296v1 |
null |
| 2023-10-06 |
Spectroscopic variability of massive pre-main-sequence stars in M17 |
A. R. Derkink et.al. |
2310.04287v1 |
null |
| 2023-10-06 |
Multi-Industry Simplex : A Probabilistic Extension of GICS |
Maksim Papenkov et.al. |
2310.04280v1 |
null |
| 2023-10-06 |
Bringing Quantum Algorithms to Automated Machine Learning: A Systematic Review of AutoML Frameworks Regarding Extensibility for QML Algorithms |
Dennis Klau et.al. |
2310.04238v1 |
null |
| 2023-10-06 |
Written and spoken corpus of real and fake social media postings about COVID-19 |
Ng Bee Chin et.al. |
2310.04237v1 |
null |
| 2023-10-05 |
The Un-Kidnappable Robot: Acoustic Localization of Sneaking People |
Mengyu Yang et.al. |
2310.03743v1 |
null |
| 2023-10-05 |
Agent Instructs Large Language Models to be General Zero-Shot Reasoners |
Nicholas Crispino et.al. |
2310.03710v1 |
link |
| 2023-10-05 |
OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks |
Ofir Bar Tal et.al. |
2310.03707v1 |
null |
| 2023-10-05 |
Role of Spatial Coherence in Diffractive Optical Neural Networks |
Matthew J. Filipovich et.al. |
2310.03679v1 |
null |
| 2023-10-05 |
Certification of Deep Learning Models for Medical Image Segmentation |
Othmane Laousy et.al. |
2310.03664v1 |
null |
| 2023-10-05 |
Autoregressive Coefficients based Intelligent Protection of Transmission Lines Connected to Type-3 Wind Farms |
Pallav Kumar Bera et.al. |
2310.03663v1 |
null |
| 2023-10-05 |
Robustness-Guided Image Synthesis for Data-Free Quantization |
Jianhong Bai et.al. |
2310.03661v1 |
null |
| 2023-10-05 |
Balancing Autonomy and Alignment: A Multi-Dimensional Taxonomy for Autonomous LLM-powered Multi-Agent Architectures |
Thorsten Händler et.al. |
2310.03659v1 |
null |
| 2023-10-05 |
Strategic Evaluation: Subjects, Evaluators, and Society |
Benjamin Laufer et.al. |
2310.03655v1 |
null |
| 2023-10-05 |
CLEVRER-Humans: Describing Physical and Causal Events the Human Way |
Jiayuan Mao et.al. |
2310.03635v1 |
null |
| 2023-10-04 |
SemiReward: A General Reward Model for Semi-supervised Learning |
Siyuan Li et.al. |
2310.03013v1 |
link |
| 2023-10-04 |
High-dimensional SGD aligns with emerging outlier eigenspaces |
Gerard Ben Arous et.al. |
2310.03010v1 |
null |
| 2023-10-05 |
IBCL: Zero-shot Model Generation for Task Trade-offs in Continual Learning |
Pengyuan Lu et.al. |
2310.02995v2 |
link |
| 2023-10-04 |
Multiple Physics Pretraining for Physical Surrogate Models |
Michael McCabe et.al. |
2310.02994v1 |
null |
| 2023-10-04 |
UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network |
Siddhant Arora et.al. |
2310.02973v1 |
null |
| 2023-10-04 |
Fully Automatic Segmentation of Gross Target Volume and Organs-at-Risk for Radiotherapy Planning of Nasopharyngeal Carcinoma |
Mehdi Astaraki et.al. |
2310.02972v1 |
null |
| 2023-10-04 |
Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model |
Kai-Wei Chang et.al. |
2310.02971v1 |
null |
| 2023-10-05 |
Co-modeling the Sequential and Graphical Routes for Peptide Representation Learning |
Zihan Liu et.al. |
2310.02964v2 |
link |
| 2023-10-04 |
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection |
Yang Cao et.al. |
2310.02960v1 |
link |
| 2023-10-04 |
HappyFeat -- An interactive and efficient BCI framework for clinical applications |
Arthur Desbois et.al. |
2310.02948v1 |
null |
| 2023-10-03 |
DREAM: Visual Decoding from Reversing Human Visual System |
Weihao Xia et.al. |
2310.02265v1 |
null |
| 2023-10-03 |
RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving |
Tong Zhao et.al. |
2310.02262v1 |
null |
| 2023-10-03 |
Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages |
Ananya Joshi et.al. |
2310.02249v1 |
null |
| 2023-10-04 |
Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks |
Greg Yang et.al. |
2310.02244v2 |
null |
| 2023-10-03 |
MIS-AVioDD: Modality Invariant and Specific Representation for Audio-Visual Deepfake Detection |
Vinaya Sree Katamneni et.al. |
2310.02234v1 |
null |
| 2023-10-03 |
HoloNets: Spectral Convolutions do extend to Directed Graphs |
Christian Koke et.al. |
2310.02232v1 |
null |
| 2023-10-03 |
Extraction of Medication and Temporal Relation from Clinical Text by Harnessing Different Deep Learning Models |
Hangyu Tu et.al. |
2310.02229v1 |
null |
| 2023-10-03 |
Symmetry-based classification of exact flat bands in single and bilayer moiré systems |
Siddhartha Sarkar et.al. |
2310.02218v1 |
null |
| 2023-10-03 |
Learnable Data Augmentation for One-Shot Unsupervised Domain Adaptation |
Julio Ivan Davila Carrazco et.al. |
2310.02201v1 |
null |
| 2023-10-03 |
CNN photometric redshifts in the SDSS at $r\leq 20$ |
M. Treyer et.al. |
2310.02173v1 |
null |
| 2023-09-29 |
A Large Language Model Approach to Educational Survey Feedback Analysis |
Michael J. Parker et.al. |
2309.17447v1 |
null |
| 2023-10-02 |
LLM-grounded Video Diffusion Models |
Long Lian et.al. |
2309.17444v2 |
null |
| 2023-09-29 |
Classification of Potholes Based on Surface Area Using Pre-Trained Models of Convolutional Neural Network |
Chauhdary Fazeel Ahmad et.al. |
2309.17426v1 |
null |
| 2023-09-29 |
CNN-based automatic segmentation of Lumen & Media boundaries in IVUS images using closed polygonal chains |
Pavel Sinha et.al. |
2309.17406v1 |
null |
| 2023-09-29 |
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition |
Andrew Rouditchenko et.al. |
2309.17395v1 |
null |
| 2023-09-29 |
Tree Cross Attention |
Leo Feng et.al. |
2309.17388v1 |
null |
| 2023-09-29 |
Adversarial Imitation Learning from Visual Observations using Latent Information |
Vittorio Giammarino et.al. |
2309.17371v1 |
link |
| 2023-09-29 |
SpinView: General interactive visual analysis tool for multiscale computational magnetism |
Qichen Xu et.al. |
2309.17367v1 |
null |
| 2023-09-29 |
Asynchronous Graph Generators |
Christopher P. Ley et.al. |
2309.17335v1 |
null |
| 2023-09-29 |
Multi-Depth Branches Network for Efficient Image Super-Resolution |
Huiyuan Tian et.al. |
2309.17334v1 |
link |
| 2023-09-29 |
Demystifying CLIP Data |
Hu Xu et.al. |
2309.16671v2 |
link |
| 2023-09-28 |
Decaf: Monocular Deformation Capture for Face and Hand Interactions |
Soshi Shimada et.al. |
2309.16670v1 |
null |
| 2023-09-28 |
Training a Large Video Model on a Single Machine in a Day |
Yue Zhao et.al. |
2309.16669v1 |
link |
| 2023-09-28 |
Novel Deep Learning Pipeline for Automatic Weapon Detection |
Haribharathi Sivakumar et.al. |
2309.16654v1 |
null |
| 2023-09-28 |
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning |
Qiao Gu et.al. |
2309.16650v1 |
null |
| 2023-09-29 |
Mixup Your Own Pairs |
Yilei Wu et.al. |
2309.16633v2 |
link |
| 2023-09-28 |
Class Activation Map-based Weakly supervised Hemorrhage Segmentation using Resnet-LSTM in Non-Contrast Computed Tomography images |
Shreyas H Ramananda et.al. |
2309.16627v1 |
null |
| 2023-09-28 |
The twisting index in semitoric systems |
Jaume Alonso et.al. |
2309.16614v1 |
null |
| 2023-09-28 |
Exploiting Edge Features in Graphs with Fused Network Gromov-Wasserstein Distance |
Junjie Yang et.al. |
2309.16604v1 |
null |
| 2023-09-28 |
Can LLMs Effectively Leverage Structural Information for Graph Learning: When and Why |
Jin Huang et.al. |
2309.16595v1 |
null |
| 2023-09-27 |
SHACIRA: Scalable HAsh-grid Compression for Implicit Neural Representations |
Sharath Girish et.al. |
2309.15848v1 |
null |
| 2023-09-27 |
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing |
Brian Yan et.al. |
2309.15826v1 |
null |
| 2023-09-27 |
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation |
David Junhao Zhang et.al. |
2309.15818v1 |
link |
| 2023-09-27 |
Convolutional Networks with Oriented 1D Kernels |
Alexandre Kirchmeyer et.al. |
2309.15812v1 |
link |
| 2023-09-27 |
A Quantum-Classical Hybrid Block-Matching Algorithm in Noisy Environment using Dissimilarity Measure |
M. Martínez-Felipe et.al. |
2309.15792v1 |
null |
| 2023-09-27 |
Large Language Model Routing with Benchmark Datasets |
Tal Shnitzer et.al. |
2309.15789v1 |
null |
| 2023-09-27 |
One For All: Video Conversation is Feasible Without Video Instruction Tuning |
Ruyang Liu et.al. |
2309.15785v1 |
null |
| 2023-09-27 |
Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback |
Teresa Yeo et.al. |
2309.15762v1 |
null |
| 2023-09-27 |
Automated CT Lung Cancer Screening Workflow using 3D Camera |
Brian Teixeira et.al. |
2309.15750v1 |
null |
| 2023-09-27 |
Data-Driven Latent Space Representation for Robust Bipedal Locomotion Learning |
Guillermo A. Castillo et.al. |
2309.15740v1 |
null |
| 2023-09-26 |
Classification of symmetry-enriched topological quantum spin liquids |
Weicheng Ye et.al. |
2309.15118v1 |
null |
| 2023-09-26 |
Doduo: Learning Dense Visual Correspondence from Unsupervised Semantic-Aware Flow |
Zhenyu Jiang et.al. |
2309.15110v1 |
null |
| 2023-09-27 |
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models |
Yaohui Wang et.al. |
2309.15103v2 |
null |
| 2023-09-26 |
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning |
Han Lin et.al. |
2309.15091v1 |
null |
| 2023-09-26 |
Video-adverb retrieval with compositional adverb-action embeddings |
Thomas Hummel et.al. |
2309.15086v1 |
null |
| 2023-09-26 |
Challenges of building medical image datasets for development of deep learning software in stroke |
Alessandro Fontanella et.al. |
2309.15081v1 |
null |
| 2023-09-26 |
On Excess Risk Convergence Rates of Neural Network Classifiers |
Hyunouk Ko et.al. |
2309.15075v1 |
null |
| 2023-09-26 |
Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding |
Christina Kassab et.al. |
2309.15065v1 |
null |
| 2023-09-26 |
QUILT: Effective Multi-Class Classification on Quantum Computers Using an Ensemble of Diverse Quantum Classifiers |
Daniel Silver et.al. |
2309.15056v1 |
null |
| 2023-09-26 |
Thalamic nuclei segmentation from T$_1$-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts |
Brendan Williams et.al. |
2309.15053v1 |
null |
| 2023-09-25 |
Extreme Parkour with Legged Robots |
Xuxin Cheng et.al. |
2309.14341v1 |
null |
| 2023-09-25 |
Chop & Learn: Recognizing and Generating Object-State Compositions |
Nirat Saini et.al. |
2309.14339v1 |
null |
| 2023-09-25 |
Human-Assisted Continual Robot Learning with Foundation Models |
Meenal Parakh et.al. |
2309.14321v1 |
null |
| 2023-09-25 |
MUTEX: Learning Unified Policies from Multimodal Task Specifications |
Rutav Shah et.al. |
2309.14320v1 |
null |
| 2023-09-25 |
DeepMesh: Mesh-based Cardiac Motion Tracking using Deep Learning |
Qingjie Meng et.al. |
2309.14306v1 |
null |
| 2023-09-25 |
NAS-NeRF: Generative Neural Architecture Search for Neural Radiance Fields |
Saeejith Nair et.al. |
2309.14293v1 |
null |
| 2023-09-25 |
CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free |
Monika Wysoczańska et.al. |
2309.14289v1 |
null |
| 2023-09-25 |
Comparison of One- Two- and Three- Dimensional CNN models for Drawing-Test-Based Diagnostics of the Parkinson's Disease |
Xuechao Wang et.al. |
2309.14288v1 |
null |
| 2023-09-26 |
Virtual Hyperspectral Images Using Symmetric Autoencoders |
Archisman Bhattacharjee et.al. |
2309.14286v2 |
null |
| 2023-09-25 |
OmniEvent: A Comprehensive, Fair, and Easy-to-Use Toolkit for Event Understanding |
Hao Peng et.al. |
2309.14258v1 |
link |
| 2023-09-22 |
Robotic Offline RL from Internet Videos via Value-Function Pre-Training |
Chethan Bhateja et.al. |
2309.13041v1 |
null |
| 2023-09-22 |
Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception? |
Xiaoxiao Sun et.al. |
2309.13038v1 |
null |
| 2023-09-22 |
Encoding optimization for quantum machine learning demonstrated on a superconducting transmon qutrit |
Shuxiang Cao et.al. |
2309.13036v1 |
null |
| 2023-09-22 |
Performance Analysis of UNet and Variants for Medical Image Segmentation |
Walid Ehab et.al. |
2309.13013v1 |
null |
| 2023-09-22 |
Pursuing Counterfactual Fairness via Sequential Autoencoder Across Domains |
Yujie Lin et.al. |
2309.13005v1 |
null |
| 2023-09-22 |
Braid groups, elliptic curves, and resolving the quartic |
Peter Huxford et.al. |
2309.12999v1 |
null |
| 2023-09-22 |
License Plate Recognition Based On Multi-Angle View Model |
Dat Tran-Anh et.al. |
2309.12972v1 |
null |
| 2023-09-22 |
PI-RADS v2 Compliant Automated Segmentation of Prostate Zones Using co-training Motivated Multi-task Dual-Path CNN |
Arnab Das et.al. |
2309.12970v1 |
null |
| 2023-09-22 |
Detect Every Thing with Few Examples |
Xinyu Zhang et.al. |
2309.12969v1 |
link |
| 2023-09-22 |
Massive End-to-end Models for Short Search Queries |
Weiran Wang et.al. |
2309.12963v1 |
null |
| 2023-09-21 |
ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals |
Jeremy A. Collins et.al. |
2309.12312v1 |
null |
| 2023-09-21 |
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent |
Jianing Yang et.al. |
2309.12311v1 |
null |
| 2023-09-21 |
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning |
Chaeyoung Jung et.al. |
2309.12306v1 |
null |
| 2023-09-22 |
PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation |
Shilin Yan et.al. |
2309.12303v2 |
link |
| 2023-09-21 |
See to Touch: Learning Tactile Dexterity through Visual Incentives |
Irmak Guzey et.al. |
2309.12300v1 |
null |
| 2023-09-21 |
The Broad Impact of Feature Imitation: Neural Enhancements Across Financial, Speech, and Physiological Domains |
Reza Khanmohammadi et.al. |
2309.12279v1 |
null |
| 2023-09-21 |
Enabling Quartile-based Estimated-Mean Gradient Aggregation As Baseline for Federated Image Classifications |
Yusen Wu et.al. |
2309.12267v1 |
null |
| 2023-09-21 |
Parallelizing non-linear sequential models over the sequence length |
Yi Heng Lim et.al. |
2309.12252v1 |
null |
| 2023-09-21 |
Adaptive Input-image Normalization for Solving Mode Collapse Problem in GAN-based X-ray Images |
Muhammad Muneeb Saad et.al. |
2309.12245v1 |
null |
| 2023-09-21 |
Model-based Clustering using Non-parametric Hidden Markov Models |
Elisabeth Gassiat et.al. |
2309.12238v1 |
null |
| 2023-09-20 |
A Large-scale Dataset for Audio-Language Representation Learning |
Luoyi Sun et.al. |
2309.11500v1 |
null |
| 2023-09-20 |
FreeU: Free Lunch in Diffusion U-Net |
Chenyang Si et.al. |
2309.11497v1 |
null |
| 2023-09-21 |
Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning |
Tianbao Xie et.al. |
2309.11489v2 |
null |
| 2023-09-20 |
First detection of CO$_2$ emission in a Centaur: JWST NIRSpec observations of 39P/Oterma |
O. Harrington Pinto et.al. |
2309.11486v1 |
null |
| 2023-09-20 |
Multi-Label Takagi-Sugeno-Kang Fuzzy System |
Qiongdan Lou et.al. |
2309.11469v1 |
null |
| 2023-09-20 |
Budget-Aware Pruning: Handling Multiple Domains with Less Parameters |
Samuel Felipe dos Santos et.al. |
2309.11464v1 |
null |
| 2023-09-20 |
AudioFool: Fast, Universal and synchronization-free Cross-Domain Attack on Speech Recognition |
Mohamad Fakih et.al. |
2309.11462v1 |
null |
| 2023-09-20 |
SkeleTR: Towrads Skeleton-based Action Recognition in the Wild |
Haodong Duan et.al. |
2309.11445v1 |
null |
| 2023-09-20 |
A Systematic Review of Few-Shot Learning in Medical Imaging |
Eva Pachetti et.al. |
2309.11433v1 |
null |
| 2023-09-21 |
Video Screens for Hearing Research: Transmittance and Reflectance of Professional and Other Fabrics |
Jan Heeren et.al. |
2309.11430v2 |
null |
| 2023-09-19 |
Assessing the capacity of a denoising diffusion probabilistic model to reproduce spatial context |
Rucha Deshpande et.al. |
2309.10817v1 |
null |
| 2023-09-19 |
Multisource Holography |
Grace Kuo et.al. |
2309.10816v1 |
null |
| 2023-09-19 |
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning |
Tianhua Zhang et.al. |
2309.10814v1 |
link |
| 2023-09-19 |
Semantic Text Compression for Classification |
Emrecan Kutay et.al. |
2309.10809v1 |
null |
| 2023-09-19 |
Multi-Context Dual Hyper-Prior Neural Image Compression |
Atefeh Khoshkhahtinat et.al. |
2309.10799v1 |
null |
| 2023-09-19 |
Multi-spectral Entropy Constrained Neural Compression of Solar Imagery |
Ali Zafari et.al. |
2309.10791v1 |
null |
| 2023-09-19 |
Guide Your Agent with Adaptive Multimodal Rewards |
Changyeon Kim et.al. |
2309.10790v1 |
link |
| 2023-09-19 |
Physics-Informed Machine Learning for Data Anomaly Detection, Classification, Localization, and Mitigation: A Review, Challenges, and Path Forward |
Mehdi Jabbari Zideh et.al. |
2309.10788v1 |
null |
| 2023-09-19 |
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models |
Yuan Tseng et.al. |
2309.10787v1 |
link |
| 2023-09-19 |
Context-Aware Neural Video Compression on Solar Dynamics Observatory |
Atefeh Khoshkhahtinat et.al. |
2309.10784v1 |
null |
| 2023-09-19 |
Des-q: a quantum algorithm to construct and efficiently retrain decision trees for regression and binary classification |
Niraj Kumar et.al. |
2309.09976v2 |
null |
| 2023-09-18 |
Empirical Study of Mix-based Data Augmentation Methods in Physiological Time Series Data |
Peikun Guo et.al. |
2309.09970v1 |
null |
| 2023-09-18 |
vSHARP: variable Splitting Half-quadratic ADMM algorithm for Reconstruction of inverse-Problems |
George Yiasemis et.al. |
2309.09954v1 |
null |
| 2023-09-18 |
TransientViT: A novel CNN - Vision Transformer hybrid real/bogus transient classifier for the Kilodegree Automatic Transient Survey |
Zhuoyang Chen et.al. |
2309.09937v1 |
null |
| 2023-09-18 |
Algebra of Self-Replication |
Lawrence S. Moss et.al. |
2309.09931v1 |
null |
| 2023-09-18 |
Evaluating Adversarial Robustness with Expected Viable Performance |
Ryan McCoppin et.al. |
2309.09928v1 |
null |
| 2023-09-18 |
Impact of Augmented reality system on elementary school ESL learners in country side of china: Motivations, achievements, behaviors and cognitive attainment |
Ijaz Ul Haq et.al. |
2309.09894v1 |
null |
| 2023-09-18 |
Not Enough Labeled Data? Just Add Semantics: A Data-Efficient Method for Inferring Online Health Texts |
Joseph Gatto et.al. |
2309.09877v1 |
null |
| 2023-09-18 |
Domain Generalization with Fourier Transform and Soft Thresholding |
Hongyi Pan et.al. |
2309.09866v1 |
null |
| 2023-09-18 |
Unsupervised Open-Vocabulary Object Localization in Videos |
Ke Fan et.al. |
2309.09858v1 |
null |
| 2023-09-18 |
Closing the Loop on Runtime Monitors with Fallback-Safe MPC |
Rohan Sinha et.al. |
2309.08603v2 |
null |
| 2023-09-15 |
Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes |
Fabien Delattre et.al. |
2309.08588v1 |
null |
| 2023-09-15 |
Compositional Foundation Models for Hierarchical Planning |
Anurag Ajay et.al. |
2309.08587v1 |
null |
| 2023-09-15 |
HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks |
Minh-Hao Van et.al. |
2309.08549v1 |
null |
| 2023-09-15 |
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens |
Minsu Kim et.al. |
2309.08531v1 |
null |
| 2023-09-15 |
Generalised Probabilistic Diffusion Scale-Spaces |
Pascal Peter et.al. |
2309.08511v1 |
null |
| 2023-09-15 |
Deep-learning-powered data analysis in plankton ecology |
Harshith Bachimanchi et.al. |
2309.08500v1 |
link |
| 2023-09-15 |
P-ROCKET: Pruning Random Convolution Kernels for Time Series Classification |
Shaowu Chen et.al. |
2309.08499v1 |
link |
| 2023-09-15 |
YCB-Ev: Event-vision dataset for 6DoF object pose estimation |
Pavel Rojtberg et.al. |
2309.08482v1 |
link |
| 2023-09-15 |
Current and future directions in network biology |
Marinka Zitnik et.al. |
2309.08478v1 |
null |
| 2023-09-14 |
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning |
Zhiwu Qing et.al. |
2309.07911v1 |
link |
| 2023-09-14 |
Generative Image Dynamics |
Zhengqi Li et.al. |
2309.07906v1 |
null |
| 2023-09-14 |
Ambiguity-Aware In-Context Learning with Large Language Models |
Lingyu Gao et.al. |
2309.07900v1 |
null |
| 2023-09-14 |
SMARTFEAT: Efficient Feature Construction through Feature-Level Foundation Model Interactions |
Yin Lin et.al. |
2309.07856v1 |
null |
| 2023-09-14 |
Two Timin': Repairing Smart Contracts With A Two-Layered Approach |
Abhinav Jain et.al. |
2309.07841v1 |
null |
| 2023-09-14 |
Text Classification of Cancer Clinical Trial Eligibility Criteria |
Yumeng Yang et.al. |
2309.07812v1 |
null |
| 2023-09-14 |
What Matters to Enhance Traffic Rule Compliance of Imitation Learning for Automated Driving |
Hongkuan Zhou et.al. |
2309.07808v1 |
null |
| 2023-09-14 |
Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary tasks |
Danae Sánchez Villegas et.al. |
2309.07794v1 |
null |
| 2023-09-14 |
A Multi-In and Multi-Out Dendritic Neuron Model and its Optimization |
Yu Ding et.al. |
2309.07791v1 |
null |
| 2023-09-15 |
Virchow: A Million-Slide Digital Pathology Foundation Model |
Eugene Vorontsov et.al. |
2309.07778v2 |
null |
| 2023-09-13 |
Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology |
Nirhoshan Sivaroopan et.al. |
2309.07113v1 |
null |
| 2023-09-13 |
Data Augmentation via Subgroup Mixup for Improving Fairness |
Madeline Navarro et.al. |
2309.07110v1 |
null |
| 2023-09-13 |
The end sum of surfaces |
Liam K. Axon et.al. |
2309.07101v1 |
null |
| 2023-09-13 |
Revisiting the classics: On the evolutionary origin of the "Fe II" and "He/N" spectral classes of novae |
E. Aydi et.al. |
2309.07097v1 |
null |
| 2023-09-13 |
RadarLCD: Learnable Radar-based Loop Closure Detection Pipeline |
Mirko Usuelli et.al. |
2309.07094v1 |
null |
| 2023-09-13 |
Mitigating Group Bias in Federated Learning for Heterogeneous Devices |
Khotso Selialia et.al. |
2309.07085v1 |
null |
| 2023-09-13 |
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning |
Alexander Bastounis et.al. |
2309.07072v1 |
null |
| 2023-09-13 |
Aggregating Long-term Sharp Features via Hybrid Transformers for Video Deblurring |
Dongwei Ren et.al. |
2309.07054v1 |
link |
| 2023-09-13 |
Thurston's theorem and the Nielsen-Thurston classification via Teichmüller's theorem |
James Belk et.al. |
2309.06993v1 |
null |
| 2023-09-13 |
Neural network-based coronary dominance classification of RCA angiograms |
Ivan Kruzhilov et.al. |
2309.06958v1 |
null |
| 2023-09-12 |
Learning Disentangled Avatars with Hybrid 3D Representations |
Yao Feng et.al. |
2309.06441v1 |
null |
| 2023-09-12 |
LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning |
Kenneth Shaw et.al. |
2309.06440v1 |
null |
| 2023-09-12 |
AGMDT: Virtual Staining of Renal Histology Images with Adjacency-Guided Multi-Domain Transfer |
Tao Ma et.al. |
2309.06421v1 |
null |
| 2023-09-12 |
Style2Fab: Functionality-Aware Segmentation for Fabricating Personalized 3D Models with Generative AI |
Faraz Faruqi et.al. |
2309.06379v1 |
null |
| 2023-09-12 |
Padding-free Convolution based on Preservation of Differential Characteristics of Kernels |
Kuangdai Leng et.al. |
2309.06370v1 |
null |
| 2023-09-12 |
Using Reed-Muller Codes for Classification with Rejection and Recovery |
Daniel Fentham et.al. |
2309.06359v1 |
link |
| 2023-09-12 |
Eccentric graph of trees and their Cartesian products |
Anita Arora et.al. |
2309.06338v1 |
null |
| 2023-09-12 |
Exploring Flat Minima for Domain Generalization with Large Learning Rates |
Jian Zhang et.al. |
2309.06337v1 |
null |
| 2023-09-12 |
Grounded Language Acquisition From Object and Action Imagery |
James Robert Kubricht et.al. |
2309.06335v1 |
null |
| 2023-09-12 |
Visualising Game Engine Subsystem Coupling |
Gabriel C. Ullmann et.al. |
2309.06329v1 |
null |
| 2023-09-11 |
Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips |
Yufei Ye et.al. |
2309.05663v1 |
null |
| 2023-09-11 |
From Capture to Display: A Survey on Volumetric Video |
Yili Jin et.al. |
2309.05658v1 |
null |
| 2023-09-11 |
Potentials of Deterministic Radio Propagation Simulation for AI-Enabled Localization and Sensing |
Albrecht Michler et.al. |
2309.05650v1 |
null |
| 2023-09-11 |
A Novel Supervised Deep Learning Solution to Detect Distributed Denial of Service (DDoS) attacks on Edge Systems using Convolutional Neural Networks (CNN) |
Vedanth Ramanathan et.al. |
2309.05646v1 |
null |
| 2023-09-11 |
Boundary Peeling: Outlier Detection Method Using One-Class Peeling |
Sheikh Arafat et.al. |
2309.05630v1 |
null |
| 2023-09-11 |
Temporal Action Localization with Enhanced Instant Discriminability |
Dingfeng Shi et.al. |
2309.05590v1 |
link |
| 2023-09-11 |
Anisotropic Diffusion Stencils: From Simple Derivations over Stability Estimates to ResNet Implementations |
Karl Schrader et.al. |
2309.05575v1 |
null |
| 2023-09-11 |
On the Meromorphic Integrability of the Critical Systems for Optimal Sums of Eigenvalues |
Yuzhou Tian et.al. |
2309.05568v1 |
null |
| 2023-09-11 |
OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data |
Giuseppe Cartella et.al. |
2309.05551v1 |
link |
| 2023-09-11 |
Distance-Aware eXplanation Based Learning |
Misgina Tsighe Hagos et.al. |
2309.05548v1 |
link |
| 2023-09-08 |
Generalized Cross-domain Multi-label Few-shot Learning for Chest X-rays |
Aroof Aimen et.al. |
2309.04462v1 |
null |
| 2023-09-08 |
Generalized Variable Selection Algorithms for Gaussian Process Models by LASSO-like Penalty |
Zhiyong Hu et.al. |
2309.04455v1 |
null |
| 2023-09-08 |
Vis-SPLIT: Interactive Hierarchical Modeling for mRNA Expression Classification |
Braden Roper et.al. |
2309.04423v1 |
null |
| 2023-09-08 |
Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving |
Thomas E. Huang et.al. |
2309.04422v1 |
null |
| 2023-09-08 |
Seeing-Eye Quadruped Navigation with Force Responsive Locomotion Control |
David DeFazio et.al. |
2309.04370v1 |
null |
| 2023-09-08 |
Active Learning for Classifying 2D Grid-Based Level Completability |
Mahsa Bazzaz et.al. |
2309.04367v1 |
link |
| 2023-09-08 |
Sparse Codesigned Communication and Radar Systems |
Hyeon Seok Rou et.al. |
2309.04362v1 |
null |
| 2023-09-08 |
Learning from Power Signals: An Automated Approach to Electrical Disturbance Identification Within a Power Transmission System |
Jonathan D. Boyd et.al. |
2309.04361v1 |
null |
| 2023-09-08 |
Zero-Shot Robustification of Zero-Shot Models With Foundation Models |
Dyah Adila et.al. |
2309.04344v1 |
null |
| 2023-09-08 |
Encoding Multi-Domain Scientific Papers by Ensembling Multiple CLS Tokens |
Ronald Seoh et.al. |
2309.04333v1 |
link |
| 2023-09-07 |
A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation |
Ziyan Huang et.al. |
2309.03906v1 |
link |
| 2023-09-07 |
ImageBind-LLM: Multi-modality Instruction Tuning |
Jiaming Han et.al. |
2309.03905v1 |
link |
| 2023-09-07 |
Tracking Anything with Decoupled Video Segmentation |
Ho Kei Cheng et.al. |
2309.03903v1 |
link |
| 2023-09-07 |
Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction |
Su-Kai Chen et.al. |
2309.03900v1 |
null |
| 2023-09-07 |
The Making and Breaking of Camouflage |
Hala Lamdouar et.al. |
2309.03899v1 |
null |
| 2023-09-07 |
ProPainter: Improving Propagation and Transformer for Video Inpainting |
Shangchen Zhou et.al. |
2309.03897v1 |
null |
| 2023-09-07 |
Zero-Shot Audio Captioning via Audibility Guidance |
Tal Shaharabany et.al. |
2309.03884v1 |
null |
| 2023-09-07 |
Text-to-feature diffusion for audio-visual few-shot learning |
Otniel-Bogdan Mercea et.al. |
2309.03869v1 |
null |
| 2023-09-07 |
Classification of Killing Magnetic Curves In H^3 |
Özgür Kelekçi et.al. |
2309.03859v1 |
null |
| 2023-09-07 |
CenTime: Event-Conditional Modelling of Censoring in Survival Analysis |
Ahmed H. Shahin et.al. |
2309.03851v1 |
link |
| 2023-09-07 |
Terahertz-Band Direction Finding With Beam-Split and Mutual Coupling Calibration |
Ahmet M. Elbir et.al. |
2309.03195v2 |
null |
| 2023-09-06 |
Signatures of Bayesian inference emerge from energy efficient synapses |
James Malkin et.al. |
2309.03194v1 |
null |
| 2023-09-06 |
3D Transformer based on deformable patch location for differential diagnosis between Alzheimer's disease and Frontotemporal dementia |
Huy-Dung Nguyen et.al. |
2309.03183v1 |
null |
| 2023-09-06 |
PDiscoNet: Semantically consistent part discovery for fine-grained recognition |
Robert van der Klis et.al. |
2309.03173v1 |
null |
| 2023-09-06 |
ResFields: Residual Neural Fields for Spatiotemporal Signals |
Marko Mihajlovic et.al. |
2309.03160v1 |
null |
| 2023-09-06 |
Normal mode decomposition of atomic motion in solids |
Jaeyun Moon et.al. |
2309.03140v1 |
null |
| 2023-09-06 |
Serving Time: Real-Time, Safe Motion Planning and Control for Manipulation of Unsecured Objects |
Zachary Brei et.al. |
2309.03111v1 |
null |
| 2023-09-06 |
The Secrets of Non-Blind Poisson Deconvolution |
Abhiram Gnanasambandam et.al. |
2309.03105v1 |
null |
| 2023-09-06 |
On the $Σ$-invariants of Artin groups satisfying the $K(π,1)$-conjecture |
Marcos Escartín Ferrer et.al. |
2309.03091v1 |
null |
| 2023-09-06 |
Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection |
Yu Chen et.al. |
2309.03057v1 |
null |
| 2023-09-05 |
ReliTalk: Relightable Talking Portrait Generation from a Single Video |
Haonan Qiu et.al. |
2309.02434v1 |
link |
| 2023-09-05 |
A Likelihood Approach to Incorporating Self-Report Data in HIV Recency Classification |
Wenlong Yang et.al. |
2309.02430v1 |
null |
| 2023-09-05 |
Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach |
Vimal K B et.al. |
2309.02429v1 |
null |
| 2023-09-05 |
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding |
Yue Xu et.al. |
2309.02423v1 |
null |
| 2023-09-05 |
Doppelgangers: Learning to Disambiguate Images of Similar Structures |
Ruojin Cai et.al. |
2309.02420v1 |
link |
| 2023-09-05 |
Classification of La3+ and Gd3+ rare earth ions using surface-enhanced Raman scattering |
Hao Jin et.al. |
2309.02409v1 |
null |
| 2023-09-05 |
Semantic Communications Based on Adaptive Generative Models and Information Bottleneck |
S. Barbarossa et.al. |
2309.02387v1 |
null |
| 2023-09-05 |
On the classification of primitive ideals for complex classical Lie algebras, IV |
William McGovern et.al. |
2309.02363v1 |
null |
| 2023-09-05 |
Generating Infinite-Resolution Texture using GANs with Patch-by-Patch Paradigm |
Alhasan Abdellatif et.al. |
2309.02340v1 |
null |
| 2023-09-05 |
DEEPBEAS3D: Deep Learning and B-Spline Explicit Active Surfaces |
Helena Williams et.al. |
2309.02335v1 |
null |
| 2023-09-01 |
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following |
Ziyu Guo et.al. |
2309.00615v1 |
link |
| 2023-09-01 |
Amyloid-Beta Axial Plane PET Synthesis from Structural MRI: An Image Translation Approach for Screening Alzheimer's Disease |
Fernando Vega et.al. |
2309.00569v1 |
null |
| 2023-09-01 |
Powder-Bot: A Modular Autonomous Multi-Robot Workflow for Powder X-Ray Diffraction |
Amy M. Lunt et.al. |
2309.00544v1 |
null |
| 2023-09-01 |
A Machine Vision Method for Correction of Eccentric Error: Based on Adaptive Enhancement Algorithm |
Fanyi Wang et.al. |
2309.00514v1 |
null |
| 2023-09-01 |
Multi-stage Deep Learning Artifact Reduction for Computed Tomography |
Jiayang Shi et.al. |
2309.00494v1 |
null |
| 2023-09-01 |
Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction |
Peizhen Bai et.al. |
2309.00483v1 |
null |
| 2023-09-01 |
Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO Channels |
Haotian Wu et.al. |
2309.00470v1 |
null |
| 2023-09-01 |
New metrics for analyzing continual learners |
Nicolas Michel et.al. |
2309.00462v1 |
null |
| 2023-09-01 |
The miniJPAS survey quasar selection IV: Classification and redshift estimation with SQUEzE |
Ignasi Pérez-Ràfols et.al. |
2309.00461v1 |
null |
| 2023-09-01 |
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding |
Étienne Labbé et.al. |
2309.00454v1 |
link |
| 2023-08-31 |
PointLLM: Empowering Large Language Models to Understand Point Clouds |
Runsen Xu et.al. |
2308.16911v1 |
link |
| 2023-08-31 |
StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation |
Yuhan Wang et.al. |
2308.16909v1 |
link |
| 2023-08-31 |
Learning to Taste: A Multimodal Wine Dataset |
Thoranna Bender et.al. |
2308.16900v1 |
null |
| 2023-08-31 |
EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild |
Manuel Kaufmann et.al. |
2308.16894v1 |
link |
| 2023-08-31 |
On the Role of Non-Localities in Fundamental Diagram Estimation |
Jing Liu et.al. |
2308.16878v1 |
null |
| 2023-08-31 |
SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation |
Jiaben Chen et.al. |
2308.16876v1 |
null |
| 2023-08-31 |
Understanding defects in amorphous silicon with million-atom simulations and machine learning |
Joe D. Morrow et.al. |
2308.16868v1 |
null |
| 2023-08-31 |
Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images |
Chinmay Prabhakar et.al. |
2308.16863v1 |
link |
| 2023-08-31 |
Facing Unknown: Open-World Encrypted Traffic Classification Based on Contrastive Pre-Training |
Xiang Li et.al. |
2308.16861v1 |
null |
| 2023-08-31 |
Majorization-Minimization for sparse SVMs |
Alessandro Benfenati et.al. |
2308.16858v1 |
null |
| 2023-08-30 |
Fully Non-Linear Neuromorphic Computing with Linear Wave Scattering |
Clara C. Wanjura et.al. |
2308.16181v1 |
null |
| 2023-08-30 |
General Purpose Audio Effect Removal |
Matthew Rice et.al. |
2308.16177v1 |
null |
| 2023-08-30 |
Algebraic, Topological, and Mereological Foundations of Existential Granules |
Mani A et.al. |
2308.16157v1 |
null |
| 2023-08-31 |
MMVP: Motion-Matrix-based Video Prediction |
Yiqi Zhong et.al. |
2308.16154v2 |
link |
| 2023-08-30 |
Modality Cycles with Masked Conditional Diffusion for Unsupervised Anomaly Segmentation in MRI |
Ziyun Liang et.al. |
2308.16150v1 |
null |
| 2023-08-30 |
Spatial Graph Coarsening: Weather and Weekday Prediction with London's Bike-Sharing Service using GNN |
Yuta Sato et.al. |
2308.16122v1 |
null |
| 2023-08-30 |
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion |
Man Zhou et.al. |
2308.16083v1 |
null |
| 2023-08-30 |
A Classification of Observation-Driven State-Space Count Models for Panel Data |
Jae Youn Ahn et.al. |
2308.16058v1 |
null |
| 2023-08-30 |
Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs |
Jiani Liu et.al. |
2308.16056v1 |
null |
| 2023-08-30 |
Telepresence Lantern -- Designing an Immersive Video-Mediated Communication Device for Older Adults |
Thomas H. Weisswange et.al. |
2308.16052v1 |
null |
| 2023-08-29 |
An Adaptive Tangent Feature Perspective of Neural Networks |
Daniel LeJeune et.al. |
2308.15478v1 |
null |
| 2023-08-29 |
A General-Purpose Self-Supervised Model for Computational Pathology |
Richard J. Chen et.al. |
2308.15474v1 |
null |
| 2023-08-29 |
Learning Modulated Transformation in GANs |
Ceyuan Yang et.al. |
2308.15472v1 |
null |
| 2023-08-30 |
Policy composition in reinforcement learning via multi-objective policy optimization |
Shruti Mishra et.al. |
2308.15470v2 |
null |
| 2023-08-29 |
Input margins can predict generalization too |
Coenraad Mouton et.al. |
2308.15466v1 |
null |
| 2023-08-29 |
A Comparative Study of Loss Functions: Traffic Predictions in Regular and Congestion Scenarios |
Yangxinyu Xie et.al. |
2308.15464v1 |
link |
| 2023-08-29 |
Online Overexposed Pixels Hallucination in Videos with Adaptive Reference Frame Selection |
Yazhou Xing et.al. |
2308.15462v1 |
null |
| 2023-08-29 |
From SMOTE to Mixup for Deep Imbalanced Classification |
Wei-Chao Cheng et.al. |
2308.15457v1 |
link |
| 2023-08-29 |
Pseudo-Boolean Polynomials Approach To Edge Detection And Image Segmentation |
Tendai Mapungwana Chikake et.al. |
2308.15453v1 |
null |
| 2023-08-29 |
WrappingNet: Mesh Autoencoder via Deep Sphere Deformation |
Eric Lei et.al. |
2308.15413v1 |
null |
| 2023-08-28 |
MagicEdit: High-Fidelity and Temporally Coherent Video Editing |
Jun Hao Liew et.al. |
2308.14749v1 |
null |
| 2023-08-28 |
MagicAvatar: Multimodal Avatar Generation and Animation |
Jianfeng Zhang et.al. |
2308.14748v1 |
null |
| 2023-08-28 |
CoVR: Learning Composed Video Retrieval from Web Video Captions |
Lucas Ventura et.al. |
2308.14746v1 |
link |
| 2023-08-28 |
Total Selfie: Generating Full-Body Selfies |
Bowei Chen et.al. |
2308.14740v1 |
null |
| 2023-08-28 |
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding |
Zhixin Ling et.al. |
2308.14726v1 |
null |
| 2023-08-28 |
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation |
Xudong Wang et.al. |
2308.14710v1 |
link |
| 2023-08-28 |
Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual Predatory Chats and Abusive Texts |
Thanh Thi Nguyen et.al. |
2308.14683v1 |
null |
| 2023-08-28 |
Video-Based Hand Pose Estimation for Remote Assessment of Bradykinesia in Parkinson's Disease |
Gabriela T. Acevedo Trebbau et.al. |
2308.14679v1 |
null |
| 2023-08-28 |
Noncommutative tensor triangular geometry: classification via noetherian spectra |
James Rowe et.al. |
2308.14661v1 |
null |
| 2023-08-28 |
Towards Standardized Disturbance Rejection Testing of Legged Robot Locomotion with Linear Impactor: A Preliminary Study, Observations, and Implications |
Bowen Weng et.al. |
2308.14636v1 |
null |
| 2023-08-25 |
Unveiling the Role of Message Passing in Dual-Privacy Preservation on GNNs |
Tianyi Zhao et.al. |
2308.13513v1 |
null |
| 2023-08-25 |
Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation |
Jiaming Zhang et.al. |
2308.13505v1 |
null |
| 2023-08-25 |
Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning |
Pranav Balaji et.al. |
2308.13503v1 |
null |
| 2023-08-25 |
Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers |
Matthew Dutson et.al. |
2308.13494v1 |
link |
| 2023-08-25 |
Temporal Uncertainty Localization to Enable Human-in-the-loop Analysis of Dynamic Contrast-enhanced Cardiac MRI Datasets |
Dilek M. Yalcinkaya et.al. |
2308.13488v1 |
null |
| 2023-08-25 |
QKSAN: A Quantum Kernel Self-Attention Network |
Ren-Xin Zhao et.al. |
2308.13422v1 |
null |
| 2023-08-25 |
An investigation into the impact of deep learning model choice on sex and race bias in cardiac MR segmentation |
Tiarna Lee et.al. |
2308.13415v1 |
null |
| 2023-08-25 |
Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features |
Zheng Gao et.al. |
2308.13392v1 |
null |
| 2023-08-25 |
Direction-aware Video Demoireing with Temporal-guided Bilateral Learning |
Shuning Xu et.al. |
2308.13388v1 |
null |
| 2023-08-25 |
On flags of holomorphic foliations associated with singular second-order ordinary differential equations |
Fernando Lourenço et.al. |
2308.13370v1 |
null |
| 2023-08-24 |
POCO: 3D Pose and Shape Estimation with Confidence |
Sai Kumar Dwivedi et.al. |
2308.12965v1 |
null |
| 2023-08-24 |
Motion-Guided Masking for Spatiotemporal Representation Learning |
David Fan et.al. |
2308.12962v1 |
null |
| 2023-08-24 |
Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment |
Sheng Zhang et.al. |
2308.12960v1 |
link |
| 2023-08-24 |
Beyond Document Page Classification: Design, Datasets, and Challenges |
Jordy Van Landeghem et.al. |
2308.12896v1 |
null |
| 2023-08-24 |
Large Language Models Vote: Prompting for Rare Disease Identification |
David Oniani et.al. |
2308.12890v1 |
link |
| 2023-08-24 |
Multi-stage feature decorrelation constraints for improving CNN classification performance |
Qiuyu Zhu et.al. |
2308.12880v1 |
null |
| 2023-08-24 |
ToonTalker: Cross-Domain Face Reenactment |
Yuan Gong et.al. |
2308.12866v1 |
null |
| 2023-08-24 |
Learned Local Attention Maps for Synthesising Vessel Segmentations |
Yash Deo et.al. |
2308.12861v1 |
null |
| 2023-08-24 |
Algebraicity of hypergeometric functions with arbitrary parameters |
Florian Fürnsinn et.al. |
2308.12855v1 |
null |
| 2023-08-24 |
$p$-brane Galilean and Carrollian Geometries and Gravities |
Eric Bergshoeff et.al. |
2308.12852v1 |
null |
| 2023-08-23 |
Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models |
Nancy Tyagi et.al. |
2308.12272v1 |
null |
| 2023-08-23 |
Bugsplainer: Leveraging Code Structures to Explain Software Bugs with Neural Machine Translation |
Parvez Mahbub et.al. |
2308.12267v1 |
null |
| 2023-08-23 |
SPPNet: A Single-Point Prompt Network for Nuclei Image Segmentation |
Qing Xu et.al. |
2308.12231v1 |
link |
| 2023-08-23 |
Towards Real-Time Analysis of Broadcast Badminton Videos |
Nitin Nilesh et.al. |
2308.12199v1 |
null |
| 2023-08-23 |
Sign Language Translation with Iterative Prototype |
Huijie Yao et.al. |
2308.12191v1 |
null |
| 2023-08-23 |
Tumor-Centered Patching for Enhanced Medical Image Segmentation |
Mutyyba Asghar et.al. |
2308.12168v1 |
null |
| 2023-08-23 |
Constant mean curvature hypersurfaces in Anti-de Sitter space |
Enrico Trebeschi et.al. |
2308.12167v1 |
null |
| 2023-08-23 |
NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos |
Ziyu Yang et.al. |
2308.12163v1 |
null |
| 2023-08-23 |
A Probabilistic Fluctuation based Membership Inference Attack for Generative Models |
Wenjie Fu et.al. |
2308.12143v1 |
null |
| 2023-08-23 |
Masking Strategies for Background Bias Removal in Computer Vision Models |
Ananthu Aniraj et.al. |
2308.12127v1 |
link |
| 2023-08-22 |
StoryBench: A Multifaceted Benchmark for Continuous Story Visualization |
Emanuele Bugliarello et.al. |
2308.11606v1 |
link |
| 2023-08-22 |
Semantic Multi-Resolution Communications |
Matin Mortaheb et.al. |
2308.11604v1 |
null |
| 2023-08-22 |
EndoNet: model for automatic calculation of H-score on histological slides |
Egor Ushakov et.al. |
2308.11562v1 |
null |
| 2023-08-22 |
Open Set Synthetic Image Source Attribution |
Shengbang Fang et.al. |
2308.11557v1 |
null |
| 2023-08-22 |
Multi-event Video-Text Retrieval |
Gengyuan Zhang et.al. |
2308.11551v1 |
link |
| 2023-08-22 |
Furnishing Sound Event Detection with Language Model Abilities |
Hualei Wang et.al. |
2308.11530v1 |
null |
| 2023-08-22 |
LCCo: Lending CLIP to Co-Segmentation |
Xin Duan et.al. |
2308.11506v1 |
null |
| 2023-08-23 |
Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition |
Qitong Wang et.al. |
2308.11489v2 |
link |
| 2023-08-22 |
Opening the Vocabulary of Egocentric Actions |
Dibyadip Chatterjee et.al. |
2308.11488v1 |
null |
| 2023-08-22 |
Free Lunch for Gait Recognition: A Novel Relation Descriptor |
Jilong Wang et.al. |
2308.11487v1 |
null |
| 2023-08-21 |
Structured World Models from Human Videos |
Russell Mendonca et.al. |
2308.10901v1 |
null |
| 2023-08-21 |
Unlocking Accuracy and Fairness in Differentially Private Image Classification |
Leonard Berrada et.al. |
2308.10888v1 |
null |
| 2023-08-21 |
Evaluating quantum generative models via imbalanced data classification benchmarks |
Graham R. Enos et.al. |
2308.10847v1 |
null |
| 2023-08-21 |
Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction |
Miaoyu Li et.al. |
2308.10820v1 |
null |
| 2023-08-21 |
Improving Continuous Sign Language Recognition with Cross-Lingual Signs |
Fangyun Wei et.al. |
2308.10809v1 |
null |
| 2023-08-21 |
DynED: Dynamic Ensemble Diversification in Data Stream Classification |
Soheil Abadifard et.al. |
2308.10807v1 |
link |
| 2023-08-21 |
MGMAE: Motion Guided Masking for Video Masked Autoencoding |
Bingkun Huang et.al. |
2308.10794v1 |
null |
| 2023-08-21 |
Extraction of Text from Optic Nerve Optical Coherence Tomography Reports |
Iyad Majid et.al. |
2308.10790v1 |
null |
| 2023-08-21 |
Dense Error Map Estimation for MRI-Ultrasound Registration in Brain Tumor Surgery Using Swin UNETR |
Soorena Salari et.al. |
2308.10784v1 |
null |
| 2023-08-21 |
Superfluid weight in the isolated band limit within the generalized random phase approximation |
Minh Tam et.al. |
2308.10780v1 |
null |
| 2023-08-18 |
Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization |
Soumik Mukhopadhyay et.al. |
2308.09716v1 |
link |
| 2023-08-18 |
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis |
Jonathon Luiten et.al. |
2308.09713v1 |
null |
| 2023-08-18 |
SimDA: Simple Diffusion Adapter for Efficient Video Generation |
Zhen Xing et.al. |
2308.09710v1 |
null |
| 2023-08-18 |
Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud Recognition |
Xuanyu Yi et.al. |
2308.09694v1 |
null |
| 2023-08-18 |
A Lightweight Transformer for Faster and Robust EBSD Data Collection |
Harry Dong et.al. |
2308.09693v1 |
link |
| 2023-08-18 |
Audiovisual Moments in Time: A Large-Scale Annotated Dataset of Audiovisual Actions |
Michael Joannou et.al. |
2308.09685v1 |
link |
| 2023-08-18 |
Quantifying Uncertainties of Contact Classifications in a Human-Robot Collaboration with Parallel Robots |
Aran Mohammad et.al. |
2308.09675v1 |
null |
| 2023-08-18 |
Classification of modular data up to rank 11 |
Siu-Hung Ng et.al. |
2308.09670v1 |
null |
| 2023-08-18 |
Collision Isolation and Identification Using Proprioceptive Sensing for Parallel Robots to Enable Human-Robot Collaboration |
Aran Mohammad et.al. |
2308.09650v1 |
null |
| 2023-08-18 |
Robust Uncertainty Quantification using Conformalised Monte Carlo Prediction |
Daniel Bethell et.al. |
2308.09647v1 |
link |
| 2023-08-16 |
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions |
Henghui Ding et.al. |
2308.08544v1 |
link |
| 2023-08-16 |
Deployment and Analysis of Instance Segmentation Algorithm for In-field Grade Estimation of Sweetpotatoes |
Hoang M. Nguyen et.al. |
2308.08534v1 |
null |
| 2023-08-16 |
Diagnosing Human-object Interaction Detectors |
Fangrui Zhu et.al. |
2308.08529v1 |
link |
| 2023-08-17 |
Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction |
Yuhao Yang et.al. |
2308.08518v2 |
null |
| 2023-08-17 |
Two-and-a-half Order Score-based Model for Solving 3D Ill-posed Inverse Problems |
Zirong Li et.al. |
2308.08511v2 |
null |
| 2023-08-16 |
ResBuilder: Automated Learning of Depth with Residual Structures |
Julian Burghoff et.al. |
2308.08504v1 |
null |
| 2023-08-16 |
Galactic Archaeology: Tracing the Milky Way's Formation and Evolution through Stellar Populations |
J. Alfredo Collazos et.al. |
2308.08492v1 |
null |
| 2023-08-16 |
Label Propagation Techniques for Artifact Detection in Imbalanced Classes using Photoplethysmogram Signals |
Clara Macabiau et.al. |
2308.08480v1 |
null |
| 2023-08-16 |
DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local Feature Matching |
Johan Edstedt et.al. |
2308.08479v1 |
link |
| 2023-08-16 |
Classification Committee for Active Deep Object Detection |
Lei Zhao et.al. |
2308.08476v1 |
null |
| 2023-08-15 |
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing |
Hao Ouyang et.al. |
2308.07926v1 |
link |
| 2023-08-15 |
Helping Hands: An Object-Aware Ego-Centric Video Recognition Model |
Chuhan Zhang et.al. |
2308.07918v1 |
link |
| 2023-08-15 |
Relightable and Animatable Neural Avatar from Sparse-View Video |
Zhen Xu et.al. |
2308.07903v1 |
null |
| 2023-08-15 |
Back to Basics: A Sanity Check on Modern Time Series Classification Algorithms |
Bhaskar Dhariyal et.al. |
2308.07886v1 |
link |
| 2023-08-15 |
The Challenge of Fetal Cardiac MRI Reconstruction Using Deep Learning |
Denis Prokopenko et.al. |
2308.07885v1 |
null |
| 2023-08-15 |
Towards Temporal Edge Regression: A Case Study on Agriculture Trade Between Nations |
Lekang Jiang et.al. |
2308.07883v1 |
link |
| 2023-08-15 |
Synthesizing Political Zero-Shot Relation Classification via Codebook Knowledge, NLI, and ChatGPT |
Yibo Hu et.al. |
2308.07876v1 |
null |
| 2023-08-15 |
SEDA: Self-Ensembling ViT with Defensive Distillation and Adversarial Training for robust Chest X-rays Classification |
Raza Imam et.al. |
2308.07874v1 |
link |
| 2023-08-15 |
Sequence Processing with Quantum Tensor Networks |
Carys Harvey et.al. |
2308.07865v1 |
null |
| 2023-08-15 |
ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition |
Yixuan Zhou et.al. |
2308.07815v1 |
link |
| 2023-08-14 |
Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification |
Olesya Razuvayevskaya et.al. |
2308.07282v1 |
null |
| 2023-08-14 |
A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer |
Manjary P Gangan et.al. |
2308.07279v1 |
null |
| 2023-08-14 |
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models |
Peng Wang et.al. |
2308.07269v1 |
link |
| 2023-08-14 |
Diving with Penguins: Detecting Penguins and their Prey in Animal-borne Underwater Videos via Deep Learning |
Kejia Zhang et.al. |
2308.07267v1 |
null |
| 2023-08-14 |
Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation |
Liam Chalcroft et.al. |
2308.07251v1 |
link |
| 2023-08-14 |
LCE -- An Augmented Combination of Bagging and Boosting in Python |
Kevin Fauvel et.al. |
2308.07250v1 |
link |
| 2023-08-14 |
Large-scale environment mapping and immersive human-robot interaction for agricultural mobile robot teleoperation |
Tao Liu et.al. |
2308.07231v1 |
null |
| 2023-08-14 |
Almost fine gradings on algebras and classification of gradings up to isomorphism |
Alberto Elduque et.al. |
2308.07230v1 |
null |
| 2023-08-14 |
Distance Matters For Improving Performance Estimation Under Covariate Shift |
Mélanie Roschewitz et.al. |
2308.07223v1 |
link |
| 2023-08-15 |
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes |
Zhaohui Li et.al. |
2308.07221v2 |
link |
| 2023-08-11 |
ARGUS: Visualization of AI-Assisted Task Guidance in AR |
Sonia Castelo et.al. |
2308.06246v1 |
null |
| 2023-08-11 |
Exploring Predicate Visual Context in Detecting of Human-Object Interactions |
Frederic Z. Zhang et.al. |
2308.06202v1 |
link |
| 2023-08-11 |
Weakly Supervised Text Classification on Free Text Comments in Patient-Reported Outcome Measures |
Anna-Grace Linton et.al. |
2308.06199v1 |
null |
| 2023-08-11 |
Physical Adversarial Attacks For Camera-based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook |
Amira Guesmi et.al. |
2308.06173v1 |
null |
| 2023-08-11 |
Extrinsic geometry and linear differential equations of $\mathfrak{sl}_3$-type |
Boris Doubrov et.al. |
2308.06169v1 |
null |
| 2023-08-11 |
Rethinking the Localization in Weakly Supervised Object Localization |
Rui Xu et.al. |
2308.06161v1 |
null |
| 2023-08-11 |
Identification of the Relevance of Comments in Codes Using Bag of Words and Transformer Based Models |
Sruthi S et.al. |
2308.06144v1 |
link |
| 2023-08-11 |
Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping |
Yasser Abdelaziz Dahou Djilali et.al. |
2308.06112v1 |
null |
| 2023-08-11 |
Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation |
Philipp Vaeth et.al. |
2308.06100v1 |
link |
| 2023-08-11 |
Automated Construction of Time-Space Diagrams for Traffic Analysis Using Street-View Video Sequence |
Tanay Rastogi et.al. |
2308.06098v1 |
null |
| 2023-08-10 |
Follow Anything: Open-set detection, tracking, and following in real-time |
Alaa Maalouf et.al. |
2308.05737v1 |
link |
| 2023-08-10 |
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models |
Guangkai Xu et.al. |
2308.05733v1 |
null |
| 2023-08-10 |
Optimizing Performance of Feedforward and Convolutional Neural Networks through Dynamic Activation Functions |
Chinmay Rane et.al. |
2308.05724v1 |
null |
| 2023-08-10 |
Towards the Automorphism Conjecture I: Combinatorial Control and Compensation for Factorials |
Bernd S. W. Schröder et.al. |
2308.05715v1 |
null |
| 2023-08-10 |
Automatic Extraction of Relevant Road Infrastructure using Connected vehicle data and Deep Learning Model |
Adu-Gyamfi Kojo et.al. |
2308.05658v1 |
null |
| 2023-08-10 |
Attention-based 3D CNN with Multi-layer Features for Alzheimer's Disease Diagnosis using Brain Images |
Yanteng Zhang et.al. |
2308.05655v1 |
null |
| 2023-08-10 |
Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization |
Zezhong Lv et.al. |
2308.05648v1 |
link |
| 2023-08-10 |
Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network |
Wencheng Han et.al. |
2308.05605v1 |
link |
| 2023-08-10 |
Object Goal Navigation with Recursive Implicit Maps |
Shizhe Chen et.al. |
2308.05602v1 |
null |
| 2023-08-10 |
You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content |
Xinlei He et.al. |
2308.05596v1 |
null |
| 2023-08-09 |
Improved Multi-Shot Diffusion-Weighted MRI with Zero-Shot Self-Supervised Learning Reconstruction |
Jaejin Cho et.al. |
2308.05103v1 |
link |
| 2023-08-09 |
DOST -- Domain Obedient Self-supervised Training for Multi Label Classification with Noisy Labels |
Soumadeep Saha et.al. |
2308.05101v1 |
null |
| 2023-08-09 |
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling |
Yu Zhao et.al. |
2308.05081v1 |
null |
| 2023-08-10 |
Geometric Learning-Based Transformer Network for Estimation of Segmentation Errors |
Sneha Sree C et.al. |
2308.05068v2 |
null |
| 2023-08-09 |
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection |
Faegheh Sardari et.al. |
2308.05051v1 |
null |
| 2023-08-09 |
Collaborative Wideband Spectrum Sensing and Scheduling for Networked UAVs in UTM Systems |
Sravan Reddy Chintareddy et.al. |
2308.05036v1 |
null |
| 2023-08-09 |
Expert load matters: operating networks at high accuracy and low manual effort |
Sara Sangalli et.al. |
2308.05035v1 |
null |
| 2023-08-09 |
MetRoBERTa: Leveraging Traditional Customer Relationship Management Data to Develop a Transit-Topic-Aware Language Model |
Michael Leong et.al. |
2308.05012v1 |
null |
| 2023-08-09 |
Exploring Multilingual Text Data Distillation |
Shivam Sahni et.al. |
2308.04982v1 |
link |
| 2023-08-09 |
CasCIFF: A Cross-Domain Information Fusion Framework Tailored for Cascade Prediction in Social Networks |
Hongjun Zhu et.al. |
2308.04961v1 |
null |
| 2023-08-08 |
A Deep-Learning Method Using Auto-encoder and Generative Adversarial Network for Anomaly Detection on Ancient Stone Stele Surfaces |
Yikun Liu et.al. |
2308.04426v1 |
null |
| 2023-08-08 |
A Bi-directional Multi-hop Inference Model for Joint Dialog Sentiment Classification and Act Recognition |
Li Zheng et.al. |
2308.04424v1 |
null |
| 2023-08-08 |
DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images |
Xuechao Zou et.al. |
2308.04417v1 |
null |
| 2023-08-08 |
Probabilistic Invariant Learning with Randomized Linear Classifiers |
Leonardo Cotta et.al. |
2308.04412v1 |
null |
| 2023-08-08 |
Data Augmentation-Based Unsupervised Domain Adaptation In Medical Imaging |
Sebastian Nørgaard Llambias et.al. |
2308.04395v1 |
null |
| 2023-08-08 |
SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition |
Xiao Wang et.al. |
2308.04369v1 |
link |
| 2023-08-08 |
Vascular Ageing and Smoking Habit Prediction via a Low-Cost Single-Lead ECG Module |
S. Anas Ali et.al. |
2308.04355v1 |
null |
| 2023-08-08 |
A Lightweight and Accurate Face Detection Algorithm Based on Retinaface |
Baozhu Liu et.al. |
2308.04340v1 |
null |
| 2023-08-08 |
Pengembangan Model untuk Mendeteksi Kerusakan pada Terumbu Karang dengan Klasifikasi Citra |
Fadhil Muhammad et.al. |
2308.04337v1 |
null |
| 2023-08-08 |
Embracing Safe Contacts with Contact-aware Planning and Control |
Zhaoting Li et.al. |
2308.04323v1 |
null |
| 2023-08-07 |
3D Motion Magnification: Visualizing Subtle Motions with Time Varying Radiance Fields |
Brandon Y. Feng et.al. |
2308.03757v1 |
null |
| 2023-08-07 |
What about translation? New coding system for content analysis on the perception of literary translation around the political transformation in 1989 in Hungary as a classification problem on an unbalanced dataset |
Dalma Galambos et.al. |
2308.03742v1 |
null |
| 2023-08-07 |
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation |
Renjie Liang et.al. |
2308.03725v1 |
null |
| 2023-08-07 |
Automated Real Time Delineation of Supraclavicular Brachial Plexus in Neck Ultrasonography Videos: A Deep Learning Approach |
Abhay Tyagi et.al. |
2308.03717v1 |
null |
| 2023-08-08 |
Communication-Efficient Framework for Distributed Image Semantic Wireless Transmission |
Bingyan Xie et.al. |
2308.03713v2 |
null |
| 2023-08-07 |
Scaling may be all you need for achieving human-level object recognition capacity with human-like visual experience |
A. Emin Orhan et.al. |
2308.03712v1 |
link |
| 2023-08-07 |
Video-based Person Re-identification with Long Short-Term Representation Learning |
Xuehu Liu et.al. |
2308.03703v1 |
null |
| 2023-08-08 |
Screen-based 3D Subjective Experiment Software |
Songlin Fan et.al. |
2308.03698v2 |
null |
| 2023-08-07 |
Learning Concise and Descriptive Attributes for Visual Recognition |
An Yan et.al. |
2308.03685v1 |
null |
| 2023-08-07 |
Detecting Spells in Fantasy Literature with a Transformer Based Artificial Intelligence |
Marcel Moravek et.al. |
2308.03660v1 |
null |
| 2023-08-04 |
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP |
Qihang Yu et.al. |
2308.02487v1 |
link |
| 2023-08-04 |
BlindSage: Label Inference Attacks against Node-level Vertical Federated Graph Neural Networks |
Marco Arazzi et.al. |
2308.02465v1 |
null |
| 2023-08-04 |
Nonprehensile Planar Manipulation through Reinforcement Learning with Multimodal Categorical Exploration |
Juan Del Aguila Ferrandis et.al. |
2308.02459v1 |
null |
| 2023-08-04 |
Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints |
Yasunori Toshimitsu et.al. |
2308.02453v1 |
null |
| 2023-08-04 |
Adaptive Preferential Attached kNN Graph With Distribution-Awareness |
Shaojie Min et.al. |
2308.02442v1 |
link |
| 2023-08-04 |
Scaling Survival Analysis in Healthcare with Federated Survival Forests: A Comparative Study on Heart Failure and Breast Cancer Genomics |
Alberto Archetti et.al. |
2308.02382v1 |
null |
| 2023-08-04 |
Brain MRI Segmentation using Template-Based Training and Visual Perception Augmentation |
Fang-Cheng Yeh et.al. |
2308.02363v1 |
null |
| 2023-08-04 |
T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images |
Huan Zhong et.al. |
2308.02356v1 |
link |
| 2023-08-04 |
Adapting to Change: Robust Counterfactual Explanations in Dynamic Data Landscapes |
Bardh Prenkaj et.al. |
2308.02353v1 |
link |
| 2023-08-04 |
Generative Image Priors for MRI Reconstruction Trained from Magnitude-Only Images |
Guanxiong Luo et.al. |
2308.02340v1 |
null |
| 2023-08-03 |
FROD: Robust Object Detection for Free |
Muhammad et.al. |
2308.01888v1 |
null |
| 2023-08-03 |
Similar image retrieval using Autoencoder. I. Automatic morphology classification of galaxies |
Eunsuk Seo et.al. |
2308.01871v1 |
null |
| 2023-08-03 |
Tag Prediction of Competitive Programming Problems using Deep Learning Techniques |
Taha Lokat et.al. |
2308.01863v1 |
null |
| 2023-08-03 |
URET: Universal Robustness Evaluation Toolkit (for Evasion) |
Kevin Eykholt et.al. |
2308.01840v1 |
link |
| 2023-08-03 |
Distribution-Free Inference for the Regression Function of Binary Classification |
Ambrus Tamás et.al. |
2308.01835v1 |
null |
| 2023-08-03 |
Deep Neural Networks Fused with Textures for Image Classification |
Asish Bera et.al. |
2308.01813v1 |
null |
| 2023-08-03 |
Deep Learning-based Prediction of Stress and Strain Maps in Arterial Walls for Improved Cardiovascular Risk Assessment |
Yasin Shokrollahi1 et.al. |
2308.01771v1 |
null |
| 2023-08-03 |
Focus on Content not Noise: Improving Image Generation for Nuclei Segmentation by Suppressing Steganography in CycleGAN |
Jonas Utz et.al. |
2308.01769v1 |
null |
| 2023-08-03 |
A Novel Tensor Decomposition of arbitrary order based on Block Convolution with Reflective Boundary Conditions for Multi-Dimensional Data Analysis |
Mahdi Molavi et.al. |
2308.01768v1 |
null |
| 2023-08-03 |
NuInsSeg: A Fully Annotated Dataset for Nuclei Instance Segmentation in H&E-Stained Histological Images |
Amirreza Mahbod et.al. |
2308.01760v1 |
link |
| 2023-08-02 |
ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders |
Shawn Xu et.al. |
2308.01317v1 |
null |
| 2023-08-02 |
More Context, Less Distraction: Visual Classification by Inferring and Conditioning on Contextual Attributes |
Bang An et.al. |
2308.01313v1 |
link |
| 2023-08-02 |
Revisiting DETR Pre-training for Object Detection |
Yan Ma et.al. |
2308.01300v1 |
null |
| 2023-08-02 |
A Probabilistic Approach to Self-Supervised Learning using Cyclical Stochastic Gradient MCMC |
Masoumeh Javanbakhat et.al. |
2308.01271v1 |
null |
| 2023-08-02 |
Incorporating Season and Solar Specificity into Renderings made by a NeRF Architecture using Satellite Images |
Michael Gableman et.al. |
2308.01262v1 |
link |
| 2023-08-02 |
Quantum Imprint of the Anharmonic Oscillator |
Prisco Lo Chiatto et.al. |
2308.01244v1 |
null |
| 2023-08-03 |
CMUNeXt: An Efficient Medical Image Segmentation Network based on Large Kernel and Skip Fusion |
Fenghe Tang et.al. |
2308.01239v2 |
link |
| 2023-08-02 |
LSF-IDM: Lightweight Deep Learning Models for Automotive Intrusion Detection Model Based on Semantic Fusion |
Pengzhou Cheng et.al. |
2308.01237v1 |
null |
| 2023-08-02 |
JADES. The diverse population of infant Black Holes at 4<z<11: merging, tiny, poor, but mighty |
Roberto Maiolino et.al. |
2308.01230v1 |
null |
| 2023-08-02 |
TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval |
Kaibin Tian et.al. |
2308.01217v1 |
null |
| 2023-08-01 |
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models |
Cheng-Yu Hsieh et.al. |
2308.00675v1 |
null |
| 2023-08-01 |
Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes |
Bohao Fan et.al. |
2308.00628v1 |
link |
| 2023-08-01 |
NeRT: Implicit Neural Representations for General Unsupervised Turbulence Mitigation |
Weiyun Jiang et.al. |
2308.00622v1 |
null |
| 2023-08-01 |
Beyond One-Hot-Encoding: Injecting Semantics to Drive Image Classifiers |
Alan Perotti et.al. |
2308.00607v1 |
link |
| 2023-08-01 |
Relation-Aware Distribution Representation Network for Person Clustering with Multiple Modalities |
Kaijian Liu et.al. |
2308.00588v1 |
null |
| 2023-08-01 |
Gradient Scaling on Deep Spiking Neural Networks with Spike-Dependent Local Information |
Seongsik Park et.al. |
2308.00558v1 |
null |
| 2023-08-01 |
SF-IDS: An Imbalanced Semi-Supervised Learning Framework for Fine-grained Intrusion Detection |
Xinran Zheng et.al. |
2308.00542v1 |
null |
| 2023-08-01 |
Compressed Private Aggregation for Scalable and Robust Federated Learning over Massive Networks |
Natalie Lang et.al. |
2308.00540v1 |
link |
| 2023-08-01 |
Predicting Early Dropouts of an Active and Healthy Ageing App |
Vasileios Perifanis et.al. |
2308.00539v1 |
null |
| 2023-08-01 |
PressureTransferNet: Human Attribute Guided Dynamic Ground Pressure Profile Transfer using 3D simulated Pressure Maps |
Lala Shakti Swarup Ray et.al. |
2308.00538v1 |
null |
| 2023-07-31 |
A Quantized Interband Topological Index in Two-Dimensional Systems |
Tharindu Fernando et.al. |
2307.16893v1 |
null |
| 2023-07-31 |
Foundational Models for Fault Diagnosis of Electrical Motors |
Sriram Anbalagan et.al. |
2307.16891v1 |
null |
| 2023-07-31 |
Discovering Adaptable Symbolic Algorithms from Scratch |
Stephen Kelly et.al. |
2307.16890v1 |
null |
| 2023-07-31 |
Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models |
Weikang Yu et.al. |
2307.16865v1 |
null |
| 2023-07-31 |
Nonlinearity-induced topological phase transition characterized by the nonlinear Chern number |
Kazuki Sone et.al. |
2307.16827v1 |
null |
| 2023-07-31 |
Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and Baseline via Detection |
Xuanang Chen et.al. |
2307.16816v1 |
null |
| 2023-07-31 |
Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment |
Kun Yuan et.al. |
2307.16813v1 |
null |
| 2023-07-31 |
DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures |
Hannah Rose Kirk et.al. |
2307.16811v1 |
null |
| 2023-07-31 |
DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation |
Yue Zhang et.al. |
2307.16803v1 |
null |
| 2023-07-31 |
Classification with Deep Neural Networks and Logistic Loss |
Zihan Zhang et.al. |
2307.16792v1 |
null |
| 2023-07-28 |
Quantum-noise-limited optical neural networks operating at a few quanta per activation |
Shi-Yuan Ma et.al. |
2307.15712v1 |
null |
| 2023-07-31 |
MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking |
Ruopeng Gao et.al. |
2307.15700v2 |
null |
| 2023-07-28 |
PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding |
Davide Boscaini et.al. |
2307.15692v1 |
null |
| 2023-07-28 |
ODTlearn: A Package for Learning Optimal Decision Trees for Prediction and Prescription |
Patrick Vossler et.al. |
2307.15691v1 |
link |
| 2023-07-28 |
Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks |
Ran Dou et.al. |
2307.15679v1 |
null |
| 2023-07-28 |
Bayesian Time-Series Classifier for Decoding Simple Visual Stimuli from Intracranial Neural Activity |
Navid Ziaei et.al. |
2307.15672v1 |
null |
| 2023-07-28 |
Classifying core collapse supernova remnants by their morphology as shaped by the last exploding jets |
Noam Soker et.al. |
2307.15666v1 |
null |
| 2023-07-28 |
Multi-layer Aggregation as a key to feature-based OOD detection |
Benjamin Lambert et.al. |
2307.15647v1 |
null |
| 2023-07-28 |
Scale-aware Test-time Click Adaptation for Pulmonary Nodule and Mass Segmentation |
Zhihao Li et.al. |
2307.15645v1 |
link |
| 2023-07-28 |
TriadNet: Sampling-free predictive intervals for lesional volume in 3D brain MR images |
Benjamin Lambert et.al. |
2307.15638v1 |
null |
| 2023-07-27 |
PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking |
Yang Zheng et.al. |
2307.15055v1 |
null |
| 2023-07-27 |
A Transformer-based Approach for Arabic Offline Handwritten Text Recognition |
Saleh Momeni et.al. |
2307.15045v1 |
null |
| 2023-07-27 |
Drive Asymmetry, Convergence and the Origin of Turbulence in ICF Implosions |
Vincent A. Thomas et.al. |
2307.15028v1 |
null |
| 2023-07-27 |
Self-Supervised Graph Transformer for Deepfake Detection |
Aminollah Khormali et.al. |
2307.15019v1 |
null |
| 2023-07-27 |
The last patch for classifying shuffle groups |
Junyang Zhang et.al. |
2307.15012v1 |
null |
| 2023-07-27 |
Gzip versus bag-of-words for text classification with KNN |
Juri Opitz et.al. |
2307.15002v1 |
null |
| 2023-07-27 |
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs |
Or Sharir et.al. |
2307.14988v1 |
null |
| 2023-07-27 |
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models |
Ziyi Wang et.al. |
2307.14971v1 |
link |
| 2023-07-27 |
Federated Model Aggregation via Self-Supervised Priors for Highly Imbalanced Medical Image Classification |
Marawan Elbatel et.al. |
2307.14959v1 |
link |
| 2023-07-27 |
Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space |
Eduardo Fernandes Montesuma et.al. |
2307.14953v1 |
null |
| 2023-07-26 |
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation |
Rajeev Yasarla et.al. |
2307.14336v1 |
null |
| 2023-07-26 |
Event-based Vision for Early Prediction of Manipulation Actions |
Daniel Deniz et.al. |
2307.14332v1 |
null |
| 2023-07-26 |
Waypoint-Based Imitation Learning for Robotic Manipulation |
Lucy Xiaoyang Shi et.al. |
2307.14326v1 |
null |
| 2023-07-26 |
Unraveling the Complexity of Splitting Sequential Data: Tackling Challenges in Video and Time Series Analysis |
Diego Botache et.al. |
2307.14294v1 |
null |
| 2023-07-26 |
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory |
Hongxiang Li et.al. |
2307.14277v1 |
null |
| 2023-07-26 |
Deepfake Image Generation for Improved Brain Tumor Segmentation |
Roa'a Al-Emaryeen et.al. |
2307.14273v1 |
null |
| 2023-07-26 |
Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing |
Max Yang et.al. |
2307.14272v1 |
null |
| 2023-07-26 |
Artifact Restoration in Histology Images with Diffusion Probabilistic Models |
Zhenqi He et.al. |
2307.14262v1 |
link |
| 2023-07-26 |
Defending Adversarial Patches via Joint Region Localizing and Inpainting |
Junwen Chen et.al. |
2307.14242v1 |
null |
| 2023-07-26 |
DisguisOR: Holistic Face Anonymization for the Operating Room |
Lennart Bastian et.al. |
2307.14241v1 |
link |
| 2023-07-25 |
RED CoMETS: An ensemble classifier for symbolically represented multivariate time series |
Luca A. Bennett et.al. |
2307.13679v1 |
link |
| 2023-07-25 |
QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models |
Justin Engelmann et.al. |
2307.13646v1 |
link |
| 2023-07-25 |
Manifestly Covariant Worldline Actions from Coadjoint Orbits. Part I: Generalities and Vectorial Descriptions |
Thomas Basile et.al. |
2307.13644v1 |
null |
| 2023-07-25 |
Optical Flow boosts Unsupervised Localization and Segmentation |
Xinyu Zhang et.al. |
2307.13640v1 |
link |
| 2023-07-25 |
Insights into Cognitive Engagement: Comparing the Effectiveness of Game-Based and Video-Based Learning |
Shayla Sharmin et.al. |
2307.13637v1 |
null |
| 2023-07-25 |
Contributions to the Improvement of Question Answering Systems in the Biomedical Domain |
Mourad Sarrouti et.al. |
2307.13631v1 |
null |
| 2023-07-25 |
Chandra X-ray Observatory Observations of 13 Fermi LAT Sources |
Blagoy Rangelov et.al. |
2307.13594v1 |
null |
| 2023-07-25 |
Reinterpreting survival analysis in the universal approximator age |
Sören Dittmer et.al. |
2307.13579v1 |
link |
| 2023-07-25 |
PT$\mathrm{L}^{p}$: Partial Transport $\mathrm{L}^{p}$ Distances |
Xinran Liu et.al. |
2307.13571v1 |
null |
| 2023-07-25 |
Group Activity Recognition in Computer Vision: A Comprehensive Review, Challenges, and Future Perspectives |
Chuanchuan Wang et.al. |
2307.13541v1 |
null |
| 2023-07-24 |
Leveraging Label Variation in Large Language Models for Zero-Shot Text Classification |
Flor Miriam Plaza-del-Arco et.al. |
2307.12973v1 |
null |
| 2023-07-24 |
A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning |
Benjamin Eysenbach et.al. |
2307.12968v1 |
link |
| 2023-07-24 |
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment |
Sarah Ibrahimi et.al. |
2307.12964v1 |
null |
| 2023-07-24 |
Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection |
Christopher Clarke et.al. |
2307.12935v1 |
link |
| 2023-07-25 |
Towards a Visual-Language Foundation Model for Computational Pathology |
Ming Y. Lu et.al. |
2307.12914v2 |
null |
| 2023-07-24 |
Dyn-E: Local Appearance Editing of Dynamic Neural Radiance Fields |
Shangzhan Zhang et.al. |
2307.12909v1 |
null |
| 2023-07-24 |
Conditional Residual Coding: A Remedy for Bottleneck Problems in Conditional Inter Frame Coding |
Fabian Brand et.al. |
2307.12864v1 |
null |
| 2023-07-24 |
Multiscale Video Pretraining for Long-Term Activity Forecasting |
Reuben Tan et.al. |
2307.12854v1 |
null |
| 2023-07-25 |
Spatiotemporal Modeling Encounters 3D Medical Image Analysis: Slice-Shift UNet with Multi-View Fusion |
C. I. Ugwu et.al. |
2307.12853v2 |
null |
| 2023-07-24 |
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization |
Hancheng Min et.al. |
2307.12851v1 |
null |
| 2023-07-21 |
Advanced Monte Carlo simulation techniques to study polymers under equilibrium conditions |
Monika Angwani et.al. |
2307.11722v1 |
null |
| 2023-07-21 |
Deep Learning Hyperspectral Pansharpening on large scale PRISMA dataset |
Simone Zini et.al. |
2307.11666v1 |
null |
| 2023-07-21 |
FEDD -- Fair, Efficient, and Diverse Diffusion-based Lesion Segmentation and Malignancy Classification |
Héctor Carrión et.al. |
2307.11654v1 |
null |
| 2023-07-21 |
Sparse Cholesky factorization by greedy conditional selection |
Stephen Huan et.al. |
2307.11648v1 |
link |
| 2023-07-24 |
Morphological Image Analysis and Feature Extraction for Reasoning with AI-based Defect Detection and Classification Models |
Jiajun Zhang et.al. |
2307.11643v2 |
null |
| 2023-07-21 |
Deep Reinforcement Learning Based System for Intraoperative Hyperspectral Video Autofocusing |
Charlie Budd et.al. |
2307.11638v1 |
null |
| 2023-07-21 |
Computational Image Formation |
Stanley H. Chan et.al. |
2307.11635v1 |
null |
| 2023-07-21 |
Finding Optimal Diverse Feature Sets with Alternative Feature Selection |
Jakob Bach et.al. |
2307.11607v1 |
null |
| 2023-07-21 |
Cascaded multitask U-Net using topological loss for vessel segmentation and centerline extraction |
Pierre Rougé et.al. |
2307.11603v1 |
null |
| 2023-07-21 |
Mixbiotic society measures: Assessment of community well-going as living system |
Takeshi Kato et.al. |
2307.11594v1 |
null |
| 2023-07-20 |
GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos |
Nisarg A. Shah et.al. |
2307.11081v1 |
link |
| 2023-07-20 |
Driving Policy Prediction based on Deep Learning Models |
Fuxiao Liu et.al. |
2307.11058v1 |
null |
| 2023-07-20 |
Cascade-DETR: Delving into High-Quality Universal Object Detection |
Mingqiao Ye et.al. |
2307.11035v1 |
link |
| 2023-07-20 |
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification |
Neel Guha et.al. |
2307.11031v1 |
null |
| 2023-07-20 |
Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering |
Yijun Dong et.al. |
2307.11030v1 |
null |
| 2023-07-20 |
Multi-objective point cloud autoencoders for explainable myocardial infarction prediction |
Marcel Beetz et.al. |
2307.11017v1 |
null |
| 2023-07-20 |
Treatment And Follow-Up Guidelines For Multiple Brain Metastases: A Systematic Review |
Ana Sofia Santos et.al. |
2307.11016v1 |
null |
| 2023-07-21 |
Dense Sample Deep Learning |
Stephen Josè Hanson et.al. |
2307.10991v2 |
null |
| 2023-07-20 |
Deep Spiking-UNet for Image Processing |
Hebei Li et.al. |
2307.10974v1 |
link |
| 2023-07-20 |
Spinal nerve segmentation method and dataset construction in endoscopic surgical scenarios |
Shaowu Peng et.al. |
2307.10955v1 |
link |
| 2023-07-19 |
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering |
Wei Cheng et.al. |
2307.10173v1 |
link |
| 2023-07-19 |
Adversarial Latent Autoencoder with Self-Attention for Structural Image Synthesis |
Jiajie Fan et.al. |
2307.10166v1 |
null |
| 2023-07-19 |
Leveraging Visemes for Better Visual Speech Representation and Lip Reading |
Javad Peymanfard et.al. |
2307.10157v1 |
null |
| 2023-07-19 |
Remarks on a theorem of Pink in presence of bad reduction |
Wojciech Gajda et.al. |
2307.10140v1 |
null |
| 2023-07-19 |
Gradient Sparsification For Masked Fine-Tuning of Transformers |
James O' Neill et.al. |
2307.10098v1 |
null |
| 2023-07-19 |
Boundary-Refined Prototype Generation: A General End-to-End Paradigm for Semi-Supervised Semantic Segmentation |
Junhao Dong et.al. |
2307.10097v1 |
null |
| 2023-07-19 |
Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis |
Lingting Zhu et.al. |
2307.10094v1 |
null |
| 2023-07-19 |
Divert More Attention to Vision-Language Object Tracking |
Mingzhe Guo et.al. |
2307.10046v1 |
link |
| 2023-07-19 |
A non-monotone extra-gradient trust-region method with noisy oracles |
Natasa Krejic et.al. |
2307.10038v1 |
null |
| 2023-07-20 |
Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition |
Jia-Xin Zhuang et.al. |
2307.10036v2 |
null |
| 2023-07-18 |
AnyDoor: Zero-shot Object-level Image Customization |
Xi Chen et.al. |
2307.09481v1 |
null |
| 2023-07-18 |
FACTS: Facial Animation Creation using the Transfer of Styles |
Jack Saunders et.al. |
2307.09480v1 |
null |
| 2023-07-18 |
GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping |
Zhuoling Li et.al. |
2307.09472v1 |
null |
| 2023-07-18 |
Smooth Attention for Deep Multiple Instance Learning: Application to CT Intracranial Hemorrhage Detection |
Yunan Wu et.al. |
2307.09457v1 |
link |
| 2023-07-19 |
A comparative analysis of SRGAN models |
Fatemeh Rezapoor Nikroo et.al. |
2307.09456v2 |
null |
| 2023-07-19 |
Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers |
Jaeyoung Kim et.al. |
2307.09455v2 |
null |
| 2023-07-18 |
Measuring Student Behavioral Engagement using Histogram of Actions |
Ahmed Abdelkawy et.al. |
2307.09420v1 |
null |
| 2023-07-18 |
Is this Snippet Written by ChatGPT? An Empirical Study with a CodeBERT-Based Classifier |
Phuong T. Nguyen et.al. |
2307.09381v1 |
null |
| 2023-07-18 |
CertPri: Certifiable Prioritization for Deep Neural Networks via Movement Cost in Feature Space |
Haibin Zheng et.al. |
2307.09375v1 |
null |
| 2023-07-18 |
Enhancing Pattern Classification in Support Vector Machines through Matrix Formulation |
Sambhav Jain Reshma Rastogi et.al. |
2307.09372v1 |
null |
| 2023-07-17 |
Diffusion Models Beat GANs on Image Classification |
Soumik Mukhopadhyay et.al. |
2307.08702v1 |
null |
| 2023-07-17 |
Neural Video Depth Stabilizer |
Yiran Wang et.al. |
2307.08695v1 |
link |
| 2023-07-17 |
SEMI-DiffusionInst: A Diffusion Model Based Approach for Semiconductor Defect Classification and Segmentation |
Vic De Ridder et.al. |
2307.08693v1 |
null |
| 2023-07-17 |
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning |
Tri Dao et.al. |
2307.08691v1 |
link |
| 2023-07-17 |
Implementation of a perception system for autonomous vehicles using a detection-segmentation network in SoC FPGA |
Maciej Baczmanski et.al. |
2307.08682v1 |
null |
| 2023-07-17 |
Neural Image Compression: Generalization, Robustness, and Spectral Biases |
Kelsey Lieberman et.al. |
2307.08657v1 |
null |
| 2023-07-17 |
PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds |
Zhaiyu Chen et.al. |
2307.08636v1 |
null |
| 2023-07-17 |
Deficiency-Aware Masked Transformer for Video Inpainting |
Yongsheng Yu et.al. |
2307.08629v1 |
link |
| 2023-07-17 |
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs |
Yang Zhao et.al. |
2307.08581v1 |
null |
| 2023-07-18 |
Deep Learning with Passive Optical Nonlinear Mapping |
Fei Xia et.al. |
2307.08558v2 |
null |
| 2023-07-14 |
Expressive Monotonic Neural Networks |
Ouail Kitouni et.al. |
2307.07512v1 |
link |
| 2023-07-14 |
Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data |
Qi-Wei Wang et.al. |
2307.07509v1 |
null |
| 2023-07-14 |
Brain Tumor Detection using Convolutional Neural Networks with Skip Connections |
Aupam Hamran et.al. |
2307.07503v1 |
null |
| 2023-07-14 |
TALL: Thumbnail Layout for Deepfake Video Detection |
Yuting Xu et.al. |
2307.07494v1 |
null |
| 2023-07-14 |
DreamTeacher: Pretraining Image Backbones with Deep Generative Models |
Daiqing Li et.al. |
2307.07487v1 |
null |
| 2023-07-14 |
Multimodal Distillation for Egocentric Action Recognition |
Gorjan Radevski et.al. |
2307.07483v1 |
null |
| 2023-07-14 |
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification |
Simon Holdenried-Krafft et.al. |
2307.07482v1 |
null |
| 2023-07-14 |
Passage-times for partially-homogeneous reflected random walks on the quadrant |
Conrado da Costa et.al. |
2307.07458v1 |
null |
| 2023-07-14 |
An equivariant surgery classification of $C_p$-surfaces |
Kelly Pohland et.al. |
2307.07446v1 |
null |
| 2023-07-14 |
Can Large Language Models Empower Molecular Property Prediction? |
Chen Qian et.al. |
2307.07443v1 |
link |
| 2023-07-13 |
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition |
Syed Talal Wasim et.al. |
2307.06947v1 |
link |
| 2023-07-13 |
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation |
Yi Wang et.al. |
2307.06942v1 |
link |
| 2023-07-13 |
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation |
Yingqing He et.al. |
2307.06940v1 |
link |
| 2023-07-13 |
DRAGON: A Dialogue-Based Robot for Assistive Navigation with Visual Language Grounding |
Shuijing Liu et.al. |
2307.06924v1 |
null |
| 2023-07-13 |
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks |
Liam Collins et.al. |
2307.06887v1 |
null |
| 2023-07-13 |
LVLane: Deep Learning for Lane Detection and Classification in Challenging Conditions |
Zillur Rahman et.al. |
2307.06853v1 |
link |
| 2023-07-13 |
Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks |
Denis Coquenet et.al. |
2307.06795v1 |
link |
| 2023-07-13 |
Robotic surface exploration with vision and tactile sensing for cracks detection and characterisation |
Francesca Palermo et.al. |
2307.06784v1 |
null |
| 2023-07-13 |
Generalizing Supervised Deep Learning MRI Reconstruction to Multiple and Unseen Contrasts using Meta-Learning Hypernetworks |
Sriprabha Ramanarayanan et.al. |
2307.06771v1 |
link |
| 2023-07-13 |
Pairs of inner projections and two applications |
Ramlal Debnath et.al. |
2307.06744v1 |
null |
| 2023-07-12 |
Deep Learning of Crystalline Defects from TEM images: A Solution for the Problem of "Never Enough Training Data" |
Kishan Govind et.al. |
2307.06322v1 |
null |
| 2023-07-12 |
A geometric classification of rod complements in the 3-torus |
Connie On Yu Hui et.al. |
2307.06317v1 |
null |
| 2023-07-12 |
Facial Reenactment Through a Personalized Generator |
Ariel Elazary et.al. |
2307.06307v1 |
null |
| 2023-07-12 |
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution |
Mostafa Dehghani et.al. |
2307.06304v1 |
null |
| 2023-07-12 |
Feature Embeddings from Large-Scale Acoustic Bird Classifiers Enable Few-Shot Transfer Learning |
Burooj Ghani et.al. |
2307.06292v1 |
null |
| 2023-07-12 |
Stochastic Light Field Holography |
Florian Schiffers et.al. |
2307.06277v1 |
null |
| 2023-07-12 |
Machine learning and Topological data analysis identify unique features of human papillae in 3D scans |
Rayna Andreeva et.al. |
2307.06255v1 |
null |
| 2023-07-12 |
On the Importance of Denoising when Learning to Compress Images |
Benoit Brummer et.al. |
2307.06233v1 |
link |
| 2023-07-12 |
Ashaar: Automatic Analysis and Generation of Arabic Poetry Using Deep Learning Approaches |
Zaid Alyafeai et.al. |
2307.06218v1 |
link |
| 2023-07-12 |
Local Conditional Neural Fields for Versatile and Generalizable Large-Scale Reconstructions in Computational Imaging |
Hao Wang et.al. |
2307.06207v1 |
null |
| 2023-07-11 |
Fractonic Higher-Order Topological Phases in Open Quantum Systems |
Jian-Hao Zhang et.al. |
2307.05474v1 |
null |
| 2023-07-11 |
Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives |
Tom Monnier et.al. |
2307.05473v1 |
null |
| 2023-07-11 |
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone |
Shraman Pramanick et.al. |
2307.05463v1 |
null |
| 2023-07-11 |
Improving the Security of Smartwatch Payment with Deep Learning |
George Webber et.al. |
2307.05437v1 |
null |
| 2023-07-11 |
One-Versus-Others Attention: Scalable Multimodal Integration |
Michal Golovanevsky et.al. |
2307.05435v1 |
link |
| 2023-07-11 |
Identifying Acoustic Wave Sources on the Sun. II. Improved Filter Techniques for Source Wavefield Seismology |
Shah Mohammad Bahauddin et.al. |
2307.05433v1 |
null |
| 2023-07-11 |
Effective Whitney Stratification of Real Algebraic Varieties |
Martin Helmer et.al. |
2307.05427v1 |
null |
| 2023-07-11 |
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform |
Mateusz Wójcik et.al. |
2307.05399v1 |
link |
| 2023-07-11 |
ShredGP: Guitarist Style-Conditioned Tablature Generation |
Pedro Sarmento et.al. |
2307.05324v1 |
null |
| 2023-07-11 |
Class Instance Balanced Learning for Long-Tailed Classification |
Marc-Antoine Lavoie et.al. |
2307.05322v1 |
null |
| 2023-07-10 |
Semantic-SAM: Segment and Recognize Anything at Any Granularity |
Feng Li et.al. |
2307.04767v1 |
link |
| 2023-07-10 |
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos |
Sagnik Majumder et.al. |
2307.04760v1 |
null |
| 2023-07-10 |
Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement |
Anthony Simeonov et.al. |
2307.04751v1 |
null |
| 2023-07-10 |
RoCo: Dialectic Multi-Robot Collaboration with Large Language Models |
Zhao Mandi et.al. |
2307.04738v1 |
link |
| 2023-07-10 |
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning |
Yuwei Guo et.al. |
2307.04725v1 |
null |
| 2023-07-10 |
Quark/Gluon Discrimination and Top Tagging with Dual Attention Transformer |
Minxuan He et.al. |
2307.04723v1 |
null |
| 2023-07-10 |
CVPR MultiEarth 2023 Deforestation Estimation Challenge:SpaceVision4Amazon |
Sunita Arya et.al. |
2307.04715v1 |
null |
| 2023-07-10 |
Multimodal brain age estimation using interpretable adaptive population-graph learning |
Kyriaki-Margarita Bintsi et.al. |
2307.04639v1 |
null |
| 2023-07-10 |
Learning Fine Pinch-Grasp Skills using Tactile Sensing from Real Demonstration Data |
Xiaofeng Mao et.al. |
2307.04619v1 |
null |
| 2023-07-10 |
Weakly-supervised positional contrastive learning: application to cirrhosis classification |
Emma Sarfati et.al. |
2307.04617v1 |
null |
| 2023-07-07 |
On the representation theory of cyclic and dihedral quandles |
Mohamed Elhamdadi et.al. |
2307.03728v1 |
null |
| 2023-07-07 |
Polybot: Training One Policy Across Robots While Embracing Variability |
Jonathan Yang et.al. |
2307.03719v1 |
null |
| 2023-07-07 |
Motion Magnification in Robotic Sonography: Enabling Pulsation-Aware Artery Segmentation |
Dianye Huang et.al. |
2307.03698v1 |
null |
| 2023-07-07 |
Detecting the Sensing Area of A Laparoscopic Probe in Minimally Invasive Cancer Surgery |
Baoru Huang et.al. |
2307.03662v1 |
null |
| 2023-07-07 |
Physical-aware Cross-modal Adversarial Network for Wearable Sensor-based Human Action Recognition |
Jianyuan Ni et.al. |
2307.03638v1 |
null |
| 2023-07-07 |
VesselVAE: Recursive Variational Autoencoders for 3D Blood Vessel Synthesis |
Paula Feldman et.al. |
2307.03592v1 |
null |
| 2023-07-07 |
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks |
Xingyu Lin et.al. |
2307.03567v1 |
null |
| 2023-07-07 |
VariGrad: A Novel Feature Vector Architecture for Geometric Deep Learning on Unregistered Data |
Emmanuel Hartman et.al. |
2307.03553v1 |
null |
| 2023-07-07 |
TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task Foundation Model Learning |
Zelun Zhang et.al. |
2307.03465v1 |
null |
| 2023-07-07 |
A Deep Active Contour Model for Delineating Glacier Calving Fronts |
Konrad Heidler et.al. |
2307.03461v1 |
null |
| 2023-07-06 |
Synthesizing Artistic Cinemagraphs from Text |
Aniruddha Mahapatra et.al. |
2307.03190v1 |
null |
| 2023-07-06 |
Long-term follow-up observations of extreme coronal line emitting galaxies |
Peter Clark et.al. |
2307.03182v1 |
null |
| 2023-07-06 |
Push Past Green: Learning to Look Behind Plant Foliage by Moving It |
Xiaoyu Zhang et.al. |
2307.03175v1 |
null |
| 2023-07-06 |
VideoGLUE: Video General Understanding Evaluation of Foundation Models |
Liangzhe Yuan et.al. |
2307.03166v1 |
null |
| 2023-07-06 |
Can Domain Adaptation Improve Accuracy and Fairness of Skin Lesion Classification? |
Janet Wang et.al. |
2307.03157v1 |
null |
| 2023-07-06 |
MultiVENT: Multilingual Videos of Events with Aligned Natural Text |
Kate Sanders et.al. |
2307.03153v1 |
null |
| 2023-07-06 |
Topology-Aware Loss for Aorta and Great Vessel Segmentation in Computed Tomography Images |
Seher Ozcelik et.al. |
2307.03137v1 |
null |
| 2023-07-06 |
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability |
Xuanlin Li et.al. |
2307.03135v1 |
link |
| 2023-07-06 |
Benchmarking Test-Time Adaptation against Distribution Shifts in Image Classification |
Yongcan Yu et.al. |
2307.03133v1 |
link |
| 2023-07-06 |
VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering |
Zijun Yao et.al. |
2307.03130v1 |
null |
| 2023-07-05 |
Building Cooperative Embodied Agents Modularly with Large Language Models |
Hongxin Zhang et.al. |
2307.02485v1 |
null |
| 2023-07-05 |
Elastic Decision Transformer |
Yueh-Hua Wu et.al. |
2307.02484v1 |
null |
| 2023-07-05 |
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? |
Yan Zeng et.al. |
2307.02469v1 |
null |
| 2023-07-05 |
Supersymmetric asymptotically locally AdS$_5$ gravitational solitons |
Turkuler Durgut et.al. |
2307.02466v1 |
null |
| 2023-07-05 |
AxonCallosumEM Dataset: Axon Semantic Segmentation of Whole Corpus Callosum cross section from EM Images |
Ao Cheng et.al. |
2307.02464v1 |
null |
| 2023-07-05 |
Expert-Agnostic Ultrasound Image Quality Assessment using Deep Variational Clustering |
Deepak Raina et.al. |
2307.02462v1 |
null |
| 2023-07-05 |
LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved Wavelet Attention and Reverse Diffusion |
Long Bai et.al. |
2307.02452v1 |
link |
| 2023-07-05 |
On Deep Learning Classification of Digitally Modulated Signals Using Raw I/Q Data |
John A. Snoap et.al. |
2307.02450v1 |
null |
| 2023-07-05 |
Vulnerable Source Code Detection using SonarCloud Code Analysis |
Alifia Puspaningrum et.al. |
2307.02446v1 |
null |
| 2023-07-05 |
Base Layer Efficiency in Scalable Human-Machine Coding |
Yalda Foroutan et.al. |
2307.02430v1 |
null |
| 2023-07-03 |
Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning |
Yuxiang Zhang et.al. |
2307.01200v1 |
null |
| 2023-07-03 |
Segment Anything Meets Point Tracking |
Frano Rajič et.al. |
2307.01197v1 |
link |
| 2023-07-03 |
Online nearest neighbor classification |
Sanjoy Dasgupta et.al. |
2307.01170v1 |
null |
| 2023-07-03 |
Don't freeze: Finetune encoders for better Self-Supervised HAR |
Vitor Fortes Rey et.al. |
2307.01168v1 |
null |
| 2023-07-03 |
Characteristic signatures of accreting binary black holes produced by eccentric minidisks |
John Ryan Westernacher-Schneider et.al. |
2307.01154v1 |
null |
| 2023-07-03 |
Integral cohomology rings of weighted Grassmann orbifolds and Rigidity properties |
Koushik Brahma et.al. |
2307.01153v1 |
null |
| 2023-07-03 |
Investigating Data Memorization in 3D Latent Diffusion Models for Medical Image Synthesis |
Salman Ul Hassan Dar et.al. |
2307.01148v1 |
null |
| 2023-07-05 |
AVSegFormer: Audio-Visual Segmentation with Transformer |
Shengyi Gao et.al. |
2307.01146v2 |
link |
| 2023-07-03 |
Cross-modality Attention Adapter: A Glioma Segmentation Fine-tuning Method for SAM Using Multimodal Brain MR Images |
Xiaoyu Shi et.al. |
2307.01124v1 |
null |
| 2023-07-03 |
Supervised Manifold Learning via Random Forest Geometry-Preserving Proximities |
Jake S. Rhodes et.al. |
2307.01077v1 |
null |
| 2023-07-03 |
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs |
Lijun Yu et.al. |
2306.17842v2 |
null |
| 2023-06-30 |
Learning Evacuee Models from Robot-Guided Emergency Evacuation Experiments |
Mollik Nayyar et.al. |
2306.17824v1 |
null |
| 2023-06-30 |
Act3D: Infinite Resolution Action Detection Transformer for Robotic Manipulation |
Theophile Gervet et.al. |
2306.17817v1 |
null |
| 2023-06-30 |
Topologically Attributed Graphs for Shape Discrimination |
Justin Curry et.al. |
2306.17805v1 |
null |
| 2023-06-30 |
Vision Through the Veil: Differential Privacy in Federated Learning for Medical Image Classification |
Kishore Babu Nampalle et.al. |
2306.17794v1 |
null |
| 2023-06-30 |
Precision Anti-Cancer Drug Selection via Neural Ranking |
Vishal Dey et.al. |
2306.17771v1 |
null |
| 2023-06-30 |
Improved NL2SQL based on Multi-layer Expert Network |
Chenduo Hao et.al. |
2306.17727v1 |
null |
| 2023-06-30 |
Content-Preserving Diffusion Model for Unsupervised AS-OCT image Despeckling |
Li Sanqian et.al. |
2306.17717v1 |
null |
| 2023-06-30 |
Evaluation of the Benefits of Zero Velocity Update in Decentralized EKF-Based Cooperative Localization Algorithms for GNSS-Denied Multi-Robot Systems |
Cagri Kilic et.al. |
2306.17703v1 |
null |
| 2023-06-30 |
Generalized Time Warping Invariant Dictionary Learning for Time Series Classification and Clustering |
Ruiyu Xu et.al. |
2306.17690v1 |
null |
| 2023-06-29 |
An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training |
Zitian Chen et.al. |
2306.17165v1 |
null |
| 2023-06-29 |
Can Machines Garden? Systematically Comparing the AlphaGarden vs. Professional Horticulturalists |
Simeon Adebola et.al. |
2306.17162v1 |
null |
| 2023-06-29 |
FogROS2-SGC: A ROS2 Cloud Robotics Platform for Secure Global Connectivity |
Kaiyuan Chen et.al. |
2306.17157v1 |
null |
| 2023-06-29 |
Orbit Classification of asteroids using implementation of radial Basis Function on Support Vector Machines |
Yashvir Tiberwal et.al. |
2306.17138v1 |
null |
| 2023-06-29 |
On separably integrable symmetric convex bodies |
Vladyslav Yaskin et.al. |
2306.17127v1 |
null |
| 2023-06-29 |
PVP: Personalized Video Prior for Editable Dynamic Portraits using StyleGAN |
Kai-En Lin et.al. |
2306.17123v1 |
null |
| 2023-06-29 |
Learning Nuclei Representations with Masked Image Modelling |
Piotr Wójcik et.al. |
2306.17116v1 |
null |
| 2023-06-29 |
Deep Ensemble for Rotorcraft Attitude Prediction |
Hikmat Khan et.al. |
2306.17104v1 |
null |
| 2023-06-29 |
Twice Binnable Color Filter Arrays |
Mritunjay Singh et.al. |
2306.17078v1 |
null |
| 2023-06-29 |
Extremal behavior of reduced type of one dimensional rings |
Sarasij Maitra et.al. |
2306.17069v1 |
null |
| 2023-06-28 |
Class Numbers, Congruent Numbers and Umbral Moonshine |
Miranda C. N. Cheng et.al. |
2306.16414v1 |
null |
| 2023-06-28 |
Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise |
Ilias Diakonikolas et.al. |
2306.16352v1 |
null |
| 2023-06-28 |
Accurate, uncertainty-aware classification of molecular chemical motifs from multi-modal X-ray absorption spectroscopy |
Matthew R. Carbone et.al. |
2306.16349v1 |
null |
| 2023-06-28 |
DoseDiff: Distance-aware Diffusion Model for Dose Prediction in Radiotherapy |
Yiwen Zhang et.al. |
2306.16324v1 |
null |
| 2023-06-28 |
Universal theory of spin-momentum-orbital-site locking |
Yuntian Liu et.al. |
2306.16312v1 |
null |
| 2023-06-28 |
Generalizing Surgical Instruments Segmentation to Unseen Domains with One-to-Many Synthesis |
An Wang et.al. |
2306.16285v1 |
link |
| 2023-06-28 |
Emotion Analysis of Tweets Banning Education in Afghanistan |
Mohammad Ali Hussiny et.al. |
2306.16268v1 |
null |
| 2023-06-28 |
Reconfigurable Robot Control Using Flexible Coupling Mechanisms |
Sha Yi et.al. |
2306.16265v1 |
null |
| 2023-06-28 |
Latent SDEs on Homogeneous Spaces |
Sebastian Zeng et.al. |
2306.16248v1 |
null |
| 2023-06-28 |
Investigating the Uncanny Valley Phenomenon Through the Temporal Dynamics of Neural Responses to Virtual Characters |
Chiara Gorlini et.al. |
2306.16233v1 |
null |
| 2023-06-27 |
Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties |
Hsiao-Yu Tung et.al. |
2306.15668v1 |
null |
| 2023-06-27 |
Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs |
Navindu Leelarathna et.al. |
2306.15661v1 |
null |
| 2023-06-27 |
Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos |
Chiori Hori et.al. |
2306.15644v1 |
null |
| 2023-06-27 |
Biclustering random matrix partitions with an application to classification of forensic body fluids |
Chieh-Hsi Wu et.al. |
2306.15622v1 |
null |
| 2023-06-27 |
Recurrent Neural Network-coupled SPAD TCSPC System for Real-time Fluorescence Lifetime Imaging |
Yang Lin et.al. |
2306.15599v1 |
null |
| 2023-06-27 |
Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning |
Sherly Alfonso-Sánchez et.al. |
2306.15585v1 |
null |
| 2023-06-27 |
Parity doublet model for baryon octets: diquark classifications and mass hierarchy based on the quark-line diagram |
Takuya Minamikawa et.al. |
2306.15564v1 |
null |
| 2023-06-27 |
You Can Mask More For Extremely Low-Bitrate Image Compression |
Anqi Li et.al. |
2306.15561v1 |
link |
| 2023-06-27 |
A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms |
Cristina Silvano et.al. |
2306.15552v1 |
null |
| 2023-06-27 |
Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames |
Yunfan Lu et.al. |
2306.15507v1 |
null |
| 2023-06-26 |
FunQA: Towards Surprising Video Comprehension |
Binzhu Xie et.al. |
2306.14899v1 |
link |
| 2023-06-26 |
Mapping out phase diagrams with generative classifiers |
Julian Arnold et.al. |
2306.14894v1 |
null |
| 2023-06-26 |
Fuzzy-Conditioned Diffusion and Diffusion Projection Attention Applied to Facial Image Correction |
Majed El Helou et.al. |
2306.14891v1 |
link |
| 2023-06-26 |
A Fully Unsupervised Instance Segmentation Technique for White Blood Cell Images |
Shrijeet Biswas et.al. |
2306.14875v1 |
null |
| 2023-06-26 |
ANYmal Parkour: Learning Agile Navigation for Quadrupedal Robots |
David Hoeller et.al. |
2306.14874v1 |
null |
| 2023-06-26 |
Leveraging Task Structures for Improved Identifiability in Neural Network Representations |
Wenlin Chen et.al. |
2306.14861v1 |
null |
| 2023-06-26 |
ViNT: A Foundation Model for Visual Navigation |
Dhruv Shah et.al. |
2306.14846v1 |
null |
| 2023-06-26 |
An open-source robust machine learning platform for real-time detection and classification of 2D material flakes |
Jan-Lucas Uslu et.al. |
2306.14845v1 |
null |
| 2023-06-26 |
A Flyweight CNN with Adaptive Decoder for Schistosoma mansoni Egg Detection |
Leonardo de Melo Joao et.al. |
2306.14840v1 |
null |
| 2023-06-26 |
Label-Aware Hyperbolic Embeddings for Fine-grained Emotion Classification |
Chih-Yao Chen et.al. |
2306.14822v1 |
link |
| 2023-06-23 |
Adversarial Robustness Certification for Bayesian Neural Networks |
Matthew Wicker et.al. |
2306.13614v1 |
link |
| 2023-06-23 |
TACOformer:Token-channel compounded Cross Attention for Multimodal Emotion Recognition |
Xinda Li et.al. |
2306.13592v1 |
null |
| 2023-06-23 |
Estimating Residential Solar Potential Using Aerial Data |
Ross Goroshin et.al. |
2306.13564v1 |
null |
| 2023-06-23 |
Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning |
Takumi Yoshida et.al. |
2306.13561v1 |
null |
| 2023-06-26 |
FPGA Implementation of Convolutional Neural Network for Real-Time Handwriting Recognition |
Shichen Qiao et.al. |
2306.13557v2 |
link |
| 2023-06-23 |
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation |
Massimiliano Patacchiola et.al. |
2306.13554v1 |
link |
| 2023-06-23 |
Manifold Contrastive Learning with Variational Lie Group Operators |
Kion Fallah et.al. |
2306.13544v1 |
null |
| 2023-06-23 |
Torsion Graph Neural Networks |
Cong Shen et.al. |
2306.13541v1 |
link |
| 2023-06-23 |
Topological learning for the classification of disorder: an application to the design of metasurfaces |
Tristan Madeleine et.al. |
2306.13540v1 |
null |
| 2023-06-23 |
WBCAtt: A White Blood Cell Dataset Annotated with Detailed Morphological Attributes |
Satoshi Tsutsui et.al. |
2306.13531v1 |
link |
| 2023-06-22 |
A Comparison of Time-based Models for Multimodal Emotion Recognition |
Ege Kesim et.al. |
2306.13076v1 |
null |
| 2023-06-22 |
Auditing Predictive Models for Intersectional Biases |
Kate S. Boxer et.al. |
2306.13064v1 |
null |
| 2023-06-22 |
Impacts and Risk of Generative AI Technology on Cyber Defense |
Subash Neupane et.al. |
2306.13033v1 |
null |
| 2023-06-22 |
Toward Automated Detection of Microbleeds with Anatomical Scale Localization: A Complete Clinical Diagnosis Support Using Deep Learning |
Jun-Ho Kim et.al. |
2306.13020v1 |
null |
| 2023-06-22 |
Minimalist and High-Quality Panoramic Imaging with PSF-aware Transformers |
Qi Jiang et.al. |
2306.12992v1 |
link |
| 2023-06-22 |
Can a single image processing algorithm work equally well across all phases of DCE-MRI? |
Adam G. Tattersall et.al. |
2306.12988v1 |
null |
| 2023-06-22 |
Radiation Emission during the Erasure of Magnetic Monopoles |
Maximilian Bachmaier et.al. |
2306.12958v1 |
null |
| 2023-06-22 |
Robust Semantic Segmentation: Strong Adversarial Attacks and Fast Training of Robust Models |
Francesco Croce et.al. |
2306.12941v1 |
link |
| 2023-06-22 |
Deficit of Hot Dust in Low-redshift Active Galactic Nuclei |
Suyeon Son et.al. |
2306.12927v1 |
null |
| 2023-06-22 |
Machine-Learning-Assisted and Real-Time-Feedback-Controlled Growth of InAs/GaAs Quantum Dots |
Chao Shen et.al. |
2306.12898v1 |
null |
| 2023-06-21 |
Spectroscopy of the Supernova H0pe Host Galaxy at Redshift 1.78 |
M. Polletta et.al. |
2306.12385v1 |
null |
| 2023-06-21 |
Geometric Algorithms for $k$-NN Poisoning |
Diego Ihara Centurion et.al. |
2306.12377v1 |
null |
| 2023-06-21 |
M-VAAL: Multimodal Variational Adversarial Active Learning for Downstream Medical Image Analysis Tasks |
Bidur Khanal et.al. |
2306.12376v1 |
link |
| 2023-06-21 |
One Policy to Dress Them All: Learning to Dress People with Diverse Poses and Garments |
Yufei Wang et.al. |
2306.12372v1 |
null |
| 2023-06-21 |
Attention Hybrid Variational Net for Accelerated MRI Reconstruction |
Guoyao Shen et.al. |
2306.12365v1 |
null |
| 2023-06-21 |
Linear and Non-Linear Barrier Coverage in Deterministic and Uncertain environment in WSNs: A New Classification |
Adda Boualem et.al. |
2306.12355v1 |
null |
| 2023-06-21 |
An efficient, provably exact algorithm for the 0-1 loss linear classification problem |
Xi He et.al. |
2306.12344v1 |
null |
| 2023-06-21 |
Geometric Pooling: maintaining more useful information |
Hao Xu et.al. |
2306.12341v1 |
null |
| 2023-06-22 |
Do you still need a manual smart contract audit? |
Isaac David et.al. |
2306.12338v2 |
null |
| 2023-06-22 |
Beyond Deep Ensembles: A Large-Scale Evaluation of Bayesian Deep Learning under Distribution Shift |
Florian Seligmann et.al. |
2306.12306v2 |
link |
| 2023-06-20 |
Segment Anything Model (SAM) for Radiation Oncology |
Lian Zhang et.al. |
2306.11730v1 |
null |
| 2023-06-20 |
Dense Video Object Captioning from Disjoint Supervision |
Xingyi Zhou et.al. |
2306.11729v1 |
link |
| 2023-06-20 |
How can objects help action recognition? |
Xingyi Zhou et.al. |
2306.11726v1 |
link |
| 2023-06-20 |
Low-complexity Multidimensional DCT Approximations |
V. A. Coutinho et.al. |
2306.11724v1 |
null |
| 2023-06-20 |
Meta-Analysis of Transfer Learning for Segmentation of Brain Lesions |
Sovesh Mohapatra et.al. |
2306.11714v1 |
null |
| 2023-06-20 |
Hexagonal circular 3-webs with polar curves of degree three |
Sergey I. Agafonov et.al. |
2306.11707v1 |
null |
| 2023-06-20 |
SkyGPT: Probabilistic Short-term Solar Forecasting Using Synthetic Sky Videos from Physics-constrained VideoGPT |
Yuhao Nie et.al. |
2306.11682v1 |
null |
| 2023-06-20 |
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks |
Yuan Cao et.al. |
2306.11680v1 |
null |
| 2023-06-20 |
A primal-dual data-driven method for computational optical imaging with a photonic lantern |
Carlos Santos Garcia et.al. |
2306.11679v1 |
null |
| 2023-06-20 |
Deep Learning Methods for Retinal Blood Vessel Segmentation: Evaluation on Images with Retinopathy of Prematurity |
Gorana Gojić et.al. |
2306.11576v1 |
null |
| 2023-06-16 |
Variational quantum algorithms for machine learning: theory and applications |
Stefano Mangini et.al. |
2306.09984v1 |
null |
| 2023-06-16 |
HePCo: Data-Free Heterogeneous Prompt Consolidation for Continual Federated Learning |
Shaunak Halbe et.al. |
2306.09970v1 |
null |
| 2023-06-16 |
Training shallow ReLU networks on noisy data using hinge loss: when do we overfit and is it benign? |
Erin George et.al. |
2306.09955v1 |
null |
| 2023-06-16 |
Towards Better Certified Segmentation via Diffusion Models |
Othmane Laousy et.al. |
2306.09949v1 |
null |
| 2023-06-16 |
Knowledge Distillation for Efficient Audio-Visual Video Captioning |
Özkan Çaylı et.al. |
2306.09947v1 |
null |
| 2023-06-16 |
RealImpact: A Dataset of Impact Sound Fields for Real Objects |
Samuel Clarke et.al. |
2306.09944v1 |
null |
| 2023-06-16 |
A classification of supersymmetric Kaluza-Klein black holes with a single axial symmetry |
David Katona et.al. |
2306.09933v1 |
null |
| 2023-06-16 |
A Metaheuristic-based Machine Learning Approach for Energy Prediction in Mobile App Development |
Seyed Jalaleddin Mousavirad et.al. |
2306.09931v1 |
null |
| 2023-06-16 |
Learning to Summarize and Answer Questions about a Virtual Robot's Past Actions |
Chad DeChant et.al. |
2306.09922v1 |
null |
| 2023-06-16 |
No Strong Feelings One Way or Another: Re-operationalizing Neutrality in Natural Language Inference |
Animesh Nighojkar et.al. |
2306.09918v1 |
null |
| 2023-06-16 |
UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video |
Zhi-Hao Lin et.al. |
2306.09349v2 |
null |
| 2023-06-15 |
Causal classification of spatiotemporal quantum correlations |
Minjeong Song et.al. |
2306.09336v1 |
null |
| 2023-06-15 |
Class-Conditional Conformal Prediction With Many Classes |
Tiffany Ding et.al. |
2306.09335v1 |
link |
| 2023-06-15 |
Personalized Image Enhancement Featuring Masked Style Modeling |
Satoshi Kosugi et.al. |
2306.09334v1 |
link |
| 2023-06-15 |
Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers |
Dominick Reilly et.al. |
2306.09331v1 |
link |
| 2023-06-15 |
WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings |
Zijie J. Wang et.al. |
2306.09328v1 |
link |
| 2023-06-15 |
Language-Guided Music Recommendation for Video via Prompt Analogies |
Daniel McKee et.al. |
2306.09327v1 |
null |
| 2023-06-15 |
Single-Stage Visual Query Localization in Egocentric Videos |
Hanwen Jiang et.al. |
2306.09324v1 |
null |
| 2023-06-15 |
Crowd-Powered Photo Enhancement Featuring an Active Learning Based Local Filter |
Satoshi Kosugi et.al. |
2306.09321v1 |
link |
| 2023-06-15 |
Learnable Weight Initialization for Volumetric Medical Image Segmentation |
Shahina Kunhimon et.al. |
2306.09320v1 |
link |
| 2023-06-13 |
Classification of branched Willmore spheres |
Dorian Martino et.al. |
2306.07965v1 |
null |
| 2023-06-13 |
Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters |
Ganesh Ramachandra Kini et.al. |
2306.07960v1 |
link |
| 2023-06-13 |
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation |
Shuai Yang et.al. |
2306.07954v1 |
null |
| 2023-06-13 |
MOFI: Learning Image Representations from Noisy Entity Annotated Images |
Wentao Wu et.al. |
2306.07952v1 |
null |
| 2023-06-13 |
Image Captioners Are Scalable Vision Learners Too |
Michael Tschannen et.al. |
2306.07915v1 |
null |
| 2023-06-13 |
Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark |
Łukasz Augustyniak et.al. |
2306.07902v1 |
link |
| 2023-06-13 |
Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks |
Veniamin Veselovsky et.al. |
2306.07899v1 |
link |
| 2023-06-13 |
CAMEO: A Causal Transfer Learning Approach for Performance Optimization of Configurable Computer Systems |
Md Shahriar Iqbal et.al. |
2306.07888v1 |
null |
| 2023-06-13 |
Deep Learning-Enabled Zero-Touch Device Identification: Mitigating the Impact of Channel Variability Through MIMO Diversity |
Bechir Hamdaoui et.al. |
2306.07878v1 |
null |
| 2023-06-13 |
On the flow unsteadiness and operational characteristics of a novel supersonic fluidic oscillator |
Spandan Maikap et.al. |
2306.07849v1 |
null |
| 2023-06-12 |
Waffling around for Performance: Visual Classification with Random Words and Broad Concepts |
Karsten Roth et.al. |
2306.07282v1 |
link |
| 2023-06-12 |
The Cheltsov--Rubinstein problem for strongly asymptotically log del Pezzo surfaces |
Chenzi Jin et.al. |
2306.07278v1 |
null |
| 2023-06-12 |
Gaussian Membership Inference Privacy |
Tobias Leemann et.al. |
2306.07273v1 |
null |
| 2023-06-12 |
MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images |
Junchen Zhu et.al. |
2306.07257v1 |
null |
| 2023-06-12 |
On the Expected Size of Conformal Prediction Sets |
Guneet S. Dhillon et.al. |
2306.07254v1 |
null |
| 2023-06-12 |
RB-Dust -- A Reference-based Dataset for Vision-based Dust Removal |
Peter Buckel et.al. |
2306.07244v1 |
null |
| 2023-06-12 |
Strokes2Surface: Recovering Curve Networks From 4D Architectural Design Sketches |
S. Rasoulzadeh et.al. |
2306.07220v1 |
null |
| 2023-06-12 |
Cyclic objects from surfaces |
Ivan Bartulović et.al. |
2306.07216v1 |
null |
| 2023-06-12 |
Valley: Video Assistant with Large Language model Enhanced abilitY |
Ruipu Luo et.al. |
2306.07207v1 |
null |
| 2023-06-12 |
A Survey of Vision-Language Pre-training from the Lens of Multimodal Machine Translation |
Jeremy Gwinnup et.al. |
2306.07198v1 |
null |
| 2023-06-09 |
Shock Cooling and Possible Precursor Emission in the Early Light Curve of the Type II SN 2023ixf |
Griffin Hosseinzadeh et.al. |
2306.06097v1 |
null |
| 2023-06-09 |
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding |
Mu Cai et.al. |
2306.06094v1 |
null |
| 2023-06-09 |
Virtual Node Tuning for Few-shot Node Classification |
Zhen Tan et.al. |
2306.06063v1 |
null |
| 2023-06-09 |
Ion-Driven Instabilities in the Inner Heliosphere II: Classification and Multi-Dimensional Mapping |
Mihailo M. Martinovic et.al. |
2306.06060v1 |
null |
| 2023-06-09 |
Exploring the Impact of Image Resolution on Chest X-ray Classification Performance |
Alessandro Wollek et.al. |
2306.06051v1 |
null |
| 2023-06-09 |
How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models? |
Yifei Ming et.al. |
2306.06048v1 |
null |
| 2023-06-09 |
GANeRF: Leveraging Discriminators to Optimize Neural Radiance Fields |
Barbara Roessle et.al. |
2306.06044v1 |
null |
| 2023-06-09 |
WindowNet: Learnable Windows for Chest X-ray Classification |
Alessandro Wollek et.al. |
2306.06038v1 |
null |
| 2023-06-09 |
Benchmarking self-supervised video representation learning |
Akash Kumar et.al. |
2306.06010v1 |
null |
| 2023-06-09 |
Beyond Detection: Visual Realism Assessment of Deepfakes |
Luka Dragar et.al. |
2306.05985v1 |
null |
| 2023-06-08 |
MIMIC-IT: Multi-Modal In-Context Instruction Tuning |
Bo Li et.al. |
2306.05425v1 |
link |
| 2023-06-08 |
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models |
Muhammad Maaz et.al. |
2306.05424v1 |
link |
| 2023-06-08 |
ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process |
Changyao Tian et.al. |
2306.05423v1 |
null |
| 2023-06-08 |
Tracking Everything Everywhere All at Once |
Qianqian Wang et.al. |
2306.05422v1 |
null |
| 2023-06-08 |
2D Supervised Monocular 3D Object Detection by Global-to-Local 3D Reconstruction |
Jiawei He et.al. |
2306.05418v1 |
null |
| 2023-06-08 |
Tracking Objects with 3D Representation from Videos |
Jiawei He et.al. |
2306.05416v1 |
null |
| 2023-06-08 |
Quantum symmetries in 2+1 dimensions: Carroll, (a)dS-Carroll, Galilei and (a)dS-Galilei |
Tomasz Trześniewski et.al. |
2306.05409v1 |
null |
| 2023-06-08 |
Deformation theory for prismatic $G$-displays |
Kazuhiro Ito et.al. |
2306.05361v1 |
null |
| 2023-06-08 |
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models |
Nan Liu et.al. |
2306.05357v1 |
null |
| 2023-06-08 |
Predictive Modeling of Equine Activity Budgets Using a 3D Skeleton Reconstructed from Surveillance Recordings |
Ernest Pokropek et.al. |
2306.05311v1 |
null |
| 2023-06-08 |
Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt |
Kai Chen et.al. |
2306.04607v2 |
null |
| 2023-06-07 |
MarineVRS: Marine Video Retrieval System with Explainability via Semantic Understanding |
Tan-Sang Ha et.al. |
2306.04593v1 |
null |
| 2023-06-07 |
A Dataset for Deep Learning-based Bone Structure Analyses in Total Hip Arthroplasty |
Kaidong Zhang et.al. |
2306.04579v1 |
link |
| 2023-06-07 |
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models |
Sophie Jentzsch et.al. |
2306.04563v1 |
link |
| 2023-06-07 |
Contrastive Bootstrapping for Label Refinement |
Shudi Hou et.al. |
2306.04544v1 |
null |
| 2023-06-07 |
Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications |
Paul Pu Liang et.al. |
2306.04539v1 |
link |
| 2023-06-07 |
Long-form analogies generated by chatGPT lack human-like psycholinguistic properties |
S. M. Seals et.al. |
2306.04537v1 |
null |
| 2023-06-07 |
ContriMix: Unsupervised disentanglement of content and attribute for domain generalization in microscopy image analysis |
Tan H. Nguyen et.al. |
2306.04527v1 |
null |
| 2023-06-07 |
Cross-attention learning enables real-time nonuniform rotational distortion correction in OCT |
Haoran Zhang et.al. |
2306.04512v1 |
null |
| 2023-06-07 |
Hardness of Deceptive Certificate Selection |
Stephan Wäldchen et.al. |
2306.04505v1 |
null |
| 2023-06-06 |
CL-UZH at SemEval-2023 Task 10: Sexism Detection through Incremental Fine-Tuning and Multi-Task Learning with Label Descriptions |
Janis Goldzycher et.al. |
2306.03907v1 |
null |
| 2023-06-06 |
Utterance Classification with Logical Neural Network: Explainable AI for Mental Disorder Diagnosis |
Yeldar Toleubay et.al. |
2306.03902v1 |
null |
| 2023-06-06 |
Towards Label-free Scene Understanding by Vision Foundation Models |
Runnan Chen et.al. |
2306.03899v1 |
null |
| 2023-06-06 |
Multi-Label ECG Classification using Temporal Convolutional Neural Network |
Eedara Prabhakararao et.al. |
2306.03844v1 |
null |
| 2023-06-06 |
Atrial Septal Defect Detection in Children Based on Ultrasound Video Using Multiple Instances Learning |
Yiman Liu et.al. |
2306.03835v1 |
null |
| 2023-06-06 |
MTS2Graph: Interpretable Multivariate Time Series Classification with Temporal Evolving Graphs |
Raneen Younis et.al. |
2306.03834v1 |
null |
| 2023-06-06 |
GEO-Bench: Toward Foundation Models for Earth Monitoring |
Alexandre Lacoste et.al. |
2306.03831v1 |
link |
| 2023-06-06 |
Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How |
Sebastian Pineda Arango et.al. |
2306.03828v1 |
null |
| 2023-06-06 |
Learning to Ground Instructional Articles in Videos through Narrations |
Effrosyni Mavroudi et.al. |
2306.03802v1 |
null |
| 2023-06-06 |
Matched Pair Calibration for Ranking Fairness |
Hannah Korevaar et.al. |
2306.03775v1 |
null |
| 2023-06-05 |
Neuralangelo: High-Fidelity Neural Surface Reconstruction |
Zhaoshuo Li et.al. |
2306.03092v1 |
null |
| 2023-06-05 |
Dismantling Hate: Understanding Hate Speech Trends Against NBA Athletes |
Edinam Kofi Klutse et.al. |
2306.03086v1 |
null |
| 2023-06-05 |
MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion |
Chiyu Max Jiang et.al. |
2306.03083v1 |
null |
| 2023-06-05 |
Of Mice and Mates: Automated Classification and Modelling of Mouse Behaviour in Groups using a Single Model across Cages |
Michael P. J. Camilleri et.al. |
2306.03066v1 |
null |
| 2023-06-05 |
LibAUC: A Deep Learning Library for X-Risk Optimization |
Zhuoning Yuan et.al. |
2306.03065v1 |
link |
| 2023-06-05 |
Classification of Edge-dependent Labels of Nodes in Hypergraphs |
Minyoung Choe et.al. |
2306.03032v1 |
link |
| 2023-06-05 |
AI Techniques for Cone Beam Computed Tomography in Dentistry: Trends and Practices |
Saba Sarwar et.al. |
2306.03025v1 |
null |
| 2023-06-05 |
Interpretable Alzheimer's Disease Classification Via a Contrastive Diffusion Autoencoder |
Ayodeji Ijishakin et.al. |
2306.03022v1 |
null |
| 2023-06-05 |
Controllable odd-frequency Cooper pairs in multi-superconductor Josephson junctions |
Jorge Cayao et.al. |
2306.03017v1 |
null |
| 2023-06-05 |
Over-the-Air Federated Learning in Satellite systems |
Edward Akito Carlos et.al. |
2306.02996v1 |
null |
| 2023-06-02 |
Multilingual Conceptual Coverage in Text-to-Image Models |
Michael Saxon et.al. |
2306.01735v1 |
link |
| 2023-06-02 |
Video Colorization with Pre-trained Text-to-Image Diffusion Models |
Hanyuan Liu et.al. |
2306.01732v1 |
null |
| 2023-06-02 |
Streaming algorithms for evaluating noisy judges on unlabeled data -- binary classification |
Andrés Corrada-Emmanuel et.al. |
2306.01726v1 |
null |
| 2023-06-02 |
A Data-Driven Measure of Relative Uncertainty for Misclassification Detection |
Eduardo Dadalto et.al. |
2306.01710v1 |
link |
| 2023-06-02 |
Is Generative Modeling-based Stylization Necessary for Domain Adaptation in Regression Tasks? |
Jinman Park et.al. |
2306.01706v1 |
null |
| 2023-06-02 |
Temporal-controlled Frame Swap for Generating High-Fidelity Stereo Driving Data for Autonomy Analysis |
Yedi Luo et.al. |
2306.01704v1 |
null |
| 2023-06-02 |
Unique Brain Network Identification Number for Parkinson's Individuals Using Structural MRI |
Tanmayee Samantaray et.al. |
2306.01689v1 |
null |
| 2023-06-02 |
Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning |
Cristina Menghini et.al. |
2306.01669v1 |
link |
| 2023-06-02 |
SourceP: Smart Ponzi Schemes Detection on Ethereum Using Pre-training Model with Data Flow |
Pengcheng Lu et.al. |
2306.01665v1 |
null |
| 2023-06-02 |
An Adaptive Method for Weak Supervision with Drifting Data |
Alessio Mazzetto et.al. |
2306.01658v1 |
null |
| 2023-06-01 |
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles |
Chaitanya Ryali et.al. |
2306.00989v1 |
link |
| 2023-06-01 |
Continual Learning for Abdominal Multi-Organ and Tumor Segmentation |
Yixiao Zhang et.al. |
2306.00988v1 |
link |
| 2023-06-01 |
Using generative AI to investigate medical imagery models and datasets |
Oran Lang et.al. |
2306.00985v1 |
null |
| 2023-06-01 |
Building Rearticulable Models for Arbitrary 3D Objects from 4D Point Clouds |
Shaowei Liu et.al. |
2306.00979v1 |
null |
| 2023-06-01 |
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models |
Chang Liu et.al. |
2306.00973v1 |
link |
| 2023-06-01 |
LIV: Language-Image Representations and Rewards for Robotic Control |
Yecheng Jason Ma et.al. |
2306.00958v1 |
link |
| 2023-06-01 |
The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects |
Ruohan Gao et.al. |
2306.00956v1 |
null |
| 2023-06-01 |
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance |
Jinbo Xing et.al. |
2306.00943v1 |
null |
| 2023-06-01 |
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft |
Shalev Lifshitz et.al. |
2306.00937v1 |
null |
| 2023-06-01 |
Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features |
Kunal Mukherjee et.al. |
2306.00934v1 |
null |
| 2023-05-31 |
Humans in 4D: Reconstructing and Tracking Humans with Transformers |
Shubham Goel et.al. |
2305.20091v1 |
null |
| 2023-05-31 |
MARDELS: A full-sky X-ray selected galaxy cluster catalog |
Matthias Klein et.al. |
2305.20066v1 |
null |
| 2023-05-31 |
Exploiting Mechanics-Based Priors for Lateral Displacement Estimation in Ultrasound Elastography |
Md Ashikuzzaman et.al. |
2305.20059v1 |
null |
| 2023-05-31 |
Exploring Regions of Interest: Visualizing Histological Image Classification for Breast Cancer using Deep Learning |
Imane Nedjar et.al. |
2305.20058v1 |
null |
| 2023-05-31 |
LOWA: Localize Objects in the Wild with Attributes |
Xiaoyuan Guo et.al. |
2305.20047v1 |
null |
| 2023-06-01 |
Crowdsourcing subjective annotations using pairwise comparisons reduces bias and error compared to the majority-vote method |
Hasti Narimanzadeh et.al. |
2305.20042v2 |
null |
| 2023-05-31 |
Bias Mitigation Methods for Binary Classification Decision-Making Systems: Survey and Recommendations |
Madeleine Waller et.al. |
2305.20020v1 |
null |
| 2023-05-31 |
On the faces of unigraphic $3$-polytopes |
Riccardo W. Maffucci et.al. |
2305.20012v1 |
null |
| 2023-05-31 |
Number of Equivalence Classes of Rational Functions over Finite Fields |
Xiang-dong Hou et.al. |
2305.20008v1 |
null |
| 2023-05-31 |
Physics-Informed Ensemble Representation for Light-Field Image Super-Resolution |
Manchang Jin et.al. |
2305.20006v1 |
null |
| 2023-05-30 |
Imaginary quadratic fields with $\ell$-torsion-free class groups and specified split primes |
Olivia Beckwith et.al. |
2305.19272v1 |
null |
| 2023-05-30 |
A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem |
Sebastian P. Bayerl et.al. |
2305.19255v1 |
null |
| 2023-05-30 |
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models |
Guande He et.al. |
2305.19249v1 |
link |
| 2023-05-30 |
Optimal bounds on surfaces |
Jihao Liu et.al. |
2305.19248v1 |
null |
| 2023-05-30 |
COVID-19 Detection from Mass Spectra of Exhaled Breath |
Nicolò Bellarmino et.al. |
2305.19211v1 |
null |
| 2023-05-30 |
Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion Models |
Ernie Chu et.al. |
2305.19193v1 |
null |
| 2023-05-30 |
GMCs and their Type classification in M74: Toward understanding star formation and cloud evolution |
F. Demachi et.al. |
2305.19192v1 |
null |
| 2023-05-30 |
Classification of Classical Spin Liquids: Detailed Formalism and Suite of Examples |
Han Yan et.al. |
2305.19189v1 |
null |
| 2023-05-30 |
Reduced Precision Floating-Point Optimization for Deep Neural Network On-Device Learning on MicroControllers |
Davide Nadalini et.al. |
2305.19167v1 |
link |
| 2023-05-30 |
Recognizing People by Body Shape Using Deep Networks of Images and Words |
Blake A. Myers et.al. |
2305.19160v1 |
null |
| 2023-05-29 |
Direct Preference Optimization: Your Language Model is Secretly a Reward Model |
Rafael Rafailov et.al. |
2305.18290v1 |
null |
| 2023-05-29 |
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections |
M. Jehanzeb Mirza et.al. |
2305.18287v1 |
null |
| 2023-05-29 |
CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice |
Juan Zuluaga-Gomez et.al. |
2305.18283v1 |
link |
| 2023-05-29 |
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising |
Fu-Yun Wang et.al. |
2305.18264v1 |
link |
| 2023-05-30 |
TaleCrafter: Interactive Story Visualization with Multiple Characters |
Yuan Gong et.al. |
2305.18247v2 |
link |
| 2023-05-29 |
GazeGNN: A Gaze-Guided Graph Neural Network for Disease Classification |
Bin Wang et.al. |
2305.18221v1 |
null |
| 2023-05-29 |
Improved Probabilistic Image-Text Representations |
Sanghyuk Chun et.al. |
2305.18171v1 |
link |
| 2023-05-29 |
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning |
Amirhossein Abaskohi et.al. |
2305.18169v1 |
link |
| 2023-05-29 |
Generative Adversarial Networks based Skin Lesion Segmentation |
Shubham Innani et.al. |
2305.18164v1 |
link |
| 2023-05-30 |
Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning |
Yu Wang et.al. |
2305.18158v2 |
link |
| 2023-05-26 |
Characterizing and Measuring Linguistic Dataset Drift |
Tyler A. Chang et.al. |
2305.17127v1 |
null |
| 2023-05-26 |
IndustReal: Transferring Contact-Rich Assembly Tasks from Simulation to Reality |
Bingjie Tang et.al. |
2305.17110v1 |
null |
| 2023-05-26 |
ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing |
Min Zhao et.al. |
2305.17098v1 |
null |
| 2023-05-26 |
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation |
Tanveer Hannan et.al. |
2305.17096v1 |
null |
| 2023-05-26 |
Benchmarking state-of-the-art gradient boosting algorithms for classification |
Piotr Florek et.al. |
2305.17094v1 |
null |
| 2023-05-26 |
NeuroX Library for Neuron Analysis of Deep NLP Models |
Fahim Dalvi et.al. |
2305.17073v1 |
link |
| 2023-05-26 |
Extremely weakly-supervised blood vessel segmentation with physiologically based synthesis and domain adaptation |
Peidi Xu et.al. |
2305.17054v1 |
link |
| 2023-05-26 |
The Brain Tumor Segmentation (BraTS) Challenge 2023: Focus on Pediatrics (CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs) |
Anahita Fathi Kazerooni et.al. |
2305.17033v1 |
null |
| 2023-05-26 |
Are Deep Neural Networks Adequate Behavioural Models of Human Visual Perception? |
Felix A. Wichmann et.al. |
2305.17023v1 |
null |
| 2023-05-26 |
D-CALM: A Dynamic Clustering-based Active Learning Approach for Mitigating Bias |
Sabit Hassan et.al. |
2305.17013v1 |
null |
| 2023-05-25 |
Image is First-order Norm+Linear Autoregressive |
Yinpeng Chen et.al. |
2305.16319v1 |
null |
| 2023-05-25 |
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation |
Shilin Yan et.al. |
2305.16318v1 |
link |
| 2023-05-25 |
Making Vision Transformers Truly Shift-Equivariant |
Renan A. Rojas-Gomez et.al. |
2305.16316v1 |
null |
| 2023-05-25 |
Break-A-Scene: Extracting Multiple Concepts from a Single Image |
Omri Avrahami et.al. |
2305.16311v1 |
null |
| 2023-05-25 |
Imitating Task and Motion Planning with Visuomotor Transformers |
Murtaza Dalal et.al. |
2305.16309v1 |
null |
| 2023-05-25 |
Look Ma, No Hands! Agent-Environment Factorization of Egocentric Videos |
Matthew Chang et.al. |
2305.16301v1 |
null |
| 2023-05-25 |
Sharpness-Aware Minimization Leads to Low-Rank Features |
Maksym Andriushchenko et.al. |
2305.16292v1 |
link |
| 2023-05-25 |
Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation |
Lisa Dunlap et.al. |
2305.16289v1 |
link |
| 2023-05-25 |
UDPM: Upsampling Diffusion Probabilistic Models |
Shady Abu-Hussein et.al. |
2305.16269v1 |
null |
| 2023-05-25 |
Trans-Dimensional Generative Modeling via Jump Diffusion Models |
Andrew Campbell et.al. |
2305.16261v1 |
link |
| 2023-05-24 |
RoMa: Revisiting Robust Losses for Dense Feature Matching |
Johan Edstedt et.al. |
2305.15404v1 |
null |
| 2023-05-24 |
Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering |
Avi Caciularu et.al. |
2305.15387v1 |
null |
| 2023-05-24 |
What can generic neural networks learn from a child's visual experience? |
A. Emin Orhan et.al. |
2305.15372v1 |
null |
| 2023-05-24 |
Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution |
Yiyang Ma et.al. |
2305.15357v1 |
null |
| 2023-05-24 |
A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence |
Junyi Zhang et.al. |
2305.15347v1 |
null |
| 2023-05-24 |
Is Your Model "MADD"? A Novel Metric to Evaluate Algorithmic Fairness for Predictive Student Models |
Mélina Verger et.al. |
2305.15342v1 |
null |
| 2023-05-24 |
Statistical post-processing of visibility ensemble forecasts |
Sándor Baran et.al. |
2305.15325v1 |
null |
| 2023-05-24 |
Training on Thin Air: Improve Image Classification with Generated Data |
Yongchao Zhou et.al. |
2305.15316v1 |
null |
| 2023-05-24 |
Personalized Dictionary Learning for Heterogeneous Datasets |
Geyu Liang et.al. |
2305.15311v1 |
null |
| 2023-05-24 |
High Speed Human Action Recognition using a Photonic Reservoir Computer |
Enrico Picco et.al. |
2305.15283v1 |
null |
| 2023-05-23 |
Siamese Masked Autoencoders |
Agrim Gupta et.al. |
2305.14344v1 |
null |
| 2023-05-23 |
Video Prediction Models as Rewards for Reinforcement Learning |
Alejandro Escontrela et.al. |
2305.14343v1 |
null |
| 2023-05-23 |
Effect of speed fluctuations on the collective dynamics of active disks |
R. Kailasham et.al. |
2305.14340v1 |
null |
| 2023-05-23 |
Prototype Adaption and Projection for Few- and Zero-shot 3D Point Cloud Semantic Segmentation |
Shuting He et.al. |
2305.14335v1 |
link |
| 2023-05-23 |
Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation |
Susung Hong et.al. |
2305.14330v1 |
link |
| 2023-05-23 |
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings |
William Brannon et.al. |
2305.14321v1 |
link |
| 2023-05-23 |
Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science |
Yida Mu et.al. |
2305.14310v1 |
null |
| 2023-05-23 |
A Laplacian Pyramid Based Generative H&E Stain Augmentation Network |
Fangda Li et.al. |
2305.14301v1 |
link |
| 2023-05-23 |
TaDSE: Template-aware Dialogue Sentence Embeddings |
Minsik Oh et.al. |
2305.14299v1 |
null |
| 2023-05-23 |
Active Learning Principles for In-Context Learning with Large Language Models |
Katerina Margatina et.al. |
2305.14264v1 |
null |
| 2023-05-22 |
VDT: An Empirical Study on Video Diffusion with Transformers |
Haoyu Lu et.al. |
2305.13311v1 |
link |
| 2023-05-23 |
VideoLLM: Modeling Video Sequence with Large Language Models |
Guo Chen et.al. |
2305.13292v2 |
link |
| 2023-05-22 |
Materialistic: Selecting Similar Materials in Images |
Prafull Sharma et.al. |
2305.13291v1 |
null |
| 2023-05-22 |
Morphological Sampling Theorem and its Extension to Grey-value Images |
Vivek Sridhar et.al. |
2305.13279v1 |
null |
| 2023-05-22 |
U-TILISE: A Sequence-to-sequence Model for Cloud Removal in Optical Satellite Time Series |
Corinne Stucker et.al. |
2305.13277v1 |
null |
| 2023-05-22 |
The Geometric Approach to the Classification of Signals via a Maximal Set of Signals |
Leon A. Luxemburg et.al. |
2305.13255v1 |
null |
| 2023-05-22 |
Copy Recurrent Neural Network Structure Network |
Xiaofan Zhou et.al. |
2305.13250v1 |
null |
| 2023-05-22 |
Interactive Natural Language Processing |
Zekun Wang et.al. |
2305.13246v1 |
null |
| 2023-05-22 |
Sequential Transfer Learning to Decode Heard and Imagined Timbre from fMRI Data |
Sean Paulsen et.al. |
2305.13226v1 |
null |
| 2023-05-22 |
Learning to detect an animal sound from five examples |
Inês Nolasco et.al. |
2305.13210v1 |
null |
| 2023-05-19 |
North Sámi Dialect Identification with Self-supervised Speech Models |
Sofoklis Kakouros et.al. |
2305.11864v1 |
link |
| 2023-05-19 |
Recommendations for Verifying HDR Subjective Testing Workflows |
Vibhoothi et.al. |
2305.11858v1 |
null |
| 2023-05-19 |
Q-malizing flow and infinitesimal density ratio estimation |
Chen Xu et.al. |
2305.11857v1 |
null |
| 2023-05-19 |
Video Killed the HD-Map: Predicting Driving Behavior Directly From Drone Images |
Yunpeng Liu et.al. |
2305.11856v1 |
null |
| 2023-05-19 |
Any-to-Any Generation via Composable Diffusion |
Zineng Tang et.al. |
2305.11846v1 |
link |
| 2023-05-19 |
A One-Class Classifier for the Detection of GAN Manipulated Multi-Spectral Satellite Images |
Lydia Abady et.al. |
2305.11795v1 |
null |
| 2023-05-19 |
Enhancing Few-shot NER with Prompt Ordering based Data Augmentation |
Huiming Wang et.al. |
2305.11791v1 |
link |
| 2023-05-22 |
Prompting with Pseudo-Code Instructions |
Mayank Mishra et.al. |
2305.11790v2 |
null |
| 2023-05-19 |
Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes |
Aran Nayebi et.al. |
2305.11772v1 |
null |
| 2023-05-19 |
Persian Typographical Error Type Detection using Many-to-Many Deep Neural Networks on Algorithmically-Generated Misspellings |
Mohammad Dehghani et.al. |
2305.11731v1 |
null |
| 2023-05-18 |
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities |
Peng Wang et.al. |
2305.11172v1 |
link |
| 2023-05-18 |
Exploring the Carbon Footprint of Hugging Face's ML Models: A Repository Mining Study |
Joel Castaño et.al. |
2305.11164v1 |
null |
| 2023-05-18 |
Skin Lesion Diagnosis Using Convolutional Neural Networks |
Daniel Alonso Villanueva Nunez et.al. |
2305.11125v1 |
null |
| 2023-05-18 |
MiraBest: A Dataset of Morphologically Classified Radio Galaxies for Machine Learning |
Fiona A. M. Porter et.al. |
2305.11108v1 |
link |
| 2023-05-18 |
Cross-modality Data Augmentation for End-to-End Sign Language Translation |
Jinhui Ye et.al. |
2305.11096v1 |
null |
| 2023-05-18 |
Universal Domain Adaptation from Foundation Models |
Bin Deng et.al. |
2305.11092v1 |
link |
| 2023-05-18 |
The Heterotic-Ricci flow and its three-dimensional solitons |
Andrei Moroianu et.al. |
2305.11069v1 |
null |
| 2023-05-18 |
NODE-ImgNet: a PDE-informed effective and robust model for image denoising |
Xinheng Xie et.al. |
2305.11049v1 |
null |
| 2023-05-18 |
Simulation of a Variational Quantum Perceptron using Grover's Algorithm |
Nouhaila Innan et.al. |
2305.11040v1 |
null |
| 2023-05-18 |
Sizing multimodal suspensions with differential dynamic microscopy |
Joe J Bradley et.al. |
2305.11018v1 |
null |
| 2023-05-17 |
Performance of the Quasar Spectral Templates for the Dark Energy Spectroscopic Instrument |
Allyson Brodzeller et.al. |
2305.10426v1 |
null |
| 2023-05-17 |
Evolving Tsukamoto Neuro Fuzzy Model for Multiclass Covid 19 Classification with Chest X Ray Images |
Marziyeh Rezaei et.al. |
2305.10421v1 |
null |
| 2023-05-17 |
Variational Classification |
Shehzaad Dhuliawala et.al. |
2305.10406v1 |
null |
| 2023-05-17 |
What You See is What You Read? Improving Text-Image Alignment Evaluation |
Michal Yarom et.al. |
2305.10400v1 |
link |
| 2023-05-17 |
Optimality of Message-Passing Architectures for Sparse Graphs |
Aseem Baranwal et.al. |
2305.10391v1 |
null |
| 2023-05-17 |
Logit-Based Ensemble Distribution Distillation for Robust Autoregressive Sequence Uncertainties |
Yassir Fathullah et.al. |
2305.10384v1 |
null |
| 2023-05-18 |
Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents |
Sergio Pelaez et.al. |
2305.10383v2 |
null |
| 2023-05-18 |
Dislocation correlations and the continuum dynamics of the weak line bundle ensemble |
Joseph Pierre Anderson et.al. |
2305.10375v2 |
null |
| 2023-05-17 |
A catalog of collected debris disks: properties, classifications and correlations between disks and stars/planets |
Peng-cheng Cao et.al. |
2305.10364v1 |
null |
| 2023-05-17 |
Confidence-Guided Semi-supervised Learning in Land Cover Classification |
Wanli Ma et.al. |
2305.10344v1 |
null |
| 2023-05-16 |
Annotating 8,000 Abdominal CT Volumes for Multi-Organ Segmentation in Three Weeks |
Chongyu Qu et.al. |
2305.09666v1 |
null |
| 2023-05-16 |
Understanding 3D Object Interaction from a Single Image |
Shengyi Qian et.al. |
2305.09664v1 |
null |
| 2023-05-16 |
Sixfold way of traversable wormholes in the Sachdev-Ye-Kitaev model |
Antonio M. García-García et.al. |
2305.09663v1 |
null |
| 2023-05-16 |
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation |
Samaneh Azadi et.al. |
2305.09662v1 |
null |
| 2023-05-16 |
Osteosarcoma Tumor Detection using Transfer Learning Models |
Raisa Fairooz Meem et.al. |
2305.09660v1 |
null |
| 2023-05-16 |
RAMario: Experimental Approach to Reptile Algorithm -- Reinforcement Learning for Mario |
Sanyam Jain et.al. |
2305.09655v1 |
link |
| 2023-05-16 |
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation |
Mutian He et.al. |
2305.09652v1 |
null |
| 2023-05-16 |
Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation |
Yuxin Ren et.al. |
2305.09651v1 |
link |
| 2023-05-16 |
Wavelet-based Unsupervised Label-to-Image Translation |
George Eskandar et.al. |
2305.09647v1 |
link |
| 2023-05-16 |
Data Augmentation for Conflict and Duplicate Detection in Software Engineering Sentence Pairs |
Garima Malik et.al. |
2305.09608v1 |
null |
| 2023-05-15 |
Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models |
Antoni Bigata Casademunt et.al. |
2305.08854v1 |
null |
| 2023-05-15 |
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts |
Yuyang Zhao et.al. |
2305.08850v1 |
null |
| 2023-05-15 |
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks |
Minyoung Huh et.al. |
2305.08842v1 |
null |
| 2023-05-15 |
Attacking Perceptual Similarity Metrics |
Abhijay Ghildyal et.al. |
2305.08840v1 |
null |
| 2023-05-15 |
Learning Better Contrastive View from Radiologist's Gaze |
Sheng Wang et.al. |
2305.08826v1 |
link |
| 2023-05-15 |
Measuring Cross-Lingual Transferability of Multilingual Transformers on Sentence Classification |
Zewen Chi et.al. |
2305.08800v1 |
null |
| 2023-05-15 |
Predictive Models from Quantum Computer Benchmarks |
Daniel Hothem et.al. |
2305.08796v1 |
null |
| 2023-05-15 |
Comparing Variation in Tokenizer Outputs Using a Series of Problematic and Challenging Biomedical Sentences |
Christopher Meaney et.al. |
2305.08787v1 |
null |
| 2023-05-15 |
TAA-GCN: A Temporally Aware Adaptive Graph Convolutional Network for Age Estimation |
Matthew Korban et.al. |
2305.08779v1 |
null |
| 2023-05-15 |
Question-Answering System Extracts Information on Injection Drug Use from Clinical Progress Notes |
Maria Mahbub et.al. |
2305.08777v1 |
null |
| 2023-05-12 |
How do supernova remnants cool? -- I. Morphology, optical emission lines, and shocks |
Ekaterina I. Makarenko et.al. |
2305.07652v1 |
null |
| 2023-05-12 |
Beware of diffusion models for synthesizing medical images -- A comparison with GANs in terms of memorizing brain tumor images |
Muhammad Usman Akbar et.al. |
2305.07644v1 |
null |
| 2023-05-12 |
Efficient Neural Network based Classification and Outlier Detection for Image Moderation using Compressed Sensing and Group Testing |
Sabyasachi Ghosh et.al. |
2305.07639v1 |
null |
| 2023-05-12 |
Agile gesture recognition for capacitive sensing devices: adapting on-the-job |
Ying Liu et.al. |
2305.07624v1 |
null |
| 2023-05-12 |
Uncertainty Estimation for Deep Learning Image Reconstruction using a Local Lipschitz Metric |
Danyal F. Bhutto et.al. |
2305.07618v1 |
null |
| 2023-05-12 |
Fisher Information Embedding for Node and Graph Learning |
Dexiong Chen et.al. |
2305.07580v1 |
link |
| 2023-05-12 |
A Memory Model for Question Answering from Streaming Data Supported by Rehearsal and Anticipation of Coreference Information |
Vladimir Araujo et.al. |
2305.07565v1 |
null |
| 2023-05-12 |
Dish detection in food platters: A framework for automated diet logging and nutrition management |
Mansi Goel et.al. |
2305.07552v1 |
null |
| 2023-05-12 |
Content-based jewellery item retrieval using the local region-based histograms |
Amin Muhammad Shoib et.al. |
2305.07540v1 |
null |
| 2023-05-12 |
Saturated Non-Monotonic Activation Functions |
Junjia Chen et.al. |
2305.07537v1 |
null |
| 2023-05-11 |
A General-Purpose Multilingual Document Encoder |
Onur Galoğlu et.al. |
2305.07016v1 |
link |
| 2023-05-11 |
Self-Chained Image-Language Model for Video Localization and Question Answering |
Shoubin Yu et.al. |
2305.06988v1 |
link |
| 2023-05-11 |
Meta-hallucinator: Towards Few-Shot Cross-Modality Cardiac Image Segmentation |
Ziyuan Zhao et.al. |
2305.06978v1 |
null |
| 2023-05-11 |
Data quality dimensions for fair AI |
Camilla Quaresmini et.al. |
2305.06967v1 |
link |
| 2023-05-11 |
Transformers for CT Reconstruction From Monoplanar and Biplanar Radiographs |
Firas Khader et.al. |
2305.06965v1 |
null |
| 2023-05-11 |
Cascaded Cross-Attention Networks for Data-Efficient Whole-Slide Image Classification Using Transformers |
Firas Khader et.al. |
2305.06963v1 |
null |
| 2023-05-11 |
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification |
Souhail Bakkali et.al. |
2305.06923v1 |
null |
| 2023-05-11 |
Meta-Learners for Few-Shot Weakly-Supervised Medical Image Segmentation |
Hugo Oliveira et.al. |
2305.06912v1 |
null |
| 2023-05-11 |
Stochastic Variance-Reduced Majorization-Minimization Algorithms |
Duy-Nhat Phan et.al. |
2305.06848v1 |
link |
| 2023-05-11 |
Detection and Classification of Pole-like Landmarks for Domain-invariant 3D Point Cloud Map Matching |
Sun Yifei et.al. |
2305.06845v1 |
null |
| 2023-05-11 |
HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion |
Mustafa Işık et.al. |
2305.06356v2 |
link |
| 2023-05-10 |
VideoChat: Chat-Centric Video Understanding |
KunChang Li et.al. |
2305.06355v1 |
link |
| 2023-05-10 |
Reconstructing Animatable Categories from Videos |
Gengshan Yang et.al. |
2305.06351v1 |
null |
| 2023-05-10 |
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception |
Hassan Akbari et.al. |
2305.06324v1 |
null |
| 2023-05-10 |
SepMark: Deep Separable Watermarking for Unified Source Tracing and Deepfake Detection |
Xiaoshuai Wu et.al. |
2305.06321v1 |
null |
| 2023-05-10 |
Learning Video-Conditioned Policies for Unseen Manipulation Tasks |
Elliot Chane-Sane et.al. |
2305.06289v1 |
null |
| 2023-05-10 |
Convolution of periodic multiplicative functions and the divisor problem |
Marco Aymone et.al. |
2305.06260v1 |
null |
| 2023-05-11 |
Embedded Feature Similarity Optimization with Specific Parameter Initialization for 2D/3D Registration |
Minheng Chen et.al. |
2305.06252v2 |
null |
| 2023-05-10 |
Eigenmodes of magnetic skyrmion lattices |
Louise Desplat et.al. |
2305.06248v1 |
null |
| 2023-05-10 |
Explainable Knowledge Distillation for On-device Chest X-Ray Classification |
Chakkrit Termritthikun et.al. |
2305.06244v1 |
null |
| 2023-05-10 |
InternChat: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language |
Zhaoyang Liu et.al. |
2305.05662v2 |
link |
| 2023-05-09 |
An Exploration of Encoder-Decoder Approaches to Multi-Label Classification for Legal and Biomedical Text |
Yova Kementchedjhieva et.al. |
2305.05627v1 |
null |
| 2023-05-09 |
Can point cloud networks learn statistical shape models of anatomies? |
Jadie Adams et.al. |
2305.05610v1 |
null |
| 2023-05-09 |
Region-based Contrastive Pretraining for Medical Image Retrieval with Anatomic Query |
Ho Hin Lee et.al. |
2305.05598v1 |
null |
| 2023-05-09 |
Topological classification and black hole thermodynamics |
Mohammad Reza Alipour et.al. |
2305.05595v1 |
null |
| 2023-05-09 |
Fashion CUT: Unsupervised domain adaptation for visual pattern classification in clothes using synthetic data and pseudo-labels |
Enric Moreu et.al. |
2305.05580v1 |
null |
| 2023-05-09 |
Resource Dimensioning for Single-Cell Edge Video Analytics |
Jaume Anguera Peris et.al. |
2305.05568v1 |
null |
| 2023-05-09 |
ColonMapper: topological mapping and localization for colonoscopy |
Javier Morlana et.al. |
2305.05546v1 |
null |
| 2023-05-09 |
Integrating Holistic and Local Information to Estimate Emotional Reaction Intensity |
Yini Fang et.al. |
2305.05534v1 |
link |
| 2023-05-09 |
RMES: Real-Time Micro-Expression Spotting Using Phase From Riesz Pyramid |
Yini Fang et.al. |
2305.05523v1 |
null |
| 2023-05-08 |
What Do Patients Say About Their Disease Symptoms? Deep Multilabel Text Classification With Human-in-the-Loop Curation for Automatic Labeling of Patient Self Reports of Problems |
Lakshmi Arbatti et.al. |
2305.04905v1 |
null |
| 2023-05-08 |
Toward Connecting Speech Acts and Search Actions in Conversational Search Tasks |
Souvick Ghosh et.al. |
2305.04858v1 |
null |
| 2023-05-08 |
Isotonic subgroup selection |
Manuel M. Müller et.al. |
2305.04852v1 |
null |
| 2023-05-08 |
Compressed Video Quality Assessment for Super-Resolution: a Benchmark and a Quality Metric |
Evgeney Bogatyrev et.al. |
2305.04844v1 |
link |
| 2023-05-08 |
Learning Summary-Worthy Visual Representation for Abstractive Summarization in Video |
Zenan Xu et.al. |
2305.04824v1 |
null |
| 2023-05-08 |
On word complexity and topological entropy of random substitution subshifts |
Andrew Mitchell et.al. |
2305.04817v1 |
null |
| 2023-05-08 |
FlaPy: Mining Flaky Python Tests at Scale |
Martin Gruber et.al. |
2305.04793v1 |
null |
| 2023-05-08 |
AvatarReX: Real-time Expressive Full-body Avatars |
Zerong Zheng et.al. |
2305.04789v1 |
null |
| 2023-05-08 |
Multi-Scale Energy (MuSE) plug and play framework for inverse problems |
Jyothi Rikhab Chand et.al. |
2305.04775v1 |
null |
| 2023-05-08 |
Understanding Noise-Augmented Training for Randomized Smoothing |
Ambar Pal et.al. |
2305.04746v1 |
link |
| 2023-05-05 |
Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos |
Ekta Prashnani et.al. |
2305.03713v1 |
null |
| 2023-05-05 |
Fine-Grained Product Classification on Leaflet Advertisements |
Daniel Ladwig et.al. |
2305.03706v1 |
link |
| 2023-05-05 |
How Segment Anything Model (SAM) Boost Medical Image Segmentation? |
Yichi Zhang et.al. |
2305.03678v1 |
link |
| 2023-05-05 |
Stochastic maximum principle for sub-diffusions and its applications |
Shuaiqi Zhang et.al. |
2305.03676v1 |
null |
| 2023-05-08 |
White-Box Multi-Objective Adversarial Attack on Dialogue Generation |
Yufei Li et.al. |
2305.03655v2 |
link |
| 2023-05-05 |
($\mathfrak{S}_p \times \mathfrak{S}_q$)-Invariant Graphical Parking Functions |
Lauren Snider et.al. |
2305.03651v1 |
null |
| 2023-05-05 |
A surface-normal photodetector as nonlinear activation function in diffractive optical neural networks |
Farshid Ashtiani et.al. |
2305.03627v1 |
null |
| 2023-05-05 |
Segmentation of fundus vascular images based on a dual-attention mechanism |
Yuanyuan Peng et.al. |
2305.03617v1 |
null |
| 2023-05-05 |
Human Attention-Guided Explainable Artificial Intelligence for Computer Vision Models |
Guoyang Liu et.al. |
2305.03601v1 |
null |
| 2023-05-05 |
HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer |
Shuzhe Wang et.al. |
2305.03595v1 |
null |
| 2023-05-04 |
Tracking through Containers and Occluders in the Wild |
Basile Van Hoorick et.al. |
2305.03052v1 |
null |
| 2023-05-04 |
NeuralEditor: Editing Neural Radiance Fields via Manipulating Point Clouds |
Jun-Kun Chen et.al. |
2305.03049v1 |
null |
| 2023-05-04 |
Personalize Segment Anything Model with One Shot |
Renrui Zhang et.al. |
2305.03048v1 |
link |
| 2023-05-04 |
Learning Hand-Held Object Reconstruction from In-The-Wild Videos |
Aditya Prakash et.al. |
2305.03036v1 |
null |
| 2023-05-04 |
NeRSemble: Multi-view Radiance Field Reconstruction of Human Heads |
Tobias Kirschstein et.al. |
2305.03027v1 |
null |
| 2023-05-04 |
The Polynomial Connection between Morphological Dilation and Discrete Convolution |
Vivek Sridhar et.al. |
2305.03018v1 |
null |
| 2023-05-04 |
Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Recover the Whole Sentence |
Haoran Li et.al. |
2305.03010v1 |
link |
| 2023-05-04 |
NatCS: Eliciting Natural Customer Support Dialogues |
James Gung et.al. |
2305.03007v1 |
link |
| 2023-05-04 |
Hybrid quantum learning with data re-uploading on a small-scale superconducting quantum simulator |
Aleksei Tolstobrov et.al. |
2305.02956v1 |
null |
| 2023-05-04 |
A study on the composition of elementary cellular automata |
Alonso Castillo-Ramirez et.al. |
2305.02947v1 |
null |
| 2023-05-03 |
Descent in tensor triangular geometry |
Tobias Barthel et.al. |
2305.02308v1 |
null |
| 2023-05-03 |
New Equivalences Between Interpolation and SVMs: Kernels and Structured Features |
Chiraag Kaushik et.al. |
2305.02304v1 |
null |
| 2023-05-03 |
Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime |
Chuhan Zhang et.al. |
2305.02297v1 |
null |
| 2023-05-03 |
DynamicStereo: Consistent Dynamic Depth from Stereo Videos |
Nikita Karaev et.al. |
2305.02296v1 |
null |
| 2023-05-03 |
Iranian License Plate Recognition Using a Reliable Deep Learning Approach |
Soheila Hatami et.al. |
2305.02292v1 |
null |
| 2023-05-03 |
Evaluating BERT-based Scientific Relation Classifiers for Scholarly Knowledge Graph Construction on Digital Library Collections |
Ming Jiang et.al. |
2305.02291v1 |
null |
| 2023-05-03 |
Multi-dimensional Signal Recovery using Low-rank Deconvolution |
David Reixach et.al. |
2305.02264v1 |
null |
| 2023-05-03 |
Standardized Benchmark Dataset for Localized Exposure to a Realistic Source at 10$-$90 GHz |
Ante Kapetanovic et.al. |
2305.02260v1 |
null |
| 2023-05-03 |
Thermally-driven Multilevel Non-volatile Memory with Monolayer MoS2 for Neuro-inspired Artificial Learning |
Sameer Kumar Mallik et.al. |
2305.02259v1 |
null |
| 2023-05-03 |
The Benefits of Label-Description Training for Zero-Shot Text Classification |
Lingyu Gao et.al. |
2305.02239v1 |
link |
| 2023-05-02 |
Sequence Modeling with Multiresolution Convolutional Memory |
Jiaxin Shi et.al. |
2305.01638v1 |
null |
| 2023-05-03 |
A Technical Report on Image Classification using AWS |
Balakrishna Phani Kommanaboina et.al. |
2305.01634v2 |
null |
| 2023-05-02 |
AutoColor: Learned Light Power Control for Multi-Color Holograms |
Yicheng Zhan et.al. |
2305.01611v1 |
null |
| 2023-05-02 |
On the Impact of Data Quality on Image Classification Fairness |
Aki Barry et.al. |
2305.01595v1 |
null |
| 2023-05-02 |
On cubic-line arrangements with simple singularities |
Przemysław Talar et.al. |
2305.01530v1 |
null |
| 2023-05-02 |
ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression Learning |
Azmine Toushik Wasi et.al. |
2305.01486v1 |
link |
| 2023-05-02 |
Scalable Mask Annotation for Video Text Spotting |
Haibin He et.al. |
2305.01443v1 |
link |
| 2023-05-02 |
Unsupervised Feature Based Algorithms for Time Series Extrinsic Regression |
David Guijo-Rubio et.al. |
2305.01429v1 |
null |
| 2023-05-02 |
Are demographically invariant models and representations in medical imaging fair? |
Eike Petersen et.al. |
2305.01397v1 |
null |
| 2023-05-02 |
Self-supervised arbitrary scale super-resolution framework for anisotropic MRI |
Haonan Zhang et.al. |
2305.01360v1 |
null |
| 2023-05-01 |
Behavioral Forensics in Social Networks: Identifying Misinformation, Disinformation and Refutation Spreaders Using Machine Learning |
Euna Mehnaz Khan et.al. |
2305.00957v1 |
null |
| 2023-05-01 |
Probabilistic 3D segmentation for aleatoric uncertainty quantification in full 3D medical data |
Christiaan G. A. Viviers et.al. |
2305.00950v1 |
null |
| 2023-05-01 |
StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video |
Lizhen Wang et.al. |
2305.00942v1 |
null |
| 2023-05-01 |
Early Detection of Alzheimer's Disease using Bottleneck Transformers |
Arunima Jaiswal et.al. |
2305.00923v1 |
null |
| 2023-05-01 |
A Novel Low-Rank Tensor Method for Undersampling Artifact Removal in Respiratory Motion-Resolved Multi-Echo 3D Cones MRI |
Seongho Jeong et.al. |
2305.00892v1 |
null |
| 2023-05-01 |
Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression |
Akash Srivastava et.al. |
2305.00869v1 |
null |
| 2023-05-01 |
Automated Paper Screening for Clinical Reviews Using Large Language Models |
Eddie Guo et.al. |
2305.00844v1 |
null |
| 2023-05-01 |
LCAUnet: A skin lesion segmentation network with enhanced edge and body fusion |
Qisen Ma et.al. |
2305.00837v1 |
null |
| 2023-05-01 |
Performance and Energy Consumption of Parallel Machine Learning Algorithms |
Xidong Wu et.al. |
2305.00798v1 |
null |
| 2023-05-01 |
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation |
Zhenhui Ye et.al. |
2305.00787v1 |
null |
| 2023-04-28 |
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs |
George Pu et.al. |
2304.14999v1 |
null |
| 2023-04-28 |
Topology of critical points in boundary matrix duals |
Pavan Kumar Yerra et.al. |
2304.14988v1 |
null |
| 2023-04-28 |
Quality-Adaptive Split-Federated Learning for Segmenting Medical Images with Inaccurate Annotations |
Zahra Hafezi Kafshgari et.al. |
2304.14976v1 |
null |
| 2023-04-28 |
Embodiment perception of a smart home assistant |
Mariya Kilina et.al. |
2304.14947v1 |
null |
| 2023-04-28 |
The Emotions of the Crowd: Learning Image Sentiment from Tweets via Cross-modal Distillation |
Alessio Serra et.al. |
2304.14942v1 |
null |
| 2023-04-28 |
Contactless hand tremor amplitude measurement using smartphones: development and pilot evaluation |
James Bungay et.al. |
2304.14937v1 |
null |
| 2023-04-28 |
Unified Noise-aware Network for Low-count PET Denoising |
Huidong Xie et.al. |
2304.14900v1 |
null |
| 2023-04-28 |
Making the Invisible Visible: Toward High-Quality Terahertz Tomographic Imaging via Physics-Guided Restoration |
Weng-Tai Su et.al. |
2304.14894v1 |
null |
| 2023-04-28 |
Dense Hybrid Proposal Modulation for Lane Detection |
Yuejian Wu et.al. |
2304.14874v1 |
link |
| 2023-04-28 |
Evaluating the Stability of Semantic Concept Representations in CNNs for Robust Explainability |
Georgii Mikriukov et.al. |
2304.14864v1 |
null |
| 2023-04-27 |
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System |
Junke Wang et.al. |
2304.14407v1 |
null |
| 2023-04-27 |
Putting People in Their Place: Affordance-Aware Human Insertion into Scenes |
Sumith Kulal et.al. |
2304.14406v1 |
link |
| 2023-04-27 |
ViMQ: A Vietnamese Medical Question Dataset for Healthcare Dialogue System Development |
Ta Duc Huy et.al. |
2304.14405v1 |
link |
| 2023-04-27 |
Motion-Conditioned Diffusion Model for Controllable Video Synthesis |
Tsai-Shien Chen et.al. |
2304.14404v1 |
null |
| 2023-04-27 |
ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs |
Jiteng Mu et.al. |
2304.14401v1 |
null |
| 2023-04-27 |
SeqTrack: Sequence to Sequence Learning for Visual Object Tracking |
Xin Chen et.al. |
2304.14394v1 |
link |
| 2023-04-27 |
SLoMo: A General System for Legged Robot Motion Imitation from Casual Videos |
John Z. Zhang et.al. |
2304.14389v1 |
null |
| 2023-04-27 |
Learning Neural Constitutive Laws From Motion Observations for Generalizable PDE Dynamics |
Pingchuan Ma et.al. |
2304.14369v1 |
null |
| 2023-04-27 |
CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants |
Albert Yu Sun et.al. |
2304.14364v1 |
null |
| 2023-04-27 |
Double-Deck Multi-Agent Pickup and Delivery: Multi-Robot Rearrangement in Large-Scale Warehouses |
Baiyu Li et.al. |
2304.14309v1 |
null |
| 2023-04-26 |
A Control-Centric Benchmark for Video Prediction |
Stephen Tian et.al. |
2304.13723v1 |
link |
| 2023-04-26 |
Association Rules Mining with Auto-Encoders |
Théophile Berteloot et.al. |
2304.13717v1 |
null |
| 2023-04-26 |
Random Additive Polynomials |
Lior Bary-Soroker et.al. |
2304.13709v1 |
null |
| 2023-04-27 |
Rigidity, Generators and Homology of Interval Exchange Groups |
Owen Tanner et.al. |
2304.13691v2 |
null |
| 2023-04-26 |
Pseudo-periodic map and classification of theories with eight supercharges |
Dan Xie et.al. |
2304.13663v1 |
null |
| 2023-04-26 |
PVP: Pre-trained Visual Parameter-Efficient Tuning |
Zhao Song et.al. |
2304.13639v1 |
null |
| 2023-04-26 |
HausaNLP at SemEval-2023 Task 12: Leveraging African Low Resource TweetData for Sentiment Analysis |
Saheed Abdullahi Salahudeen et.al. |
2304.13634v1 |
link |
| 2023-04-26 |
HDR-VDP-3: A multi-metric for predicting image differences, quality and contrast distortions in high dynamic range and regular content |
Rafal K. Mantiuk et.al. |
2304.13625v1 |
null |
| 2023-04-26 |
Shades of meaning: Uncovering the geometry of ambiguous word representations through contextualised language models |
Benedetta Cevoli et.al. |
2304.13597v1 |
null |
| 2023-04-26 |
Video Frame Interpolation with Densely Queried Bilateral Correlation |
Chang Zhou et.al. |
2304.13596v1 |
link |
| 2023-04-25 |
Bake off redux: a review and experimental evaluation of recent time series classification algorithms |
Matthew Middlehurst et.al. |
2304.13029v1 |
null |
| 2023-04-25 |
Flickr-PAD: New Face High-Resolution Presentation Attack Detection Database |
Diego Pasmino et.al. |
2304.13015v1 |
link |
| 2023-04-25 |
Methods and datasets for segmentation of minimally invasive surgical instruments in endoscopic images and videos: A review of the state of the art |
Tobias Rueckert et.al. |
2304.13014v1 |
null |
| 2023-04-25 |
The Potential of Visual ChatGPT For Remote Sensing |
Lucas Prado Osco et.al. |
2304.13009v1 |
null |
| 2023-04-25 |
PoseVocab: Learning Joint-structured Pose Embeddings for Human Avatar Modeling |
Zhe Li et.al. |
2304.13006v1 |
link |
| 2023-04-25 |
Segment anything, from space? |
Simiao Ren et.al. |
2304.13000v1 |
null |
| 2023-04-25 |
Multi-Scale Feature Fusion using Parallel-Attention Block for COVID-19 Chest X-ray Diagnosis |
Xiao Qi et.al. |
2304.12988v1 |
null |
| 2023-04-25 |
Quantifying the Effect of Image Similarity on Diabetic Foot Ulcer Classification |
Imran Chowdhury Dipto et.al. |
2304.12987v1 |
null |
| 2023-04-25 |
How to account for behavioural states in step-selection analysis: a model comparison |
Jennifer Pohle et.al. |
2304.12964v1 |
null |
| 2023-04-25 |
Latent diffusion models for generative precipitation nowcasting with accurate uncertainty quantification |
Jussi Leinonen et.al. |
2304.12891v1 |
null |
| 2023-04-24 |
Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis |
Chonghyuk Song et.al. |
2304.12317v1 |
null |
| 2023-04-24 |
Segment Anything in Medical Images |
Jun Ma et.al. |
2304.12306v1 |
link |
| 2023-04-24 |
AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation |
Takehiko Ohkawa et.al. |
2304.12301v1 |
null |
| 2023-04-24 |
Large-capacity and Flexible Video Steganography via Invertible Neural Network |
Chong Mou et.al. |
2304.12300v1 |
link |
| 2023-04-24 |
On local delta invariant of del Pezzo surfaces |
Erroxe Etxabarri Alberdi et.al. |
2304.12286v1 |
null |
| 2023-04-24 |
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video |
Jia-Wei Liu et.al. |
2304.12281v1 |
null |
| 2023-04-24 |
Dynamic generation and attribution of revenues in a video platform |
Francisco Lopez-Navarrete et.al. |
2304.12268v1 |
null |
| 2023-04-24 |
Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity |
Pablo Alonso-Jiménez et.al. |
2304.12257v1 |
null |
| 2023-04-24 |
Ordinal time series analysis with the R package otsfeatures |
Ángel López Oriona et.al. |
2304.12251v1 |
null |
| 2023-04-24 |
Classification of regular subalgebras of injective type III factors |
Soham Chakraborty et.al. |
2304.12243v1 |
null |
| 2023-04-21 |
Implicit Neural Head Synthesis via Controllable Local Deformation Fields |
Chuhan Chen et.al. |
2304.11113v1 |
null |
| 2023-04-21 |
A Convolutional Spiking Network for Gesture Recognition in Brain-Computer Interfaces |
Yiming Ai et.al. |
2304.11106v1 |
null |
| 2023-04-21 |
Classification of solutions to Hardy-Sobolev Doubly Critical Systems |
Francesco Esposito et.al. |
2304.11066v1 |
null |
| 2023-04-24 |
CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval |
Shangda Wu et.al. |
2304.11029v2 |
link |
| 2023-04-21 |
3d mirror symmetry of braided tensor categories |
Andrew Ballin et.al. |
2304.11001v1 |
null |
| 2023-04-21 |
Information Extraction from Documents: Question Answering vs Token Classification in real-world setups |
Laurent Lam et.al. |
2304.10994v1 |
null |
| 2023-04-21 |
LEIA: Linguistic Embeddings for the Identification of Affect |
Segun Taofeek Aroyehun et.al. |
2304.10973v1 |
null |
| 2023-04-21 |
Factored Neural Representation for Scene Understanding |
Yu-Shiang Wong et.al. |
2304.10950v1 |
null |
| 2023-04-21 |
A combined approach to analyze and classify families of classical spin liquids |
Naïmo Davier et.al. |
2304.10906v1 |
null |
| 2023-04-21 |
AMP in the wild: Learning robust, agile, natural legged locomotion skills |
Yikai Wang et.al. |
2304.10888v1 |
null |
| 2023-04-20 |
Learning Sparse and Low-Rank Priors for Image Recovery via Iterative Reweighted Least Squares Minimization |
Stamatios Lefkimmiatis et.al. |
2304.10536v1 |
null |
| 2023-04-20 |
Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion |
Tomas Jakab et.al. |
2304.10535v1 |
null |
| 2023-04-20 |
Multidimensional Uncertainty Quantification for Deep Neural Networks |
Xujiang Zhao et.al. |
2304.10527v1 |
null |
| 2023-04-20 |
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget |
Johannes Lehner et.al. |
2304.10520v1 |
link |
| 2023-04-20 |
"Can We Detect Substance Use Disorder?": Knowledge and Time Aware Classification on Social Media from Darkweb |
Usha Lokala et.al. |
2304.10512v1 |
null |
| 2023-04-20 |
Reconstructing Signing Avatars From Video Using Linguistic Priors |
Maria-Paola Forte et.al. |
2304.10482v1 |
null |
| 2023-04-20 |
Implicit Temporal Modeling with Learnable Alignment for Video Recognition |
Shuyuan Tu et.al. |
2304.10465v1 |
link |
| 2023-04-20 |
Angle based dynamic learning rate for gradient descent |
Neel Mishra et.al. |
2304.10457v1 |
link |
| 2023-04-20 |
On the classification of singular cubic threefolds |
Sasha Viktorova et.al. |
2304.10452v1 |
null |
| 2023-04-20 |
Domain-specific Continued Pretraining of Language Models for Capturing Long Context in Mental Health |
Shaoxiong Ji et.al. |
2304.10447v1 |
null |
| 2023-04-19 |
Transformer-Based Visual Segmentation: A Survey |
Xiangtai Li et.al. |
2304.09854v1 |
link |
| 2023-04-19 |
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation |
Zhen Li et.al. |
2304.09790v1 |
link |
| 2023-04-19 |
Automatic Interaction and Activity Recognition from Videos of Human Manual Demonstrations with Application to Anomaly Detection |
Elena Merlo et.al. |
2304.09789v1 |
null |
| 2023-04-19 |
Advances on Concept Drift Detection in Regression Tasks using Social Networks Theory |
Jean Paul Barddal et.al. |
2304.09788v1 |
null |
| 2023-04-20 |
Application of attention-based Siamese composite neural network in medical image recognition |
Zihao Huang et.al. |
2304.09783v2 |
null |
| 2023-04-19 |
Equalised Odds is not Equal Individual Odds: Post-processing for Group and Individual Fairness |
Edward A. Small et.al. |
2304.09779v1 |
null |
| 2023-04-19 |
An End-to-End Vehicle Trajcetory Prediction Framework |
Fuad Hasan et.al. |
2304.09764v1 |
null |
| 2023-04-19 |
Rehabilitation Exercise Repetition Segmentation and Counting using Skeletal Body Joints |
Ali Abedi et.al. |
2304.09735v1 |
link |
| 2023-04-19 |
Hyperspectral Image Analysis with Subspace Learning-based One-Class Classification |
Sertac Kilickaya et.al. |
2304.09730v1 |
null |
| 2023-04-20 |
Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate |
Songhua Liu et.al. |
2304.09728v2 |
link |
| 2023-04-18 |
Hyperbolic Image-Text Representations |
Karan Desai et.al. |
2304.09172v1 |
null |
| 2023-04-18 |
Optimal PAC Bounds Without Uniform Convergence |
Ishaq Aden-Ali et.al. |
2304.09167v1 |
null |
| 2023-04-18 |
Structure Preserving Cycle-GAN for Unsupervised Medical Image Domain Adaptation |
Paolo Iacono et.al. |
2304.09164v1 |
null |
| 2023-04-18 |
Detection and Classification of Glioblastoma Brain Tumor |
Utkarsh Maurya et.al. |
2304.09133v1 |
null |
| 2023-04-18 |
Variational Relational Point Completion Network for Robust 3D Classification |
Liang Pan et.al. |
2304.09131v1 |
null |
| 2023-04-18 |
Safety Guaranteed Manipulation Based on Reinforcement Learning Planner and Model Predictive Control Actor |
Zhenshan Bing et.al. |
2304.09119v1 |
null |
| 2023-04-18 |
Sliced Optimal Transport on the Sphere |
Michael Quellmalz et.al. |
2304.09092v1 |
null |
| 2023-04-18 |
A general review on the NLS equation with point-concentrated nonlinearity |
Lorenzo Tentarelli et.al. |
2304.09086v1 |
null |
| 2023-04-18 |
Performance of GAN-based augmentation for deep learning COVID-19 image classification |
Oleksandr Fedoruk et.al. |
2304.09067v1 |
null |
| 2023-04-18 |
KSB stability is automatic in codimension 3 |
János Kollár et.al. |
2304.09009v1 |
null |
| 2023-04-17 |
Conditional Generation of Audio from Video via Foley Analogies |
Yuexi Du et.al. |
2304.08490v1 |
link |
| 2023-04-17 |
Affordances from Human Videos as a Versatile Representation for Robotics |
Shikhar Bahl et.al. |
2304.08488v1 |
null |
| 2023-04-17 |
Text2Performer: Text-Driven Human Video Generation |
Yuming Jiang et.al. |
2304.08483v1 |
link |
| 2023-04-18 |
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation |
Jie An et.al. |
2304.08477v2 |
null |
| 2023-04-17 |
Synthetic Data from Diffusion Models Improves ImageNet Classification |
Shekoofeh Azizi et.al. |
2304.08466v1 |
null |
| 2023-04-17 |
Efficient Video Action Detection with Token Dropout and Context Refinement |
Lei Chen et.al. |
2304.08451v1 |
null |
| 2023-04-17 |
Morph-SSL: Self-Supervision with Longitudinal Morphing to Predict AMD Progression from OCT |
Arunava Chakravarty et.al. |
2304.08439v1 |
null |
| 2023-04-17 |
CAViaR: Context Aware Video Recommendations |
Khushhall Chandra Mahajan et.al. |
2304.08435v1 |
null |
| 2023-04-17 |
Tame symmetric algebras of period four |
Karin Erdmann et.al. |
2304.08414v1 |
null |
| 2023-04-17 |
OVTrack: Open-Vocabulary Multiple Object Tracking |
Siyuan Li et.al. |
2304.08408v1 |
null |
| 2023-04-14 |
CAD-RADS scoring of coronary CT angiography with Multi-Axis Vision Transformer: a clinically-inspired deep learning pipeline |
Alessia Gerbasi et.al. |
2304.07277v1 |
null |
| 2023-04-14 |
Genus Comparisons in the Topological Analysis of RNA Structures |
Nicolò Cangiotti et.al. |
2304.07273v1 |
null |
| 2023-04-14 |
Phantom Embeddings: Using Embedding Space for Model Regularization in Deep Neural Networks |
Mofassir ul Islam Arif et.al. |
2304.07262v1 |
null |
| 2023-04-14 |
The University of California San Francisco, Brain Metastases Stereotactic Radiosurgery (UCSF-BMSR) MRI Dataset |
Jeffrey D. Rudie et.al. |
2304.07248v1 |
null |
| 2023-04-14 |
Covidia: COVID-19 Interdisciplinary Academic Knowledge Graph |
Cheng Deng et.al. |
2304.07242v1 |
null |
| 2023-04-14 |
PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute Recognition |
Xinwen Fan et.al. |
2304.07230v1 |
link |
| 2023-04-14 |
Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models |
Yaohua Zha et.al. |
2304.07221v1 |
link |
| 2023-04-14 |
Emergency Resource Layout with Multiple Objectives under Complex Disaster Scenarios |
Changwei Yuan et.al. |
2304.07216v1 |
null |
| 2023-04-14 |
Radio Galaxy Zoo EMU: Towards a Semantic Radio Galaxy Morphology Taxonomy |
Micah Bowles et.al. |
2304.07171v1 |
null |
| 2023-04-14 |
Robust thalamic nuclei segmentation from T1-weighted MRI |
Julie P. Vidal et.al. |
2304.07167v1 |
null |
| 2023-04-13 |
Representing Volumetric Videos as Dynamic MLP Maps |
Sida Peng et.al. |
2304.06717v1 |
null |
| 2023-04-13 |
What does CLIP know about a red circle? Visual prompt engineering for VLMs |
Aleksandar Shtedritski et.al. |
2304.06712v1 |
null |
| 2023-04-13 |
DiffusionRig: Learning Personalized Priors for Facial Appearance Editing |
Zheng Ding et.al. |
2304.06711v1 |
null |
| 2023-04-13 |
Remote Sensing Change Detection With Transformers Trained from Scratch |
Mubashir Noman et.al. |
2304.06710v1 |
link |
| 2023-04-13 |
Verbs in Action: Improving verb understanding in video-language models |
Liliane Momeni et.al. |
2304.06708v1 |
null |
| 2023-04-13 |
How Will It Drape Like? Capturing Fabric Mechanics from Depth Images |
Carlos Rodriguez-Pardo et.al. |
2304.06704v1 |
null |
| 2023-04-13 |
Learning Controllable 3D Diffusion Models from Single-view Images |
Jiatao Gu et.al. |
2304.06700v1 |
null |
| 2023-04-13 |
Improving novelty detection with generative adversarial networks on hand gesture data |
Miguel Simão et.al. |
2304.06696v1 |
null |
| 2023-04-13 |
LSFSL: Leveraging Shape Information in Few-shot Learning |
Deepan Chakravarthi Padmanabhan et.al. |
2304.06672v1 |
null |
| 2023-04-13 |
Do deep neural networks have an inbuilt Occam's razor? |
Chris Mingard et.al. |
2304.06670v1 |
null |
| 2023-04-12 |
Fluctuation based interpretable analysis scheme for quantum many-body snapshots |
Henning Schlömer et.al. |
2304.06029v1 |
null |
| 2023-04-12 |
RECLIP: Resource-efficient CLIP by Training with Small Images |
Runze Li et.al. |
2304.06028v1 |
null |
| 2023-04-12 |
Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA |
James Seale Smith et.al. |
2304.06027v1 |
null |
| 2023-04-12 |
DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion |
Johanna Karras et.al. |
2304.06025v1 |
null |
| 2023-04-12 |
VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs |
Moayed Haji Ali et.al. |
2304.06020v1 |
null |
| 2023-04-12 |
Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera |
Ruicheng Feng et.al. |
2304.06019v1 |
link |
| 2023-04-12 |
Adaptive Human Matting for Dynamic Videos |
Chung-Ching Lin et.al. |
2304.06018v1 |
null |
| 2023-04-12 |
Exploiting Logic Locking for a Neural Trojan Attack on Machine Learning Accelerators |
Hongye Xu et.al. |
2304.06017v1 |
null |
| 2023-04-12 |
An Improved Heart Disease Prediction Using Stacked Ensemble Method |
Md. Maidul Islam et.al. |
2304.06015v1 |
null |
| 2023-04-12 |
Rigidly-rotating scalar fields: between real divergence and imaginary fractalization |
Victor E. Ambruş et.al. |
2304.05998v1 |
null |
| 2023-04-11 |
Discrete pre-Tannakian categories |
Nate Harman et.al. |
2304.05375v1 |
null |
| 2023-04-11 |
A comparative study between paired and unpaired Image Quality Assessment in Low-Dose CT Denoising |
Francesco Di Feola et.al. |
2304.05359v1 |
link |
| 2023-04-11 |
On Elliott's conjecture and applications |
Oleksiy Klurman et.al. |
2304.05344v1 |
null |
| 2023-04-11 |
Bayesian Optimization of Catalysts With In-context Learning |
Mayk Caldas Ramos et.al. |
2304.05341v1 |
link |
| 2023-04-11 |
Unified Multi-Modal Image Synthesis for Missing Modality Imputation |
Yue Zhang et.al. |
2304.05340v1 |
null |
| 2023-04-11 |
Deep-learning assisted detection and quantification of (oo)cysts of Giardia and Cryptosporidium on smartphone microscopy images |
Suprim Nakarmi et.al. |
2304.05339v1 |
null |
| 2023-04-11 |
Neural Delay Differential Equations: System Reconstruction and Image Classification |
Qunxi Zhu et.al. |
2304.05310v1 |
null |
| 2023-04-11 |
A Comprehensive Study on Object Detection Techniques in Unconstrained Environments |
Hrishitva Patel et.al. |
2304.05295v1 |
null |
| 2023-04-11 |
MC-ViViT: Multi-branch Classifier-ViViT to Detect Mild Cognitive Impairment in Older Adults using Facial Videos |
Jian Sun et.al. |
2304.05292v1 |
null |
| 2023-04-11 |
YouNICon: YouTube's CommuNIty of Conspiracy Videos |
Shaoyi Liaw et.al. |
2304.05274v1 |
null |
| 2023-04-10 |
Detection Transformer with Stable Matching |
Shilong Liu et.al. |
2304.04742v1 |
link |
| 2023-04-10 |
Brain Extraction comparing Segment Anything Model (SAM) and FSL Brain Extraction Tool |
Sovesh Mohapatra et.al. |
2304.04738v1 |
null |
| 2023-04-10 |
Artifact magnification on deepfake videos increases human detection and subjective confidence |
Emilie Josephs et.al. |
2304.04733v1 |
null |
| 2023-04-10 |
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition |
Shuhuai Ren et.al. |
2304.04704v1 |
link |
| 2023-04-10 |
Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation |
Inkyu Shin et.al. |
2304.04694v1 |
null |
| 2023-04-11 |
Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection |
Wei-Jhe Huang et.al. |
2304.04688v2 |
null |
| 2023-04-10 |
Learning to Detect Touches on Cluttered Tables |
Norberto Adrian Goussies et.al. |
2304.04687v1 |
null |
| 2023-04-10 |
ECG-CL: A Comprehensive Electrocardiogram Interpretation Method Based on Continual Learning |
Hongxiang Gao et.al. |
2304.04646v1 |
null |
| 2023-04-10 |
Improving ABR Performance for Short Video Streaming Using Multi-Agent Reinforcement Learning with Expert Guidance |
Yueheng Li et.al. |
2304.04637v1 |
null |
| 2023-04-10 |
VARS: Video Assistant Referee System for Automated Soccer Decision Making from Multiple Views |
Jan Held et.al. |
2304.04617v1 |
null |
| 2023-04-07 |
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens |
Ziteng Gao et.al. |
2304.03768v1 |
null |
| 2023-04-07 |
Zero-shot CT Field-of-view Completion with Unconditional Generative Diffusion Prior |
Kaiwen Xu et.al. |
2304.03760v1 |
null |
| 2023-04-07 |
An unsupervised segmentation of vocal breath sounds |
Shivani Yadav et.al. |
2304.03758v1 |
null |
| 2023-04-07 |
Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering |
Hung-Ting Su et.al. |
2304.03754v1 |
null |
| 2023-04-07 |
Three-dimensional morphology of an ultrafine Al-Si eutectic produced via laser rapid solidification |
Xinyi Zhou et.al. |
2304.03740v1 |
null |
| 2023-04-07 |
Integrating Edge-AI in Structural Health Monitoring domain |
Anoop Mishra et.al. |
2304.03718v1 |
null |
| 2023-04-07 |
Meta-causal Learning for Single Domain Generalization |
Jin Chen et.al. |
2304.03709v1 |
null |
| 2023-04-07 |
Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge |
Gongning Luo et.al. |
2304.03708v1 |
null |
| 2023-04-07 |
Idaho Blacks: Quiet Economic Triumph of Enduring Champions |
Rama K. Malladi et.al. |
2304.03676v1 |
null |
| 2023-04-07 |
An Accessible Toolkit for 360 VR Studies |
Corrie Green et.al. |
2304.03652v1 |
null |
| 2023-04-06 |
SegGPT: Segmenting Everything In Context |
Xinlong Wang et.al. |
2304.03284v1 |
link |
| 2023-04-06 |
Diffusion Models as Masked Autoencoders |
Chen Wei et.al. |
2304.03283v1 |
null |
| 2023-04-06 |
That's What I Said: Fully-Controllable Talking Face Generation |
Youngjoon Jang et.al. |
2304.03275v1 |
null |
| 2023-04-06 |
ImageEye: Batch Image Processing Using Program Synthesis |
Celeste Barnaby et.al. |
2304.03253v1 |
null |
| 2023-04-06 |
Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts |
Chenyu You et.al. |
2304.03209v1 |
null |
| 2023-04-06 |
Face Animation with an Attribute-Guided Diffusion Model |
Bohan Zeng et.al. |
2304.03199v1 |
link |
| 2023-04-06 |
RFAConv: Innovating Spatital Attention and Standard Convolutional Operation |
Xin Zhang et.al. |
2304.03198v1 |
link |
| 2023-04-06 |
Micron-BERT: BERT-based Facial Micro-Expression Recognition |
Xuan-Bac Nguyen et.al. |
2304.03195v1 |
link |
| 2023-04-06 |
Improving automatic endoscopic stone recognition using a multi-view fusion approach enhanced with two-step transfer learning |
Francisco Lopez-Tiro et.al. |
2304.03193v1 |
null |
| 2023-04-06 |
The Concept of Forward-Forward Learning Applied to a Multi Output Perceptron |
K. Fredrik Karlsson et.al. |
2304.03189v1 |
link |
| 2023-04-05 |
Self-Distillation for Gaussian Process Regression and Classification |
Kenneth Borup et.al. |
2304.02641v1 |
link |
| 2023-04-05 |
HNeRV: A Hybrid Neural Representation for Videos |
Hao Chen et.al. |
2304.02633v1 |
link |
| 2023-04-05 |
High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation |
Arvi Jonnarth et.al. |
2304.02621v1 |
null |
| 2023-04-05 |
A note on the classification of positive solutions to the critical p-Laplace equation in $\mathbb{R}^n$ |
Jérôme Vétois et.al. |
2304.02600v1 |
null |
| 2023-04-05 |
ECG Feature Importance Rankings: Cardiologists vs. Algorithms |
Temesgen Mehari et.al. |
2304.02577v1 |
null |
| 2023-04-05 |
Deep learning estimation of modified Zernike coefficients for image point spread functions |
Abu Bucker Siddik et.al. |
2304.02576v1 |
null |
| 2023-04-05 |
VicTR: Video-conditioned Text Representations for Activity Recognition |
Kumara Kahatapitiya et.al. |
2304.02560v1 |
null |
| 2023-04-05 |
Detecting and Grounding Multi-Modal Media Manipulation |
Rui Shao et.al. |
2304.02556v1 |
link |
| 2023-04-05 |
Self-Supervised Siamese Autoencoders |
Friederike Baier et.al. |
2304.02549v1 |
null |
| 2023-04-05 |
Multi-annotator Deep Learning: A Probabilistic Framework for Classification |
Marek Herde et.al. |
2304.02539v1 |
link |
| 2023-04-04 |
NPC: Neural Point Characters from Video |
Shih-Yang Su et.al. |
2304.02013v1 |
null |
| 2023-04-04 |
EGC: Image Generation and Classification via a Single Energy-Based Model |
Qiushan Guo et.al. |
2304.02012v1 |
link |
| 2023-04-04 |
FakET: Simulating Cryo-Electron Tomograms with Neural Style Transfer |
Pavol Harar et.al. |
2304.02011v1 |
link |
| 2023-04-04 |
MonoHuman: Animatable Human Neural Field from Monocular Video |
Zhengming Yu et.al. |
2304.02001v1 |
null |
| 2023-04-05 |
Waving Goodbye to Low-Res: A Diffusion-Wavelet Approach for Image Super-Resolution |
Brian Moser et.al. |
2304.01994v2 |
link |
| 2023-04-04 |
Cross-modulated Few-shot Image Generation for Colorectal Tissue Classification |
Amandeep Kumar et.al. |
2304.01992v1 |
link |
| 2023-04-04 |
Side Channel-Assisted Inference Leakage from Machine Learning-based ECG Classification |
Jialin Liu et.al. |
2304.01990v1 |
null |
| 2023-04-04 |
Kinetic relaxation and Bose-star formation in multicomponent dark matter- I |
Mudit Jain et.al. |
2304.01985v1 |
null |
| 2023-04-04 |
Reactive Multi-agent Coordination using Auction-based Task Allocation and Behavior Trees |
Niklas Dahlquist et.al. |
2304.01976v1 |
null |
| 2023-04-04 |
MEGClass: Text Classification with Extremely Weak Supervision via Mutually-Enhancing Text Granularities |
Priyanka Kargupta et.al. |
2304.01969v1 |
link |
| 2023-04-03 |
Neural Volumetric Memory for Visual Locomotion Control |
Ruihan Yang et.al. |
2304.01201v1 |
null |
| 2023-04-03 |
Video Instance Segmentation in an Open-World |
Omkar Thawakar et.al. |
2304.01200v1 |
link |
| 2023-04-03 |
Zero-Shot Semantic Segmentation with Decoupled One-Pass Network |
Cong Han et.al. |
2304.01198v1 |
null |
| 2023-04-03 |
Bringing Telepresence to Every Desk |
Shengze Wang et.al. |
2304.01197v1 |
null |
| 2023-04-03 |
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos |
Yue Ma et.al. |
2304.01186v1 |
link |
| 2023-04-03 |
WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation |
Lianghui Zhu et.al. |
2304.01184v1 |
link |
| 2023-04-03 |
Hate Speech Targets Detection in Parler using BERT |
Nadav Schneider et.al. |
2304.01179v1 |
link |
| 2023-04-03 |
DribbleBot: Dynamic Legged Manipulation in the Wild |
Yandong Ji et.al. |
2304.01159v1 |
null |
| 2023-04-03 |
Algebraic and Geometric Models for Space Networking |
William Bernardoni et.al. |
2304.01150v1 |
null |
| 2023-04-03 |
Use Your Head: Improving Long-Tail Video Recognition |
Toby Perrett et.al. |
2304.01143v1 |
null |
| 2023-03-31 |
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence? |
Arjun Majumdar et.al. |
2303.18240v1 |
null |
| 2023-03-31 |
DIME-FM: DIstilling Multimodal and Efficient Foundation Models |
Ximeng Sun et.al. |
2303.18232v1 |
null |
| 2023-03-31 |
Procedure-Aware Pretraining for Instructional Video Understanding |
Honglu Zhou et.al. |
2303.18230v1 |
link |
| 2023-03-31 |
A Closer Look at Few-Shot 3D Point Cloud Classification |
Chuangguan Ye et.al. |
2303.18210v1 |
null |
| 2023-03-31 |
SimTS: Rethinking Contrastive Representation Learning for Time Series Forecasting |
Xiaochen Zheng et.al. |
2303.18205v1 |
link |
| 2023-03-31 |
Towards a Classification of Charge-3 Monopoles with Symmetry |
H. W. Braden et.al. |
2303.18189v1 |
null |
| 2023-04-03 |
Three-dimensional coherent diffraction snapshot imaging using extreme ultraviolet radiation from a free electron laser |
Danny Fainozzi et.al. |
2303.18166v2 |
null |
| 2023-03-31 |
Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction |
Delin Qu et.al. |
2303.18125v1 |
null |
| 2023-03-31 |
BERTino: an Italian DistilBERT model |
Matteo Muffo et.al. |
2303.18121v1 |
link |
| 2023-03-31 |
A two-head loss function for deep Average-K classification |
Camille Garcin et.al. |
2303.18118v1 |
null |
| 2023-03-30 |
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models |
Wen Wang et.al. |
2303.17599v1 |
link |
| 2023-03-30 |
Consistent View Synthesis with Pose-Guided Diffusion Models |
Hung-Yu Tseng et.al. |
2303.17598v1 |
null |
| 2023-03-30 |
MobileInst: Video Instance Segmentation on the Mobile |
Renhong Zhang et.al. |
2303.17594v1 |
null |
| 2023-03-30 |
Anatomically aware dual-hop learning for pulmonary embolism detection in CT pulmonary angiograms |
Florin Condrea et.al. |
2303.17593v1 |
null |
| 2023-03-30 |
SoftCLIP: Softer Cross-modal Alignment Makes CLIP Stronger |
Yuting Gao et.al. |
2303.17561v1 |
null |
| 2023-03-30 |
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder |
Chenpng Du et.al. |
2303.17550v1 |
null |
| 2023-03-30 |
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment |
Kim Sung-Bin et.al. |
2303.17490v1 |
null |
| 2023-03-30 |
PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation |
Qitao Zhao et.al. |
2303.17472v1 |
link |
| 2023-03-30 |
NN-Copula-CD: A Copula-Guided Interpretable Neural Network for Change Detection in Heterogeneous Remote Sensing Images |
Weiming Li et.al. |
2303.17448v1 |
null |
| 2023-03-30 |
Steered Mixture of Experts Regression for Image Denoising with Multi-Model-Inference |
Aytac Özkan et.al. |
2303.17409v1 |
null |
| 2023-03-29 |
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos |
Kun Su et.al. |
2303.16897v1 |
null |
| 2023-03-29 |
Towards Understanding the Effect of Pretraining Label Granularity |
Guan Zhe Hong et.al. |
2303.16887v1 |
null |
| 2023-03-29 |
End-to-End $n$-ary Relation Extraction for Combination Drug Therapies |
Yuhang Jiang et.al. |
2303.16886v1 |
link |
| 2023-03-29 |
CheckerPose: Progressive Dense Keypoint Localization for Object Pose Estimation with Graph Neural Network |
Ruyi Lian et.al. |
2303.16874v1 |
null |
| 2023-03-29 |
A Video-based End-to-end Pipeline for Non-nutritive Sucking Action Recognition and Segmentation in Young Infants |
Shaotong Zhu et.al. |
2303.16867v1 |
link |
| 2023-03-29 |
The puzzle of the formation of T8 dwarf Ross 458c |
Josefine Gaarn et.al. |
2303.16863v1 |
null |
| 2023-03-29 |
Beyond Empirical Risk Minimization: Local Structure Preserving Regularization for Improving Adversarial Robustness |
Wei Wei et.al. |
2303.16861v1 |
null |
| 2023-03-29 |
Robust Dancer: Long-term 3D Dance Synthesis Using Unpaired Data |
Bin Feng et.al. |
2303.16856v1 |
link |
| 2023-03-30 |
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks |
Weicheng Kuo et.al. |
2303.16839v2 |
null |
| 2023-03-29 |
bAIoimage analysis: elevating the rate of scientific discovery -- as a community |
Damian Edward Dalle Nogare et.al. |
2303.16743v1 |
null |
| 2023-03-29 |
Your Diffusion Model is Secretly a Zero-Shot Classifier |
Alexander C. Li et.al. |
2303.16203v2 |
link |
| 2023-03-28 |
Natural Selection Favors AIs over Humans |
Dan Hendrycks et.al. |
2303.16200v1 |
null |
| 2023-03-28 |
Forecasting localized weather impacts on vegetation as seen from space with meteo-guided video prediction |
Vitus Benson et.al. |
2303.16198v1 |
link |
| 2023-03-28 |
Automorphisms of del Pezzo surfaces in odd characteristic |
Igor Dolgachev et.al. |
2303.16170v1 |
null |
| 2023-03-28 |
Comparison of HDR quality metrics in Per-Clip Lagrangian multiplier optimisation with AV1 |
Vibhoothi et.al. |
2303.16163v1 |
link |
| 2023-03-28 |
A Comparative Study of Federated Learning Models for COVID-19 Detection |
Erfan Darzidehkalani et.al. |
2303.16141v1 |
null |
| 2023-03-28 |
XRBcats: Galactic High Mass X-ray Binary Catalogue |
Marvin Neumann et.al. |
2303.16137v1 |
null |
| 2023-03-28 |
Transformer and Snowball Graph Convolution Learning for Biomedical Graph Classification |
Jinlong Hu et.al. |
2303.16132v1 |
null |
| 2023-03-28 |
Evaluating the Effectiveness of 2D and 3D Features for Predicting Tumor Response to Chemotherapy |
Neman Abdoli et.al. |
2303.16123v1 |
null |
| 2023-03-28 |
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection |
Lei Chen et.al. |
2303.16118v1 |
null |
| 2023-03-27 |
GeoNet: Benchmarking Unsupervised Adaptation across Geographies |
Tarun Kalluri et.al. |
2303.15443v1 |
null |
| 2023-03-27 |
Zero-shot Model Diagnosis |
Jinqi Luo et.al. |
2303.15441v1 |
null |
| 2023-03-27 |
TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models |
Md Kamrul Hasan et.al. |
2303.15430v1 |
null |
| 2023-03-27 |
JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields |
Xi Wang et.al. |
2303.15427v1 |
null |
| 2023-03-27 |
ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging |
Alessandro Fontanella et.al. |
2303.15421v1 |
null |
| 2023-03-27 |
3D Video Object Detection with Learnable Object-Centric Global Optimization |
Jiawei He et.al. |
2303.15416v1 |
link |
| 2023-03-27 |
Classifier Robustness Enhancement Via Test-Time Transformation |
Tsachi Blau et.al. |
2303.15409v1 |
null |
| 2023-03-27 |
Generalizable Neural Voxels for Fast Human Radiance Fields |
Taoran Yi et.al. |
2303.15387v1 |
null |
| 2023-03-27 |
List Online Classification |
Shay Moran et.al. |
2303.15383v1 |
null |
| 2023-03-27 |
Quantum-inspired classification based on quantum state discrimination |
Emmanuel Zambrini Cruzeiro et.al. |
2303.15353v1 |
null |
| 2023-03-24 |
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization |
Pavan Kumar Anasosalu Vasu et.al. |
2303.14189v1 |
null |
| 2023-03-24 |
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects |
Bowen Wen et.al. |
2303.14158v1 |
null |
| 2023-03-24 |
Adversarial Attack and Defense for Medical Image Analysis: Methods and Applications |
Junhao Dong et.al. |
2303.14133v1 |
null |
| 2023-03-24 |
CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images |
Jordan J. Bird et.al. |
2303.14126v1 |
null |
| 2023-03-24 |
Towards Scalable Neural Representation for Diverse Videos |
Bo He et.al. |
2303.14124v1 |
null |
| 2023-03-24 |
Prediction of the morphological evolution of a splashing drop using an encoder-decoder |
Jingzu Yee et.al. |
2303.14109v1 |
null |
| 2023-03-24 |
Parallel and totally umbilical hypersurfaces of the four-dimensional Thurston geometry $\text{Sol}^4_0$ |
Marie D'haene et.al. |
2303.14105v1 |
null |
| 2023-03-24 |
PanoVPR: Towards Unified Perspective-to-Equirectangular Visual Place Recognition via Sliding Windows across the Panoramic View |
Ze Shi et.al. |
2303.14095v1 |
link |
| 2023-03-24 |
CoLa-Diff: Conditional Latent Diffusion Model for Multi-Modal MRI Synthesis |
Lan Jiang et.al. |
2303.14081v1 |
null |
| 2023-03-27 |
The ImSPOC snapshot imaging spectrometer: image formation model and device characterization |
Daniele Picone et.al. |
2303.14076v2 |
link |
| 2023-03-23 |
Learning and Verification of Task Structure in Instructional Videos |
Medhini Narasimhan et.al. |
2303.13519v1 |
null |
| 2023-03-23 |
DreamBooth3D: Subject-Driven Text-to-3D Generation |
Amit Raj et.al. |
2303.13508v1 |
null |
| 2023-03-23 |
A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition |
Andong Deng et.al. |
2303.13505v1 |
link |
| 2023-03-23 |
ReBotNet: Fast Real-time Video Enhancement |
Jeya Maria Jose Valanarasu et.al. |
2303.13504v1 |
null |
| 2023-03-23 |
TriPlaneNet: An Encoder for EG3D Inversion |
Ananta R. Bhattarai et.al. |
2303.13497v1 |
null |
| 2023-03-23 |
The effectiveness of MAE pre-pretraining for billion-scale pretraining |
Mannat Singh et.al. |
2303.13496v1 |
null |
| 2023-03-23 |
The strength of a simplex is the key to a continuous isometry classification of Euclidean clouds of unlabelled points |
Vitaliy Kurlin et.al. |
2303.13486v1 |
null |
| 2023-03-23 |
TactoFind: A Tactile Only System for Object Retrieval |
Sameer Pai et.al. |
2303.13482v1 |
null |
| 2023-03-23 |
Plotting Behind the Scenes: Towards Learnable Game Engines |
Willi Menapace et.al. |
2303.13472v1 |
null |
| 2023-03-23 |
Egocentric Audio-Visual Object Localization |
Chao Huang et.al. |
2303.13471v1 |
link |
| 2023-03-22 |
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning |
Yiting Cheng et.al. |
2303.12793v1 |
link |
| 2023-03-22 |
SHERF: Generalizable Human NeRF from a Single Image |
Shoukang Hu et.al. |
2303.12791v1 |
link |
| 2023-03-22 |
Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation |
Xiangtai Li et.al. |
2303.12782v1 |
link |
| 2023-03-22 |
Active particles with delayed attractions form quaking crystallites |
Pin-Chuan Chen et.al. |
2303.12780v1 |
null |
| 2023-03-22 |
LSTM-based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls |
Gabriel Mittag et.al. |
2303.12761v1 |
link |
| 2023-03-22 |
Uncertainty Aware Active Learning for Reconfiguration of Pre-trained Deep Object-Detection Networks for New Target Domains |
Jiaming Na et.al. |
2303.12760v1 |
null |
| 2023-03-22 |
Multiscale Relevance of Natural Images |
Samy Lakhal et.al. |
2303.12717v1 |
null |
| 2023-03-22 |
Automatic Identification of Crystal Structures and Interfaces via Artificial-Intelligence-based Electron Microscopy |
Andreas Leitherer et.al. |
2303.12702v1 |
null |
| 2023-03-22 |
Pix2Video: Video Editing using Image Diffusion |
Duygu Ceylan et.al. |
2303.12688v1 |
null |
| 2023-03-22 |
The miniJPAS survey quasar selection III: Classification with artificial neural networks and hybridisation |
G. Martínez-Solaeche et.al. |
2303.12684v1 |
null |
| 2023-03-21 |
Natural Language-Assisted Sign Language Recognition |
Ronglai Zuo et.al. |
2303.12080v1 |
link |
| 2023-03-21 |
OmniTracker: Unifying Object Tracking by Tracking-with-Detection |
Junke Wang et.al. |
2303.12079v1 |
null |
| 2023-03-21 |
Two-shot Video Object Segmentation |
Kun Yan et.al. |
2303.12078v1 |
null |
| 2023-03-21 |
Dexterity from Touch: Self-Supervised Pre-Training of Tactile Representations with Robotic Play |
Irmak Guzey et.al. |
2303.12076v1 |
null |
| 2023-03-21 |
3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers |
Omkar Thawakar et.al. |
2303.12073v1 |
null |
| 2023-03-21 |
Machine Learning for Brain Disorders: Transformers and Visual Transformers |
Robin Courant et.al. |
2303.12068v1 |
null |
| 2023-03-21 |
VideoXum: Cross-modal Visual and Textural Summarization of Videos |
Jingyang Lin et.al. |
2303.12060v1 |
null |
| 2023-03-21 |
Motion Matters: Neural Motion Transfer for Better Camera Physiological Sensing |
Akshay Paruchuri et.al. |
2303.12059v1 |
null |
| 2023-03-21 |
Influencer Backdoor Attack on Semantic Segmentation |
Haoheng Lan et.al. |
2303.12054v1 |
null |
| 2023-03-21 |
Calibration of the convective parameters in stellar pulsation hydrocodes |
Gábor B. Kovács et.al. |
2303.12049v1 |
null |
| 2023-03-20 |
Legs as Manipulator: Pushing Quadrupedal Agility Beyond Locomotion |
Xuxin Cheng et.al. |
2303.11330v1 |
null |
| 2023-03-20 |
Over-the-Air Federated Edge Learning with Error-Feedback One-Bit Quantization and Power Control |
Yuding Liu et.al. |
2303.11319v1 |
null |
| 2023-03-20 |
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning |
Weixuan Sun et.al. |
2303.11302v1 |
null |
| 2023-03-20 |
Reliability in Semantic Segmentation: Are We on the Right Track? |
Pau de Jorge et.al. |
2303.11298v1 |
link |
| 2023-03-20 |
Resource Saving via Ensemble Techniques for Quantum Neural Networks |
Massimiliano Incudini et.al. |
2303.11283v1 |
null |
| 2023-03-20 |
Cascading Hierarchical Networks with Multi-task Balanced Loss for Fine-grained hashing |
Xianxian Zeng et.al. |
2303.11274v1 |
link |
| 2023-03-20 |
Rethinking the backbone architecture for tiny object detection |
Jinlai Ning et.al. |
2303.11267v1 |
null |
| 2023-03-20 |
Robust Imaging of Speed-of-Sound Using Virtual Source Transmission |
Dieter Schweizer et.al. |
2303.11262v1 |
null |
| 2023-03-20 |
Bronchoscopic video synchronization for interactive multimodal inspection of bronchial lesions |
Qi Chang et.al. |
2303.11258v1 |
null |
| 2023-03-20 |
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers |
Jaehoon Yoo et.al. |
2303.11251v1 |
link |
| 2023-03-17 |
Toward Super-Resolution for Appearance-Based Gaze Estimation |
Galen O'Shea et.al. |
2303.10151v1 |
null |
| 2023-03-17 |
Spectrum-inspired Low-light Image Translation for Saliency Detection |
Kitty Varghese et.al. |
2303.10145v1 |
null |
| 2023-03-20 |
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models |
Tyna Eloundou et.al. |
2303.10130v2 |
null |
| 2023-03-17 |
Retrieving false claims on Twitter during the Russia-Ukraine conflict |
Valerio La Gatta et.al. |
2303.10121v1 |
null |
| 2023-03-17 |
Causal Discovery from Temporal Data: An Overview and New Perspectives |
Chang Gong et.al. |
2303.10112v1 |
null |
| 2023-03-17 |
Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation |
Liulei Li et.al. |
2303.10100v1 |
null |
| 2023-03-17 |
Efficient Neural Generation of 4K Masks for Homogeneous Diffusion Inpainting |
Karl Schrader et.al. |
2303.10096v1 |
null |
| 2023-03-17 |
Posterior Estimation Using Deep Learning: A Simulation Study of Compartmental Modeling in Dynamic PET |
Xiaofeng Liu et.al. |
2303.10057v1 |
null |
| 2023-03-20 |
Uncertainty-informed Mutual Learning for Joint Medical Image Classification and Segmentation |
Kai Ren et.al. |
2303.10049v2 |
null |
| 2023-03-17 |
Multi-modal Expression Recognition with Ensemble Method |
Chuanhe Liu et.al. |
2303.10033v1 |
null |
| 2023-03-16 |
Deep Metric Learning for Unsupervised Remote Sensing Change Detection |
Wele Gedara Chaminda Bandara et.al. |
2303.09536v1 |
link |
| 2023-03-17 |
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing |
Chenyang Qi et.al. |
2303.09535v2 |
link |
| 2023-03-16 |
Fast 3D Volumetric Image Reconstruction from 2D MRI Slices by Parallel Processing |
Somoballi Ghoshal et.al. |
2303.09523v1 |
null |
| 2023-03-16 |
MATIS: Masked-Attention Transformers for Surgical Instrument Segmentation |
Nicolás Ayobi et.al. |
2303.09514v1 |
null |
| 2023-03-16 |
LDMVFI: Video Frame Interpolation with Latent Diffusion Models |
Duolikun Danier et.al. |
2303.09508v1 |
null |
| 2023-03-16 |
Knowledge Distillation for Adaptive MRI Prostate Segmentation Based on Limit-Trained Multi-Teacher Models |
Eddardaa Ben Loussaief et.al. |
2303.09494v1 |
null |
| 2023-03-16 |
Classification of tight contact structures on some Seifert fibered manifolds: I |
Tanushree Shah et.al. |
2303.09490v1 |
null |
| 2023-03-16 |
Effectively Modeling Time Series with Simple Discrete State Spaces |
Michael Zhang et.al. |
2303.09489v1 |
link |
| 2023-03-16 |
Combining Distance to Class Centroids and Outlier Discounting for Improved Learning with Noisy Labels |
Farooq Ahmad Wani et.al. |
2303.09470v1 |
link |
| 2023-03-16 |
Diffusion-Shock Inpainting |
Kristina Schaefer et.al. |
2303.09450v1 |
null |
| 2023-03-16 |
DeepMIM: Deep Supervision for Masked Image Modeling |
Sucheng Ren et.al. |
2303.08817v2 |
link |
| 2023-03-15 |
BiFormer: Vision Transformer with Bi-Level Routing Attention |
Lei Zhu et.al. |
2303.08810v1 |
link |
| 2023-03-15 |
Mesh Strikes Back: Fast and Efficient Human Reconstruction from RGB videos |
Rohit Jena et.al. |
2303.08808v1 |
null |
| 2023-03-15 |
Building an Effective Email Spam Classification Model with spaCy |
Kazem Taghandiki et.al. |
2303.08792v1 |
null |
| 2023-03-15 |
PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining |
Garrett Thomas et.al. |
2303.08789v1 |
null |
| 2023-03-15 |
Exploiting 4D CT Perfusion for segmenting infarcted areas in patients with suspected acute ischemic stroke |
Luca Tomasetti et.al. |
2303.08757v1 |
null |
| 2023-03-15 |
2D and 3D CNN-Based Fusion Approach for COVID-19 Severity Prediction from 3D CT-Scans |
Fares Bougourzi et.al. |
2303.08740v1 |
link |
| 2023-03-15 |
Evaluating gesture-generation in a large-scale open challenge: The GENEA Challenge 2022 |
Taras Kucherenko et.al. |
2303.08737v1 |
null |
| 2023-03-15 |
A machine-learning approach to thunderstorm forecasting through post-processing of simulation data |
Kianusch Vahid Yousefnia et.al. |
2303.08736v1 |
null |
| 2023-03-16 |
UniCT DMI Solution for 3rd COV19D Competition on COVID-19 Detection trough attention deep learning for CT Scan |
Alessia Rondinella et.al. |
2303.08728v2 |
null |
| 2023-03-15 |
Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations |
Jianren Wang et.al. |
2303.08135v2 |
null |
| 2023-03-14 |
InstMove: Instance Motion for Object-centric Video Segmentation |
Qihao Liu et.al. |
2303.08132v1 |
link |
| 2023-03-14 |
Blind Video Deflickering by Neural Filtering with a Flawed Atlas |
Chenyang Lei et.al. |
2303.08120v1 |
link |
| 2023-03-14 |
Homeomorphic Image Registration via Conformal-Invariant Hyperelastic Regularisation |
Jing Zou et.al. |
2303.08113v1 |
null |
| 2023-03-15 |
Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations |
Hagay Michaeli et.al. |
2303.08085v2 |
link |
| 2023-03-14 |
Point Cloud Diffusion Models for Automatic Implant Generation |
Paul Friedrich et.al. |
2303.08061v1 |
null |
| 2023-03-14 |
Subjective and Objective Quality Assessment for in-the-Wild Computer Graphics Images |
Zicheng Zhang et.al. |
2303.08050v1 |
null |
| 2023-03-14 |
BODEGA: Benchmark for Adversarial Example Generation in Credibility Assessment |
Piotr Przybyła et.al. |
2303.08032v1 |
null |
| 2023-03-14 |
Optimizing Deep Learning Model Parameters with the Bees Algorithm for Improved Medical Text Classification |
Mai A. Shaaban et.al. |
2303.08021v1 |
link |
| 2023-03-14 |
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection |
Jinchao Li et.al. |
2303.08019v1 |
null |
| 2023-03-13 |
TriDet: Temporal Action Detection with Relative Boundary Modeling |
Dingfeng Shi et.al. |
2303.07347v1 |
link |
| 2023-03-13 |
Unsupervised HDR Image and Video Tone Mapping via Contrastive Learning |
Cong Cao et.al. |
2303.07327v1 |
null |
| 2023-03-13 |
Collision Cross-entropy and EM Algorithm for Self-labeled Classification |
Zhongwen Zhang et.al. |
2303.07321v1 |
null |
| 2023-03-13 |
Model-tuning Via Prompts Makes NLP Models Adversarially Robust |
Mrigank Raman et.al. |
2303.07320v1 |
null |
| 2023-03-13 |
Nearest-Neighbor Inter-Intra Contrastive Learning from Unlabeled Videos |
David Fan et.al. |
2303.07317v1 |
null |
| 2023-03-13 |
Transformer Models for Acute Brain Dysfunction Prediction |
Brandon Silva et.al. |
2303.07305v1 |
null |
| 2023-03-13 |
One-form symmetries in $\mathcal{N} = 3$ $S$-folds |
Antonio Amariti et.al. |
2303.07299v1 |
null |
| 2023-03-13 |
Transformer-based approaches to Sentiment Detection |
Olumide Ebenezer Ojo et.al. |
2303.07292v1 |
null |
| 2023-03-13 |
Align and Attend: Multimodal Summarization with Dual Contrastive Losses |
Bo He et.al. |
2303.07284v1 |
link |
| 2023-03-13 |
Vision-Language Models as Success Detectors |
Yuqing Du et.al. |
2303.07280v1 |
null |
| 2023-03-10 |
Learning to Select Camera Views: Efficient Multiview Understanding at Few Glances |
Yunzhong Hou et.al. |
2303.06145v1 |
link |
| 2023-03-10 |
Warm-Starting and Quantum Computing: A Systematic Mapping Study |
Felix Truger et.al. |
2303.06133v1 |
null |
| 2023-03-10 |
Supersolid phases of bosonic particles in a bubble trap |
Matteo Ciardi et.al. |
2303.06113v1 |
null |
| 2023-03-10 |
Statistical Study of the Correlation between Solar Energetic Particles and Properties of Active Regions |
Russell D. Marroquin et.al. |
2303.06100v1 |
null |
| 2023-03-10 |
Communication-Critical Planning via Multi-Agent Trajectory Exchange |
Nathaniel Moore Glaser et.al. |
2303.06080v1 |
null |
| 2023-03-10 |
Long-tailed Classification from a Bayesian-decision-theory Perspective |
Bolian Li et.al. |
2303.06075v1 |
null |
| 2023-03-10 |
MVImgNet: A Large-scale Dataset of Multi-view Images |
Xianggang Yu et.al. |
2303.06042v1 |
null |
| 2023-03-10 |
Importance of Aligning Training Strategy with Evaluation for Diffusion Models in 3D Multiclass Segmentation |
Yunguan Fu et.al. |
2303.06040v1 |
link |
| 2023-03-10 |
Tactile-Filter: Interactive Tactile Perception for Part Mating |
Kei Ota et.al. |
2303.06034v1 |
null |
| 2023-03-10 |
Material Identification From Radiographs Without Energy Resolution |
Michael T. McCann et.al. |
2303.06005v1 |
null |
| 2023-03-09 |
PAC-NeRF: Physics Augmented Continuum Neural Radiance Fields for Geometry-Agnostic System Identification |
Xuan Li et.al. |
2303.05512v1 |
null |
| 2023-03-09 |
Cherry-Picking with Reinforcement Learning |
Yunchu Zhang et.al. |
2303.05508v1 |
null |
| 2023-03-09 |
Learning Arm-Assisted Fall Damage Reduction and Recovery for Legged Mobile Manipulators |
Yuntao Ma et.al. |
2303.05486v1 |
null |
| 2023-03-09 |
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases |
Aengus Lynch et.al. |
2303.05470v1 |
link |
| 2023-03-09 |
Resolving quantitative MRI model degeneracy with machine learning via training data distribution design |
Michele Guerreri et.al. |
2303.05464v1 |
null |
| 2023-03-09 |
Understanding the Challenges and Opportunities of Pose-based Anomaly Detection |
Ghazal Alinezhad Noghre et.al. |
2303.05463v1 |
null |
| 2023-03-09 |
Presentation Attack Detection with Advanced CNN Models for Noncontact-based Fingerprint Systems |
Sandip Purnapatra et.al. |
2303.05459v1 |
null |
| 2023-03-09 |
The Impact of Feature Selection and Transformation on Machine Learning Methods in Determining the Credit Scoring |
Oguz Koc et.al. |
2303.05427v1 |
null |
| 2023-03-09 |
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning |
Kazi Injamamul Haque et.al. |
2303.05416v1 |
link |
| 2023-03-09 |
Deep Equilibrium Learning of Explicit Regularizers for Imaging Inverse Problems |
Zihao Zou et.al. |
2303.05386v1 |
null |
| 2023-03-08 |
Automatic Debiased Learning from Positive, Unlabeled, and Exposure Data |
Masahiro Kato et.al. |
2303.04797v1 |
null |
| 2023-03-08 |
Video-P2P: Video Editing with Cross-attention Control |
Shaoteng Liu et.al. |
2303.04761v1 |
null |
| 2023-03-08 |
Multimodal Parameter-Efficient Few-Shot Class Incremental Learning |
Marco D'Alessandro et.al. |
2303.04751v1 |
null |
| 2023-03-08 |
A General Theory of Correct, Incorrect, and Extrinsic Equivariance |
Dian Wang et.al. |
2303.04745v1 |
null |
| 2023-03-08 |
Model Predictive Control with Gaussian-Process-Supported Dynamical Constraints for Autonomous Vehicles |
Johanna Bethge et.al. |
2303.04725v1 |
null |
| 2023-03-08 |
A lattice model for condensation in Levin-Wen systems |
Jessica Christian et.al. |
2303.04711v1 |
null |
| 2023-03-08 |
VOLTA: an Environment-Aware Contrastive Cell Representation Learning for Histopathology |
Ramin Nakhli et.al. |
2303.04696v1 |
null |
| 2023-03-08 |
STPDnet: Spatial-temporal convolutional primal dual network for dynamic PET image reconstruction |
Rui Hu et.al. |
2303.04667v1 |
null |
| 2023-03-08 |
Centroid-centered Modeling for Efficient Vision Transformer Pre-training |
Xin Yan et.al. |
2303.04664v1 |
null |
| 2023-03-08 |
**DULDA: Dual-domain Unsupervis |
|
|
|