ASR/Self-training ASR Guided by Unsupervised ASR Teacher
Speech Recognition,
Interspeech , 2024
Interspeech , 2024
Towards Understanding the Relationship between In-context Learning and Compositional Generalization
Natural Language Processing,
COLING, 2024
COLING, 2024
Joint Appearance and Motion Model with Temporal Transformer for Multiple Object Tracking
Computer Vision,
IEEE Access, 2023
IEEE Access, 2023
Boosting Unknown-number Speaker Separation With Transformer Decoder-based Attractor
Speech Separation,
ICASSP, 2024
ICASSP, 2024
Voxtlm: Unified Decoder-only Models for Consolidating Speech Recognition/Synthesis and Speech/Text Continuation Tasks
Speech Recognition,
ICASSP, 2024
ICASSP, 2024
Learning Contextualized Representation On Discrete Space Via Hierarchical Product Quantization
Speech Recognition,
ICASSP, 2024
ICASSP, 2024
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation
Speech Separation,
IEEE/ACM TASLP, 2023
IEEE/ACM TASLP, 2023
That's What Said: Fully-Controllable Talking Face Generation
Computer Vision,
ACM/MM, 2023
ACM/MM, 2023
Luminance-aware Color Transform for Multiple Exposure Correction
Computer Vision,
ICCV, 2023
ICCV, 2023
SlaBins: Fisheye Depth Estimation using Slanted Bins on Road Environments
Computer Vision,
ICCV, 2023
ICCV, 2023
SpeedFormer: Learning Speed Profiles with Upper and Lower Boundary Constraints Based on Transformer
Motion Planning,
IROS, 2023
IROS, 2023
Factspeech: Speaking a Foreign Language Pronunciation Using Only Your Native Characters
Speech Synthesis,
Interspeech, 2023
Interspeech, 2023
MiLO: Multi-task Learning with Localization Ambiguity Suppression for Occupancy Prediction
Computer Vision,
CVPRW, 2023
CVPRW, 2023
RUFI: Reducing Uncertainty in behavior prediction with Future Information
Machine Learning,
CVPRW, 2023
CVPRW, 2023
BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling
Computer Vision,
CVPR, 2023
CVPR, 2023
Masked Token Similarity Transfer for Compressing Transformer-Based ASR Models
Speech Recognition,
ICASSP, 2023
ICASSP, 2023
CrossSpeech: Speaker-independent Acoustic Representation for Cross-lingual Speech Synthesis
Speech Synthesis,
ICASSP, 2023
ICASSP, 2023
Metric Learning for User-defined Keyword Spotting
Keyword Spotting,
ICASSP, 2023
ICASSP, 2023
Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling
Speech Enhancement,
ICASSP, 2023
ICASSP, 2023
Joint unsupervised and supervised learning for context-aware language identification
Language Identification,
ICASSP, 2023
ICASSP, 2023
TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation
Speaker Separation,
ICASSP, 2023
ICASSP, 2023
ASBERT: ASR-Specific Self-Supervised Learning with Self-Training
Speech Recognition,
SLT, 2022
SLT, 2022
An Empirical Study of Training Mixture Generation Strategies on Speech Separation: Dynamic Mixing and Augmentation
Speech Separation,
APSIPA, 2022
APSIPA, 2022
Self-supervised surround-view depth estimation with volumetric feature fusion
Computer Vision,
NeurIPS, 2022
NeurIPS, 2022
Character decomposition to resolve class imbalance problem in Hangul OCR
Computer Vision,
ECCVW, 2022
ECCVW, 2022
Eigenlanes: Data-driven lane descriptors for structurally diverse lanes
Computer Vision,
CVPR, 2022
CVPR, 2022
Harmonious semantic line detection via maximal weight clique selection
Computer Vision,
CVPR, 2021
CVPR, 2021
Instance-level future motion estimation in a single image based on ordinal regression
Computer Vision,
ICCV, 2019
ICCV, 2019
Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation
Computer Vision,
ICCV, 2019
ICCV, 2019
Anchor Loss: Modulating Loss Scale based on Prediction Difficulty
Computer Vision,
ICCV, 2019
ICCV, 2019