포티투닷 | 42dot - We Are A Mobility AI Company

Computer Vision

Cross-Lingual

Depth Estimation

Domain Adaptation

Ehancement

Face obfuscation

Keyword Spotting

Lane Detection

Language Identification

Machine Learning

Motion Planning

Motion Prediction

Multi-Object Tracking

Natural Language Processing

OCR

Pattern Recognition

Speaker Separation

Speech Enhancement

Speech Processing

Speech Recognition

Speech Separation

Speech Synthesis

Trajectory Prediction

ACM/MM

APSIPA

COLING

CVPR

CVPRW

ECCV

ECCVW

ICASSP

ICCV

ICPR

IEEE Access

IEEE Signal Processing Letters

IEEE/ACM TASLP

IROS

Interspeech

NeurIPS

SLT

SOAP: Vision-Centric 3D Semantic Scene Completion with Scene-Adaptive Decoder and Occluded Region-Aware View Projection

Computer Vision,
CVPR, 2025

GRAE-3DMOT: Geometry Relation-Aware Encoder for Online 3D Multi-Object Tracking

Multi-Object Tracking,
CVPR , 2025

Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding

Speech Processing,
ICASSP, 2024

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Computer Vision,
CVPR, 2024

Bridging the Gap between Audio and Text using Parallel-attention for User-defined Keyword Spotting

Keyword Spotting,
IEEE Signal Processing Letters, 2024

Who Should Have Been Focused: Transferring Attention-Based Knowledge from Future Observations for Trajectory Prediction

Trajectory Prediction,
ICPR, 2024

Forbes: Face Obfuscation Rendering via Backpropagation Refinement Scheme

Face obfuscation,
ECCV, 2024

Self-training ASR Guided by Unsupervised ASR Teacher

Speech Recognition,
Interspeech , 2024

Towards Understanding the Relationship between In-context Learning and Compositional Generalization

Natural Language Processing,
COLING, 2024

Joint Appearance and Motion Model with Temporal Transformer for Multiple Object Tracking

Computer Vision,
IEEE Access, 2023

Boosting Unknown-number Speaker Separation With Transformer Decoder-based Attractor

Speech Separation,
ICASSP, 2024

Voxtlm: Unified Decoder-only Models for Consolidating Speech Recognition/Synthesis and Speech/Text Continuation Tasks

Speech Recognition,
ICASSP, 2024

Learning Contextualized Representation On Discrete Space Via Hierarchical Product Quantization

Speech Recognition,
ICASSP, 2024

TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation

Speech Separation,
IEEE/ACM TASLP, 2023

That's What Said: Fully-Controllable Talking Face Generation

Computer Vision,
ACM/MM, 2023

Luminance-aware Color Transform for Multiple Exposure Correction

Computer Vision,
ICCV, 2023

SlaBins: Fisheye Depth Estimation using Slanted Bins on Road Environments

Computer Vision,
ICCV, 2023

SpeedFormer: Learning Speed Profiles with Upper and Lower Boundary Constraints Based on Transformer

Motion Planning,
IROS, 2023

Factspeech: Speaking a Foreign Language Pronunciation Using Only Your Native Characters

Speech Synthesis,
Interspeech, 2023

MiLO: Multi-task Learning with Localization Ambiguity Suppression for Occupancy Prediction

Computer Vision,
CVPRW, 2023

RUFI: Reducing Uncertainty in behavior prediction with Future Information

Machine Learning,
CVPRW, 2023

BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling

Computer Vision,
CVPR, 2023

Speech Recognition,
ICASSP, 2023

CrossSpeech: Speaker-independent Acoustic Representation for Cross-lingual Speech Synthesis

Speech Synthesis,
ICASSP, 2023

Metric Learning for User-defined Keyword Spotting

Keyword Spotting,
ICASSP, 2023

Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling

Speech Enhancement,
ICASSP, 2023

Joint unsupervised and supervised learning for context-aware language identification

Language Identification,
ICASSP, 2023

TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation

Speaker Separation,
ICASSP, 2023

ASBERT: ASR-Specific Self-Supervised Learning with Self-Training

Speech Recognition,
SLT, 2022

An Empirical Study of Training Mixture Generation Strategies on Speech Separation: Dynamic Mixing and Augmentation

Speech Separation,
APSIPA, 2022

Self-supervised surround-view depth estimation with volumetric feature fusion

Computer Vision,
NeurIPS, 2022

Character decomposition to resolve class imbalance problem in Hangul OCR

Computer Vision,
ECCVW, 2022

Eigenlanes: Data-driven lane descriptors for structurally diverse lanes

Computer Vision,
CVPR, 2022

Harmonious semantic line detection via maximal weight clique selection

Computer Vision,
CVPR, 2021

Instance-level future motion estimation in a single image based on ordinal regression

Computer Vision,
ICCV, 2019

Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation

Computer Vision,
ICCV, 2019

Anchor Loss: Modulating Loss Scale based on Prediction Difficulty

Computer Vision,
ICCV, 2019