Review

[논문리뷰] DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

Zhuoyang Liu이 [arXiv]에 게시한 'DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] DiP: Taming Diffusion Models in Pixel Space

Xu Chen이 [arXiv]에 게시한 'DiP: Taming Diffusion Models in Pixel Space' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

이 [arXiv]에 게시한 'DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

이 [arXiv]에 게시한 'Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] CaptionQA: Is Your Caption as Useful as the Image Itself?

Zicheng Liu이 [arXiv]에 게시한 'CaptionQA: Is Your Caption as Useful as the Image Itself?' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] Captain Safari: A World Engine

Yitong Li이 [arXiv]에 게시한 'Captain Safari: A World Engine' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] Architecture Decoupling Is Not All You Need For Unified Multimodal Model

Hongyu Li이 [arXiv]에 게시한 'Architecture Decoupling Is Not All You Need For Unified Multimodal Model' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement

Yicheng Ji이 [arXiv]에 게시한 'AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] Adversarial Flow Models

이 [arXiv]에 게시한 'Adversarial Flow Models' 논문에 대한 자세한 리뷰입니다.

2025년 12월 1일

[논문리뷰] What does it mean to understand language?

이 [arXiv]에 게시한 'What does it mean to understand language?' 논문에 대한 자세한 리뷰입니다.

2025년 11월 28일

[논문리뷰] Video Generation Models Are Good Latent Reward Models

이 [arXiv]에 게시한 'Video Generation Models Are Good Latent Reward Models' 논문에 대한 자세한 리뷰입니다.

2025년 11월 28일

[논문리뷰] Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following

이 [arXiv]에 게시한 'Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following' 논문에 대한 자세한 리뷰입니다.

2025년 11월 28일

[논문리뷰] MIRA: Multimodal Iterative Reasoning Agent for Image Editing

Jiebo Luo이 [arXiv]에 게시한 'MIRA: Multimodal Iterative Reasoning Agent for Image Editing' 논문에 대한 자세한 리뷰입니다.

2025년 11월 28일

[논문리뷰] Canvas-to-Image: Compositional Image Generation with Multimodal Controls

Kfir Aberman이 [arXiv]에 게시한 'Canvas-to-Image: Compositional Image Generation with Multimodal Controls' 논문에 대한 자세한 리뷰입니다.

2025년 11월 28일

[논문리뷰] Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

Qunyi Xie이 [arXiv]에 게시한 'Agentic Learner with Grow-and-Refine Multimodal Semantic Memory' 논문에 대한 자세한 리뷰입니다.

2025년 11월 28일

[논문리뷰] Terminal Velocity Matching

Jiaming Song이 [arXiv]에 게시한 'Terminal Velocity Matching' 논문에 대한 자세한 리뷰입니다.

2025년 11월 27일

[논문리뷰] SPHINX: A Synthetic Environment for Visual Perception and Reasoning

Nidhi Rastogi이 [arXiv]에 게시한 'SPHINX: A Synthetic Environment for Visual Perception and Reasoning' 논문에 대한 자세한 리뷰입니다.

2025년 11월 27일

[논문리뷰] Revisiting Generalization Across Difficulty Levels: It's Not So Easy

이 [arXiv]에 게시한 'Revisiting Generalization Across Difficulty Levels: It's Not So Easy' 논문에 대한 자세한 리뷰입니다.

2025년 11월 27일

[논문리뷰] RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale

Yangcheng Yu이 [arXiv]에 게시한 'RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale' 논문에 대한 자세한 리뷰입니다.

2025년 11월 27일

[논문리뷰] NVIDIA Nemotron Parse 1.1

이 [arXiv]에 게시한 'NVIDIA Nemotron Parse 1.1' 논문에 대한 자세한 리뷰입니다.

2025년 11월 27일