Akio Hayakawa
Research engineer at Sony Research
Links
Google Scholar /
Github (private account) /
Github (company account) /
Email
Educations
- M.S. in Information Science and Technology, the University of Tokyo (2016/04 - 2018/03)
- B.S. in Engineering, the University of Tokyo (2012/04 - 2016/03)
Work Experiences
- R&D Center, Sony Corp. (2018/04 - 2023/03)
- Sony Research Inc. (2023/04 - )
Publications
International Conference
- Akio Hayakawa, Yusuke Mukuta, Tatsuya Harada, “Linear Fusion MultiDiffusion for Fast Training-Free Spherical Panorama Generation,” ECCV 2026 [project]
- Akio Hayakawa, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji, “Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance,” ECCV 2026 [arXiv] [project]
- Christian Simon, Masato Ishii, Wei-Yao Wang, Koichi Saito, Akio Hayakawa, Dongseok Shim, Zhi Zhong, Shuyang Cui, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, “Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models,” CVPR 2026 [arXiv] [project]
- Christian Simon, Masato Ishii, Akio Hayakawa, Zhi Zhong, Shusuke Takahashi, Takashi Shibuya, Yuki Mitsufuji, “TITAN-Guide: Taming Inference-Time Alignment for Guided Text-to-Video Diffusion Models,” ICCV 2025 [arXiv] [project]
- Masato Ishii, Akio Hayakawa, Takashi Shibuya, Yuki Mitsufuji, “A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation”, IJCNN 2025 [arXiv] [code]
- Ho Kei Cheng, Masato Ishii, Akio Hayakawa, Takashi Shibuya, Alexander Schwing, Yuki Mitsufuji, “Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis,” CVPR 2025 [arXiv] [code]
- Akio Hayakawa, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji, “MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation”, ICLR 2025 [arXiv] [code]
- Akio Hayakawa, Yusuke Kurose, Kiyohito Tanaka, Kento Aida, Shin’ichi Satoh, Masaru Kitsuregawa, Tatsuya Harada, “Gastric cancer detection for gastroenterological endoscopy with local and multi-scale global information”, CARS 2019
Workshop
- Naoki Matsunaga, Masato Ishii, Akio Hayakawa, Kenji Suzuki, Takuya Narihira, “Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models”, CVPRW AI4CC 2023 [arXiv]
Domestic Conference
- Hiromichi Kamata, Yuiko Sakuma, Akio Hayakawa, Masato Ishii, Takuya Narihira, “Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion”, MIRU 2023 [arXiv]
- Akio Hayakawa, Jun Nishikawa, Masato Ishii, “任意の画像生成モデルに対する汎用リファイナーとしての拡散確率モデルの応用”, MIRU 2022
- Naofumi Akimoto, Akio Hayakawa, Andrew Shin, Takuya Narihira, “Reference-based video colorization with spatiotemporal correspondence”, MIRU 2021 [arXiv]
- Akio Hayakawa, Takuya Narihira, “Out-of-core training for extremely large-scale neural networks with adaptive window-based scheduling”, MIRU 2021 [arXiv]
Preprint
- Zhengyang Yu, Akio Hayakawa, Masato Ishii, Qingtao Yu, Takashi Shibuya, Jing Zhang, Yuki Mitsufuji, “AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path,” 2025 [arXiv]
- Masato Ishii, Akio Hayakawa, Takashi Shibuya, Yuki Mitsufuji, “Coherent Audio-Visual Editing via Conditional Audio Generation Following Video Edits,” 2025 [arXiv]
- Koichi Saito, Julian Tanke, Christian Simon, Masato Ishii, Kazuki Shimada, Zachary Novack, Zhi Zhong, Akio Hayakawa, Takashi Shibuya, Yuki Mitsufuji, “SoundReactor: Frame-level Online Video-to-Audio Generation,” 2025 [arXiv]
- Takuya Narihira, Javier Alonsogarcia, Fabien Cardinaux, Akio Hayakawa, Masato Ishii, Kazunori Iwaki, Thomas Kemp, Yoshiyuki Kobayashi, Lukas Mauch, Akira Nakamura, Yukio Obuchi, Andrew Shin, Kenji Suzuki, Stephen Tiedmann, Stefan Uhlich, Takuya Yashima, Kazuki Yoshiyama, “Neural Network Libraries: A Deep Learning Framework Designed from Engineers’ Perspectives”, 2021 [arXiv]
Patents
- Yoshiyuki Kobayashi, Andrew Shin, Akio Hayakawa, Takayoshi Takayanagi, Hirotaka Suzuki, “Bias adjustment device, information processing device, information processing method, and information processing program”, US Patent App. 17/771,051, 2022
Prizes
- Mitou Creator selected by Information-technology Promotion Agency (IPA), Japan, 2017 [link]
Invited presentations
- “Tutorial on Diffusion Models” in The 29th Symposium on Sensing via Image Information (SSII) 2023 [slides]
- “Tutorial on Diffusion Models and Recent Trends” in The 37th Annual Conference of Japanese Society for Artificial Intelligence (JSAI) 2023
Professional Activities
- Conference Reviewer: CVPR, ECCV, ICLR (Top 200 Reviewers at ICLR 2026), NeurIPS