LLMs and post-training
RLHF, PPO variants, alignment, test-time reasoning, adaptive inference, long-context modeling, and domain-specific fine-tuning.
McGill University | Mila - Quebec AI Institute
I work on large language models for technical domains, with a focus on post-training, knowledge-grounded generation, telecommunications, networked systems, and distributed AI.
About
I am a Ph.D. candidate in Computer Science at McGill University, supervised by Prof. Xue Liu, and affiliated with Mila - Quebec AI Institute. I study how language models can be trained, grounded, and deployed in technical environments where accuracy and reliability matter.
Recent projects include contraction-aware reinforcement learning for language model fine-tuning, knowledge-graph-enhanced retrieval for telecommunications, semantic alignment in agent communication protocols, and optimization for wireless and data-center systems.
Research Areas
RLHF, PPO variants, alignment, test-time reasoning, adaptive inference, long-context modeling, and domain-specific fine-tuning.
Retrieval-augmented generation, knowledge graphs, provenance-aware question answering, information extraction, and grounded generation.
LLMs for wireless networks, telecom question answering, digital twins, multi-agent reinforcement learning, and edge intelligence.
Web3, decentralized systems, cryptographic protocols, secure data infrastructures, and privacy-aware distributed computing.
Featured Project
Project page for my survey on agent communication protocols, organized around communication, syntactic, and semantic layers.
Recent Publications
Dun Yuan, Di Wu, Xue Liu. ICLR 2026 Poster.
OpenReviewDun Yuan, Fuyuan Lyu, Ye Yuan, Weixu Zhang, et al. arXiv:2604.02369.
Project pageDun Yuan, Hao Zhou, Di Wu, Xue Liu, Hao Chen, Yan Xin, Jianzhong Charlie Zhang. ICC Workshops 2025.
DBLPFor the complete publication list, see Google Scholar.