KAIST Takes the Lead in Developing Core Technologies for Generative AI National R&D Project
KAIST (President Kwang Hyung Lee) is leading the transition to AI Transformation (AX) by advancing research topics based on the practical technological demands of industries, fostering AI talent, and demonstrating research outcomes in industrial settings. In this context, KAIST announced on the 13th of August that it is at the forefront of strengthening the nation's AI technology competitiveness by developing core AI technologies via national R&D projects for generative AI led by the Ministry of Science and ICT.
In the 'Generative AI Leading Talent Cultivation Project,' KAIST was selected as a joint research institution for all three projects—two led by industry partners and one by a research institution—and will thus be tasked with the dual challenge of developing core generative AI technologies and cultivating practical, core talent through industry-academia collaborations.
Moreover, in the 'Development of a Proprietary AI Foundation Model' project, KAIST faculty members are participating as key researchers in four out of five consortia, establishing the university as a central hub for domestic generative AI research.
Each project in the Generative AI Leading Talent Cultivation Project will receive 6.7 billion won, while each consortium in the proprietary AI foundation model development project will receive a total of 200 billion won in government support, including GPU infrastructure.
As part of the 'Generative AI Leading Talent Cultivation Project,' which runs until the end of 2028, KAIST is collaborating with LG AI Research. Professor Noseong Park from the School of Computing will participate as the principal investigator for KAIST, conducting research in the field of physics-based generative AI (Physical AI). This project focuses on developing image and video generation technologies based on physical laws and developing a 'World Model.'
In particular, research being conducted by Professor Noseong Park's team and Professor Sung-Eui Yoon's team proposes a model structure designed to help AI learn the real-world rules of the physical world more precisely. This is considered a core technology for Physical AI.
Professors Noseong Park, Jae-gil Lee, Jiyoung Hwang, Sung-Eui Yoon, and Hyun-Woo Kim from the School of Computing, who have been globally recognized for their achievements in the AI field, are jointly participating in this project. This year, they have presented work at top AI conferences such as ICLR, ICRA, ICCV, and ICML, including: ▲ Research on physics-based Ollivier Ricci-flow (ICLR 2025, Prof. Noseong Park) ▲ Technology to improve the navigation efficiency of quadruped robots (ICRA 2025, Prof. Sung-Eui Yoon) ▲ A multimodal large language model for text-video retrieval (ICCV 2025, Prof. Hyun-Woo Kim) ▲ Structured representation learning for knowledge generation (ICML 2025, Prof. Jiyoung Whang).
In the collaboration with NC AI, Professor Tae-Kyun Kim from the School of Computing is participating as the principal investigator to develop multimodal AI agent technology. The research will explore technologies applicable to the entire gaming industry, such as 3D modeling, animation, avatar expression generation, and character AI. It is expected to contribute to training practical AI talents by giving them hands-on experience in the industrial field and making the game production pipeline more efficient.
As the principal investigator, Professor Tae-Kyun Kim, a renowned scholar in 3D computer vision and generative AI, is developing key technologies for creating immersive avatars in the virtual and gaming industries. He will apply a first-person full-body motion diffusion model, which he developed through a joint research project with Meta, to VR and AR environments.
Professor Tae-Kyun Kim, Minhyeok Seong, and Tae-Hyun Oh from the School of Computing, and Professors Sung-Hee Lee, Woon-Tack Woo, Jun-Yong Noh, and Kyung-Tae Lim from the Graduate School of Culture Technology, are participating in the NC AI project. They have presented globally recognized work at CVPR 2025 and ICLR 2025, including: ▲ A first-person full-body motion diffusion model (CVPR 2025, Prof. Tae-Kyun Kim) ▲ Stochastic diffusion synchronization technology for image generation (ICLR 2025, Prof. Minhyeok Seong) ▲ The creation of a large-scale 3D facial mesh video dataset (ICLR 2025, Prof. Tae-Hyun Oh) ▲ Object-adaptive agent motion generation technology, InterFaceRays (Eurographics 2025, Prof. Sung-Hee Lee) ▲ 3D neural face editing technology (CVPR 2025, Prof. Jun-Yong Noh) ▲ Research on selective search augmentation for multilingual vision-language models (COLING 2025, Prof. Kyung-Tae Lim).
In the project led by the Korea Electronics Technology Institute (KETI), Professor Seungryong Kim from the Kim Jae-chul Graduate School of AI is participating in generative AI technology development. His team recently developed new technology for extracting robust point-tracking information from video data in collaboration with Adobe Research and Google DeepMind, proposing a key technology for clearly understanding and generating videos.
Each industry partner will open joint courses with KAIST and provide their generative AI foundation models for education and research. Selected outstanding students will be dispatched to these companies to conduct practical research, and KAIST faculty will also serve as adjunct professors at the in-house AI graduate school established by LG AI Research.
Meanwhile, KAIST showed an unrivaled presence by participating in four consortia for the Ministry of Science and ICT's 'Proprietary AI Foundation Model Development' project.
In the NC AI Consortium, Professors Tae-Kyun Kim, Sung-Eui Yoon, Noseong Park, Jiyoung Hwang, and Minhyeok Seong from the School of Computing are participating, focusing on the development of multimodal foundation models (LMMs) and robot-based models. They are particularly concentrating on developing LMMs that learn common sense about space, physics, and time. They have formed a research team optimized for developing next-generation, multimodal AI models that can understand and interact with the physical world, equipped with an 'all-purpose AI brain' capable of simultaneously understanding and processing diverse information such as text, images, video, and sound.
In the Upstage Consortium, Professors Jae-gil Lee and Hyeon-eon Oh from the School of Computing, both renowned scholars in data AI and NLP (natural language processing), along with Professor Kyung-Tae Lim from the Graduate School of Culture Technology, an LLM expert, are responsible for developing vertical models for industries such as finance, law, and manufacturing. The KAIST researchers will concentrate on developing practical AI models that are directly applicable to industrial settings and tailored to each specific industry.
The Naver Consortium includes Professor Tae-Hyun Oh from the School of Computing, who has developed key technology for multimodal learning and compositional language-vision models, Professor Hyun-Woo Kim, who has proposed video reasoning and generation methods using language models, and faculty from the Kim Jae-chul Graduate School of AI and the Department of Electrical Engineering.
In the SKT Consortium, Professor Ki-min Lee from the Kim Jae-chul Graduate School of AI, who has achieved outstanding results in text-to-image generation, human preference modeling, and visual robotic manipulation technology development, is participating. This technology is expected to play a key role in developing personalized services and customized AI solutions for telecommunications companies.
This outcome is considered a successful culmination of KAIST's strategy for developing AI technology based on industry demand and centered on on-site demonstrations.
KAIST President Kwang Hyung Lee said, "For AI technology to go beyond academic achievements and be connected to and practical for industry, continuous government support, research, and education centered on industry-academia collaboration are essential. KAIST will continue to strive to solve problems in industrial settings and make a real contribution to enhancing the competitiveness of the AI ecosystem."
He added that while the project led by Professor Sung-Ju Hwang from the Kim Jae-chul Graduate School of AI, which had applied as a lead institution for the proprietary foundation model development project, was unfortunately not selected, it was a meaningful challenge that stood out for its original approach and bold attempts. President Lee further commented, "Regardless of whether it was selected or not, such attempts will accumulate and make the Korean AI ecosystem even richer."
KAIST & CMU Unveils Amuse, a Songwriting AI-Collaborator to Help Create Music
Wouldn't it be great if music creators had someone to brainstorm with, help them when they're stuck, and explore different musical directions together? Researchers of KAIST and Carnegie Mellon University (CMU) have developed AI technology similar to a fellow songwriter who helps create music.
KAIST (President Kwang-Hyung Lee) has developed an AI-based music creation support system, Amuse, by a research team led by Professor Sung-Ju Lee of the School of Electrical Engineering in collaboration with CMU. The research was presented at the ACM Conference on Human Factors in Computing Systems (CHI), one of the world’s top conferences in human-computer interaction, held in Yokohama, Japan from April 26 to May 1. It received the Best Paper Award, given to only the top 1% of all submissions.
< (From left) Professor Chris Donahue of Carnegie Mellon University, Ph.D. Student Yewon Kim and Professor Sung-Ju Lee of the School of Electrical Engineering >
The system developed by Professor Sung-Ju Lee’s research team, Amuse, is an AI-based system that converts various forms of inspiration such as text, images, and audio into harmonic structures (chord progressions) to support composition.
For example, if a user inputs a phrase, image, or sound clip such as “memories of a warm summer beach”, Amuse automatically generates and suggests chord progressions that match the inspiration.
Unlike existing generative AI, Amuse is differentiated in that it respects the user's creative flow and naturally induces creative exploration through an interactive method that allows flexible integration and modification of AI suggestions.
The core technology of the Amuse system is a generation method that blends two approaches: a large language model creates music code based on the user's prompt and inspiration, while another AI model, trained on real music data, filters out awkward or unnatural results using rejection sampling.
< Figure 1. Amuse system configuration. After extracting music keywords from user input, a large language model-based code progression is generated and refined through rejection sampling (left). Code extraction from audio input is also possible (right). The bottom is an example visualizing the chord structure of the generated code. >
The research team conducted a user study targeting actual musicians and evaluated that Amuse has high potential as a creative companion, or a Co-Creative AI, a concept in which people and AI collaborate, rather than having a generative AI simply put together a song.
The paper, in which a Ph.D. student Yewon Kim and Professor Sung-Ju Lee of KAIST School of Electrical and Electronic Engineering and Carnegie Mellon University Professor Chris Donahue participated, demonstrated the potential of creative AI system design in both academia and industry. ※ Paper title: Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations DOI: https://doi.org/10.1145/3706598.3713818
※ Research demo video: https://youtu.be/udilkRSnftI?si=FNXccC9EjxHOCrm1
※ Research homepage: https://nmsl.kaist.ac.kr/projects/amuse/
Professor Sung-Ju Lee said, “Recent generative AI technology has raised concerns in that it directly imitates copyrighted content, thereby violating the copyright of the creator, or generating results one-way regardless of the creator’s intention. Accordingly, the research team was aware of this trend, paid attention to what the creator actually needs, and focused on designing an AI system centered on the creator.”
He continued, “Amuse is an attempt to explore the possibility of collaboration with AI while maintaining the initiative of the creator, and is expected to be a starting point for suggesting a more creator-friendly direction in the development of music creation tools and generative AI systems in the future.”
This research was conducted with the support of the National Research Foundation of Korea with funding from the government (Ministry of Science and ICT). (RS-2024-00337007)
Professor Sung-Ju Lee’s Team Wins the Best Paper and the Methods Recognition Awards at the ACM CSCW
A research team led by Professor Sung-Ju Lee at the School of Electrical Engineering won the Best Paper Award and the Methods Recognition Award from ACM CSCW (International Conference on Computer-Supported Cooperative Work and Social Computing) 2021 for their paper “Reflect, not Regret: Understanding Regretful Smartphone Use with App Feature-Level Analysis”.
Founded in 1986, CSCW has been a premier conference on HCI (Human Computer Interaction) and Social Computing. This year, 340 full papers were presented and the best paper awards are given to the top 1% papers of the submitted. Methods Recognition, which is a new award, is given “for strong examples of work that includes well developed, explained, or implemented methods, and methodological innovation.”
Hyunsung Cho (KAIST alumus and currently a PhD candidate at Carnegie Mellon University), Daeun Choi (KAIST undergraduate researcher), Donghwi Kim (KAIST PhD Candidate), Wan Ju Kang (KAIST PhD Candidate), and Professor Eun Kyoung Choe (University of Maryland and KAIST alumna) collaborated on this research.
The authors developed a tool that tracks and analyzes which features of a mobile app (e.g., Instagram’s following post, following story, recommended post, post upload, direct messaging, etc.) are in use based on a smartphone’s User Interface (UI) layout. Utilizing this novel method, the authors revealed which feature usage patterns result in regretful smartphone use.
Professor Lee said, “Although many people enjoy the benefits of smartphones, issues have emerged from the overuse of smartphones. With this feature level analysis, users can reflect on their smartphone usage based on finer grained analysis and this could contribute to digital wellbeing.”
Object Identification and Interaction with a Smartphone Knock
(Professor Lee (far right) demonstrate 'Knocker' with his students.)
A KAIST team has featured a new technology, “Knocker”, which identifies objects and executes actions just by knocking on it with the smartphone. Software powered by machine learning of sounds, vibrations, and other reactions will perform the users’ directions.
What separates Knocker from existing technology is the sensor fusion of sound and motion. Previously, object identification used either computer vision technology with cameras or hardware such as RFID (Radio Frequency Identification) tags. These solutions all have their limitations. For computer vision technology, users need to take pictures of every item. Even worse, the technology will not work well in poor lighting situations. Using hardware leads to additional costs and labor burdens.
Knocker, on the other hand, can identify objects even in dark environments only with a smartphone, without requiring any specialized hardware or using a camera. Knocker utilizes the smartphone’s built-in sensors such as a microphone, an accelerometer, and a gyroscope to capture a unique set of responses generated when a smartphone is knocked against an object. Machine learning is used to analyze these responses and classify and identify objects.
The research team under Professor Sung-Ju Lee from the School of Computing confirmed the applicability of Knocker technology using 23 everyday objects such as books, laptop computers, water bottles, and bicycles. In noisy environments such as a busy café or on the side of a road, it achieved 83% identification accuracy. In a quiet indoor environment, the accuracy rose to 98%.
The team believes Knocker will open a new paradigm of object interaction. For instance, by knocking on an empty water bottle, a smartphone can automatically order new water bottles from a merchant app. When integrated with IoT devices, knocking on a bed’s headboard before going to sleep could turn off the lights and set an alarm. The team suggested and implemented 15 application cases in the paper, presented during the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp 2019) held in London last month.
Professor Sung-Ju Lee said, “This new technology does not require any specialized sensor or hardware. It simply uses the built-in sensors on smartphones and takes advantage of the power of machine learning. It’s a software solution that everyday smartphone users could immediately benefit from.” He continued, “This technology enables users to conveniently interact with their favorite objects.”
The research was supported in part by the Next-Generation Information Computing Development Program through the National Research Foundation of Korea funded by the Ministry of Science and ICT and an Institute for Information & Communications Technology Promotion (IITP) grant funded by the Ministry of Science and ICT.
Figure: An example knock on a bottle. Knocker identifies the object by analyzing a unique set of responses from the knock, and automatically launches a proper application or service.
KAIST Professor Sung-Ju Lee Appointed a Technical Program Chair of INFOCOM
Professor Sung-Ju Lee of the Department of Computer Science at KAIST has been appointed to serve as a technical program chair of IEEE INFOCOME. The computer communication conference, started in 1982, is influential in the research fields of the Internet, wireless, and data centers.
Professor Lee is the first Korean to serve as a program chair. He has been acknowledged for his work in network communications. In the 34th conference, which will be held next year, he will take part in selecting 650 experts in the field to become members and supervise the evaluation of around 1,600 papers.
Professor Lee is the leading researcher in the field of wireless mobile network systems. He is a fellow of the Institute of Electrical and Electronics Engineers (IEEE) and served as the general chair of the 20th Association for Computing Machinery (ACM) SIGMOBILE Annual International Conference on Mobile Computing & Networking (MobiCom 2014). He is on the editorial boards of IEEE Transactions on Mobile Computing (TMC) and IEEE Internet of Things Journals.
Professor Lee said, “I hope to continue the traditions of the conference, as well as integrating research from various areas of network communication. I will strive to create a program with high technology transfer probability.”
The 34th IEEE INFOCOM will take place in San Francisco in April 2016.