Developing Technology to Become the Joker in The Dark Knight
<(From left) Ph.D. candidate Taewoong Kang, Ph.D candidate Junha Hyung, Professor Jaegul Choo, and Ph.D. candidate Minho Park (From top right square, from left), Ph.D. candidate Kinam Kim, Seoul National University undergraduate researcher Dohyeon Kim>
What if, while watching The Dark Knight, you weren't just observing the Joker on screen, but actually seeing Gotham City through his eyes? The video technology that allows viewers to experience the world through a character's perspective, rather than as a mere observer, is becoming a reality. Researchers at our university have developed a new AI model that generates first-person viewpoint videos from standard footage.
KAIST announced on February 23rd that Professor Jaegul Choo’s research team at the Kim Jaechul Graduate School of AI has developed 'EgoX,' an AI model that utilizes observer-perspective (exocentric) video to precisely generate the scenes that a person in the video would actually be seeing.
With the rapid advancement of Augmented Reality (AR), Virtual Reality (VR), and AI robotics, the importance of "egocentric video"—which captures scenes as one directly sees them—is growing. However, obtaining high-quality first-person footage previously required users to wear expensive action cameras or smart glasses. Furthermore, there were significant technical limitations in naturally converting existing standard (third-person or exocentric) video into a first-person perspective.
A key feature of this technology is that it goes beyond simply rotating the screen; it comprehensively understands the person's position, posture, and the 3D structure of the surrounding space to reconstruct the first-person viewpoint.
< Example of converting a third-person perspective video into a first-person perspective video >
Existing technologies often only converted still images or required footage from four or more cameras. Additionally, they frequently suffered from awkward visual artifacts in videos with complex lighting or rapid movement.
In contrast, EgoX can generate high-quality first-person video from just a single third-person video source. Specifically, the research team succeeded in realistically implementing natural shifts in vision—such as when a person turns their head—by precisely modeling the correlation between head movement and the actual field of view.
This technology demonstrated stable performance across various daily scenarios, including cooking, exercising, and working, without being limited to specific environments. It is being evaluated as a breakthrough that opens new possibilities for securing high-quality first-person data from existing video archives without the need for wearable devices.
EgoX is expected to have a significant impact across various industries. In the fields of AR, VR, and the Metaverse, it can maximize user experience by transforming standard videos into immersive content that makes users feel as if they are experiencing the scene firsthand.
Furthermore, it is projected to contribute to the fields of robotics and AI training by serving as core data for "Imitation Learning," where robots learn by watching human actions. New types of video services, such as switching sports broadcasts or vlogs to the perspective of the athlete or the protagonist, are also anticipated.
< EgoX technology that converts a third-person perspective into a first-person perspective (AI-generated image) >
Distinguished Professor Jaegul Choo stated, "This research is significant in that AI has moved beyond simple video conversion to learning and reconstructing human 'vision' and 'spatial understanding.' We expect an environment to open up where anyone can create and experience immersive content using only previously recorded videos." He added, "KAIST will continue to secure global competitiveness in the field of generative AI-based video technology."
This research was led by first authors Taewoong Kang, Kinam Kim, and Dohyeon Kim . The paper was pre-released on arXiv on December 9, 2025, garnering significant attention from AI industry giants like NVIDIA and Meta, as well as academia. It is scheduled for official presentation at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), an international academic conference to be held in Colorado, USA, on June 3, 2026.
Paper Title: EgoX: Egocentric Video Generation from a Single Exocentric Video Paper Link: https://keh0t0.github.io/EgoX/
Meanwhile, this research was supported by the Ministry of Science and ICT through the National Research Foundation of Korea's individual basic research project, "Research on User-Centered Content Generation and Editing Technology through Generative AI," and the Supercomputer No. 5 High-Performance Computing-based R&D Innovation Support project, "Research on Video Filming Viewpoint Conversion Based on Diffusion Models."
KAIST-KakaoBank Speeds Up 'Explainable AI' by 11 Times: "Boosts Financial AI Reliability
< (From left) Professor Jaesik Choi of the Kim Jaechul Graduate School of AI, Ph.D candidate Chanwoo Lee, Ph.D candidate Youngjin Park >
The research team led by Professor Jaesik Choi of KAIST's Kim Jaechul Graduate School of AI, in collaboration with KakaoBank Corp, announced that they have developed an accelerated explanation technology that can explain the basis of an Artificial Intelligence (AI) model's judgment in real-time. This research achievement significantly increases the practical applicability of Explainable Artificial Intelligence (hereinafter XAI) technology in fields requiring real-time decision-making, such as financial services, by achieving an average processing speed 8.5 times faster, and up to 11 times faster, than existing explanation algorithms for AI model predictions.
In the financial sector, a clear explanation for decisions made by AI systems is essential. Especially in services directly related to customer rights, such as loan screening and anomaly detection, regulatory demands to transparently present the basis for the AI model's judgment are increasingly stringent. However, conventional Explainable Artificial Intelligence (XAI) technologies required the repeated calculation of hundreds to thousands of baselines to generate accurate explanations, resulting in massive computational costs. This was a major factor limiting the application of XAI technology in real-time service environments.
To address this issue, Professor Choi's research team developed the 'ABSQR (Amortized Baseline Selection via Rank-Revealing QR)' framework for accelerating explanation algorithms. ABSQR noticed that the value function matrix generated during the AI model explanation process has a low-rank structure. It introduced a method to select only a critical few baselines from the hundreds available. This drastically reduced the computation complexity, which was previously proportional to the number of baselines, to be proportional only to the number of selected critical baselines, thereby maximizing computational efficiency while maintaining explanatory accuracy.
Specifically, ABSQR operates in two stages. The first stage systematically selects important baselines using Singular Value Decomposition (SVD) and Rank-Revealing QR decomposition techniques. Unlike existing random sampling methods, this is a deterministic selection method aimed at preserving information recovery, which guarantees the accuracy of the explanation while significantly reducing computation. The second stage introduces an amortized inference mechanism, which reuses the pre-calculated weights of the baselines through cluster-based search, allowing the system to provide an explanation for the model's prediction result in real-time service environments without repeatedly evaluating the model. The research team verified the superiority of ABSQR through experiments on various real-world datasets. Tests on standard datasets across five sectors—finance, marketing, and demographics—showed that ABSQR achieved an average processing speed 8.5 times faster than existing explanation algorithms that use all baselines, with a maximum speed improvement of over 11 times. Furthermore, the degradation of explanatory accuracy due to speed acceleration was minimized, maintaining up to 93.5% of the explanation accuracy compared to the baseline algorithm. This level is sufficient to meet the explanation quality required in real-world applications.
< ABSQR Framework Overview. (1) The baseline selection stage utilizes the low-rank structure of the value function matrix to select only a small number of key baselines, and (2) the accelerated search stage reuses the pre-calculated baseline weight coefficients based on clusters. This dramatically reduces the computation complexity, which was proportional to the number of baselines, to be proportional only to the number of selected key baselines. >
A KakaoBank official stated, "We will continue relentless research and development to enhance the reliability and convenience of financial services and introduce innovative financial technologies that customers can experience." Chanwoo Lee and Youngjin Park, co-first authors from KAIST, explained the significance of the research: "This methodology solves the crucial acceleration problem for real-time application in the financial sector, proving that it is possible to provide users with the reasons behind a learning model's decision in real-time." They added, "This research provides new insights into what constitutes unnecessary computation and the selection of important baselines in explanation algorithms, practically contributing to the improvement of explanation technology efficiency." This research, co-authored by PhD candidates Chanwoo Lee and Youngjin Park from the KAIST Kim Jaechul Graduate School of AI, and researchers Hyeongeun Lee and Yeeun Yoo from the KakaoBank Financial Technology Research Institute, was presented on November 12 at the 'CIKM 2025 (ACM International Conference on Information and Knowledge Management)', the world's highest-authority academic conference in the field of information and knowledge management. ※ Paper Title: Amortized Baseline Selection via Rank-Revealing QR for Efficient Model Explanation
※ Author Information:
※ Author Information: DOI: https://doi.org/10.1145/3746252.3761036
Co-First Authors: Chanwoo Lee (KAIST Kim Jaechul Graduate School of AI), Youngjin Park (KAIST Kim Jaechul Graduate School of AI), Hyeogeun Lee (KakaoBank), Yeeun Yoo (KakaoBank)
Co-Authors: Daehee Han (KakaoBank), Junho Choi (KAIST Kim Jaechul Graduate School of AI), Kunhyung Kim (KAIST Kim Jaechul Graduate School of AI)
Corresponding Authors: Nari Kim (KAIST Kim Jaechul Graduate School of AI), Jaesik Choi (KAIST Kim Jaechul Graduate School of AI)
Meanwhile, this research achievement was conducted through KakaoBank's industry-academia research project 'Advanced Research on Explainable Artificial Intelligence Algorithms in the Financial Sector' and the Ministry of Science and ICT/Institute for Information & Communications Technology Planning and Evaluation (IITP) supported project 'Development of Explainable Artificial Intelligence Technology Providing Explainability in a Plug-and-Play Manner and Verification of Explanation Provision for AI Systems.'
KAIST Predicts Human Group Behavior with AI! 1st Place at the World’s Top Conference… Major Success after 23 Years
<(From Left) Ph.D candidate Geon Lee, Ph.D candidate Minyoung Choe, M.S candidate Jaewan Chun, Professor Kijung Shin, M.S candidate Seokbum Yoon>
KAIST (President Kwang Hyung Lee) announced on the 9th of December that Professor Kijung Shin’s research team at the Kim Jaechul Graduate School of AI has developed a groundbreaking AI technology that predicts complex social group behavior by analyzing how individual attributes such as age and role influence group relationships.
With this technology, the research team achieved the remarkable feat of winning the Best Paper Award at the world-renowned data mining conference “IEEE ICDM,” hosted by the Institute of Electrical and Electronics Engineers (IEEE). This is the highest honor awarded to only one paper out of 785 submissions worldwide, and marks the first time in 23 years that a Korean university research team has received this award, once again demonstrating KAIST’s technological leadership on the global research stage.
Today, group interactions involving many participants at the same time—such as online communities, research collaborations, and group chats—are rapidly increasing across society. However, there has been a lack of technology that can precisely explain both how such group behavior is structured and how individual characteristics influence it at the same time.
To overcome this limitation, Professor Kijung Shin’s research team developed an AI model called “NoAH (Node Attribute-based Hypergraph Generator),” which realistically reproduces the interplay between individual attributes and group structure.
NoAH is an artificial intelligence that explains and imitates what kinds of group behaviors emerge when people’s characteristics come together. For example, it can analyze and faithfully reproduce how information such as a person’s interests and roles actually combine to form group behavior.
As such, NoAH is an AI that generates “realistic group behavior” by simultaneously reflecting human traits and relationships. It was shown to reproduce various real-world group behaviors—such as product purchase combinations in e-commerce, the spread of online discussions, and co-authorship networks among researchers—far more realistically than existing models.
< The process of generating group interactions using NoAH >
Professor Kijung Shin stated, “This study opens a new AI paradigm that enables a richer understanding of complex interactions by considering not only the structure of groups but also individual attributes together,” and added, “Analyses of online communities, messengers, and social networks will become far more precise.”
This research was conducted by a team consisting of Professor Kijung Shin and KAIST Kim Jaechul Graduate School of AI students: master’s students Jaewan Chun and Seokbum Yoon, and doctoral students Minyoung Choe and Geon Lee, and was presented at IEEE ICDM on November 18.
※ Paper title: “Attributed Hypergraph Generation with Realistic Interplay Between Structure and Attributes” Original paper: https://arxiv.org/abs/2509.21838
< Photo from the award ceremony held on November 14 at the International Spy Museum in Washington, D.C.>
Meanwhile, including this award-winning paper, Professor Shin’s research team presented a total of four papers at IEEE ICDM this year. In addition, in 2023, the team also received the Best Student Paper Runner-up (4th place) at the same conference.
This work was supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. RS-202400457882, AI Research Hub Project) (RS-2019-II190075, Artificial Intelligence Graduate School Program (KAIST)) (No. RS-2022-II220871, Development of AI Autonomy and Knowledge Enhancement for AI Agent Collaboration).
How Does AI Think? KAIST Achieves First Visualization of the Internal Structure Behind AI Decision-Making
<(From Left) Ph.D candidate Daehee Kwon, Ph.D candidate Sehyun lee, Professor Jaesik Choi>
Although deep learning–based image recognition technology is rapidly advancing, it still remains difficult to clearly explain the criteria AI uses internally to observe and judge images. In particular, technologies that analyze how large-scale models combine various concepts (e.g., cat ears, car wheels) to reach a conclusion have long been recognized as a major unsolved challenge.
KAIST (President Kwang Hyung Lee) announced on the 26th of November that Professor Jaesik Choi’s research team at the Kim Jaechul Graduate School of AI has developed a new explainable AI (XAI) technology that visualizes the concept-formation process inside a model at the level of circuits, enabling humans to understand the basis on which AI makes decisions.
The study is evaluated as a significant step forward that allows researchers to structurally examine “how AI thinks.”
Inside deep learning models, there exist basic computational units called neurons, which function similarly to those in the human brain. Neurons detect small features within an image—such as the shape of an ear, a specific color, or an outline—and compute a value (signal) that is transmitted to the next layer.
In contrast, a circuit refers to a structure in which multiple neurons are connected to jointly recognize a single meaning (concept). For example, to recognize the concept of cat ear, neurons detecting outline shapes, neurons detecting triangular forms, and neurons detecting fur-color patterns must activate in sequence, forming a functional unit (circuit).
Up until now, most explanation techniques have taken a neuron-centric approach based on the idea that “a specific neuron detects a specific concept.” However, in reality, deep learning models form concepts through cooperative circuit structures involving many neurons. Based on this observation, the KAIST research team proposed a technique that expands the unit of concept representation from “neuron → circuit.”
The research team’s newly developed technology, Granular Concept Circuits (GCC), is a novel method that analyzes and visualizes how an image-classification model internally forms concepts at the circuit level.
GCC automatically traces circuits by computing Neuron Sensitivity and Semantic Flow. Neuron Sensitivity indicates how strongly a neuron responds to a particular feature, while Semantic Flow measures how strongly that feature is passed on to the next concept. Using these metrics, the system can visualize, step-by-step, how basic features such as color and texture are assembled into higher-level concepts.
The team conducted experiments in which specific circuits were temporarily disabled (ablation). As a result, when the circuit responsible for a concept was deactivated, the AI’s predictions actually changed.
In other words, the experiment directly demonstrated that the corresponding circuit indeed performs the function of recognizing that concept.
This study is regarded as the first to reveal, at a fine-grained circuit level, the actual structural process by which concepts are formed inside complex deep learning models. Through this, the research suggests practical applicability across the entire explainable AI (XAI) domain—including strengthening transparency in AI decision-making, analyzing the causes of misclassification, detecting bias, improving model debugging and architecture, and enhancing safety and accountability.
The research team stated, “This technology shows the concept structures that AI forms internally in a way that humans can understand,” adding that “this study provides a scientific starting point for researching how AI thinks.”
Professor Jaesik Choi emphasized, “Unlike previous approaches that simplified complex models for explanation, this is the first approach to precisely interpret the model’s interior at the level of fine-grained circuits,” and added, “We demonstrated that the concepts learned by AI can be automatically traced and visualized.”
< Overview of the Conceptual Circuit Proposed by the Research Team >
This study, with Ph.D. candidates Dahee Kwon and Sehyun Lee from KAIST Kim Jaechul Graduate School of AI as co–first authors, was presented on October 21 at the International Conference on Computer Vision (ICCV).
Paper title: Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations
Paper link: https://openaccess.thecvf.com/content/ICCV2025/papers/Kwon_Granular_Concept_Circuits_Toward_a_Fine-Grained_Circuit_Discovery_for_Concept_ICCV_2025_paper.pdf
This research was supported by the Ministry of Science and ICT and the Institute for Information & Communications Technology Planning & Evaluation (IITP) under the “Development of Artificial Intelligence Technology for Personalized Plug-and-Play Explanation and Verification of Explanation” project, the AI Research Hub Project, and the KAIST AI Graduate School Program, and was carried out with support from the Defense Acquisition Program Administration (DAPA) and the Agency for Defense Development (ADD) at the KAIST Center for Applied Research in Artificial Intelligence.
KAIST Develops AI ‘MARIOH’ to Uncover and Reconstruct Hidden Multi-Entity Relationships
<(From Left) Professor Kijung Shin, Ph.D candidate Kyuhan Lee, and Ph.D candidate Geon Lee>
Just like when multiple people gather simultaneously in a meeting room, higher-order interactions—where many entities interact at once—occur across various fields and reflect the complexity of real-world relationships. However, due to technical limitations, in many fields, only low-order pairwise interactions between entities can be observed and collected, which results in the loss of full context and restricts practical use. KAIST researchers have developed the AI model “MARIOH,” which can accurately reconstruct* higher-order interactions from such low-order information, opening up innovative analytical possibilities in fields like social network analysis, neuroscience, and life sciences.
*Reconstruction: Estimating/reconstructing the original structure that has disappeared or was not observed.
KAIST (President Kwang Hyung Lee) announced on the 5th that Professor Kijung Shin’s research team at the Kim Jaechul Graduate School of AI has developed an AI technology called “MARIOH” (Multiplicity-Aware Hypergraph Reconstruction), which can reconstruct higher-order interaction structures with high accuracy using only low-order interaction data.
Reconstructing higher-order interactions is challenging because a vast number of higher-order interactions can arise from the same low-order structure.
The key idea behind MARIOH, developed by the research team, is to utilize multiplicity information of low-order interactions to drastically reduce the number of candidate higher-order interactions that could stem from a given structure.
In addition, by employing efficient search techniques, MARIOH quickly identifies promising interaction candidates and uses multiplicity-based deep learning to accurately predict the likelihood that each candidate represents an actual higher-order interaction.
<Figure 1. An example of recovering high-dimensional relationships (right) from low-dimensional paper co-authorship relationships (left) with 100% accuracy, using MARIOH technology.>
Through experiments on ten diverse real-world datasets, the research team showed that MARIOH reconstructed higher-order interactions with up to 74% greater accuracy compared to existing methods.
For instance, in a dataset on co-authorship relations (source: DBLP), MARIOH achieved a reconstruction accuracy of over 98%, significantly outperforming existing methods, which reached only about 86%. Furthermore, leveraging the reconstructed higher-order structures led to improved performance in downstream tasks, including prediction and classification.
According to Kijung, “MARIOH moves beyond existing approaches that rely solely on simplified connection information, enabling precise analysis of the complex interconnections found in the real world.” Furthermore, “it has broad potential applications in fields such as social network analysis for group chats or collaborative networks, life sciences for studying protein complexes or gene interactions, and neuroscience for tracking simultaneous activity across multiple brain regions.”
The research was conducted by Kyuhan Lee (Integrated M.S.–Ph.D. program at the Kim Jaechul Graduate School of AI at KAIST; currently a software engineer at GraphAI), Geon Lee (Integrated M.S.–Ph.D. program at KAIST), and Professor Kijung Shin. It was presented at the 41st IEEE International Conference on Data Engineering (IEEE ICDE), held in Hong Kong this past May.
※ Paper title: MARIOH: Multiplicity-Aware Hypergraph Reconstruction ※ DOI: https://doi.ieeecomputersociety.org/10.1109/ICDE65448.2025.00233
<Figure 2. An example of the process of recovering high-dimensional relationships using MARIOH technology>
This research was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) through the project “EntireDB2AI: Foundational technologies and software for deep representation learning and prediction using complete relational databases,” as well as by the National Research Foundation of Korea through the project “Graph Foundation Model: Graph-based machine learning applicable across various modalities and domains.”
3 KAIST PhD Candidates Selected as the 2021 Google PhD Fellows
PhD candidates Soo Ye Kim and Sanghyun Woo from the KAIST School of Electrical Engineering and Hae Beom Lee from the Kim Jaechul Graduate School of AI were selected as the 2021 Google PhD Fellows. The Google PhD Fellowship is a scholarship program that supports graduate school students from around the world that have produced excellent achievements from promising computer science-related fields. The 75 selected fellows will receive ten thousand dollars of funding with the opportunity to discuss research and receive one-on-one feedback from experts in related fields at Google.
Kim and Woo were named fellows in the field of "Machine Perception, Speech Technology and Computer Vision" with research of deep learning based super-resolution and computer vision respectively. Lee was named a fellow in the field of "Machine Learning" for his research in meta-learning.
Kim's research includes the formulation of novel methods for super-resolution and HDR video restoration and deep joint frame interpolation and super-resolution methods. Many of her works have been presented in leading conferences in computer vision and AI such as CVPR, ICCV, and AAAI. In addition, she has been collaborating as a research intern with the Vision Group Team at Adobe Research to study depth map refinement techniques.
(Kim's research on deep learning based joint super-resolution and inverse tone-mapping framework for HDR videos)
Woo’s research includes an effective deep learning model design based on the attention mechanism and learning methods based on self-learning and simulators. His works have been also presented in leading conferences such as CVPR, ECCV, and NeurIPS. In particular, his work on the Convolutional Block Attention Module (CBAM) which was presented at ECCV in 2018 has surpassed over 2700 citations on Google Scholar after being referenced in many computer vision applications. He was also a recipient of Microsoft Research PhD Fellowship in 2020.
(Woo's research on attention mechanism based deep learning models)
Lee’s research focuses effectively overcoming various limitations of the existing meta-learning framework. Specifically, he proposed to deal with a realistic task distribution with imbalances, improved the practicality of meta-knowledge, and made meta-learning possible even in large-scale task scenarios. These various studies have been accepted to numerous top-tier machine learning conferences such as NeurIPS, ICML, and ICLR. In particular, one of his papers has been selected as an oral presentation at ICLR 2020 and another as a spotlight presentation at NeurIPS 2020.
(Lee's research on learning to balance and continual trajectory shifting)
Due to the COVID-19 pandemic, the award ceremony was held virtually at the Google PhD Fellowship Summit from August 31st to September 1st. The list of fellowship recipients is displayed on the Google webpage.