본문 바로가기 대메뉴 바로가기

research

KAIST Develops AI Crowd Prediction Technology to Prevent Disasters like the Itaewon Tragedy​
View : 1476 Date : 2025-09-17 Writer : PR Office

<(From Left) Ph.D candidate Youngeun Nam from KAIST, Professor Jae-Gil Lee from KAIST, Ji-Hye Na from KAIST, (Top right, from left) Professor Soo-Sik Yoon from Korea University, Professor HwanJun Song from KAIST>

To prevent crowd crush incidents like the Itaewon tragedy, it's crucial to go beyond simply counting people and to instead have a technology that can detect the real-

inflow and movement patterns of crowds. A KAIST research team has successfully developed new AI crowd prediction technology that can be used not only for managing large-scale events and mitigating urban traffic congestion but also for responding to infectious disease outbreaks.

On the 17th, KAIST (President Kwang Hyung Lee) announced that a research team led by Professor Jae-Gil Lee from the School of Computing has developed a new AI technology that can more accurately predict crowd density.

The dynamics of crowd gathering cannot be explained by a simple increase or decrease in the number of people. Even with the same number of people, the level of risk changes depending on where they are coming from and which direction they are heading.

Professor Lee's team expressed this movement using the concept of a "time-varying graph." This means that accurate prediction is only possible by simultaneously analyzing two types of information: "node information" (how many people are in a specific area) and "edge information" (the flow of people between areas).

In contrast, most previous studies focused on only one of these factors, either concentrating on "how many people are gathered right now" or "which paths are people moving along." However, the research team emphasized that combining both is necessary to truly capture a dangerous situation.

For example, a sudden increase in density in a specific alleyway, such as Alley A, is difficult to predict with just "current population" data. But by also considering the flow of people continuously moving from a nearby area, Area B, towards Area A (edge information), it's possible to pre-emptively identify the signal that "Area A will soon become dangerous."

To achieve this, the team developed a "bi-modal learning" method. This technology simultaneously considers population counts (node information) and population flow (edge information), while also learning spatial relationships (which areas are connected) and temporal changes (when and how movement occurs).

Specifically, the team introduced a 3D contrastive learning technique. This allows the AI to learn not only 2D spatial (geographical) information but also temporal information, creating a 3D relationship. As a result, the AI can understand not just whether the population is "large or small right now," but "what pattern the crowd is developing into over time." This allows for a much more accurate prediction of the time and place where congestion will occur than previous methods.

<Figure 1. Workflow of the bi-modal learning-based crowd congestion risk prediction developed by the research team.

The research team developed a crowd congestion risk prediction model based on bi-modal learning. The vertex-based time series represents indicator changes in a specific area (e.g., increases or decreases in crowd density), while the edge-based time series captures the flow of population movement between areas over time. Although these two types of data are collected from different sources, they are mapped onto the same network structure and provided together as input to the AI model. During training, the model simultaneously leverages both vertex and edge information based on a shared network, allowing it to capture complex movement patterns that might be overlooked when relying on only a single type of data. For example, a sudden increase in crowd density in a particular area may be difficult to predict using vertex information alone, but by additionally considering the steady inflow of people from adjacent areas (edge information), the prediction becomes more accurate. In this way, the model can precisely identify future changes based on past and present information, ultimately predicting high-risk crowd congestion areas in advance.>


The research team built and publicly released six real-world datasets for their study, which were compiled from sources such as Seoul, Busan, and Daegu subway data, New York City transit data, and COVID-19 confirmed case data from South Korea and New York.

The proposed technology achieved up to a 76.1% improvement in prediction accuracy over recent state-of-the-art methods, demonstrating strong perf

Professor Jae-Gil Lee stated, "It is important to develop technologies that can have a significant social impact," adding, "I hope this technology will greatly contribute to protecting public safety in daily life, such as in crowd management for large events, easing urban traffic congestion, and curbing the spread of infectious diseases."

Youngeun Nam, a Ph.D candidate in the KAIST School of Computing, was the first author of the study, and Jihye Na, another Ph.D candidate, was a co-author. The research findings were presented at the Knowledge Discovery and Data Mining (KDD) 2025 conference, a top international conference in the field of data mining, this past August.

※ Paper Title: Bi-Modal Learning for Networked Time Series ※ DOI: https://doi.org/10.1145/3711896.3736856

This technology is the result of research projects including the "Mid-Career Researcher Project" (RS-2023-NR077002, Core Technology Research for Crowd Management Systems Based on AI and Mobility Big Data) and the "Human-Centered AI Core Technology Development Project" (RS-2022-II220157, Robust, Fair, and Scalable Data-Centric Continuous Learning).

 

Releated news