
<(From Left) Ph.D candidate Seongryong Oh, Ph.D candidate Yoonsung Kim, Ph.D candidate Wonung Kim, Ph.D candidate Yubin Lee, M.S candidate Jiyong Jung, Professor Jongse Park, Professor Divya Mahajan, Professor Chang Hyun Park>
As recent Artificial Intelligence (AI) models’ capacity to understand and process long, complex sentences grows, the necessity for new semiconductor technologies that can simultaneously boost computation speed and memory efficiency is increasing. Amidst this, a joint research team featuring KAIST researchers and international collaborators has successfully developed a core AI semiconductor 'brain' technology based on a hybrid Transformer and Mamba structure, which was implemented for the first time in the world in a form capable of direct computation inside the memory, resulting in a four-fold increase in the inference speed of Large Language Models (LLMs) and a 2.2-fold reduction in power consumption.
KAIST (President Kwang Hyung Lee) announced on the 17th of October that the research team led by Professor Jongse Park from KAIST School of Computing, in collaboration with Georgia Institute of Technology in the United States and Uppsala University in Sweden, developed 'PIMBA,' a core technology based on 'AI Memory Semiconductor (PIM, Processing-in-Memory),' which acts as the brain for next-generation AI models.
Currently, LLMs such as ChatGPT, GPT-4, Claude, Gemini, and Llama operate based on the 'Transformer' brain structure, which sees all of the words simultaneously. Consequently, as the AI model grows and the processed sentences become longer, the computational load and memory requirements surge, leading to speed reductions and high energy consumption as major issues.
To overcome these problems with Transformer, the recently proposed sequential memory-based 'Mamba' structure introduced a method for processing information over time, increasing efficiency. However, memory bottlenecks and power consumption limits still remained.
Professor Park Jongse's research team designed 'PIMBA,' a new semiconductor structure that directly performs computations inside the memory in order to maximize the performance of the 'Transformer–Mamba Hybrid Model,' which combines the advantages of both Transformer and Mamba.
While existing GPU-based systems move data out of the memory to perform computations, PIMBA performs calculations directly within the storage device without moving the data. This minimizes data movement time and significantly reduces power consumption.

<Analysis of Post-Transformer Models and Proposal of a Problem-Solving Acceleration System>
As a result, PIMBA showed up to a 4.1-fold improvement in processing performance and an average 2.2-fold decrease in energy consumption compared to existing GPU systems.
The research outcome is scheduled to be presented on October 20th at the '58th International Symposium on Microarchitecture (MICRO 2025),' a globally renowned computer architecture conference that will be held in Seoul. It was previously recognized for its excellence by winning the Gold Prize at the '31st Samsung Humantech Paper Award.' ※Paper Title: Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving, DOI: 10.1145/3725843.3756121
This research was supported by the Institute for Information & Communications Technology Planning & Evaluation (IITP), the AI Semiconductor Graduate School Support Project, and the ICT R&D Program of the Ministry of Science and ICT and the IITP, with assistance from the Electronics and Telecommunications Research Institute (ETRI). The EDA tools were supported by IDEC (the IC Design Education Center).
<(From Left) Ph.D candidate Uichang Jeong, Professor Seungbum Hong> In hydrogen production catalysts, water droplets must detach easily from the surface to prevent blockage by bubbles, allowing for faster hydrogen generation. In semiconductor manufacturing, the quality of the process is determined by how evenly water or liquid spreads on the surface, or how quickly it dries. However, directly observing how such water or liquid spreads and moves on a surface ('wettability') at the nanosc
2025-12-03<(From Left) Ph.D candidate Daehee Kwon, Ph.D candidate Sehyun lee, Professor Jaesik Choi> Although deep learning–based image recognition technology is rapidly advancing, it still remains difficult to clearly explain the criteria AI uses internally to observe and judge images. In particular, technologies that analyze how large-scale models combine various concepts (e.g., cat ears, car wheels) to reach a conclusion have long been recognized as a major unsolved challenge. KAIST (Pr
2025-11-26< 2025 OPEN KAIST (Demonstration of the cluster systems and AI drone program conducted in Prof. Il-Chul Moon’s Lab, Department of Industrial & Systems Engineering)> KAIST announced on November 25th that it is operating the 'Science Education Sharing (KSOP),' 'OPEN KAIST,' and 'KAIST-style IT/AI Academy for the General Public, social contribution programs based on science popularization,in line with the government's policy to spread science culture. Through these initiatives, K
2025-11-25< Professor Youngjin Kwon > Modern CPUs have complex structures, and in the process of handling multiple tasks simultaneously, an order-scrambling error known as a 'concurrency bug' can occur. Although this can lead to security issues, these bugs were extremely difficult to detect using conventional methods. Our university's research team has developed a world-first-level technology to automatically detect these bugs by precisely reproducing the internal operation of the CPU in a virt
2025-11-21<(From Left) Ph.D candidate Insook Ahn from KAIST, Professor Jinju Han from KAIST, (Upper Left) Yangsik Kim from Inhan University School of Medicine, Ph.D candidate Soyeon Chang(psychiatrist)> Major depressive disorder (MDD) is characterized by a lowered mood and loss of interest, contributing not only to difficulties in academic and professional life but also as a major cause of suicide in South Korea. However, there are currently no objective biological markers that can be used for di
2025-11-20