Making Truly Smart AI Agents a Reality with the World's Best DB Integration Technology
<(From Left) Engineer Jeongho Park from GraphAI, Ph.D candidate Geonho Lee, Prof. Min-Soo Kim from KAIST>
For a long time, companies have been using relational databases (DB) to manage data. However, with the increasing use of large AI models, integration with graph databases is now required. This process, however, reveals limitations such as cost burden, data inconsistency, and the difficulty of processing complex queries.
Our research team has succeeded in developing a next-generation graph-relational DB system that can solve these problems at once, and it is expected to be applied to industrial sites immediately. When this technology is applied, AI will be able to reason about complex relationships in real time, going beyond simple searches, making it possible to implement a smarter AI service.
The research team led by Professor Min-Soo Kim announced on the 8th of September that the team has developed a new DB system named 'Chimera' that fully integrates relational DB and graph DB to efficiently execute graph-relational queries. Chimera has proven its world-class performance by processing queries at least 4 times and up to 280 times faster than existing systems in international performance standard benchmarks.
Unlike existing relational DBs, graph DBs have a structure that represents data as vertices (nodes) and edges (connections), which gives them a strong advantage in analyzing and reasoning about complexly intertwined information like people, events, places, and time. Thanks to this feature, its use is rapidly spreading in various fields such as AI agents, SNS, finance, and e-commerce.
With the growing demand for complex query processing between relational DBs and graph DBs, a new standard language, 'SQL/PGQ,' which extends relational query language (SQL) with graph query functions, has also been proposed.
SQL/PGQ is a new standard language that adds graph traversal capabilities to the existing database language (SQL) and is designed to query both table-like data and connected information such as people, events, and places at once. Using this, complex relationships such as 'which company does my friend's friend work for?' can be searched much more simply than before.
<Diagram (a): This diagram shows the typical architecture of a graph query processing system based on a traditional RDBMS. It has separate dedicated operators for graph traversal and an in-memory graph structure, while attribute joins are handled by relational operators. However, this structure makes it difficult to optimize execution plans for hybrid queries because traversal and joins are performed in different pipelines. Additionally, for large-scale graphs, the in-memory structure creates memory constraints, and the method of extracting graph data from relational data limits data freshness. Diagram (b): This diagram shows Chimera's integrated architecture. Chimera introduces new components to the existing RDBMS architecture: a traversal-join operator that combines graph traversal and joins, a disk-based graph storage, and a dedicated graph access layer. This allows it to process both graph and relational data within a single execution flow. Furthermore, a hybrid query planner integrally optimizes both graph and relational operations. Its shared transaction management and disk-based storage structure enable it to handle large-scale graph databases without memory constraints while maintaining data freshness. This architecture removes the bottlenecks of existing systems by flexibly combining traversal, joins, and mappings in a single execution plan, thereby simultaneously improving performance and scalability.>
The problem is that existing approaches have relied on either trying to mimic graph traversal with join operations or processing by pre-building a graph view in memory. In the former case, performance drops sharply as the traversal depth increases, and in the latter case, execution fails due to insufficient memory even if the data size increases slightly. Furthermore, changes to the original data are not immediately reflected in the view, resulting in poor data freshness and the inefficiency of having to combine relational and graph results separately.
KAIST research team's 'Chimera' fundamentally solves these limitations. The research team redesigned both the storage layer and the query processing layer of the database.
First, the research team introduced a 'dual-store structure' that operates a graph-specific storage and a relational data storage together. They then applied a 'traversal-join operator' that processes graph traversal and relational operations simultaneously, allowing complex operations to be executed efficiently in a single system. Thanks to this, Chimera has established itself as the world's first graph-relational DB system that integrates the entire process from data storage to query processing into one.
As a result, it recorded world-class performance on the international performance standard benchmark 'LDBC Social Network Benchmark (SNB),' being at least 4 times and up to 280 times faster than existing systems.
Query failure due to insufficient memory does not occur no matter how large the graph data becomes, and since it does not use views, there is no delay problem in terms of data freshness.
Professor Min-Soo Kim stated, "As the connections between data become more complex, the need for integrated technology that encompasses both graph and relational DBs is increasing. Chimera is a technology that fundamentally solves this problem, and we expect it to be widely used in various industries such as AI agents, finance, and e-commerce."
The study was co-authored by Geonho Lee, a Ph.D. student in KAIST School of Computing, as the first author, and Jeongho Park, an engineer at Professor Kim's startup GraphAI Co., Ltd., as the second author, with Professor Kim as the corresponding author.
The research results were presented on September 1st at VLDB, a world-renowned international academic conference in the field of databases. In particular, the newly developed Chimera technology is expected to have an immediate industrial impact as a core technology for implementing 'high-performance AI agents based on RAG (a smart AI assistant with search capabilities),' which will be applied to 'AkasicDB,' a vector-graph-relational DB system scheduled to be released by GraphAI Co., Ltd.
*Paper title: Chimera: A System Design of Dual Storage and Traversal-Join Unified Query Processing for SQL/PGQ *DOI: https://dl.acm.org/doi/10.14778/3705829.3705845
This research was supported by the Ministry of Science and ICT's IITP SW Star Lab and the National Research Foundation of Korea's Mid-Career Researcher Program.
KAIST develops “FlexGNN,” a graph analysis AI 95 times faster with a single GPU server
<(From Left) Donghyoung Han, CTO of GraphAI Co, Ph.D candidate Jeongmin Bae from KAIST, Professor Min-soo Kim from KAIST>
Alongside text-based large language models (LLMs) including ChatGPT, in industrial fields, GNN (Graph Neural Network)-based graph AI models that analyze unstructured data such as financial transactions, stocks, social media, and patient records in graph form are being actively used. However, there is a limitation in that full graph learning—training the entire graph at once—requires massive memory and GPU servers. A KAIST research team has succeeded in developing the world’s highest-performance software technology that can train large-scale GNN models at maximum speed using only a single GPU server.
KAIST (President Kwang Hyung Lee) announced on the 13th that the research team led by Professor Min-Soo Kim of the School of Computing has developed “FlexGNN,” a GNN system that, unlike existing methods using multiple GPU servers, can quickly train and infer large-scale full-graph AI models on a single GPU server. FlexGNN improves training speed by up to 95 times compared to existing technologies.
Recently, in various fields such as climate, finance, medicine, pharmaceuticals, manufacturing, and distribution, there has been a growing number of cases where data is converted into graph form, consisting of nodes and edges, for analysis and prediction.
While the full graph approach, which uses the entire graph for training, achieves higher accuracy, it has the drawback of frequently running out of memory due to the generation of massive intermediate data during training, as well as prolonged training times caused by data communication between multiple servers.
To overcome these problems, FlexGNN performs optimal AI model training on a single GPU server by utilizing SSDs (solid-state drives) and main memory instead of multiple GPU servers.
<Figure (a): This illustrates the typical execution flow of a conventional full-graph GNN training system. All intermediate data generated during training are retained in GPU memory, and computations are performed sequentially without data movement or memory optimization. Consequently, if the GPU memory capacity is exceeded, training becomes infeasible. Additionally, inter-GPU data exchange relies solely on a fixed method (X_rigid), limiting performance and scalability. Figure (b): This depicts an example of the execution flow based on the optimized training execution plan generated by FlexGNN. For each intermediate data, strategies such as retention, offloading, or recomputation are selectively applied. Depending on resource constraints and data size, an appropriate inter-GPU exchange method—either GPU-to-GPU (G2G) or GPU-to-Host (G2H)—is adaptively chosen by the exchange operator (X_adapt). Furthermore, offloading and reloading operations are scheduled to overlap as much as possible with computation, maximizing compute-data movement parallelism. The adaptive exchange operator and various data offloading and reloading operators (R, O) within the figure demonstrate FlexGNN's ability to flexibly control intermediate data management and inter-GPU exchange strategies based on the training execution plan.>
Particularly, through AI query optimization training—which optimizes the quality of database systems—the team developed a new training optimization technology that calculates model parameters, training data, and intermediate data between GPU, main memory, and SSD layers at the optimal timing and method.
As a result, FlexGNN flexibly generates optimal training execution plans according to available resources such as data size, model scale, and GPU memory, thereby achieving high resource efficiency and training speed.
Consequently, it became possible to train GNN models on data far exceeding main memory capacity, and training could be up to 95 times faster even on a single GPU server. In particular, the realization of full-graph AI, capable of more precise analysis than supercomputers in applications such as climate prediction, has become a reality.
Professor Min-Soo Kim of KAIST stated, “As full-graph GNN models are actively used to solve complex problems such as weather prediction and new material discovery, the importance of related technologies is increasing.” He added that “since FlexGNN has dramatically solved the longstanding problems of training scale and speed in graph AI models, we expect it to be widely used in various industries.”
In this research, Jeongmin Bae, a doctoral student in the School of Computing at KAIST, participated as the first author, Donghyoung Han, CTO of GraphAI Co. (founded by Professor Kim) participated as the second author, and Professor Kim served as the corresponding author.
The research results were presented on August 5 at ACM KDD, a world-renowned data mining conference. The FlexGNN technology is also planned to be applied to Grapheye’s graph database solution, GraphOn.
● Paper title: FlexGNN: A High-Performance, Large-Scale Full-Graph GNN System with Best-Effort Training Plan Optimization
● DOI: https://doi.org/10.1145/3711896.3736964
This research was supported by the IITP SW Star Lab and IITP-ITRC of the Ministry of Science and ICT, as well as the mid-level project program of the National Research Foundation of Korea.
T-GPS Processes a Graph with Trillion Edges on a Single Computer
Trillion-scale graph processing simulation on a single computer presents a new concept of graph processing
A KAIST research team has developed a new technology that enables to process a large-scale graph algorithm without storing the graph in the main memory or on disks. Named as T-GPS (Trillion-scale Graph Processing Simulation) by the developer Professor Min-Soo Kim from the School of Computing at KAIST, it can process a graph with one trillion edges using a single computer.
Graphs are widely used to represent and analyze real-world objects in many domains such as social networks, business intelligence, biology, and neuroscience. As the number of graph applications increases rapidly, developing and testing new graph algorithms is becoming more important than ever before. Nowadays, many industrial applications require a graph algorithm to process a large-scale graph (e.g., one trillion edges). So, when developing and testing graph algorithms such for a large-scale graph, a synthetic graph is usually used instead of a real graph. This is because sharing and utilizing large-scale real graphs is very limited due to their being proprietary or being practically impossible to collect.
Conventionally, developing and testing graph algorithms is done via the following two-step approach: generating and storing a graph and executing an algorithm on the graph using a graph processing engine.
The first step generates a synthetic graph and stores it on disks. The synthetic graph is usually generated by either parameter-based generation methods or graph upscaling methods. The former extracts a small number of parameters that can capture some properties of a given real graph and generates the synthetic graph with the parameters. The latter upscales a given real graph to a larger one so as to preserve the properties of the original real graph as much as possible.
The second step loads the stored graph into the main memory of the graph processing engine such as Apache GraphX and executes a given graph algorithm on the engine. Since the size of the graph is too large to fit in the main memory of a single computer, the graph engine typically runs on a cluster of several tens or hundreds of computers. Therefore, the cost of the conventional two-step approach is very high.
The research team solved the problem of the conventional two-step approach. It does not generate and store a large-scale synthetic graph. Instead, it just loads the initial small real graph into main memory. Then, T-GPS processes a graph algorithm on the small real graph as if the large-scale synthetic graph that should be generated from the real graph exists in main memory. After the algorithm is done, T-GPS returns the exactly same result as the conventional two-step approach.
The key idea of T-GPS is generating only the part of the synthetic graph that the algorithm needs to access on the fly and modifying the graph processing engine to recognize the part generated on the fly as the part of the synthetic graph actually generated.
The research team showed that T-GPS can process a graph of 1 trillion edges using a single computer, while the conventional two-step approach can only process of a graph of 1 billion edges using a cluster of eleven computers of the same specification. Thus, T-GPS outperforms the conventional approach by 10,000 times in terms of computing resources. The team also showed that the speed of processing an algorithm in T-GPS is up to 43 times faster than the conventional approach. This is because T-GPS has no network communication overhead, while the conventional approach has a lot of communication overhead among computers.
Professor Kim believes that this work will have a large impact on the IT industry where almost every area utilizes graph data, adding, “T-GPS can significantly increase both the scale and efficiency of developing a new graph algorithm.”
This work was supported by the National Research Foundation (NRF) of Korea and Institute of Information & communications Technology Planning & Evaluation (IITP).
Publication:
Park, H., et al. (2021) “Trillion-scale Graph Processing Simulation based on Top-Down Graph Upscaling,” Presented at the IEEE ICDE 2021 (April 19-22, 2021, Chania, Greece)
Profile:
Min-Soo Kim
Associate Professor
minsoo.k@kaist.ac.kr
http://infolab.kaist.ac.kr
School of Computing
KAIST