Trillion-scale graph processing simulation on a single computer presents a new concept of graph processing
A KAIST research team has developed a new technology that enables to process a large-scale graph algorithm without storing the graph in the main memory or on disks. Named as T-GPS (Trillion-scale Graph Processing Simulation) by the developer Professor Min-Soo Kim from the School of Computing at KAIST, it can process a graph with one trillion edges using a single computer.
Graphs are widely used to represent and analyze real-world objects in many domains such as social networks, business intelligence, biology, and neuroscience. As the number of graph applications increases rapidly, developing and testing new graph algorithms is becoming more important than ever before. Nowadays, many industrial applications require a graph algorithm to process a large-scale graph (e.g., one trillion edges). So, when developing and testing graph algorithms such for a large-scale graph, a synthetic graph is usually used instead of a real graph. This is because sharing and utilizing large-scale real graphs is very limited due to their being proprietary or being practically impossible to collect.
Conventionally, developing and testing graph algorithms is done via the following two-step approach: generating and storing a graph and executing an algorithm on the graph using a graph processing engine.
-revised.png)
The first step generates a synthetic graph and stores it on disks. The synthetic graph is usually generated by either parameter-based generation methods or graph upscaling methods. The former extracts a small number of parameters that can capture some properties of a given real graph and generates the synthetic graph with the parameters. The latter upscales a given real graph to a larger one so as to preserve the properties of the original real graph as much as possible.
The second step loads the stored graph into the main memory of the graph processing engine such as Apache GraphX and executes a given graph algorithm on the engine. Since the size of the graph is too large to fit in the main memory of a single computer, the graph engine typically runs on a cluster of several tens or hundreds of computers. Therefore, the cost of the conventional two-step approach is very high.
The research team solved the problem of the conventional two-step approach. It does not generate and store a large-scale synthetic graph. Instead, it just loads the initial small real graph into main memory. Then, T-GPS processes a graph algorithm on the small real graph as if the large-scale synthetic graph that should be generated from the real graph exists in main memory. After the algorithm is done, T-GPS returns the exactly same result as the conventional two-step approach.
The key idea of T-GPS is generating only the part of the synthetic graph that the algorithm needs to access on the fly and modifying the graph processing engine to recognize the part generated on the fly as the part of the synthetic graph actually generated.
The research team showed that T-GPS can process a graph of 1 trillion edges using a single computer, while the conventional two-step approach can only process of a graph of 1 billion edges using a cluster of eleven computers of the same specification. Thus, T-GPS outperforms the conventional approach by 10,000 times in terms of computing resources. The team also showed that the speed of processing an algorithm in T-GPS is up to 43 times faster than the conventional approach. This is because T-GPS has no network communication overhead, while the conventional approach has a lot of communication overhead among computers.
Professor Kim believes that this work will have a large impact on the IT industry where almost every area utilizes graph data, adding, “T-GPS can significantly increase both the scale and efficiency of developing a new graph algorithm.”
This work was supported by the National Research Foundation (NRF) of Korea and Institute of Information & communications Technology Planning & Evaluation (IITP).
Publication:
Park, H., et al. (2021) “Trillion-scale Graph Processing Simulation based on Top-Down Graph Upscaling,” Presented at the IEEE ICDE 2021 (April 19-22, 2021, Chania, Greece)
Profile:
Min-Soo Kim
Associate Professor
School of Computing
KAIST
<(From Left) Dr. Geun-Hee Lee, Professor Kyung-Jin Lee, Professor Kyoung-Whan Kim> Research is actively underway to develop a “dream memory” that can reduce heat generation in smartphones and laptops while delivering faster performance and lower power consumption. Korean researchers have now proposed a new possibility for controlling magnetism using the exchange interaction of electron orbitals—the motion of electrons orbiting around an atomic nucleus—rather than
2026-03-16< (From left) Ph.D candidate Changhwan Kim, Ph.D candidate Seunghwan Kim , Ph.D candidate Namwook Hur, Professor Joonki Suh, Ph. D candidate Youngseok Cho> As artificial intelligence advances, computers demand faster and more efficient memory. The key to ultra-high-speed, low-power semiconductors lies in the "switching" principle—the mechanism by which memory materials turn electricity on and off. A South Korean research team has successfully captured the elusive moment of switchi
2026-02-09<Professor Kyung Cheol Choi, Dr. Byeongju Noh, Ph.D candidate Young-Hun Jung, Ph.D candidate Minwoo Park, Dr.Ja Wook Koo, Researcher Jiyun Lee, Researcher Ji-Eun Lee, Dr. Hyang Sook Hoe, Dr. Hyun-Ju Lee, Dr. Sora Kang, Researcher Seokjun Oh> A Korean research team, raising the question “Which OLED light color can actually improve memory and pathological markers in Alzheimer’s patients?”, has identified the most effective OLED color capable of enhancing cognitive fun
2025-11-24Fear memories can form in the brain following exposure to threatening situations such as natural disasters, accidents, or violence. When these memories become excessive or distorted, they can lead to severe mental health disorders, including post-traumatic stress disorder (PTSD), anxiety disorders, and depression. However, the mechanisms underlying fear memory formation triggered by affective pain rather than direct physical pain have remained largely unexplored – until now. A KAIST resea
2025-05-15A team of Korean researchers is making headlines by developing a new memory device that can be used to replace existing memory or used in implementing neuromorphic computing for next-generation artificial intelligence hardware for its low processing costs and its ultra-low power consumption. KAIST (President Kwang-Hyung Lee) announced on April 4th that Professor Shinhyun Choi's research team in the School of Electrical Engineering has developed a next-generation phase change memory* device fe
2024-04-04