Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search

Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search
Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search

Junhyeok Jang, Hanjin Choi, Hanyeoreum Bae, Seungjun Lee, Miryeong Kwon, Myoungsoo Jung

ACM Transaction on Storage

2024

Research Areas
Operating Systems
Architecture
Machine Learning
Coherent Interconnect

Abstract

We propose CXL-ANNS, a software-hardware collaborative approach to enable scalable approximate nearest neighbor search (ANNS) services. To this end, we first disaggregate DRAM from the host via compute express link (CXL) and place all essential datasets into its memory pool. While this CXL memory pool allows ANNS to handle billion-point graphs without an accuracy loss, we observe that the search performance significantly degrades because of CXL’s far-memory-like characteristics. To address this, CXL-ANNS considers the node-level relationship and caches the neighbors in local memory, which are expected to visit most frequently. For the uncached nodes, CXL-ANNS prefetches a set of nodes most likely to visit soon by understanding the graph traversing behaviors of ANNS. CXL-ANNS is also aware of the architectural structures of the CXL interconnect network and lets different hardware components collaborate with each other for the search. Furthermore, it relaxes the execution dependency of neighbor search tasks and allows ANNS to utilize all hardware in the CXL network in parallel. Our evaluation shows that CXL-ANNS exhibits 93.3% lower query latency than state-of-the-art ANNS platforms that we tested. CXL-ANNS also outperforms an oracle ANNS system that has unlimited local DRAM capacity by 68.0%, in terms of latency.


Related Publications
Featured
CXL Topology-Aware and Expander-Driven Prefetching: Unlocking SSD PerformanceIEEE Micro2025
Coherent Interconnect
Machine Learning
+1 more
CXL-ANNS: Software-Hardware Collaborative Memory Disaggregation and Computation for Billion-Scale Approximate Nearest Neighbor SearchThe USENIX Annual Technical Conference (ATC)2023
Operating Systems
Architecture
+2 more
Failure Tolerant Training with Persistent Memory Disaggregation over CXLIEEE Micro2023
Architecture
Operating Systems
+2 more