Panmnesia had the pleasure of being invited by Professor Byung-Gon Chun, the CEO of FriendliAI at Seoul National University, to present at his seminar. During our presentation, we shed light on two significant applications of the advanced Compute Express Link (CXL) technology.
Firstly, we discussed a combined software-hardware system that enhances search capabilities for approximate nearest neighbors. This solution uses CXL to separate memory from host resources and optimize search performance despite CXL’s distant memory characteristics. It employs innovative strategies and utilizes all available hardware, outperforming current platforms in query latency.
Secondly, we presented a robust system for managing large recommendation datasets. It harnesses CXL to integrate persistent memory and graphics processing units seamlessly, enabling direct memory access without software intervention. With a sophisticated checkpointing technique that updates model parameters and embeddings across training batches, we’ve seen a significant boost in training performance and energy savings.
We’re looking forward to more such opportunities! Keep an eye out for our future endeavors!
Panmnesia, Inc., the industry leader in data memory and storage solutions, is thrilled to announce that their cutting-edge SSD technology, Panmnesia’s ExPAND, has been accepted for presentation at the prestigious HotStorage 2023 conference, to be held in Boston this July.
Panmnesia’s ExPAND is a groundbreaking development that integrates Compute Express Link (CXL) with SSDs, allowing for scalable access to large memory. Although this technology currently operates at slower speeds than DRAMs, ExPAND mitigates this by offloading last-level cache (LLC) prefetching from the host CPU to the CXL-SSDs, which dramatically improves performance.
The cornerstone of Panmnesia’s ExPAND is its novel use of a heterogeneous prediction algorithm for prefetching, which ensures data consistency with CXL sub-protocols. The system also allows for detailed examination of prefetch timeliness, paving the way for more accurate latency estimations.
Furthermore, ExPAND’s unique feature lies in its awareness of CXL multi-tiered switching, providing end-to-end latency for each CXL-SSD, as well as precise prefetch timeliness estimations. This breakthrough reduces the reliance on CXL-SSDs and enables most data to access the host cache directly.
Early tests are already showing promise: Panmnesia’s ExPAND has been shown to enhance the performance of graph applications by a staggering 2.8 times, surpassing CXL-SSD pools with a variety of prefetching strategies.
ExPAND’s acceptance into HotStorage 2023 underscores its potential to revolutionize the way we manage and access data. With ExPAND, Panmnesia, Inc. continues to solidify its reputation as a frontrunner in data storage solutions, pushing the boundaries of technology and driving the industry forward.
The full details of ExPAND and its advantages will be presented at HotStorage 2023. Stay tuned for more exciting updates from the conference.
Interested in discovering the potential applications of the CXL solution?
We invite you to explore a recent white paper detailing Panmnesia’s CXL innovation, known as Hearst, which significantly enhances AI-driven applications (such as recommendation systems and vector search) through exceptional performance and cost efficiency.
In summary, Hearst’s distinguishing features include 1. port-based routing (PBR) CXL switch for high scalability, 2. near-Data processing for high performance, and 3. DIMM pooling technology for a low cost. We hope you find this information helpful and enjoyable to read.
We are thrilled to announce that Panmnesia’s groundbreaking research has been recognized by the esteemed USENIX ATC’23, boasting a competitive 18% acceptance rate.
Our study explores the acceleration of nearest neighbor search services, handling billion-point datasets through innovative CXL memory pooling and coherence network topology. This approach enables substantial memory requirements for ANNS, such as Microsoft’s search engines (Bing and Outlook) which demand over 40TB for 100B+ vectors, each described by 100 dimensions. Likewise, Alibaba’s e-commerce platforms require TB-scale memory for their 2B+ vectors (128 dimensions).
Join us in Boston this July to learn more about our exciting findings!
Get ready for a revolution in AI acceleration with Panmnesia’s CXL-augmented system - making waves in StorageNewsletter!
Our groundbreaking technology empowers GPUs to unleash a massive memory space of up to 4 Petabytes, supercharging the training of colossal machine learning models like recommendation systems and ChatGPT using just a few GPUs!
Embrace substantial cost savings and push the boundaries of AI capabilities for your business! Don’t miss out - explore our trailblazing tech via the following link
Panmnesia’s research team has developed an efficient machine learning framework designed for large-scale data training in datacenters, utilizing CXL 3.0 technology. Our recent achievements have been highlighted in Forbes, which can be accessed via the first link below.
To gain a deeper understanding of our CXL 3.0-based training system, we recommend watching our informative video presentation via the second link below. The video demonstrates how CXL can enhance large-scale machine learning acceleration and provides insights for ML and AI system designers to consider when implementing this technology.
We are excited to announce that our team will be showcasing this innovative system at the IPDPS ‘23 conference in Florida this May. We hope to see you there and look forward to discussing our advancements with you in person!
Our team has successfully expanded memory by over a dozen terabytes, making it the largest CXL (Compute Express Link) hardware and software solution in the world. It’s designed to work seamlessly with multi-CPUs, CXL switches, and CXL memory expanders, offering a versatile option that can be easily integrated into any tech setup.
We’re excited to share that we’ve successfully validated the system with the latest Linux full stack and have proven its capabilities in large-scale deep learning and machine learning-based recommendation systems. During our demonstration, we were able to assign all embedding features to the remote-side CXL memory expansion without the need for an assistant module. Our software and hardware IPs are seamlessly integrated with TensorFlow, which allows for effective management of training and servicing with ease.
We’re committed to pushing the boundaries of innovation and are eager to showcase the range of state-of-the-art, memory-intensive applications running on our system. We’re also excited to introduce an array of new hardware-architected devices in the pipeline. Stay connected for further updates on our exciting new innovations!
We’re open to exploring collaboration opportunities and would love to work with you. Please don’t hesitate to get in touch with us at [email protected] or [email protected] if you’re interested. Thanks for being a part of our journey!
Panmnesia, a KAIST start-up company that focuses on developing cache coherent interconnect (CCI) technologies using Compute Express Link (CXL), has garnered significant attention from the technology industry, particularly in the field of computing and memory. Seoul Finance recently published an article that highlights Panmnesia as the first developer of a real CXL system with both working hardware and software modules. The article includes an interview with Panmnesia’s CEO, who explains the significance of CXL and how it could impact various computing and memory industries.
CXL is a new cache coherent interconnect standard that supports multiple protocols and enables cache coherent communication between CPUs, memory devices, and accelerators. According to Panmnesia’s CEO, CXL is the most important technology in the next-generation memory semiconductor industry. The article explains that as the demand for high-performance and high-capacity memory semiconductor grows, CXL could be a viable solution for memory disaggregation, allowing memory resources to be efficiently and cost-effectively utilized.
The article also discusses the potential impact of CXL on artificial intelligence (AI) acceleration and ChatGPT (Conversational AI based on the GPT model). Panmnesia’s CEO believes that CXL can play a crucial role in AI acceleration by offering large-scale memory subsystems to existing processing silicon and co-processing accelerators such as GPU, DPU, and TPU. This will enable faster data processing and more efficient use of resources.
Panmnesia has already secured a significant number of intellectual property rights related to CXL, including the CXL standard IP, and is currently in talks with various domestic and international institutions for collaboration. Panmnesia aims to become the leading solution provider in the CXL industry and is focused on developing new and innovative CCI technologies to meet the demands of various industries.
We had the privilege of attending a dynamic CXL workshop, where we were invited to share our knowledge and expertise as a leading vendor in CPU and memory technology. The workshop brought together industry experts from renowned companies such as Intel, AMD, Samsung, SK-Hynix, and our own company, Panmnesia. We were thrilled to participate in this high-level discussion, which explored the latest trends and innovations in CXL technology and its impact on the future of large-scale data-centric applications.
As a participant in the workshop, we had the opportunity to give an invited talk and share our insights and perspectives on the direction that data-centric applications should take. We highlighted the importance of cutting-edge research directions such as near-data processing and AI acceleration, and how they can be leveraged to drive innovation and progress in this field.
The discussions throughout the workshop were engaging and thought-provoking, and we were impressed by the wealth of knowledge and expertise shared by the attendees. We gained valuable insights into the latest trends and innovations in CXL technology, which will undoubtedly shape our thinking and approach to data-centric applications in the future.
Overall, we were honored to have been a part of this workshop and grateful for the opportunity to share our knowledge and experience with other industry leaders. We look forward to continued engagement with CXL technology and its impact on the future of large-scale data-centric applications.
We are thrilled to share our recent participation in the prestigious Computer System Society 2023 conference, where we had the honor of delivering an invited talk entitled “Memory Pooling over CXL”. Our talk was met with great enthusiasm from attendees, as we presented our history of development and research for CXL and memory expander technology from 2015 to 2023, including some specific use-cases. We also introduced a cutting-edge memory pooling stack and showcased a CXL 3.0 memory expander device reference board that we built from the ground up.
As a leading vendor in CXL and memory technology, we are committed to driving innovation and progress in this field. We are particularly excited about our new memory pooling stack, which we believe has the potential to revolutionize the industry by providing a more efficient and cost-effective way to manage memory resources. Our CXL 3.0 memory expander device reference board is also a testament to our commitment to cutting-edge research and development, as we continue to push the boundaries of what is possible in this field.
In addition to our talk at the Computer System Society 2023 conference, we are also honored to be participating in a closed CXL working symposium later this month, alongside industry leaders such as AMD, Intel, Samsung, and SK-Hynix. We look forward to sharing more technical details about our latest developments and discussing our vision for the future of CXL technology and its impact on data-centric applications.
Overall, our participation in the Computer System Society 2023 conference was a tremendous success, and we are excited to continue to drive innovation and progress in this field. We remain committed to pushing the boundaries of what is possible in CPU and memory technology, and we look forward to sharing more updates and insights in the future.
We are thrilled to announce the publication of our latest research paper, which proposes a cutting-edge solution called TrainingCXL for efficiently processing large-scale recommendation datasets in the pool of disaggregated memory. The paper, which will soon be available in IEEE Micro Magazine, introduces several innovative techniques that make training fault-tolerant with low overhead.
TrainingCXL integrates persistent memory (PMEM) and GPU into a cache-coherent domain as Type 2, which enables PMEM to be directly placed in GPU’s memory hierarchy, allowing GPU to access PMEM without software intervention. The solution introduces computing and checkpointing logic near the CXL controller, thereby actively managing persistency and training data. Considering PMEM’s vulnerability, TrainingCXL utilizes the unique characteristics of recommendation models and takes the checkpointing overhead off the critical path of their training. Lastly, the solution employs an advanced checkpointing technique that relaxes the updating sequence of model parameters and embeddings across training batches.
The evaluation of TrainingCXL shows that it achieves a remarkable 5.2x training performance improvement and 76% energy savings, compared to modern PMEM-based recommendation systems. These results demonstrate the potential of TrainingCXL to revolutionize the field of large-scale recommendation datasets by providing a more efficient, cost-effective, and fault-tolerant solution.
We are proud to have developed this innovative solution and excited to share it with the community. This research builds on our commitment to pushing the boundaries of what is possible in CPU and memory technology, and we look forward to continued innovation and progress in this field.
We are excited to announce the publication of our latest research paper in IEEE Micro Magazine, which proposes a groundbreaking solution for memory disaggregation using Compute Express Link (CXL). CXL has recently garnered significant attention in the industry due to its exceptional hardware heterogeneity management and resource disaggregation capabilities. While there is currently no commercially available product or platform integrating CXL into memory pooling, it is widely expected to revolutionize the way memory resources are practically and efficiently disaggregated.
Our paper introduces the concept of directly accessible memory disaggregation, which establishes a direct connection between a host processor complex and remote memory resources over CXL’s memory protocol (CXL.mem). This innovative approach overcomes the limitations of traditional memory disaggregation techniques, such as high overhead and low efficiency. By utilizing CXL.mem, we achieve a highly efficient, low-latency, and high-bandwidth memory disaggregation solution that has the potential to revolutionize the industry.
The evaluation of our solution demonstrates its remarkable performance, as it achieves a significant reduction in memory access latency and a substantial increase in bandwidth compared to traditional disaggregation techniques. These results demonstrate the potential of directly accessible memory disaggregation over CXL as a game-changing solution for future memory systems.
We are proud to have developed this innovative solution and excited to share it with the community. This research builds on our commitment to pushing the boundaries of what is possible in CPU and memory technology, and we look forward to continued innovation and progress in this field.
We were honored to give a keynote address at the Process-in-Memory and AI-Semiconductor Strategic Technology Symposium 2022, hosted by two major ministries in Korea. Our talk, entitled “Why CXL?”, was a deep dive into the unique capabilities and potential of Compute Express Link (CXL) technology, and explored why it is poised to be one of the key technologies in hyper-scale computing.
As a leading vendor in CPU and memory technology, we are committed to driving innovation and progress in this field, and we were thrilled to share our knowledge and experience with the attendees at the symposium. Our keynote address offered several insights into the benefits of CXL, as well as the differences between CXL 1.1, 2.0, and 3.0, providing a comprehensive understanding of this groundbreaking technology.
In contrast to our distinguished lecture at SC’22, which focused on heterogeneous computing over CXL, this keynote honed in on CXL memory expanders and the corresponding infrastructure that most memory/controller vendors are currently missing. We discussed the challenges that vendors face in adopting this technology, and shared our expertise on how to overcome them. Our talk was well-received by attendees, and we were honored to have had the opportunity to share our insights and perspectives on the future of CXL technologies.
As a company, we remain committed to pushing the boundaries of what is possible in CPU and memory technology, and we believe that CXL has the potential to revolutionize the industry. We welcome any inquiries or questions about CXL technology, and we look forward to continued engagement with industry leaders and experts. We are excited to see how this technology will shape the future of hyper-scale computing and data-centric applications, and we are dedicated to driving progress and innovation in this field.
We are thrilled to share our recent experience at SC’22, where we had the opportunity to engage with industry leaders and experts in the field of high-performance computing (HPC) and Compute Express Link (CXL) technology. Our participation in the CXL consortium co-chair meeting, which was led by AMD and Intel, was a highlight of the event. The panel meeting, which included representatives from Microsoft, Intel, and LBNL, was also an excellent opportunity to discuss the challenges and opportunities that HPC needs to address in order to adopt CXL technology.
Our distinguished lecture at SC’22 was a great success, where we demonstrated the entire design of the CXL switch and a CXL 3.0 system integrating true composable architecture. We showcased a new opportunity to connect all heterogeneous computing systems and HPC, including multiple AI vendors and data processing accelerators, and integrate them into a single pool. Our presentation also highlighted the capabilities of CXL 3.0 as a rack-scale interconnect technology, including features such as back-invalidation, cache coherence engine, and fabric architecture with CXL.
We are excited about the potential of CXL technology to revolutionize the industry and provide more efficient and effective solutions for HPC. We hope to have the opportunity to share more about our vision and CXL 3.0 prototypes at future events and venues in the near future. As a company, we remain committed to driving innovation and progress in CPU and memory technology, and we look forward to continued engagement with industry leaders and experts in this field.
Compute Express Link (CXL) technology has been generating significant attention in recent years, thanks to its exceptional hardware heterogeneity management and resource disaggregation capabilities. While there is currently no commercially available product or platform integrating CXL 2.0/3.0 into memory pooling, it is expected to revolutionize the way memory resources are practically and efficiently disaggregated.
In our upcoming distinguished lecture, we will delve into why existing computing and memory resources require a new interface for cache coherence and explain how CXL can put different types of resources into a disaggregated pool. We will showcase two real system examples, including a CXL 2.0-based end-to-end system that directly connects a host processor complex and remote memory resources over CXL’s memory protocol, and a CXL-integrated storage expansion system prototype.
In addition to these real-world examples, we will also introduce a set of hardware prototypes designed to support the future CXL system (CXL 3.0) as part of our ongoing project. We are excited to share our expertise and insights with attendees, and we hope to inspire others to push the boundaries of what is possible in CPU and memory technology.
Our distinguished lecture will offer a comprehensive understanding of the capabilities and potential of CXL technology, and we are excited to share our vision with the community. We are committed to driving innovation and progress in this field and believe that CXL has the potential to revolutionize the industry. We encourage attendees to join us at the event in Texas, Dallas, this November, and we look forward to engaging with industry leaders and experts in the field. The program for the event will be available soon, so stay tuned for more details!
We are excited to announce that our team has been invited to demonstrate two hot topics related to CXL memory disaggregation and storage pooling at the CXL Forum 2022, one of the hottest sessions at Flash Memory Summit. This session, led by the CXL Consortium and MemVerge, brings together industry leaders and experts to discuss the latest updates and advancements in CXL technology. We will be joining other leading companies and institutions, including ARM, Lenovo, Kioxia, and the University of Michigan TPP team, among others.
Our sessions will take place on August 2nd at 4:40 PM PT, and we are eager to share our insights and expertise with attendees. In the first session, entitled “CXL-SSD: Expanding PCIe Storage as Working Memory over CXL,” we will argue that CXL is a cost-effective and practical interconnect technology that can bridge PCIe storage’s block semantics to memory-compatible byte semantics. We will discuss the mechanisms that make PCIe storage impractical for use as a memory expander and explore all the CXL device types and their protocol interfaces to determine the best configuration for expanding the host’s CPU memory.
In the second session, we will demonstrate our CXL 2.0 end-to-end system, which includes the CXL network and memory expanders. This system showcases our expertise in CXL technology and provides attendees with a firsthand look at the potential of this groundbreaking technology.
We are honored to have the opportunity to participate in the CXL Forum 2022 and to share our insights and expertise with industry leaders and experts. We encourage attendees to check out the detailed information on the session and how to register via Zoom through the link provided. We look forward to engaging with the community and pushing the boundaries of what is possible in CPU and memory technology.
The Compute Express Link (CXL) protocol has been making waves in the computing industry, with various organizations exploring its potential to create tiered, disaggregated, and composable main memory for systems. While hyperscalers and cloud builders have been the early adopters of this technology, high-performance computing (HPC) centers are also starting to show interest.
The Korea Advanced Institute of Science and Technology (KAIST) has been working on a promising solution called DirectCXL, which enables memory disaggregation and composition using the CXL 2.0 protocol atop the PCI-Express bus and a PCI-Express switching complex. Recently, DirectCXL was showcased in a paper presented at the USENIX Annual Technical Conference and discussed in a brochure and a short video.
This solution promises to revolutionize the field of HPC, offering practical and cost-effective solutions for memory expansion. DirectCXL utilizes the Transparent Page Placement protocol and Chameleon memory tracking to create tiered, disaggregated, and composable main memory for systems, similar to other projects like Meta Platforms’ Transparent Page Placement protocol and Microsoft’s zNUMA memory project.
We are pleased to announce that Panmnesia, Inc has acquired the intellectual property rights for DirectCXL from KAIST. As a company, we are excited to see the progress being made in the field of memory disaggregation and composition, particularly in the HPC space. We believe that CXL has the potential to drive innovation and progress in CPU and memory technology, and we are committed to leveraging this technology to create groundbreaking solutions for our customers.
Stay tuned for more updates and developments as we continue to explore the potential of CXL and DirectCXL.
We are proud to announce that our cutting-edge Compute Express Link (CXL) based solid-state drives (CXL-SSD) have been successfully demonstrated at the 2022 ACM Workshop on Hot Topics in Storage and File Systems (HotStorage’22). The workshop, which focuses on research and developments in storage systems, was a perfect opportunity to showcase the innovative capabilities of our CXL-SSD.
CXL is an open multi-protocol method that supports cache coherent interconnect for various processors, accelerators, and memory device types. While it primarily manages data coherency between CPU memory spaces and memory on attached devices, we believe that it can also be useful in transforming existing block storage into cost-efficient, large-scale working memory.
Our presentation at HotStorage’22 examined three different sub-protocols of CXL from a memory expander viewpoint. We suggested the device type that can be the best option for PCIe storage to bridge its block semantics to memory-compatible, byte semantics. Additionally, we discussed how to integrate a storage-integrated memory expander into an existing system and speculated on its impact on system performance. Lastly, we explored various CXL network topologies and the potential of a new opportunity to efficiently manage the storage-integrated, CXL-based memory expansion.
Our demonstration at HotStorage’22 highlighted the significant advancements in CXL technology and how it can transform storage and memory systems. CXL-SSD is a revolutionary solution that enables large-scale working memory to be created cost-effectively by leveraging existing block storage. This breakthrough will revolutionize storage and memory systems, particularly in the context of high-performance computing.
Panmnesia, Inc is committed to developing innovative solutions that leverage the potential of CXL technology. Our CXL-SSD is just one example of the groundbreaking solutions that we are creating to meet the needs of our customers. We believe that CXL is the future of CPU and memory technology, and we are excited to be at the forefront of this exciting new frontier.
Stay tuned for more updates as we continue to explore the potential of CXL technology and its impact on storage and memory systems.
We are thrilled to share that our team has successfully proposed and demonstrated a new cache coherent interconnect called DirectCXL, which directly connects a host processor complex and remote memory resources over CXL’s memory protocol (CXL.mem). In a paper published in USENIX ATC, we explore several practical designs for CXL-based memory disaggregation and make them real, offering a new approach to utilizing memory resources that is expected to be more efficient than ever before.
DirectCXL eliminates the need for data copies between the host memory and remote memory, which allows for the true performance of remote-side disaggregated memory resources to be exposed to the users. Additionally, we have created a CXL software runtime that enables users to utilize the underlying disaggregated memory resources through sheer load/store instructions, as there is no operating system currently supporting CXL.
Our work has been recognized with an acceptance rate of only 16% at USENIX ATC, a testament to the significance of our proposal and its potential impact on the industry. We are excited to continue our research and development in this area and look forward to sharing more updates in the near future.
News (Korean):
As the big data era arrives and data-intensive workloads continue to proliferate, resource disaggregation has become an increasingly important area of research and development. By separating processors and storage devices from one another, disaggregated resource architectures can offer greater scale-out capabilities, increased cost efficiency, and transparent elasticity, breaking down the physical boundaries that have traditionally constrained data center and high-performance computing environments.
However, achieving memory disaggregation that supports high performance and scalability with low cost is a non-trivial task. Many industry prototypes and academic studies have explored a wide range of approaches to realize memory disaggregation, but to date, the concept has yet to be fully realized due to several fundamental challenges.
That’s where our team at Panmnesia (KAIST startup) comes in. We have developed a world-first proof-of-concept CXL solution that directly connects a host processor complex and remote memory resources over the computing express link (CXL) protocol. Our solution framework includes a set of CXL hardware and software IPs, such as a CXL switch, processor complex IP, and CXL memory controller, which completely decouple memory resources from computing resources and enable high-performance, fully scale-out memory disaggregation architectures.
Our CXL solution framework provides several key benefits over traditional memory disaggregation approaches. For example, it enables greater resource utilization and efficiency by making it possible to more easily share and allocate memory resources across multiple nodes. Additionally, our framework can support greater scalability, allowing organizations to expand their memory resources as their needs grow without incurring significant costs or disruptions.
Overall, we believe that our CXL solution represents a major breakthrough in the field of resource disaggregation and memory disaggregation in particular. By providing a more efficient, cost-effective, and scalable approach to memory disaggregation, we are helping organizations to fully leverage the power of their data-intensive workloads and realize the full potential of their computing environments.