Downloads

Keywords:

Retrieval-Augmented Generation, Compute-in-Memory, Edge LLM, noise-aware training, contrastive learning, external memory, non-volatile crossbars.

Architectural Features of Extended Retrieval Generation with External Memory

Authors

Gartman Ievgen1
CEO and founder, Bridge.Digital Austin, TX 1

Abstract

This article examines the RoCR framework, a Retrieval-Augmented Generation (RAG) system optimized for edge deployment in latency-sensitive environments such as real-time search, product recommendation, and dynamic content generation in eCommerce platforms. RoCR leverages Compute-in-Memory (CiM) architectures to enable fast, energy-efficient inference at scale. At the core of the solution is the CiM-Retriever, a module optimized for performing max inner product search (MIPS). Two architectural variants of the generator are analyzed—decoder-only (RA-T) and encoder–decoder with kNN cross-attention—both demonstrating improved accuracy across various tasks while maintaining scalability to millions of documents. The aim of this study is to analyze the architectural characteristics of RAG systems enhanced with external memory modules, focusing on their applicability to eCommerce-scale tasks requiring sub-second response times and contextual relevance. The methodology is based on a review of recent scientific publications, enabling an in-depth exploration of the system-level design of RAG solutions leveraging memory augmentation. The insights from this analysis will be particularly relevant to AI practitioners and system architects working on scalable, high-performance retrieval systems for domains such as personalized retail, product search, and dynamic user engagement optimization. Moreover, the results are of interest to hardware-software co-design specialists and architects of scalable distributed platforms focused on integrating external memory modules in the context of cognitive and neural network applications.

Article Details

Published

2025-06-13

Section

Articles

License

Copyright (c) 2025 International Journal of Engineering and Computer Science Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

How to Cite

Architectural Features of Extended Retrieval Generation with External Memory. (2025). International Journal of Engineering and Computer Science, 14(06), 27355-27361. https://doi.org/10.18535/ijecs.v14i06.5163