COMPASSLAB | Redefining PIM Architecture with Compact and Power-Efficient Microscaling

Publications

Redefining PIM Architecture with Compact and Power-Efficient Microscaling

2025 International Conference on Electronics, Information, and Communication (ICEIC)

Yoonho Jang
Hyeongjun Cho
Seokin Hong

Abstract

With advances in neural network technology, Processing-In-Memory (PIM) has emerged as a solution to per-formance bottlenecks between processors and memory. Among various PIM design techniques, integrating processing units within memory banks has demonstrated high performance in accelerating neural network models. However, prior architectures based on this approach have required sacrificing a large portion of the cell array to accommodate complex floating-point pro-cessing units. In this paper, we propose a new architecture that incorporates a unique quantization technique called microscaling. This technique efficiently converts high-precision data to integer types with minimal accuracy loss during model inference. By adopting microscaling, we reduce the processing unit overhead by replacing floating-point units with integer-based ones. As a result, our approach achieves a 50 % reduction in area while maintaining equivalent performance and reduces energy consumption to approximately 70 % of that in prior architectures.

Keywords

Energy consumption

Quantization

Computer Architecture

Process in Memory (PIM)