In the age of information, security issues play a crucial role. Security comes with three points’ confidentiality, integrity and availability. The entire above said thing will come from an efficient cryptographic algorithm. Cryptosystem is usually achieved by repeated modular multiplications on large integers. To speed up the encryption decryption process, many high-speed Montgomery modular multiplication algorithms and hardware architectures employ carry save addition to avoid the carry propagation at each addition operation of the add-shift loop. In this paper, we propose an energy-efficient algorithm and its corresponding architecture to not only reduce the energy consumption but also further enhance the throughput of Montgomery modular multipliers. The proposed architecture is capable of bypassing the superfluous carry-save addition and register write operations, leading to less energy consumption and higher throughput. In addition, we also modify the barrel register full adder (BRFA) so that the gated clock design technique can be applied to significantly reduce the energy consumption of storage elements in BRFA. Experimental results show that the proposed approaches can achieve up to 60% energy saving and 24.6% throughput improvement for 1024-bit Montgomery multiplier.