Page 1 of 1

## MASM : 1024-bit Arithmetic Functions : Part II

### #1 Martyn.Rae Reputation: 555
• Posts: 1,436
• Joined: 22-August 09

Posted 28 March 2011 - 09:46 PM

MASM : 1024-bit Arithmetic Functions : Part II

Introduction

The ability to provide arithmetic on integers larger than the CPU register width has always provided me with endless hours of fun in assembly. Being a kind and generous person, I have decided to share that enjoyment with you all, by providing a set of three tutorials on addition, subtraction, multiplication and division on binary numbers that are very stupidly large!

This second tutorial concerns itself with 512-bit unsigned multiplication providing a 1024-bit result.

Theory

Using Booths Algorithm, for the equation A x B = C where A and B are binary numbers, we take B (the multiplier) and for each iteration of the 512 bits (as this is the size of the multiplier), if the least significant bit is 1, then add the multiplicand to a partial result, then right-shift the multiplier 1 bit and left-shift the multiplicand 1 bit otherwise right-shift the multiplier 1 bit and left-shift the multiplicand 1 bit.

1024-bit Shift Left

This is the 1024-bit shift left function, which performs an in-situ left shift on the 32 words at the address passed as a parameter. We use the shl machine code instruction which shifts the least significant word left by 1 bit moving binary 0 into the 'hole' left in the least significant bit position and the bit shifted out of the most significant bit is moved into the 'carry' bit in the condition codes register. The remaining 31 iterations of the function use the rcl machine code instruction which shifts the current word left by 1 bit moving the carry flag into the 'hole' left in the least significant bit position and the bit shifted out of the most significant bit is moved into the 'carry' bit in the condition codes register.

```Shl1024				proc		num1:dword

push		ecx
mov			esi, num1
mov			ecx, 32
shl			dword ptr [esi+ecx*4-4], 1
pushf
dec			ecx
Shl1024_loop:		popf
rcl			dword ptr [esi+ecx*4-4], 1
pushf
dec			ecx
jnz			Shl1024_loop
popf
pop			ecx
ret

Shl1024				endp

```

1024-bit Shift Right

This is the 1024-bit shift right function, which performs an in-situ right shift on the 32 words at the address passed as a parameter. We use the shr machine code instruction which shifts the most significant word right by 1 bit moving binary 0 into the 'hole' left in the most significant bit position and the bit shifted out of the least significant bit is moved into the 'carry' bit in the condition codes register. The remaining 31 iterations of the function use the rcr machine code instruction which shifts the current word right by 1 bit moving the carry flag into the 'hole' left in the most significant bit position and the bit shifted out of the least significant bit is moved into the 'carry' bit in the condition codes register.

```Shr1024				proc		num1:dword

push		ecx
mov			esi, num1
mov			ecx, 32
shr			dword ptr [esi], 1
pushf
dec			ecx
popf
rcr			dword ptr [esi], 1
pushf
dec			ecx
jnz			Shr1024_loop
popf
pop			ecx
ret

Shr1024				endp

```

512-bit Multiplication Producing 1024-bit Result

The following function performs the multiplication itself. The multiplicand (num1) which is 512-bits long is moved into the least significant 512 bits of a 1024-bit area (num1_shift), and the upper 512 bits set to zero. Similarly, the multiplier (num2) which is 512-bits long is moved into the least significant 512 bits of a 1024-bit area (num2_shift), and the upper 512 bits set to zero. The partial result (partial) is set to zero.

The actual multiplication uses the shift right function provided above to determine the value of the least significant bit of the multiplier, then based on that either adds the multiplicand to the partial result or not. The multiplicand is then shifted left. When the loop has been iterated 512 times, the 1024-bit partial result (partial) is copied to the address provided by the caller through the result parameter.

```Mul1024				proc		num1:dword, num2:dword, result:dword
local		num1_shift:DWORD, partial:DWORD
local		num2_shift:DWORD
xor			eax, eax
lea			edi, partial
mov			ecx, 32
rep			stosd
lea			edi, num1_shift
mov			ecx, 16
rep			stosd
mov			esi, num1
mov			ecx, 16
rep			movsd
lea			edi, num2_shift
mov			ecx, 16
rep			stosd
mov			esi, num2
mov			ecx, 16
rep			movsd
mov			ecx, 512
clc
pushf
Mul1024_loop:		popf
jnc			Mul1024_continue
pushf
dec			ecx
jnz			Mul1024_loop
popf
lea			esi, partial
mov			edi, result
mov			ecx, 32
rep			movsd
ret
Mul1024				endp

```

Part III of this tutorial may be found here

This post has been edited by Martyn.Rae: 31 March 2011 - 01:30 PM

Is This A Good Question/Topic? 0

Page 1 of 1

 .related ul { list-style-type: circle; font-size: 12px; font-weight: bold; } .related li { margin-bottom: 5px; background-position: left 7px !important; margin-left: -35px; } .related h2 { font-size: 18px; font-weight: bold; } .related a { color: blue; }