Introduction
The ability to provide arithmetic on integers larger than the CPU register width has always provided me with endless hours of fun in assembly. Being a kind and generous person, I have decided to share that enjoyment with you all, by providing a set of three tutorials on addition, subtraction, multiplication and division on binary numbers that are very stupidly large!
This third tutorial concerns itself with 1024-bit unsigned division providing a 1024-bit quotient and 1024-bit remainder.
Theory
Given a 1024-bit dividend and a 1024 bit divisor, compute the 1024-bit quotient and 1024 bit remainder. We shall use the long division method by:
1. Clear the remainder
2. Left shift the remainder by 1
3. Left shift the dividend by 1 (the most significant bit is shifted out and into the carry flag)
4. If the carry flag is set, then set the least significant bit of the remainder
5. Shift the quotient left by 1
6. Compare the remainder with the divisor
7. If the remainder is greater than or equal to the divisor then subtract the divisor from the remainder and set the least significant bit of the quotient
8. Repeat steps 2 to 7 1023 times
1024-bit Division
Here then, is the 1024-bit division code.
Div1024 proc num1:dword, num2:dword, result:dword, remainder:dword local num1_shift[32]:DWORD, partial[32]:DWORD local num2_shift[32]:DWORD xor eax, eax mov edi, remainder mov ecx, 32 rep stosd mov edi, result mov ecx, 32 rep stosd mov esi, num1 lea edi, num1_shift mov ecx, 32 rep movsd mov esi, remainder mov edi, result mov ecx, 1024 Div1024_loop: invoke Shl1024, remainder invoke Shl1024, ADDR num1_shift jnc Div1024_1 bts dword ptr [esi+124], 0 Div1024_1: invoke Shl1024, result invoke Sub1024, remainder, num2, ADDR partial jb Div1024_2 invoke Sub1024, remainder, num2, remainder bts dword ptr [edi+124], 0 Div1024_2: dec ecx jnz Div1024_loop ret Div1024 endp
Conclusion
The routines that I have provided over these three tutorials are, of course little or no use in the real world. Their purpose is to demonstrate the versatility and extensibility of the x86 machine code instruction set and perhaps inspire the reader to investigate it for themselves.




MultiQuote


|