Crypto++  8.2
Free C++ class library of cryptographic schemes
integer.cpp
1 // integer.cpp - originally written and placed in the public domain by Wei Dai
2 // contains public domain code contributed by Alister Lee and Leonard Janke
3 
4 // Notes by JW: The Integer class needs to do two things. First, it needs
5 // to set function pointers on some platforms, like X86 and X64. The
6 // function pointers select a fast multiply and addition based on the cpu.
7 // Second, it wants to create Integer::Zero(), Integer::One() and
8 // Integer::Two().
9 // The function pointers are initialized in the InitializeInteger class by
10 // calling SetFunctionPointers(). The call to SetFunctionPointers() is
11 // guarded to run once using a double-checked pattern. We don't use C++
12 // std::call_once due to bad interactions between libstdc++, glibc and
13 // pthreads. The bad interactions were causing crashes for us on platforms
14 // like Sparc and PowerPC. Since we are only setting function pointers we
15 // don't have to worry about leaking memory. The worst case seems to be the
16 // pointers gets written twice until the init flag is set and visible to
17 // all threads.
18 // For Integer::Zero(), Integer::One() and Integer::Two(), we use one of three
19 // strategies. First, if initialization priorities are available then we use
20 // them. Initialization priorities are init_priority() on Linux and init_seg()
21 // on Windows. AIX, OS X and several other platforms lack them. Initialization
22 // priorities are platform specific but they are also the most trouble free
23 // with determisitic destruction.
24 // Second, if C++11 dynamic initialization is available, then we use it. After
25 // the std::call_once fiasco we dropped the priority dynamic initialization
26 // to avoid unknown troubles platforms that are tested less frequently. In
27 // addition Microsoft platforms mostly do not provide dynamic initialization.
28 // The MSDN docs claim they do but they don't in practice because we need
29 // Visual Studio 2017 and Windows 10 or above.
30 // Third, we fall back to Wei's original code of a Singleton. Wei's original
31 // code was much simpler. It simply used the Singleton pattern, but it always
32 // produced memory findings on some platforms. The Singleton generates memory
33 // findings because it uses a Create On First Use pattern (a dumb Nifty
34 // Counter) and the compiler had to be smart enough to fold them to return
35 // the same object. Unix and Linux compilers do a good job of folding objects,
36 // but Microsoft compilers do a rather poor job for some versions of the
37 // compiler.
38 // Another problem with the Singleton is resource destruction requires running
39 // resource acquisition in reverse. For resources provided through the
40 // Singletons, there is no way to express the dependency order to safely
41 // destroy resources. (That's one of the problems C++11 dynamic
42 // intitialization with concurrent execution is supposed to solve).
43 // The final problem with Singletons is resource/memory exhaustion in languages
44 // like Java and .Net. Java and .Net load and unload a native DLL hundreds or
45 // thousands of times during the life of a program. Each load produces a
46 // memory leak and they can add up quickly. If they library is being used in
47 // Java or .Net then Singleton must be avoided at all costs.
48 
49 #include "pch.h"
50 #include "config.h"
51 
52 #ifndef CRYPTOPP_IMPORTS
53 
54 #include "integer.h"
55 #include "secblock.h"
56 #include "modarith.h"
57 #include "nbtheory.h"
58 #include "smartptr.h"
59 #include "algparam.h"
60 #include "filters.h"
61 #include "stdcpp.h"
62 #include "asn.h"
63 #include "oids.h"
64 #include "words.h"
65 #include "pubkey.h" // for P1363_KDF2
66 #include "sha.h"
67 #include "cpu.h"
68 #include "misc.h"
69 
70 #include <iostream>
71 
72 #if (_MSC_VER >= 1400) && !defined(_M_ARM)
73  #include <intrin.h>
74 #endif
75 
76 #ifdef __DECCXX
77  #include <c_asm.h>
78 #endif
79 
80 // "Error: The operand ___LKDB cannot be assigned to", http://github.com/weidai11/cryptopp/issues/188
81 #if (__SUNPRO_CC >= 0x5130)
82 # define MAYBE_CONST
83 # define MAYBE_UNCONST_CAST(x) const_cast<word*>(x)
84 #else
85 # define MAYBE_CONST const
86 # define MAYBE_UNCONST_CAST(x) x
87 #endif
88 
89 // "Inline assembly operands don't work with .intel_syntax",
90 // http://llvm.org/bugs/show_bug.cgi?id=24232
91 #if CRYPTOPP_BOOL_X32 || defined(CRYPTOPP_DISABLE_MIXED_ASM)
92 # undef CRYPTOPP_X86_ASM_AVAILABLE
93 # undef CRYPTOPP_X32_ASM_AVAILABLE
94 # undef CRYPTOPP_X64_ASM_AVAILABLE
95 # undef CRYPTOPP_SSE2_ASM_AVAILABLE
96 # undef CRYPTOPP_SSSE3_ASM_AVAILABLE
97 #else
98 # define CRYPTOPP_INTEGER_SSE2 (CRYPTOPP_SSE2_ASM_AVAILABLE && (CRYPTOPP_BOOL_X86))
99 #endif
100 
101 // ***************** C++ Static Initialization ********************
102 
103 NAMESPACE_BEGIN(CryptoPP)
104 
105 // Function body near the middle of the file
106 static void SetFunctionPointers();
107 
108 // Use a double-checked pattern. We are not leaking anything so it
109 // does not matter if a pointer is written twice during a race.
110 // Avoid std::call_once due to too many problems on platforms like
111 // Solaris and Sparc. Also see
112 // http://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146 and
113 // http://github.com/weidai11/cryptopp/issues/707.
114 InitializeInteger::InitializeInteger()
115 {
116  static bool s_flag;
117  MEMORY_BARRIER();
118  if (s_flag == false)
119  {
120  SetFunctionPointers();
121  s_flag = true;
122  MEMORY_BARRIER();
123  }
124 }
125 
126 template <long i>
128 {
129  Integer * operator()() const
130  {
131  return new Integer(i);
132  }
133 };
134 
135 // ***************** Library code ********************
136 
137 inline static int Compare(const word *A, const word *B, size_t N)
138 {
139  while (N--)
140  if (A[N] > B[N])
141  return 1;
142  else if (A[N] < B[N])
143  return -1;
144 
145  return 0;
146 }
147 
148 inline static int Increment(word *A, size_t N, word B=1)
149 {
150  CRYPTOPP_ASSERT(N);
151  word t = A[0];
152  A[0] = t+B;
153  if (A[0] >= t)
154  return 0;
155  for (unsigned i=1; i<N; i++)
156  if (++A[i])
157  return 0;
158  return 1;
159 }
160 
161 inline static int Decrement(word *A, size_t N, word B=1)
162 {
163  CRYPTOPP_ASSERT(N);
164  word t = A[0];
165  A[0] = t-B;
166  if (A[0] <= t)
167  return 0;
168  for (unsigned i=1; i<N; i++)
169  if (A[i]--)
170  return 0;
171  return 1;
172 }
173 
174 static void TwosComplement(word *A, size_t N)
175 {
176  Decrement(A, N);
177  for (unsigned i=0; i<N; i++)
178  A[i] = ~A[i];
179 }
180 
181 static word AtomicInverseModPower2(word A)
182 {
183  CRYPTOPP_ASSERT(A%2==1);
184 
185  word R=A%8;
186 
187  for (unsigned i=3; i<WORD_BITS; i*=2)
188  R = R*(2-R*A);
189 
190  CRYPTOPP_ASSERT(R*A==1);
191  return R;
192 }
193 
194 // ********************************************************
195 
196 #if !defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE) || (defined(__x86_64__) && defined(CRYPTOPP_WORD128_AVAILABLE))
197  #define TWO_64_BIT_WORDS 1
198  #define Declare2Words(x) word x##0, x##1;
199  #define AssignWord(a, b) a##0 = b; a##1 = 0;
200  #define Add2WordsBy1(a, b, c) a##0 = b##0 + c; a##1 = b##1 + (a##0 < c);
201  #define LowWord(a) a##0
202  #define HighWord(a) a##1
203  #ifdef _MSC_VER
204  #define MultiplyWordsLoHi(p0, p1, a, b) p0 = _umul128(a, b, &p1);
205  #ifndef __INTEL_COMPILER
206  #define Double3Words(c, d) d##1 = __shiftleft128(d##0, d##1, 1); d##0 = __shiftleft128(c, d##0, 1); c *= 2;
207  #endif
208  #elif defined(__DECCXX)
209  #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; p1 = asm("umulh %a0, %a1, %v0", a, b);
210  #elif defined(__x86_64__)
211  #if defined(__SUNPRO_CC) && __SUNPRO_CC < 0x5100
212  // Sun Studio's gcc-style inline assembly is heavily bugged as of version 5.9 Patch 124864-09 2008/12/16, but this one works
213  #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "r"(b) : "cc");
214  #else
215  #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
216  #define MulAcc(c, d, a, b) asm ("mulq %6; addq %3, %0; adcq %4, %1; adcq $0, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1), "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
217  #define Double3Words(c, d) asm ("addq %0, %0; adcq %1, %1; adcq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
218  #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcq $0, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b) : "cc");
219  #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
220  #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcq %6, %1; adcq $0, %2;" : "+r"(c), "=r"(e##0), "=r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1) : "cc");
221  #endif
222  #endif
223  #define MultiplyWords(p, a, b) MultiplyWordsLoHi(p##0, p##1, a, b)
224  #ifndef Double3Words
225  #define Double3Words(c, d) d##1 = 2*d##1 + (d##0>>(WORD_BITS-1)); d##0 = 2*d##0 + (c>>(WORD_BITS-1)); c *= 2;
226  #endif
227  #ifndef Acc2WordsBy2
228  #define Acc2WordsBy2(a, b) a##0 += b##0; a##1 += a##0 < b##0; a##1 += b##1;
229  #endif
230  #define AddWithCarry(u, a, b) {word t = a+b; u##0 = t + u##1; u##1 = (t<a) + (u##0<t);}
231  #define SubtractWithBorrow(u, a, b) {word t = a-b; u##0 = t - u##1; u##1 = (t>a) + (u##0>t);}
232  #define GetCarry(u) u##1
233  #define GetBorrow(u) u##1
234 #else
235  #define Declare2Words(x) dword x;
236  #if _MSC_VER >= 1400 && !defined(__INTEL_COMPILER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_IA64))
237  #define MultiplyWords(p, a, b) p = __emulu(a, b);
238  #else
239  #define MultiplyWords(p, a, b) p = (dword)a*b;
240  #endif
241  #define AssignWord(a, b) a = b;
242  #define Add2WordsBy1(a, b, c) a = b + c;
243  #define Acc2WordsBy2(a, b) a += b;
244  #define LowWord(a) word(a)
245  #define HighWord(a) word(a>>WORD_BITS)
246  #define Double3Words(c, d) d = 2*d + (c>>(WORD_BITS-1)); c *= 2;
247  #define AddWithCarry(u, a, b) u = dword(a) + b + GetCarry(u);
248  #define SubtractWithBorrow(u, a, b) u = dword(a) - b - GetBorrow(u);
249  #define GetCarry(u) HighWord(u)
250  #define GetBorrow(u) word(u>>(WORD_BITS*2-1))
251 #endif
252 #ifndef MulAcc
253  #define MulAcc(c, d, a, b) MultiplyWords(p, a, b); Acc2WordsBy1(p, c); c = LowWord(p); Acc2WordsBy1(d, HighWord(p));
254 #endif
255 #ifndef Acc2WordsBy1
256  #define Acc2WordsBy1(a, b) Add2WordsBy1(a, a, b)
257 #endif
258 #ifndef Acc3WordsBy2
259  #define Acc3WordsBy2(c, d, e) Acc2WordsBy1(e, c); c = LowWord(e); Add2WordsBy1(e, d, HighWord(e));
260 #endif
261 
262 class DWord
263 {
264 public:
265 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
266  DWord() {std::memset(&m_whole, 0x00, sizeof(m_whole));}
267 #else
268  DWord() {std::memset(&m_halfs, 0x00, sizeof(m_halfs));}
269 #endif
270 
271 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
272  explicit DWord(word low) : m_whole(low) { }
273 #else
274  explicit DWord(word low)
275  {
276  m_halfs.high = 0;
277  m_halfs.low = low;
278  }
279 #endif
280 
281 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
282  DWord(word low, word high) : m_whole()
283 #else
284  DWord(word low, word high) : m_halfs()
285 #endif
286  {
287 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
288 # if (CRYPTOPP_LITTLE_ENDIAN)
289  const word t[2] = {low,high};
290  memcpy(&m_whole, t, sizeof(m_whole));
291 # else
292  const word t[2] = {high,low};
293  memcpy(&m_whole, t, sizeof(m_whole));
294 # endif
295 #else
296  m_halfs.low = low;
297  m_halfs.high = high;
298 #endif
299  }
300 
301  static DWord Multiply(word a, word b)
302  {
303  DWord r;
304  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
305  r.m_whole = (dword)a * b;
306  #elif defined(MultiplyWordsLoHi)
307  MultiplyWordsLoHi(r.m_halfs.low, r.m_halfs.high, a, b);
308  #else
309  CRYPTOPP_ASSERT(0);
310  #endif
311  return r;
312  }
313 
314  static DWord MultiplyAndAdd(word a, word b, word c)
315  {
316  DWord r = Multiply(a, b);
317  return r += c;
318  }
319 
320  DWord & operator+=(word a)
321  {
322  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
323  m_whole = m_whole + a;
324  #else
325  m_halfs.low += a;
326  m_halfs.high += (m_halfs.low < a);
327  #endif
328  return *this;
329  }
330 
331  DWord operator+(word a)
332  {
333  DWord r;
334  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
335  r.m_whole = m_whole + a;
336  #else
337  r.m_halfs.low = m_halfs.low + a;
338  r.m_halfs.high = m_halfs.high + (r.m_halfs.low < a);
339  #endif
340  return r;
341  }
342 
343  DWord operator-(DWord a)
344  {
345  DWord r;
346  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
347  r.m_whole = m_whole - a.m_whole;
348  #else
349  r.m_halfs.low = m_halfs.low - a.m_halfs.low;
350  r.m_halfs.high = m_halfs.high - a.m_halfs.high - (r.m_halfs.low > m_halfs.low);
351  #endif
352  return r;
353  }
354 
355  DWord operator-(word a)
356  {
357  DWord r;
358  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
359  r.m_whole = m_whole - a;
360  #else
361  r.m_halfs.low = m_halfs.low - a;
362  r.m_halfs.high = m_halfs.high - (r.m_halfs.low > m_halfs.low);
363  #endif
364  return r;
365  }
366 
367  // returns quotient, which must fit in a word
368  word operator/(word divisor);
369 
370  word operator%(word a);
371 
372  bool operator!() const
373  {
374  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
375  return !m_whole;
376  #else
377  return !m_halfs.high && !m_halfs.low;
378  #endif
379  }
380 
381  // TODO: When NATIVE_DWORD is in effect, we access high and low, which are inactive
382  // union members, and that's UB. Also see http://stackoverflow.com/q/11373203.
383  word GetLowHalf() const {return m_halfs.low;}
384  word GetHighHalf() const {return m_halfs.high;}
385  word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
386 
387 private:
388  // Issue 274, "Types cannot be declared in anonymous union",
389  // http://github.com/weidai11/cryptopp/issues/274
390  // Thanks to Martin Bonner at http://stackoverflow.com/a/39507183
391  struct half_words
392  {
393  #if (CRYPTOPP_LITTLE_ENDIAN)
394  word low;
395  word high;
396  #else
397  word high;
398  word low;
399  #endif
400  };
401  union
402  {
403  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
404  dword m_whole;
405  #endif
406  half_words m_halfs;
407  };
408 };
409 
410 class Word
411 {
412 public:
413  Word() : m_whole(0) {}
414  Word(word value) : m_whole(value) {}
415  Word(hword low, hword high) : m_whole(low | (word(high) << (WORD_BITS/2))) {}
416 
417  static Word Multiply(hword a, hword b)
418  {
419  Word r;
420  r.m_whole = (word)a * b;
421  return r;
422  }
423 
424  Word operator-(Word a)
425  {
426  Word r;
427  r.m_whole = m_whole - a.m_whole;
428  return r;
429  }
430 
431  Word operator-(hword a)
432  {
433  Word r;
434  r.m_whole = m_whole - a;
435  return r;
436  }
437 
438  // returns quotient, which must fit in a word
439  hword operator/(hword divisor)
440  {
441  return hword(m_whole / divisor);
442  }
443 
444  bool operator!() const
445  {
446  return !m_whole;
447  }
448 
449  word GetWhole() const {return m_whole;}
450  hword GetLowHalf() const {return hword(m_whole);}
451  hword GetHighHalf() const {return hword(m_whole>>(WORD_BITS/2));}
452  hword GetHighHalfAsBorrow() const {return 0-hword(m_whole>>(WORD_BITS/2));}
453 
454 private:
455  word m_whole;
456 };
457 
458 // do a 3 word by 2 word divide, returns quotient and leaves remainder in A
459 template <class S, class D>
460 S DivideThreeWordsByTwo(S *A, S B0, S B1, D *dummy=NULLPTR)
461 {
462  CRYPTOPP_UNUSED(dummy);
463 
464  // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a S
465  CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
466 
467  // estimate the quotient: do a 2 S by 1 S divide.
468  // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
469  // The code change occurred at Commit dc99266599a0e72d.
470 
471  S Q; bool pre = (S(B1+1) == 0);
472  if (B1 > 0 && !pre)
473  Q = D(A[1], A[2]) / S(B1+1);
474  else if (pre)
475  Q = A[2];
476  else
477  Q = D(A[0], A[1]) / B0;
478 
479  // now subtract Q*B from A
480  D p = D::Multiply(B0, Q);
481  D u = (D) A[0] - p.GetLowHalf();
482  A[0] = u.GetLowHalf();
483  u = (D) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - D::Multiply(B1, Q);
484  A[1] = u.GetLowHalf();
485  A[2] += u.GetHighHalf();
486 
487  // Q <= actual quotient, so fix it
488  while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
489  {
490  u = (D) A[0] - B0;
491  A[0] = u.GetLowHalf();
492  u = (D) A[1] - B1 - u.GetHighHalfAsBorrow();
493  A[1] = u.GetLowHalf();
494  A[2] += u.GetHighHalf();
495  Q++;
496  CRYPTOPP_ASSERT(Q); // shouldn't overflow
497  }
498 
499  return Q;
500 }
501 
502 // do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
503 template <class S, class D>
504 inline D DivideFourWordsByTwo(S *T, const D &Al, const D &Ah, const D &B)
505 {
506  // Profiling tells us the original second case was dominant, so it was
507  // promoted to the first If statement. The code change occurred at
508  // Commit dc99266599a0e72d.
509 
510  if (!!B)
511  {
512  S Q[2];
513  T[0] = Al.GetLowHalf();
514  T[1] = Al.GetHighHalf();
515  T[2] = Ah.GetLowHalf();
516  T[3] = Ah.GetHighHalf();
517  Q[1] = DivideThreeWordsByTwo<S, D>(T+1, B.GetLowHalf(), B.GetHighHalf());
518  Q[0] = DivideThreeWordsByTwo<S, D>(T, B.GetLowHalf(), B.GetHighHalf());
519  return D(Q[0], Q[1]);
520  }
521  else // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
522  {
523  return D(Ah.GetLowHalf(), Ah.GetHighHalf());
524  }
525 }
526 
527 // returns quotient, which must fit in a word
528 inline word DWord::operator/(word a)
529 {
530  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
531  return word(m_whole / a);
532  #else
533  hword r[4];
534  return DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a).GetWhole();
535  #endif
536 }
537 
538 inline word DWord::operator%(word a)
539 {
540  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
541  return word(m_whole % a);
542  #else
543  if (a < (word(1) << (WORD_BITS/2)))
544  {
545  hword h = hword(a);
546  word r = m_halfs.high % h;
547  r = ((m_halfs.low >> (WORD_BITS/2)) + (r << (WORD_BITS/2))) % h;
548  return hword((hword(m_halfs.low) + (r << (WORD_BITS/2))) % h);
549  }
550  else
551  {
552  hword r[4];
553  DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a);
554  return Word(r[0], r[1]).GetWhole();
555  }
556  #endif
557 }
558 
559 // ********************************************************
560 
561 // Use some tricks to share assembly code between MSVC, GCC, Clang and Sun CC.
562 #if defined(__GNUC__)
563  #define AddPrologue \
564  int result; \
565  __asm__ __volatile__ \
566  ( \
567  INTEL_NOPREFIX
568  #define AddEpilogue \
569  ATT_PREFIX \
570  : "=a" (result)\
571  : "d" (C), "a" (A), "D" (B), "c" (N) \
572  : "%esi", "memory", "cc" \
573  );\
574  return result;
575  #define MulPrologue \
576  __asm__ __volatile__ \
577  ( \
578  INTEL_NOPREFIX \
579  AS1( push ebx) \
580  AS2( mov ebx, edx)
581  #define MulEpilogue \
582  AS1( pop ebx) \
583  ATT_PREFIX \
584  : \
585  : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B) \
586  : "%esi", "memory", "cc" \
587  );
588  #define SquPrologue MulPrologue
589  #define SquEpilogue \
590  AS1( pop ebx) \
591  ATT_PREFIX \
592  : \
593  : "d" (s_maskLow16), "c" (C), "a" (A) \
594  : "%esi", "%edi", "memory", "cc" \
595  );
596  #define TopPrologue MulPrologue
597  #define TopEpilogue \
598  AS1( pop ebx) \
599  ATT_PREFIX \
600  : \
601  : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B), "S" (L) \
602  : "memory", "cc" \
603  );
604 #else
605  #define AddPrologue \
606  __asm push edi \
607  __asm push esi \
608  __asm mov eax, [esp+12] \
609  __asm mov edi, [esp+16]
610  #define AddEpilogue \
611  __asm pop esi \
612  __asm pop edi \
613  __asm ret 8
614  #define SaveEBX
615  #define RestoreEBX
616  #define SquPrologue \
617  AS2( mov eax, A) \
618  AS2( mov ecx, C) \
619  SaveEBX \
620  AS2( lea ebx, s_maskLow16)
621  #define MulPrologue \
622  AS2( mov eax, A) \
623  AS2( mov edi, B) \
624  AS2( mov ecx, C) \
625  SaveEBX \
626  AS2( lea ebx, s_maskLow16)
627  #define TopPrologue \
628  AS2( mov eax, A) \
629  AS2( mov edi, B) \
630  AS2( mov ecx, C) \
631  AS2( mov esi, L) \
632  SaveEBX \
633  AS2( lea ebx, s_maskLow16)
634  #define SquEpilogue RestoreEBX
635  #define MulEpilogue RestoreEBX
636  #define TopEpilogue RestoreEBX
637 #endif
638 
639 #ifdef CRYPTOPP_X64_MASM_AVAILABLE
640 extern "C" {
641 int Baseline_Add(size_t N, word *C, const word *A, const word *B);
642 int Baseline_Sub(size_t N, word *C, const word *A, const word *B);
643 }
644 #elif defined(CRYPTOPP_X64_ASM_AVAILABLE) && defined(__GNUC__) && defined(CRYPTOPP_WORD128_AVAILABLE)
645 int Baseline_Add(size_t N, word *C, const word *A, const word *B)
646 {
647  word result;
648  __asm__ __volatile__
649  (
650  INTEL_NOPREFIX
651  AS1( neg %1)
652  ASJ( jz, 1, f)
653  AS2( mov %0,[%3+8*%1])
654  AS2( add %0,[%4+8*%1])
655  AS2( mov [%2+8*%1],%0)
656  ASL(0)
657  AS2( mov %0,[%3+8*%1+8])
658  AS2( adc %0,[%4+8*%1+8])
659  AS2( mov [%2+8*%1+8],%0)
660  AS2( lea %1,[%1+2])
661  ASJ( jrcxz, 1, f)
662  AS2( mov %0,[%3+8*%1])
663  AS2( adc %0,[%4+8*%1])
664  AS2( mov [%2+8*%1],%0)
665  ASJ( jmp, 0, b)
666  ASL(1)
667  AS2( mov %0, 0)
668  AS2( adc %0, %0)
669  ATT_NOPREFIX
670  : "=&r" (result), "+c" (N)
671  : "r" (C+N), "r" (A+N), "r" (B+N)
672  : "memory", "cc"
673  );
674  return (int)result;
675 }
676 
677 int Baseline_Sub(size_t N, word *C, const word *A, const word *B)
678 {
679  word result;
680  __asm__ __volatile__
681  (
682  INTEL_NOPREFIX
683  AS1( neg %1)
684  ASJ( jz, 1, f)
685  AS2( mov %0,[%3+8*%1])
686  AS2( sub %0,[%4+8*%1])
687  AS2( mov [%2+8*%1],%0)
688  ASL(0)
689  AS2( mov %0,[%3+8*%1+8])
690  AS2( sbb %0,[%4+8*%1+8])
691  AS2( mov [%2+8*%1+8],%0)
692  AS2( lea %1,[%1+2])
693  ASJ( jrcxz, 1, f)
694  AS2( mov %0,[%3+8*%1])
695  AS2( sbb %0,[%4+8*%1])
696  AS2( mov [%2+8*%1],%0)
697  ASJ( jmp, 0, b)
698  ASL(1)
699  AS2( mov %0, 0)
700  AS2( adc %0, %0)
701  ATT_NOPREFIX
702  : "=&r" (result), "+c" (N)
703  : "r" (C+N), "r" (A+N), "r" (B+N)
704  : "memory", "cc"
705  );
706  return (int)result;
707 }
708 #elif defined(CRYPTOPP_X86_ASM_AVAILABLE) && CRYPTOPP_BOOL_X86
709 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
710 {
711  AddPrologue
712 
713  // now: eax = A, edi = B, edx = C, ecx = N
714  AS2( lea eax, [eax+4*ecx])
715  AS2( lea edi, [edi+4*ecx])
716  AS2( lea edx, [edx+4*ecx])
717 
718  AS1( neg ecx) // ecx is negative index
719  AS2( test ecx, 2) // this clears carry flag
720  ASJ( jz, 0, f)
721  AS2( sub ecx, 2)
722  ASJ( jmp, 1, f)
723 
724  ASL(0)
725  ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
726  AS2( mov esi,[eax+4*ecx])
727  AS2( adc esi,[edi+4*ecx])
728  AS2( mov [edx+4*ecx],esi)
729  AS2( mov esi,[eax+4*ecx+4])
730  AS2( adc esi,[edi+4*ecx+4])
731  AS2( mov [edx+4*ecx+4],esi)
732  ASL(1)
733  AS2( mov esi,[eax+4*ecx+8])
734  AS2( adc esi,[edi+4*ecx+8])
735  AS2( mov [edx+4*ecx+8],esi)
736  AS2( mov esi,[eax+4*ecx+12])
737  AS2( adc esi,[edi+4*ecx+12])
738  AS2( mov [edx+4*ecx+12],esi)
739 
740  AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
741  ASJ( jmp, 0, b)
742 
743  ASL(2)
744  AS2( mov eax, 0)
745  AS1( setc al) // store carry into eax (return result register)
746 
747  AddEpilogue
748 
749  // http://github.com/weidai11/cryptopp/issues/340
750  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
751  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
752 }
753 
754 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
755 {
756  AddPrologue
757 
758  // now: eax = A, edi = B, edx = C, ecx = N
759  AS2( lea eax, [eax+4*ecx])
760  AS2( lea edi, [edi+4*ecx])
761  AS2( lea edx, [edx+4*ecx])
762 
763  AS1( neg ecx) // ecx is negative index
764  AS2( test ecx, 2) // this clears carry flag
765  ASJ( jz, 0, f)
766  AS2( sub ecx, 2)
767  ASJ( jmp, 1, f)
768 
769  ASL(0)
770  ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
771  AS2( mov esi,[eax+4*ecx])
772  AS2( sbb esi,[edi+4*ecx])
773  AS2( mov [edx+4*ecx],esi)
774  AS2( mov esi,[eax+4*ecx+4])
775  AS2( sbb esi,[edi+4*ecx+4])
776  AS2( mov [edx+4*ecx+4],esi)
777  ASL(1)
778  AS2( mov esi,[eax+4*ecx+8])
779  AS2( sbb esi,[edi+4*ecx+8])
780  AS2( mov [edx+4*ecx+8],esi)
781  AS2( mov esi,[eax+4*ecx+12])
782  AS2( sbb esi,[edi+4*ecx+12])
783  AS2( mov [edx+4*ecx+12],esi)
784 
785  AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
786  ASJ( jmp, 0, b)
787 
788  ASL(2)
789  AS2( mov eax, 0)
790  AS1( setc al) // store carry into eax (return result register)
791 
792  AddEpilogue
793 
794  // http://github.com/weidai11/cryptopp/issues/340
795  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
796  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
797 }
798 
799 #if CRYPTOPP_INTEGER_SSE2
800 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Add(size_t N, word *C, const word *A, const word *B)
801 {
802  AddPrologue
803 
804  // now: eax = A, edi = B, edx = C, ecx = N
805  AS2( lea eax, [eax+4*ecx])
806  AS2( lea edi, [edi+4*ecx])
807  AS2( lea edx, [edx+4*ecx])
808 
809  AS1( neg ecx) // ecx is negative index
810  AS2( pxor mm2, mm2)
811  ASJ( jz, 2, f)
812  AS2( test ecx, 2) // this clears carry flag
813  ASJ( jz, 0, f)
814  AS2( sub ecx, 2)
815  ASJ( jmp, 1, f)
816 
817  ASL(0)
818  AS2( movd mm0, DWORD PTR [eax+4*ecx])
819  AS2( movd mm1, DWORD PTR [edi+4*ecx])
820  AS2( paddq mm0, mm1)
821  AS2( paddq mm2, mm0)
822  AS2( movd DWORD PTR [edx+4*ecx], mm2)
823  AS2( psrlq mm2, 32)
824 
825  AS2( movd mm0, DWORD PTR [eax+4*ecx+4])
826  AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
827  AS2( paddq mm0, mm1)
828  AS2( paddq mm2, mm0)
829  AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
830  AS2( psrlq mm2, 32)
831 
832  ASL(1)
833  AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
834  AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
835  AS2( paddq mm0, mm1)
836  AS2( paddq mm2, mm0)
837  AS2( movd DWORD PTR [edx+4*ecx+8], mm2)
838  AS2( psrlq mm2, 32)
839 
840  AS2( movd mm0, DWORD PTR [eax+4*ecx+12])
841  AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
842  AS2( paddq mm0, mm1)
843  AS2( paddq mm2, mm0)
844  AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
845  AS2( psrlq mm2, 32)
846 
847  AS2( add ecx, 4)
848  ASJ( jnz, 0, b)
849 
850  ASL(2)
851  AS2( movd eax, mm2)
852  AS1( emms)
853 
854  AddEpilogue
855 
856  // http://github.com/weidai11/cryptopp/issues/340
857  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
858  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
859 }
860 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Sub(size_t N, word *C, const word *A, const word *B)
861 {
862  AddPrologue
863 
864  // now: eax = A, edi = B, edx = C, ecx = N
865  AS2( lea eax, [eax+4*ecx])
866  AS2( lea edi, [edi+4*ecx])
867  AS2( lea edx, [edx+4*ecx])
868 
869  AS1( neg ecx) // ecx is negative index
870  AS2( pxor mm2, mm2)
871  ASJ( jz, 2, f)
872  AS2( test ecx, 2) // this clears carry flag
873  ASJ( jz, 0, f)
874  AS2( sub ecx, 2)
875  ASJ( jmp, 1, f)
876 
877  ASL(0)
878  AS2( movd mm0, DWORD PTR [eax+4*ecx])
879  AS2( movd mm1, DWORD PTR [edi+4*ecx])
880  AS2( psubq mm0, mm1)
881  AS2( psubq mm0, mm2)
882  AS2( movd DWORD PTR [edx+4*ecx], mm0)
883  AS2( psrlq mm0, 63)
884 
885  AS2( movd mm2, DWORD PTR [eax+4*ecx+4])
886  AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
887  AS2( psubq mm2, mm1)
888  AS2( psubq mm2, mm0)
889  AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
890  AS2( psrlq mm2, 63)
891 
892  ASL(1)
893  AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
894  AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
895  AS2( psubq mm0, mm1)
896  AS2( psubq mm0, mm2)
897  AS2( movd DWORD PTR [edx+4*ecx+8], mm0)
898  AS2( psrlq mm0, 63)
899 
900  AS2( movd mm2, DWORD PTR [eax+4*ecx+12])
901  AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
902  AS2( psubq mm2, mm1)
903  AS2( psubq mm2, mm0)
904  AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
905  AS2( psrlq mm2, 63)
906 
907  AS2( add ecx, 4)
908  ASJ( jnz, 0, b)
909 
910  ASL(2)
911  AS2( movd eax, mm2)
912  AS1( emms)
913 
914  AddEpilogue
915 
916  // http://github.com/weidai11/cryptopp/issues/340
917  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
918  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
919 }
920 #endif // CRYPTOPP_INTEGER_SSE2
921 #else // CRYPTOPP_SSE2_ASM_AVAILABLE
922 int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
923 {
924  CRYPTOPP_ASSERT (N%2 == 0);
925 
926  Declare2Words(u);
927  AssignWord(u, 0);
928  for (size_t i=0; i<N; i+=2)
929  {
930  AddWithCarry(u, A[i], B[i]);
931  C[i] = LowWord(u);
932  AddWithCarry(u, A[i+1], B[i+1]);
933  C[i+1] = LowWord(u);
934  }
935  return int(GetCarry(u));
936 }
937 
938 int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
939 {
940  CRYPTOPP_ASSERT (N%2 == 0);
941 
942  Declare2Words(u);
943  AssignWord(u, 0);
944  for (size_t i=0; i<N; i+=2)
945  {
946  SubtractWithBorrow(u, A[i], B[i]);
947  C[i] = LowWord(u);
948  SubtractWithBorrow(u, A[i+1], B[i+1]);
949  C[i+1] = LowWord(u);
950  }
951  return int(GetBorrow(u));
952 }
953 #endif
954 
955 static word LinearMultiply(word *C, const word *AA, word B, size_t N)
956 {
957  // http://github.com/weidai11/cryptopp/issues/188
958  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
959 
960  word carry=0;
961  for(unsigned i=0; i<N; i++)
962  {
963  Declare2Words(p);
964  MultiplyWords(p, A[i], B);
965  Acc2WordsBy1(p, carry);
966  C[i] = LowWord(p);
967  carry = HighWord(p);
968  }
969  return carry;
970 }
971 
972 #ifndef CRYPTOPP_DOXYGEN_PROCESSING
973 
974 #define Mul_2 \
975  Mul_Begin(2) \
976  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
977  Mul_End(1, 1)
978 
979 #define Mul_4 \
980  Mul_Begin(4) \
981  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
982  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
983  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
984  Mul_SaveAcc(3, 1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
985  Mul_SaveAcc(4, 2, 3) Mul_Acc(3, 2) \
986  Mul_End(5, 3)
987 
988 #define Mul_8 \
989  Mul_Begin(8) \
990  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
991  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
992  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
993  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
994  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
995  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
996  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
997  Mul_SaveAcc(7, 1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
998  Mul_SaveAcc(8, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
999  Mul_SaveAcc(9, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1000  Mul_SaveAcc(10, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1001  Mul_SaveAcc(11, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1002  Mul_SaveAcc(12, 6, 7) Mul_Acc(7, 6) \
1003  Mul_End(13, 7)
1004 
1005 #define Mul_16 \
1006  Mul_Begin(16) \
1007  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1008  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1009  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1010  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1011  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1012  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1013  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1014  Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1015  Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1016  Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1017  Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1018  Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1019  Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1020  Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1021  Mul_SaveAcc(14, 0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1022  Mul_SaveAcc(15, 1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1023  Mul_SaveAcc(16, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1024  Mul_SaveAcc(17, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1025  Mul_SaveAcc(18, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1026  Mul_SaveAcc(19, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1027  Mul_SaveAcc(20, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1028  Mul_SaveAcc(21, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1029  Mul_SaveAcc(22, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1030  Mul_SaveAcc(23, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1031  Mul_SaveAcc(24, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1032  Mul_SaveAcc(25, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1033  Mul_SaveAcc(26, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1034  Mul_SaveAcc(27, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1035  Mul_SaveAcc(28, 14, 15) Mul_Acc(15, 14) \
1036  Mul_End(29, 15)
1037 
1038 #define Squ_2 \
1039  Squ_Begin(2) \
1040  Squ_End(2)
1041 
1042 #define Squ_4 \
1043  Squ_Begin(4) \
1044  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1045  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1046  Squ_SaveAcc(3, 1, 3) Squ_Diag(2) \
1047  Squ_SaveAcc(4, 2, 3) Squ_NonDiag \
1048  Squ_End(4)
1049 
1050 #define Squ_8 \
1051  Squ_Begin(8) \
1052  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1053  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1054  Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1055  Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1056  Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1057  Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1058  Squ_SaveAcc(7, 1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1059  Squ_SaveAcc(8, 2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1060  Squ_SaveAcc(9, 3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1061  Squ_SaveAcc(10, 4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1062  Squ_SaveAcc(11, 5, 7) Squ_Diag(6) \
1063  Squ_SaveAcc(12, 6, 7) Squ_NonDiag \
1064  Squ_End(8)
1065 
1066 #define Squ_16 \
1067  Squ_Begin(16) \
1068  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1069  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1070  Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1071  Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1072  Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1073  Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1074  Squ_SaveAcc(7, 0, 8) Squ_Acc(1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1075  Squ_SaveAcc(8, 0, 9) Squ_Acc(1, 8) Squ_Acc(2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1076  Squ_SaveAcc(9, 0, 10) Squ_Acc(1, 9) Squ_Acc(2, 8) Squ_Acc(3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1077  Squ_SaveAcc(10, 0, 11) Squ_Acc(1, 10) Squ_Acc(2, 9) Squ_Acc(3, 8) Squ_Acc(4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1078  Squ_SaveAcc(11, 0, 12) Squ_Acc(1, 11) Squ_Acc(2, 10) Squ_Acc(3, 9) Squ_Acc(4, 8) Squ_Acc(5, 7) Squ_Diag(6) \
1079  Squ_SaveAcc(12, 0, 13) Squ_Acc(1, 12) Squ_Acc(2, 11) Squ_Acc(3, 10) Squ_Acc(4, 9) Squ_Acc(5, 8) Squ_Acc(6, 7) Squ_NonDiag \
1080  Squ_SaveAcc(13, 0, 14) Squ_Acc(1, 13) Squ_Acc(2, 12) Squ_Acc(3, 11) Squ_Acc(4, 10) Squ_Acc(5, 9) Squ_Acc(6, 8) Squ_Diag(7) \
1081  Squ_SaveAcc(14, 0, 15) Squ_Acc(1, 14) Squ_Acc(2, 13) Squ_Acc(3, 12) Squ_Acc(4, 11) Squ_Acc(5, 10) Squ_Acc(6, 9) Squ_Acc(7, 8) Squ_NonDiag \
1082  Squ_SaveAcc(15, 1, 15) Squ_Acc(2, 14) Squ_Acc(3, 13) Squ_Acc(4, 12) Squ_Acc(5, 11) Squ_Acc(6, 10) Squ_Acc(7, 9) Squ_Diag(8) \
1083  Squ_SaveAcc(16, 2, 15) Squ_Acc(3, 14) Squ_Acc(4, 13) Squ_Acc(5, 12) Squ_Acc(6, 11) Squ_Acc(7, 10) Squ_Acc(8, 9) Squ_NonDiag \
1084  Squ_SaveAcc(17, 3, 15) Squ_Acc(4, 14) Squ_Acc(5, 13) Squ_Acc(6, 12) Squ_Acc(7, 11) Squ_Acc(8, 10) Squ_Diag(9) \
1085  Squ_SaveAcc(18, 4, 15) Squ_Acc(5, 14) Squ_Acc(6, 13) Squ_Acc(7, 12) Squ_Acc(8, 11) Squ_Acc(9, 10) Squ_NonDiag \
1086  Squ_SaveAcc(19, 5, 15) Squ_Acc(6, 14) Squ_Acc(7, 13) Squ_Acc(8, 12) Squ_Acc(9, 11) Squ_Diag(10) \
1087  Squ_SaveAcc(20, 6, 15) Squ_Acc(7, 14) Squ_Acc(8, 13) Squ_Acc(9, 12) Squ_Acc(10, 11) Squ_NonDiag \
1088  Squ_SaveAcc(21, 7, 15) Squ_Acc(8, 14) Squ_Acc(9, 13) Squ_Acc(10, 12) Squ_Diag(11) \
1089  Squ_SaveAcc(22, 8, 15) Squ_Acc(9, 14) Squ_Acc(10, 13) Squ_Acc(11, 12) Squ_NonDiag \
1090  Squ_SaveAcc(23, 9, 15) Squ_Acc(10, 14) Squ_Acc(11, 13) Squ_Diag(12) \
1091  Squ_SaveAcc(24, 10, 15) Squ_Acc(11, 14) Squ_Acc(12, 13) Squ_NonDiag \
1092  Squ_SaveAcc(25, 11, 15) Squ_Acc(12, 14) Squ_Diag(13) \
1093  Squ_SaveAcc(26, 12, 15) Squ_Acc(13, 14) Squ_NonDiag \
1094  Squ_SaveAcc(27, 13, 15) Squ_Diag(14) \
1095  Squ_SaveAcc(28, 14, 15) Squ_NonDiag \
1096  Squ_End(16)
1097 
1098 #define Bot_2 \
1099  Mul_Begin(2) \
1100  Bot_SaveAcc(0, 0, 1) Bot_Acc(1, 0) \
1101  Bot_End(2)
1102 
1103 #define Bot_4 \
1104  Mul_Begin(4) \
1105  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1106  Mul_SaveAcc(1, 2, 0) Mul_Acc(1, 1) Mul_Acc(0, 2) \
1107  Bot_SaveAcc(2, 0, 3) Bot_Acc(1, 2) Bot_Acc(2, 1) Bot_Acc(3, 0) \
1108  Bot_End(4)
1109 
1110 #define Bot_8 \
1111  Mul_Begin(8) \
1112  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1113  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1114  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1115  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1116  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1117  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1118  Bot_SaveAcc(6, 0, 7) Bot_Acc(1, 6) Bot_Acc(2, 5) Bot_Acc(3, 4) Bot_Acc(4, 3) Bot_Acc(5, 2) Bot_Acc(6, 1) Bot_Acc(7, 0) \
1119  Bot_End(8)
1120 
1121 #define Bot_16 \
1122  Mul_Begin(16) \
1123  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1124  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1125  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1126  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1127  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1128  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1129  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1130  Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1131  Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1132  Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1133  Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1134  Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1135  Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1136  Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1137  Bot_SaveAcc(14, 0, 15) Bot_Acc(1, 14) Bot_Acc(2, 13) Bot_Acc(3, 12) Bot_Acc(4, 11) Bot_Acc(5, 10) Bot_Acc(6, 9) Bot_Acc(7, 8) Bot_Acc(8, 7) Bot_Acc(9, 6) Bot_Acc(10, 5) Bot_Acc(11, 4) Bot_Acc(12, 3) Bot_Acc(13, 2) Bot_Acc(14, 1) Bot_Acc(15, 0) \
1138  Bot_End(16)
1139 
1140 #endif
1141 
1142 #if 0
1143 #define Mul_Begin(n) \
1144  Declare2Words(p) \
1145  Declare2Words(c) \
1146  Declare2Words(d) \
1147  MultiplyWords(p, A[0], B[0]) \
1148  AssignWord(c, LowWord(p)) \
1149  AssignWord(d, HighWord(p))
1150 
1151 #define Mul_Acc(i, j) \
1152  MultiplyWords(p, A[i], B[j]) \
1153  Acc2WordsBy1(c, LowWord(p)) \
1154  Acc2WordsBy1(d, HighWord(p))
1155 
1156 #define Mul_SaveAcc(k, i, j) \
1157  R[k] = LowWord(c); \
1158  Add2WordsBy1(c, d, HighWord(c)) \
1159  MultiplyWords(p, A[i], B[j]) \
1160  AssignWord(d, HighWord(p)) \
1161  Acc2WordsBy1(c, LowWord(p))
1162 
1163 #define Mul_End(n) \
1164  R[2*n-3] = LowWord(c); \
1165  Acc2WordsBy1(d, HighWord(c)) \
1166  MultiplyWords(p, A[n-1], B[n-1])\
1167  Acc2WordsBy2(d, p) \
1168  R[2*n-2] = LowWord(d); \
1169  R[2*n-1] = HighWord(d);
1170 
1171 #define Bot_SaveAcc(k, i, j) \
1172  R[k] = LowWord(c); \
1173  word e = LowWord(d) + HighWord(c); \
1174  e += A[i] * B[j];
1175 
1176 #define Bot_Acc(i, j) \
1177  e += A[i] * B[j];
1178 
1179 #define Bot_End(n) \
1180  R[n-1] = e;
1181 #else
1182 #define Mul_Begin(n) \
1183  Declare2Words(p) \
1184  word c; \
1185  Declare2Words(d) \
1186  MultiplyWords(p, A[0], B[0]) \
1187  c = LowWord(p); \
1188  AssignWord(d, HighWord(p))
1189 
1190 #define Mul_Acc(i, j) \
1191  MulAcc(c, d, A[i], B[j])
1192 
1193 #define Mul_SaveAcc(k, i, j) \
1194  R[k] = c; \
1195  c = LowWord(d); \
1196  AssignWord(d, HighWord(d)) \
1197  MulAcc(c, d, A[i], B[j])
1198 
1199 #define Mul_End(k, i) \
1200  R[k] = c; \
1201  MultiplyWords(p, A[i], B[i]) \
1202  Acc2WordsBy2(p, d) \
1203  R[k+1] = LowWord(p); \
1204  R[k+2] = HighWord(p);
1205 
1206 #define Bot_SaveAcc(k, i, j) \
1207  R[k] = c; \
1208  c = LowWord(d); \
1209  c += A[i] * B[j];
1210 
1211 #define Bot_Acc(i, j) \
1212  c += A[i] * B[j];
1213 
1214 #define Bot_End(n) \
1215  R[n-1] = c;
1216 #endif
1217 
1218 #define Squ_Begin(n) \
1219  Declare2Words(p) \
1220  word c; \
1221  Declare2Words(d) \
1222  Declare2Words(e) \
1223  MultiplyWords(p, A[0], A[0]) \
1224  R[0] = LowWord(p); \
1225  AssignWord(e, HighWord(p)) \
1226  MultiplyWords(p, A[0], A[1]) \
1227  c = LowWord(p); \
1228  AssignWord(d, HighWord(p)) \
1229  Squ_NonDiag \
1230 
1231 #define Squ_NonDiag \
1232  Double3Words(c, d)
1233 
1234 #define Squ_SaveAcc(k, i, j) \
1235  Acc3WordsBy2(c, d, e) \
1236  R[k] = c; \
1237  MultiplyWords(p, A[i], A[j]) \
1238  c = LowWord(p); \
1239  AssignWord(d, HighWord(p)) \
1240 
1241 #define Squ_Acc(i, j) \
1242  MulAcc(c, d, A[i], A[j])
1243 
1244 #define Squ_Diag(i) \
1245  Squ_NonDiag \
1246  MulAcc(c, d, A[i], A[i])
1247 
1248 #define Squ_End(n) \
1249  Acc3WordsBy2(c, d, e) \
1250  R[2*n-3] = c; \
1251  MultiplyWords(p, A[n-1], A[n-1])\
1252  Acc2WordsBy2(p, e) \
1253  R[2*n-2] = LowWord(p); \
1254  R[2*n-1] = HighWord(p);
1255 
1256 
1257 void Baseline_Multiply2(word *R, const word *AA, const word *BB)
1258 {
1259  // http://github.com/weidai11/cryptopp/issues/188
1260  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1261  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1262 
1263  Mul_2
1264 }
1265 
1266 void Baseline_Multiply4(word *R, const word *AA, const word *BB)
1267 {
1268  // http://github.com/weidai11/cryptopp/issues/188
1269  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1270  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1271 
1272  Mul_4
1273 }
1274 
1275 void Baseline_Multiply8(word *R, const word *AA, const word *BB)
1276 {
1277  // http://github.com/weidai11/cryptopp/issues/188
1278  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1279  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1280 
1281  Mul_8
1282 }
1283 
1284 void Baseline_Square2(word *R, const word *AA)
1285 {
1286  // http://github.com/weidai11/cryptopp/issues/188
1287  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1288 
1289  Squ_2
1290 }
1291 
1292 void Baseline_Square4(word *R, const word *AA)
1293 {
1294  // http://github.com/weidai11/cryptopp/issues/188
1295  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1296 
1297  Squ_4
1298 }
1299 
1300 void Baseline_Square8(word *R, const word *AA)
1301 {
1302  // http://github.com/weidai11/cryptopp/issues/188
1303  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1304 
1305  Squ_8
1306 }
1307 
1308 void Baseline_MultiplyBottom2(word *R, const word *AA, const word *BB)
1309 {
1310  // http://github.com/weidai11/cryptopp/issues/188
1311  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1312  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1313 
1314  Bot_2
1315 
1316 // http://github.com/weidai11/cryptopp/issues/340
1317 #if defined(TWO_64_BIT_WORDS)
1318  CRYPTOPP_UNUSED(d0); CRYPTOPP_UNUSED(d1);
1319 #endif
1320 }
1321 
1322 void Baseline_MultiplyBottom4(word *R, const word *AA, const word *BB)
1323 {
1324  // http://github.com/weidai11/cryptopp/issues/188
1325  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1326  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1327 
1328  Bot_4
1329 }
1330 
1331 void Baseline_MultiplyBottom8(word *R, const word *AA, const word *BB)
1332 {
1333  // http://github.com/weidai11/cryptopp/issues/188
1334  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1335  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1336 
1337  Bot_8
1338 }
1339 
1340 #define Top_Begin(n) \
1341  Declare2Words(p) \
1342  word c; \
1343  Declare2Words(d) \
1344  MultiplyWords(p, A[0], B[n-2]);\
1345  AssignWord(d, HighWord(p));
1346 
1347 #define Top_Acc(i, j) \
1348  MultiplyWords(p, A[i], B[j]);\
1349  Acc2WordsBy1(d, HighWord(p));
1350 
1351 #define Top_SaveAcc0(i, j) \
1352  c = LowWord(d); \
1353  AssignWord(d, HighWord(d)) \
1354  MulAcc(c, d, A[i], B[j])
1355 
1356 #define Top_SaveAcc1(i, j) \
1357  c = L<c; \
1358  Acc2WordsBy1(d, c); \
1359  c = LowWord(d); \
1360  AssignWord(d, HighWord(d)) \
1361  MulAcc(c, d, A[i], B[j])
1362 
1363 void Baseline_MultiplyTop2(word *R, const word *A, const word *B, word L)
1364 {
1365  CRYPTOPP_UNUSED(L);
1366  word T[4];
1367  Baseline_Multiply2(T, A, B);
1368  R[0] = T[2];
1369  R[1] = T[3];
1370 }
1371 
1372 void Baseline_MultiplyTop4(word *R, const word *AA, const word *BB, word L)
1373 {
1374  // http://github.com/weidai11/cryptopp/issues/188
1375  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1376  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1377 
1378  Top_Begin(4)
1379  Top_Acc(1, 1) Top_Acc(2, 0) \
1380  Top_SaveAcc0(0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1381  Top_SaveAcc1(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
1382  Mul_SaveAcc(0, 2, 3) Mul_Acc(3, 2) \
1383  Mul_End(1, 3)
1384 }
1385 
1386 void Baseline_MultiplyTop8(word *R, const word *AA, const word *BB, word L)
1387 {
1388  // http://github.com/weidai11/cryptopp/issues/188
1389  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1390  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1391 
1392  Top_Begin(8)
1393  Top_Acc(1, 5) Top_Acc(2, 4) Top_Acc(3, 3) Top_Acc(4, 2) Top_Acc(5, 1) Top_Acc(6, 0) \
1394  Top_SaveAcc0(0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1395  Top_SaveAcc1(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1396  Mul_SaveAcc(0, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1397  Mul_SaveAcc(1, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1398  Mul_SaveAcc(2, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1399  Mul_SaveAcc(3, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1400  Mul_SaveAcc(4, 6, 7) Mul_Acc(7, 6) \
1401  Mul_End(5, 7)
1402 }
1403 
1404 #if !CRYPTOPP_INTEGER_SSE2 // save memory by not compiling these functions when SSE2 is available
1405 void Baseline_Multiply16(word *R, const word *AA, const word *BB)
1406 {
1407  // http://github.com/weidai11/cryptopp/issues/188
1408  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1409  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1410 
1411  Mul_16
1412 }
1413 
1414 void Baseline_Square16(word *R, const word *AA)
1415 {
1416  // http://github.com/weidai11/cryptopp/issues/188
1417  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1418 
1419  Squ_16
1420 }
1421 
1422 void Baseline_MultiplyBottom16(word *R, const word *AA, const word *BB)
1423 {
1424  // http://github.com/weidai11/cryptopp/issues/188
1425  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1426  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1427 
1428  Bot_16
1429 }
1430 
1431 void Baseline_MultiplyTop16(word *R, const word *AA, const word *BB, word L)
1432 {
1433  // http://github.com/weidai11/cryptopp/issues/188
1434  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1435  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1436 
1437  Top_Begin(16)
1438  Top_Acc(1, 13) Top_Acc(2, 12) Top_Acc(3, 11) Top_Acc(4, 10) Top_Acc(5, 9) Top_Acc(6, 8) Top_Acc(7, 7) Top_Acc(8, 6) Top_Acc(9, 5) Top_Acc(10, 4) Top_Acc(11, 3) Top_Acc(12, 2) Top_Acc(13, 1) Top_Acc(14, 0) \
1439  Top_SaveAcc0(0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1440  Top_SaveAcc1(1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1441  Mul_SaveAcc(0, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1442  Mul_SaveAcc(1, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1443  Mul_SaveAcc(2, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1444  Mul_SaveAcc(3, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1445  Mul_SaveAcc(4, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1446  Mul_SaveAcc(5, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1447  Mul_SaveAcc(6, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1448  Mul_SaveAcc(7, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1449  Mul_SaveAcc(8, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1450  Mul_SaveAcc(9, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1451  Mul_SaveAcc(10, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1452  Mul_SaveAcc(11, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1453  Mul_SaveAcc(12, 14, 15) Mul_Acc(15, 14) \
1454  Mul_End(13, 15)
1455 }
1456 #endif
1457 
1458 // ********************************************************
1459 
1460 #if CRYPTOPP_INTEGER_SSE2
1461 
1462 CRYPTOPP_ALIGN_DATA(16)
1463 CRYPTOPP_TABLE
1464 const word32 s_maskLow16[4] = {
1465  0xffff,0xffff,0xffff,0xffff
1466 };
1467 
1468 #undef Mul_Begin
1469 #undef Mul_Acc
1470 #undef Top_Begin
1471 #undef Top_Acc
1472 #undef Squ_Acc
1473 #undef Squ_NonDiag
1474 #undef Squ_Diag
1475 #undef Squ_SaveAcc
1476 #undef Squ_Begin
1477 #undef Mul_SaveAcc
1478 #undef Bot_Acc
1479 #undef Bot_SaveAcc
1480 #undef Bot_End
1481 #undef Squ_End
1482 #undef Mul_End
1483 
1484 #define SSE2_FinalSave(k) \
1485  AS2( psllq xmm5, 16) \
1486  AS2( paddq xmm4, xmm5) \
1487  AS2( movq QWORD PTR [ecx+8*(k)], xmm4)
1488 
1489 #define SSE2_SaveShift(k) \
1490  AS2( movq xmm0, xmm6) \
1491  AS2( punpckhqdq xmm6, xmm0) \
1492  AS2( movq xmm1, xmm7) \
1493  AS2( punpckhqdq xmm7, xmm1) \
1494  AS2( paddd xmm6, xmm0) \
1495  AS2( pslldq xmm6, 4) \
1496  AS2( paddd xmm7, xmm1) \
1497  AS2( paddd xmm4, xmm6) \
1498  AS2( pslldq xmm7, 4) \
1499  AS2( movq xmm6, xmm4) \
1500  AS2( paddd xmm5, xmm7) \
1501  AS2( movq xmm7, xmm5) \
1502  AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1503  AS2( psrlq xmm6, 16) \
1504  AS2( paddq xmm6, xmm7) \
1505  AS2( punpckhqdq xmm4, xmm0) \
1506  AS2( punpckhqdq xmm5, xmm0) \
1507  AS2( movq QWORD PTR [ecx+8*(k)+2], xmm6) \
1508  AS2( psrlq xmm6, 3*16) \
1509  AS2( paddd xmm4, xmm6) \
1510 
1511 #define Squ_SSE2_SaveShift(k) \
1512  AS2( movq xmm0, xmm6) \
1513  AS2( punpckhqdq xmm6, xmm0) \
1514  AS2( movq xmm1, xmm7) \
1515  AS2( punpckhqdq xmm7, xmm1) \
1516  AS2( paddd xmm6, xmm0) \
1517  AS2( pslldq xmm6, 4) \
1518  AS2( paddd xmm7, xmm1) \
1519  AS2( paddd xmm4, xmm6) \
1520  AS2( pslldq xmm7, 4) \
1521  AS2( movhlps xmm6, xmm4) \
1522  AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1523  AS2( paddd xmm5, xmm7) \
1524  AS2( movhps QWORD PTR [esp+12], xmm5)\
1525  AS2( psrlq xmm4, 16) \
1526  AS2( paddq xmm4, xmm5) \
1527  AS2( movq QWORD PTR [ecx+8*(k)+2], xmm4) \
1528  AS2( psrlq xmm4, 3*16) \
1529  AS2( paddd xmm4, xmm6) \
1530  AS2( movq QWORD PTR [esp+4], xmm4)\
1531 
1532 #define SSE2_FirstMultiply(i) \
1533  AS2( movdqa xmm7, [esi+(i)*16])\
1534  AS2( movdqa xmm5, [edi-(i)*16])\
1535  AS2( pmuludq xmm5, xmm7) \
1536  AS2( movdqa xmm4, [ebx])\
1537  AS2( movdqa xmm6, xmm4) \
1538  AS2( pand xmm4, xmm5) \
1539  AS2( psrld xmm5, 16) \
1540  AS2( pmuludq xmm7, [edx-(i)*16])\
1541  AS2( pand xmm6, xmm7) \
1542  AS2( psrld xmm7, 16)
1543 
1544 #define Squ_Begin(n) \
1545  SquPrologue \
1546  AS2( mov esi, esp)\
1547  AS2( and esp, 0xfffffff0)\
1548  AS2( lea edi, [esp-32*n])\
1549  AS2( sub esp, 32*n+16)\
1550  AS1( push esi)\
1551  AS2( mov esi, edi) \
1552  AS2( xor edx, edx) \
1553  ASL(1) \
1554  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1555  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1556  AS2( movdqa [edi+2*edx], xmm0) \
1557  AS2( psrlq xmm0, 32) \
1558  AS2( movdqa [edi+2*edx+16], xmm0) \
1559  AS2( movdqa [edi+16*n+2*edx], xmm1) \
1560  AS2( psrlq xmm1, 32) \
1561  AS2( movdqa [edi+16*n+2*edx+16], xmm1) \
1562  AS2( add edx, 16) \
1563  AS2( cmp edx, 8*(n)) \
1564  ASJ( jne, 1, b) \
1565  AS2( lea edx, [edi+16*n])\
1566  SSE2_FirstMultiply(0) \
1567 
1568 #define Squ_Acc(i) \
1569  ASL(LSqu##i) \
1570  AS2( movdqa xmm1, [esi+(i)*16]) \
1571  AS2( movdqa xmm0, [edi-(i)*16]) \
1572  AS2( movdqa xmm2, [ebx]) \
1573  AS2( pmuludq xmm0, xmm1) \
1574  AS2( pmuludq xmm1, [edx-(i)*16]) \
1575  AS2( movdqa xmm3, xmm2) \
1576  AS2( pand xmm2, xmm0) \
1577  AS2( psrld xmm0, 16) \
1578  AS2( paddd xmm4, xmm2) \
1579  AS2( paddd xmm5, xmm0) \
1580  AS2( pand xmm3, xmm1) \
1581  AS2( psrld xmm1, 16) \
1582  AS2( paddd xmm6, xmm3) \
1583  AS2( paddd xmm7, xmm1) \
1584 
1585 #define Squ_Acc1(i)
1586 #define Squ_Acc2(i) ASC(call, LSqu##i)
1587 #define Squ_Acc3(i) Squ_Acc2(i)
1588 #define Squ_Acc4(i) Squ_Acc2(i)
1589 #define Squ_Acc5(i) Squ_Acc2(i)
1590 #define Squ_Acc6(i) Squ_Acc2(i)
1591 #define Squ_Acc7(i) Squ_Acc2(i)
1592 #define Squ_Acc8(i) Squ_Acc2(i)
1593 
1594 #define SSE2_End(E, n) \
1595  SSE2_SaveShift(2*(n)-3) \
1596  AS2( movdqa xmm7, [esi+16]) \
1597  AS2( movdqa xmm0, [edi]) \
1598  AS2( pmuludq xmm0, xmm7) \
1599  AS2( movdqa xmm2, [ebx]) \
1600  AS2( pmuludq xmm7, [edx]) \
1601  AS2( movdqa xmm6, xmm2) \
1602  AS2( pand xmm2, xmm0) \
1603  AS2( psrld xmm0, 16) \
1604  AS2( paddd xmm4, xmm2) \
1605  AS2( paddd xmm5, xmm0) \
1606  AS2( pand xmm6, xmm7) \
1607  AS2( psrld xmm7, 16) \
1608  SSE2_SaveShift(2*(n)-2) \
1609  SSE2_FinalSave(2*(n)-1) \
1610  AS1( pop esp)\
1611  E
1612 
1613 #define Squ_End(n) SSE2_End(SquEpilogue, n)
1614 #define Mul_End(n) SSE2_End(MulEpilogue, n)
1615 #define Top_End(n) SSE2_End(TopEpilogue, n)
1616 
1617 #define Squ_Column1(k, i) \
1618  Squ_SSE2_SaveShift(k) \
1619  AS2( add esi, 16) \
1620  SSE2_FirstMultiply(1)\
1621  Squ_Acc##i(i) \
1622  AS2( paddd xmm4, xmm4) \
1623  AS2( paddd xmm5, xmm5) \
1624  AS2( movdqa xmm3, [esi]) \
1625  AS2( movq xmm1, QWORD PTR [esi+8]) \
1626  AS2( pmuludq xmm1, xmm3) \
1627  AS2( pmuludq xmm3, xmm3) \
1628  AS2( movdqa xmm0, [ebx])\
1629  AS2( movdqa xmm2, xmm0) \
1630  AS2( pand xmm0, xmm1) \
1631  AS2( psrld xmm1, 16) \
1632  AS2( paddd xmm6, xmm0) \
1633  AS2( paddd xmm7, xmm1) \
1634  AS2( pand xmm2, xmm3) \
1635  AS2( psrld xmm3, 16) \
1636  AS2( paddd xmm6, xmm6) \
1637  AS2( paddd xmm7, xmm7) \
1638  AS2( paddd xmm4, xmm2) \
1639  AS2( paddd xmm5, xmm3) \
1640  AS2( movq xmm0, QWORD PTR [esp+4])\
1641  AS2( movq xmm1, QWORD PTR [esp+12])\
1642  AS2( paddd xmm4, xmm0)\
1643  AS2( paddd xmm5, xmm1)\
1644 
1645 #define Squ_Column0(k, i) \
1646  Squ_SSE2_SaveShift(k) \
1647  AS2( add edi, 16) \
1648  AS2( add edx, 16) \
1649  SSE2_FirstMultiply(1)\
1650  Squ_Acc##i(i) \
1651  AS2( paddd xmm6, xmm6) \
1652  AS2( paddd xmm7, xmm7) \
1653  AS2( paddd xmm4, xmm4) \
1654  AS2( paddd xmm5, xmm5) \
1655  AS2( movq xmm0, QWORD PTR [esp+4])\
1656  AS2( movq xmm1, QWORD PTR [esp+12])\
1657  AS2( paddd xmm4, xmm0)\
1658  AS2( paddd xmm5, xmm1)\
1659 
1660 #define SSE2_MulAdd45 \
1661  AS2( movdqa xmm7, [esi]) \
1662  AS2( movdqa xmm0, [edi]) \
1663  AS2( pmuludq xmm0, xmm7) \
1664  AS2( movdqa xmm2, [ebx]) \
1665  AS2( pmuludq xmm7, [edx]) \
1666  AS2( movdqa xmm6, xmm2) \
1667  AS2( pand xmm2, xmm0) \
1668  AS2( psrld xmm0, 16) \
1669  AS2( paddd xmm4, xmm2) \
1670  AS2( paddd xmm5, xmm0) \
1671  AS2( pand xmm6, xmm7) \
1672  AS2( psrld xmm7, 16)
1673 
1674 #define Mul_Begin(n) \
1675  MulPrologue \
1676  AS2( mov esi, esp)\
1677  AS2( and esp, 0xfffffff0)\
1678  AS2( sub esp, 48*n+16)\
1679  AS1( push esi)\
1680  AS2( xor edx, edx) \
1681  ASL(1) \
1682  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1683  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1684  ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1685  AS2( movdqa [esp+20+2*edx], xmm0) \
1686  AS2( psrlq xmm0, 32) \
1687  AS2( movdqa [esp+20+2*edx+16], xmm0) \
1688  AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1689  AS2( psrlq xmm1, 32) \
1690  AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1691  AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1692  AS2( psrlq xmm2, 32) \
1693  AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1694  AS2( add edx, 16) \
1695  AS2( cmp edx, 8*(n)) \
1696  ASJ( jne, 1, b) \
1697  AS2( lea edi, [esp+20])\
1698  AS2( lea edx, [esp+20+16*n])\
1699  AS2( lea esi, [esp+20+32*n])\
1700  SSE2_FirstMultiply(0) \
1701 
1702 #define Mul_Acc(i) \
1703  ASL(LMul##i) \
1704  AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1705  AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1706  AS2( movdqa xmm2, [ebx]) \
1707  AS2( pmuludq xmm0, xmm1) \
1708  AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1709  AS2( movdqa xmm3, xmm2) \
1710  AS2( pand xmm2, xmm0) \
1711  AS2( psrld xmm0, 16) \
1712  AS2( paddd xmm4, xmm2) \
1713  AS2( paddd xmm5, xmm0) \
1714  AS2( pand xmm3, xmm1) \
1715  AS2( psrld xmm1, 16) \
1716  AS2( paddd xmm6, xmm3) \
1717  AS2( paddd xmm7, xmm1) \
1718 
1719 #define Mul_Acc1(i)
1720 #define Mul_Acc2(i) ASC(call, LMul##i)
1721 #define Mul_Acc3(i) Mul_Acc2(i)
1722 #define Mul_Acc4(i) Mul_Acc2(i)
1723 #define Mul_Acc5(i) Mul_Acc2(i)
1724 #define Mul_Acc6(i) Mul_Acc2(i)
1725 #define Mul_Acc7(i) Mul_Acc2(i)
1726 #define Mul_Acc8(i) Mul_Acc2(i)
1727 #define Mul_Acc9(i) Mul_Acc2(i)
1728 #define Mul_Acc10(i) Mul_Acc2(i)
1729 #define Mul_Acc11(i) Mul_Acc2(i)
1730 #define Mul_Acc12(i) Mul_Acc2(i)
1731 #define Mul_Acc13(i) Mul_Acc2(i)
1732 #define Mul_Acc14(i) Mul_Acc2(i)
1733 #define Mul_Acc15(i) Mul_Acc2(i)
1734 #define Mul_Acc16(i) Mul_Acc2(i)
1735 
1736 #define Mul_Column1(k, i) \
1737  SSE2_SaveShift(k) \
1738  AS2( add esi, 16) \
1739  SSE2_MulAdd45\
1740  Mul_Acc##i(i) \
1741 
1742 #define Mul_Column0(k, i) \
1743  SSE2_SaveShift(k) \
1744  AS2( add edi, 16) \
1745  AS2( add edx, 16) \
1746  SSE2_MulAdd45\
1747  Mul_Acc##i(i) \
1748 
1749 #define Bot_Acc(i) \
1750  AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1751  AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1752  AS2( pmuludq xmm0, xmm1) \
1753  AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1754  AS2( paddq xmm4, xmm0) \
1755  AS2( paddd xmm6, xmm1)
1756 
1757 #define Bot_SaveAcc(k) \
1758  SSE2_SaveShift(k) \
1759  AS2( add edi, 16) \
1760  AS2( add edx, 16) \
1761  AS2( movdqa xmm6, [esi]) \
1762  AS2( movdqa xmm0, [edi]) \
1763  AS2( pmuludq xmm0, xmm6) \
1764  AS2( paddq xmm4, xmm0) \
1765  AS2( psllq xmm5, 16) \
1766  AS2( paddq xmm4, xmm5) \
1767  AS2( pmuludq xmm6, [edx])
1768 
1769 #define Bot_End(n) \
1770  AS2( movhlps xmm7, xmm6) \
1771  AS2( paddd xmm6, xmm7) \
1772  AS2( psllq xmm6, 32) \
1773  AS2( paddd xmm4, xmm6) \
1774  AS2( movq QWORD PTR [ecx+8*((n)-1)], xmm4) \
1775  AS1( pop esp)\
1776  MulEpilogue
1777 
1778 #define Top_Begin(n) \
1779  TopPrologue \
1780  AS2( mov edx, esp)\
1781  AS2( and esp, 0xfffffff0)\
1782  AS2( sub esp, 48*n+16)\
1783  AS1( push edx)\
1784  AS2( xor edx, edx) \
1785  ASL(1) \
1786  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1787  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1788  ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1789  AS2( movdqa [esp+20+2*edx], xmm0) \
1790  AS2( psrlq xmm0, 32) \
1791  AS2( movdqa [esp+20+2*edx+16], xmm0) \
1792  AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1793  AS2( psrlq xmm1, 32) \
1794  AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1795  AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1796  AS2( psrlq xmm2, 32) \
1797  AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1798  AS2( add edx, 16) \
1799  AS2( cmp edx, 8*(n)) \
1800  ASJ( jne, 1, b) \
1801  AS2( mov eax, esi) \
1802  AS2( lea edi, [esp+20+00*n+16*(n/2-1)])\
1803  AS2( lea edx, [esp+20+16*n+16*(n/2-1)])\
1804  AS2( lea esi, [esp+20+32*n+16*(n/2-1)])\
1805  AS2( pxor xmm4, xmm4)\
1806  AS2( pxor xmm5, xmm5)
1807 
1808 #define Top_Acc(i) \
1809  AS2( movq xmm0, QWORD PTR [esi+i/2*(1-(i-2*(i/2))*2)*16+8]) \
1810  AS2( pmuludq xmm0, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1811  AS2( psrlq xmm0, 48) \
1812  AS2( paddd xmm5, xmm0)\
1813 
1814 #define Top_Column0(i) \
1815  AS2( psllq xmm5, 32) \
1816  AS2( add edi, 16) \
1817  AS2( add edx, 16) \
1818  SSE2_MulAdd45\
1819  Mul_Acc##i(i) \
1820 
1821 #define Top_Column1(i) \
1822  SSE2_SaveShift(0) \
1823  AS2( add esi, 16) \
1824  SSE2_MulAdd45\
1825  Mul_Acc##i(i) \
1826  AS2( shr eax, 16) \
1827  AS2( movd xmm0, eax)\
1828  AS2( movd xmm1, [ecx+4])\
1829  AS2( psrld xmm1, 16)\
1830  AS2( pcmpgtd xmm1, xmm0)\
1831  AS2( psrld xmm1, 31)\
1832  AS2( paddd xmm4, xmm1)\
1833 
1834 void SSE2_Square4(word *C, const word *A)
1835 {
1836  Squ_Begin(2)
1837  Squ_Column0(0, 1)
1838  Squ_End(2)
1839 }
1840 
1841 void SSE2_Square8(word *C, const word *A)
1842 {
1843  Squ_Begin(4)
1844 #ifndef __GNUC__
1845  ASJ( jmp, 0, f)
1846  Squ_Acc(2)
1847  AS1( ret) ASL(0)
1848 #endif
1849  Squ_Column0(0, 1)
1850  Squ_Column1(1, 1)
1851  Squ_Column0(2, 2)
1852  Squ_Column1(3, 1)
1853  Squ_Column0(4, 1)
1854  Squ_End(4)
1855 }
1856 
1857 void SSE2_Square16(word *C, const word *A)
1858 {
1859  Squ_Begin(8)
1860 #ifndef __GNUC__
1861  ASJ( jmp, 0, f)
1862  Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1863  AS1( ret) ASL(0)
1864 #endif
1865  Squ_Column0(0, 1)
1866  Squ_Column1(1, 1)
1867  Squ_Column0(2, 2)
1868  Squ_Column1(3, 2)
1869  Squ_Column0(4, 3)
1870  Squ_Column1(5, 3)
1871  Squ_Column0(6, 4)
1872  Squ_Column1(7, 3)
1873  Squ_Column0(8, 3)
1874  Squ_Column1(9, 2)
1875  Squ_Column0(10, 2)
1876  Squ_Column1(11, 1)
1877  Squ_Column0(12, 1)
1878  Squ_End(8)
1879 }
1880 
1881 void SSE2_Square32(word *C, const word *A)
1882 {
1883  Squ_Begin(16)
1884  ASJ( jmp, 0, f)
1885  Squ_Acc(8) Squ_Acc(7) Squ_Acc(6) Squ_Acc(5) Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1886  AS1( ret) ASL(0)
1887  Squ_Column0(0, 1)
1888  Squ_Column1(1, 1)
1889  Squ_Column0(2, 2)
1890  Squ_Column1(3, 2)
1891  Squ_Column0(4, 3)
1892  Squ_Column1(5, 3)
1893  Squ_Column0(6, 4)
1894  Squ_Column1(7, 4)
1895  Squ_Column0(8, 5)
1896  Squ_Column1(9, 5)
1897  Squ_Column0(10, 6)
1898  Squ_Column1(11, 6)
1899  Squ_Column0(12, 7)
1900  Squ_Column1(13, 7)
1901  Squ_Column0(14, 8)
1902  Squ_Column1(15, 7)
1903  Squ_Column0(16, 7)
1904  Squ_Column1(17, 6)
1905  Squ_Column0(18, 6)
1906  Squ_Column1(19, 5)
1907  Squ_Column0(20, 5)
1908  Squ_Column1(21, 4)
1909  Squ_Column0(22, 4)
1910  Squ_Column1(23, 3)
1911  Squ_Column0(24, 3)
1912  Squ_Column1(25, 2)
1913  Squ_Column0(26, 2)
1914  Squ_Column1(27, 1)
1915  Squ_Column0(28, 1)
1916  Squ_End(16)
1917 }
1918 
1919 void SSE2_Multiply4(word *C, const word *A, const word *B)
1920 {
1921  Mul_Begin(2)
1922 #ifndef __GNUC__
1923  ASJ( jmp, 0, f)
1924  Mul_Acc(2)
1925  AS1( ret) ASL(0)
1926 #endif
1927  Mul_Column0(0, 2)
1928  Mul_End(2)
1929 }
1930 
1931 void SSE2_Multiply8(word *C, const word *A, const word *B)
1932 {
1933  Mul_Begin(4)
1934 #ifndef __GNUC__
1935  ASJ( jmp, 0, f)
1936  Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1937  AS1( ret) ASL(0)
1938 #endif
1939  Mul_Column0(0, 2)
1940  Mul_Column1(1, 3)
1941  Mul_Column0(2, 4)
1942  Mul_Column1(3, 3)
1943  Mul_Column0(4, 2)
1944  Mul_End(4)
1945 }
1946 
1947 void SSE2_Multiply16(word *C, const word *A, const word *B)
1948 {
1949  Mul_Begin(8)
1950 #ifndef __GNUC__
1951  ASJ( jmp, 0, f)
1952  Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1953  AS1( ret) ASL(0)
1954 #endif
1955  Mul_Column0(0, 2)
1956  Mul_Column1(1, 3)
1957  Mul_Column0(2, 4)
1958  Mul_Column1(3, 5)
1959  Mul_Column0(4, 6)
1960  Mul_Column1(5, 7)
1961  Mul_Column0(6, 8)
1962  Mul_Column1(7, 7)
1963  Mul_Column0(8, 6)
1964  Mul_Column1(9, 5)
1965  Mul_Column0(10, 4)
1966  Mul_Column1(11, 3)
1967  Mul_Column0(12, 2)
1968  Mul_End(8)
1969 }
1970 
1971 void SSE2_Multiply32(word *C, const word *A, const word *B)
1972 {
1973  Mul_Begin(16)
1974  ASJ( jmp, 0, f)
1975  Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1976  AS1( ret) ASL(0)
1977  Mul_Column0(0, 2)
1978  Mul_Column1(1, 3)
1979  Mul_Column0(2, 4)
1980  Mul_Column1(3, 5)
1981  Mul_Column0(4, 6)
1982  Mul_Column1(5, 7)
1983  Mul_Column0(6, 8)
1984  Mul_Column1(7, 9)
1985  Mul_Column0(8, 10)
1986  Mul_Column1(9, 11)
1987  Mul_Column0(10, 12)
1988  Mul_Column1(11, 13)
1989  Mul_Column0(12, 14)
1990  Mul_Column1(13, 15)
1991  Mul_Column0(14, 16)
1992  Mul_Column1(15, 15)
1993  Mul_Column0(16, 14)
1994  Mul_Column1(17, 13)
1995  Mul_Column0(18, 12)
1996  Mul_Column1(19, 11)
1997  Mul_Column0(20, 10)
1998  Mul_Column1(21, 9)
1999  Mul_Column0(22, 8)
2000  Mul_Column1(23, 7)
2001  Mul_Column0(24, 6)
2002  Mul_Column1(25, 5)
2003  Mul_Column0(26, 4)
2004  Mul_Column1(27, 3)
2005  Mul_Column0(28, 2)
2006  Mul_End(16)
2007 }
2008 
2009 void SSE2_MultiplyBottom4(word *C, const word *A, const word *B)
2010 {
2011  Mul_Begin(2)
2012  Bot_SaveAcc(0) Bot_Acc(2)
2013  Bot_End(2)
2014 }
2015 
2016 void SSE2_MultiplyBottom8(word *C, const word *A, const word *B)
2017 {
2018  Mul_Begin(4)
2019 #ifndef __GNUC__
2020  ASJ( jmp, 0, f)
2021  Mul_Acc(3) Mul_Acc(2)
2022  AS1( ret) ASL(0)
2023 #endif
2024  Mul_Column0(0, 2)
2025  Mul_Column1(1, 3)
2026  Bot_SaveAcc(2) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2027  Bot_End(4)
2028 }
2029 
2030 void SSE2_MultiplyBottom16(word *C, const word *A, const word *B)
2031 {
2032  Mul_Begin(8)
2033 #ifndef __GNUC__
2034  ASJ( jmp, 0, f)
2035  Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2036  AS1( ret) ASL(0)
2037 #endif
2038  Mul_Column0(0, 2)
2039  Mul_Column1(1, 3)
2040  Mul_Column0(2, 4)
2041  Mul_Column1(3, 5)
2042  Mul_Column0(4, 6)
2043  Mul_Column1(5, 7)
2044  Bot_SaveAcc(6) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2045  Bot_End(8)
2046 }
2047 
2048 void SSE2_MultiplyBottom32(word *C, const word *A, const word *B)
2049 {
2050  Mul_Begin(16)
2051 #ifndef __GNUC__
2052  ASJ( jmp, 0, f)
2053  Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2054  AS1( ret) ASL(0)
2055 #endif
2056  Mul_Column0(0, 2)
2057  Mul_Column1(1, 3)
2058  Mul_Column0(2, 4)
2059  Mul_Column1(3, 5)
2060  Mul_Column0(4, 6)
2061  Mul_Column1(5, 7)
2062  Mul_Column0(6, 8)
2063  Mul_Column1(7, 9)
2064  Mul_Column0(8, 10)
2065  Mul_Column1(9, 11)
2066  Mul_Column0(10, 12)
2067  Mul_Column1(11, 13)
2068  Mul_Column0(12, 14)
2069  Mul_Column1(13, 15)
2070  Bot_SaveAcc(14) Bot_Acc(16) Bot_Acc(15) Bot_Acc(14) Bot_Acc(13) Bot_Acc(12) Bot_Acc(11) Bot_Acc(10) Bot_Acc(9) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2071  Bot_End(16)
2072 }
2073 
2074 void SSE2_MultiplyTop8(word *C, const word *A, const word *B, word L)
2075 {
2076  Top_Begin(4)
2077  Top_Acc(3) Top_Acc(2) Top_Acc(1)
2078 #ifndef __GNUC__
2079  ASJ( jmp, 0, f)
2080  Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2081  AS1( ret) ASL(0)
2082 #endif
2083  Top_Column0(4)
2084  Top_Column1(3)
2085  Mul_Column0(0, 2)
2086  Top_End(2)
2087 }
2088 
2089 void SSE2_MultiplyTop16(word *C, const word *A, const word *B, word L)
2090 {
2091  Top_Begin(8)
2092  Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2093 #ifndef __GNUC__
2094  ASJ( jmp, 0, f)
2095  Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2096  AS1( ret) ASL(0)
2097 #endif
2098  Top_Column0(8)
2099  Top_Column1(7)
2100  Mul_Column0(0, 6)
2101  Mul_Column1(1, 5)
2102  Mul_Column0(2, 4)
2103  Mul_Column1(3, 3)
2104  Mul_Column0(4, 2)
2105  Top_End(4)
2106 }
2107 
2108 void SSE2_MultiplyTop32(word *C, const word *A, const word *B, word L)
2109 {
2110  Top_Begin(16)
2111  Top_Acc(15) Top_Acc(14) Top_Acc(13) Top_Acc(12) Top_Acc(11) Top_Acc(10) Top_Acc(9) Top_Acc(8) Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2112 #ifndef __GNUC__
2113  ASJ( jmp, 0, f)
2114  Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2115  AS1( ret) ASL(0)
2116 #endif
2117  Top_Column0(16)
2118  Top_Column1(15)
2119  Mul_Column0(0, 14)
2120  Mul_Column1(1, 13)
2121  Mul_Column0(2, 12)
2122  Mul_Column1(3, 11)
2123  Mul_Column0(4, 10)
2124  Mul_Column1(5, 9)
2125  Mul_Column0(6, 8)
2126  Mul_Column1(7, 7)
2127  Mul_Column0(8, 6)
2128  Mul_Column1(9, 5)
2129  Mul_Column0(10, 4)
2130  Mul_Column1(11, 3)
2131  Mul_Column0(12, 2)
2132  Top_End(8)
2133 }
2134 
2135 #endif // #if CRYPTOPP_INTEGER_SSE2
2136 
2137 // ********************************************************
2138 
2139 typedef int (CRYPTOPP_FASTCALL * PAdd)(size_t N, word *C, const word *A, const word *B);
2140 typedef void (* PMul)(word *C, const word *A, const word *B);
2141 typedef void (* PSqu)(word *C, const word *A);
2142 typedef void (* PMulTop)(word *C, const word *A, const word *B, word L);
2143 
2144 #if CRYPTOPP_INTEGER_SSE2
2145 static PAdd s_pAdd = &Baseline_Add, s_pSub = &Baseline_Sub;
2146 static size_t s_recursionLimit = 8;
2147 #else
2148 static const size_t s_recursionLimit = 16;
2149 #endif // CRYPTOPP_INTEGER_SSE2
2150 
2151 static PMul s_pMul[9], s_pBot[9];
2152 static PSqu s_pSqu[9];
2153 static PMulTop s_pTop[9];
2154 
2155 void SetFunctionPointers()
2156 {
2157  s_pMul[0] = &Baseline_Multiply2;
2158  s_pBot[0] = &Baseline_MultiplyBottom2;
2159  s_pSqu[0] = &Baseline_Square2;
2160  s_pTop[0] = &Baseline_MultiplyTop2;
2161  s_pTop[1] = &Baseline_MultiplyTop4;
2162 
2163 #if CRYPTOPP_INTEGER_SSE2
2164  if (HasSSE2())
2165  {
2166  if (IsP4())
2167  {
2168  s_pAdd = &SSE2_Add;
2169  s_pSub = &SSE2_Sub;
2170  }
2171 
2172  s_recursionLimit = 32;
2173 
2174  s_pMul[1] = &SSE2_Multiply4;
2175  s_pMul[2] = &SSE2_Multiply8;
2176  s_pMul[4] = &SSE2_Multiply16;
2177  s_pMul[8] = &SSE2_Multiply32;
2178 
2179  s_pBot[1] = &SSE2_MultiplyBottom4;
2180  s_pBot[2] = &SSE2_MultiplyBottom8;
2181  s_pBot[4] = &SSE2_MultiplyBottom16;
2182  s_pBot[8] = &SSE2_MultiplyBottom32;
2183 
2184  s_pSqu[1] = &SSE2_Square4;
2185  s_pSqu[2] = &SSE2_Square8;
2186  s_pSqu[4] = &SSE2_Square16;
2187  s_pSqu[8] = &SSE2_Square32;
2188 
2189  s_pTop[2] = &SSE2_MultiplyTop8;
2190  s_pTop[4] = &SSE2_MultiplyTop16;
2191  s_pTop[8] = &SSE2_MultiplyTop32;
2192  }
2193  else
2194 #endif // CRYPTOPP_INTEGER_SSE2
2195  {
2196  s_pMul[1] = &Baseline_Multiply4;
2197  s_pMul[2] = &Baseline_Multiply8;
2198 
2199  s_pBot[1] = &Baseline_MultiplyBottom4;
2200  s_pBot[2] = &Baseline_MultiplyBottom8;
2201 
2202  s_pSqu[1] = &Baseline_Square4;
2203  s_pSqu[2] = &Baseline_Square8;
2204 
2205  s_pTop[2] = &Baseline_MultiplyTop8;
2206 
2207 #if !CRYPTOPP_INTEGER_SSE2
2208  s_pMul[4] = &Baseline_Multiply16;
2209  s_pBot[4] = &Baseline_MultiplyBottom16;
2210  s_pSqu[4] = &Baseline_Square16;
2211  s_pTop[4] = &Baseline_MultiplyTop16;
2212 #endif // !CRYPTOPP_INTEGER_SSE2
2213  }
2214 }
2215 
2216 inline int Add(word *C, const word *A, const word *B, size_t N)
2217 {
2218 #if CRYPTOPP_INTEGER_SSE2
2219  return s_pAdd(N, C, A, B);
2220 #else
2221  return Baseline_Add(N, C, A, B);
2222 #endif // CRYPTOPP_INTEGER_SSE2
2223 }
2224 
2225 inline int Subtract(word *C, const word *A, const word *B, size_t N)
2226 {
2227 #if CRYPTOPP_INTEGER_SSE2
2228  return s_pSub(N, C, A, B);
2229 #else
2230  return Baseline_Sub(N, C, A, B);
2231 #endif // CRYPTOPP_INTEGER_SSE2
2232 }
2233 
2234 // ********************************************************
2235 
2236 
2237 #define A0 A
2238 #define A1 (A+N2)
2239 #define B0 B
2240 #define B1 (B+N2)
2241 
2242 #define T0 T
2243 #define T1 (T+N2)
2244 #define T2 (T+N)
2245 #define T3 (T+N+N2)
2246 
2247 #define R0 R
2248 #define R1 (R+N2)
2249 #define R2 (R+N)
2250 #define R3 (R+N+N2)
2251 
2252 // R[2*N] - result = A*B
2253 // T[2*N] - temporary work space
2254 // A[N] --- multiplier
2255 // B[N] --- multiplicant
2256 
2257 void RecursiveMultiply(word *R, word *T, const word *A, const word *B, size_t N)
2258 {
2259  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2260 
2261  if (N <= s_recursionLimit)
2262  s_pMul[N/4](R, A, B);
2263  else
2264  {
2265  const size_t N2 = N/2;
2266 
2267  size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2268  Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2269 
2270  size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2271  Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2272 
2273  RecursiveMultiply(R2, T2, A1, B1, N2);
2274  RecursiveMultiply(T0, T2, R0, R1, N2);
2275  RecursiveMultiply(R0, T2, A0, B0, N2);
2276 
2277  // now T[01] holds (A1-A0)*(B0-B1), R[01] holds A0*B0, R[23] holds A1*B1
2278 
2279  int c2 = Add(R2, R2, R1, N2);
2280  int c3 = c2;
2281  c2 += Add(R1, R2, R0, N2);
2282  c3 += Add(R2, R2, R3, N2);
2283 
2284  if (AN2 == BN2)
2285  c3 -= Subtract(R1, R1, T0, N);
2286  else
2287  c3 += Add(R1, R1, T0, N);
2288 
2289  c3 += Increment(R2, N2, c2);
2290  CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2291  Increment(R3, N2, c3);
2292  }
2293 }
2294 
2295 // R[2*N] - result = A*A
2296 // T[2*N] - temporary work space
2297 // A[N] --- number to be squared
2298 
2299 void RecursiveSquare(word *R, word *T, const word *A, size_t N)
2300 {
2301  CRYPTOPP_ASSERT(N && N%2==0);
2302 
2303  if (N <= s_recursionLimit)
2304  s_pSqu[N/4](R, A);
2305  else
2306  {
2307  const size_t N2 = N/2;
2308 
2309  RecursiveSquare(R0, T2, A0, N2);
2310  RecursiveSquare(R2, T2, A1, N2);
2311  RecursiveMultiply(T0, T2, A0, A1, N2);
2312 
2313  int carry = Add(R1, R1, T0, N);
2314  carry += Add(R1, R1, T0, N);
2315  Increment(R3, N2, carry);
2316  }
2317 }
2318 
2319 // R[N] - bottom half of A*B
2320 // T[3*N/2] - temporary work space
2321 // A[N] - multiplier
2322 // B[N] - multiplicant
2323 
2324 void RecursiveMultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2325 {
2326  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2327 
2328  if (N <= s_recursionLimit)
2329  s_pBot[N/4](R, A, B);
2330  else
2331  {
2332  const size_t N2 = N/2;
2333 
2334  RecursiveMultiply(R, T, A0, B0, N2);
2335  RecursiveMultiplyBottom(T0, T1, A1, B0, N2);
2336  Add(R1, R1, T0, N2);
2337  RecursiveMultiplyBottom(T0, T1, A0, B1, N2);
2338  Add(R1, R1, T0, N2);
2339  }
2340 }
2341 
2342 // R[N] --- upper half of A*B
2343 // T[2*N] - temporary work space
2344 // L[N] --- lower half of A*B
2345 // A[N] --- multiplier
2346 // B[N] --- multiplicant
2347 
2348 void MultiplyTop(word *R, word *T, const word *L, const word *A, const word *B, size_t N)
2349 {
2350  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2351 
2352  if (N <= s_recursionLimit)
2353  s_pTop[N/4](R, A, B, L[N-1]);
2354  else
2355  {
2356  const size_t N2 = N/2;
2357 
2358  size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2359  Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2360 
2361  size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2362  Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2363 
2364  RecursiveMultiply(T0, T2, R0, R1, N2);
2365  RecursiveMultiply(R0, T2, A1, B1, N2);
2366 
2367  // now T[01] holds (A1-A0)*(B0-B1) = A1*B0+A0*B1-A1*B1-A0*B0, R[01] holds A1*B1
2368 
2369  int t, c3;
2370  int c2 = Subtract(T2, L+N2, L, N2);
2371 
2372  if (AN2 == BN2)
2373  {
2374  c2 -= Add(T2, T2, T0, N2);
2375  t = (Compare(T2, R0, N2) == -1);
2376  c3 = t - Subtract(T2, T2, T1, N2);
2377  }
2378  else
2379  {
2380  c2 += Subtract(T2, T2, T0, N2);
2381  t = (Compare(T2, R0, N2) == -1);
2382  c3 = t + Add(T2, T2, T1, N2);
2383  }
2384 
2385  c2 += t;
2386  if (c2 >= 0)
2387  c3 += Increment(T2, N2, c2);
2388  else
2389  c3 -= Decrement(T2, N2, -c2);
2390  c3 += Add(R0, T2, R1, N2);
2391 
2392  CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2393  Increment(R1, N2, c3);
2394  }
2395 }
2396 
2397 inline void Multiply(word *R, word *T, const word *A, const word *B, size_t N)
2398 {
2399  RecursiveMultiply(R, T, A, B, N);
2400 }
2401 
2402 inline void Square(word *R, word *T, const word *A, size_t N)
2403 {
2404  RecursiveSquare(R, T, A, N);
2405 }
2406 
2407 inline void MultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2408 {
2409  RecursiveMultiplyBottom(R, T, A, B, N);
2410 }
2411 
2412 // R[NA+NB] - result = A*B
2413 // T[NA+NB] - temporary work space
2414 // A[NA] ---- multiplier
2415 // B[NB] ---- multiplicant
2416 
2417 void AsymmetricMultiply(word *R, word *T, const word *A, size_t NA, const word *B, size_t NB)
2418 {
2419  if (NA == NB)
2420  {
2421  // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
2422  // The code change occurred at Commit dc99266599a0e72d.
2423  if (A != B)
2424  Multiply(R, T, A, B, NA);
2425  else
2426  Square(R, T, A, NA);
2427 
2428  return;
2429  }
2430 
2431  if (NA > NB)
2432  {
2433  std::swap(A, B);
2434  std::swap(NA, NB);
2435  }
2436 
2437  CRYPTOPP_ASSERT(NB % NA == 0);
2438 
2439  if (NA==2 && !A[1])
2440  {
2441  // Profiling tells us the original Default case was dominant, so it was
2442  // promoted to the first Case statement. The code change occurred at
2443  // Commit dc99266599a0e72d.
2444  switch (A[0])
2445  {
2446  default:
2447  R[NB] = LinearMultiply(R, B, A[0], NB);
2448  R[NB+1] = 0;
2449  return;
2450  case 0:
2451  SetWords(R, 0, NB+2);
2452  return;
2453  case 1:
2454  CopyWords(R, B, NB);
2455  R[NB] = R[NB+1] = 0;
2456  return;
2457  }
2458  }
2459 
2460  size_t i;
2461  if ((NB/NA)%2 == 0)
2462  {
2463  Multiply(R, T, A, B, NA);
2464  CopyWords(T+2*NA, R+NA, NA);
2465 
2466  for (i=2*NA; i<NB; i+=2*NA)
2467  Multiply(T+NA+i, T, A, B+i, NA);
2468  for (i=NA; i<NB; i+=2*NA)
2469  Multiply(R+i, T, A, B+i, NA);
2470  }
2471  else
2472  {
2473  for (i=0; i<NB; i+=2*NA)
2474  Multiply(R+i, T, A, B+i, NA);
2475  for (i=NA; i<NB; i+=2*NA)
2476  Multiply(T+NA+i, T, A, B+i, NA);
2477  }
2478 
2479  if (Add(R+NA, R+NA, T+2*NA, NB-NA))
2480  Increment(R+NB, NA);
2481 }
2482 
2483 // R[N] ----- result = A inverse mod 2**(WORD_BITS*N)
2484 // T[3*N/2] - temporary work space
2485 // A[N] ----- an odd number as input
2486 
2487 void RecursiveInverseModPower2(word *R, word *T, const word *A, size_t N)
2488 {
2489  // Profiling tells us the original Else was dominant, so it was
2490  // promoted to the first If statement. The code change occurred
2491  // at Commit dc99266599a0e72d.
2492  if (N!=2)
2493  {
2494  const size_t N2 = N/2;
2495  RecursiveInverseModPower2(R0, T0, A0, N2);
2496  T0[0] = 1;
2497  SetWords(T0+1, 0, N2-1);
2498  MultiplyTop(R1, T1, T0, R0, A0, N2);
2499  MultiplyBottom(T0, T1, R0, A1, N2);
2500  Add(T0, R1, T0, N2);
2501  TwosComplement(T0, N2);
2502  MultiplyBottom(R1, T1, R0, T0, N2);
2503  }
2504  else
2505  {
2506  T[0] = AtomicInverseModPower2(A[0]);
2507  T[1] = 0;
2508  s_pBot[0](T+2, T, A);
2509  TwosComplement(T+2, 2);
2510  Increment(T+2, 2, 2);
2511  s_pBot[0](R, T, T+2);
2512  }
2513 }
2514 
2515 // R[N] --- result = X/(2**(WORD_BITS*N)) mod M
2516 // T[3*N] - temporary work space
2517 // X[2*N] - number to be reduced
2518 // M[N] --- modulus
2519 // U[N] --- multiplicative inverse of M mod 2**(WORD_BITS*N)
2520 
2521 void MontgomeryReduce(word *R, word *T, word *X, const word *M, const word *U, size_t N)
2522 {
2523 #if 1
2524  MultiplyBottom(R, T, X, U, N);
2525  MultiplyTop(T, T+N, X, R, M, N);
2526  word borrow = Subtract(T, X+N, T, N);
2527  // defend against timing attack by doing this Add even when not needed
2528  word carry = Add(T+N, T, M, N);
2529  CRYPTOPP_ASSERT(carry | !borrow);
2530  CRYPTOPP_UNUSED(carry), CRYPTOPP_UNUSED(borrow);
2531  CopyWords(R, T + ((0-borrow) & N), N);
2532 #elif 0
2533  const word u = 0-U[0];
2534  Declare2Words(p)
2535  for (size_t i=0; i<N; i++)
2536  {
2537  const word t = u * X[i];
2538  word c = 0;
2539  for (size_t j=0; j<N; j+=2)
2540  {
2541  MultiplyWords(p, t, M[j]);
2542  Acc2WordsBy1(p, X[i+j]);
2543  Acc2WordsBy1(p, c);
2544  X[i+j] = LowWord(p);
2545  c = HighWord(p);
2546  MultiplyWords(p, t, M[j+1]);
2547  Acc2WordsBy1(p, X[i+j+1]);
2548  Acc2WordsBy1(p, c);
2549  X[i+j+1] = LowWord(p);
2550  c = HighWord(p);
2551  }
2552 
2553  if (Increment(X+N+i, N-i, c))
2554  while (!Subtract(X+N, X+N, M, N)) {}
2555  }
2556 
2557  memcpy(R, X+N, N*WORD_SIZE);
2558 #else
2559  __m64 u = _mm_cvtsi32_si64(0-U[0]), p;
2560  for (size_t i=0; i<N; i++)
2561  {
2562  __m64 t = _mm_cvtsi32_si64(X[i]);
2563  t = _mm_mul_su32(t, u);
2564  __m64 c = _mm_setzero_si64();
2565  for (size_t j=0; j<N; j+=2)
2566  {
2567  p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j]));
2568  p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j]));
2569  c = _mm_add_si64(c, p);
2570  X[i+j] = _mm_cvtsi64_si32(c);
2571  c = _mm_srli_si64(c, 32);
2572  p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j+1]));
2573  p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j+1]));
2574  c = _mm_add_si64(c, p);
2575  X[i+j+1] = _mm_cvtsi64_si32(c);
2576  c = _mm_srli_si64(c, 32);
2577  }
2578 
2579  if (Increment(X+N+i, N-i, _mm_cvtsi64_si32(c)))
2580  while (!Subtract(X+N, X+N, M, N)) {}
2581  }
2582 
2583  memcpy(R, X+N, N*WORD_SIZE);
2584  _mm_empty();
2585 #endif
2586 }
2587 
2588 // R[N] --- result = X/(2**(WORD_BITS*N/2)) mod M
2589 // T[2*N] - temporary work space
2590 // X[2*N] - number to be reduced
2591 // M[N] --- modulus
2592 // U[N/2] - multiplicative inverse of M mod 2**(WORD_BITS*N/2)
2593 // V[N] --- 2**(WORD_BITS*3*N/2) mod M
2594 
2595 void HalfMontgomeryReduce(word *R, word *T, const word *X, const word *M, const word *U, const word *V, size_t N)
2596 {
2597  CRYPTOPP_ASSERT(N%2==0 && N>=4);
2598 
2599 #define M0 M
2600 #define M1 (M+N2)
2601 #define V0 V
2602 #define V1 (V+N2)
2603 
2604 #define X0 X
2605 #define X1 (X+N2)
2606 #define X2 (X+N)
2607 #define X3 (X+N+N2)
2608 
2609  const size_t N2 = N/2;
2610  Multiply(T0, T2, V0, X3, N2);
2611  int c2 = Add(T0, T0, X0, N);
2612  MultiplyBottom(T3, T2, T0, U, N2);
2613  MultiplyTop(T2, R, T0, T3, M0, N2);
2614  c2 -= Subtract(T2, T1, T2, N2);
2615  Multiply(T0, R, T3, M1, N2);
2616  c2 -= Subtract(T0, T2, T0, N2);
2617  int c3 = -(int)Subtract(T1, X2, T1, N2);
2618  Multiply(R0, T2, V1, X3, N2);
2619  c3 += Add(R, R, T, N);
2620 
2621  if (c2>0)
2622  c3 += Increment(R1, N2);
2623  else if (c2<0)
2624  c3 -= Decrement(R1, N2, -c2);
2625 
2626  CRYPTOPP_ASSERT(c3>=-1 && c3<=1);
2627  if (c3>0)
2628  Subtract(R, R, M, N);
2629  else if (c3<0)
2630  Add(R, R, M, N);
2631 
2632 #undef M0
2633 #undef M1
2634 #undef V0
2635 #undef V1
2636 
2637 #undef X0
2638 #undef X1
2639 #undef X2
2640 #undef X3
2641 }
2642 
2643 #undef A0
2644 #undef A1
2645 #undef B0
2646 #undef B1
2647 
2648 #undef T0
2649 #undef T1
2650 #undef T2
2651 #undef T3
2652 
2653 #undef R0
2654 #undef R1
2655 #undef R2
2656 #undef R3
2657 
2658 /*
2659 // do a 3 word by 2 word divide, returns quotient and leaves remainder in A
2660 static word SubatomicDivide(word *A, word B0, word B1)
2661 {
2662  // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a word
2663  CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
2664 
2665  // estimate the quotient: do a 2 word by 1 word divide
2666  word Q;
2667  if (B1+1 == 0)
2668  Q = A[2];
2669  else
2670  Q = DWord(A[1], A[2]).DividedBy(B1+1);
2671 
2672  // now subtract Q*B from A
2673  DWord p = DWord::Multiply(B0, Q);
2674  DWord u = (DWord) A[0] - p.GetLowHalf();
2675  A[0] = u.GetLowHalf();
2676  u = (DWord) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - DWord::Multiply(B1, Q);
2677  A[1] = u.GetLowHalf();
2678  A[2] += u.GetHighHalf();
2679 
2680  // Q <= actual quotient, so fix it
2681  while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
2682  {
2683  u = (DWord) A[0] - B0;
2684  A[0] = u.GetLowHalf();
2685  u = (DWord) A[1] - B1 - u.GetHighHalfAsBorrow();
2686  A[1] = u.GetLowHalf();
2687  A[2] += u.GetHighHalf();
2688  Q++;
2689  CRYPTOPP_ASSERT(Q); // shouldn't overflow
2690  }
2691 
2692  return Q;
2693 }
2694 
2695 // do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
2696 static inline void AtomicDivide(word *Q, const word *A, const word *B)
2697 {
2698  if (!B[0] && !B[1]) // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
2699  {
2700  Q[0] = A[2];
2701  Q[1] = A[3];
2702  }
2703  else
2704  {
2705  word T[4];
2706  T[0] = A[0]; T[1] = A[1]; T[2] = A[2]; T[3] = A[3];
2707  Q[1] = SubatomicDivide(T+1, B[0], B[1]);
2708  Q[0] = SubatomicDivide(T, B[0], B[1]);
2709 
2710 #if defined(CRYPTOPP_DEBUG)
2711  // multiply quotient and divisor and add remainder, make sure it equals dividend
2712  CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2713  word P[4];
2714  LowLevel::Multiply2(P, Q, B);
2715  Add(P, P, T, 4);
2716  CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2717 #endif
2718  }
2719 }
2720 */
2721 
2722 static inline void AtomicDivide(word *Q, const word *A, const word *B)
2723 {
2724  word T[4];
2725  DWord q = DivideFourWordsByTwo<word, DWord>(T, DWord(A[0], A[1]), DWord(A[2], A[3]), DWord(B[0], B[1]));
2726  Q[0] = q.GetLowHalf();
2727  Q[1] = q.GetHighHalf();
2728 
2729 #if defined(CRYPTOPP_DEBUG)
2730  if (B[0] || B[1])
2731  {
2732  // multiply quotient and divisor and add remainder, make sure it equals dividend
2733  CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2734  word P[4];
2735  s_pMul[0](P, Q, B);
2736  Add(P, P, T, 4);
2737  CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2738  }
2739 #endif
2740 }
2741 
2742 // for use by Divide(), corrects the underestimated quotient {Q1,Q0}
2743 static void CorrectQuotientEstimate(word *R, word *T, word *Q, const word *B, size_t N)
2744 {
2745  CRYPTOPP_ASSERT(N && N%2==0);
2746 
2747  AsymmetricMultiply(T, T+N+2, Q, 2, B, N);
2748 
2749  word borrow = Subtract(R, R, T, N+2);
2750  CRYPTOPP_ASSERT(!borrow && !R[N+1]);
2751  CRYPTOPP_UNUSED(borrow);
2752 
2753  while (R[N] || Compare(R, B, N) >= 0)
2754  {
2755  R[N] -= Subtract(R, R, B, N);
2756  Q[1] += (++Q[0]==0);
2757  CRYPTOPP_ASSERT(Q[0] || Q[1]); // no overflow
2758  }
2759 }
2760 
2761 // R[NB] -------- remainder = A%B
2762 // Q[NA-NB+2] --- quotient = A/B
2763 // T[NA+3*(NB+2)] - temp work space
2764 // A[NA] -------- dividend
2765 // B[NB] -------- divisor
2766 
2767 void Divide(word *R, word *Q, word *T, const word *A, size_t NA, const word *B, size_t NB)
2768 {
2769  CRYPTOPP_ASSERT(NA && NB && NA%2==0 && NB%2==0);
2770  CRYPTOPP_ASSERT(B[NB-1] || B[NB-2]);
2771  CRYPTOPP_ASSERT(NB <= NA);
2772 
2773  // set up temporary work space
2774  word *const TA=T;
2775  word *const TB=T+NA+2;
2776  word *const TP=T+NA+2+NB;
2777 
2778  // copy B into TB and normalize it so that TB has highest bit set to 1
2779  unsigned shiftWords = (B[NB-1]==0);
2780  TB[0] = TB[NB-1] = 0;
2781  CopyWords(TB+shiftWords, B, NB-shiftWords);
2782  unsigned shiftBits = WORD_BITS - BitPrecision(TB[NB-1]);
2783  CRYPTOPP_ASSERT(shiftBits < WORD_BITS);
2784  ShiftWordsLeftByBits(TB, NB, shiftBits);
2785 
2786  // copy A into TA and normalize it
2787  TA[0] = TA[NA] = TA[NA+1] = 0;
2788  CopyWords(TA+shiftWords, A, NA);
2789  ShiftWordsLeftByBits(TA, NA+2, shiftBits);
2790 
2791  if (TA[NA+1]==0 && TA[NA] <= 1)
2792  {
2793  Q[NA-NB+1] = Q[NA-NB] = 0;
2794  while (TA[NA] || Compare(TA+NA-NB, TB, NB) >= 0)
2795  {
2796  TA[NA] -= Subtract(TA+NA-NB, TA+NA-NB, TB, NB);
2797  ++Q[NA-NB];
2798  }
2799  }
2800  else
2801  {
2802  NA+=2;
2803  CRYPTOPP_ASSERT(Compare(TA+NA-NB, TB, NB) < 0);
2804  }
2805 
2806  word BT[2];
2807  BT[0] = TB[NB-2] + 1;
2808  BT[1] = TB[NB-1] + (BT[0]==0);
2809 
2810  // start reducing TA mod TB, 2 words at a time
2811  for (size_t i=NA-2; i>=NB; i-=2)
2812  {
2813  AtomicDivide(Q+i-NB, TA+i-2, BT);
2814  CorrectQuotientEstimate(TA+i-NB, TP, Q+i-NB, TB, NB);
2815  }
2816 
2817  // copy TA into R, and denormalize it
2818  CopyWords(R, TA+shiftWords, NB);
2819  ShiftWordsRightByBits(R, NB, shiftBits);
2820 }
2821 
2822 static inline size_t EvenWordCount(const word *X, size_t N)
2823 {
2824  while (N && X[N-2]==0 && X[N-1]==0)
2825  N-=2;
2826  return N;
2827 }
2828 
2829 // return k
2830 // R[N] --- result = A^(-1) * 2^k mod M
2831 // T[4*N] - temporary work space
2832 // A[NA] -- number to take inverse of
2833 // M[N] --- modulus
2834 
2835 unsigned int AlmostInverse(word *R, word *T, const word *A, size_t NA, const word *M, size_t N)
2836 {
2837  CRYPTOPP_ASSERT(NA<=N && N && N%2==0);
2838 
2839  word *b = T;
2840  word *c = T+N;
2841  word *f = T+2*N;
2842  word *g = T+3*N;
2843  size_t bcLen=2, fgLen=EvenWordCount(M, N);
2844  unsigned int k=0;
2845  bool s=false;
2846 
2847  SetWords(T, 0, 3*N);
2848  b[0]=1;
2849  CopyWords(f, A, NA);
2850  CopyWords(g, M, N);
2851 
2852  while (1)
2853  {
2854  word t=f[0];
2855  while (!t)
2856  {
2857  if (EvenWordCount(f, fgLen)==0)
2858  {
2859  SetWords(R, 0, N);
2860  return 0;
2861  }
2862 
2863  ShiftWordsRightByWords(f, fgLen, 1);
2864  bcLen += 2 * (c[bcLen-1] != 0);
2865  CRYPTOPP_ASSERT(bcLen <= N);
2866  ShiftWordsLeftByWords(c, bcLen, 1);
2867  k+=WORD_BITS;
2868  t=f[0];
2869  }
2870 
2871  // t must be non-0; otherwise undefined.
2872  unsigned int i = TrailingZeros(t);
2873  t >>= i;
2874  k += i;
2875 
2876  if (t==1 && f[1]==0 && EvenWordCount(f+2, fgLen-2)==0)
2877  {
2878  if (s)
2879  Subtract(R, M, b, N);
2880  else
2881  CopyWords(R, b, N);
2882  return k;
2883  }
2884 
2885  ShiftWordsRightByBits(f, fgLen, i);
2886  t = ShiftWordsLeftByBits(c, bcLen, i);
2887  c[bcLen] += t;
2888  bcLen += 2 * (t!=0);
2889  CRYPTOPP_ASSERT(bcLen <= N);
2890 
2891  bool swap = Compare(f, g, fgLen)==-1;
2892  ConditionalSwapPointers(swap, f, g);
2893  ConditionalSwapPointers(swap, b, c);
2894  s ^= swap;
2895 
2896  fgLen -= 2 * !(f[fgLen-2] | f[fgLen-1]);
2897 
2898  Subtract(f, f, g, fgLen);
2899  t = Add(b, b, c, bcLen);
2900  b[bcLen] += t;
2901  bcLen += 2*t;
2902  CRYPTOPP_ASSERT(bcLen <= N);
2903  }
2904 }
2905 
2906 // R[N] - result = A/(2^k) mod M
2907 // A[N] - input
2908 // M[N] - modulus
2909 
2910 void DivideByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2911 {
2912  CopyWords(R, A, N);
2913 
2914  while (k--)
2915  {
2916  if (R[0]%2==0)
2917  ShiftWordsRightByBits(R, N, 1);
2918  else
2919  {
2920  word carry = Add(R, R, M, N);
2921  ShiftWordsRightByBits(R, N, 1);
2922  R[N-1] += carry<<(WORD_BITS-1);
2923  }
2924  }
2925 }
2926 
2927 // R[N] - result = A*(2^k) mod M
2928 // A[N] - input
2929 // M[N] - modulus
2930 
2931 void MultiplyByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2932 {
2933  CopyWords(R, A, N);
2934 
2935  while (k--)
2936  if (ShiftWordsLeftByBits(R, N, 1) || Compare(R, M, N)>=0)
2937  Subtract(R, R, M, N);
2938 }
2939 
2940 // ******************************************************************
2941 
2942 static const unsigned int RoundupSizeTable[] = {2, 2, 2, 4, 4, 8, 8, 8, 8};
2943 
2944 static inline size_t RoundupSize(size_t n)
2945 {
2946  if (n<=8)
2947  return RoundupSizeTable[n];
2948  else if (n<=16)
2949  return 16;
2950  else if (n<=32)
2951  return 32;
2952  else if (n<=64)
2953  return 64;
2954  else
2955  return size_t(1) << BitPrecision(n-1);
2956 }
2957 
2959  : reg(2), sign(POSITIVE)
2960 {
2961  reg[0] = reg[1] = 0;
2962 }
2963 
2965  : reg(RoundupSize(t.WordCount())), sign(t.sign)
2966 {
2967  CopyWords(reg, t.reg, reg.size());
2968 }
2969 
2970 Integer::Integer(Sign s, lword value)
2971  : reg(2), sign(s)
2972 {
2973  reg[0] = word(value);
2974  reg[1] = word(SafeRightShift<WORD_BITS>(value));
2975 }
2976 
2977 Integer::Integer(signed long value)
2978  : reg(2)
2979 {
2980  if (value >= 0)
2981  sign = POSITIVE;
2982  else
2983  {
2984  sign = NEGATIVE;
2985  value = -value;
2986  }
2987  reg[0] = word(value);
2988  reg[1] = word(SafeRightShift<WORD_BITS>((unsigned long)value));
2989 }
2990 
2991 Integer::Integer(Sign s, word high, word low)
2992  : reg(2), sign(s)
2993 {
2994  reg[0] = low;
2995  reg[1] = high;
2996 }
2997 
2999 {
3000  if (ByteCount() > sizeof(long))
3001  return false;
3002 
3003  unsigned long value = (unsigned long)reg[0];
3004  value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3005 
3006  if (sign==POSITIVE)
3007  return (signed long)value >= 0;
3008  else
3009  return -(signed long)value < 0;
3010 }
3011 
3012 signed long Integer::ConvertToLong() const
3013 {
3015 
3016  unsigned long value = (unsigned long)reg[0];
3017  value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3018  return sign==POSITIVE ? value : -(signed long)value;
3019 }
3020 
3021 Integer::Integer(BufferedTransformation &encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3022 {
3024 
3025  if (o != LITTLE_ENDIAN_ORDER)
3026  {
3027  Decode(encodedInteger, byteCount, s);
3028  }
3029  else
3030  {
3031  SecByteBlock block(byteCount);
3032  encodedInteger.Get(block, block.size());
3033  std::reverse(block.begin(), block.begin()+block.size());
3034 
3035  Decode(block.begin(), block.size(), s);
3036  }
3037 }
3038 
3039 Integer::Integer(const byte *encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3040 {
3041  CRYPTOPP_ASSERT(encodedInteger && byteCount); // NULL buffer
3043 
3044  if (o != LITTLE_ENDIAN_ORDER)
3045  {
3046  Decode(encodedInteger, byteCount, s);
3047  }
3048  else
3049  {
3050  SecByteBlock block(byteCount);
3051 #if (_MSC_VER >= 1500)
3052  std::reverse_copy(encodedInteger, encodedInteger+byteCount,
3053  stdext::make_checked_array_iterator(block.begin(), block.size()));
3054 #else
3055  std::reverse_copy(encodedInteger, encodedInteger+byteCount, block.begin());
3056 #endif
3057  Decode(block.begin(), block.size(), s);
3058  return;
3059  }
3060 }
3061 
3063 {
3064  // Make explicit call to avoid virtual-dispatch findings in ctor
3065  Integer::BERDecode(bt);
3066 }
3067 
3069 {
3070  Randomize(rng, bitcount);
3071 }
3072 
3073 Integer::Integer(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3074 {
3075  if (!Randomize(rng, min, max, rnType, equiv, mod))
3077 }
3078 
3080 {
3081  Integer r((word)0, BitsToWords(e+1));
3082  r.SetBit(e);
3083  return r;
3084 }
3085 
3087 {
3088  return IsNegative() ? false : (reg[0]==0 && WordCount()==0);
3089 }
3090 
3092 {
3093  if (this != &t)
3094  {
3095  if (reg.size() != t.reg.size() || t.reg[t.reg.size()/2] == 0)
3096  reg.New(RoundupSize(t.WordCount()));
3097  CopyWords(reg, t.reg, reg.size());
3098  sign = t.sign;
3099  }
3100  return *this;
3101 }
3102 
3103 bool Integer::GetBit(size_t n) const
3104 {
3105  // Profiling tells us the original Else was dominant, so it was
3106  // promoted to the first If statement. The code change occurred
3107  // at Commit dc99266599a0e72d.
3108  if (n/WORD_BITS < reg.size())
3109  return bool((reg[n/WORD_BITS] >> (n % WORD_BITS)) & 1);
3110  else
3111  return 0;
3112 }
3113 
3114 void Integer::SetBit(size_t n, bool value)
3115 {
3116  if (value)
3117  {
3118  reg.CleanGrow(RoundupSize(BitsToWords(n+1)));
3119  reg[n/WORD_BITS] |= (word(1) << (n%WORD_BITS));
3120  }
3121  else
3122  {
3123  if (n/WORD_BITS < reg.size())
3124  reg[n/WORD_BITS] &= ~(word(1) << (n%WORD_BITS));
3125  }
3126 }
3127 
3128 byte Integer::GetByte(size_t n) const
3129 {
3130  // Profiling tells us the original Else was dominant, so it was
3131  // promoted to the first If statement. The code change occurred
3132  // at Commit dc99266599a0e72d.
3133  if (n/WORD_SIZE < reg.size())
3134  return byte(reg[n/WORD_SIZE] >> ((n%WORD_SIZE)*8));
3135  else
3136  return 0;
3137 }
3138 
3139 void Integer::SetByte(size_t n, byte value)
3140 {
3141  reg.CleanGrow(RoundupSize(BytesToWords(n+1)));
3142  reg[n/WORD_SIZE] &= ~(word(0xff) << 8*(n%WORD_SIZE));
3143  reg[n/WORD_SIZE] |= (word(value) << 8*(n%WORD_SIZE));
3144 }
3145 
3146 lword Integer::GetBits(size_t i, size_t n) const
3147 {
3148  lword v = 0;
3149  CRYPTOPP_ASSERT(n <= sizeof(v)*8);
3150  for (unsigned int j=0; j<n; j++)
3151  v |= lword(GetBit(i+j)) << j;
3152  return v;
3153 }
3154 
3156 {
3157  Integer result(*this);
3158  result.Negate();
3159  return result;
3160 }
3161 
3163 {
3164  Integer result(*this);
3165  result.sign = POSITIVE;
3166  return result;
3167 }
3168 
3170 {
3171  reg.swap(a.reg);
3172  std::swap(sign, a.sign);
3173 }
3174 
3175 Integer::Integer(word value, size_t length)
3176  : reg(RoundupSize(length)), sign(POSITIVE)
3177 {
3178  reg[0] = value;
3179  SetWords(reg+1, 0, reg.size()-1);
3180 }
3181 
3182 template <class T>
3183 static Integer StringToInteger(const T *str, ByteOrder order)
3184 {
3185  CRYPTOPP_ASSERT( order == BIG_ENDIAN_ORDER || order == LITTLE_ENDIAN_ORDER );
3186 
3187  int radix, sign = 1;
3188  // GCC workaround
3189  // std::char_traits<wchar_t>::length() not defined in GCC 3.2 and STLport 4.5.3
3190  unsigned int length;
3191  for (length = 0; str[length] != 0; length++) {}
3192 
3193  Integer v;
3194 
3195  if (length == 0)
3196  return Integer::Zero();
3197 
3198  switch (str[length-1])
3199  {
3200  case 'h':
3201  case 'H':
3202  radix=16;
3203  break;
3204  case 'o':
3205  case 'O':
3206  radix=8;
3207  break;
3208  case 'b':
3209  case 'B':
3210  radix=2;
3211  break;
3212  default:
3213  radix=10;
3214  }
3215 
3216  // 'str' is of length 1 or more
3217  if (str[0] == '-')
3218  {
3219  sign = -1;
3220  str += 1, length -= 1;
3221  }
3222 
3223  if (length > 2 && str[0] == '0' && (str[1] == 'x' || str[1] == 'X'))
3224  {
3225  radix = 16;
3226  str += 2, length -= 2;
3227  }
3228 
3229  if (order == BIG_ENDIAN_ORDER)
3230  {
3231  for (unsigned int i=0; i<length; i++)
3232  {
3233  int digit, ch = static_cast<int>(str[i]);
3234 
3235  // Profiling showd the second and third Else needed to be swapped
3236  // The code change occurred at Commit dc99266599a0e72d.
3237  if (ch >= '0' && ch <= '9')
3238  digit = ch - '0';
3239  else if (ch >= 'a' && ch <= 'f')
3240  digit = ch - 'a' + 10;
3241  else if (ch >= 'A' && ch <= 'F')
3242  digit = ch - 'A' + 10;
3243  else
3244  digit = radix;
3245 
3246  if (digit < radix)
3247  {
3248  v *= radix;
3249  v += digit;
3250  }
3251  }
3252  }
3253  else if (radix == 16 && order == LITTLE_ENDIAN_ORDER)
3254  {
3255  // Nibble high, low and count
3256  unsigned int nh = 0, nl = 0, nc = 0;
3257  Integer position(Integer::One());
3258 
3259  for (unsigned int i=0; i<length; i++)
3260  {
3261  int digit, ch = static_cast<int>(str[i]);
3262 
3263  if (ch >= '0' && ch <= '9')
3264  digit = ch - '0';
3265  else if (ch >= 'a' && ch <= 'f')
3266  digit = ch - 'a' + 10;
3267  else if (ch >= 'A' && ch <= 'F')
3268  digit = ch - 'A' + 10;
3269  else
3270  digit = radix;
3271 
3272  if (digit < radix)
3273  {
3274  if (nc++ == 0)
3275  nh = digit;
3276  else
3277  nl = digit;
3278 
3279  if (nc == 2)
3280  {
3281  v += position * (nh << 4 | nl);
3282  nc = 0, position <<= 8;
3283  }
3284  }
3285  }
3286 
3287  if (nc == 1)
3288  v += nh * position;
3289  }
3290  else // LITTLE_ENDIAN_ORDER && radix != 16
3291  {
3292  for (int i=static_cast<int>(length)-1; i>=0; i--)
3293  {
3294  int digit, ch = static_cast<int>(str[i]);
3295 
3296  if (ch >= '0' && ch <= '9')
3297  digit = ch - '0';
3298  else if (ch >= 'a' && ch <= 'f')
3299  digit = ch - 'a' + 10;
3300  else if (ch >= 'A' && ch <= 'F')
3301  digit = ch - 'A' + 10;
3302  else
3303  digit = radix;
3304 
3305  if (digit < radix)
3306  {
3307  v *= radix;
3308  v += digit;
3309  }
3310  }
3311  }
3312 
3313  if (sign == -1)
3314  v.Negate();
3315 
3316  return v;
3317 }
3318 
3319 Integer::Integer(const char *str, ByteOrder order)
3320  : reg(2), sign(POSITIVE)
3321 {
3322  *this = StringToInteger(str,order);
3323 }
3324 
3325 Integer::Integer(const wchar_t *str, ByteOrder order)
3326  : reg(2), sign(POSITIVE)
3327 {
3328  *this = StringToInteger(str,order);
3329 }
3330 
3331 unsigned int Integer::WordCount() const
3332 {
3333  return (unsigned int)CountWords(reg, reg.size());
3334 }
3335 
3336 unsigned int Integer::ByteCount() const
3337 {
3338  unsigned wordCount = WordCount();
3339  if (wordCount)
3340  return (wordCount-1)*WORD_SIZE + BytePrecision(reg[wordCount-1]);
3341  else
3342  return 0;
3343 }
3344 
3345 unsigned int Integer::BitCount() const
3346 {
3347  unsigned wordCount = WordCount();
3348  if (wordCount)
3349  return (wordCount-1)*WORD_BITS + BitPrecision(reg[wordCount-1]);
3350  else
3351  return 0;
3352 }
3353 
3354 void Integer::Decode(const byte *input, size_t inputLen, Signedness s)
3355 {
3356  CRYPTOPP_ASSERT(input && inputLen); // NULL buffer
3357  StringStore store(input, inputLen);
3358  Decode(store, inputLen, s);
3359 }
3360 
3362 {
3363  CRYPTOPP_ASSERT(bt.MaxRetrievable() >= inputLen);
3364  if (bt.MaxRetrievable() < inputLen)
3365  throw InvalidArgument("Integer: input length is too small");
3366 
3367  byte b;
3368  bt.Peek(b);
3369  sign = ((s==SIGNED) && (b & 0x80)) ? NEGATIVE : POSITIVE;
3370 
3371  while (inputLen>0 && (sign==POSITIVE ? b==0 : b==0xff))
3372  {
3373  bt.Skip(1);
3374  inputLen--;
3375  bt.Peek(b);
3376  }
3377 
3378  reg.CleanNew(RoundupSize(BytesToWords(inputLen)));
3379  for (size_t i=inputLen; i > 0; i--)
3380  {
3381  (void)bt.Get(b);
3382  reg[(i-1)/WORD_SIZE] |= word(b) << ((i-1)%WORD_SIZE)*8;
3383  }
3384 
3385  if (sign == NEGATIVE)
3386  {
3387  for (size_t i=inputLen; i<reg.size()*WORD_SIZE; i++)
3388  reg[i/WORD_SIZE] |= word(0xff) << (i%WORD_SIZE)*8;
3389  TwosComplement(reg, reg.size());
3390  }
3391 }
3392 
3393 size_t Integer::MinEncodedSize(Signedness signedness) const
3394 {
3395  // Profiling tells us the original second If was dominant, so it
3396  // was promoted to the first If statement. The code change occurred
3397  // at Commit dc99266599a0e72d.
3398  unsigned int outputLen = STDMAX(1U, ByteCount());
3399  const bool pre = (signedness == UNSIGNED);
3400  if (!pre && NotNegative() && (GetByte(outputLen-1) & 0x80))
3401  outputLen++;
3402  if (pre)
3403  return outputLen;
3404  if (IsNegative() && *this < -Power2(outputLen*8-1))
3405  outputLen++;
3406  return outputLen;
3407 }
3408 
3409 // PKCS12_PBKDF and other classes use undersized buffers
3410 void Integer::Encode(byte *output, size_t outputLen, Signedness signedness) const
3411 {
3412  CRYPTOPP_ASSERT(output && outputLen); // NULL buffer
3413  ArraySink sink(output, outputLen);
3414  Encode(sink, outputLen, signedness);
3415 }
3416 
3417 void Integer::Encode(BufferedTransformation &bt, size_t outputLen, Signedness signedness) const
3418 {
3419  if (signedness == UNSIGNED || NotNegative())
3420  {
3421  for (size_t i=outputLen; i > 0; i--)
3422  bt.Put(GetByte(i-1));
3423  }
3424  else
3425  {
3426  // take two's complement of *this
3427  Integer temp = Integer::Power2(8*STDMAX((size_t)ByteCount(), outputLen)) + *this;
3428  temp.Encode(bt, outputLen, UNSIGNED);
3429  }
3430 }
3431 
3433 {
3434  DERGeneralEncoder enc(bt, INTEGER);
3436  enc.MessageEnd();
3437 }
3438 
3439 void Integer::BERDecode(const byte *input, size_t len)
3440 {
3441  CRYPTOPP_ASSERT(input && len); // NULL buffer
3442  StringStore store(input, len);
3443  BERDecode(store);
3444 }
3445 
3447 {
3448  BERGeneralDecoder dec(bt, INTEGER);
3449  if (!dec.IsDefiniteLength() || dec.MaxRetrievable() < dec.RemainingLength())
3450  BERDecodeError();
3451  Decode(dec, (size_t)dec.RemainingLength(), SIGNED);
3452  dec.MessageEnd();
3453 }
3454 
3456 {
3457  DERGeneralEncoder enc(bt, OCTET_STRING);
3458  Encode(enc, length);
3459  enc.MessageEnd();
3460 }
3461 
3463 {
3464  BERGeneralDecoder dec(bt, OCTET_STRING);
3465  if (!dec.IsDefiniteLength() || dec.RemainingLength() != length)
3466  BERDecodeError();
3467  Decode(dec, length);
3468  dec.MessageEnd();
3469 }
3470 
3471 size_t Integer::OpenPGPEncode(byte *output, size_t bufferSize) const
3472 {
3473  CRYPTOPP_ASSERT(output && bufferSize); // NULL buffer
3474  CRYPTOPP_ASSERT(bufferSize >= MinEncodedSize()); // Undersized buffer
3475  ArraySink sink(output, bufferSize);
3476  return OpenPGPEncode(sink);
3477 }
3478 
3480 {
3481  word16 bitCount = word16(BitCount());
3482  bt.PutWord16(bitCount);
3483  size_t byteCount = BitsToBytes(bitCount);
3484  Encode(bt, byteCount);
3485  return 2 + byteCount;
3486 }
3487 
3488 void Integer::OpenPGPDecode(const byte *input, size_t len)
3489 {
3490  CRYPTOPP_ASSERT(input && len); // NULL buffer
3491  StringStore store(input, len);
3492  OpenPGPDecode(store);
3493 }
3494 
3496 {
3497  word16 bitCount;
3498  if (bt.GetWord16(bitCount) != 2 || bt.MaxRetrievable() < BitsToBytes(bitCount))
3499  throw OpenPGPDecodeErr();
3500  Decode(bt, BitsToBytes(bitCount));
3501 }
3502 
3504 {
3505  const size_t nbytes = nbits/8 + 1;
3506  SecByteBlock buf(nbytes);
3507  rng.GenerateBlock(buf, nbytes);
3508  if (nbytes)
3509  buf[0] = (byte)Crop(buf[0], nbits % 8);
3510  Decode(buf, nbytes, UNSIGNED);
3511 }
3512 
3513 void Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max)
3514 {
3515  if (min > max)
3516  throw InvalidArgument("Integer: Min must be no greater than Max");
3517 
3518  Integer range = max - min;
3519  const unsigned int nbits = range.BitCount();
3520 
3521  do
3522  {
3523  Randomize(rng, nbits);
3524  }
3525  while (*this > range);
3526 
3527  *this += min;
3528 }
3529 
3530 bool Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3531 {
3532  return GenerateRandomNoThrow(rng, MakeParameters("Min", min)("Max", max)
3533  ("RandomNumberType", rnType)("EquivalentTo", equiv)("Mod", mod));
3534 }
3535 
3537 {
3538 public:
3539  KDF2_RNG(const byte *seed, size_t seedSize)
3540  : m_counter(0), m_counterAndSeed(ClampSize(seedSize) + 4)
3541  {
3542  memcpy(m_counterAndSeed + 4, seed, ClampSize(seedSize));
3543  }
3544 
3545  void GenerateBlock(byte *output, size_t size)
3546  {
3547  CRYPTOPP_ASSERT(output && size); // NULL buffer
3548  PutWord(false, BIG_ENDIAN_ORDER, m_counterAndSeed, m_counter);
3549  ++m_counter;
3550  P1363_KDF2<SHA1>::DeriveKey(output, size, m_counterAndSeed, m_counterAndSeed.size(), NULLPTR, 0);
3551  }
3552 
3553  // UBsan finding, -Wstringop-overflow
3554  inline size_t ClampSize(size_t req) const
3555  {
3556  // Clamp at 16 MB
3557  if (req > 16U*1024*1024)
3558  return 16U*1024*1024;
3559  return req;
3560  }
3561 
3562 private:
3563  word32 m_counter;
3564  SecByteBlock m_counterAndSeed;
3565 };
3566 
3568 {
3569  Integer min = params.GetValueWithDefault("Min", Integer::Zero());
3570  Integer max;
3571  if (!params.GetValue("Max", max))
3572  {
3573  int bitLength;
3574  if (params.GetIntValue("BitLength", bitLength))
3575  max = Integer::Power2(bitLength);
3576  else
3577  throw InvalidArgument("Integer: missing Max argument");
3578  }
3579  if (min > max)
3580  throw InvalidArgument("Integer: Min must be no greater than Max");
3581 
3582  Integer equiv = params.GetValueWithDefault("EquivalentTo", Integer::Zero());
3583  Integer mod = params.GetValueWithDefault("Mod", Integer::One());
3584 
3585  if (equiv.IsNegative() || equiv >= mod)
3586  throw InvalidArgument("Integer: invalid EquivalentTo and/or Mod argument");
3587 
3588  Integer::RandomNumberType rnType = params.GetValueWithDefault("RandomNumberType", Integer::ANY);
3589 
3590  member_ptr<KDF2_RNG> kdf2Rng;
3592  if (params.GetValue(Name::Seed(), seed))
3593  {
3594  ByteQueue bq;
3595  DERSequenceEncoder seq(bq);
3596  min.DEREncode(seq);
3597  max.DEREncode(seq);
3598  equiv.DEREncode(seq);
3599  mod.DEREncode(seq);
3600  DEREncodeUnsigned(seq, rnType);
3601  DEREncodeOctetString(seq, seed.begin(), seed.size());
3602  seq.MessageEnd();
3603 
3604  SecByteBlock finalSeed((size_t)bq.MaxRetrievable());
3605  bq.Get(finalSeed, finalSeed.size());
3606  kdf2Rng.reset(new KDF2_RNG(finalSeed.begin(), finalSeed.size()));
3607  }
3608  RandomNumberGenerator &rng = kdf2Rng.get() ? (RandomNumberGenerator &)*kdf2Rng : i_rng;
3609 
3610  switch (rnType)
3611  {
3612  case ANY:
3613  if (mod == One())
3614  Randomize(rng, min, max);
3615  else
3616  {
3617  Integer min1 = min + (equiv-min)%mod;
3618  if (max < min1)
3619  return false;
3620  Randomize(rng, Zero(), (max - min1) / mod);
3621  *this *= mod;
3622  *this += min1;
3623  }
3624  return true;
3625 
3626  case PRIME:
3627  {
3628  const PrimeSelector *pSelector = params.GetValueWithDefault(Name::PointerToPrimeSelector(), (const PrimeSelector *)NULLPTR);
3629 
3630  int i;
3631  i = 0;
3632  while (1)
3633  {
3634  if (++i==16)
3635  {
3636  // check if there are any suitable primes in [min, max]
3637  Integer first = min;
3638  if (FirstPrime(first, max, equiv, mod, pSelector))
3639  {
3640  // if there is only one suitable prime, we're done
3641  *this = first;
3642  if (!FirstPrime(first, max, equiv, mod, pSelector))
3643  return true;
3644  }
3645  else
3646  return false;
3647  }
3648 
3649  Randomize(rng, min, max);
3650  if (FirstPrime(*this, STDMIN(*this+mod*PrimeSearchInterval(max), max), equiv, mod, pSelector))
3651  return true;
3652  }
3653  }
3654 
3655  default:
3656  throw InvalidArgument("Integer: invalid RandomNumberType argument");
3657  }
3658 }
3659 
3660 std::istream& operator>>(std::istream& in, Integer &a)
3661 {
3662  char c;
3663  unsigned int length = 0;
3664  SecBlock<char> str(length + 16);
3665 
3666  std::ws(in);
3667 
3668  do
3669  {
3670  in.read(&c, 1);
3671  str[length++] = c;
3672  if (length >= str.size())
3673  str.Grow(length + 16);
3674  }
3675  while (in && (c=='-' || c=='x' || (c>='0' && c<='9') || (c>='a' && c<='f') || (c>='A' && c<='F') || c=='h' || c=='H' || c=='o' || c=='O' || c==',' || c=='.'));
3676 
3677  if (in.gcount())
3678  in.putback(c);
3679  str[length-1] = '\0';
3680  a = Integer(str);
3681 
3682  return in;
3683 }
3684 
3685 // Ensure base 10 is default
3686 inline int FlagToBase(long f) {
3687  return f == std::ios::hex ? 16 : (f == std::ios::oct ? 8 : 10);
3688 }
3689 
3690 inline char FlagToSuffix(long f) {
3691  return f == std::ios::hex ? 'h' : (f == std::ios::oct ? 'o' : '.');
3692 }
3693 
3694 // Ensure base 10 is default
3695 std::ostream& operator<<(std::ostream& out, const Integer &a)
3696 {
3697  // Get relevant conversion specifications from ostream.
3698  const long f = out.flags() & std::ios::basefield;
3699  const int base = FlagToBase(f);
3700  const char suffix = FlagToSuffix(f);
3701 
3702  Integer temp1=a, temp2;
3703  if (a.IsNegative())
3704  {
3705  out << '-';
3706  temp1.Negate();
3707  }
3708 
3709  if (!a)
3710  out << '0';
3711 
3712  static const char upper[]="0123456789ABCDEF";
3713  static const char lower[]="0123456789abcdef";
3714 
3715  const char* vec = (out.flags() & std::ios::uppercase) ? upper : lower;
3716  unsigned int i=0;
3717  SecBlock<char> s(a.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
3718 
3719  while (!!temp1)
3720  {
3721  word digit;
3722  Integer::Divide(digit, temp2, temp1, base);
3723  s[i++]=vec[digit];
3724  temp1.swap(temp2);
3725  }
3726 
3727  while (i--)
3728  {
3729  out << s[i];
3730  }
3731 
3732 #ifdef CRYPTOPP_USE_STD_SHOWBASE
3733  if (out.flags() & std::ios_base::showbase)
3734  out << suffix;
3735 
3736  return out;
3737 #else
3738  return out << suffix;
3739 #endif
3740 }
3741 
3743 {
3744  if (NotNegative())
3745  {
3746  if (Increment(reg, reg.size()))
3747  {
3748  reg.CleanGrow(2*reg.size());
3749  reg[reg.size()/2]=1;
3750  }
3751  }
3752  else
3753  {
3754  word borrow = Decrement(reg, reg.size());
3755  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3756 
3757  if (WordCount()==0)
3758  *this = Zero();
3759  }
3760  return *this;
3761 }
3762 
3764 {
3765  if (IsNegative())
3766  {
3767  if (Increment(reg, reg.size()))
3768  {
3769  reg.CleanGrow(2*reg.size());
3770  reg[reg.size()/2]=1;
3771  }
3772  }
3773  else
3774  {
3775  if (Decrement(reg, reg.size()))
3776  *this = -One();
3777  }
3778  return *this;
3779 }
3780 
3781 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3782 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3784 {
3785  if (this == &t)
3786  {
3787  return AbsoluteValue();
3788  }
3789  else if (reg.size() >= t.reg.size())
3790  {
3791  Integer result(t);
3792  AndWords(result.reg, reg, t.reg.size());
3793 
3794  result.sign = POSITIVE;
3795  return result;
3796  }
3797  else // reg.size() < t.reg.size()
3798  {
3799  Integer result(*this);
3800  AndWords(result.reg, t.reg, reg.size());
3801 
3802  result.sign = POSITIVE;
3803  return result;
3804  }
3805 }
3806 
3807 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3808 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3809 Integer Integer::Or(const Integer& t) const
3810 {
3811  if (this == &t)
3812  {
3813  return AbsoluteValue();
3814  }
3815  else if (reg.size() >= t.reg.size())
3816  {
3817  Integer result(*this);
3818  OrWords(result.reg, t.reg, t.reg.size());
3819 
3820  result.sign = POSITIVE;
3821  return result;
3822  }
3823  else // reg.size() < t.reg.size()
3824  {
3825  Integer result(t);
3826  OrWords(result.reg, reg, reg.size());
3827 
3828  result.sign = POSITIVE;
3829  return result;
3830  }
3831 }
3832 
3833 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3834 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3836 {
3837  if (this == &t)
3838  {
3839  return Integer::Zero();
3840  }
3841  else if (reg.size() >= t.reg.size())
3842  {
3843  Integer result(*this);
3844  XorWords(result.reg, t.reg, t.reg.size());
3845 
3846  result.sign = POSITIVE;
3847  return result;
3848  }
3849  else // reg.size() < t.reg.size()
3850  {
3851  Integer result(t);
3852  XorWords(result.reg, reg, reg.size());
3853 
3854  result.sign = POSITIVE;
3855  return result;
3856  }
3857 }
3858 
3859 void PositiveAdd(Integer &sum, const Integer &a, const Integer& b)
3860 {
3861  // Profiling tells us the original second Else If was dominant, so it
3862  // was promoted to the first If statement. The code change occurred at
3863  // Commit dc99266599a0e72d.
3864  int carry; const bool pre = (a.reg.size() == b.reg.size());
3865  if (!pre && a.reg.size() > b.reg.size())
3866  {
3867  carry = Add(sum.reg, a.reg, b.reg, b.reg.size());
3868  CopyWords(sum.reg+b.reg.size(), a.reg+b.reg.size(), a.reg.size()-b.reg.size());
3869  carry = Increment(sum.reg+b.reg.size(), a.reg.size()-b.reg.size(), carry);
3870  }
3871  else if (pre)
3872  {
3873  carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3874  }
3875  else
3876  {
3877  carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3878  CopyWords(sum.reg+a.reg.size(), b.reg+a.reg.size(), b.reg.size()-a.reg.size());
3879  carry = Increment(sum.reg+a.reg.size(), b.reg.size()-a.reg.size(), carry);
3880  }
3881 
3882  if (carry)
3883  {
3884  sum.reg.CleanGrow(2*sum.reg.size());
3885  sum.reg[sum.reg.size()/2] = 1;
3886  }
3887  sum.sign = Integer::POSITIVE;
3888 }
3889 
3890 void PositiveSubtract(Integer &diff, const Integer &a, const Integer& b)
3891 {
3892  unsigned aSize = a.WordCount();
3893  aSize += aSize%2;
3894  unsigned bSize = b.WordCount();
3895  bSize += bSize%2;
3896 
3897  // Profiling tells us the original second Else If was dominant, so it
3898  // was promoted to the first If statement. The code change occurred at
3899  // Commit dc99266599a0e72d.
3900  if (aSize > bSize)
3901  {
3902  word borrow = Subtract(diff.reg, a.reg, b.reg, bSize);
3903  CopyWords(diff.reg+bSize, a.reg+bSize, aSize-bSize);
3904  borrow = Decrement(diff.reg+bSize, aSize-bSize, borrow);
3905  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3906  diff.sign = Integer::POSITIVE;
3907  }
3908  else if (aSize == bSize)
3909  {
3910  if (Compare(a.reg, b.reg, aSize) >= 0)
3911  {
3912  Subtract(diff.reg, a.reg, b.reg, aSize);
3913  diff.sign = Integer::POSITIVE;
3914  }
3915  else
3916  {
3917  Subtract(diff.reg, b.reg, a.reg, aSize);
3918  diff.sign = Integer::NEGATIVE;
3919  }
3920  }
3921  else
3922  {
3923  word borrow = Subtract(diff.reg, b.reg, a.reg, aSize);
3924  CopyWords(diff.reg+aSize, b.reg+aSize, bSize-aSize);
3925  borrow = Decrement(diff.reg+aSize, bSize-aSize, borrow);
3926  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3927  diff.sign = Integer::NEGATIVE;
3928  }
3929 }
3930 
3931 // MSVC .NET 2003 workaround
3932 template <class T> inline const T& STDMAX2(const T& a, const T& b)
3933 {
3934  return a < b ? b : a;
3935 }
3936 
3938 {
3939  Integer sum((word)0, STDMAX2(reg.size(), b.reg.size()));
3940  if (NotNegative())
3941  {
3942  if (b.NotNegative())
3943  PositiveAdd(sum, *this, b);
3944  else
3945  PositiveSubtract(sum, *this, b);
3946  }
3947  else
3948  {
3949  if (b.NotNegative())
3950  PositiveSubtract(sum, b, *this);
3951  else
3952  {
3953  PositiveAdd(sum, *this, b);
3954  sum.sign = Integer::NEGATIVE;
3955  }
3956  }
3957  return sum;
3958 }
3959 
3961 {
3962  reg.CleanGrow(t.reg.size());
3963  if (NotNegative())
3964  {
3965  if (t.NotNegative())
3966  PositiveAdd(*this, *this, t);
3967  else
3968  PositiveSubtract(*this, *this, t);
3969  }
3970  else
3971  {
3972  if (t.NotNegative())
3973  PositiveSubtract(*this, t, *this);
3974  else
3975  {
3976  PositiveAdd(*this, *this, t);
3977  sign = Integer::NEGATIVE;
3978  }
3979  }
3980  return *this;
3981 }
3982 
3984 {
3985  Integer diff((word)0, STDMAX2(reg.size(), b.reg.size()));
3986  if (NotNegative())
3987  {
3988  if (b.NotNegative())
3989  PositiveSubtract(diff, *this, b);
3990  else
3991  PositiveAdd(diff, *this, b);
3992  }
3993  else
3994  {
3995  if (b.NotNegative())
3996  {
3997  PositiveAdd(diff, *this, b);
3998  diff.sign = Integer::NEGATIVE;
3999  }
4000  else
4001  PositiveSubtract(diff, b, *this);
4002  }
4003  return diff;
4004 }
4005 
4007 {
4008  reg.CleanGrow(t.reg.size());
4009  if (NotNegative())
4010  {
4011  if (t.NotNegative())
4012  PositiveSubtract(*this, *this, t);
4013  else
4014  PositiveAdd(*this, *this, t);
4015  }
4016  else
4017  {
4018  if (t.NotNegative())
4019  {
4020  PositiveAdd(*this, *this, t);
4021  sign = Integer::NEGATIVE;
4022  }
4023  else
4024  PositiveSubtract(*this, t, *this);
4025  }
4026  return *this;
4027 }
4028 
4030 {
4031  const size_t wordCount = WordCount();
4032  const size_t shiftWords = n / WORD_BITS;
4033  const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4034 
4035  reg.CleanGrow(RoundupSize(wordCount+BitsToWords(n)));
4036  ShiftWordsLeftByWords(reg, wordCount + shiftWords, shiftWords);
4037  ShiftWordsLeftByBits(reg+shiftWords, wordCount+BitsToWords(shiftBits), shiftBits);
4038  return *this;
4039 }
4040 
4042 {
4043  const size_t wordCount = WordCount();
4044  const size_t shiftWords = n / WORD_BITS;
4045  const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4046 
4047  ShiftWordsRightByWords(reg, wordCount, shiftWords);
4048  if (wordCount > shiftWords)
4049  ShiftWordsRightByBits(reg, wordCount-shiftWords, shiftBits);
4050  if (IsNegative() && WordCount()==0) // avoid -0
4051  *this = Zero();
4052  return *this;
4053 }
4054 
4056 {
4057  if (this != &t)
4058  {
4059  const size_t size = STDMIN(reg.size(), t.reg.size());
4060  reg.resize(size);
4061  AndWords(reg, t.reg, size);
4062  }
4063  sign = POSITIVE;
4064  return *this;
4065 }
4066 
4068 {
4069  if (this != &t)
4070  {
4071  if (reg.size() >= t.reg.size())
4072  {
4073  OrWords(reg, t.reg, t.reg.size());
4074  }
4075  else // reg.size() < t.reg.size()
4076  {
4077  const size_t head = reg.size();
4078  const size_t tail = t.reg.size() - reg.size();
4079  reg.resize(head+tail);
4080  OrWords(reg, t.reg, head);
4081  CopyWords(reg+head,t.reg+head,tail);
4082  }
4083  }
4084  sign = POSITIVE;
4085  return *this;
4086 }
4087 
4089 {
4090  if (this == &t)
4091  {
4092  *this = Zero();
4093  }
4094  else
4095  {
4096  if (reg.size() >= t.reg.size())
4097  {
4098  XorWords(reg, t.reg, t.reg.size());
4099  }
4100  else // reg.size() < t.reg.size()
4101  {
4102  const size_t head = reg.size();
4103  const size_t tail = t.reg.size() - reg.size();
4104  reg.resize(head+tail);
4105  XorWords(reg, t.reg, head);
4106  CopyWords(reg+head,t.reg+head,tail);
4107  }
4108  }
4109  sign = POSITIVE;
4110  return *this;
4111 }
4112 
4113 void PositiveMultiply(Integer &product, const Integer &a, const Integer &b)
4114 {
4115  size_t aSize = RoundupSize(a.WordCount());
4116  size_t bSize = RoundupSize(b.WordCount());
4117 
4118  product.reg.CleanNew(RoundupSize(aSize+bSize));
4119  product.sign = Integer::POSITIVE;
4120 
4121  IntegerSecBlock workspace(aSize + bSize);
4122  AsymmetricMultiply(product.reg, workspace, a.reg, aSize, b.reg, bSize);
4123 }
4124 
4125 void Multiply(Integer &product, const Integer &a, const Integer &b)
4126 {
4127  PositiveMultiply(product, a, b);
4128 
4129  if (a.NotNegative() != b.NotNegative())
4130  product.Negate();
4131 }
4132 
4134 {
4135  Integer product;
4136  Multiply(product, *this, b);
4137  return product;
4138 }
4139 
4140 /*
4141 void PositiveDivide(Integer &remainder, Integer &quotient,
4142  const Integer &dividend, const Integer &divisor)
4143 {
4144  remainder.reg.CleanNew(divisor.reg.size());
4145  remainder.sign = Integer::POSITIVE;
4146  quotient.reg.New(0);
4147  quotient.sign = Integer::POSITIVE;
4148  unsigned i=dividend.BitCount();
4149  while (i--)
4150  {
4151  word overflow = ShiftWordsLeftByBits(remainder.reg, remainder.reg.size(), 1);
4152  remainder.reg[0] |= dividend[i];
4153  if (overflow || remainder >= divisor)
4154  {
4155  Subtract(remainder.reg, remainder.reg, divisor.reg, remainder.reg.size());
4156  quotient.SetBit(i);
4157  }
4158  }
4159 }
4160 */
4161 
4162 void PositiveDivide(Integer &remainder, Integer &quotient,
4163  const Integer &a, const Integer &b)
4164 {
4165  unsigned aSize = a.WordCount();
4166  unsigned bSize = b.WordCount();
4167 
4168  if (!bSize)
4169  throw Integer::DivideByZero();
4170 
4171  if (aSize < bSize)
4172  {
4173  remainder = a;
4174  remainder.sign = Integer::POSITIVE;
4175  quotient = Integer::Zero();
4176  return;
4177  }
4178 
4179  aSize += aSize%2; // round up to next even number
4180  bSize += bSize%2;
4181 
4182  remainder.reg.CleanNew(RoundupSize(bSize));
4183  remainder.sign = Integer::POSITIVE;
4184  quotient.reg.CleanNew(RoundupSize(aSize-bSize+2));
4185  quotient.sign = Integer::POSITIVE;
4186 
4187  IntegerSecBlock T(aSize+3*(bSize+2));
4188  Divide(remainder.reg, quotient.reg, T, a.reg, aSize, b.reg, bSize);
4189 }
4190 
4191 void Integer::Divide(Integer &remainder, Integer &quotient, const Integer &dividend, const Integer &divisor)
4192 {
4193  PositiveDivide(remainder, quotient, dividend, divisor);
4194 
4195  if (dividend.IsNegative())
4196  {
4197  quotient.Negate();
4198  if (remainder.NotZero())
4199  {
4200  --quotient;
4201  remainder = divisor.AbsoluteValue() - remainder;
4202  }
4203  }
4204 
4205  if (divisor.IsNegative())
4206  quotient.Negate();
4207 }
4208 
4209 void Integer::DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
4210 {
4211  q = a;
4212  q >>= n;
4213 
4214  const size_t wordCount = BitsToWords(n);
4215  if (wordCount <= a.WordCount())
4216  {
4217  r.reg.resize(RoundupSize(wordCount));
4218  CopyWords(r.reg, a.reg, wordCount);
4219  SetWords(r.reg+wordCount, 0, r.reg.size()-wordCount);
4220  if (n % WORD_BITS != 0)
4221  r.reg[wordCount-1] %= (word(1) << (n % WORD_BITS));
4222  }
4223  else
4224  {
4225  r.reg.resize(RoundupSize(a.WordCount()));
4226  CopyWords(r.reg, a.reg, r.reg.size());
4227  }
4228  r.sign = POSITIVE;
4229 
4230  if (a.IsNegative() && r.NotZero())
4231  {
4232  --q;
4233  r = Power2(n) - r;
4234  }
4235 }
4236 
4238 {
4239  Integer remainder, quotient;
4240  Integer::Divide(remainder, quotient, *this, b);
4241  return quotient;
4242 }
4243 
4245 {
4246  Integer remainder, quotient;
4247  Integer::Divide(remainder, quotient, *this, b);
4248  return remainder;
4249 }
4250 
4251 void Integer::Divide(word &remainder, Integer &quotient, const Integer &dividend, word divisor)
4252 {
4253  if (!divisor)
4254  throw Integer::DivideByZero();
4255 
4256  // IsPowerOf2 uses BMI on x86 if available. There is a small
4257  // but measurable improvement during decryption and signing.
4258  if (IsPowerOf2(divisor))
4259  {
4260  quotient = dividend >> (BitPrecision(divisor)-1);
4261  remainder = dividend.reg[0] & (divisor-1);
4262  return;
4263  }
4264 
4265  unsigned int i = dividend.WordCount();
4266  quotient.reg.CleanNew(RoundupSize(i));
4267  remainder = 0;
4268  while (i--)
4269  {
4270  quotient.reg[i] = DWord(dividend.reg[i], remainder) / divisor;
4271  remainder = DWord(dividend.reg[i], remainder) % divisor;
4272  }
4273 
4274  if (dividend.NotNegative())
4275  quotient.sign = POSITIVE;
4276  else
4277  {
4278  quotient.sign = NEGATIVE;
4279  if (remainder)
4280  {
4281  --quotient;
4282  remainder = divisor - remainder;
4283  }
4284  }
4285 }
4286 
4288 {
4289  word remainder;
4290  Integer quotient;
4291  Integer::Divide(remainder, quotient, *this, b);
4292  return quotient;
4293 }
4294 
4295 word Integer::Modulo(word divisor) const
4296 {
4297  if (!divisor)
4298  throw Integer::DivideByZero();
4299 
4300  word remainder;
4301 
4302  // Profiling tells us the original Else was dominant, so it was
4303  // promoted to the first If statement. The code change occurred
4304  // at Commit dc99266599a0e72d.
4305  if ((divisor & (divisor-1)) != 0) // divisor is not a power of 2
4306  {
4307  // Profiling tells us the original Else was dominant, so it
4308  // was promoted to the first If statement. The code change
4309  // occurred at Commit dc99266599a0e72d.
4310  unsigned int i = WordCount();
4311  if (divisor > 5)
4312  {
4313  remainder = 0;
4314  while (i--)
4315  remainder = DWord(reg[i], remainder) % divisor;
4316  }
4317  else
4318  {
4319  DWord sum(0, 0);
4320  while (i--)
4321  sum += reg[i];
4322  remainder = sum % divisor;
4323  }
4324  }
4325  else // divisor is a power of 2
4326  {
4327  remainder = reg[0] & (divisor-1);
4328  }
4329 
4330  if (IsNegative() && remainder)
4331  remainder = divisor - remainder;
4332 
4333  return remainder;
4334 }
4335 
4337 {
4338  if (!!(*this)) // don't flip sign if *this==0
4339  sign = Sign(1-sign);
4340 }
4341 
4342 int Integer::PositiveCompare(const Integer& t) const
4343 {
4344  // Profiling tells us the original Else was dominant, so it
4345  // was promoted to the first If statement. The code change
4346  // occurred at Commit dc99266599a0e72d.
4347  const unsigned size = WordCount(), tSize = t.WordCount();
4348  if (size != tSize)
4349  return size > tSize ? 1 : -1;
4350  else
4351  return CryptoPP::Compare(reg, t.reg, size);
4352 }
4353 
4354 int Integer::Compare(const Integer& t) const
4355 {
4356  if (NotNegative())
4357  {
4358  if (t.NotNegative())
4359  return PositiveCompare(t);
4360  else
4361  return 1;
4362  }
4363  else
4364  {
4365  if (t.NotNegative())
4366  return -1;
4367  else
4368  return -PositiveCompare(t);
4369  }
4370 }
4371 
4373 {
4374  if (!IsPositive())
4375  return Zero();
4376 
4377  // overestimate square root
4378  Integer x, y = Power2((BitCount()+1)/2);
4379  CRYPTOPP_ASSERT(y*y >= *this);
4380 
4381  do
4382  {
4383  x = y;
4384  y = (x + *this/x) >> 1;
4385  } while (y<x);
4386 
4387  return x;
4388 }
4389 
4390 bool Integer::IsSquare() const
4391 {
4392  Integer r = SquareRoot();
4393  return *this == r.Squared();
4394 }
4395 
4396 bool Integer::IsUnit() const
4397 {
4398  return (WordCount() == 1) && (reg[0] == 1);
4399 }
4400 
4402 {
4403  return IsUnit() ? *this : Zero();
4404 }
4405 
4406 Integer a_times_b_mod_c(const Integer &x, const Integer& y, const Integer& m)
4407 {
4408  CRYPTOPP_ASSERT(m.NotZero());
4409  if (m.IsZero())
4410  throw Integer::DivideByZero();
4411 
4412  return x*y%m;
4413 }
4414 
4415 Integer a_exp_b_mod_c(const Integer &x, const Integer& e, const Integer& m)
4416 {
4417  CRYPTOPP_ASSERT(m.NotZero());
4418  if (m.IsZero())
4419  throw Integer::DivideByZero();
4420 
4421  ModularArithmetic mr(m);
4422  return mr.Exponentiate(x, e);
4423 }
4424 
4426 {
4427  return EuclideanDomainOf<Integer>().Gcd(a, b);
4428 }
4429 
4431 {
4433  CRYPTOPP_ASSERT(m.NotZero());
4434 
4435  if (IsNegative())
4436  return Modulo(m).InverseModNext(m);
4437 
4438  // http://github.com/weidai11/cryptopp/issues/602
4439  if (*this >= m)
4440  return Modulo(m).InverseModNext(m);
4441 
4442  return InverseModNext(m);
4443 }
4444 
4445 Integer Integer::InverseModNext(const Integer &m) const
4446 {
4448  CRYPTOPP_ASSERT(m.NotZero());
4449 
4450  if (m.IsEven())
4451  {
4452  if (!m || IsEven())
4453  return Zero(); // no inverse
4454  if (*this == One())
4455  return One();
4456 
4457  Integer u = m.Modulo(*this).InverseModNext(*this);
4458  return !u ? Zero() : (m*(*this-u)+1)/(*this);
4459  }
4460 
4461  // AlmostInverse requires a 4x workspace
4462  IntegerSecBlock T(m.reg.size() * 4);
4463  Integer r((word)0, m.reg.size());
4464  unsigned k = AlmostInverse(r.reg, T, reg, reg.size(), m.reg, m.reg.size());
4465  DivideByPower2Mod(r.reg, r.reg, k, m.reg, m.reg.size());
4466  return r;
4467 }
4468 
4469 word Integer::InverseMod(word mod) const
4470 {
4471  CRYPTOPP_ASSERT(mod != 0);
4472 
4473  word g0 = mod, g1 = *this % mod;
4474  word v0 = 0, v1 = 1;
4475  word y;
4476 
4477  while (g1)
4478  {
4479  if (g1 == 1)
4480  return v1;
4481  y = g0 / g1;
4482  g0 = g0 % g1;
4483  v0 += y * v1;
4484 
4485  if (!g0)
4486  break;
4487  if (g0 == 1)
4488  return mod-v0;
4489  y = g1 / g0;
4490  g1 = g1 % g0;
4491  v1 += y * v0;
4492  }
4493  return 0;
4494 }
4495 
4496 // ********************************************************
4497 
4499 {
4500  BERSequenceDecoder seq(bt);
4501  OID oid(seq);
4502  if (oid != ASN1::prime_field())
4503  BERDecodeError();
4504  m_modulus.BERDecode(seq);
4505  seq.MessageEnd();
4506  m_result.reg.resize(m_modulus.reg.size());
4507 }
4508 
4510 {
4511  DERSequenceEncoder seq(bt);
4512  ASN1::prime_field().DEREncode(seq);
4513  m_modulus.DEREncode(seq);
4514  seq.MessageEnd();
4515 }
4516 
4518 {
4519  a.DEREncodeAsOctetString(out, MaxElementByteLength());
4520 }
4521 
4523 {
4524  a.BERDecodeAsOctetString(in, MaxElementByteLength());
4525 }
4526 
4528 {
4529  if (a.reg.size()==m_modulus.reg.size())
4530  {
4531  CryptoPP::DivideByPower2Mod(m_result.reg.begin(), a.reg, 1, m_modulus.reg, a.reg.size());
4532  return m_result;
4533  }
4534  else
4535  return m_result1 = (a.IsEven() ? (a >> 1) : ((a+m_modulus) >> 1));
4536 }
4537 
4538 const Integer& ModularArithmetic::Add(const Integer &a, const Integer &b) const
4539 {
4540  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4541  {
4542  if (CryptoPP::Add(m_result.reg.begin(), a.reg, b.reg, a.reg.size())
4543  || Compare(m_result.reg, m_modulus.reg, a.reg.size()) >= 0)
4544  {
4545  CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4546  }
4547  return m_result;
4548  }
4549  else
4550  {
4551  m_result1 = a+b;
4552  if (m_result1 >= m_modulus)
4553  m_result1 -= m_modulus;
4554  return m_result1;
4555  }
4556 }
4557 
4559 {
4560  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4561  {
4562  if (CryptoPP::Add(a.reg, a.reg, b.reg, a.reg.size())
4563  || Compare(a.reg, m_modulus.reg, a.reg.size()) >= 0)
4564  {
4565  CryptoPP::Subtract(a.reg, a.reg, m_modulus.reg, a.reg.size());
4566  }
4567  }
4568  else
4569  {
4570  a+=b;
4571  if (a>=m_modulus)
4572  a-=m_modulus;
4573  }
4574 
4575  return a;
4576 }
4577 
4578 const Integer& ModularArithmetic::Subtract(const Integer &a, const Integer &b) const
4579 {
4580  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4581  {
4582  if (CryptoPP::Subtract(m_result.reg.begin(), a.reg, b.reg, a.reg.size()))
4583  CryptoPP::Add(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4584  return m_result;
4585  }
4586  else
4587  {
4588  m_result1 = a-b;
4589  if (m_result1.IsNegative())
4590  m_result1 += m_modulus;
4591  return m_result1;
4592  }
4593 }
4594 
4596 {
4597  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4598  {
4599  if (CryptoPP::Subtract(a.reg, a.reg, b.reg, a.reg.size()))
4600  CryptoPP::Add(a.reg, a.reg, m_modulus.reg, a.reg.size());
4601  }
4602  else
4603  {
4604  a-=b;
4605  if (a.IsNegative())
4606  a+=m_modulus;
4607  }
4608 
4609  return a;
4610 }
4611 
4613 {
4614  if (!a)
4615  return a;
4616 
4617  CopyWords(m_result.reg.begin(), m_modulus.reg, m_modulus.reg.size());
4618  if (CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, a.reg, a.reg.size()))
4619  Decrement(m_result.reg.begin()+a.reg.size(), m_modulus.reg.size()-a.reg.size());
4620 
4621  return m_result;
4622 }
4623 
4624 Integer ModularArithmetic::CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
4625 {
4626  if (m_modulus.IsOdd())
4627  {
4628  MontgomeryRepresentation dr(m_modulus);
4629  return dr.ConvertOut(dr.CascadeExponentiate(dr.ConvertIn(x), e1, dr.ConvertIn(y), e2));
4630  }
4631  else
4632  return AbstractRing<Integer>::CascadeExponentiate(x, e1, y, e2);
4633 }
4634 
4635 void ModularArithmetic::SimultaneousExponentiate(Integer *results, const Integer &base, const Integer *exponents, unsigned int exponentsCount) const
4636 {
4637  if (m_modulus.IsOdd())
4638  {
4639  MontgomeryRepresentation dr(m_modulus);
4640  dr.SimultaneousExponentiate(results, dr.ConvertIn(base), exponents, exponentsCount);
4641  for (unsigned int i=0; i<exponentsCount; i++)
4642  results[i] = dr.ConvertOut(results[i]);
4643  }
4644  else
4645  AbstractRing<Integer>::SimultaneousExponentiate(results, base, exponents, exponentsCount);
4646 }
4647 
4649  : ModularArithmetic(m),
4650  m_u((word)0, m_modulus.reg.size()),
4651  m_workspace(5*m_modulus.reg.size())
4652 {
4653  if (!m_modulus.IsOdd())
4654  throw InvalidArgument("MontgomeryRepresentation: Montgomery representation requires an odd modulus");
4655 
4656  RecursiveInverseModPower2(m_u.reg, m_workspace, m_modulus.reg, m_modulus.reg.size());
4657 }
4658 
4660 {
4661  word *const T = m_workspace.begin();
4662  word *const R = m_result.reg.begin();
4663  const size_t N = m_modulus.reg.size();
4664  CRYPTOPP_ASSERT(a.reg.size()<=N && b.reg.size()<=N);
4665 
4666  AsymmetricMultiply(T, T+2*N, a.reg, a.reg.size(), b.reg, b.reg.size());
4667  SetWords(T+a.reg.size()+b.reg.size(), 0, 2*N-a.reg.size()-b.reg.size());
4668  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4669  return m_result;
4670 }
4671 
4673 {
4674  word *const T = m_workspace.begin();
4675  word *const R = m_result.reg.begin();
4676  const size_t N = m_modulus.reg.size();
4677  CRYPTOPP_ASSERT(a.reg.size()<=N);
4678 
4679  CryptoPP::Square(T, T+2*N, a.reg, a.reg.size());
4680  SetWords(T+2*a.reg.size(), 0, 2*N-2*a.reg.size());
4681  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4682  return m_result;
4683 }
4684 
4686 {
4687  word *const T = m_workspace.begin();
4688  word *const R = m_result.reg.begin();
4689  const size_t N = m_modulus.reg.size();
4690  CRYPTOPP_ASSERT(a.reg.size()<=N);
4691 
4692  CopyWords(T, a.reg, a.reg.size());
4693  SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4694  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4695  return m_result;
4696 }
4697 
4699 {
4700 // return (EuclideanMultiplicativeInverse(a, modulus)<<(2*WORD_BITS*modulus.reg.size()))%modulus;
4701  word *const T = m_workspace.begin();
4702  word *const R = m_result.reg.begin();
4703  const size_t N = m_modulus.reg.size();
4704  CRYPTOPP_ASSERT(a.reg.size()<=N);
4705 
4706  CopyWords(T, a.reg, a.reg.size());
4707  SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4708  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4709  unsigned k = AlmostInverse(R, T, R, N, m_modulus.reg, N);
4710 
4711 // cout << "k=" << k << " N*32=" << 32*N << endl;
4712 
4713  if (k>N*WORD_BITS)
4714  DivideByPower2Mod(R, R, k-N*WORD_BITS, m_modulus.reg, N);
4715  else
4716  MultiplyByPower2Mod(R, R, N*WORD_BITS-k, m_modulus.reg, N);
4717 
4718  return m_result;
4719 }
4720 
4721 // Specialization declared in misc.h to allow us to print integers
4722 // with additional control options, like arbirary bases and uppercase.
4723 template <> CRYPTOPP_DLL
4724 std::string IntToString<Integer>(Integer value, unsigned int base)
4725 {
4726  // Hack... set the high bit for uppercase. Set the next bit fo a suffix.
4727  static const unsigned int BIT_32 = (1U << 31);
4728  const bool UPPER = !!(base & BIT_32);
4729  static const unsigned int BIT_31 = (1U << 30);
4730  const bool BASE = !!(base & BIT_31);
4731 
4732  const char CH = UPPER ? 'A' : 'a';
4733  base &= ~(BIT_32|BIT_31);
4734  CRYPTOPP_ASSERT(base >= 2 && base <= 32);
4735 
4736  if (value == 0)
4737  return "0";
4738 
4739  bool negative = false, zero = false;
4740  if (value.IsNegative())
4741  {
4742  negative = true;
4743  value.Negate();
4744  }
4745 
4746  if (!value)
4747  zero = true;
4748 
4749  SecBlock<char> s(value.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
4750  Integer temp;
4751 
4752  unsigned int i=0;
4753  while (!!value)
4754  {
4755  word digit;
4756  Integer::Divide(digit, temp, value, word(base));
4757  s[i++]=char((digit < 10 ? '0' : (CH - 10)) + digit);
4758  value.swap(temp);
4759  }
4760 
4761  std::string result;
4762  result.reserve(i+2);
4763 
4764  if (negative)
4765  result += '-';
4766 
4767  if (zero)
4768  result += '0';
4769 
4770  while (i--)
4771  result += s[i];
4772 
4773  if (BASE)
4774  {
4775  if (base == 10)
4776  result += '.';
4777  else if (base == 16)
4778  result += 'h';
4779  else if (base == 8)
4780  result += 'o';
4781  else if (base == 2)
4782  result += 'b';
4783  }
4784 
4785  return result;
4786 }
4787 
4788 // Specialization declared in misc.h to avoid Coverity findings.
4789 template <> CRYPTOPP_DLL
4790 std::string IntToString<word64>(word64 value, unsigned int base)
4791 {
4792  // Hack... set the high bit for uppercase.
4793  static const unsigned int HIGH_BIT = (1U << 31);
4794  const char CH = !!(base & HIGH_BIT) ? 'A' : 'a';
4795  base &= ~HIGH_BIT;
4796 
4797  CRYPTOPP_ASSERT(base >= 2);
4798  if (value == 0)
4799  return "0";
4800 
4801  std::string result;
4802  while (value > 0)
4803  {
4804  word64 digit = value % base;
4805  result = char((digit < 10 ? '0' : (CH - 10)) + digit) + result;
4806  value /= base;
4807  }
4808  return result;
4809 }
4810 
4811 #ifndef CRYPTOPP_NO_ASSIGN_TO_INTEGER
4812 // Allow the linker to discard Integer code if not needed.
4813 // Also see http://github.com/weidai11/cryptopp/issues/389.
4814 bool AssignIntToInteger(const std::type_info &valueType, void *pInteger, const void *pInt)
4815 {
4816  if (valueType != typeid(Integer))
4817  return false;
4818  *reinterpret_cast<Integer *>(pInteger) = *reinterpret_cast<const int *>(pInt);
4819  return true;
4820 }
4821 #endif // CRYPTOPP_NO_ASSIGN_TO_INTEGER
4822 
4823 // *************************** C++ Static Initialization ***************************
4824 
4826 {
4827 public:
4828  InitInteger()
4829  {
4830  SetFunctionPointers();
4831  }
4832 };
4833 
4834 // This is not really needed because each Integer can dynamically initialize
4835 // itself, but we take a peephole optimization and initialize the class once
4836 // if init priorities are available. Dynamic initialization will be used if
4837 // init priorities are not available.
4838 
4839 #if defined(HAVE_GCC_INIT_PRIORITY)
4840  const InitInteger s_init __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 10))) = InitInteger();
4841  const Integer g_zero __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 11))) = Integer(0L);
4842  const Integer g_one __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 12))) = Integer(1L);
4843  const Integer g_two __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 13))) = Integer(2L);
4844 #elif defined(HAVE_MSC_INIT_PRIORITY)
4845  #pragma warning(disable: 4075)
4846  #pragma init_seg(".CRT$XCU")
4847  const InitInteger s_init;
4848  const Integer g_zero(0L);
4849  const Integer g_one(1L);
4850  const Integer g_two(2L);
4851  #pragma warning(default: 4075)
4852 #elif HAVE_XLC_INIT_PRIORITY
4853  // XLC needs constant, not a define
4854  #pragma priority(280)
4855  const InitInteger s_init;
4856  const Integer g_zero(0L);
4857  const Integer g_one(1L);
4858  const Integer g_two(2L);
4859 #else
4860  const InitInteger s_init;
4861 #endif
4862 
4863 // ***************** Library code ********************
4864 
4866 {
4867 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4868  return g_zero;
4869 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4870  static const Integer s_zero(0L);
4871  return s_zero;
4872 #else // Potential memory leak. Avoid if possible.
4873  return Singleton<Integer, NewInteger<0L> >().Ref();
4874 #endif
4875 }
4876 
4878 {
4879 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4880  return g_one;
4881 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4882  static const Integer s_one(1L);
4883  return s_one;
4884 #else // Potential memory leak. Avoid if possible.
4885  return Singleton<Integer, NewInteger<1L> >().Ref();
4886 #endif
4887 }
4888 
4890 {
4891 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4892  return g_two;
4893 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4894  static const Integer s_two(2L);
4895  return s_two;
4896 #else // Potential memory leak. Avoid if possible.
4897  return Singleton<Integer, NewInteger<2L> >().Ref();
4898 #endif
4899 }
4900 
4901 NAMESPACE_END
4902 
4903 #endif // CRYPTOPP_IMPORTS
Used to pass byte array input as part of a NameValuePairs object.
Definition: algparam.h:20
An invalid argument was detected.
Definition: cryptlib.h:202
unsigned int WordCount() const
Determines the number of words required to represent the Integer.
Definition: integer.cpp:3331
Integer MultiplicativeInverse() const
Calculate multiplicative inverse.
Definition: integer.cpp:4401
Integer & Reduce(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4595
bool NotZero() const
Determines if the Integer is non-0.
Definition: integer.h:333
Classes for working with NameValuePairs.
Integer Or(const Integer &t) const
Bitwise OR.
Definition: integer.cpp:3809
Integer & operator|=(const Integer &t)
Bitwise OR Assignment.
Definition: integer.cpp:4067
void swap(SecBlock< T, A > &b)
Swap contents with another SecBlock.
Definition: secblock.h:1041
Integer And(const Integer &t) const
Bitwise AND.
Definition: integer.cpp:3783
a number which is probabilistically prime
Definition: integer.h:95
Utility functions for the Crypto++ library.
Integer Plus(const Integer &b) const
Addition.
Definition: integer.cpp:3937
virtual size_t Peek(byte &outByte) const
Peek a 8-bit byte.
Definition: cryptlib.cpp:550
ByteOrder
Provides the byte ordering.
Definition: cryptlib.h:143
Restricts the instantiation of a class to one static object without locks.
Definition: misc.h:263
void CleanNew(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:980
an unsigned value
Definition: integer.h:85
T GetValueWithDefault(const char *name, T defaultValue) const
Get a named value.
Definition: cryptlib.h:363
bool IsSquare() const
Determine whether this integer is a perfect square.
Definition: integer.cpp:4390
void DEREncode(BufferedTransformation &bt) const
Encode in DER format.
Definition: integer.cpp:3432
size_t DEREncodeUnsigned(BufferedTransformation &out, T w, byte asnTag=INTEGER)
DER Encode unsigned value.
Definition: asn.h:461
virtual void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: cryptlib.cpp:311
size_t size() const
Length of the memory block.
Definition: algparam.h:84
size_t BitsToBytes(size_t bitCount)
Returns the number of 8-bit bytes or octets required for the specified number of bits.
Definition: misc.h:850
This file contains helper classes/functions for implementing public key algorithms.
void DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
Encode absolute value as big-endian octet string.
Definition: integer.cpp:3455
Integer & operator=(const Integer &t)
Assignment.
Definition: integer.cpp:3091
bool FirstPrime(Integer &p, const Integer &max, const Integer &equiv, const Integer &mod, const PrimeSelector *pSelector)
Finds a random prime of special form.
Definition: nbtheory.cpp:379
static Integer Gcd(const Integer &a, const Integer &n)
Calculate greatest common divisor.
Definition: integer.cpp:4425
void resize(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1031
Integer & operator+=(const Integer &t)
Addition Assignment.
Definition: integer.cpp:3960
size_t BitsToWords(size_t bitCount)
Returns the number of words required for the specified number of bits.
Definition: misc.h:870
void PutWord(bool assumeAligned, ByteOrder order, byte *block, T value, const byte *xorBlock=NULL)
Access a block of memory.
Definition: misc.h:2428
unsigned int BytePrecision(const T &value)
Returns the number of 8-bit bytes or octets required for a value.
Definition: misc.h:731
void CleanGrow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1013
Integer & operator--()
Pre-decrement.
Definition: integer.cpp:3763
virtual Element CascadeExponentiate(const Element &x, const Integer &e1, const Element &y, const Integer &e2) const
TODO.
Definition: algebra.cpp:323
Secure memory block with allocator and cleanup.
Definition: secblock.h:688
const Integer & Inverse(const Integer &a) const
Inverts the element in the ring.
Definition: integer.cpp:4612
signed long ConvertToLong() const
Convert the Integer to Long.
Definition: integer.cpp:3012
const Integer & Subtract(const Integer &a, const Integer &b) const
Subtracts elements in the ring.
Definition: integer.cpp:4578
void OpenPGPDecode(const byte *input, size_t inputLen)
Decode from OpenPGP format.
Definition: integer.cpp:3488
Signedness
Used when importing and exporting integers.
Definition: integer.h:83
unsigned int MaxElementByteLength() const
Provides the maximum byte size of an element in the ring.
Definition: modarith.h:232
ASN.1 object identifiers for algorthms and schemes.
Square block cipher.
Definition: square.h:24
Classes for automatic resource management.
bool IsP4()
Determines if the CPU is an Intel P4.
Definition: cpu.h:236
bool IsNegative() const
Determines if the Integer is negative.
Definition: integer.h:336
lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: queue.h:33
Library configuration file.
MontgomeryRepresentation(const Integer &modulus)
Construct a MontgomeryRepresentation.
Definition: integer.cpp:4648
static void DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
Extended Division.
Definition: integer.cpp:4209
Ring of congruence classes modulo n.
Definition: modarith.h:38
Interface for random number generators.
Definition: cryptlib.h:1383
size_t BytesToWords(size_t byteCount)
Returns the number of words required for the specified number of bytes.
Definition: misc.h:860
Common C++ header files.
void Randomize(RandomNumberGenerator &rng, size_t bitCount)
Set this Integer to random integer.
Definition: integer.cpp:3503
std::string IntToString< word64 >(word64 value, unsigned int base)
Converts an unsigned value to a string.
Definition: integer.cpp:4790
void New(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:965
void SetByte(size_t n, byte value)
Set the n-th byte to value.
Definition: integer.cpp:3139
Integer InverseMod(const Integer &n) const
Calculate multiplicative inverse.
Definition: integer.cpp:4430
the value is negative
Definition: integer.h:77
SecBlock<byte> typedef.
Definition: secblock.h:1058
BER Sequence Decoder.
Definition: asn.h:309
Interface for buffered transformations.
Definition: cryptlib.h:1598
bool IsPositive() const
Determines if the Integer is positive.
Definition: integer.h:342
bool NotNegative() const
Determines if the Integer is non-negative.
Definition: integer.h:339
const Integer & Add(const Integer &a, const Integer &b) const
Adds elements in the ring.
Definition: integer.cpp:4538
static const Integer & One()
Integer representing 1.
Definition: integer.cpp:4877
Integer & Accumulate(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4558
P1363 key derivation function.
Definition: pubkey.h:729
byte order is little-endian
Definition: cryptlib.h:145
Sign
Used internally to represent the integer.
Definition: integer.h:73
Integer ConvertIn(const Integer &a) const
Reduces an element in the congruence class.
Definition: modarith.h:292
bool operator!() const
Negation.
Definition: integer.cpp:3086
Pointer that overloads operator ->
Definition: smartptr.h:36
bool IsUnit() const
Determine if 1 or -1.
Definition: integer.cpp:4396
unsigned int ByteCount() const
Determines the number of bytes required to represent the Integer.
Definition: integer.cpp:3336
Classes and functions for secure memory allocations.
void DEREncodeElement(BufferedTransformation &out, const Element &a) const
Encodes element in DER format.
Definition: integer.cpp:4517
Integer Modulo(const Integer &b) const
Remainder.
Definition: integer.cpp:4244
Copy input to a memory buffer.
Definition: filters.h:1136
void BERDecodeElement(BufferedTransformation &in, Element &a) const
Decodes element in DER format.
Definition: integer.cpp:4522
size_t MinEncodedSize(Signedness sign=UNSIGNED) const
Minimum number of bytes to encode this integer.
Definition: integer.cpp:3393
Integer DividedBy(const Integer &b) const
Division.
Definition: integer.cpp:4237
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: integer.cpp:4624
Integer Times(const Integer &b) const
Multiplication.
Definition: integer.cpp:4133
a number with no special properties
Definition: integer.h:93
const byte * begin() const
Pointer to the first byte in the memory block.
Definition: algparam.h:80
size_t Put(byte inByte, bool blocking=true)
Input a byte for processing.
Definition: cryptlib.h:1620
Integer & operator++()
Pre-increment.
Definition: integer.cpp:3742
AlgorithmParameters MakeParameters(const char *name, const T &value, bool throwIfNotUsed=true)
Create an object that implements NameValuePairs.
Definition: algparam.h:502
void swap(Integer &a)
Swaps this Integer with another Integer.
Definition: integer.cpp:3169
Integer()
Creates the zero integer.
Definition: integer.cpp:2958
unsigned int TrailingZeros(word32 v)
Determines the number of trailing 0-bits in a value.
Definition: misc.h:779
Integer & operator<<=(size_t n)
Left-shift Assignment.
Definition: integer.cpp:4029
bool IsZero() const
Determines if the Integer is 0.
Definition: integer.h:330
Exception thrown when an error is encountered decoding an OpenPGP integer.
Definition: integer.h:282
void Negate()
Reverse the Sign of the Integer.
Definition: integer.cpp:4336
int Compare(const Integer &a) const
Perform signed comparison.
Definition: integer.cpp:4354
T Crop(T value, size_t bits)
Truncates the value to the specified number of bits.
Definition: misc.h:838
void BERDecodeAsOctetString(BufferedTransformation &bt, size_t length)
Decode nonnegative value from big-endian octet string.
Definition: integer.cpp:3462
Application callback to signal suitability of a cabdidate prime.
Definition: nbtheory.h:113
void ConditionalSwapPointers(bool c, T &a, T &b)
Performs a branchless swap of pointers a and b if condition c is true.
Definition: misc.h:1269
static Integer Power2(size_t e)
Exponentiates to a power of 2.
Definition: integer.cpp:3079
Multiple precision integer with arithmetic operations.
Definition: integer.h:49
Precompiled header file.
virtual Element Exponentiate(const Element &a, const Integer &e) const
Raises a base to an exponent in the group.
Definition: algebra.cpp:316
const Integer & Multiply(const Integer &a, const Integer &b) const
Multiplies elements in the ring.
Definition: integer.cpp:4659
a signed value
Definition: integer.h:87
size_t OpenPGPEncode(byte *output, size_t bufferSize) const
Encode absolute value in OpenPGP format.
Definition: integer.cpp:3471
const Integer & Half(const Integer &a) const
Divides an element by 2.
Definition: integer.cpp:4527
static const Integer & Two()
Integer representing 2.
Definition: integer.cpp:4889
virtual lword Skip(lword skipMax=LWORD_MAX)
Discard skipMax bytes from the output buffer.
Definition: cryptlib.cpp:573
bool IsPowerOf2(const T &value)
Tests whether a value is a power of 2.
Definition: misc.h:921
const Integer & Square(const Integer &a) const
Square an element in the ring.
Definition: integer.cpp:4672
Integer & operator^=(const Integer &t)
Bitwise XOR Assignment.
Definition: integer.cpp:4088
#define MEMORY_BARRIER
A memory barrier.
Definition: misc.h:227
RandomNumberType
Properties of a random integer.
Definition: integer.h:91
const char * Seed()
ConstByteArrayParameter.
Definition: argnames.h:19
bool IsEven() const
Determines if the Integer is even parity.
Definition: integer.h:348
byte order is big-endian
Definition: cryptlib.h:147
const T & STDMIN(const T &a, const T &b)
Replacement function for std::min.
Definition: misc.h:567
String-based implementation of Store interface.
Definition: filters.h:1195
#define CRYPTOPP_ASSERT(exp)
Debugging and diagnostic assertion.
Definition: trap.h:69
static void Divide(Integer &r, Integer &q, const Integer &a, const Integer &d)
Extended Division.
Definition: integer.cpp:4191
ModularArithmetic(const Integer &modulus=Integer::One())
Construct a ModularArithmetic.
Definition: modarith.h:49
void BERDecodeError()
Raises a BERDecodeErr.
Definition: asn.h:69
Data structure used to store byte strings.
Definition: queue.h:18
Functions for CPU features and intrinsics.
Classes and functions for working with ANS.1 objects.
Classes for SHA-1 and SHA-2 family of message digests.
void SetBit(size_t n, bool value=1)
Set the n-th bit to value.
Definition: integer.cpp:3114
size_t GetWord16(word16 &value, ByteOrder order=BIG_ENDIAN_ORDER)
Retrieve a 16-bit word.
Definition: cryptlib.cpp:793
iterator begin()
Provides an iterator pointing to the first element in the memory block.
Definition: secblock.h:772
unsigned int BitCount() const
Determines the number of bits required to represent the Integer.
Definition: integer.cpp:3345
const char * PointerToPrimeSelector()
const PrimeSelector *
Definition: argnames.h:42
Implementation of BufferedTransformation&#39;s attachment interface.
Classes and functions for number theoretic operations.
byte GetByte(size_t i) const
Provides the i-th byte of the Integer.
Definition: integer.cpp:3128
Exception thrown when division by 0 is encountered.
Definition: integer.h:55
DER Sequence Encoder.
Definition: asn.h:319
void Encode(byte *output, size_t outputLen, Signedness sign=UNSIGNED) const
Encode in big-endian format.
Definition: integer.cpp:3410
Integer Squared() const
Multiply this integer by itself.
Definition: integer.h:609
T1 SaturatingSubtract1(const T1 &a, const T2 &b)
Performs a saturating subtract clamped at 1.
Definition: misc.h:1021
virtual lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: cryptlib.cpp:504
Exception thrown when a random number cannot be found that satisfies the condition.
Definition: integer.h:63
bool GenerateRandomNoThrow(RandomNumberGenerator &rng, const NameValuePairs &params=g_nullNameValuePairs)
Generate a random number.
Definition: integer.cpp:3567
Performs modular arithmetic in Montgomery representation for increased speed.
Definition: modarith.h:274
DER General Encoder.
Definition: asn.h:291
bool HasSSE2()
Determines SSE2 availability.
Definition: cpu.h:116
Integer Minus(const Integer &b) const
Subtraction.
Definition: integer.cpp:3983
void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: integer.cpp:3545
void Decode(const byte *input, size_t inputLen, Signedness sign=UNSIGNED)
Decode from big-endian byte array.
Definition: integer.cpp:3354
size_t PutWord16(word16 value, ByteOrder order=BIG_ENDIAN_ORDER, bool blocking=true)
Input a 16-bit word for processing.
Definition: cryptlib.cpp:745
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: modarith.h:306
size_t DEREncodeOctetString(BufferedTransformation &bt, const byte *str, size_t strLen)
DER encode octet string.
Definition: asn.cpp:104
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Definition: integer.cpp:4635
Multiple precision integer with arithmetic operations.
Integer Xor(const Integer &t) const
Bitwise XOR.
Definition: integer.cpp:3835
static const Integer & Zero()
Integer representing 0.
Definition: integer.cpp:4865
const T & STDMAX(const T &a, const T &b)
Replacement function for std::max.
Definition: misc.h:578
Euclidean domain.
Definition: algebra.h:315
void Grow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:995
void BERDecode(const byte *input, size_t inputLen)
Decode from BER format.
Definition: integer.cpp:3439
std::string IntToString< Integer >(Integer value, unsigned int base)
Converts an Integer to a string.
Definition: integer.cpp:4724
virtual size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: cryptlib.cpp:527
Class file for performing modular arithmetic.
Crypto++ library namespace.
bool GetValue(const char *name, T &value) const
Get a named value.
Definition: cryptlib.h:350
Integer & operator>>=(size_t n)
Right-shift Assignment.
Definition: integer.cpp:4041
bool GetBit(size_t i) const
Provides the i-th bit of the Integer.
Definition: integer.cpp:3103
Integer SquareRoot() const
Extract square root.
Definition: integer.cpp:4372
BER General Decoder.
Definition: asn.h:258
bool GetIntValue(const char *name, int &value) const
Get a named value with type int.
Definition: cryptlib.h:386
bool IsConvertableToLong() const
Determines if the Integer is convertable to Long.
Definition: integer.cpp:2998
Object Identifier.
Definition: asn.h:166
size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: queue.cpp:301
Integer AbsoluteValue() const
Retrieve the absolute value of this integer.
Definition: integer.cpp:3162
Integer & operator-=(const Integer &t)
Subtraction Assignment.
Definition: integer.cpp:4006
unsigned int BitPrecision(const T &value)
Returns the number of bits required for a value.
Definition: misc.h:754
size_type size() const
Provides the count of elements in the SecBlock.
Definition: secblock.h:797
Integer & operator &=(const Integer &t)
Bitwise AND Assignment.
void DEREncode(BufferedTransformation &bt) const
Encodes in DER format.
Definition: integer.cpp:4509
lword GetBits(size_t i, size_t n) const
Provides the low order bits of the Integer.
Definition: integer.cpp:3146
Integer operator-() const
Subtraction.
Definition: integer.cpp:3155
the value is positive or 0
Definition: integer.h:75
const Integer & MultiplicativeInverse(const Integer &a) const
Calculate the multiplicative inverse of an element in the ring.
Definition: integer.cpp:4698
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Definition: modarith.h:309
bool IsOdd() const
Determines if the Integer is odd parity.
Definition: integer.h:351
virtual void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the Ring.
Definition: algebra.cpp:334
Interface for retrieving values given their names.
Definition: cryptlib.h:293
Integer ConvertOut(const Integer &a) const
Reduces an element in the congruence class.
Definition: integer.cpp:4685