Crypto++  8.2
Free C++ class library of cryptographic schemes
integer.cpp
1 // integer.cpp - originally written and placed in the public domain by Wei Dai
2 // contains public domain code contributed by Alister Lee and Leonard Janke
3 
4 // Notes by JW: The Integer class needs to do two things. First, it needs
5 // to set function pointers on some platforms, like X86 and X64. The
6 // function pointers select a fast multiply and addition based on the cpu.
7 // Second, it wants to create Integer::Zero(), Integer::One() and
8 // Integer::Two().
9 // The function pointers are initialized in the InitializeInteger class by
10 // calling SetFunctionPointers(). The call to SetFunctionPointers() is
11 // guarded to run once using a double-checked pattern. We don't use C++
12 // std::call_once due to bad interactions between libstdc++, glibc and
13 // pthreads. The bad interactions were causing crashes for us on platforms
14 // like Sparc and PowerPC. Since we are only setting function pointers we
15 // don't have to worry about leaking memory. The worst case seems to be the
16 // pointers gets written twice until the init flag is set and visible to
17 // all threads.
18 // For Integer::Zero(), Integer::One() and Integer::Two(), we use one of three
19 // strategies. First, if initialization priorities are available then we use
20 // them. Initialization priorities are init_priority() on Linux and init_seg()
21 // on Windows. OS X and several other platforms lack them. Initialization
22 // priorities are platform specific but they are also the most trouble free
23 // with determisitic destruction.
24 // Second, if C++11 dynamic initialization is available, then we use it. After
25 // the std::call_once fiasco we moved to dynamic initialization to avoid
26 // unknown troubles platforms that are tested less frequently. In addition
27 // Microsoft platforms mostly do not provide dynamic initialization.
28 // The MSDN docs claim they do but they don't in practice because we need
29 // Visual Studio 2017 and Windows 10 or above.
30 // Third, we fall back to Wei's original code of a Singleton. Wei's original
31 // code was much simpler. It simply used the Singleton pattern, but it always
32 // produced memory findings on some platforms. The Singleton generates memory
33 // findings because it uses a Create On First Use pattern (a dumb Nifty
34 // Counter) and the compiler had to be smart enough to fold them to return
35 // the same object. Unix and Linux compilers do a good job of folding objects,
36 // but Microsoft compilers do a rather poor job for some versions of the
37 // compiler.
38 // Another problem with the Singleton is resource destruction requires running
39 // resource acquisition in reverse. For resources provided through the
40 // Singletons, there is no way to express the dependency order to safely
41 // destroy resources. (That's one of the problems C++11 dynamic
42 // intitialization with concurrent execution is supposed to solve).
43 // The final problem with Singletons is resource/memory exhaustion in languages
44 // like Java and .Net. Java and .Net load and unload a native DLL hundreds or
45 // thousands of times during the life of a program. Each load produces a
46 // memory leak and they can add up quickly. If they library is being used in
47 // Java or .Net then Singleton must be avoided at all costs.
48 //
49 // The code below has a path cut-in for BMI2 using mulx and adcx instructions.
50 // There was a modest speedup of approximately 0.03 ms in public key Integer
51 // operations. We had to disable BMI2 for the moment because some OS X machines
52 // were advertising BMI/BMI2 support but caused SIGILL's at runtime. Also see
53 // https://github.com/weidai11/cryptopp/issues/850.
54 
55 #include "pch.h"
56 #include "config.h"
57 
58 #ifndef CRYPTOPP_IMPORTS
59 
60 #include "integer.h"
61 #include "secblock.h"
62 #include "modarith.h"
63 #include "nbtheory.h"
64 #include "smartptr.h"
65 #include "algparam.h"
66 #include "filters.h"
67 #include "stdcpp.h"
68 #include "asn.h"
69 #include "oids.h"
70 #include "words.h"
71 #include "pubkey.h" // for P1363_KDF2
72 #include "sha.h"
73 #include "cpu.h"
74 #include "misc.h"
75 
76 #include <iostream>
77 
78 #if (_MSC_VER >= 1400) && !defined(_M_ARM)
79  #include <intrin.h>
80 #endif
81 
82 #ifdef __DECCXX
83  #include <c_asm.h>
84 #endif
85 
86 // "Error: The operand ___LKDB cannot be assigned to",
87 // http://github.com/weidai11/cryptopp/issues/188
88 #if (__SUNPRO_CC >= 0x5130)
89 # define MAYBE_CONST
90 # define MAYBE_UNCONST_CAST(x) const_cast<word*>(x)
91 #else
92 # define MAYBE_CONST const
93 # define MAYBE_UNCONST_CAST(x) x
94 #endif
95 
96 // "Inline assembly operands don't work with .intel_syntax",
97 // http://llvm.org/bugs/show_bug.cgi?id=24232
98 #if CRYPTOPP_BOOL_X32 || defined(CRYPTOPP_DISABLE_MIXED_ASM)
99 # undef CRYPTOPP_X86_ASM_AVAILABLE
100 # undef CRYPTOPP_X32_ASM_AVAILABLE
101 # undef CRYPTOPP_X64_ASM_AVAILABLE
102 # undef CRYPTOPP_SSE2_ASM_AVAILABLE
103 # undef CRYPTOPP_SSSE3_ASM_AVAILABLE
104 #else
105 # define CRYPTOPP_INTEGER_SSE2 (CRYPTOPP_SSE2_ASM_AVAILABLE && (CRYPTOPP_BOOL_X86))
106 #endif
107 
108 // ***************** C++ Static Initialization ********************
109 
110 NAMESPACE_BEGIN(CryptoPP)
111 
112 // Function body near the middle of the file
113 static void SetFunctionPointers();
114 
115 // Use a double-checked pattern. We are not leaking anything so it
116 // does not matter if a pointer is written twice during a race.
117 // Avoid std::call_once due to too many problems on platforms like
118 // Solaris and Sparc. Also see
119 // http://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146 and
120 // http://github.com/weidai11/cryptopp/issues/707.
121 InitializeInteger::InitializeInteger()
122 {
123  static bool s_flag;
124  MEMORY_BARRIER();
125  if (s_flag == false)
126  {
127  SetFunctionPointers();
128  s_flag = true;
129  MEMORY_BARRIER();
130  }
131 }
132 
133 template <long i>
135 {
136  Integer * operator()() const
137  {
138  return new Integer(i);
139  }
140 };
141 
142 // ***************** Library code ********************
143 
144 inline static int Compare(const word *A, const word *B, size_t N)
145 {
146  while (N--)
147  if (A[N] > B[N])
148  return 1;
149  else if (A[N] < B[N])
150  return -1;
151 
152  return 0;
153 }
154 
155 inline static int Increment(word *A, size_t N, word B=1)
156 {
157  CRYPTOPP_ASSERT(N);
158  word t = A[0];
159  A[0] = t+B;
160  if (A[0] >= t)
161  return 0;
162  for (unsigned i=1; i<N; i++)
163  if (++A[i])
164  return 0;
165  return 1;
166 }
167 
168 inline static int Decrement(word *A, size_t N, word B=1)
169 {
170  CRYPTOPP_ASSERT(N);
171  word t = A[0];
172  A[0] = t-B;
173  if (A[0] <= t)
174  return 0;
175  for (unsigned i=1; i<N; i++)
176  if (A[i]--)
177  return 0;
178  return 1;
179 }
180 
181 static void TwosComplement(word *A, size_t N)
182 {
183  Decrement(A, N);
184  for (unsigned i=0; i<N; i++)
185  A[i] = ~A[i];
186 }
187 
188 static word AtomicInverseModPower2(word A)
189 {
190  CRYPTOPP_ASSERT(A%2==1);
191 
192  word R=A%8;
193 
194  for (unsigned i=3; i<WORD_BITS; i*=2)
195  R = R*(2-R*A);
196 
197  CRYPTOPP_ASSERT(R*A==1);
198  return R;
199 }
200 
201 // ********************************************************
202 
203 #if !defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE) || (defined(__x86_64__) && defined(CRYPTOPP_WORD128_AVAILABLE))
204  #define TWO_64_BIT_WORDS 1
205  #define Declare2Words(x) word x##0, x##1;
206  #define AssignWord(a, b) a##0 = b; a##1 = 0;
207  #define Add2WordsBy1(a, b, c) a##0 = b##0 + c; a##1 = b##1 + (a##0 < c);
208  #define LowWord(a) a##0
209  #define HighWord(a) a##1
210  #ifdef _MSC_VER
211  #define MultiplyWordsLoHi(p0, p1, a, b) p0 = _umul128(a, b, &p1);
212  #ifndef __INTEL_COMPILER
213  #define Double3Words(c, d) d##1 = __shiftleft128(d##0, d##1, 1); d##0 = __shiftleft128(c, d##0, 1); c *= 2;
214  #endif
215  #elif defined(__DECCXX)
216  #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; p1 = asm("umulh %a0, %a1, %v0", a, b);
217  #elif defined(__x86_64__)
218  #if defined(__SUNPRO_CC) && __SUNPRO_CC < 0x5100
219  // Sun Studio's gcc-style inline assembly is heavily bugged as of version 5.9 Patch 124864-09 2008/12/16, but this one works
220  #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "r"(b) : "cc");
221  #elif defined(__BMI2__) && 0
222  #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulxq %3, %0, %1" : "=r"(p0), "=r"(p1) : "d"(a), "r"(b));
223  #define MulAcc(c, d, a, b) asm ("mulxq %6, %3, %4; addq %3, %0; adcxq %4, %1; adcxq %7, %2;" : "+&r"(c), "+&r"(d##0), "+&r"(d##1), "=&r"(p0), "=&r"(p1) : "d"(a), "r"(b), "r"(W64LIT(0)) : "cc");
224  #define Double3Words(c, d) asm ("addq %0, %0; adcxq %1, %1; adcxq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
225  #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcxq %3, %1;" : "+&r"(a##0), "+r"(a##1) : "r"(b), "r"(W64LIT(0)) : "cc");
226  #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcxq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
227  #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcxq %6, %1; adcxq %7, %2;" : "+r"(c), "=&r"(e##0), "=&r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1), "r"(W64LIT(0)) : "cc");
228  #else
229  #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
230  #define MulAcc(c, d, a, b) asm ("mulq %6; addq %3, %0; adcq %4, %1; adcq $0, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1), "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
231  #define Double3Words(c, d) asm ("addq %0, %0; adcq %1, %1; adcq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
232  #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcq $0, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b) : "cc");
233  #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
234  #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcq %6, %1; adcq $0, %2;" : "+r"(c), "=r"(e##0), "=r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1) : "cc");
235  #endif
236  #endif
237  #define MultiplyWords(p, a, b) MultiplyWordsLoHi(p##0, p##1, a, b)
238  #ifndef Double3Words
239  #define Double3Words(c, d) d##1 = 2*d##1 + (d##0>>(WORD_BITS-1)); d##0 = 2*d##0 + (c>>(WORD_BITS-1)); c *= 2;
240  #endif
241  #ifndef Acc2WordsBy2
242  #define Acc2WordsBy2(a, b) a##0 += b##0; a##1 += a##0 < b##0; a##1 += b##1;
243  #endif
244  #define AddWithCarry(u, a, b) {word t = a+b; u##0 = t + u##1; u##1 = (t<a) + (u##0<t);}
245  #define SubtractWithBorrow(u, a, b) {word t = a-b; u##0 = t - u##1; u##1 = (t>a) + (u##0>t);}
246  #define GetCarry(u) u##1
247  #define GetBorrow(u) u##1
248 #else
249  #define Declare2Words(x) dword x;
250  #if _MSC_VER >= 1400 && !defined(__INTEL_COMPILER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_IA64))
251  #define MultiplyWords(p, a, b) p = __emulu(a, b);
252  #else
253  #define MultiplyWords(p, a, b) p = (dword)a*b;
254  #endif
255  #define AssignWord(a, b) a = b;
256  #define Add2WordsBy1(a, b, c) a = b + c;
257  #define Acc2WordsBy2(a, b) a += b;
258  #define LowWord(a) word(a)
259  #define HighWord(a) word(a>>WORD_BITS)
260  #define Double3Words(c, d) d = 2*d + (c>>(WORD_BITS-1)); c *= 2;
261  #define AddWithCarry(u, a, b) u = dword(a) + b + GetCarry(u);
262  #define SubtractWithBorrow(u, a, b) u = dword(a) - b - GetBorrow(u);
263  #define GetCarry(u) HighWord(u)
264  #define GetBorrow(u) word(u>>(WORD_BITS*2-1))
265 #endif
266 #ifndef MulAcc
267  #define MulAcc(c, d, a, b) MultiplyWords(p, a, b); Acc2WordsBy1(p, c); c = LowWord(p); Acc2WordsBy1(d, HighWord(p));
268 #endif
269 #ifndef Acc2WordsBy1
270  #define Acc2WordsBy1(a, b) Add2WordsBy1(a, a, b)
271 #endif
272 #ifndef Acc3WordsBy2
273  #define Acc3WordsBy2(c, d, e) Acc2WordsBy1(e, c); c = LowWord(e); Add2WordsBy1(e, d, HighWord(e));
274 #endif
275 
276 class DWord
277 {
278 public:
279 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
280  DWord() {std::memset(&m_whole, 0x00, sizeof(m_whole));}
281 #else
282  DWord() {std::memset(&m_halfs, 0x00, sizeof(m_halfs));}
283 #endif
284 
285 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
286  explicit DWord(word low) : m_whole(low) { }
287 #else
288  explicit DWord(word low)
289  {
290  m_halfs.high = 0;
291  m_halfs.low = low;
292  }
293 #endif
294 
295 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
296  DWord(word low, word high) : m_whole()
297 #else
298  DWord(word low, word high) : m_halfs()
299 #endif
300  {
301 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
302 # if (CRYPTOPP_LITTLE_ENDIAN)
303  const word t[2] = {low,high};
304  memcpy(&m_whole, t, sizeof(m_whole));
305 # else
306  const word t[2] = {high,low};
307  memcpy(&m_whole, t, sizeof(m_whole));
308 # endif
309 #else
310  m_halfs.low = low;
311  m_halfs.high = high;
312 #endif
313  }
314 
315  static DWord Multiply(word a, word b)
316  {
317  DWord r;
318  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
319  r.m_whole = (dword)a * b;
320  #elif defined(MultiplyWordsLoHi)
321  MultiplyWordsLoHi(r.m_halfs.low, r.m_halfs.high, a, b);
322  #else
323  CRYPTOPP_ASSERT(0);
324  #endif
325  return r;
326  }
327 
328  static DWord MultiplyAndAdd(word a, word b, word c)
329  {
330  DWord r = Multiply(a, b);
331  return r += c;
332  }
333 
334  DWord & operator+=(word a)
335  {
336  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
337  m_whole = m_whole + a;
338  #else
339  m_halfs.low += a;
340  m_halfs.high += (m_halfs.low < a);
341  #endif
342  return *this;
343  }
344 
345  DWord operator+(word a)
346  {
347  DWord r;
348  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
349  r.m_whole = m_whole + a;
350  #else
351  r.m_halfs.low = m_halfs.low + a;
352  r.m_halfs.high = m_halfs.high + (r.m_halfs.low < a);
353  #endif
354  return r;
355  }
356 
357  DWord operator-(DWord a)
358  {
359  DWord r;
360  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
361  r.m_whole = m_whole - a.m_whole;
362  #else
363  r.m_halfs.low = m_halfs.low - a.m_halfs.low;
364  r.m_halfs.high = m_halfs.high - a.m_halfs.high - (r.m_halfs.low > m_halfs.low);
365  #endif
366  return r;
367  }
368 
369  DWord operator-(word a)
370  {
371  DWord r;
372  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
373  r.m_whole = m_whole - a;
374  #else
375  r.m_halfs.low = m_halfs.low - a;
376  r.m_halfs.high = m_halfs.high - (r.m_halfs.low > m_halfs.low);
377  #endif
378  return r;
379  }
380 
381  // returns quotient, which must fit in a word
382  word operator/(word divisor);
383 
384  word operator%(word a);
385 
386  bool operator!() const
387  {
388  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
389  return !m_whole;
390  #else
391  return !m_halfs.high && !m_halfs.low;
392  #endif
393  }
394 
395  // TODO: When NATIVE_DWORD is in effect, we access high and low, which are inactive
396  // union members, and that's UB. Also see http://stackoverflow.com/q/11373203.
397  word GetLowHalf() const {return m_halfs.low;}
398  word GetHighHalf() const {return m_halfs.high;}
399  word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
400 
401 private:
402  // Issue 274, "Types cannot be declared in anonymous union",
403  // http://github.com/weidai11/cryptopp/issues/274
404  // Thanks to Martin Bonner at http://stackoverflow.com/a/39507183
405  struct half_words
406  {
407  #if (CRYPTOPP_LITTLE_ENDIAN)
408  word low;
409  word high;
410  #else
411  word high;
412  word low;
413  #endif
414  };
415  union
416  {
417  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
418  dword m_whole;
419  #endif
420  half_words m_halfs;
421  };
422 };
423 
424 class Word
425 {
426 public:
427  Word() : m_whole(0) {}
428  Word(word value) : m_whole(value) {}
429  Word(hword low, hword high) : m_whole(low | (word(high) << (WORD_BITS/2))) {}
430 
431  static Word Multiply(hword a, hword b)
432  {
433  Word r;
434  r.m_whole = (word)a * b;
435  return r;
436  }
437 
438  Word operator-(Word a)
439  {
440  Word r;
441  r.m_whole = m_whole - a.m_whole;
442  return r;
443  }
444 
445  Word operator-(hword a)
446  {
447  Word r;
448  r.m_whole = m_whole - a;
449  return r;
450  }
451 
452  // returns quotient, which must fit in a word
453  hword operator/(hword divisor)
454  {
455  return hword(m_whole / divisor);
456  }
457 
458  bool operator!() const
459  {
460  return !m_whole;
461  }
462 
463  word GetWhole() const {return m_whole;}
464  hword GetLowHalf() const {return hword(m_whole);}
465  hword GetHighHalf() const {return hword(m_whole>>(WORD_BITS/2));}
466  hword GetHighHalfAsBorrow() const {return 0-hword(m_whole>>(WORD_BITS/2));}
467 
468 private:
469  word m_whole;
470 };
471 
472 // do a 3 word by 2 word divide, returns quotient and leaves remainder in A
473 template <class S, class D>
474 S DivideThreeWordsByTwo(S *A, S B0, S B1, D *dummy=NULLPTR)
475 {
476  CRYPTOPP_UNUSED(dummy);
477 
478  // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a S
479  CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
480 
481  // estimate the quotient: do a 2 S by 1 S divide.
482  // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
483  // The code change occurred at Commit dc99266599a0e72d.
484 
485  S Q; bool pre = (S(B1+1) == 0);
486  if (B1 > 0 && !pre)
487  Q = D(A[1], A[2]) / S(B1+1);
488  else if (pre)
489  Q = A[2];
490  else
491  Q = D(A[0], A[1]) / B0;
492 
493  // now subtract Q*B from A
494  D p = D::Multiply(B0, Q);
495  D u = (D) A[0] - p.GetLowHalf();
496  A[0] = u.GetLowHalf();
497  u = (D) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - D::Multiply(B1, Q);
498  A[1] = u.GetLowHalf();
499  A[2] += u.GetHighHalf();
500 
501  // Q <= actual quotient, so fix it
502  while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
503  {
504  u = (D) A[0] - B0;
505  A[0] = u.GetLowHalf();
506  u = (D) A[1] - B1 - u.GetHighHalfAsBorrow();
507  A[1] = u.GetLowHalf();
508  A[2] += u.GetHighHalf();
509  Q++;
510  CRYPTOPP_ASSERT(Q); // shouldn't overflow
511  }
512 
513  return Q;
514 }
515 
516 // do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
517 template <class S, class D>
518 inline D DivideFourWordsByTwo(S *T, const D &Al, const D &Ah, const D &B)
519 {
520  // Profiling tells us the original second case was dominant, so it was
521  // promoted to the first If statement. The code change occurred at
522  // Commit dc99266599a0e72d.
523 
524  if (!!B)
525  {
526  S Q[2];
527  T[0] = Al.GetLowHalf();
528  T[1] = Al.GetHighHalf();
529  T[2] = Ah.GetLowHalf();
530  T[3] = Ah.GetHighHalf();
531  Q[1] = DivideThreeWordsByTwo<S, D>(T+1, B.GetLowHalf(), B.GetHighHalf());
532  Q[0] = DivideThreeWordsByTwo<S, D>(T, B.GetLowHalf(), B.GetHighHalf());
533  return D(Q[0], Q[1]);
534  }
535  else // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
536  {
537  return D(Ah.GetLowHalf(), Ah.GetHighHalf());
538  }
539 }
540 
541 // returns quotient, which must fit in a word
542 inline word DWord::operator/(word a)
543 {
544  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
545  return word(m_whole / a);
546  #else
547  hword r[4];
548  return DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a).GetWhole();
549  #endif
550 }
551 
552 inline word DWord::operator%(word a)
553 {
554  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
555  return word(m_whole % a);
556  #else
557  if (a < (word(1) << (WORD_BITS/2)))
558  {
559  hword h = hword(a);
560  word r = m_halfs.high % h;
561  r = ((m_halfs.low >> (WORD_BITS/2)) + (r << (WORD_BITS/2))) % h;
562  return hword((hword(m_halfs.low) + (r << (WORD_BITS/2))) % h);
563  }
564  else
565  {
566  hword r[4];
567  DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a);
568  return Word(r[0], r[1]).GetWhole();
569  }
570  #endif
571 }
572 
573 // ********************************************************
574 
575 // Use some tricks to share assembly code between MSVC, GCC, Clang and Sun CC.
576 #if defined(__GNUC__)
577  #define AddPrologue \
578  int result; \
579  __asm__ __volatile__ \
580  ( \
581  INTEL_NOPREFIX
582  #define AddEpilogue \
583  ATT_PREFIX \
584  : "=a" (result)\
585  : "d" (C), "a" (A), "D" (B), "c" (N) \
586  : "%esi", "memory", "cc" \
587  );\
588  return result;
589  #define MulPrologue \
590  __asm__ __volatile__ \
591  ( \
592  INTEL_NOPREFIX \
593  AS1( push ebx) \
594  AS2( mov ebx, edx)
595  #define MulEpilogue \
596  AS1( pop ebx) \
597  ATT_PREFIX \
598  : \
599  : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B) \
600  : "%esi", "memory", "cc" \
601  );
602  #define SquPrologue MulPrologue
603  #define SquEpilogue \
604  AS1( pop ebx) \
605  ATT_PREFIX \
606  : \
607  : "d" (s_maskLow16), "c" (C), "a" (A) \
608  : "%esi", "%edi", "memory", "cc" \
609  );
610  #define TopPrologue MulPrologue
611  #define TopEpilogue \
612  AS1( pop ebx) \
613  ATT_PREFIX \
614  : \
615  : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B), "S" (L) \
616  : "memory", "cc" \
617  );
618 #else
619  #define AddPrologue \
620  __asm push edi \
621  __asm push esi \
622  __asm mov eax, [esp+12] \
623  __asm mov edi, [esp+16]
624  #define AddEpilogue \
625  __asm pop esi \
626  __asm pop edi \
627  __asm ret 8
628  #define SaveEBX
629  #define RestoreEBX
630  #define SquPrologue \
631  AS2( mov eax, A) \
632  AS2( mov ecx, C) \
633  SaveEBX \
634  AS2( lea ebx, s_maskLow16)
635  #define MulPrologue \
636  AS2( mov eax, A) \
637  AS2( mov edi, B) \
638  AS2( mov ecx, C) \
639  SaveEBX \
640  AS2( lea ebx, s_maskLow16)
641  #define TopPrologue \
642  AS2( mov eax, A) \
643  AS2( mov edi, B) \
644  AS2( mov ecx, C) \
645  AS2( mov esi, L) \
646  SaveEBX \
647  AS2( lea ebx, s_maskLow16)
648  #define SquEpilogue RestoreEBX
649  #define MulEpilogue RestoreEBX
650  #define TopEpilogue RestoreEBX
651 #endif
652 
653 #ifdef CRYPTOPP_X64_MASM_AVAILABLE
654 extern "C" {
655 int Baseline_Add(size_t N, word *C, const word *A, const word *B);
656 int Baseline_Sub(size_t N, word *C, const word *A, const word *B);
657 }
658 #elif defined(CRYPTOPP_X64_ASM_AVAILABLE) && defined(__GNUC__) && defined(CRYPTOPP_WORD128_AVAILABLE)
659 int Baseline_Add(size_t N, word *C, const word *A, const word *B)
660 {
661  word result;
662  __asm__ __volatile__
663  (
664  INTEL_NOPREFIX
665  AS1( neg %1)
666  ASJ( jz, 1, f)
667  AS2( mov %0,[%3+8*%1])
668  AS2( add %0,[%4+8*%1])
669  AS2( mov [%2+8*%1],%0)
670  ASL(0)
671  AS2( mov %0,[%3+8*%1+8])
672  AS2( adc %0,[%4+8*%1+8])
673  AS2( mov [%2+8*%1+8],%0)
674  AS2( lea %1,[%1+2])
675  ASJ( jrcxz, 1, f)
676  AS2( mov %0,[%3+8*%1])
677  AS2( adc %0,[%4+8*%1])
678  AS2( mov [%2+8*%1],%0)
679  ASJ( jmp, 0, b)
680  ASL(1)
681  AS2( mov %0, 0)
682  AS2( adc %0, %0)
683  ATT_NOPREFIX
684  : "=&r" (result), "+c" (N)
685  : "r" (C+N), "r" (A+N), "r" (B+N)
686  : "memory", "cc"
687  );
688  return (int)result;
689 }
690 
691 int Baseline_Sub(size_t N, word *C, const word *A, const word *B)
692 {
693  word result;
694  __asm__ __volatile__
695  (
696  INTEL_NOPREFIX
697  AS1( neg %1)
698  ASJ( jz, 1, f)
699  AS2( mov %0,[%3+8*%1])
700  AS2( sub %0,[%4+8*%1])
701  AS2( mov [%2+8*%1],%0)
702  ASL(0)
703  AS2( mov %0,[%3+8*%1+8])
704  AS2( sbb %0,[%4+8*%1+8])
705  AS2( mov [%2+8*%1+8],%0)
706  AS2( lea %1,[%1+2])
707  ASJ( jrcxz, 1, f)
708  AS2( mov %0,[%3+8*%1])
709  AS2( sbb %0,[%4+8*%1])
710  AS2( mov [%2+8*%1],%0)
711  ASJ( jmp, 0, b)
712  ASL(1)
713  AS2( mov %0, 0)
714  AS2( adc %0, %0)
715  ATT_NOPREFIX
716  : "=&r" (result), "+c" (N)
717  : "r" (C+N), "r" (A+N), "r" (B+N)
718  : "memory", "cc"
719  );
720  return (int)result;
721 }
722 #elif defined(CRYPTOPP_X86_ASM_AVAILABLE) && CRYPTOPP_BOOL_X86
723 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
724 {
725  AddPrologue
726 
727  // now: eax = A, edi = B, edx = C, ecx = N
728  AS2( lea eax, [eax+4*ecx])
729  AS2( lea edi, [edi+4*ecx])
730  AS2( lea edx, [edx+4*ecx])
731 
732  AS1( neg ecx) // ecx is negative index
733  AS2( test ecx, 2) // this clears carry flag
734  ASJ( jz, 0, f)
735  AS2( sub ecx, 2)
736  ASJ( jmp, 1, f)
737 
738  ASL(0)
739  ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
740  AS2( mov esi,[eax+4*ecx])
741  AS2( adc esi,[edi+4*ecx])
742  AS2( mov [edx+4*ecx],esi)
743  AS2( mov esi,[eax+4*ecx+4])
744  AS2( adc esi,[edi+4*ecx+4])
745  AS2( mov [edx+4*ecx+4],esi)
746  ASL(1)
747  AS2( mov esi,[eax+4*ecx+8])
748  AS2( adc esi,[edi+4*ecx+8])
749  AS2( mov [edx+4*ecx+8],esi)
750  AS2( mov esi,[eax+4*ecx+12])
751  AS2( adc esi,[edi+4*ecx+12])
752  AS2( mov [edx+4*ecx+12],esi)
753 
754  AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
755  ASJ( jmp, 0, b)
756 
757  ASL(2)
758  AS2( mov eax, 0)
759  AS1( setc al) // store carry into eax (return result register)
760 
761  AddEpilogue
762 
763  // http://github.com/weidai11/cryptopp/issues/340
764  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
765  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
766 }
767 
768 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
769 {
770  AddPrologue
771 
772  // now: eax = A, edi = B, edx = C, ecx = N
773  AS2( lea eax, [eax+4*ecx])
774  AS2( lea edi, [edi+4*ecx])
775  AS2( lea edx, [edx+4*ecx])
776 
777  AS1( neg ecx) // ecx is negative index
778  AS2( test ecx, 2) // this clears carry flag
779  ASJ( jz, 0, f)
780  AS2( sub ecx, 2)
781  ASJ( jmp, 1, f)
782 
783  ASL(0)
784  ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
785  AS2( mov esi,[eax+4*ecx])
786  AS2( sbb esi,[edi+4*ecx])
787  AS2( mov [edx+4*ecx],esi)
788  AS2( mov esi,[eax+4*ecx+4])
789  AS2( sbb esi,[edi+4*ecx+4])
790  AS2( mov [edx+4*ecx+4],esi)
791  ASL(1)
792  AS2( mov esi,[eax+4*ecx+8])
793  AS2( sbb esi,[edi+4*ecx+8])
794  AS2( mov [edx+4*ecx+8],esi)
795  AS2( mov esi,[eax+4*ecx+12])
796  AS2( sbb esi,[edi+4*ecx+12])
797  AS2( mov [edx+4*ecx+12],esi)
798 
799  AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
800  ASJ( jmp, 0, b)
801 
802  ASL(2)
803  AS2( mov eax, 0)
804  AS1( setc al) // store carry into eax (return result register)
805 
806  AddEpilogue
807 
808  // http://github.com/weidai11/cryptopp/issues/340
809  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
810  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
811 }
812 
813 #if CRYPTOPP_INTEGER_SSE2
814 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Add(size_t N, word *C, const word *A, const word *B)
815 {
816  AddPrologue
817 
818  // now: eax = A, edi = B, edx = C, ecx = N
819  AS2( lea eax, [eax+4*ecx])
820  AS2( lea edi, [edi+4*ecx])
821  AS2( lea edx, [edx+4*ecx])
822 
823  AS1( neg ecx) // ecx is negative index
824  AS2( pxor mm2, mm2)
825  ASJ( jz, 2, f)
826  AS2( test ecx, 2) // this clears carry flag
827  ASJ( jz, 0, f)
828  AS2( sub ecx, 2)
829  ASJ( jmp, 1, f)
830 
831  ASL(0)
832  AS2( movd mm0, DWORD PTR [eax+4*ecx])
833  AS2( movd mm1, DWORD PTR [edi+4*ecx])
834  AS2( paddq mm0, mm1)
835  AS2( paddq mm2, mm0)
836  AS2( movd DWORD PTR [edx+4*ecx], mm2)
837  AS2( psrlq mm2, 32)
838 
839  AS2( movd mm0, DWORD PTR [eax+4*ecx+4])
840  AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
841  AS2( paddq mm0, mm1)
842  AS2( paddq mm2, mm0)
843  AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
844  AS2( psrlq mm2, 32)
845 
846  ASL(1)
847  AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
848  AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
849  AS2( paddq mm0, mm1)
850  AS2( paddq mm2, mm0)
851  AS2( movd DWORD PTR [edx+4*ecx+8], mm2)
852  AS2( psrlq mm2, 32)
853 
854  AS2( movd mm0, DWORD PTR [eax+4*ecx+12])
855  AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
856  AS2( paddq mm0, mm1)
857  AS2( paddq mm2, mm0)
858  AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
859  AS2( psrlq mm2, 32)
860 
861  AS2( add ecx, 4)
862  ASJ( jnz, 0, b)
863 
864  ASL(2)
865  AS2( movd eax, mm2)
866  AS1( emms)
867 
868  AddEpilogue
869 
870  // http://github.com/weidai11/cryptopp/issues/340
871  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
872  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
873 }
874 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Sub(size_t N, word *C, const word *A, const word *B)
875 {
876  AddPrologue
877 
878  // now: eax = A, edi = B, edx = C, ecx = N
879  AS2( lea eax, [eax+4*ecx])
880  AS2( lea edi, [edi+4*ecx])
881  AS2( lea edx, [edx+4*ecx])
882 
883  AS1( neg ecx) // ecx is negative index
884  AS2( pxor mm2, mm2)
885  ASJ( jz, 2, f)
886  AS2( test ecx, 2) // this clears carry flag
887  ASJ( jz, 0, f)
888  AS2( sub ecx, 2)
889  ASJ( jmp, 1, f)
890 
891  ASL(0)
892  AS2( movd mm0, DWORD PTR [eax+4*ecx])
893  AS2( movd mm1, DWORD PTR [edi+4*ecx])
894  AS2( psubq mm0, mm1)
895  AS2( psubq mm0, mm2)
896  AS2( movd DWORD PTR [edx+4*ecx], mm0)
897  AS2( psrlq mm0, 63)
898 
899  AS2( movd mm2, DWORD PTR [eax+4*ecx+4])
900  AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
901  AS2( psubq mm2, mm1)
902  AS2( psubq mm2, mm0)
903  AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
904  AS2( psrlq mm2, 63)
905 
906  ASL(1)
907  AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
908  AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
909  AS2( psubq mm0, mm1)
910  AS2( psubq mm0, mm2)
911  AS2( movd DWORD PTR [edx+4*ecx+8], mm0)
912  AS2( psrlq mm0, 63)
913 
914  AS2( movd mm2, DWORD PTR [eax+4*ecx+12])
915  AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
916  AS2( psubq mm2, mm1)
917  AS2( psubq mm2, mm0)
918  AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
919  AS2( psrlq mm2, 63)
920 
921  AS2( add ecx, 4)
922  ASJ( jnz, 0, b)
923 
924  ASL(2)
925  AS2( movd eax, mm2)
926  AS1( emms)
927 
928  AddEpilogue
929 
930  // http://github.com/weidai11/cryptopp/issues/340
931  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
932  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
933 }
934 #endif // CRYPTOPP_INTEGER_SSE2
935 #else // CRYPTOPP_SSE2_ASM_AVAILABLE
936 int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
937 {
938  CRYPTOPP_ASSERT (N%2 == 0);
939 
940  Declare2Words(u);
941  AssignWord(u, 0);
942  for (size_t i=0; i<N; i+=2)
943  {
944  AddWithCarry(u, A[i], B[i]);
945  C[i] = LowWord(u);
946  AddWithCarry(u, A[i+1], B[i+1]);
947  C[i+1] = LowWord(u);
948  }
949  return int(GetCarry(u));
950 }
951 
952 int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
953 {
954  CRYPTOPP_ASSERT (N%2 == 0);
955 
956  Declare2Words(u);
957  AssignWord(u, 0);
958  for (size_t i=0; i<N; i+=2)
959  {
960  SubtractWithBorrow(u, A[i], B[i]);
961  C[i] = LowWord(u);
962  SubtractWithBorrow(u, A[i+1], B[i+1]);
963  C[i+1] = LowWord(u);
964  }
965  return int(GetBorrow(u));
966 }
967 #endif
968 
969 static word LinearMultiply(word *C, const word *AA, word B, size_t N)
970 {
971  // http://github.com/weidai11/cryptopp/issues/188
972  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
973 
974  word carry=0;
975  for(unsigned i=0; i<N; i++)
976  {
977  Declare2Words(p);
978  MultiplyWords(p, A[i], B);
979  Acc2WordsBy1(p, carry);
980  C[i] = LowWord(p);
981  carry = HighWord(p);
982  }
983  return carry;
984 }
985 
986 #ifndef CRYPTOPP_DOXYGEN_PROCESSING
987 
988 #define Mul_2 \
989  Mul_Begin(2) \
990  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
991  Mul_End(1, 1)
992 
993 #define Mul_4 \
994  Mul_Begin(4) \
995  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
996  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
997  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
998  Mul_SaveAcc(3, 1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
999  Mul_SaveAcc(4, 2, 3) Mul_Acc(3, 2) \
1000  Mul_End(5, 3)
1001 
1002 #define Mul_8 \
1003  Mul_Begin(8) \
1004  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1005  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1006  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1007  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1008  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1009  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1010  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1011  Mul_SaveAcc(7, 1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1012  Mul_SaveAcc(8, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1013  Mul_SaveAcc(9, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1014  Mul_SaveAcc(10, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1015  Mul_SaveAcc(11, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1016  Mul_SaveAcc(12, 6, 7) Mul_Acc(7, 6) \
1017  Mul_End(13, 7)
1018 
1019 #define Mul_16 \
1020  Mul_Begin(16) \
1021  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1022  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1023  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1024  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1025  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1026  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1027  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1028  Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1029  Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1030  Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1031  Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1032  Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1033  Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1034  Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1035  Mul_SaveAcc(14, 0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1036  Mul_SaveAcc(15, 1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1037  Mul_SaveAcc(16, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1038  Mul_SaveAcc(17, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1039  Mul_SaveAcc(18, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1040  Mul_SaveAcc(19, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1041  Mul_SaveAcc(20, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1042  Mul_SaveAcc(21, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1043  Mul_SaveAcc(22, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1044  Mul_SaveAcc(23, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1045  Mul_SaveAcc(24, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1046  Mul_SaveAcc(25, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1047  Mul_SaveAcc(26, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1048  Mul_SaveAcc(27, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1049  Mul_SaveAcc(28, 14, 15) Mul_Acc(15, 14) \
1050  Mul_End(29, 15)
1051 
1052 #define Squ_2 \
1053  Squ_Begin(2) \
1054  Squ_End(2)
1055 
1056 #define Squ_4 \
1057  Squ_Begin(4) \
1058  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1059  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1060  Squ_SaveAcc(3, 1, 3) Squ_Diag(2) \
1061  Squ_SaveAcc(4, 2, 3) Squ_NonDiag \
1062  Squ_End(4)
1063 
1064 #define Squ_8 \
1065  Squ_Begin(8) \
1066  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1067  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1068  Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1069  Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1070  Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1071  Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1072  Squ_SaveAcc(7, 1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1073  Squ_SaveAcc(8, 2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1074  Squ_SaveAcc(9, 3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1075  Squ_SaveAcc(10, 4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1076  Squ_SaveAcc(11, 5, 7) Squ_Diag(6) \
1077  Squ_SaveAcc(12, 6, 7) Squ_NonDiag \
1078  Squ_End(8)
1079 
1080 #define Squ_16 \
1081  Squ_Begin(16) \
1082  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1083  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1084  Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1085  Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1086  Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1087  Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1088  Squ_SaveAcc(7, 0, 8) Squ_Acc(1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1089  Squ_SaveAcc(8, 0, 9) Squ_Acc(1, 8) Squ_Acc(2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1090  Squ_SaveAcc(9, 0, 10) Squ_Acc(1, 9) Squ_Acc(2, 8) Squ_Acc(3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1091  Squ_SaveAcc(10, 0, 11) Squ_Acc(1, 10) Squ_Acc(2, 9) Squ_Acc(3, 8) Squ_Acc(4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1092  Squ_SaveAcc(11, 0, 12) Squ_Acc(1, 11) Squ_Acc(2, 10) Squ_Acc(3, 9) Squ_Acc(4, 8) Squ_Acc(5, 7) Squ_Diag(6) \
1093  Squ_SaveAcc(12, 0, 13) Squ_Acc(1, 12) Squ_Acc(2, 11) Squ_Acc(3, 10) Squ_Acc(4, 9) Squ_Acc(5, 8) Squ_Acc(6, 7) Squ_NonDiag \
1094  Squ_SaveAcc(13, 0, 14) Squ_Acc(1, 13) Squ_Acc(2, 12) Squ_Acc(3, 11) Squ_Acc(4, 10) Squ_Acc(5, 9) Squ_Acc(6, 8) Squ_Diag(7) \
1095  Squ_SaveAcc(14, 0, 15) Squ_Acc(1, 14) Squ_Acc(2, 13) Squ_Acc(3, 12) Squ_Acc(4, 11) Squ_Acc(5, 10) Squ_Acc(6, 9) Squ_Acc(7, 8) Squ_NonDiag \
1096  Squ_SaveAcc(15, 1, 15) Squ_Acc(2, 14) Squ_Acc(3, 13) Squ_Acc(4, 12) Squ_Acc(5, 11) Squ_Acc(6, 10) Squ_Acc(7, 9) Squ_Diag(8) \
1097  Squ_SaveAcc(16, 2, 15) Squ_Acc(3, 14) Squ_Acc(4, 13) Squ_Acc(5, 12) Squ_Acc(6, 11) Squ_Acc(7, 10) Squ_Acc(8, 9) Squ_NonDiag \
1098  Squ_SaveAcc(17, 3, 15) Squ_Acc(4, 14) Squ_Acc(5, 13) Squ_Acc(6, 12) Squ_Acc(7, 11) Squ_Acc(8, 10) Squ_Diag(9) \
1099  Squ_SaveAcc(18, 4, 15) Squ_Acc(5, 14) Squ_Acc(6, 13) Squ_Acc(7, 12) Squ_Acc(8, 11) Squ_Acc(9, 10) Squ_NonDiag \
1100  Squ_SaveAcc(19, 5, 15) Squ_Acc(6, 14) Squ_Acc(7, 13) Squ_Acc(8, 12) Squ_Acc(9, 11) Squ_Diag(10) \
1101  Squ_SaveAcc(20, 6, 15) Squ_Acc(7, 14) Squ_Acc(8, 13) Squ_Acc(9, 12) Squ_Acc(10, 11) Squ_NonDiag \
1102  Squ_SaveAcc(21, 7, 15) Squ_Acc(8, 14) Squ_Acc(9, 13) Squ_Acc(10, 12) Squ_Diag(11) \
1103  Squ_SaveAcc(22, 8, 15) Squ_Acc(9, 14) Squ_Acc(10, 13) Squ_Acc(11, 12) Squ_NonDiag \
1104  Squ_SaveAcc(23, 9, 15) Squ_Acc(10, 14) Squ_Acc(11, 13) Squ_Diag(12) \
1105  Squ_SaveAcc(24, 10, 15) Squ_Acc(11, 14) Squ_Acc(12, 13) Squ_NonDiag \
1106  Squ_SaveAcc(25, 11, 15) Squ_Acc(12, 14) Squ_Diag(13) \
1107  Squ_SaveAcc(26, 12, 15) Squ_Acc(13, 14) Squ_NonDiag \
1108  Squ_SaveAcc(27, 13, 15) Squ_Diag(14) \
1109  Squ_SaveAcc(28, 14, 15) Squ_NonDiag \
1110  Squ_End(16)
1111 
1112 #define Bot_2 \
1113  Mul_Begin(2) \
1114  Bot_SaveAcc(0, 0, 1) Bot_Acc(1, 0) \
1115  Bot_End(2)
1116 
1117 #define Bot_4 \
1118  Mul_Begin(4) \
1119  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1120  Mul_SaveAcc(1, 2, 0) Mul_Acc(1, 1) Mul_Acc(0, 2) \
1121  Bot_SaveAcc(2, 0, 3) Bot_Acc(1, 2) Bot_Acc(2, 1) Bot_Acc(3, 0) \
1122  Bot_End(4)
1123 
1124 #define Bot_8 \
1125  Mul_Begin(8) \
1126  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1127  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1128  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1129  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1130  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1131  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1132  Bot_SaveAcc(6, 0, 7) Bot_Acc(1, 6) Bot_Acc(2, 5) Bot_Acc(3, 4) Bot_Acc(4, 3) Bot_Acc(5, 2) Bot_Acc(6, 1) Bot_Acc(7, 0) \
1133  Bot_End(8)
1134 
1135 #define Bot_16 \
1136  Mul_Begin(16) \
1137  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1138  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1139  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1140  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1141  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1142  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1143  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1144  Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1145  Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1146  Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1147  Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1148  Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1149  Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1150  Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1151  Bot_SaveAcc(14, 0, 15) Bot_Acc(1, 14) Bot_Acc(2, 13) Bot_Acc(3, 12) Bot_Acc(4, 11) Bot_Acc(5, 10) Bot_Acc(6, 9) Bot_Acc(7, 8) Bot_Acc(8, 7) Bot_Acc(9, 6) Bot_Acc(10, 5) Bot_Acc(11, 4) Bot_Acc(12, 3) Bot_Acc(13, 2) Bot_Acc(14, 1) Bot_Acc(15, 0) \
1152  Bot_End(16)
1153 
1154 #endif
1155 
1156 #if 0
1157 #define Mul_Begin(n) \
1158  Declare2Words(p) \
1159  Declare2Words(c) \
1160  Declare2Words(d) \
1161  MultiplyWords(p, A[0], B[0]) \
1162  AssignWord(c, LowWord(p)) \
1163  AssignWord(d, HighWord(p))
1164 
1165 #define Mul_Acc(i, j) \
1166  MultiplyWords(p, A[i], B[j]) \
1167  Acc2WordsBy1(c, LowWord(p)) \
1168  Acc2WordsBy1(d, HighWord(p))
1169 
1170 #define Mul_SaveAcc(k, i, j) \
1171  R[k] = LowWord(c); \
1172  Add2WordsBy1(c, d, HighWord(c)) \
1173  MultiplyWords(p, A[i], B[j]) \
1174  AssignWord(d, HighWord(p)) \
1175  Acc2WordsBy1(c, LowWord(p))
1176 
1177 #define Mul_End(n) \
1178  R[2*n-3] = LowWord(c); \
1179  Acc2WordsBy1(d, HighWord(c)) \
1180  MultiplyWords(p, A[n-1], B[n-1])\
1181  Acc2WordsBy2(d, p) \
1182  R[2*n-2] = LowWord(d); \
1183  R[2*n-1] = HighWord(d);
1184 
1185 #define Bot_SaveAcc(k, i, j) \
1186  R[k] = LowWord(c); \
1187  word e = LowWord(d) + HighWord(c); \
1188  e += A[i] * B[j];
1189 
1190 #define Bot_Acc(i, j) \
1191  e += A[i] * B[j];
1192 
1193 #define Bot_End(n) \
1194  R[n-1] = e;
1195 #else
1196 #define Mul_Begin(n) \
1197  Declare2Words(p) \
1198  word c; \
1199  Declare2Words(d) \
1200  MultiplyWords(p, A[0], B[0]) \
1201  c = LowWord(p); \
1202  AssignWord(d, HighWord(p))
1203 
1204 #define Mul_Acc(i, j) \
1205  MulAcc(c, d, A[i], B[j])
1206 
1207 #define Mul_SaveAcc(k, i, j) \
1208  R[k] = c; \
1209  c = LowWord(d); \
1210  AssignWord(d, HighWord(d)) \
1211  MulAcc(c, d, A[i], B[j])
1212 
1213 #define Mul_End(k, i) \
1214  R[k] = c; \
1215  MultiplyWords(p, A[i], B[i]) \
1216  Acc2WordsBy2(p, d) \
1217  R[k+1] = LowWord(p); \
1218  R[k+2] = HighWord(p);
1219 
1220 #define Bot_SaveAcc(k, i, j) \
1221  R[k] = c; \
1222  c = LowWord(d); \
1223  c += A[i] * B[j];
1224 
1225 #define Bot_Acc(i, j) \
1226  c += A[i] * B[j];
1227 
1228 #define Bot_End(n) \
1229  R[n-1] = c;
1230 #endif
1231 
1232 #define Squ_Begin(n) \
1233  Declare2Words(p) \
1234  word c; \
1235  Declare2Words(d) \
1236  Declare2Words(e) \
1237  MultiplyWords(p, A[0], A[0]) \
1238  R[0] = LowWord(p); \
1239  AssignWord(e, HighWord(p)) \
1240  MultiplyWords(p, A[0], A[1]) \
1241  c = LowWord(p); \
1242  AssignWord(d, HighWord(p)) \
1243  Squ_NonDiag \
1244 
1245 #define Squ_NonDiag \
1246  Double3Words(c, d)
1247 
1248 #define Squ_SaveAcc(k, i, j) \
1249  Acc3WordsBy2(c, d, e) \
1250  R[k] = c; \
1251  MultiplyWords(p, A[i], A[j]) \
1252  c = LowWord(p); \
1253  AssignWord(d, HighWord(p)) \
1254 
1255 #define Squ_Acc(i, j) \
1256  MulAcc(c, d, A[i], A[j])
1257 
1258 #define Squ_Diag(i) \
1259  Squ_NonDiag \
1260  MulAcc(c, d, A[i], A[i])
1261 
1262 #define Squ_End(n) \
1263  Acc3WordsBy2(c, d, e) \
1264  R[2*n-3] = c; \
1265  MultiplyWords(p, A[n-1], A[n-1])\
1266  Acc2WordsBy2(p, e) \
1267  R[2*n-2] = LowWord(p); \
1268  R[2*n-1] = HighWord(p);
1269 
1270 
1271 void Baseline_Multiply2(word *R, const word *AA, const word *BB)
1272 {
1273  // http://github.com/weidai11/cryptopp/issues/188
1274  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1275  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1276 
1277  Mul_2
1278 }
1279 
1280 void Baseline_Multiply4(word *R, const word *AA, const word *BB)
1281 {
1282  // http://github.com/weidai11/cryptopp/issues/188
1283  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1284  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1285 
1286  Mul_4
1287 }
1288 
1289 void Baseline_Multiply8(word *R, const word *AA, const word *BB)
1290 {
1291  // http://github.com/weidai11/cryptopp/issues/188
1292  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1293  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1294 
1295  Mul_8
1296 }
1297 
1298 void Baseline_Square2(word *R, const word *AA)
1299 {
1300  // http://github.com/weidai11/cryptopp/issues/188
1301  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1302 
1303  Squ_2
1304 }
1305 
1306 void Baseline_Square4(word *R, const word *AA)
1307 {
1308  // http://github.com/weidai11/cryptopp/issues/188
1309  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1310 
1311  Squ_4
1312 }
1313 
1314 void Baseline_Square8(word *R, const word *AA)
1315 {
1316  // http://github.com/weidai11/cryptopp/issues/188
1317  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1318 
1319  Squ_8
1320 }
1321 
1322 void Baseline_MultiplyBottom2(word *R, const word *AA, const word *BB)
1323 {
1324  // http://github.com/weidai11/cryptopp/issues/188
1325  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1326  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1327 
1328  Bot_2
1329 
1330 // http://github.com/weidai11/cryptopp/issues/340
1331 #if defined(TWO_64_BIT_WORDS)
1332  CRYPTOPP_UNUSED(d0); CRYPTOPP_UNUSED(d1);
1333 #endif
1334 }
1335 
1336 void Baseline_MultiplyBottom4(word *R, const word *AA, const word *BB)
1337 {
1338  // http://github.com/weidai11/cryptopp/issues/188
1339  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1340  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1341 
1342  Bot_4
1343 }
1344 
1345 void Baseline_MultiplyBottom8(word *R, const word *AA, const word *BB)
1346 {
1347  // http://github.com/weidai11/cryptopp/issues/188
1348  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1349  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1350 
1351  Bot_8
1352 }
1353 
1354 #define Top_Begin(n) \
1355  Declare2Words(p) \
1356  word c; \
1357  Declare2Words(d) \
1358  MultiplyWords(p, A[0], B[n-2]);\
1359  AssignWord(d, HighWord(p));
1360 
1361 #define Top_Acc(i, j) \
1362  MultiplyWords(p, A[i], B[j]);\
1363  Acc2WordsBy1(d, HighWord(p));
1364 
1365 #define Top_SaveAcc0(i, j) \
1366  c = LowWord(d); \
1367  AssignWord(d, HighWord(d)) \
1368  MulAcc(c, d, A[i], B[j])
1369 
1370 #define Top_SaveAcc1(i, j) \
1371  c = L<c; \
1372  Acc2WordsBy1(d, c); \
1373  c = LowWord(d); \
1374  AssignWord(d, HighWord(d)) \
1375  MulAcc(c, d, A[i], B[j])
1376 
1377 void Baseline_MultiplyTop2(word *R, const word *A, const word *B, word L)
1378 {
1379  CRYPTOPP_UNUSED(L);
1380  word T[4];
1381  Baseline_Multiply2(T, A, B);
1382  R[0] = T[2];
1383  R[1] = T[3];
1384 }
1385 
1386 void Baseline_MultiplyTop4(word *R, const word *AA, const word *BB, word L)
1387 {
1388  // http://github.com/weidai11/cryptopp/issues/188
1389  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1390  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1391 
1392  Top_Begin(4)
1393  Top_Acc(1, 1) Top_Acc(2, 0) \
1394  Top_SaveAcc0(0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1395  Top_SaveAcc1(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
1396  Mul_SaveAcc(0, 2, 3) Mul_Acc(3, 2) \
1397  Mul_End(1, 3)
1398 }
1399 
1400 void Baseline_MultiplyTop8(word *R, const word *AA, const word *BB, word L)
1401 {
1402  // http://github.com/weidai11/cryptopp/issues/188
1403  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1404  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1405 
1406  Top_Begin(8)
1407  Top_Acc(1, 5) Top_Acc(2, 4) Top_Acc(3, 3) Top_Acc(4, 2) Top_Acc(5, 1) Top_Acc(6, 0) \
1408  Top_SaveAcc0(0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1409  Top_SaveAcc1(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1410  Mul_SaveAcc(0, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1411  Mul_SaveAcc(1, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1412  Mul_SaveAcc(2, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1413  Mul_SaveAcc(3, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1414  Mul_SaveAcc(4, 6, 7) Mul_Acc(7, 6) \
1415  Mul_End(5, 7)
1416 }
1417 
1418 #if !CRYPTOPP_INTEGER_SSE2 // save memory by not compiling these functions when SSE2 is available
1419 void Baseline_Multiply16(word *R, const word *AA, const word *BB)
1420 {
1421  // http://github.com/weidai11/cryptopp/issues/188
1422  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1423  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1424 
1425  Mul_16
1426 }
1427 
1428 void Baseline_Square16(word *R, const word *AA)
1429 {
1430  // http://github.com/weidai11/cryptopp/issues/188
1431  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1432 
1433  Squ_16
1434 }
1435 
1436 void Baseline_MultiplyBottom16(word *R, const word *AA, const word *BB)
1437 {
1438  // http://github.com/weidai11/cryptopp/issues/188
1439  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1440  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1441 
1442  Bot_16
1443 }
1444 
1445 void Baseline_MultiplyTop16(word *R, const word *AA, const word *BB, word L)
1446 {
1447  // http://github.com/weidai11/cryptopp/issues/188
1448  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1449  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1450 
1451  Top_Begin(16)
1452  Top_Acc(1, 13) Top_Acc(2, 12) Top_Acc(3, 11) Top_Acc(4, 10) Top_Acc(5, 9) Top_Acc(6, 8) Top_Acc(7, 7) Top_Acc(8, 6) Top_Acc(9, 5) Top_Acc(10, 4) Top_Acc(11, 3) Top_Acc(12, 2) Top_Acc(13, 1) Top_Acc(14, 0) \
1453  Top_SaveAcc0(0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1454  Top_SaveAcc1(1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1455  Mul_SaveAcc(0, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1456  Mul_SaveAcc(1, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1457  Mul_SaveAcc(2, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1458  Mul_SaveAcc(3, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1459  Mul_SaveAcc(4, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1460  Mul_SaveAcc(5, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1461  Mul_SaveAcc(6, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1462  Mul_SaveAcc(7, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1463  Mul_SaveAcc(8, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1464  Mul_SaveAcc(9, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1465  Mul_SaveAcc(10, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1466  Mul_SaveAcc(11, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1467  Mul_SaveAcc(12, 14, 15) Mul_Acc(15, 14) \
1468  Mul_End(13, 15)
1469 }
1470 #endif
1471 
1472 // ********************************************************
1473 
1474 #if CRYPTOPP_INTEGER_SSE2
1475 
1476 CRYPTOPP_ALIGN_DATA(16)
1477 CRYPTOPP_TABLE
1478 const word32 s_maskLow16[4] = {
1479  0xffff,0xffff,0xffff,0xffff
1480 };
1481 
1482 #undef Mul_Begin
1483 #undef Mul_Acc
1484 #undef Top_Begin
1485 #undef Top_Acc
1486 #undef Squ_Acc
1487 #undef Squ_NonDiag
1488 #undef Squ_Diag
1489 #undef Squ_SaveAcc
1490 #undef Squ_Begin
1491 #undef Mul_SaveAcc
1492 #undef Bot_Acc
1493 #undef Bot_SaveAcc
1494 #undef Bot_End
1495 #undef Squ_End
1496 #undef Mul_End
1497 
1498 #define SSE2_FinalSave(k) \
1499  AS2( psllq xmm5, 16) \
1500  AS2( paddq xmm4, xmm5) \
1501  AS2( movq QWORD PTR [ecx+8*(k)], xmm4)
1502 
1503 #define SSE2_SaveShift(k) \
1504  AS2( movq xmm0, xmm6) \
1505  AS2( punpckhqdq xmm6, xmm0) \
1506  AS2( movq xmm1, xmm7) \
1507  AS2( punpckhqdq xmm7, xmm1) \
1508  AS2( paddd xmm6, xmm0) \
1509  AS2( pslldq xmm6, 4) \
1510  AS2( paddd xmm7, xmm1) \
1511  AS2( paddd xmm4, xmm6) \
1512  AS2( pslldq xmm7, 4) \
1513  AS2( movq xmm6, xmm4) \
1514  AS2( paddd xmm5, xmm7) \
1515  AS2( movq xmm7, xmm5) \
1516  AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1517  AS2( psrlq xmm6, 16) \
1518  AS2( paddq xmm6, xmm7) \
1519  AS2( punpckhqdq xmm4, xmm0) \
1520  AS2( punpckhqdq xmm5, xmm0) \
1521  AS2( movq QWORD PTR [ecx+8*(k)+2], xmm6) \
1522  AS2( psrlq xmm6, 3*16) \
1523  AS2( paddd xmm4, xmm6) \
1524 
1525 #define Squ_SSE2_SaveShift(k) \
1526  AS2( movq xmm0, xmm6) \
1527  AS2( punpckhqdq xmm6, xmm0) \
1528  AS2( movq xmm1, xmm7) \
1529  AS2( punpckhqdq xmm7, xmm1) \
1530  AS2( paddd xmm6, xmm0) \
1531  AS2( pslldq xmm6, 4) \
1532  AS2( paddd xmm7, xmm1) \
1533  AS2( paddd xmm4, xmm6) \
1534  AS2( pslldq xmm7, 4) \
1535  AS2( movhlps xmm6, xmm4) \
1536  AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1537  AS2( paddd xmm5, xmm7) \
1538  AS2( movhps QWORD PTR [esp+12], xmm5)\
1539  AS2( psrlq xmm4, 16) \
1540  AS2( paddq xmm4, xmm5) \
1541  AS2( movq QWORD PTR [ecx+8*(k)+2], xmm4) \
1542  AS2( psrlq xmm4, 3*16) \
1543  AS2( paddd xmm4, xmm6) \
1544  AS2( movq QWORD PTR [esp+4], xmm4)\
1545 
1546 #define SSE2_FirstMultiply(i) \
1547  AS2( movdqa xmm7, [esi+(i)*16])\
1548  AS2( movdqa xmm5, [edi-(i)*16])\
1549  AS2( pmuludq xmm5, xmm7) \
1550  AS2( movdqa xmm4, [ebx])\
1551  AS2( movdqa xmm6, xmm4) \
1552  AS2( pand xmm4, xmm5) \
1553  AS2( psrld xmm5, 16) \
1554  AS2( pmuludq xmm7, [edx-(i)*16])\
1555  AS2( pand xmm6, xmm7) \
1556  AS2( psrld xmm7, 16)
1557 
1558 #define Squ_Begin(n) \
1559  SquPrologue \
1560  AS2( mov esi, esp)\
1561  AS2( and esp, 0xfffffff0)\
1562  AS2( lea edi, [esp-32*n])\
1563  AS2( sub esp, 32*n+16)\
1564  AS1( push esi)\
1565  AS2( mov esi, edi) \
1566  AS2( xor edx, edx) \
1567  ASL(1) \
1568  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1569  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1570  AS2( movdqa [edi+2*edx], xmm0) \
1571  AS2( psrlq xmm0, 32) \
1572  AS2( movdqa [edi+2*edx+16], xmm0) \
1573  AS2( movdqa [edi+16*n+2*edx], xmm1) \
1574  AS2( psrlq xmm1, 32) \
1575  AS2( movdqa [edi+16*n+2*edx+16], xmm1) \
1576  AS2( add edx, 16) \
1577  AS2( cmp edx, 8*(n)) \
1578  ASJ( jne, 1, b) \
1579  AS2( lea edx, [edi+16*n])\
1580  SSE2_FirstMultiply(0) \
1581 
1582 #define Squ_Acc(i) \
1583  ASL(LSqu##i) \
1584  AS2( movdqa xmm1, [esi+(i)*16]) \
1585  AS2( movdqa xmm0, [edi-(i)*16]) \
1586  AS2( movdqa xmm2, [ebx]) \
1587  AS2( pmuludq xmm0, xmm1) \
1588  AS2( pmuludq xmm1, [edx-(i)*16]) \
1589  AS2( movdqa xmm3, xmm2) \
1590  AS2( pand xmm2, xmm0) \
1591  AS2( psrld xmm0, 16) \
1592  AS2( paddd xmm4, xmm2) \
1593  AS2( paddd xmm5, xmm0) \
1594  AS2( pand xmm3, xmm1) \
1595  AS2( psrld xmm1, 16) \
1596  AS2( paddd xmm6, xmm3) \
1597  AS2( paddd xmm7, xmm1) \
1598 
1599 #define Squ_Acc1(i)
1600 #define Squ_Acc2(i) ASC(call, LSqu##i)
1601 #define Squ_Acc3(i) Squ_Acc2(i)
1602 #define Squ_Acc4(i) Squ_Acc2(i)
1603 #define Squ_Acc5(i) Squ_Acc2(i)
1604 #define Squ_Acc6(i) Squ_Acc2(i)
1605 #define Squ_Acc7(i) Squ_Acc2(i)
1606 #define Squ_Acc8(i) Squ_Acc2(i)
1607 
1608 #define SSE2_End(E, n) \
1609  SSE2_SaveShift(2*(n)-3) \
1610  AS2( movdqa xmm7, [esi+16]) \
1611  AS2( movdqa xmm0, [edi]) \
1612  AS2( pmuludq xmm0, xmm7) \
1613  AS2( movdqa xmm2, [ebx]) \
1614  AS2( pmuludq xmm7, [edx]) \
1615  AS2( movdqa xmm6, xmm2) \
1616  AS2( pand xmm2, xmm0) \
1617  AS2( psrld xmm0, 16) \
1618  AS2( paddd xmm4, xmm2) \
1619  AS2( paddd xmm5, xmm0) \
1620  AS2( pand xmm6, xmm7) \
1621  AS2( psrld xmm7, 16) \
1622  SSE2_SaveShift(2*(n)-2) \
1623  SSE2_FinalSave(2*(n)-1) \
1624  AS1( pop esp)\
1625  E
1626 
1627 #define Squ_End(n) SSE2_End(SquEpilogue, n)
1628 #define Mul_End(n) SSE2_End(MulEpilogue, n)
1629 #define Top_End(n) SSE2_End(TopEpilogue, n)
1630 
1631 #define Squ_Column1(k, i) \
1632  Squ_SSE2_SaveShift(k) \
1633  AS2( add esi, 16) \
1634  SSE2_FirstMultiply(1)\
1635  Squ_Acc##i(i) \
1636  AS2( paddd xmm4, xmm4) \
1637  AS2( paddd xmm5, xmm5) \
1638  AS2( movdqa xmm3, [esi]) \
1639  AS2( movq xmm1, QWORD PTR [esi+8]) \
1640  AS2( pmuludq xmm1, xmm3) \
1641  AS2( pmuludq xmm3, xmm3) \
1642  AS2( movdqa xmm0, [ebx])\
1643  AS2( movdqa xmm2, xmm0) \
1644  AS2( pand xmm0, xmm1) \
1645  AS2( psrld xmm1, 16) \
1646  AS2( paddd xmm6, xmm0) \
1647  AS2( paddd xmm7, xmm1) \
1648  AS2( pand xmm2, xmm3) \
1649  AS2( psrld xmm3, 16) \
1650  AS2( paddd xmm6, xmm6) \
1651  AS2( paddd xmm7, xmm7) \
1652  AS2( paddd xmm4, xmm2) \
1653  AS2( paddd xmm5, xmm3) \
1654  AS2( movq xmm0, QWORD PTR [esp+4])\
1655  AS2( movq xmm1, QWORD PTR [esp+12])\
1656  AS2( paddd xmm4, xmm0)\
1657  AS2( paddd xmm5, xmm1)\
1658 
1659 #define Squ_Column0(k, i) \
1660  Squ_SSE2_SaveShift(k) \
1661  AS2( add edi, 16) \
1662  AS2( add edx, 16) \
1663  SSE2_FirstMultiply(1)\
1664  Squ_Acc##i(i) \
1665  AS2( paddd xmm6, xmm6) \
1666  AS2( paddd xmm7, xmm7) \
1667  AS2( paddd xmm4, xmm4) \
1668  AS2( paddd xmm5, xmm5) \
1669  AS2( movq xmm0, QWORD PTR [esp+4])\
1670  AS2( movq xmm1, QWORD PTR [esp+12])\
1671  AS2( paddd xmm4, xmm0)\
1672  AS2( paddd xmm5, xmm1)\
1673 
1674 #define SSE2_MulAdd45 \
1675  AS2( movdqa xmm7, [esi]) \
1676  AS2( movdqa xmm0, [edi]) \
1677  AS2( pmuludq xmm0, xmm7) \
1678  AS2( movdqa xmm2, [ebx]) \
1679  AS2( pmuludq xmm7, [edx]) \
1680  AS2( movdqa xmm6, xmm2) \
1681  AS2( pand xmm2, xmm0) \
1682  AS2( psrld xmm0, 16) \
1683  AS2( paddd xmm4, xmm2) \
1684  AS2( paddd xmm5, xmm0) \
1685  AS2( pand xmm6, xmm7) \
1686  AS2( psrld xmm7, 16)
1687 
1688 #define Mul_Begin(n) \
1689  MulPrologue \
1690  AS2( mov esi, esp)\
1691  AS2( and esp, 0xfffffff0)\
1692  AS2( sub esp, 48*n+16)\
1693  AS1( push esi)\
1694  AS2( xor edx, edx) \
1695  ASL(1) \
1696  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1697  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1698  ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1699  AS2( movdqa [esp+20+2*edx], xmm0) \
1700  AS2( psrlq xmm0, 32) \
1701  AS2( movdqa [esp+20+2*edx+16], xmm0) \
1702  AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1703  AS2( psrlq xmm1, 32) \
1704  AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1705  AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1706  AS2( psrlq xmm2, 32) \
1707  AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1708  AS2( add edx, 16) \
1709  AS2( cmp edx, 8*(n)) \
1710  ASJ( jne, 1, b) \
1711  AS2( lea edi, [esp+20])\
1712  AS2( lea edx, [esp+20+16*n])\
1713  AS2( lea esi, [esp+20+32*n])\
1714  SSE2_FirstMultiply(0) \
1715 
1716 #define Mul_Acc(i) \
1717  ASL(LMul##i) \
1718  AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1719  AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1720  AS2( movdqa xmm2, [ebx]) \
1721  AS2( pmuludq xmm0, xmm1) \
1722  AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1723  AS2( movdqa xmm3, xmm2) \
1724  AS2( pand xmm2, xmm0) \
1725  AS2( psrld xmm0, 16) \
1726  AS2( paddd xmm4, xmm2) \
1727  AS2( paddd xmm5, xmm0) \
1728  AS2( pand xmm3, xmm1) \
1729  AS2( psrld xmm1, 16) \
1730  AS2( paddd xmm6, xmm3) \
1731  AS2( paddd xmm7, xmm1) \
1732 
1733 #define Mul_Acc1(i)
1734 #define Mul_Acc2(i) ASC(call, LMul##i)
1735 #define Mul_Acc3(i) Mul_Acc2(i)
1736 #define Mul_Acc4(i) Mul_Acc2(i)
1737 #define Mul_Acc5(i) Mul_Acc2(i)
1738 #define Mul_Acc6(i) Mul_Acc2(i)
1739 #define Mul_Acc7(i) Mul_Acc2(i)
1740 #define Mul_Acc8(i) Mul_Acc2(i)
1741 #define Mul_Acc9(i) Mul_Acc2(i)
1742 #define Mul_Acc10(i) Mul_Acc2(i)
1743 #define Mul_Acc11(i) Mul_Acc2(i)
1744 #define Mul_Acc12(i) Mul_Acc2(i)
1745 #define Mul_Acc13(i) Mul_Acc2(i)
1746 #define Mul_Acc14(i) Mul_Acc2(i)
1747 #define Mul_Acc15(i) Mul_Acc2(i)
1748 #define Mul_Acc16(i) Mul_Acc2(i)
1749 
1750 #define Mul_Column1(k, i) \
1751  SSE2_SaveShift(k) \
1752  AS2( add esi, 16) \
1753  SSE2_MulAdd45\
1754  Mul_Acc##i(i) \
1755 
1756 #define Mul_Column0(k, i) \
1757  SSE2_SaveShift(k) \
1758  AS2( add edi, 16) \
1759  AS2( add edx, 16) \
1760  SSE2_MulAdd45\
1761  Mul_Acc##i(i) \
1762 
1763 #define Bot_Acc(i) \
1764  AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1765  AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1766  AS2( pmuludq xmm0, xmm1) \
1767  AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1768  AS2( paddq xmm4, xmm0) \
1769  AS2( paddd xmm6, xmm1)
1770 
1771 #define Bot_SaveAcc(k) \
1772  SSE2_SaveShift(k) \
1773  AS2( add edi, 16) \
1774  AS2( add edx, 16) \
1775  AS2( movdqa xmm6, [esi]) \
1776  AS2( movdqa xmm0, [edi]) \
1777  AS2( pmuludq xmm0, xmm6) \
1778  AS2( paddq xmm4, xmm0) \
1779  AS2( psllq xmm5, 16) \
1780  AS2( paddq xmm4, xmm5) \
1781  AS2( pmuludq xmm6, [edx])
1782 
1783 #define Bot_End(n) \
1784  AS2( movhlps xmm7, xmm6) \
1785  AS2( paddd xmm6, xmm7) \
1786  AS2( psllq xmm6, 32) \
1787  AS2( paddd xmm4, xmm6) \
1788  AS2( movq QWORD PTR [ecx+8*((n)-1)], xmm4) \
1789  AS1( pop esp)\
1790  MulEpilogue
1791 
1792 #define Top_Begin(n) \
1793  TopPrologue \
1794  AS2( mov edx, esp)\
1795  AS2( and esp, 0xfffffff0)\
1796  AS2( sub esp, 48*n+16)\
1797  AS1( push edx)\
1798  AS2( xor edx, edx) \
1799  ASL(1) \
1800  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1801  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1802  ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1803  AS2( movdqa [esp+20+2*edx], xmm0) \
1804  AS2( psrlq xmm0, 32) \
1805  AS2( movdqa [esp+20+2*edx+16], xmm0) \
1806  AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1807  AS2( psrlq xmm1, 32) \
1808  AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1809  AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1810  AS2( psrlq xmm2, 32) \
1811  AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1812  AS2( add edx, 16) \
1813  AS2( cmp edx, 8*(n)) \
1814  ASJ( jne, 1, b) \
1815  AS2( mov eax, esi) \
1816  AS2( lea edi, [esp+20+00*n+16*(n/2-1)])\
1817  AS2( lea edx, [esp+20+16*n+16*(n/2-1)])\
1818  AS2( lea esi, [esp+20+32*n+16*(n/2-1)])\
1819  AS2( pxor xmm4, xmm4)\
1820  AS2( pxor xmm5, xmm5)
1821 
1822 #define Top_Acc(i) \
1823  AS2( movq xmm0, QWORD PTR [esi+i/2*(1-(i-2*(i/2))*2)*16+8]) \
1824  AS2( pmuludq xmm0, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1825  AS2( psrlq xmm0, 48) \
1826  AS2( paddd xmm5, xmm0)\
1827 
1828 #define Top_Column0(i) \
1829  AS2( psllq xmm5, 32) \
1830  AS2( add edi, 16) \
1831  AS2( add edx, 16) \
1832  SSE2_MulAdd45\
1833  Mul_Acc##i(i) \
1834 
1835 #define Top_Column1(i) \
1836  SSE2_SaveShift(0) \
1837  AS2( add esi, 16) \
1838  SSE2_MulAdd45\
1839  Mul_Acc##i(i) \
1840  AS2( shr eax, 16) \
1841  AS2( movd xmm0, eax)\
1842  AS2( movd xmm1, [ecx+4])\
1843  AS2( psrld xmm1, 16)\
1844  AS2( pcmpgtd xmm1, xmm0)\
1845  AS2( psrld xmm1, 31)\
1846  AS2( paddd xmm4, xmm1)\
1847 
1848 void SSE2_Square4(word *C, const word *A)
1849 {
1850  Squ_Begin(2)
1851  Squ_Column0(0, 1)
1852  Squ_End(2)
1853 }
1854 
1855 void SSE2_Square8(word *C, const word *A)
1856 {
1857  Squ_Begin(4)
1858 #ifndef __GNUC__
1859  ASJ( jmp, 0, f)
1860  Squ_Acc(2)
1861  AS1( ret) ASL(0)
1862 #endif
1863  Squ_Column0(0, 1)
1864  Squ_Column1(1, 1)
1865  Squ_Column0(2, 2)
1866  Squ_Column1(3, 1)
1867  Squ_Column0(4, 1)
1868  Squ_End(4)
1869 }
1870 
1871 void SSE2_Square16(word *C, const word *A)
1872 {
1873  Squ_Begin(8)
1874 #ifndef __GNUC__
1875  ASJ( jmp, 0, f)
1876  Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1877  AS1( ret) ASL(0)
1878 #endif
1879  Squ_Column0(0, 1)
1880  Squ_Column1(1, 1)
1881  Squ_Column0(2, 2)
1882  Squ_Column1(3, 2)
1883  Squ_Column0(4, 3)
1884  Squ_Column1(5, 3)
1885  Squ_Column0(6, 4)
1886  Squ_Column1(7, 3)
1887  Squ_Column0(8, 3)
1888  Squ_Column1(9, 2)
1889  Squ_Column0(10, 2)
1890  Squ_Column1(11, 1)
1891  Squ_Column0(12, 1)
1892  Squ_End(8)
1893 }
1894 
1895 void SSE2_Square32(word *C, const word *A)
1896 {
1897  Squ_Begin(16)
1898  ASJ( jmp, 0, f)
1899  Squ_Acc(8) Squ_Acc(7) Squ_Acc(6) Squ_Acc(5) Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1900  AS1( ret) ASL(0)
1901  Squ_Column0(0, 1)
1902  Squ_Column1(1, 1)
1903  Squ_Column0(2, 2)
1904  Squ_Column1(3, 2)
1905  Squ_Column0(4, 3)
1906  Squ_Column1(5, 3)
1907  Squ_Column0(6, 4)
1908  Squ_Column1(7, 4)
1909  Squ_Column0(8, 5)
1910  Squ_Column1(9, 5)
1911  Squ_Column0(10, 6)
1912  Squ_Column1(11, 6)
1913  Squ_Column0(12, 7)
1914  Squ_Column1(13, 7)
1915  Squ_Column0(14, 8)
1916  Squ_Column1(15, 7)
1917  Squ_Column0(16, 7)
1918  Squ_Column1(17, 6)
1919  Squ_Column0(18, 6)
1920  Squ_Column1(19, 5)
1921  Squ_Column0(20, 5)
1922  Squ_Column1(21, 4)
1923  Squ_Column0(22, 4)
1924  Squ_Column1(23, 3)
1925  Squ_Column0(24, 3)
1926  Squ_Column1(25, 2)
1927  Squ_Column0(26, 2)
1928  Squ_Column1(27, 1)
1929  Squ_Column0(28, 1)
1930  Squ_End(16)
1931 }
1932 
1933 void SSE2_Multiply4(word *C, const word *A, const word *B)
1934 {
1935  Mul_Begin(2)
1936 #ifndef __GNUC__
1937  ASJ( jmp, 0, f)
1938  Mul_Acc(2)
1939  AS1( ret) ASL(0)
1940 #endif
1941  Mul_Column0(0, 2)
1942  Mul_End(2)
1943 }
1944 
1945 void SSE2_Multiply8(word *C, const word *A, const word *B)
1946 {
1947  Mul_Begin(4)
1948 #ifndef __GNUC__
1949  ASJ( jmp, 0, f)
1950  Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1951  AS1( ret) ASL(0)
1952 #endif
1953  Mul_Column0(0, 2)
1954  Mul_Column1(1, 3)
1955  Mul_Column0(2, 4)
1956  Mul_Column1(3, 3)
1957  Mul_Column0(4, 2)
1958  Mul_End(4)
1959 }
1960 
1961 void SSE2_Multiply16(word *C, const word *A, const word *B)
1962 {
1963  Mul_Begin(8)
1964 #ifndef __GNUC__
1965  ASJ( jmp, 0, f)
1966  Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1967  AS1( ret) ASL(0)
1968 #endif
1969  Mul_Column0(0, 2)
1970  Mul_Column1(1, 3)
1971  Mul_Column0(2, 4)
1972  Mul_Column1(3, 5)
1973  Mul_Column0(4, 6)
1974  Mul_Column1(5, 7)
1975  Mul_Column0(6, 8)
1976  Mul_Column1(7, 7)
1977  Mul_Column0(8, 6)
1978  Mul_Column1(9, 5)
1979  Mul_Column0(10, 4)
1980  Mul_Column1(11, 3)
1981  Mul_Column0(12, 2)
1982  Mul_End(8)
1983 }
1984 
1985 void SSE2_Multiply32(word *C, const word *A, const word *B)
1986 {
1987  Mul_Begin(16)
1988  ASJ( jmp, 0, f)
1989  Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1990  AS1( ret) ASL(0)
1991  Mul_Column0(0, 2)
1992  Mul_Column1(1, 3)
1993  Mul_Column0(2, 4)
1994  Mul_Column1(3, 5)
1995  Mul_Column0(4, 6)
1996  Mul_Column1(5, 7)
1997  Mul_Column0(6, 8)
1998  Mul_Column1(7, 9)
1999  Mul_Column0(8, 10)
2000  Mul_Column1(9, 11)
2001  Mul_Column0(10, 12)
2002  Mul_Column1(11, 13)
2003  Mul_Column0(12, 14)
2004  Mul_Column1(13, 15)
2005  Mul_Column0(14, 16)
2006  Mul_Column1(15, 15)
2007  Mul_Column0(16, 14)
2008  Mul_Column1(17, 13)
2009  Mul_Column0(18, 12)
2010  Mul_Column1(19, 11)
2011  Mul_Column0(20, 10)
2012  Mul_Column1(21, 9)
2013  Mul_Column0(22, 8)
2014  Mul_Column1(23, 7)
2015  Mul_Column0(24, 6)
2016  Mul_Column1(25, 5)
2017  Mul_Column0(26, 4)
2018  Mul_Column1(27, 3)
2019  Mul_Column0(28, 2)
2020  Mul_End(16)
2021 }
2022 
2023 void SSE2_MultiplyBottom4(word *C, const word *A, const word *B)
2024 {
2025  Mul_Begin(2)
2026  Bot_SaveAcc(0) Bot_Acc(2)
2027  Bot_End(2)
2028 }
2029 
2030 void SSE2_MultiplyBottom8(word *C, const word *A, const word *B)
2031 {
2032  Mul_Begin(4)
2033 #ifndef __GNUC__
2034  ASJ( jmp, 0, f)
2035  Mul_Acc(3) Mul_Acc(2)
2036  AS1( ret) ASL(0)
2037 #endif
2038  Mul_Column0(0, 2)
2039  Mul_Column1(1, 3)
2040  Bot_SaveAcc(2) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2041  Bot_End(4)
2042 }
2043 
2044 void SSE2_MultiplyBottom16(word *C, const word *A, const word *B)
2045 {
2046  Mul_Begin(8)
2047 #ifndef __GNUC__
2048  ASJ( jmp, 0, f)
2049  Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2050  AS1( ret) ASL(0)
2051 #endif
2052  Mul_Column0(0, 2)
2053  Mul_Column1(1, 3)
2054  Mul_Column0(2, 4)
2055  Mul_Column1(3, 5)
2056  Mul_Column0(4, 6)
2057  Mul_Column1(5, 7)
2058  Bot_SaveAcc(6) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2059  Bot_End(8)
2060 }
2061 
2062 void SSE2_MultiplyBottom32(word *C, const word *A, const word *B)
2063 {
2064  Mul_Begin(16)
2065 #ifndef __GNUC__
2066  ASJ( jmp, 0, f)
2067  Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2068  AS1( ret) ASL(0)
2069 #endif
2070  Mul_Column0(0, 2)
2071  Mul_Column1(1, 3)
2072  Mul_Column0(2, 4)
2073  Mul_Column1(3, 5)
2074  Mul_Column0(4, 6)
2075  Mul_Column1(5, 7)
2076  Mul_Column0(6, 8)
2077  Mul_Column1(7, 9)
2078  Mul_Column0(8, 10)
2079  Mul_Column1(9, 11)
2080  Mul_Column0(10, 12)
2081  Mul_Column1(11, 13)
2082  Mul_Column0(12, 14)
2083  Mul_Column1(13, 15)
2084  Bot_SaveAcc(14) Bot_Acc(16) Bot_Acc(15) Bot_Acc(14) Bot_Acc(13) Bot_Acc(12) Bot_Acc(11) Bot_Acc(10) Bot_Acc(9) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2085  Bot_End(16)
2086 }
2087 
2088 void SSE2_MultiplyTop8(word *C, const word *A, const word *B, word L)
2089 {
2090  Top_Begin(4)
2091  Top_Acc(3) Top_Acc(2) Top_Acc(1)
2092 #ifndef __GNUC__
2093  ASJ( jmp, 0, f)
2094  Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2095  AS1( ret) ASL(0)
2096 #endif
2097  Top_Column0(4)
2098  Top_Column1(3)
2099  Mul_Column0(0, 2)
2100  Top_End(2)
2101 }
2102 
2103 void SSE2_MultiplyTop16(word *C, const word *A, const word *B, word L)
2104 {
2105  Top_Begin(8)
2106  Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2107 #ifndef __GNUC__
2108  ASJ( jmp, 0, f)
2109  Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2110  AS1( ret) ASL(0)
2111 #endif
2112  Top_Column0(8)
2113  Top_Column1(7)
2114  Mul_Column0(0, 6)
2115  Mul_Column1(1, 5)
2116  Mul_Column0(2, 4)
2117  Mul_Column1(3, 3)
2118  Mul_Column0(4, 2)
2119  Top_End(4)
2120 }
2121 
2122 void SSE2_MultiplyTop32(word *C, const word *A, const word *B, word L)
2123 {
2124  Top_Begin(16)
2125  Top_Acc(15) Top_Acc(14) Top_Acc(13) Top_Acc(12) Top_Acc(11) Top_Acc(10) Top_Acc(9) Top_Acc(8) Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2126 #ifndef __GNUC__
2127  ASJ( jmp, 0, f)
2128  Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2129  AS1( ret) ASL(0)
2130 #endif
2131  Top_Column0(16)
2132  Top_Column1(15)
2133  Mul_Column0(0, 14)
2134  Mul_Column1(1, 13)
2135  Mul_Column0(2, 12)
2136  Mul_Column1(3, 11)
2137  Mul_Column0(4, 10)
2138  Mul_Column1(5, 9)
2139  Mul_Column0(6, 8)
2140  Mul_Column1(7, 7)
2141  Mul_Column0(8, 6)
2142  Mul_Column1(9, 5)
2143  Mul_Column0(10, 4)
2144  Mul_Column1(11, 3)
2145  Mul_Column0(12, 2)
2146  Top_End(8)
2147 }
2148 
2149 #endif // #if CRYPTOPP_INTEGER_SSE2
2150 
2151 // ********************************************************
2152 
2153 typedef int (CRYPTOPP_FASTCALL * PAdd)(size_t N, word *C, const word *A, const word *B);
2154 typedef void (* PMul)(word *C, const word *A, const word *B);
2155 typedef void (* PSqu)(word *C, const word *A);
2156 typedef void (* PMulTop)(word *C, const word *A, const word *B, word L);
2157 
2158 #if CRYPTOPP_INTEGER_SSE2
2159 static PAdd s_pAdd = &Baseline_Add, s_pSub = &Baseline_Sub;
2160 static size_t s_recursionLimit = 8;
2161 #else
2162 static const size_t s_recursionLimit = 16;
2163 #endif // CRYPTOPP_INTEGER_SSE2
2164 
2165 static PMul s_pMul[9], s_pBot[9];
2166 static PSqu s_pSqu[9];
2167 static PMulTop s_pTop[9];
2168 
2169 void SetFunctionPointers()
2170 {
2171  s_pMul[0] = &Baseline_Multiply2;
2172  s_pBot[0] = &Baseline_MultiplyBottom2;
2173  s_pSqu[0] = &Baseline_Square2;
2174  s_pTop[0] = &Baseline_MultiplyTop2;
2175  s_pTop[1] = &Baseline_MultiplyTop4;
2176 
2177 #if CRYPTOPP_INTEGER_SSE2
2178  if (HasSSE2())
2179  {
2180  if (IsP4())
2181  {
2182  s_pAdd = &SSE2_Add;
2183  s_pSub = &SSE2_Sub;
2184  }
2185 
2186  s_recursionLimit = 32;
2187 
2188  s_pMul[1] = &SSE2_Multiply4;
2189  s_pMul[2] = &SSE2_Multiply8;
2190  s_pMul[4] = &SSE2_Multiply16;
2191  s_pMul[8] = &SSE2_Multiply32;
2192 
2193  s_pBot[1] = &SSE2_MultiplyBottom4;
2194  s_pBot[2] = &SSE2_MultiplyBottom8;
2195  s_pBot[4] = &SSE2_MultiplyBottom16;
2196  s_pBot[8] = &SSE2_MultiplyBottom32;
2197 
2198  s_pSqu[1] = &SSE2_Square4;
2199  s_pSqu[2] = &SSE2_Square8;
2200  s_pSqu[4] = &SSE2_Square16;
2201  s_pSqu[8] = &SSE2_Square32;
2202 
2203  s_pTop[2] = &SSE2_MultiplyTop8;
2204  s_pTop[4] = &SSE2_MultiplyTop16;
2205  s_pTop[8] = &SSE2_MultiplyTop32;
2206  }
2207  else
2208 #endif // CRYPTOPP_INTEGER_SSE2
2209  {
2210  s_pMul[1] = &Baseline_Multiply4;
2211  s_pMul[2] = &Baseline_Multiply8;
2212 
2213  s_pBot[1] = &Baseline_MultiplyBottom4;
2214  s_pBot[2] = &Baseline_MultiplyBottom8;
2215 
2216  s_pSqu[1] = &Baseline_Square4;
2217  s_pSqu[2] = &Baseline_Square8;
2218 
2219  s_pTop[2] = &Baseline_MultiplyTop8;
2220 
2221 #if !CRYPTOPP_INTEGER_SSE2
2222  s_pMul[4] = &Baseline_Multiply16;
2223  s_pBot[4] = &Baseline_MultiplyBottom16;
2224  s_pSqu[4] = &Baseline_Square16;
2225  s_pTop[4] = &Baseline_MultiplyTop16;
2226 #endif // !CRYPTOPP_INTEGER_SSE2
2227  }
2228 }
2229 
2230 inline int Add(word *C, const word *A, const word *B, size_t N)
2231 {
2232 #if CRYPTOPP_INTEGER_SSE2
2233  return s_pAdd(N, C, A, B);
2234 #else
2235  return Baseline_Add(N, C, A, B);
2236 #endif // CRYPTOPP_INTEGER_SSE2
2237 }
2238 
2239 inline int Subtract(word *C, const word *A, const word *B, size_t N)
2240 {
2241 #if CRYPTOPP_INTEGER_SSE2
2242  return s_pSub(N, C, A, B);
2243 #else
2244  return Baseline_Sub(N, C, A, B);
2245 #endif // CRYPTOPP_INTEGER_SSE2
2246 }
2247 
2248 // ********************************************************
2249 
2250 
2251 #define A0 A
2252 #define A1 (A+N2)
2253 #define B0 B
2254 #define B1 (B+N2)
2255 
2256 #define T0 T
2257 #define T1 (T+N2)
2258 #define T2 (T+N)
2259 #define T3 (T+N+N2)
2260 
2261 #define R0 R
2262 #define R1 (R+N2)
2263 #define R2 (R+N)
2264 #define R3 (R+N+N2)
2265 
2266 // R[2*N] - result = A*B
2267 // T[2*N] - temporary work space
2268 // A[N] --- multiplier
2269 // B[N] --- multiplicant
2270 
2271 void RecursiveMultiply(word *R, word *T, const word *A, const word *B, size_t N)
2272 {
2273  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2274 
2275  if (N <= s_recursionLimit)
2276  s_pMul[N/4](R, A, B);
2277  else
2278  {
2279  const size_t N2 = N/2;
2280 
2281  size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2282  Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2283 
2284  size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2285  Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2286 
2287  RecursiveMultiply(R2, T2, A1, B1, N2);
2288  RecursiveMultiply(T0, T2, R0, R1, N2);
2289  RecursiveMultiply(R0, T2, A0, B0, N2);
2290 
2291  // now T[01] holds (A1-A0)*(B0-B1), R[01] holds A0*B0, R[23] holds A1*B1
2292 
2293  int c2 = Add(R2, R2, R1, N2);
2294  int c3 = c2;
2295  c2 += Add(R1, R2, R0, N2);
2296  c3 += Add(R2, R2, R3, N2);
2297 
2298  if (AN2 == BN2)
2299  c3 -= Subtract(R1, R1, T0, N);
2300  else
2301  c3 += Add(R1, R1, T0, N);
2302 
2303  c3 += Increment(R2, N2, c2);
2304  CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2305  Increment(R3, N2, c3);
2306  }
2307 }
2308 
2309 // R[2*N] - result = A*A
2310 // T[2*N] - temporary work space
2311 // A[N] --- number to be squared
2312 
2313 void RecursiveSquare(word *R, word *T, const word *A, size_t N)
2314 {
2315  CRYPTOPP_ASSERT(N && N%2==0);
2316 
2317  if (N <= s_recursionLimit)
2318  s_pSqu[N/4](R, A);
2319  else
2320  {
2321  const size_t N2 = N/2;
2322 
2323  RecursiveSquare(R0, T2, A0, N2);
2324  RecursiveSquare(R2, T2, A1, N2);
2325  RecursiveMultiply(T0, T2, A0, A1, N2);
2326 
2327  int carry = Add(R1, R1, T0, N);
2328  carry += Add(R1, R1, T0, N);
2329  Increment(R3, N2, carry);
2330  }
2331 }
2332 
2333 // R[N] - bottom half of A*B
2334 // T[3*N/2] - temporary work space
2335 // A[N] - multiplier
2336 // B[N] - multiplicant
2337 
2338 void RecursiveMultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2339 {
2340  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2341 
2342  if (N <= s_recursionLimit)
2343  s_pBot[N/4](R, A, B);
2344  else
2345  {
2346  const size_t N2 = N/2;
2347 
2348  RecursiveMultiply(R, T, A0, B0, N2);
2349  RecursiveMultiplyBottom(T0, T1, A1, B0, N2);
2350  Add(R1, R1, T0, N2);
2351  RecursiveMultiplyBottom(T0, T1, A0, B1, N2);
2352  Add(R1, R1, T0, N2);
2353  }
2354 }
2355 
2356 // R[N] --- upper half of A*B
2357 // T[2*N] - temporary work space
2358 // L[N] --- lower half of A*B
2359 // A[N] --- multiplier
2360 // B[N] --- multiplicant
2361 
2362 void MultiplyTop(word *R, word *T, const word *L, const word *A, const word *B, size_t N)
2363 {
2364  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2365 
2366  if (N <= s_recursionLimit)
2367  s_pTop[N/4](R, A, B, L[N-1]);
2368  else
2369  {
2370  const size_t N2 = N/2;
2371 
2372  size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2373  Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2374 
2375  size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2376  Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2377 
2378  RecursiveMultiply(T0, T2, R0, R1, N2);
2379  RecursiveMultiply(R0, T2, A1, B1, N2);
2380 
2381  // now T[01] holds (A1-A0)*(B0-B1) = A1*B0+A0*B1-A1*B1-A0*B0, R[01] holds A1*B1
2382 
2383  int t, c3;
2384  int c2 = Subtract(T2, L+N2, L, N2);
2385 
2386  if (AN2 == BN2)
2387  {
2388  c2 -= Add(T2, T2, T0, N2);
2389  t = (Compare(T2, R0, N2) == -1);
2390  c3 = t - Subtract(T2, T2, T1, N2);
2391  }
2392  else
2393  {
2394  c2 += Subtract(T2, T2, T0, N2);
2395  t = (Compare(T2, R0, N2) == -1);
2396  c3 = t + Add(T2, T2, T1, N2);
2397  }
2398 
2399  c2 += t;
2400  if (c2 >= 0)
2401  c3 += Increment(T2, N2, c2);
2402  else
2403  c3 -= Decrement(T2, N2, -c2);
2404  c3 += Add(R0, T2, R1, N2);
2405 
2406  CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2407  Increment(R1, N2, c3);
2408  }
2409 }
2410 
2411 inline void Multiply(word *R, word *T, const word *A, const word *B, size_t N)
2412 {
2413  RecursiveMultiply(R, T, A, B, N);
2414 }
2415 
2416 inline void Square(word *R, word *T, const word *A, size_t N)
2417 {
2418  RecursiveSquare(R, T, A, N);
2419 }
2420 
2421 inline void MultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2422 {
2423  RecursiveMultiplyBottom(R, T, A, B, N);
2424 }
2425 
2426 // R[NA+NB] - result = A*B
2427 // T[NA+NB] - temporary work space
2428 // A[NA] ---- multiplier
2429 // B[NB] ---- multiplicant
2430 
2431 void AsymmetricMultiply(word *R, word *T, const word *A, size_t NA, const word *B, size_t NB)
2432 {
2433  if (NA == NB)
2434  {
2435  // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
2436  // The code change occurred at Commit dc99266599a0e72d.
2437  if (A != B)
2438  Multiply(R, T, A, B, NA);
2439  else
2440  Square(R, T, A, NA);
2441 
2442  return;
2443  }
2444 
2445  if (NA > NB)
2446  {
2447  std::swap(A, B);
2448  std::swap(NA, NB);
2449  }
2450 
2451  CRYPTOPP_ASSERT(NB % NA == 0);
2452 
2453  if (NA==2 && !A[1])
2454  {
2455  // Profiling tells us the original Default case was dominant, so it was
2456  // promoted to the first Case statement. The code change occurred at
2457  // Commit dc99266599a0e72d.
2458  switch (A[0])
2459  {
2460  default:
2461  R[NB] = LinearMultiply(R, B, A[0], NB);
2462  R[NB+1] = 0;
2463  return;
2464  case 0:
2465  SetWords(R, 0, NB+2);
2466  return;
2467  case 1:
2468  CopyWords(R, B, NB);
2469  R[NB] = R[NB+1] = 0;
2470  return;
2471  }
2472  }
2473 
2474  size_t i;
2475  if ((NB/NA)%2 == 0)
2476  {
2477  Multiply(R, T, A, B, NA);
2478  CopyWords(T+2*NA, R+NA, NA);
2479 
2480  for (i=2*NA; i<NB; i+=2*NA)
2481  Multiply(T+NA+i, T, A, B+i, NA);
2482  for (i=NA; i<NB; i+=2*NA)
2483  Multiply(R+i, T, A, B+i, NA);
2484  }
2485  else
2486  {
2487  for (i=0; i<NB; i+=2*NA)
2488  Multiply(R+i, T, A, B+i, NA);
2489  for (i=NA; i<NB; i+=2*NA)
2490  Multiply(T+NA+i, T, A, B+i, NA);
2491  }
2492 
2493  if (Add(R+NA, R+NA, T+2*NA, NB-NA))
2494  Increment(R+NB, NA);
2495 }
2496 
2497 // R[N] ----- result = A inverse mod 2**(WORD_BITS*N)
2498 // T[3*N/2] - temporary work space
2499 // A[N] ----- an odd number as input
2500 
2501 void RecursiveInverseModPower2(word *R, word *T, const word *A, size_t N)
2502 {
2503  // Profiling tells us the original Else was dominant, so it was
2504  // promoted to the first If statement. The code change occurred
2505  // at Commit dc99266599a0e72d.
2506  if (N!=2)
2507  {
2508  const size_t N2 = N/2;
2509  RecursiveInverseModPower2(R0, T0, A0, N2);
2510  T0[0] = 1;
2511  SetWords(T0+1, 0, N2-1);
2512  MultiplyTop(R1, T1, T0, R0, A0, N2);
2513  MultiplyBottom(T0, T1, R0, A1, N2);
2514  Add(T0, R1, T0, N2);
2515  TwosComplement(T0, N2);
2516  MultiplyBottom(R1, T1, R0, T0, N2);
2517  }
2518  else
2519  {
2520  T[0] = AtomicInverseModPower2(A[0]);
2521  T[1] = 0;
2522  s_pBot[0](T+2, T, A);
2523  TwosComplement(T+2, 2);
2524  Increment(T+2, 2, 2);
2525  s_pBot[0](R, T, T+2);
2526  }
2527 }
2528 
2529 // R[N] --- result = X/(2**(WORD_BITS*N)) mod M
2530 // T[3*N] - temporary work space
2531 // X[2*N] - number to be reduced
2532 // M[N] --- modulus
2533 // U[N] --- multiplicative inverse of M mod 2**(WORD_BITS*N)
2534 
2535 void MontgomeryReduce(word *R, word *T, word *X, const word *M, const word *U, size_t N)
2536 {
2537 #if 1
2538  MultiplyBottom(R, T, X, U, N);
2539  MultiplyTop(T, T+N, X, R, M, N);
2540  word borrow = Subtract(T, X+N, T, N);
2541  // defend against timing attack by doing this Add even when not needed
2542  word carry = Add(T+N, T, M, N);
2543  CRYPTOPP_ASSERT(carry | !borrow);
2544  CRYPTOPP_UNUSED(carry), CRYPTOPP_UNUSED(borrow);
2545  CopyWords(R, T + ((0-borrow) & N), N);
2546 #elif 0
2547  const word u = 0-U[0];
2548  Declare2Words(p)
2549  for (size_t i=0; i<N; i++)
2550  {
2551  const word t = u * X[i];
2552  word c = 0;
2553  for (size_t j=0; j<N; j+=2)
2554  {
2555  MultiplyWords(p, t, M[j]);
2556  Acc2WordsBy1(p, X[i+j]);
2557  Acc2WordsBy1(p, c);
2558  X[i+j] = LowWord(p);
2559  c = HighWord(p);
2560  MultiplyWords(p, t, M[j+1]);
2561  Acc2WordsBy1(p, X[i+j+1]);
2562  Acc2WordsBy1(p, c);
2563  X[i+j+1] = LowWord(p);
2564  c = HighWord(p);
2565  }
2566 
2567  if (Increment(X+N+i, N-i, c))
2568  while (!Subtract(X+N, X+N, M, N)) {}
2569  }
2570 
2571  memcpy(R, X+N, N*WORD_SIZE);
2572 #else
2573  __m64 u = _mm_cvtsi32_si64(0-U[0]), p;
2574  for (size_t i=0; i<N; i++)
2575  {
2576  __m64 t = _mm_cvtsi32_si64(X[i]);
2577  t = _mm_mul_su32(t, u);
2578  __m64 c = _mm_setzero_si64();
2579  for (size_t j=0; j<N; j+=2)
2580  {
2581  p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j]));
2582  p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j]));
2583  c = _mm_add_si64(c, p);
2584  X[i+j] = _mm_cvtsi64_si32(c);
2585  c = _mm_srli_si64(c, 32);
2586  p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j+1]));
2587  p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j+1]));
2588  c = _mm_add_si64(c, p);
2589  X[i+j+1] = _mm_cvtsi64_si32(c);
2590  c = _mm_srli_si64(c, 32);
2591  }
2592 
2593  if (Increment(X+N+i, N-i, _mm_cvtsi64_si32(c)))
2594  while (!Subtract(X+N, X+N, M, N)) {}
2595  }
2596 
2597  memcpy(R, X+N, N*WORD_SIZE);
2598  _mm_empty();
2599 #endif
2600 }
2601 
2602 // R[N] --- result = X/(2**(WORD_BITS*N/2)) mod M
2603 // T[2*N] - temporary work space
2604 // X[2*N] - number to be reduced
2605 // M[N] --- modulus
2606 // U[N/2] - multiplicative inverse of M mod 2**(WORD_BITS*N/2)
2607 // V[N] --- 2**(WORD_BITS*3*N/2) mod M
2608 
2609 void HalfMontgomeryReduce(word *R, word *T, const word *X, const word *M, const word *U, const word *V, size_t N)
2610 {
2611  CRYPTOPP_ASSERT(N%2==0 && N>=4);
2612 
2613 #define M0 M
2614 #define M1 (M+N2)
2615 #define V0 V
2616 #define V1 (V+N2)
2617 
2618 #define X0 X
2619 #define X1 (X+N2)
2620 #define X2 (X+N)
2621 #define X3 (X+N+N2)
2622 
2623  const size_t N2 = N/2;
2624  Multiply(T0, T2, V0, X3, N2);
2625  int c2 = Add(T0, T0, X0, N);
2626  MultiplyBottom(T3, T2, T0, U, N2);
2627  MultiplyTop(T2, R, T0, T3, M0, N2);
2628  c2 -= Subtract(T2, T1, T2, N2);
2629  Multiply(T0, R, T3, M1, N2);
2630  c2 -= Subtract(T0, T2, T0, N2);
2631  int c3 = -(int)Subtract(T1, X2, T1, N2);
2632  Multiply(R0, T2, V1, X3, N2);
2633  c3 += Add(R, R, T, N);
2634 
2635  if (c2>0)
2636  c3 += Increment(R1, N2);
2637  else if (c2<0)
2638  c3 -= Decrement(R1, N2, -c2);
2639 
2640  CRYPTOPP_ASSERT(c3>=-1 && c3<=1);
2641  if (c3>0)
2642  Subtract(R, R, M, N);
2643  else if (c3<0)
2644  Add(R, R, M, N);
2645 
2646 #undef M0
2647 #undef M1
2648 #undef V0
2649 #undef V1
2650 
2651 #undef X0
2652 #undef X1
2653 #undef X2
2654 #undef X3
2655 }
2656 
2657 #undef A0
2658 #undef A1
2659 #undef B0
2660 #undef B1
2661 
2662 #undef T0
2663 #undef T1
2664 #undef T2
2665 #undef T3
2666 
2667 #undef R0
2668 #undef R1
2669 #undef R2
2670 #undef R3
2671 
2672 /*
2673 // do a 3 word by 2 word divide, returns quotient and leaves remainder in A
2674 static word SubatomicDivide(word *A, word B0, word B1)
2675 {
2676  // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a word
2677  CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
2678 
2679  // estimate the quotient: do a 2 word by 1 word divide
2680  word Q;
2681  if (B1+1 == 0)
2682  Q = A[2];
2683  else
2684  Q = DWord(A[1], A[2]).DividedBy(B1+1);
2685 
2686  // now subtract Q*B from A
2687  DWord p = DWord::Multiply(B0, Q);
2688  DWord u = (DWord) A[0] - p.GetLowHalf();
2689  A[0] = u.GetLowHalf();
2690  u = (DWord) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - DWord::Multiply(B1, Q);
2691  A[1] = u.GetLowHalf();
2692  A[2] += u.GetHighHalf();
2693 
2694  // Q <= actual quotient, so fix it
2695  while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
2696  {
2697  u = (DWord) A[0] - B0;
2698  A[0] = u.GetLowHalf();
2699  u = (DWord) A[1] - B1 - u.GetHighHalfAsBorrow();
2700  A[1] = u.GetLowHalf();
2701  A[2] += u.GetHighHalf();
2702  Q++;
2703  CRYPTOPP_ASSERT(Q); // shouldn't overflow
2704  }
2705 
2706  return Q;
2707 }
2708 
2709 // do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
2710 static inline void AtomicDivide(word *Q, const word *A, const word *B)
2711 {
2712  if (!B[0] && !B[1]) // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
2713  {
2714  Q[0] = A[2];
2715  Q[1] = A[3];
2716  }
2717  else
2718  {
2719  word T[4];
2720  T[0] = A[0]; T[1] = A[1]; T[2] = A[2]; T[3] = A[3];
2721  Q[1] = SubatomicDivide(T+1, B[0], B[1]);
2722  Q[0] = SubatomicDivide(T, B[0], B[1]);
2723 
2724 #if defined(CRYPTOPP_DEBUG)
2725  // multiply quotient and divisor and add remainder, make sure it equals dividend
2726  CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2727  word P[4];
2728  LowLevel::Multiply2(P, Q, B);
2729  Add(P, P, T, 4);
2730  CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2731 #endif
2732  }
2733 }
2734 */
2735 
2736 static inline void AtomicDivide(word *Q, const word *A, const word *B)
2737 {
2738  word T[4];
2739  DWord q = DivideFourWordsByTwo<word, DWord>(T, DWord(A[0], A[1]), DWord(A[2], A[3]), DWord(B[0], B[1]));
2740  Q[0] = q.GetLowHalf();
2741  Q[1] = q.GetHighHalf();
2742 
2743 #if defined(CRYPTOPP_DEBUG)
2744  if (B[0] || B[1])
2745  {
2746  // multiply quotient and divisor and add remainder, make sure it equals dividend
2747  CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2748  word P[4];
2749  s_pMul[0](P, Q, B);
2750  Add(P, P, T, 4);
2751  CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2752  }
2753 #endif
2754 }
2755 
2756 // for use by Divide(), corrects the underestimated quotient {Q1,Q0}
2757 static void CorrectQuotientEstimate(word *R, word *T, word *Q, const word *B, size_t N)
2758 {
2759  CRYPTOPP_ASSERT(N && N%2==0);
2760 
2761  AsymmetricMultiply(T, T+N+2, Q, 2, B, N);
2762 
2763  word borrow = Subtract(R, R, T, N+2);
2764  CRYPTOPP_ASSERT(!borrow && !R[N+1]);
2765  CRYPTOPP_UNUSED(borrow);
2766 
2767  while (R[N] || Compare(R, B, N) >= 0)
2768  {
2769  R[N] -= Subtract(R, R, B, N);
2770  Q[1] += (++Q[0]==0);
2771  CRYPTOPP_ASSERT(Q[0] || Q[1]); // no overflow
2772  }
2773 }
2774 
2775 // R[NB] -------- remainder = A%B
2776 // Q[NA-NB+2] --- quotient = A/B
2777 // T[NA+3*(NB+2)] - temp work space
2778 // A[NA] -------- dividend
2779 // B[NB] -------- divisor
2780 
2781 void Divide(word *R, word *Q, word *T, const word *A, size_t NA, const word *B, size_t NB)
2782 {
2783  CRYPTOPP_ASSERT(NA && NB && NA%2==0 && NB%2==0);
2784  CRYPTOPP_ASSERT(B[NB-1] || B[NB-2]);
2785  CRYPTOPP_ASSERT(NB <= NA);
2786 
2787  // set up temporary work space
2788  word *const TA=T;
2789  word *const TB=T+NA+2;
2790  word *const TP=T+NA+2+NB;
2791 
2792  // copy B into TB and normalize it so that TB has highest bit set to 1
2793  unsigned shiftWords = (B[NB-1]==0);
2794  TB[0] = TB[NB-1] = 0;
2795  CopyWords(TB+shiftWords, B, NB-shiftWords);
2796  unsigned shiftBits = WORD_BITS - BitPrecision(TB[NB-1]);
2797  CRYPTOPP_ASSERT(shiftBits < WORD_BITS);
2798  ShiftWordsLeftByBits(TB, NB, shiftBits);
2799 
2800  // copy A into TA and normalize it
2801  TA[0] = TA[NA] = TA[NA+1] = 0;
2802  CopyWords(TA+shiftWords, A, NA);
2803  ShiftWordsLeftByBits(TA, NA+2, shiftBits);
2804 
2805  if (TA[NA+1]==0 && TA[NA] <= 1)
2806  {
2807  Q[NA-NB+1] = Q[NA-NB] = 0;
2808  while (TA[NA] || Compare(TA+NA-NB, TB, NB) >= 0)
2809  {
2810  TA[NA] -= Subtract(TA+NA-NB, TA+NA-NB, TB, NB);
2811  ++Q[NA-NB];
2812  }
2813  }
2814  else
2815  {
2816  NA+=2;
2817  CRYPTOPP_ASSERT(Compare(TA+NA-NB, TB, NB) < 0);
2818  }
2819 
2820  word BT[2];
2821  BT[0] = TB[NB-2] + 1;
2822  BT[1] = TB[NB-1] + (BT[0]==0);
2823 
2824  // start reducing TA mod TB, 2 words at a time
2825  for (size_t i=NA-2; i>=NB; i-=2)
2826  {
2827  AtomicDivide(Q+i-NB, TA+i-2, BT);
2828  CorrectQuotientEstimate(TA+i-NB, TP, Q+i-NB, TB, NB);
2829  }
2830 
2831  // copy TA into R, and denormalize it
2832  CopyWords(R, TA+shiftWords, NB);
2833  ShiftWordsRightByBits(R, NB, shiftBits);
2834 }
2835 
2836 static inline size_t EvenWordCount(const word *X, size_t N)
2837 {
2838  while (N && X[N-2]==0 && X[N-1]==0)
2839  N-=2;
2840  return N;
2841 }
2842 
2843 // return k
2844 // R[N] --- result = A^(-1) * 2^k mod M
2845 // T[4*N] - temporary work space
2846 // A[NA] -- number to take inverse of
2847 // M[N] --- modulus
2848 
2849 unsigned int AlmostInverse(word *R, word *T, const word *A, size_t NA, const word *M, size_t N)
2850 {
2851  CRYPTOPP_ASSERT(NA<=N && N && N%2==0);
2852 
2853  word *b = T;
2854  word *c = T+N;
2855  word *f = T+2*N;
2856  word *g = T+3*N;
2857  size_t bcLen=2, fgLen=EvenWordCount(M, N);
2858  unsigned int k=0;
2859  bool s=false;
2860 
2861  SetWords(T, 0, 3*N);
2862  b[0]=1;
2863  CopyWords(f, A, NA);
2864  CopyWords(g, M, N);
2865 
2866  while (1)
2867  {
2868  word t=f[0];
2869  while (!t)
2870  {
2871  if (EvenWordCount(f, fgLen)==0)
2872  {
2873  SetWords(R, 0, N);
2874  return 0;
2875  }
2876 
2877  ShiftWordsRightByWords(f, fgLen, 1);
2878  bcLen += 2 * (c[bcLen-1] != 0);
2879  CRYPTOPP_ASSERT(bcLen <= N);
2880  ShiftWordsLeftByWords(c, bcLen, 1);
2881  k+=WORD_BITS;
2882  t=f[0];
2883  }
2884 
2885  // t must be non-0; otherwise undefined.
2886  unsigned int i = TrailingZeros(t);
2887  t >>= i;
2888  k += i;
2889 
2890  if (t==1 && f[1]==0 && EvenWordCount(f+2, fgLen-2)==0)
2891  {
2892  if (s)
2893  Subtract(R, M, b, N);
2894  else
2895  CopyWords(R, b, N);
2896  return k;
2897  }
2898 
2899  ShiftWordsRightByBits(f, fgLen, i);
2900  t = ShiftWordsLeftByBits(c, bcLen, i);
2901  c[bcLen] += t;
2902  bcLen += 2 * (t!=0);
2903  CRYPTOPP_ASSERT(bcLen <= N);
2904 
2905  bool swap = Compare(f, g, fgLen)==-1;
2906  ConditionalSwapPointers(swap, f, g);
2907  ConditionalSwapPointers(swap, b, c);
2908  s ^= swap;
2909 
2910  fgLen -= 2 * !(f[fgLen-2] | f[fgLen-1]);
2911 
2912  Subtract(f, f, g, fgLen);
2913  t = Add(b, b, c, bcLen);
2914  b[bcLen] += t;
2915  bcLen += 2*t;
2916  CRYPTOPP_ASSERT(bcLen <= N);
2917  }
2918 }
2919 
2920 // R[N] - result = A/(2^k) mod M
2921 // A[N] - input
2922 // M[N] - modulus
2923 
2924 void DivideByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2925 {
2926  CopyWords(R, A, N);
2927 
2928  while (k--)
2929  {
2930  if (R[0]%2==0)
2931  ShiftWordsRightByBits(R, N, 1);
2932  else
2933  {
2934  word carry = Add(R, R, M, N);
2935  ShiftWordsRightByBits(R, N, 1);
2936  R[N-1] += carry<<(WORD_BITS-1);
2937  }
2938  }
2939 }
2940 
2941 // R[N] - result = A*(2^k) mod M
2942 // A[N] - input
2943 // M[N] - modulus
2944 
2945 void MultiplyByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2946 {
2947  CopyWords(R, A, N);
2948 
2949  while (k--)
2950  if (ShiftWordsLeftByBits(R, N, 1) || Compare(R, M, N)>=0)
2951  Subtract(R, R, M, N);
2952 }
2953 
2954 // ******************************************************************
2955 
2956 static const unsigned int RoundupSizeTable[] = {2, 2, 2, 4, 4, 8, 8, 8, 8};
2957 
2958 static inline size_t RoundupSize(size_t n)
2959 {
2960  if (n<=8)
2961  return RoundupSizeTable[n];
2962  else if (n<=16)
2963  return 16;
2964  else if (n<=32)
2965  return 32;
2966  else if (n<=64)
2967  return 64;
2968  else
2969  return size_t(1) << BitPrecision(n-1);
2970 }
2971 
2973  : reg(2), sign(POSITIVE)
2974 {
2975  reg[0] = reg[1] = 0;
2976 }
2977 
2979  : reg(RoundupSize(t.WordCount())), sign(t.sign)
2980 {
2981  CopyWords(reg, t.reg, reg.size());
2982 }
2983 
2984 Integer::Integer(Sign s, lword value)
2985  : reg(2), sign(s)
2986 {
2987  reg[0] = word(value);
2988  reg[1] = word(SafeRightShift<WORD_BITS>(value));
2989 }
2990 
2991 Integer::Integer(signed long value)
2992  : reg(2)
2993 {
2994  if (value >= 0)
2995  sign = POSITIVE;
2996  else
2997  {
2998  sign = NEGATIVE;
2999  value = -value;
3000  }
3001  reg[0] = word(value);
3002  reg[1] = word(SafeRightShift<WORD_BITS>((unsigned long)value));
3003 }
3004 
3005 Integer::Integer(Sign s, word high, word low)
3006  : reg(2), sign(s)
3007 {
3008  reg[0] = low;
3009  reg[1] = high;
3010 }
3011 
3013 {
3014  if (ByteCount() > sizeof(long))
3015  return false;
3016 
3017  unsigned long value = (unsigned long)reg[0];
3018  value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3019 
3020  if (sign==POSITIVE)
3021  return (signed long)value >= 0;
3022  else
3023  return -(signed long)value < 0;
3024 }
3025 
3026 signed long Integer::ConvertToLong() const
3027 {
3029 
3030  unsigned long value = (unsigned long)reg[0];
3031  value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3032  return sign==POSITIVE ? value : -(signed long)value;
3033 }
3034 
3035 Integer::Integer(BufferedTransformation &encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3036 {
3038 
3039  if (o != LITTLE_ENDIAN_ORDER)
3040  {
3041  Decode(encodedInteger, byteCount, s);
3042  }
3043  else
3044  {
3045  SecByteBlock block(byteCount);
3046  encodedInteger.Get(block, block.size());
3047  std::reverse(block.begin(), block.begin()+block.size());
3048 
3049  Decode(block.begin(), block.size(), s);
3050  }
3051 }
3052 
3053 Integer::Integer(const byte *encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3054 {
3055  CRYPTOPP_ASSERT(encodedInteger && byteCount); // NULL buffer
3057 
3058  if (o != LITTLE_ENDIAN_ORDER)
3059  {
3060  Decode(encodedInteger, byteCount, s);
3061  }
3062  else
3063  {
3064  SecByteBlock block(byteCount);
3065 #if (_MSC_VER >= 1500)
3066  std::reverse_copy(encodedInteger, encodedInteger+byteCount,
3067  stdext::make_checked_array_iterator(block.begin(), block.size()));
3068 #else
3069  std::reverse_copy(encodedInteger, encodedInteger+byteCount, block.begin());
3070 #endif
3071  Decode(block.begin(), block.size(), s);
3072  return;
3073  }
3074 }
3075 
3077 {
3078  // Make explicit call to avoid virtual-dispatch findings in ctor
3079  Integer::BERDecode(bt);
3080 }
3081 
3083 {
3084  Randomize(rng, bitcount);
3085 }
3086 
3087 Integer::Integer(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3088 {
3089  if (!Randomize(rng, min, max, rnType, equiv, mod))
3091 }
3092 
3094 {
3095  Integer r((word)0, BitsToWords(e+1));
3096  r.SetBit(e);
3097  return r;
3098 }
3099 
3101 {
3102  return IsNegative() ? false : (reg[0]==0 && WordCount()==0);
3103 }
3104 
3106 {
3107  if (this != &t)
3108  {
3109  if (reg.size() != t.reg.size() || t.reg[t.reg.size()/2] == 0)
3110  reg.New(RoundupSize(t.WordCount()));
3111  CopyWords(reg, t.reg, reg.size());
3112  sign = t.sign;
3113  }
3114  return *this;
3115 }
3116 
3117 bool Integer::GetBit(size_t n) const
3118 {
3119  // Profiling tells us the original Else was dominant, so it was
3120  // promoted to the first If statement. The code change occurred
3121  // at Commit dc99266599a0e72d.
3122  if (n/WORD_BITS < reg.size())
3123  return bool((reg[n/WORD_BITS] >> (n % WORD_BITS)) & 1);
3124  else
3125  return 0;
3126 }
3127 
3128 void Integer::SetBit(size_t n, bool value)
3129 {
3130  if (value)
3131  {
3132  reg.CleanGrow(RoundupSize(BitsToWords(n+1)));
3133  reg[n/WORD_BITS] |= (word(1) << (n%WORD_BITS));
3134  }
3135  else
3136  {
3137  if (n/WORD_BITS < reg.size())
3138  reg[n/WORD_BITS] &= ~(word(1) << (n%WORD_BITS));
3139  }
3140 }
3141 
3142 byte Integer::GetByte(size_t n) const
3143 {
3144  // Profiling tells us the original Else was dominant, so it was
3145  // promoted to the first If statement. The code change occurred
3146  // at Commit dc99266599a0e72d.
3147  if (n/WORD_SIZE < reg.size())
3148  return byte(reg[n/WORD_SIZE] >> ((n%WORD_SIZE)*8));
3149  else
3150  return 0;
3151 }
3152 
3153 void Integer::SetByte(size_t n, byte value)
3154 {
3155  reg.CleanGrow(RoundupSize(BytesToWords(n+1)));
3156  reg[n/WORD_SIZE] &= ~(word(0xff) << 8*(n%WORD_SIZE));
3157  reg[n/WORD_SIZE] |= (word(value) << 8*(n%WORD_SIZE));
3158 }
3159 
3160 lword Integer::GetBits(size_t i, size_t n) const
3161 {
3162  lword v = 0;
3163  CRYPTOPP_ASSERT(n <= sizeof(v)*8);
3164  for (unsigned int j=0; j<n; j++)
3165  v |= lword(GetBit(i+j)) << j;
3166  return v;
3167 }
3168 
3170 {
3171  Integer result(*this);
3172  result.Negate();
3173  return result;
3174 }
3175 
3177 {
3178  Integer result(*this);
3179  result.sign = POSITIVE;
3180  return result;
3181 }
3182 
3184 {
3185  reg.swap(a.reg);
3186  std::swap(sign, a.sign);
3187 }
3188 
3189 Integer::Integer(word value, size_t length)
3190  : reg(RoundupSize(length)), sign(POSITIVE)
3191 {
3192  reg[0] = value;
3193  SetWords(reg+1, 0, reg.size()-1);
3194 }
3195 
3196 template <class T>
3197 static Integer StringToInteger(const T *str, ByteOrder order)
3198 {
3199  CRYPTOPP_ASSERT( order == BIG_ENDIAN_ORDER || order == LITTLE_ENDIAN_ORDER );
3200 
3201  int radix, sign = 1;
3202  // GCC workaround
3203  // std::char_traits<wchar_t>::length() not defined in GCC 3.2 and STLport 4.5.3
3204  unsigned int length;
3205  for (length = 0; str[length] != 0; length++) {}
3206 
3207  Integer v;
3208 
3209  if (length == 0)
3210  return Integer::Zero();
3211 
3212  switch (str[length-1])
3213  {
3214  case 'h':
3215  case 'H':
3216  radix=16;
3217  break;
3218  case 'o':
3219  case 'O':
3220  radix=8;
3221  break;
3222  case 'b':
3223  case 'B':
3224  radix=2;
3225  break;
3226  default:
3227  radix=10;
3228  }
3229 
3230  // 'str' is of length 1 or more
3231  if (str[0] == '-')
3232  {
3233  sign = -1;
3234  str += 1, length -= 1;
3235  }
3236 
3237  if (length > 2 && str[0] == '0' && (str[1] == 'x' || str[1] == 'X'))
3238  {
3239  radix = 16;
3240  str += 2, length -= 2;
3241  }
3242 
3243  if (order == BIG_ENDIAN_ORDER)
3244  {
3245  for (unsigned int i=0; i<length; i++)
3246  {
3247  int digit, ch = static_cast<int>(str[i]);
3248 
3249  // Profiling showd the second and third Else needed to be swapped
3250  // The code change occurred at Commit dc99266599a0e72d.
3251  if (ch >= '0' && ch <= '9')
3252  digit = ch - '0';
3253  else if (ch >= 'a' && ch <= 'f')
3254  digit = ch - 'a' + 10;
3255  else if (ch >= 'A' && ch <= 'F')
3256  digit = ch - 'A' + 10;
3257  else
3258  digit = radix;
3259 
3260  if (digit < radix)
3261  {
3262  v *= radix;
3263  v += digit;
3264  }
3265  }
3266  }
3267  else if (radix == 16 && order == LITTLE_ENDIAN_ORDER)
3268  {
3269  // Nibble high, low and count
3270  unsigned int nh = 0, nl = 0, nc = 0;
3271  Integer position(Integer::One());
3272 
3273  for (unsigned int i=0; i<length; i++)
3274  {
3275  int digit, ch = static_cast<int>(str[i]);
3276 
3277  if (ch >= '0' && ch <= '9')
3278  digit = ch - '0';
3279  else if (ch >= 'a' && ch <= 'f')
3280  digit = ch - 'a' + 10;
3281  else if (ch >= 'A' && ch <= 'F')
3282  digit = ch - 'A' + 10;
3283  else
3284  digit = radix;
3285 
3286  if (digit < radix)
3287  {
3288  if (nc++ == 0)
3289  nh = digit;
3290  else
3291  nl = digit;
3292 
3293  if (nc == 2)
3294  {
3295  v += position * (nh << 4 | nl);
3296  nc = 0, position <<= 8;
3297  }
3298  }
3299  }
3300 
3301  if (nc == 1)
3302  v += nh * position;
3303  }
3304  else // LITTLE_ENDIAN_ORDER && radix != 16
3305  {
3306  for (int i=static_cast<int>(length)-1; i>=0; i--)
3307  {
3308  int digit, ch = static_cast<int>(str[i]);
3309 
3310  if (ch >= '0' && ch <= '9')
3311  digit = ch - '0';
3312  else if (ch >= 'a' && ch <= 'f')
3313  digit = ch - 'a' + 10;
3314  else if (ch >= 'A' && ch <= 'F')
3315  digit = ch - 'A' + 10;
3316  else
3317  digit = radix;
3318 
3319  if (digit < radix)
3320  {
3321  v *= radix;
3322  v += digit;
3323  }
3324  }
3325  }
3326 
3327  if (sign == -1)
3328  v.Negate();
3329 
3330  return v;
3331 }
3332 
3333 Integer::Integer(const char *str, ByteOrder order)
3334  : reg(2), sign(POSITIVE)
3335 {
3336  *this = StringToInteger(str,order);
3337 }
3338 
3339 Integer::Integer(const wchar_t *str, ByteOrder order)
3340  : reg(2), sign(POSITIVE)
3341 {
3342  *this = StringToInteger(str,order);
3343 }
3344 
3345 unsigned int Integer::WordCount() const
3346 {
3347  return (unsigned int)CountWords(reg, reg.size());
3348 }
3349 
3350 unsigned int Integer::ByteCount() const
3351 {
3352  unsigned wordCount = WordCount();
3353  if (wordCount)
3354  return (wordCount-1)*WORD_SIZE + BytePrecision(reg[wordCount-1]);
3355  else
3356  return 0;
3357 }
3358 
3359 unsigned int Integer::BitCount() const
3360 {
3361  unsigned wordCount = WordCount();
3362  if (wordCount)
3363  return (wordCount-1)*WORD_BITS + BitPrecision(reg[wordCount-1]);
3364  else
3365  return 0;
3366 }
3367 
3368 void Integer::Decode(const byte *input, size_t inputLen, Signedness s)
3369 {
3370  CRYPTOPP_ASSERT(input && inputLen); // NULL buffer
3371  StringStore store(input, inputLen);
3372  Decode(store, inputLen, s);
3373 }
3374 
3376 {
3377  CRYPTOPP_ASSERT(bt.MaxRetrievable() >= inputLen);
3378  if (bt.MaxRetrievable() < inputLen)
3379  throw InvalidArgument("Integer: input length is too small");
3380 
3381  byte b;
3382  bt.Peek(b);
3383  sign = ((s==SIGNED) && (b & 0x80)) ? NEGATIVE : POSITIVE;
3384 
3385  while (inputLen>0 && (sign==POSITIVE ? b==0 : b==0xff))
3386  {
3387  bt.Skip(1);
3388  inputLen--;
3389  bt.Peek(b);
3390  }
3391 
3392  reg.CleanNew(RoundupSize(BytesToWords(inputLen)));
3393  for (size_t i=inputLen; i > 0; i--)
3394  {
3395  (void)bt.Get(b);
3396  reg[(i-1)/WORD_SIZE] |= word(b) << ((i-1)%WORD_SIZE)*8;
3397  }
3398 
3399  if (sign == NEGATIVE)
3400  {
3401  for (size_t i=inputLen; i<reg.size()*WORD_SIZE; i++)
3402  reg[i/WORD_SIZE] |= word(0xff) << (i%WORD_SIZE)*8;
3403  TwosComplement(reg, reg.size());
3404  }
3405 }
3406 
3407 size_t Integer::MinEncodedSize(Signedness signedness) const
3408 {
3409  // Profiling tells us the original second If was dominant, so it
3410  // was promoted to the first If statement. The code change occurred
3411  // at Commit dc99266599a0e72d.
3412  unsigned int outputLen = STDMAX(1U, ByteCount());
3413  const bool pre = (signedness == UNSIGNED);
3414  if (!pre && NotNegative() && (GetByte(outputLen-1) & 0x80))
3415  outputLen++;
3416  if (pre)
3417  return outputLen;
3418  if (IsNegative() && *this < -Power2(outputLen*8-1))
3419  outputLen++;
3420  return outputLen;
3421 }
3422 
3423 // PKCS12_PBKDF and other classes use undersized buffers
3424 void Integer::Encode(byte *output, size_t outputLen, Signedness signedness) const
3425 {
3426  CRYPTOPP_ASSERT(output && outputLen); // NULL buffer
3427  ArraySink sink(output, outputLen);
3428  Encode(sink, outputLen, signedness);
3429 }
3430 
3431 void Integer::Encode(BufferedTransformation &bt, size_t outputLen, Signedness signedness) const
3432 {
3433  if (signedness == UNSIGNED || NotNegative())
3434  {
3435  for (size_t i=outputLen; i > 0; i--)
3436  bt.Put(GetByte(i-1));
3437  }
3438  else
3439  {
3440  // take two's complement of *this
3441  Integer temp = Integer::Power2(8*STDMAX((size_t)ByteCount(), outputLen)) + *this;
3442  temp.Encode(bt, outputLen, UNSIGNED);
3443  }
3444 }
3445 
3447 {
3448  DERGeneralEncoder enc(bt, INTEGER);
3450  enc.MessageEnd();
3451 }
3452 
3453 void Integer::BERDecode(const byte *input, size_t len)
3454 {
3455  CRYPTOPP_ASSERT(input && len); // NULL buffer
3456  StringStore store(input, len);
3457  BERDecode(store);
3458 }
3459 
3461 {
3462  BERGeneralDecoder dec(bt, INTEGER);
3463  if (!dec.IsDefiniteLength() || dec.MaxRetrievable() < dec.RemainingLength())
3464  BERDecodeError();
3465  Decode(dec, (size_t)dec.RemainingLength(), SIGNED);
3466  dec.MessageEnd();
3467 }
3468 
3470 {
3471  DERGeneralEncoder enc(bt, OCTET_STRING);
3472  Encode(enc, length);
3473  enc.MessageEnd();
3474 }
3475 
3477 {
3478  BERGeneralDecoder dec(bt, OCTET_STRING);
3479  if (!dec.IsDefiniteLength() || dec.RemainingLength() != length)
3480  BERDecodeError();
3481  Decode(dec, length);
3482  dec.MessageEnd();
3483 }
3484 
3485 size_t Integer::OpenPGPEncode(byte *output, size_t bufferSize) const
3486 {
3487  CRYPTOPP_ASSERT(output && bufferSize); // NULL buffer
3488  CRYPTOPP_ASSERT(bufferSize >= MinEncodedSize()); // Undersized buffer
3489  ArraySink sink(output, bufferSize);
3490  return OpenPGPEncode(sink);
3491 }
3492 
3494 {
3495  word16 bitCount = word16(BitCount());
3496  bt.PutWord16(bitCount);
3497  size_t byteCount = BitsToBytes(bitCount);
3498  Encode(bt, byteCount);
3499  return 2 + byteCount;
3500 }
3501 
3502 void Integer::OpenPGPDecode(const byte *input, size_t len)
3503 {
3504  CRYPTOPP_ASSERT(input && len); // NULL buffer
3505  StringStore store(input, len);
3506  OpenPGPDecode(store);
3507 }
3508 
3510 {
3511  word16 bitCount;
3512  if (bt.GetWord16(bitCount) != 2 || bt.MaxRetrievable() < BitsToBytes(bitCount))
3513  throw OpenPGPDecodeErr();
3514  Decode(bt, BitsToBytes(bitCount));
3515 }
3516 
3518 {
3519  const size_t nbytes = nbits/8 + 1;
3520  SecByteBlock buf(nbytes);
3521  rng.GenerateBlock(buf, nbytes);
3522  if (nbytes)
3523  buf[0] = (byte)Crop(buf[0], nbits % 8);
3524  Decode(buf, nbytes, UNSIGNED);
3525 }
3526 
3527 void Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max)
3528 {
3529  if (min > max)
3530  throw InvalidArgument("Integer: Min must be no greater than Max");
3531 
3532  Integer range = max - min;
3533  const unsigned int nbits = range.BitCount();
3534 
3535  do
3536  {
3537  Randomize(rng, nbits);
3538  }
3539  while (*this > range);
3540 
3541  *this += min;
3542 }
3543 
3544 bool Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3545 {
3546  return GenerateRandomNoThrow(rng, MakeParameters("Min", min)("Max", max)
3547  ("RandomNumberType", rnType)("EquivalentTo", equiv)("Mod", mod));
3548 }
3549 
3551 {
3552 public:
3553  KDF2_RNG(const byte *seed, size_t seedSize)
3554  : m_counter(0), m_counterAndSeed(ClampSize(seedSize) + 4)
3555  {
3556  memcpy(m_counterAndSeed + 4, seed, ClampSize(seedSize));
3557  }
3558 
3559  void GenerateBlock(byte *output, size_t size)
3560  {
3561  CRYPTOPP_ASSERT(output && size); // NULL buffer
3562  PutWord(false, BIG_ENDIAN_ORDER, m_counterAndSeed, m_counter);
3563  ++m_counter;
3564  P1363_KDF2<SHA1>::DeriveKey(output, size, m_counterAndSeed, m_counterAndSeed.size(), NULLPTR, 0);
3565  }
3566 
3567  // UBsan finding, -Wstringop-overflow
3568  inline size_t ClampSize(size_t req) const
3569  {
3570  // Clamp at 16 MB
3571  if (req > 16U*1024*1024)
3572  return 16U*1024*1024;
3573  return req;
3574  }
3575 
3576 private:
3577  word32 m_counter;
3578  SecByteBlock m_counterAndSeed;
3579 };
3580 
3582 {
3583  Integer min = params.GetValueWithDefault("Min", Integer::Zero());
3584  Integer max;
3585  if (!params.GetValue("Max", max))
3586  {
3587  int bitLength;
3588  if (params.GetIntValue("BitLength", bitLength))
3589  max = Integer::Power2(bitLength);
3590  else
3591  throw InvalidArgument("Integer: missing Max argument");
3592  }
3593  if (min > max)
3594  throw InvalidArgument("Integer: Min must be no greater than Max");
3595 
3596  Integer equiv = params.GetValueWithDefault("EquivalentTo", Integer::Zero());
3597  Integer mod = params.GetValueWithDefault("Mod", Integer::One());
3598 
3599  if (equiv.IsNegative() || equiv >= mod)
3600  throw InvalidArgument("Integer: invalid EquivalentTo and/or Mod argument");
3601 
3602  Integer::RandomNumberType rnType = params.GetValueWithDefault("RandomNumberType", Integer::ANY);
3603 
3604  member_ptr<KDF2_RNG> kdf2Rng;
3606  if (params.GetValue(Name::Seed(), seed))
3607  {
3608  ByteQueue bq;
3609  DERSequenceEncoder seq(bq);
3610  min.DEREncode(seq);
3611  max.DEREncode(seq);
3612  equiv.DEREncode(seq);
3613  mod.DEREncode(seq);
3614  DEREncodeUnsigned(seq, rnType);
3615  DEREncodeOctetString(seq, seed.begin(), seed.size());
3616  seq.MessageEnd();
3617 
3618  SecByteBlock finalSeed((size_t)bq.MaxRetrievable());
3619  bq.Get(finalSeed, finalSeed.size());
3620  kdf2Rng.reset(new KDF2_RNG(finalSeed.begin(), finalSeed.size()));
3621  }
3622  RandomNumberGenerator &rng = kdf2Rng.get() ? (RandomNumberGenerator &)*kdf2Rng : i_rng;
3623 
3624  switch (rnType)
3625  {
3626  case ANY:
3627  if (mod == One())
3628  Randomize(rng, min, max);
3629  else
3630  {
3631  Integer min1 = min + (equiv-min)%mod;
3632  if (max < min1)
3633  return false;
3634  Randomize(rng, Zero(), (max - min1) / mod);
3635  *this *= mod;
3636  *this += min1;
3637  }
3638  return true;
3639 
3640  case PRIME:
3641  {
3642  const PrimeSelector *pSelector = params.GetValueWithDefault(Name::PointerToPrimeSelector(), (const PrimeSelector *)NULLPTR);
3643 
3644  int i;
3645  i = 0;
3646  while (1)
3647  {
3648  if (++i==16)
3649  {
3650  // check if there are any suitable primes in [min, max]
3651  Integer first = min;
3652  if (FirstPrime(first, max, equiv, mod, pSelector))
3653  {
3654  // if there is only one suitable prime, we're done
3655  *this = first;
3656  if (!FirstPrime(first, max, equiv, mod, pSelector))
3657  return true;
3658  }
3659  else
3660  return false;
3661  }
3662 
3663  Randomize(rng, min, max);
3664  if (FirstPrime(*this, STDMIN(*this+mod*PrimeSearchInterval(max), max), equiv, mod, pSelector))
3665  return true;
3666  }
3667  }
3668 
3669  default:
3670  throw InvalidArgument("Integer: invalid RandomNumberType argument");
3671  }
3672 }
3673 
3674 std::istream& operator>>(std::istream& in, Integer &a)
3675 {
3676  char c;
3677  unsigned int length = 0;
3678  SecBlock<char> str(length + 16);
3679 
3680  std::ws(in);
3681 
3682  do
3683  {
3684  in.read(&c, 1);
3685  str[length++] = c;
3686  if (length >= str.size())
3687  str.Grow(length + 16);
3688  }
3689  while (in && (c=='-' || c=='x' || (c>='0' && c<='9') || (c>='a' && c<='f') || (c>='A' && c<='F') || c=='h' || c=='H' || c=='o' || c=='O' || c==',' || c=='.'));
3690 
3691  if (in.gcount())
3692  in.putback(c);
3693  str[length-1] = '\0';
3694  a = Integer(str);
3695 
3696  return in;
3697 }
3698 
3699 // Ensure base 10 is default
3700 inline int FlagToBase(long f) {
3701  return f == std::ios::hex ? 16 : (f == std::ios::oct ? 8 : 10);
3702 }
3703 
3704 inline char FlagToSuffix(long f) {
3705  return f == std::ios::hex ? 'h' : (f == std::ios::oct ? 'o' : '.');
3706 }
3707 
3708 // Ensure base 10 is default
3709 std::ostream& operator<<(std::ostream& out, const Integer &a)
3710 {
3711  // Get relevant conversion specifications from ostream.
3712  const long f = out.flags() & std::ios::basefield;
3713  const int base = FlagToBase(f);
3714  const char suffix = FlagToSuffix(f);
3715 
3716  Integer temp1=a, temp2;
3717  if (a.IsNegative())
3718  {
3719  out << '-';
3720  temp1.Negate();
3721  }
3722 
3723  if (!a)
3724  out << '0';
3725 
3726  static const char upper[]="0123456789ABCDEF";
3727  static const char lower[]="0123456789abcdef";
3728 
3729  const char* vec = (out.flags() & std::ios::uppercase) ? upper : lower;
3730  unsigned int i=0;
3731  SecBlock<char> s(a.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
3732 
3733  while (!!temp1)
3734  {
3735  word digit;
3736  Integer::Divide(digit, temp2, temp1, base);
3737  s[i++]=vec[digit];
3738  temp1.swap(temp2);
3739  }
3740 
3741  while (i--)
3742  {
3743  out << s[i];
3744  }
3745 
3746 #ifdef CRYPTOPP_USE_STD_SHOWBASE
3747  if (out.flags() & std::ios_base::showbase)
3748  out << suffix;
3749 
3750  return out;
3751 #else
3752  return out << suffix;
3753 #endif
3754 }
3755 
3757 {
3758  if (NotNegative())
3759  {
3760  if (Increment(reg, reg.size()))
3761  {
3762  reg.CleanGrow(2*reg.size());
3763  reg[reg.size()/2]=1;
3764  }
3765  }
3766  else
3767  {
3768  word borrow = Decrement(reg, reg.size());
3769  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3770 
3771  if (WordCount()==0)
3772  *this = Zero();
3773  }
3774  return *this;
3775 }
3776 
3778 {
3779  if (IsNegative())
3780  {
3781  if (Increment(reg, reg.size()))
3782  {
3783  reg.CleanGrow(2*reg.size());
3784  reg[reg.size()/2]=1;
3785  }
3786  }
3787  else
3788  {
3789  if (Decrement(reg, reg.size()))
3790  *this = -One();
3791  }
3792  return *this;
3793 }
3794 
3795 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3796 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3798 {
3799  if (this == &t)
3800  {
3801  return AbsoluteValue();
3802  }
3803  else if (reg.size() >= t.reg.size())
3804  {
3805  Integer result(t);
3806  AndWords(result.reg, reg, t.reg.size());
3807 
3808  result.sign = POSITIVE;
3809  return result;
3810  }
3811  else // reg.size() < t.reg.size()
3812  {
3813  Integer result(*this);
3814  AndWords(result.reg, t.reg, reg.size());
3815 
3816  result.sign = POSITIVE;
3817  return result;
3818  }
3819 }
3820 
3821 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3822 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3823 Integer Integer::Or(const Integer& t) const
3824 {
3825  if (this == &t)
3826  {
3827  return AbsoluteValue();
3828  }
3829  else if (reg.size() >= t.reg.size())
3830  {
3831  Integer result(*this);
3832  OrWords(result.reg, t.reg, t.reg.size());
3833 
3834  result.sign = POSITIVE;
3835  return result;
3836  }
3837  else // reg.size() < t.reg.size()
3838  {
3839  Integer result(t);
3840  OrWords(result.reg, reg, reg.size());
3841 
3842  result.sign = POSITIVE;
3843  return result;
3844  }
3845 }
3846 
3847 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3848 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3850 {
3851  if (this == &t)
3852  {
3853  return Integer::Zero();
3854  }
3855  else if (reg.size() >= t.reg.size())
3856  {
3857  Integer result(*this);
3858  XorWords(result.reg, t.reg, t.reg.size());
3859 
3860  result.sign = POSITIVE;
3861  return result;
3862  }
3863  else // reg.size() < t.reg.size()
3864  {
3865  Integer result(t);
3866  XorWords(result.reg, reg, reg.size());
3867 
3868  result.sign = POSITIVE;
3869  return result;
3870  }
3871 }
3872 
3873 void PositiveAdd(Integer &sum, const Integer &a, const Integer& b)
3874 {
3875  // Profiling tells us the original second Else If was dominant, so it
3876  // was promoted to the first If statement. The code change occurred at
3877  // Commit dc99266599a0e72d.
3878  int carry; const bool pre = (a.reg.size() == b.reg.size());
3879  if (!pre && a.reg.size() > b.reg.size())
3880  {
3881  carry = Add(sum.reg, a.reg, b.reg, b.reg.size());
3882  CopyWords(sum.reg+b.reg.size(), a.reg+b.reg.size(), a.reg.size()-b.reg.size());
3883  carry = Increment(sum.reg+b.reg.size(), a.reg.size()-b.reg.size(), carry);
3884  }
3885  else if (pre)
3886  {
3887  carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3888  }
3889  else
3890  {
3891  carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3892  CopyWords(sum.reg+a.reg.size(), b.reg+a.reg.size(), b.reg.size()-a.reg.size());
3893  carry = Increment(sum.reg+a.reg.size(), b.reg.size()-a.reg.size(), carry);
3894  }
3895 
3896  if (carry)
3897  {
3898  sum.reg.CleanGrow(2*sum.reg.size());
3899  sum.reg[sum.reg.size()/2] = 1;
3900  }
3901  sum.sign = Integer::POSITIVE;
3902 }
3903 
3904 void PositiveSubtract(Integer &diff, const Integer &a, const Integer& b)
3905 {
3906  unsigned aSize = a.WordCount();
3907  aSize += aSize%2;
3908  unsigned bSize = b.WordCount();
3909  bSize += bSize%2;
3910 
3911  // Profiling tells us the original second Else If was dominant, so it
3912  // was promoted to the first If statement. The code change occurred at
3913  // Commit dc99266599a0e72d.
3914  if (aSize > bSize)
3915  {
3916  word borrow = Subtract(diff.reg, a.reg, b.reg, bSize);
3917  CopyWords(diff.reg+bSize, a.reg+bSize, aSize-bSize);
3918  borrow = Decrement(diff.reg+bSize, aSize-bSize, borrow);
3919  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3920  diff.sign = Integer::POSITIVE;
3921  }
3922  else if (aSize == bSize)
3923  {
3924  if (Compare(a.reg, b.reg, aSize) >= 0)
3925  {
3926  Subtract(diff.reg, a.reg, b.reg, aSize);
3927  diff.sign = Integer::POSITIVE;
3928  }
3929  else
3930  {
3931  Subtract(diff.reg, b.reg, a.reg, aSize);
3932  diff.sign = Integer::NEGATIVE;
3933  }
3934  }
3935  else
3936  {
3937  word borrow = Subtract(diff.reg, b.reg, a.reg, aSize);
3938  CopyWords(diff.reg+aSize, b.reg+aSize, bSize-aSize);
3939  borrow = Decrement(diff.reg+aSize, bSize-aSize, borrow);
3940  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3941  diff.sign = Integer::NEGATIVE;
3942  }
3943 }
3944 
3945 // MSVC .NET 2003 workaround
3946 template <class T> inline const T& STDMAX2(const T& a, const T& b)
3947 {
3948  return a < b ? b : a;
3949 }
3950 
3952 {
3953  Integer sum((word)0, STDMAX2(reg.size(), b.reg.size()));
3954  if (NotNegative())
3955  {
3956  if (b.NotNegative())
3957  PositiveAdd(sum, *this, b);
3958  else
3959  PositiveSubtract(sum, *this, b);
3960  }
3961  else
3962  {
3963  if (b.NotNegative())
3964  PositiveSubtract(sum, b, *this);
3965  else
3966  {
3967  PositiveAdd(sum, *this, b);
3968  sum.sign = Integer::NEGATIVE;
3969  }
3970  }
3971  return sum;
3972 }
3973 
3975 {
3976  reg.CleanGrow(t.reg.size());
3977  if (NotNegative())
3978  {
3979  if (t.NotNegative())
3980  PositiveAdd(*this, *this, t);
3981  else
3982  PositiveSubtract(*this, *this, t);
3983  }
3984  else
3985  {
3986  if (t.NotNegative())
3987  PositiveSubtract(*this, t, *this);
3988  else
3989  {
3990  PositiveAdd(*this, *this, t);
3991  sign = Integer::NEGATIVE;
3992  }
3993  }
3994  return *this;
3995 }
3996 
3998 {
3999  Integer diff((word)0, STDMAX2(reg.size(), b.reg.size()));
4000  if (NotNegative())
4001  {
4002  if (b.NotNegative())
4003  PositiveSubtract(diff, *this, b);
4004  else
4005  PositiveAdd(diff, *this, b);
4006  }
4007  else
4008  {
4009  if (b.NotNegative())
4010  {
4011  PositiveAdd(diff, *this, b);
4012  diff.sign = Integer::NEGATIVE;
4013  }
4014  else
4015  PositiveSubtract(diff, b, *this);
4016  }
4017  return diff;
4018 }
4019 
4021 {
4022  reg.CleanGrow(t.reg.size());
4023  if (NotNegative())
4024  {
4025  if (t.NotNegative())
4026  PositiveSubtract(*this, *this, t);
4027  else
4028  PositiveAdd(*this, *this, t);
4029  }
4030  else
4031  {
4032  if (t.NotNegative())
4033  {
4034  PositiveAdd(*this, *this, t);
4035  sign = Integer::NEGATIVE;
4036  }
4037  else
4038  PositiveSubtract(*this, t, *this);
4039  }
4040  return *this;
4041 }
4042 
4044 {
4045  const size_t wordCount = WordCount();
4046  const size_t shiftWords = n / WORD_BITS;
4047  const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4048 
4049  reg.CleanGrow(RoundupSize(wordCount+BitsToWords(n)));
4050  ShiftWordsLeftByWords(reg, wordCount + shiftWords, shiftWords);
4051  ShiftWordsLeftByBits(reg+shiftWords, wordCount+BitsToWords(shiftBits), shiftBits);
4052  return *this;
4053 }
4054 
4056 {
4057  const size_t wordCount = WordCount();
4058  const size_t shiftWords = n / WORD_BITS;
4059  const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4060 
4061  ShiftWordsRightByWords(reg, wordCount, shiftWords);
4062  if (wordCount > shiftWords)
4063  ShiftWordsRightByBits(reg, wordCount-shiftWords, shiftBits);
4064  if (IsNegative() && WordCount()==0) // avoid -0
4065  *this = Zero();
4066  return *this;
4067 }
4068 
4070 {
4071  if (this != &t)
4072  {
4073  const size_t size = STDMIN(reg.size(), t.reg.size());
4074  reg.resize(size);
4075  AndWords(reg, t.reg, size);
4076  }
4077  sign = POSITIVE;
4078  return *this;
4079 }
4080 
4082 {
4083  if (this != &t)
4084  {
4085  if (reg.size() >= t.reg.size())
4086  {
4087  OrWords(reg, t.reg, t.reg.size());
4088  }
4089  else // reg.size() < t.reg.size()
4090  {
4091  const size_t head = reg.size();
4092  const size_t tail = t.reg.size() - reg.size();
4093  reg.resize(head+tail);
4094  OrWords(reg, t.reg, head);
4095  CopyWords(reg+head,t.reg+head,tail);
4096  }
4097  }
4098  sign = POSITIVE;
4099  return *this;
4100 }
4101 
4103 {
4104  if (this == &t)
4105  {
4106  *this = Zero();
4107  }
4108  else
4109  {
4110  if (reg.size() >= t.reg.size())
4111  {
4112  XorWords(reg, t.reg, t.reg.size());
4113  }
4114  else // reg.size() < t.reg.size()
4115  {
4116  const size_t head = reg.size();
4117  const size_t tail = t.reg.size() - reg.size();
4118  reg.resize(head+tail);
4119  XorWords(reg, t.reg, head);
4120  CopyWords(reg+head,t.reg+head,tail);
4121  }
4122  }
4123  sign = POSITIVE;
4124  return *this;
4125 }
4126 
4127 void PositiveMultiply(Integer &product, const Integer &a, const Integer &b)
4128 {
4129  size_t aSize = RoundupSize(a.WordCount());
4130  size_t bSize = RoundupSize(b.WordCount());
4131 
4132  product.reg.CleanNew(RoundupSize(aSize+bSize));
4133  product.sign = Integer::POSITIVE;
4134 
4135  IntegerSecBlock workspace(aSize + bSize);
4136  AsymmetricMultiply(product.reg, workspace, a.reg, aSize, b.reg, bSize);
4137 }
4138 
4139 void Multiply(Integer &product, const Integer &a, const Integer &b)
4140 {
4141  PositiveMultiply(product, a, b);
4142 
4143  if (a.NotNegative() != b.NotNegative())
4144  product.Negate();
4145 }
4146 
4148 {
4149  Integer product;
4150  Multiply(product, *this, b);
4151  return product;
4152 }
4153 
4154 /*
4155 void PositiveDivide(Integer &remainder, Integer &quotient,
4156  const Integer &dividend, const Integer &divisor)
4157 {
4158  remainder.reg.CleanNew(divisor.reg.size());
4159  remainder.sign = Integer::POSITIVE;
4160  quotient.reg.New(0);
4161  quotient.sign = Integer::POSITIVE;
4162  unsigned i=dividend.BitCount();
4163  while (i--)
4164  {
4165  word overflow = ShiftWordsLeftByBits(remainder.reg, remainder.reg.size(), 1);
4166  remainder.reg[0] |= dividend[i];
4167  if (overflow || remainder >= divisor)
4168  {
4169  Subtract(remainder.reg, remainder.reg, divisor.reg, remainder.reg.size());
4170  quotient.SetBit(i);
4171  }
4172  }
4173 }
4174 */
4175 
4176 void PositiveDivide(Integer &remainder, Integer &quotient,
4177  const Integer &a, const Integer &b)
4178 {
4179  unsigned aSize = a.WordCount();
4180  unsigned bSize = b.WordCount();
4181 
4182  if (!bSize)
4183  throw Integer::DivideByZero();
4184 
4185  if (aSize < bSize)
4186  {
4187  remainder = a;
4188  remainder.sign = Integer::POSITIVE;
4189  quotient = Integer::Zero();
4190  return;
4191  }
4192 
4193  aSize += aSize%2; // round up to next even number
4194  bSize += bSize%2;
4195 
4196  remainder.reg.CleanNew(RoundupSize(bSize));
4197  remainder.sign = Integer::POSITIVE;
4198  quotient.reg.CleanNew(RoundupSize(aSize-bSize+2));
4199  quotient.sign = Integer::POSITIVE;
4200 
4201  IntegerSecBlock T(aSize+3*(bSize+2));
4202  Divide(remainder.reg, quotient.reg, T, a.reg, aSize, b.reg, bSize);
4203 }
4204 
4205 void Integer::Divide(Integer &remainder, Integer &quotient, const Integer &dividend, const Integer &divisor)
4206 {
4207  PositiveDivide(remainder, quotient, dividend, divisor);
4208 
4209  if (dividend.IsNegative())
4210  {
4211  quotient.Negate();
4212  if (remainder.NotZero())
4213  {
4214  --quotient;
4215  remainder = divisor.AbsoluteValue() - remainder;
4216  }
4217  }
4218 
4219  if (divisor.IsNegative())
4220  quotient.Negate();
4221 }
4222 
4223 void Integer::DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
4224 {
4225  q = a;
4226  q >>= n;
4227 
4228  const size_t wordCount = BitsToWords(n);
4229  if (wordCount <= a.WordCount())
4230  {
4231  r.reg.resize(RoundupSize(wordCount));
4232  CopyWords(r.reg, a.reg, wordCount);
4233  SetWords(r.reg+wordCount, 0, r.reg.size()-wordCount);
4234  if (n % WORD_BITS != 0)
4235  r.reg[wordCount-1] %= (word(1) << (n % WORD_BITS));
4236  }
4237  else
4238  {
4239  r.reg.resize(RoundupSize(a.WordCount()));
4240  CopyWords(r.reg, a.reg, r.reg.size());
4241  }
4242  r.sign = POSITIVE;
4243 
4244  if (a.IsNegative() && r.NotZero())
4245  {
4246  --q;
4247  r = Power2(n) - r;
4248  }
4249 }
4250 
4252 {
4253  Integer remainder, quotient;
4254  Integer::Divide(remainder, quotient, *this, b);
4255  return quotient;
4256 }
4257 
4259 {
4260  Integer remainder, quotient;
4261  Integer::Divide(remainder, quotient, *this, b);
4262  return remainder;
4263 }
4264 
4265 void Integer::Divide(word &remainder, Integer &quotient, const Integer &dividend, word divisor)
4266 {
4267  if (!divisor)
4268  throw Integer::DivideByZero();
4269 
4270  // IsPowerOf2 uses BMI on x86 if available. There is a small
4271  // but measurable improvement during decryption and signing.
4272  if (IsPowerOf2(divisor))
4273  {
4274  quotient = dividend >> (BitPrecision(divisor)-1);
4275  remainder = dividend.reg[0] & (divisor-1);
4276  return;
4277  }
4278 
4279  unsigned int i = dividend.WordCount();
4280  quotient.reg.CleanNew(RoundupSize(i));
4281  remainder = 0;
4282  while (i--)
4283  {
4284  quotient.reg[i] = DWord(dividend.reg[i], remainder) / divisor;
4285  remainder = DWord(dividend.reg[i], remainder) % divisor;
4286  }
4287 
4288  if (dividend.NotNegative())
4289  quotient.sign = POSITIVE;
4290  else
4291  {
4292  quotient.sign = NEGATIVE;
4293  if (remainder)
4294  {
4295  --quotient;
4296  remainder = divisor - remainder;
4297  }
4298  }
4299 }
4300 
4302 {
4303  word remainder;
4304  Integer quotient;
4305  Integer::Divide(remainder, quotient, *this, b);
4306  return quotient;
4307 }
4308 
4309 word Integer::Modulo(word divisor) const
4310 {
4311  if (!divisor)
4312  throw Integer::DivideByZero();
4313 
4314  word remainder;
4315 
4316  // Profiling tells us the original Else was dominant, so it was
4317  // promoted to the first If statement. The code change occurred
4318  // at Commit dc99266599a0e72d.
4319  if ((divisor & (divisor-1)) != 0) // divisor is not a power of 2
4320  {
4321  // Profiling tells us the original Else was dominant, so it
4322  // was promoted to the first If statement. The code change
4323  // occurred at Commit dc99266599a0e72d.
4324  unsigned int i = WordCount();
4325  if (divisor > 5)
4326  {
4327  remainder = 0;
4328  while (i--)
4329  remainder = DWord(reg[i], remainder) % divisor;
4330  }
4331  else
4332  {
4333  DWord sum(0, 0);
4334  while (i--)
4335  sum += reg[i];
4336  remainder = sum % divisor;
4337  }
4338  }
4339  else // divisor is a power of 2
4340  {
4341  remainder = reg[0] & (divisor-1);
4342  }
4343 
4344  if (IsNegative() && remainder)
4345  remainder = divisor - remainder;
4346 
4347  return remainder;
4348 }
4349 
4351 {
4352  if (!!(*this)) // don't flip sign if *this==0
4353  sign = Sign(1-sign);
4354 }
4355 
4356 int Integer::PositiveCompare(const Integer& t) const
4357 {
4358  // Profiling tells us the original Else was dominant, so it
4359  // was promoted to the first If statement. The code change
4360  // occurred at Commit dc99266599a0e72d.
4361  const unsigned size = WordCount(), tSize = t.WordCount();
4362  if (size != tSize)
4363  return size > tSize ? 1 : -1;
4364  else
4365  return CryptoPP::Compare(reg, t.reg, size);
4366 }
4367 
4368 int Integer::Compare(const Integer& t) const
4369 {
4370  if (NotNegative())
4371  {
4372  if (t.NotNegative())
4373  return PositiveCompare(t);
4374  else
4375  return 1;
4376  }
4377  else
4378  {
4379  if (t.NotNegative())
4380  return -1;
4381  else
4382  return -PositiveCompare(t);
4383  }
4384 }
4385 
4387 {
4388  if (!IsPositive())
4389  return Zero();
4390 
4391  // overestimate square root
4392  Integer x, y = Power2((BitCount()+1)/2);
4393  CRYPTOPP_ASSERT(y*y >= *this);
4394 
4395  do
4396  {
4397  x = y;
4398  y = (x + *this/x) >> 1;
4399  } while (y<x);
4400 
4401  return x;
4402 }
4403 
4404 bool Integer::IsSquare() const
4405 {
4406  Integer r = SquareRoot();
4407  return *this == r.Squared();
4408 }
4409 
4410 bool Integer::IsUnit() const
4411 {
4412  return (WordCount() == 1) && (reg[0] == 1);
4413 }
4414 
4416 {
4417  return IsUnit() ? *this : Zero();
4418 }
4419 
4420 Integer a_times_b_mod_c(const Integer &x, const Integer& y, const Integer& m)
4421 {
4422  CRYPTOPP_ASSERT(m.NotZero());
4423  if (m.IsZero())
4424  throw Integer::DivideByZero();
4425 
4426  return x*y%m;
4427 }
4428 
4429 Integer a_exp_b_mod_c(const Integer &x, const Integer& e, const Integer& m)
4430 {
4431  CRYPTOPP_ASSERT(m.NotZero());
4432  if (m.IsZero())
4433  throw Integer::DivideByZero();
4434 
4435  ModularArithmetic mr(m);
4436  return mr.Exponentiate(x, e);
4437 }
4438 
4440 {
4441  return EuclideanDomainOf<Integer>().Gcd(a, b);
4442 }
4443 
4445 {
4447  CRYPTOPP_ASSERT(m.NotZero());
4448 
4449  if (IsNegative())
4450  return Modulo(m).InverseModNext(m);
4451 
4452  // http://github.com/weidai11/cryptopp/issues/602
4453  if (*this >= m)
4454  return Modulo(m).InverseModNext(m);
4455 
4456  return InverseModNext(m);
4457 }
4458 
4459 Integer Integer::InverseModNext(const Integer &m) const
4460 {
4462  CRYPTOPP_ASSERT(m.NotZero());
4463 
4464  if (m.IsEven())
4465  {
4466  if (!m || IsEven())
4467  return Zero(); // no inverse
4468  if (*this == One())
4469  return One();
4470 
4471  Integer u = m.Modulo(*this).InverseModNext(*this);
4472  return !u ? Zero() : (m*(*this-u)+1)/(*this);
4473  }
4474 
4475  // AlmostInverse requires a 4x workspace
4476  IntegerSecBlock T(m.reg.size() * 4);
4477  Integer r((word)0, m.reg.size());
4478  unsigned k = AlmostInverse(r.reg, T, reg, reg.size(), m.reg, m.reg.size());
4479  DivideByPower2Mod(r.reg, r.reg, k, m.reg, m.reg.size());
4480  return r;
4481 }
4482 
4483 word Integer::InverseMod(word mod) const
4484 {
4485  CRYPTOPP_ASSERT(mod != 0);
4486 
4487  word g0 = mod, g1 = *this % mod;
4488  word v0 = 0, v1 = 1;
4489  word y;
4490 
4491  while (g1)
4492  {
4493  if (g1 == 1)
4494  return v1;
4495  y = g0 / g1;
4496  g0 = g0 % g1;
4497  v0 += y * v1;
4498 
4499  if (!g0)
4500  break;
4501  if (g0 == 1)
4502  return mod-v0;
4503  y = g1 / g0;
4504  g1 = g1 % g0;
4505  v1 += y * v0;
4506  }
4507  return 0;
4508 }
4509 
4510 // ********************************************************
4511 
4513 {
4514  BERSequenceDecoder seq(bt);
4515  OID oid(seq);
4516  if (oid != ASN1::prime_field())
4517  BERDecodeError();
4518  m_modulus.BERDecode(seq);
4519  seq.MessageEnd();
4520  m_result.reg.resize(m_modulus.reg.size());
4521 }
4522 
4524 {
4525  DERSequenceEncoder seq(bt);
4526  ASN1::prime_field().DEREncode(seq);
4527  m_modulus.DEREncode(seq);
4528  seq.MessageEnd();
4529 }
4530 
4532 {
4533  a.DEREncodeAsOctetString(out, MaxElementByteLength());
4534 }
4535 
4537 {
4538  a.BERDecodeAsOctetString(in, MaxElementByteLength());
4539 }
4540 
4542 {
4543  if (a.reg.size()==m_modulus.reg.size())
4544  {
4545  CryptoPP::DivideByPower2Mod(m_result.reg.begin(), a.reg, 1, m_modulus.reg, a.reg.size());
4546  return m_result;
4547  }
4548  else
4549  return m_result1 = (a.IsEven() ? (a >> 1) : ((a+m_modulus) >> 1));
4550 }
4551 
4552 const Integer& ModularArithmetic::Add(const Integer &a, const Integer &b) const
4553 {
4554  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4555  {
4556  if (CryptoPP::Add(m_result.reg.begin(), a.reg, b.reg, a.reg.size())
4557  || Compare(m_result.reg, m_modulus.reg, a.reg.size()) >= 0)
4558  {
4559  CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4560  }
4561  return m_result;
4562  }
4563  else
4564  {
4565  m_result1 = a+b;
4566  if (m_result1 >= m_modulus)
4567  m_result1 -= m_modulus;
4568  return m_result1;
4569  }
4570 }
4571 
4573 {
4574  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4575  {
4576  if (CryptoPP::Add(a.reg, a.reg, b.reg, a.reg.size())
4577  || Compare(a.reg, m_modulus.reg, a.reg.size()) >= 0)
4578  {
4579  CryptoPP::Subtract(a.reg, a.reg, m_modulus.reg, a.reg.size());
4580  }
4581  }
4582  else
4583  {
4584  a+=b;
4585  if (a>=m_modulus)
4586  a-=m_modulus;
4587  }
4588 
4589  return a;
4590 }
4591 
4592 const Integer& ModularArithmetic::Subtract(const Integer &a, const Integer &b) const
4593 {
4594  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4595  {
4596  if (CryptoPP::Subtract(m_result.reg.begin(), a.reg, b.reg, a.reg.size()))
4597  CryptoPP::Add(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4598  return m_result;
4599  }
4600  else
4601  {
4602  m_result1 = a-b;
4603  if (m_result1.IsNegative())
4604  m_result1 += m_modulus;
4605  return m_result1;
4606  }
4607 }
4608 
4610 {
4611  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4612  {
4613  if (CryptoPP::Subtract(a.reg, a.reg, b.reg, a.reg.size()))
4614  CryptoPP::Add(a.reg, a.reg, m_modulus.reg, a.reg.size());
4615  }
4616  else
4617  {
4618  a-=b;
4619  if (a.IsNegative())
4620  a+=m_modulus;
4621  }
4622 
4623  return a;
4624 }
4625 
4627 {
4628  if (!a)
4629  return a;
4630 
4631  CopyWords(m_result.reg.begin(), m_modulus.reg, m_modulus.reg.size());
4632  if (CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, a.reg, a.reg.size()))
4633  Decrement(m_result.reg.begin()+a.reg.size(), m_modulus.reg.size()-a.reg.size());
4634 
4635  return m_result;
4636 }
4637 
4638 Integer ModularArithmetic::CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
4639 {
4640  if (m_modulus.IsOdd())
4641  {
4642  MontgomeryRepresentation dr(m_modulus);
4643  return dr.ConvertOut(dr.CascadeExponentiate(dr.ConvertIn(x), e1, dr.ConvertIn(y), e2));
4644  }
4645  else
4646  return AbstractRing<Integer>::CascadeExponentiate(x, e1, y, e2);
4647 }
4648 
4649 void ModularArithmetic::SimultaneousExponentiate(Integer *results, const Integer &base, const Integer *exponents, unsigned int exponentsCount) const
4650 {
4651  if (m_modulus.IsOdd())
4652  {
4653  MontgomeryRepresentation dr(m_modulus);
4654  dr.SimultaneousExponentiate(results, dr.ConvertIn(base), exponents, exponentsCount);
4655  for (unsigned int i=0; i<exponentsCount; i++)
4656  results[i] = dr.ConvertOut(results[i]);
4657  }
4658  else
4659  AbstractRing<Integer>::SimultaneousExponentiate(results, base, exponents, exponentsCount);
4660 }
4661 
4663  : ModularArithmetic(m),
4664  m_u((word)0, m_modulus.reg.size()),
4665  m_workspace(5*m_modulus.reg.size())
4666 {
4667  if (!m_modulus.IsOdd())
4668  throw InvalidArgument("MontgomeryRepresentation: Montgomery representation requires an odd modulus");
4669 
4670  RecursiveInverseModPower2(m_u.reg, m_workspace, m_modulus.reg, m_modulus.reg.size());
4671 }
4672 
4674 {
4675  word *const T = m_workspace.begin();
4676  word *const R = m_result.reg.begin();
4677  const size_t N = m_modulus.reg.size();
4678  CRYPTOPP_ASSERT(a.reg.size()<=N && b.reg.size()<=N);
4679 
4680  AsymmetricMultiply(T, T+2*N, a.reg, a.reg.size(), b.reg, b.reg.size());
4681  SetWords(T+a.reg.size()+b.reg.size(), 0, 2*N-a.reg.size()-b.reg.size());
4682  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4683  return m_result;
4684 }
4685 
4687 {
4688  word *const T = m_workspace.begin();
4689  word *const R = m_result.reg.begin();
4690  const size_t N = m_modulus.reg.size();
4691  CRYPTOPP_ASSERT(a.reg.size()<=N);
4692 
4693  CryptoPP::Square(T, T+2*N, a.reg, a.reg.size());
4694  SetWords(T+2*a.reg.size(), 0, 2*N-2*a.reg.size());
4695  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4696  return m_result;
4697 }
4698 
4700 {
4701  word *const T = m_workspace.begin();
4702  word *const R = m_result.reg.begin();
4703  const size_t N = m_modulus.reg.size();
4704  CRYPTOPP_ASSERT(a.reg.size()<=N);
4705 
4706  CopyWords(T, a.reg, a.reg.size());
4707  SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4708  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4709  return m_result;
4710 }
4711 
4713 {
4714 // return (EuclideanMultiplicativeInverse(a, modulus)<<(2*WORD_BITS*modulus.reg.size()))%modulus;
4715  word *const T = m_workspace.begin();
4716  word *const R = m_result.reg.begin();
4717  const size_t N = m_modulus.reg.size();
4718  CRYPTOPP_ASSERT(a.reg.size()<=N);
4719 
4720  CopyWords(T, a.reg, a.reg.size());
4721  SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4722  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4723  unsigned k = AlmostInverse(R, T, R, N, m_modulus.reg, N);
4724 
4725 // cout << "k=" << k << " N*32=" << 32*N << endl;
4726 
4727  if (k>N*WORD_BITS)
4728  DivideByPower2Mod(R, R, k-N*WORD_BITS, m_modulus.reg, N);
4729  else
4730  MultiplyByPower2Mod(R, R, N*WORD_BITS-k, m_modulus.reg, N);
4731 
4732  return m_result;
4733 }
4734 
4735 // Specialization declared in misc.h to allow us to print integers
4736 // with additional control options, like arbirary bases and uppercase.
4737 template <> CRYPTOPP_DLL
4738 std::string IntToString<Integer>(Integer value, unsigned int base)
4739 {
4740  // Hack... set the high bit for uppercase. Set the next bit fo a suffix.
4741  static const unsigned int BIT_32 = (1U << 31);
4742  const bool UPPER = !!(base & BIT_32);
4743  static const unsigned int BIT_31 = (1U << 30);
4744  const bool BASE = !!(base & BIT_31);
4745 
4746  const char CH = UPPER ? 'A' : 'a';
4747  base &= ~(BIT_32|BIT_31);
4748  CRYPTOPP_ASSERT(base >= 2 && base <= 32);
4749 
4750  if (value == 0)
4751  return "0";
4752 
4753  bool negative = false, zero = false;
4754  if (value.IsNegative())
4755  {
4756  negative = true;
4757  value.Negate();
4758  }
4759 
4760  if (!value)
4761  zero = true;
4762 
4763  SecBlock<char> s(value.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
4764  Integer temp;
4765 
4766  unsigned int i=0;
4767  while (!!value)
4768  {
4769  word digit;
4770  Integer::Divide(digit, temp, value, word(base));
4771  s[i++]=char((digit < 10 ? '0' : (CH - 10)) + digit);
4772  value.swap(temp);
4773  }
4774 
4775  std::string result;
4776  result.reserve(i+2);
4777 
4778  if (negative)
4779  result += '-';
4780 
4781  if (zero)
4782  result += '0';
4783 
4784  while (i--)
4785  result += s[i];
4786 
4787  if (BASE)
4788  {
4789  if (base == 10)
4790  result += '.';
4791  else if (base == 16)
4792  result += 'h';
4793  else if (base == 8)
4794  result += 'o';
4795  else if (base == 2)
4796  result += 'b';
4797  }
4798 
4799  return result;
4800 }
4801 
4802 // Specialization declared in misc.h to avoid Coverity findings.
4803 template <> CRYPTOPP_DLL
4804 std::string IntToString<word64>(word64 value, unsigned int base)
4805 {
4806  // Hack... set the high bit for uppercase.
4807  static const unsigned int HIGH_BIT = (1U << 31);
4808  const char CH = !!(base & HIGH_BIT) ? 'A' : 'a';
4809  base &= ~HIGH_BIT;
4810 
4811  CRYPTOPP_ASSERT(base >= 2);
4812  if (value == 0)
4813  return "0";
4814 
4815  std::string result;
4816  while (value > 0)
4817  {
4818  word64 digit = value % base;
4819  result = char((digit < 10 ? '0' : (CH - 10)) + digit) + result;
4820  value /= base;
4821  }
4822  return result;
4823 }
4824 
4825 #ifndef CRYPTOPP_NO_ASSIGN_TO_INTEGER
4826 // Allow the linker to discard Integer code if not needed.
4827 // Also see http://github.com/weidai11/cryptopp/issues/389.
4828 bool AssignIntToInteger(const std::type_info &valueType, void *pInteger, const void *pInt)
4829 {
4830  if (valueType != typeid(Integer))
4831  return false;
4832  *reinterpret_cast<Integer *>(pInteger) = *reinterpret_cast<const int *>(pInt);
4833  return true;
4834 }
4835 #endif // CRYPTOPP_NO_ASSIGN_TO_INTEGER
4836 
4837 // *************************** C++ Static Initialization ***************************
4838 
4840 {
4841 public:
4842  InitInteger()
4843  {
4844  SetFunctionPointers();
4845  }
4846 };
4847 
4848 // This is not really needed because each Integer can dynamically initialize
4849 // itself, but we take a peephole optimization and initialize the class once
4850 // if init priorities are available. Dynamic initialization will be used if
4851 // init priorities are not available.
4852 
4853 #if defined(HAVE_GCC_INIT_PRIORITY)
4854  const InitInteger s_init __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 10))) = InitInteger();
4855  const Integer g_zero __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 11))) = Integer(0L);
4856  const Integer g_one __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 12))) = Integer(1L);
4857  const Integer g_two __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 13))) = Integer(2L);
4858 #elif defined(HAVE_MSC_INIT_PRIORITY)
4859  #pragma warning(disable: 4075)
4860  #pragma init_seg(".CRT$XCU")
4861  const InitInteger s_init;
4862  const Integer g_zero(0L);
4863  const Integer g_one(1L);
4864  const Integer g_two(2L);
4865  #pragma warning(default: 4075)
4866 #elif HAVE_XLC_INIT_PRIORITY
4867  // XLC needs constant, not a define
4868  #pragma priority(280)
4869  const InitInteger s_init;
4870  const Integer g_zero(0L);
4871  const Integer g_one(1L);
4872  const Integer g_two(2L);
4873 #else
4874  const InitInteger s_init;
4875 #endif
4876 
4877 // ***************** Library code ********************
4878 
4880 {
4881 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4882  return g_zero;
4883 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4884  static const Integer s_zero(0L);
4885  return s_zero;
4886 #else // Potential memory leak. Avoid if possible.
4887  return Singleton<Integer, NewInteger<0L> >().Ref();
4888 #endif
4889 }
4890 
4892 {
4893 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4894  return g_one;
4895 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4896  static const Integer s_one(1L);
4897  return s_one;
4898 #else // Potential memory leak. Avoid if possible.
4899  return Singleton<Integer, NewInteger<1L> >().Ref();
4900 #endif
4901 }
4902 
4904 {
4905 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4906  return g_two;
4907 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4908  static const Integer s_two(2L);
4909  return s_two;
4910 #else // Potential memory leak. Avoid if possible.
4911  return Singleton<Integer, NewInteger<2L> >().Ref();
4912 #endif
4913 }
4914 
4915 NAMESPACE_END
4916 
4917 #endif // CRYPTOPP_IMPORTS
Used to pass byte array input as part of a NameValuePairs object.
Definition: algparam.h:20
An invalid argument was detected.
Definition: cryptlib.h:202
CRYPTOPP_DLL std::string IntToString< Integer >(Integer value, unsigned int base)
Converts an Integer to a string.
Definition: integer.cpp:4738
unsigned int WordCount() const
Determines the number of words required to represent the Integer.
Definition: integer.cpp:3345
Integer MultiplicativeInverse() const
Calculate multiplicative inverse.
Definition: integer.cpp:4415
Integer & Reduce(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4609
bool NotZero() const
Determines if the Integer is non-0.
Definition: integer.h:336
Classes for working with NameValuePairs.
Integer Or(const Integer &t) const
Bitwise OR.
Definition: integer.cpp:3823
Integer & operator|=(const Integer &t)
Bitwise OR Assignment.
Definition: integer.cpp:4081
void CopyWords(word *r, const word *a, size_t n)
Copy word array.
Definition: words.h:48
void swap(SecBlock< T, A > &b)
Swap contents with another SecBlock.
Definition: secblock.h:1041
Integer And(const Integer &t) const
Bitwise AND.
Definition: integer.cpp:3797
a number which is probabilistically prime
Definition: integer.h:95
CRYPTOPP_DLL bool CRYPTOPP_API FirstPrime(Integer &p, const Integer &max, const Integer &equiv, const Integer &mod, const PrimeSelector *pSelector)
Finds a random prime of special form.
Definition: nbtheory.cpp:379
Utility functions for the Crypto++ library.
Integer Plus(const Integer &b) const
Addition.
Definition: integer.cpp:3951
virtual size_t Peek(byte &outByte) const
Peek a 8-bit byte.
Definition: cryptlib.cpp:552
ByteOrder
Provides the byte ordering.
Definition: cryptlib.h:143
Restricts the instantiation of a class to one static object without locks.
Definition: misc.h:298
void CleanNew(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:980
an unsigned value
Definition: integer.h:85
T GetValueWithDefault(const char *name, T defaultValue) const
Get a named value.
Definition: cryptlib.h:363
bool IsSquare() const
Determine whether this integer is a perfect square.
Definition: integer.cpp:4404
void DEREncode(BufferedTransformation &bt) const
Encode in DER format.
Definition: integer.cpp:3446
size_t DEREncodeUnsigned(BufferedTransformation &out, T w, byte asnTag=INTEGER)
DER Encode unsigned value.
Definition: asn.h:643
virtual void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: cryptlib.cpp:312
size_t size() const
Length of the memory block.
Definition: algparam.h:84
size_t BitsToBytes(size_t bitCount)
Returns the number of 8-bit bytes or octets required for the specified number of bits.
Definition: misc.h:885
This file contains helper classes/functions for implementing public key algorithms.
void DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
Encode absolute value as big-endian octet string.
Definition: integer.cpp:3469
Integer & operator=(const Integer &t)
Assignment.
Definition: integer.cpp:3105
static Integer CRYPTOPP_API Gcd(const Integer &a, const Integer &n)
Calculate greatest common divisor.
Definition: integer.cpp:4439
void resize(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1031
Integer & operator+=(const Integer &t)
Addition Assignment.
Definition: integer.cpp:3974
size_t BitsToWords(size_t bitCount)
Returns the number of words required for the specified number of bits.
Definition: misc.h:905
void MessageEnd()
Signals the end of messages to the object.
Definition: asn.cpp:538
void PutWord(bool assumeAligned, ByteOrder order, byte *block, T value, const byte *xorBlock=NULL)
Access a block of memory.
Definition: misc.h:2506
unsigned int BytePrecision(const T &value)
Returns the number of 8-bit bytes or octets required for a value.
Definition: misc.h:766
void CleanGrow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1013
Integer & operator--()
Pre-decrement.
Definition: integer.cpp:3777
virtual Element CascadeExponentiate(const Element &x, const Integer &e1, const Element &y, const Integer &e2) const
TODO.
Definition: algebra.cpp:323
Secure memory block with allocator and cleanup.
Definition: secblock.h:688
const Integer & Inverse(const Integer &a) const
Inverts the element in the ring.
Definition: integer.cpp:4626
signed long ConvertToLong() const
Convert the Integer to Long.
Definition: integer.cpp:3026
const Integer & Subtract(const Integer &a, const Integer &b) const
Subtracts elements in the ring.
Definition: integer.cpp:4592
void OpenPGPDecode(const byte *input, size_t inputLen)
Decode from OpenPGP format.
Definition: integer.cpp:3502
Signedness
Used when importing and exporting integers.
Definition: integer.h:83
unsigned int MaxElementByteLength() const
Provides the maximum byte size of an element in the ring.
Definition: modarith.h:248
ASN.1 object identifiers for algorthms and schemes.
Square block cipher.
Definition: square.h:24
Classes for automatic resource management.
bool IsP4()
Determines if the CPU is an Intel P4.
Definition: cpu.h:236
bool IsNegative() const
Determines if the Integer is negative.
Definition: integer.h:339
lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: queue.h:33
Library configuration file.
MontgomeryRepresentation(const Integer &modulus)
Construct a MontgomeryRepresentation.
Definition: integer.cpp:4662
static void CRYPTOPP_API DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
Extended Division.
Definition: integer.cpp:4223
CRYPTOPP_DLL size_t CRYPTOPP_API DEREncodeOctetString(BufferedTransformation &bt, const byte *str, size_t strLen)
DER encode octet string.
Definition: asn.cpp:104
Ring of congruence classes modulo n.
Definition: modarith.h:43
Interface for random number generators.
Definition: cryptlib.h:1383
size_t BytesToWords(size_t byteCount)
Returns the number of words required for the specified number of bytes.
Definition: misc.h:895
Common C++ header files.
void Randomize(RandomNumberGenerator &rng, size_t bitCount)
Set this Integer to random integer.
Definition: integer.cpp:3517
void New(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:965
void SetByte(size_t n, byte value)
Set the n-th byte to value.
Definition: integer.cpp:3153
Integer InverseMod(const Integer &n) const
Calculate multiplicative inverse.
Definition: integer.cpp:4444
the value is negative
Definition: integer.h:77
SecBlock<byte> typedef.
Definition: secblock.h:1058
BER Sequence Decoder.
Definition: asn.h:403
Interface for buffered transformations.
Definition: cryptlib.h:1598
Support functions for word operations.
bool IsPositive() const
Determines if the Integer is positive.
Definition: integer.h:345
bool NotNegative() const
Determines if the Integer is non-negative.
Definition: integer.h:342
const Integer & Add(const Integer &a, const Integer &b) const
Adds elements in the ring.
Definition: integer.cpp:4552
static const Integer &CRYPTOPP_API One()
Integer representing 1.
Definition: integer.cpp:4891
Integer & Accumulate(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4572
P1363 key derivation function.
Definition: pubkey.h:729
byte order is little-endian
Definition: cryptlib.h:145
Sign
Used internally to represent the integer.
Definition: integer.h:73
Integer ConvertIn(const Integer &a) const
Reduces an element in the congruence class.
Definition: modarith.h:313
bool operator!() const
Negation.
Definition: integer.cpp:3100
Pointer that overloads operator ->
Definition: smartptr.h:36
bool IsUnit() const
Determine if 1 or -1.
Definition: integer.cpp:4410
unsigned int ByteCount() const
Determines the number of bytes required to represent the Integer.
Definition: integer.cpp:3350
Classes and functions for secure memory allocations.
void DEREncodeElement(BufferedTransformation &out, const Element &a) const
Encodes element in DER format.
Definition: integer.cpp:4531
Integer Modulo(const Integer &b) const
Remainder.
Definition: integer.cpp:4258
Copy input to a memory buffer.
Definition: filters.h:1137
void BERDecodeElement(BufferedTransformation &in, Element &a) const
Decodes element in DER format.
Definition: integer.cpp:4536
size_t MinEncodedSize(Signedness sign=UNSIGNED) const
Minimum number of bytes to encode this integer.
Definition: integer.cpp:3407
void ShiftWordsLeftByWords(word *r, size_t n, size_t shiftWords)
Left shift word array.
Definition: words.h:194
Integer DividedBy(const Integer &b) const
Division.
Definition: integer.cpp:4251
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: integer.cpp:4638
Integer Times(const Integer &b) const
Multiplication.
Definition: integer.cpp:4147
a number with no special properties
Definition: integer.h:93
const byte * begin() const
Pointer to the first byte in the memory block.
Definition: algparam.h:80
size_t Put(byte inByte, bool blocking=true)
Input a byte for processing.
Definition: cryptlib.h:1620
Integer & operator++()
Pre-increment.
Definition: integer.cpp:3756
AlgorithmParameters MakeParameters(const char *name, const T &value, bool throwIfNotUsed=true)
Create an object that implements NameValuePairs.
Definition: algparam.h:504
void swap(Integer &a)
Swaps this Integer with another Integer.
Definition: integer.cpp:3183
Integer()
Creates the zero integer.
Definition: integer.cpp:2972
unsigned int TrailingZeros(word32 v)
Determines the number of trailing 0-bits in a value.
Definition: misc.h:814
Integer & operator<<=(size_t n)
Left-shift Assignment.
Definition: integer.cpp:4043
bool IsZero() const
Determines if the Integer is 0.
Definition: integer.h:333
Exception thrown when an error is encountered decoding an OpenPGP integer.
Definition: integer.h:282
void Negate()
Reverse the Sign of the Integer.
Definition: integer.cpp:4350
int Compare(const Integer &a) const
Perform signed comparison.
Definition: integer.cpp:4368
bool IsDefiniteLength() const
Determine length encoding.
Definition: asn.h:283
T Crop(T value, size_t bits)
Truncates the value to the specified number of bits.
Definition: misc.h:873
void BERDecodeAsOctetString(BufferedTransformation &bt, size_t length)
Decode nonnegative value from big-endian octet string.
Definition: integer.cpp:3476
CRYPTOPP_DLL std::string IntToString< word64 >(word64 value, unsigned int base)
Converts an unsigned value to a string.
Definition: integer.cpp:4804
Application callback to signal suitability of a cabdidate prime.
Definition: nbtheory.h:113
void ConditionalSwapPointers(bool c, T &a, T &b)
Performs a branchless swap of pointers a and b if condition c is true.
Definition: misc.h:1321
static Integer CRYPTOPP_API Power2(size_t e)
Exponentiates to a power of 2.
Definition: integer.cpp:3093
Multiple precision integer with arithmetic operations.
Definition: integer.h:49
Precompiled header file.
void SetWords(word *r, word a, size_t n)
Set the value of words.
Definition: words.h:35
virtual Element Exponentiate(const Element &a, const Integer &e) const
Raises a base to an exponent in the group.
Definition: algebra.cpp:316
const Integer & Multiply(const Integer &a, const Integer &b) const
Multiplies elements in the ring.
Definition: integer.cpp:4673
a signed value
Definition: integer.h:87
size_t OpenPGPEncode(byte *output, size_t bufferSize) const
Encode absolute value in OpenPGP format.
Definition: integer.cpp:3485
const Integer & Half(const Integer &a) const
Divides an element by 2.
Definition: integer.cpp:4541
static const Integer &CRYPTOPP_API Two()
Integer representing 2.
Definition: integer.cpp:4903
virtual lword Skip(lword skipMax=LWORD_MAX)
Discard skipMax bytes from the output buffer.
Definition: cryptlib.cpp:575
bool IsPowerOf2(const T &value)
Tests whether a value is a power of 2.
Definition: misc.h:957
const Integer & Square(const Integer &a) const
Square an element in the ring.
Definition: integer.cpp:4686
Integer & operator^=(const Integer &t)
Bitwise XOR Assignment.
Definition: integer.cpp:4102
#define MEMORY_BARRIER
A memory barrier.
Definition: misc.h:262
RandomNumberType
Properties of a random integer.
Definition: integer.h:91
const char * Seed()
ConstByteArrayParameter.
Definition: argnames.h:19
void ShiftWordsRightByWords(word *r, size_t n, size_t shiftWords)
Right shift word array.
Definition: words.h:212
bool IsEven() const
Determines if the Integer is even parity.
Definition: integer.h:351
byte order is big-endian
Definition: cryptlib.h:147
const T & STDMIN(const T &a, const T &b)
Replacement function for std::min.
Definition: misc.h:602
String-based implementation of Store interface.
Definition: filters.h:1196
#define CRYPTOPP_ASSERT(exp)
Debugging and diagnostic assertion.
Definition: trap.h:69
static void CRYPTOPP_API Divide(Integer &r, Integer &q, const Integer &a, const Integer &d)
Extended Division.
Definition: integer.cpp:4205
ModularArithmetic(const Integer &modulus=Integer::One())
Construct a ModularArithmetic.
Definition: modarith.h:54
void BERDecodeError()
Raises a BERDecodeErr.
Definition: asn.h:69
Data structure used to store byte strings.
Definition: queue.h:18
Functions for CPU features and intrinsics.
virtual const Element & Gcd(const Element &a, const Element &b) const
Calculates the greatest common denominator in the ring.
Definition: algebra.cpp:56
Classes and functions for working with ANS.1 objects.
Classes for SHA-1 and SHA-2 family of message digests.
void SetBit(size_t n, bool value=1)
Set the n-th bit to value.
Definition: integer.cpp:3128
CRYPTOPP_DLL bool GetIntValue(const char *name, int &value) const
Get a named value with type int.
Definition: cryptlib.h:386
size_t GetWord16(word16 &value, ByteOrder order=BIG_ENDIAN_ORDER)
Retrieve a 16-bit word.
Definition: cryptlib.cpp:813
iterator begin()
Provides an iterator pointing to the first element in the memory block.
Definition: secblock.h:772
unsigned int BitCount() const
Determines the number of bits required to represent the Integer.
Definition: integer.cpp:3359
const char * PointerToPrimeSelector()
const PrimeSelector *
Definition: argnames.h:42
Implementation of BufferedTransformation's attachment interface.
void MessageEnd()
Signals the end of messages to the object.
Definition: asn.cpp:467
Classes and functions for number theoretic operations.
byte GetByte(size_t i) const
Provides the i-th byte of the Integer.
Definition: integer.cpp:3142
Exception thrown when division by 0 is encountered.
Definition: integer.h:55
void XorWords(word *r, const word *a, const word *b, size_t n)
XOR word arrays.
Definition: words.h:66
DER Sequence Encoder.
Definition: asn.h:435
void Encode(byte *output, size_t outputLen, Signedness sign=UNSIGNED) const
Encode in big-endian format.
Definition: integer.cpp:3424
Integer Squared() const
Multiply this integer by itself.
Definition: integer.h:631
T1 SaturatingSubtract1(const T1 &a, const T2 &b)
Performs a saturating subtract clamped at 1.
Definition: misc.h:1057
virtual lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: cryptlib.cpp:506
Exception thrown when a random number cannot be found that satisfies the condition.
Definition: integer.h:63
bool GenerateRandomNoThrow(RandomNumberGenerator &rng, const NameValuePairs &params=g_nullNameValuePairs)
Generate a random number.
Definition: integer.cpp:3581
Performs modular arithmetic in Montgomery representation for increased speed.
Definition: modarith.h:295
DER General Encoder.
Definition: asn.h:369
bool HasSSE2()
Determines SSE2 availability.
Definition: cpu.h:116
Integer Minus(const Integer &b) const
Subtraction.
Definition: integer.cpp:3997
void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: integer.cpp:3559
void Decode(const byte *input, size_t inputLen, Signedness sign=UNSIGNED)
Decode from big-endian byte array.
Definition: integer.cpp:3368
size_t PutWord16(word16 value, ByteOrder order=BIG_ENDIAN_ORDER, bool blocking=true)
Input a 16-bit word for processing.
Definition: cryptlib.cpp:753
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: modarith.h:327
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Definition: integer.cpp:4649
word ShiftWordsRightByBits(word *r, size_t n, unsigned int shiftBits)
Right shift word array.
Definition: words.h:172
Multiple precision integer with arithmetic operations.
Integer Xor(const Integer &t) const
Bitwise XOR.
Definition: integer.cpp:3849
static const Integer &CRYPTOPP_API Zero()
Integer representing 0.
Definition: integer.cpp:4879
const T & STDMAX(const T &a, const T &b)
Replacement function for std::max.
Definition: misc.h:613
Euclidean domain.
Definition: algebra.h:315
void Grow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:995
void BERDecode(const byte *input, size_t inputLen)
Decode from BER format.
Definition: integer.cpp:3453
virtual size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: cryptlib.cpp:529
Class file for performing modular arithmetic.
Crypto++ library namespace.
bool GetValue(const char *name, T &value) const
Get a named value.
Definition: cryptlib.h:350
Integer & operator>>=(size_t n)
Right-shift Assignment.
Definition: integer.cpp:4055
bool GetBit(size_t i) const
Provides the i-th bit of the Integer.
Definition: integer.cpp:3117
Integer SquareRoot() const
Extract square root.
Definition: integer.cpp:4386
BER General Decoder.
Definition: asn.h:258
size_t CountWords(const word *x, size_t n)
Count the number of words.
Definition: words.h:21
word ShiftWordsLeftByBits(word *r, size_t n, unsigned int shiftBits)
Left shift word array.
Definition: words.h:149
bool IsConvertableToLong() const
Determines if the Integer is convertable to Long.
Definition: integer.cpp:3012
Object Identifier.
Definition: asn.h:166
size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: queue.cpp:301
void OrWords(word *r, const word *a, const word *b, size_t n)
OR word arrays.
Definition: words.h:120
Integer AbsoluteValue() const
Retrieve the absolute value of this integer.
Definition: integer.cpp:3176
Integer & operator&=(const Integer &t)
Bitwise AND Assignment.
Definition: integer.cpp:4069
Integer & operator-=(const Integer &t)
Subtraction Assignment.
Definition: integer.cpp:4020
unsigned int BitPrecision(const T &value)
Returns the number of bits required for a value.
Definition: misc.h:789
size_type size() const
Provides the count of elements in the SecBlock.
Definition: secblock.h:797
lword RemainingLength() const
Determine remaining length.
Definition: asn.h:291
void DEREncode(BufferedTransformation &bt) const
Encodes in DER format.
Definition: integer.cpp:4523
lword GetBits(size_t i, size_t n) const
Provides the low order bits of the Integer.
Definition: integer.cpp:3160
Integer operator-() const
Subtraction.
Definition: integer.cpp:3169
the value is positive or 0
Definition: integer.h:75
const Integer & MultiplicativeInverse(const Integer &a) const
Calculate the multiplicative inverse of an element in the ring.
Definition: integer.cpp:4712
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Definition: modarith.h:330
bool IsOdd() const
Determines if the Integer is odd parity.
Definition: integer.h:354
virtual void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the Ring.
Definition: algebra.cpp:334
Interface for retrieving values given their names.
Definition: cryptlib.h:293
Integer ConvertOut(const Integer &a) const
Reduces an element in the congruence class.
Definition: integer.cpp:4699
void AndWords(word *r, const word *a, const word *b, size_t n)
AND word arrays.
Definition: words.h:93