# [Zhengrui 2019 Summer Training Camp] Zhengrui 889 Small D and Computing

It is equivalent to negating the first one . You can first invert the register $$1$$ and put it in the register $$2$$. Then, for the register $$2$$, first shift left by $$63$$, and then shift right by $$63$$. Using natural overflow is equivalent to retaining only the lowest bit.

Requires $$3$$ operations.

Consider three registers $$x$$, $$y$$, $$z$$. When we say that "calculation" is performed on $$x$$, $$y$$, $$z$$, what we mean is actually operating on the corresponding values ​​in these three registers.

We ask for $$x+y$$ (that is, add the corresponding value in the register numbered $$x$$ and the register numbered $$y$$).

Consider first the addition without carry, that is, the exclusive OR. Shilling$$z:=x\operatorname{xor}y$$. Then consider the carry part, which is equal to $$(x\operatorname{and}y)\ll 1$$. Let $$y:=(x\operatorname{and}y)\ll 1$$. We were pleasantly surprised to find that the problem was transformed into the addition of $$y$$ and $$z$$! And, every time $$y$$ will be shifted to the left by $$1$$, then after at most $$64$$ times, $$y$$ will become $$0$$, and the addition will be completed. !

For each round of operation, consider whether to use $$x+y$$ or $$z+y$$ according to the parity of the current number of rounds. Equivalent to every other round, the meaning of the two numbers $$x$$, $$z$$ will be exchanged. The advantage of this is to avoid me having to force set x z at the end of each round, which is one more operation. Because the total number of rounds $$64$$ is an even number, you will find that the final answer must be stored in the position of $$z$$.

Because each round needs to use xor, and, shl a total of three operations, so a total of $$3\times64=192$$ operations are used. May actually be more or less (eg the last left shift may not be done).

Consider each bit in turn (shift left $$1$$ bit each time, then $$\operatorname{and} 1$$. This $$1$$ can be preprocessed in advance, the method is not t t, shr t 63) . For the current bit, the result is a $$0$$ or $$1$$ number. Call the addition in task2 and add this number to the answer. Each addition does not need to add $$64$$ bits, because the answer number is very small, and only needs to be added to the current possible highest bit.

It takes about $$1100$$ operations.

Consider the highest bit of $$x\operatorname{xor}y$$, which is the first distinct bit of $$x$$, $$y$$. If we can only keep the number on this digit, it may be recorded as $$z$$. Then use \ (z\operatorname{and}x\), if the result is not \ (0\), then \ (x>y\), otherwise \ (x\leq y\)

So the problem turns into how to keep only the highest bit of a number.

Consider the following operations:

x|=(x>>1);
x|=(x>>2);
x|=(x>>4);
x|=(x>>8);
x|=(x>>16);
x|=(x>>32);


After such $$6$$ operations, the effect achieved is: $$x$$ below the highest bit, all are $$1$$. The principle is actually the doubling method: first turn the next bit of the highest bit into $$1$$, then use these two bits together to turn the next two bits into $$1$$, and so on. In addition, although it seems to be a $$6$$ step operation, in actual implementation, each step operation requires $$3$$ instructions, so this process requires a total of $$18$$ instructions.

Using this method, we can first turn the highest bit of $$z$$ into $$1$$. Then, we use three operations, let $$z:=z\operatorname{xor}(z\gg1)$$. This is equivalent to retaining only the highest bit of $$z$$. Take the new $$z$$ and $$x$$ $$\operatorname{and}$$.

Finally, to determine whether the result is $$0$$, you can use the multiplication method again to push the highest bit$$1$$ (if there is any) to the lowest bit, then shift left by $$63$$ and then right $$63$$ bits, you can only retain the value of the lowest bit.

A total of $$43$$ operations are required.

We still consider each one in turn. Maintain a current answer. If the current bit is $$1$$, let: $$\text{ans}:=\text{ans}\times2+1$$.

This is equivalent to implementing a ternary operator: if the condition is true, make $$\text{ans}$$ equal to $$x$$, otherwise make $$\text{ans}$$ equal to $$y\ ) (here \(x$$, $$y$$, are general expressions. Here it is equal to $$\text{ans}\times2+1$$ and $$\text{ans}$$). However, we do not have the $$\texttt{if}$$ statement, how to implement the ternary operator?

Consider that if the condition is true, construct a number with all $$1$$ (that is, $$2^{64}-1$$), which can be achieved by the multiplication method in task4; when the condition is not true, This number is naturally all $$0$$. Then use the constructed number (all $$1$$ or all $$0$$) to $$\operatorname{and} x$$. To negate it again, go to $$\operatorname{and}y$$. It is found that one of these two results must be $$0$$ and the other is the value we want, so they are $$\operatorname{or}$$ and assigned to $$\text{ans}$$ That's it.

In this task, this "condition is true" is equivalent to whether the number on the current bit is $$1$$. So specifically, the number on the current bit can be filled with all the bits by the multiplication method.

This is just a general idea. If it is implemented in a simple way, the number of operations is relatively large (about $$1600\sim 1800$$), and some optimizations need to be done, for example:

• Considering all binary bits from small to large, for the $$i$$th bit, the number of digits of $$\text{ans}$$ must be less than or equal to $$i$$. So you don't need to fill all $$64$$ binary bits, just fill the first $$i$$ bits.
• It is found that the two numbers $$x$$, $$y$$ to be selected by the ternary operation differ only by one binary digit. So there is no need to negate the condition and then $$\operatorname{and}$$. Just $$\operatorname{and}$$ the larger number ($$\text{ans}\times2+1$$) first, and then put the two $$\operatorname{or}$$ together.

You can choose either "bubble sort" or "selection sort". The core is to implement: if (x>y) swap (x, y)

The comparison can be implemented with task4. Then use the multiplication method to cover all the bits of the comparison result, and record this result as $$t$$ ($$t\in\{0,2^{64-1}\}$$).

After having the comparison result, the rest is equivalent to a ternary operator. However, the naive implementation still has too many operations ($$7$$ times). Consider the properties of the "swap" operation. This can be done with $$\operatorname{xor}$$. First make a $$z=x\operatorname{xor}y$$. Then if you need to exchange, it is equivalent to XOR both numbers on $$z$$. So you can make $$z:=z\operatorname{and}t$$. Then put $$x$$, $$y$$ respectively $$\operatorname{xor}z$$. A total of $$4$$ operations are required.

I realized it, a total of $$2268$$ operations. A larger constant should not exceed $$2400$$.

//problem:ZR889
#include <bits/stdc++.h>
using namespace std;

#define pb push_back
#define mk make_pair
#define lob lower_bound
#define upb upper_bound
#define fi first
#define se second
#define SZ(x) ((int)(x).size())

typedef unsigned int uint;
typedef long long ll;
typedef unsigned long long ull;
typedef pair<int,int> pii;

template<typename T>inline void ckmax(T& x,T y){x=(y>x?y:x);}
template<typename T>inline void ckmin(T& x,T y){x=(y<x?y:x);}

int get_highbit(ull x){
int ans=0;
while(x)ans++,x>>=1;
return ans;
}

int add(int xpos=1,int ypos=2,int anspos=3,int highbit=64) {
//no need for a[anspos]=0
//Do not restore a[xpos],a[ypos]
int x=xpos;
int y=ypos;
int z=anspos;

int cnt=0;
for(int i=1;i<=highbit;++i){
cout<<"xor "<<z<<" "<<x<<" "<<y<<endl;++cnt;
cout<<"and "<<y<<" "<<x<<" "<<y<<endl;++cnt;
swap(x,z);
if(i<highbit){
cout<<"shl "<<y<<" "<<1<<endl;++cnt;
}
}
if(x!=anspos){
cout<<"set "<<anspos<<" "<<x<<endl;++cnt;
}
//	if(highbit!=64){
//		cout<<"shl "<<anspos<<" "<<64-highbit<<endl;++cnt;
//		cout<<"shr "<<anspos<<" "<<64-highbit<<endl;++cnt;
//	}
return cnt;
}
int popcnt(int pos=1,int anspos=2){
//Default a[anspos]=0
//do not restore a[pos]
int cnt=0;
int t1=40;
cout<<"not "<<t1<<" "<<t1<<endl;++cnt;
cout<<"shr "<<t1<<" "<<63<<endl;++cnt;
int v=39;
int s1=38;
int s2=anspos;
for(int i=1;i<=64;++i){
cout<<"and "<<v<<" "<<pos<<" "<<t1<<endl;++cnt;
//cout<<"set "<<s1<<" "<<anspos<<endl;++cnt;
swap(s1,s2);
if(i<64){
cout<<"shr "<<pos<<" "<<1<<endl;++cnt;
}
}
if(s1!=anspos){
cout<<"set "<<anspos<<" "<<s1<<endl;++cnt;
}
return cnt;
}
int push_highbit(int pos,int highbit=64){
//anspos=pos
//Make everything below the highest bit to 1
int cnt=0;
int t=20;
for(int i=1;i<highbit;i<<=1){
cout<<"set "<<t<<" "<<pos<<endl;++cnt;
cout<<"shr "<<t<<" "<<i<<endl;++cnt;
cout<<"or "<<pos<<" "<<pos<<" "<<t<<endl;++cnt;
}
return cnt;
}
int compare(int xpos=1,int ypos=2,int anspos=3){
//no need for a[anspos]=0
//does not change a[xpos],a[ypos]
int cnt=0;
int t1=40;
int t2=39;
cout<<"xor "<<t1<<" "<<xpos<<" "<<ypos<<endl;++cnt;
cnt+=push_highbit(t1);

cout<<"set "<<t2<<" "<<t1<<endl;++cnt;
cout<<"shr "<<t2<<" "<<1<<endl;++cnt;
cout<<"xor "<<t1<<" "<<t1<<" "<<t2<<endl;++cnt;//Now only the highest bit of t1 is 1

cout<<"and "<<anspos<<" "<<t1<<" "<<xpos<<endl;++cnt;
cnt+=push_highbit(anspos);
//You can comment out in task6:
cout<<"shl "<<anspos<<" "<<63<<endl;++cnt;
cout<<"shr "<<anspos<<" "<<63<<endl;++cnt;
return cnt;
}
int pow_of_popcount(int pos=1,int anspos=2){
//Default a[anspos]=0
int cnt=0;
int flagpos=40;
//cout<<"shl "<<flagpos<<" "<<64<<endl;++cnt;
cout<<"not "<<flagpos<<" "<<flagpos<<endl;++cnt;
cout<<"shr "<<flagpos<<" "<<63<<endl;++cnt;
int t1=39;
int t2=38;
int t3=37;
cout<<"set "<<t1<<" "<<flagpos<<endl;++cnt;
for(int i=1;i<=64;++i){
if(i!=1){
cout<<"set "<<t2<<" "<<anspos<<endl;++cnt;
cout<<"shl "<<t2<<" "<<1<<endl;++cnt;
}
cout<<"or "<<t2<<" "<<t2<<" "<<t1<<endl;++cnt;//t2=anspos<<1|1

cout<<"and "<<t3<<" "<<pos<<" "<<flagpos<<endl;++cnt;
cnt+=push_highbit(t3,i);
cout<<"and "<<t2<<" "<<t2<<" "<<t3<<endl;++cnt;

cout<<"or "<<anspos<<" "<<anspos<<" "<<t2<<endl;++cnt;

if(i<64){
cout<<"shl "<<flagpos<<" "<<1<<endl;++cnt;
}
}
return cnt;
}
int push_lowbit(int pos){
int cnt=0;
int t=20;
for(int i=1;i<=32;i<<=1){
cout<<"set "<<t<<" "<<pos<<endl;++cnt;
cout<<"shl "<<t<<" "<<i<<endl;++cnt;
cout<<"or "<<pos<<" "<<pos<<" "<<t<<endl;++cnt;
}
return cnt;
}
int bubble_sort(){
int cnt=0;
int t1=10;
int t2=11;
for(int i=1;i<9;++i){
for(int j=i+1;j<=9;++j){
cnt+=compare(i,j,t1);
cnt+=push_lowbit(t1);
cout<<"xor "<<t2<<" "<<j<<" "<<i<<endl;++cnt;
cout<<"and "<<t2<<" "<<t2<<" "<<t1<<endl;++cnt;
cout<<"xor "<<i<<" "<<i<<" "<<t2<<endl;++cnt;
cout<<"xor "<<j<<" "<<j<<" "<<t2<<endl;++cnt;

//			cout<<"and "<<t2<<" "<<t1<<" "<<j<<endl;++cnt;
//			cout<<"and "<<t3<<" "<<t1<<" "<<i<<endl;++cnt;
//			cout<<"not "<<t1<<" "<<t1<<endl;++cnt;
//			cout<<"and "<<t4<<" "<<t1<<" "<<j<<endl;++cnt;
//			cout<<"and "<<t5<<" "<<t1<<" "<<i<<endl;++cnt;
//
//			cout<<"or "<<j<<" "<<t3<<" "<<t4<<endl;++cnt;
//			cout<<"or "<<i<<" "<<t2<<" "<<t5<<endl;++cnt;
}
}
return cnt;
}

freopen("calculate1.ans","w",stdout);
cout<<"not "<<2<<" "<<1<<endl;
cout<<"shl "<<2<<" "<<63<<endl;
cout<<"shr "<<2<<" "<<63<<endl;
cerr<<"cnt "<<3<<endl;
}
freopen("calculate2.ans","w",stdout);
cerr<<"cnt "<<cnt<<endl;
}
freopen("calculate3.ans","w",stdout);
int cnt=popcnt();
cerr<<"cnt "<<cnt<<endl;
}
freopen("calculate4.ans","w",stdout);
int cnt=compare();
cerr<<"cnt "<<cnt<<endl;
}
freopen("calculate5.ans","w",stdout);
int cnt=pow_of_popcount();
cerr<<"cnt "<<cnt<<endl;
}