[Zhengrui 2019 Summer Training Camp] Zhengrui 889 Small D and Computing


It is equivalent to negating the first one . You can first invert the register \(1\) and put it in the register \(2\). Then, for the register \(2\), first shift left by \(63\), and then shift right by \(63\). Using natural overflow is equivalent to retaining only the lowest bit.

Requires \(3\) operations.


Consider three registers \(x\), \(y\), \(z\). When we say that "calculation" is performed on \(x\), \(y\), \(z\), what we mean is actually operating on the corresponding values ​​in these three registers.

We ask for \(x+y\) (that is, add the corresponding value in the register numbered \(x\) and the register numbered \(y\)).

Consider first the addition without carry, that is, the exclusive OR. Shilling\(z:=x\operatorname{xor}y\). Then consider the carry part, which is equal to \((x\operatorname{and}y)\ll 1\). Let \(y:=(x\operatorname{and}y)\ll 1\). We were pleasantly surprised to find that the problem was transformed into the addition of \(y\) and \(z\)! And, every time \(y\) will be shifted to the left by \(1\), then after at most \(64\) times, \(y\) will become \(0\), and the addition will be completed. !

For each round of operation, consider whether to use \(x+y\) or \(z+y\) according to the parity of the current number of rounds. Equivalent to every other round, the meaning of the two numbers \(x\), \(z\) will be exchanged. The advantage of this is to avoid me having to force set x z at the end of each round, which is one more operation. Because the total number of rounds \(64\) is an even number, you will find that the final answer must be stored in the position of \(z\).

Because each round needs to use xor, and, shl a total of three operations, so a total of \(3\times64=192\) operations are used. May actually be more or less (eg the last left shift may not be done).


Consider each bit in turn (shift left \(1\) bit each time, then \(\operatorname{and} 1\). This \(1\) can be preprocessed in advance, the method is not t t, shr t 63) . For the current bit, the result is a \(0\) or \(1\) number. Call the addition in task2 and add this number to the answer. Each addition does not need to add \(64\) bits, because the answer number is very small, and only needs to be added to the current possible highest bit.

It takes about \(1100\) operations.


Consider the highest bit of \(x\operatorname{xor}y\), which is the first distinct bit of \(x\), \(y\). If we can only keep the number on this digit, it may be recorded as \(z\). Then use \ (z\operatorname{and}x\), if the result is not \ (0\), then \ (x>y\), otherwise \ (x\leq y\)

So the problem turns into how to keep only the highest bit of a number.

Consider the following operations:


After such \(6\) operations, the effect achieved is: \(x\) below the highest bit, all are \(1\). The principle is actually the doubling method: first turn the next bit of the highest bit into \(1\), then use these two bits together to turn the next two bits into \(1\), and so on. In addition, although it seems to be a \(6\) step operation, in actual implementation, each step operation requires \(3\) instructions, so this process requires a total of \(18\) instructions.

Using this method, we can first turn the highest bit of \(z\) into \(1\). Then, we use three operations, let \(z:=z\operatorname{xor}(z\gg1)\). This is equivalent to retaining only the highest bit of \(z\). Take the new \(z\) and \(x\) \(\operatorname{and}\).

Finally, to determine whether the result is \(0\), you can use the multiplication method again to push the highest bit\(1\) (if there is any) to the lowest bit, then shift left by \(63\) and then right \(63\) bits, you can only retain the value of the lowest bit.

A total of \(43\) operations are required.


We still consider each one in turn. Maintain a current answer. If the current bit is \(1\), let: \(\text{ans}:=\text{ans}\times2+1\).

This is equivalent to implementing a ternary operator: if the condition is true, make \(\text{ans}\) equal to \(x\), otherwise make \(\text{ans}\) equal to \(y\ ) (here \(x\), \(y\), are general expressions. Here it is equal to \(\text{ans}\times2+1\) and \(\text{ans}\)). However, we do not have the \(\texttt{if}\) statement, how to implement the ternary operator?

Consider that if the condition is true, construct a number with all \(1\) (that is, \(2^{64}-1\)), which can be achieved by the multiplication method in task4; when the condition is not true, This number is naturally all \(0\). Then use the constructed number (all \(1\) or all \(0\)) to \(\operatorname{and} x\). To negate it again, go to \(\operatorname{and}y\). It is found that one of these two results must be \(0\) and the other is the value we want, so they are \(\operatorname{or}\) and assigned to \(\text{ans}\) That's it.

In this task, this "condition is true" is equivalent to whether the number on the current bit is \(1\). So specifically, the number on the current bit can be filled with all the bits by the multiplication method.

This is just a general idea. If it is implemented in a simple way, the number of operations is relatively large (about \(1600\sim 1800\)), and some optimizations need to be done, for example:

  • Considering all binary bits from small to large, for the \(i\)th bit, the number of digits of \(\text{ans}\) must be less than or equal to \(i\). So you don't need to fill all \(64\) binary bits, just fill the first \(i\) bits.
  • It is found that the two numbers \(x\), \(y\) to be selected by the ternary operation differ only by one binary digit. So there is no need to negate the condition and then \(\operatorname{and}\). Just \(\operatorname{and}\) the larger number (\(\text{ans}\times2+1\)) first, and then put the two \(\operatorname{or}\) together.


You can choose either "bubble sort" or "selection sort". The core is to implement: if (x>y) swap (x, y)

The comparison can be implemented with task4. Then use the multiplication method to cover all the bits of the comparison result, and record this result as \(t\) (\(t\in\{0,2^{64-1}\}\)).

After having the comparison result, the rest is equivalent to a ternary operator. However, the naive implementation still has too many operations (\(7\) times). Consider the properties of the "swap" operation. This can be done with \(\operatorname{xor}\). First make a \(z=x\operatorname{xor}y\). Then if you need to exchange, it is equivalent to XOR both numbers on \(z\). So you can make \(z:=z\operatorname{and}t\). Then put \(x\), \(y\) respectively \(\operatorname{xor}z\). A total of \(4\) operations are required.

I realized it, a total of \(2268\) operations. A larger constant should not exceed \(2400\).

code to make answer

#include <bits/stdc++.h>
using namespace std;

#define pb push_back
#define mk make_pair
#define lob lower_bound
#define upb upper_bound
#define fi first
#define se second
#define SZ(x) ((int)(x).size())

typedef unsigned int uint;
typedef long long ll;
typedef unsigned long long ull;
typedef pair<int,int> pii;

template<typename T>inline void ckmax(T& x,T y){x=(y>x?y:x);}
template<typename T>inline void ckmin(T& x,T y){x=(y<x?y:x);}

int get_highbit(ull x){
	int ans=0;
	return ans;

int add(int xpos=1,int ypos=2,int anspos=3,int highbit=64) {
	//no need for a[anspos]=0
	//Do not restore a[xpos],a[ypos]
	int x=xpos;
	int y=ypos;
	int z=anspos;
	int cnt=0;
	for(int i=1;i<=highbit;++i){
		cout<<"xor "<<z<<" "<<x<<" "<<y<<endl;++cnt;
		cout<<"and "<<y<<" "<<x<<" "<<y<<endl;++cnt;
			cout<<"shl "<<y<<" "<<1<<endl;++cnt;
		cout<<"set "<<anspos<<" "<<x<<endl;++cnt;
//	if(highbit!=64){
//		cout<<"shl "<<anspos<<" "<<64-highbit<<endl;++cnt;
//		cout<<"shr "<<anspos<<" "<<64-highbit<<endl;++cnt;
//	}
	return cnt;
int popcnt(int pos=1,int anspos=2){
	//Default a[anspos]=0
	//do not restore a[pos]
	int cnt=0;
	int t1=40;
	cout<<"not "<<t1<<" "<<t1<<endl;++cnt;
	cout<<"shr "<<t1<<" "<<63<<endl;++cnt;
	int v=39;
	int s1=38;
	int s2=anspos;
	for(int i=1;i<=64;++i){
		cout<<"and "<<v<<" "<<pos<<" "<<t1<<endl;++cnt;
		//cout<<"set "<<s1<<" "<<anspos<<endl;++cnt;
			cout<<"shr "<<pos<<" "<<1<<endl;++cnt;
		cout<<"set "<<anspos<<" "<<s1<<endl;++cnt;
	return cnt;
int push_highbit(int pos,int highbit=64){
	//Make everything below the highest bit to 1
	int cnt=0;
	int t=20;
	for(int i=1;i<highbit;i<<=1){
		cout<<"set "<<t<<" "<<pos<<endl;++cnt;
		cout<<"shr "<<t<<" "<<i<<endl;++cnt;
		cout<<"or "<<pos<<" "<<pos<<" "<<t<<endl;++cnt;
	return cnt;
int compare(int xpos=1,int ypos=2,int anspos=3){
	//no need for a[anspos]=0
	//does not change a[xpos],a[ypos]
	int cnt=0;
	int t1=40;
	int t2=39;
	cout<<"xor "<<t1<<" "<<xpos<<" "<<ypos<<endl;++cnt;
	cout<<"set "<<t2<<" "<<t1<<endl;++cnt;
	cout<<"shr "<<t2<<" "<<1<<endl;++cnt;
	cout<<"xor "<<t1<<" "<<t1<<" "<<t2<<endl;++cnt;//Now only the highest bit of t1 is 1
	cout<<"and "<<anspos<<" "<<t1<<" "<<xpos<<endl;++cnt;
	//You can comment out in task6:
	cout<<"shl "<<anspos<<" "<<63<<endl;++cnt;
	cout<<"shr "<<anspos<<" "<<63<<endl;++cnt;
	return cnt;
int pow_of_popcount(int pos=1,int anspos=2){
	//Default a[anspos]=0
	int cnt=0;
	int flagpos=40;
	//cout<<"shl "<<flagpos<<" "<<64<<endl;++cnt;
	cout<<"not "<<flagpos<<" "<<flagpos<<endl;++cnt;
	cout<<"shr "<<flagpos<<" "<<63<<endl;++cnt;
	int t1=39;
	int t2=38;
	int t3=37;
	cout<<"set "<<t1<<" "<<flagpos<<endl;++cnt;
	for(int i=1;i<=64;++i){
			cout<<"set "<<t2<<" "<<anspos<<endl;++cnt;
			cout<<"shl "<<t2<<" "<<1<<endl;++cnt;
		cout<<"or "<<t2<<" "<<t2<<" "<<t1<<endl;++cnt;//t2=anspos<<1|1
		cout<<"and "<<t3<<" "<<pos<<" "<<flagpos<<endl;++cnt;
		cout<<"and "<<t2<<" "<<t2<<" "<<t3<<endl;++cnt;
		cout<<"or "<<anspos<<" "<<anspos<<" "<<t2<<endl;++cnt;
			cout<<"shl "<<flagpos<<" "<<1<<endl;++cnt;
	return cnt;
int push_lowbit(int pos){
	int cnt=0;
	int t=20;
	for(int i=1;i<=32;i<<=1){
		cout<<"set "<<t<<" "<<pos<<endl;++cnt;
		cout<<"shl "<<t<<" "<<i<<endl;++cnt;
		cout<<"or "<<pos<<" "<<pos<<" "<<t<<endl;++cnt;
	return cnt;
int bubble_sort(){
	int cnt=0;
	int t1=10;
	int t2=11;
	for(int i=1;i<9;++i){
		for(int j=i+1;j<=9;++j){
			cout<<"xor "<<t2<<" "<<j<<" "<<i<<endl;++cnt;
			cout<<"and "<<t2<<" "<<t2<<" "<<t1<<endl;++cnt;
			cout<<"xor "<<i<<" "<<i<<" "<<t2<<endl;++cnt;
			cout<<"xor "<<j<<" "<<j<<" "<<t2<<endl;++cnt;
//			cout<<"and "<<t2<<" "<<t1<<" "<<j<<endl;++cnt;
//			cout<<"and "<<t3<<" "<<t1<<" "<<i<<endl;++cnt;
//			cout<<"not "<<t1<<" "<<t1<<endl;++cnt;
//			cout<<"and "<<t4<<" "<<t1<<" "<<j<<endl;++cnt;
//			cout<<"and "<<t5<<" "<<t1<<" "<<i<<endl;++cnt;
//			cout<<"or "<<j<<" "<<t3<<" "<<t4<<endl;++cnt;
//			cout<<"or "<<i<<" "<<t2<<" "<<t5<<endl;++cnt;
	return cnt;

void task1(){
	cout<<"not "<<2<<" "<<1<<endl;
	cout<<"shl "<<2<<" "<<63<<endl;
	cout<<"shr "<<2<<" "<<63<<endl;
	cerr<<"cnt "<<3<<endl;
void task2(){
	int cnt=add();
	cerr<<"cnt "<<cnt<<endl;
void task3(){
	int cnt=popcnt();
	cerr<<"cnt "<<cnt<<endl;
void task4(){
	int cnt=compare();
	cerr<<"cnt "<<cnt<<endl;
void task5(){
	int cnt=pow_of_popcount();
	cerr<<"cnt "<<cnt<<endl;
void task6(){
	int cnt=bubble_sort();
	cerr<<"cnt "<<cnt<<endl;
int main() {
	return 0;

Posted by e39m5 on Thu, 30 Jun 2022 21:16:19 +0930