# Algorithm learning notes - discretization

## Discretization of initial knowledge

Discretization

Discretization, mapping limited individuals in infinite space to limited space, so as to improve the space-time efficiency of the algorithm.

Generally speaking, discretization is to reduce the data without changing the relative size of the data.

For example:

Original data: 1999100000,15;

After treatment: 1,3,4,2;

Original data: {100200}, {2050000}, {1400};

After treatment: {3,4}, {2,6}, {1,5};

Discretization of data

Some data itself is very large, so it cannot save the corresponding attributes as the subscript of the array. If you only need the relative attributes of this pile of data, you can discretize it. When the data is only related to the relative size between them, but has nothing to do with the specific number, it can be discretized.

For the topic, when the data range is too large but the amount of data is small, you can try discretization

Operation steps

1. Store all values to be discretized
2. sort
3. Delete duplicate elements
4. The corresponding value of the index element after discretization

Examples

Suppose there is an infinite number axis, and the number on each coordinate on the number axis is 0.

Now, let's start with n operations. Each operation adds c to the number at a certain position x.

Next, make m queries. Each query contains two integers L and r. you need to find the sum of all numbers between the interval [l, r].

Input format

The first line contains two integers n and m.

Next n lines, each containing two integers x and c.

Then follow the line m, each line containing two integers l and r.

Output format

There are m lines in total, and each line outputs the sum of numbers in the interval obtained in the query.

Data range

− 1 0 9 ≤ x ≤ 1 0 9 , −10^9≤x≤10^9, −109≤x≤109, 1 ≤ n , m ≤ 1 0 5 1≤n,m≤10^5 1≤n,m≤105 − 1 0 9 ≤ l ≤ r ≤ 1 0 9 −10^9≤l≤r≤10^9 −109≤l≤r≤109 − 10000 ≤ c ≤ 10000 −10000≤c≤10000 −10000≤c≤10000
Input sample:

3 3
1 2
3 6
7 5
1 3
4 6
7 8

Output example:

8
0
5

thinking

x. l, r range is [ − 1 0 9 ， 1 0 9 ] [−10^9，10^9] [−109，109]
Obviously, you can't use arrays to simulate the number axis

And the range of n and m is
[ 1 ， 1 0 5 ] [1，10^5] [1，105]

Discretize x, l, r

The maximum space required at this time is only n + 2 ∗ m n+2*m n+2∗m
This allows you to use arrays for processing

The steps are as follows

```typedef pair<int, int> PII;

const int N = 300010;// n+2*m == 300000;

int n, m;
int a[N], s[N];//s[N] is the prefix and array of a[N]

vector<int> alls; // Store all values to be discretized

```

Next, the array to be discretized is sorted for binary search

```sort(alls.begin(), alls.end());
```

Delete duplicate elements (STL)

```alls.erase(unique(alls.begin(), alls.end()), alls.end());
//unique returns the next sequence at the end of the de duplicated sequence, which is compressed by erase
```

Implementation of unique function (double pointer)

``` vector<int>::iterator unique(vector<int> &a)
{
int j = 0;
for (int i = 0; i < a.size(); i ++ )
if (!i || a[i] != a[i - 1])
a[j ++ ] = a[i];
return a.begin() + j;
}
alls.erase(unique(alls), alls.end());//duplicate removal
```

Discretization

Writing method 1

```int find(int x) // Find the first position greater than or equal to x
{
int l = 0, r = alls.size() - 1;
while (l < r)
{
int mid = l + r >> 1;
if (alls[mid] >= x) r = mid;
else l = mid + 1;
}
return r + 1; // Map to 1, 2 n
}//Start mapping from 1 to facilitate the processing of prefixes and
```

Writing method 2

lower_bound function (this is written in the comments section of the complete code)

```int lower_bound(int *array, int size, int key)
```

Complete code

```#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

typedef pair<int, int> PII;

const int N = 300010;

int n, m;
int a[N], s[N];

vector<int> alls; // Store all values to be discretized

// Find the discretized value corresponding to x by bisection
int find(int x) // Find the first position greater than or equal to x
{
int l = 0, r = alls.size() - 1;
while (l < r)
{
int mid = l + r >> 1;
if (alls[mid] >= x) r = mid;
else l = mid + 1;
}
return r + 1; // Map to 1, 2 n
}//Start mapping from 1 to facilitate the processing of prefixes and

vector<int>::iterator unique_by(vector<int> &a) //Implementation of unique
{
int j = 0;
for (int i = 0; i < a.size(); i ++ )
if (!i || a[i] != a[i - 1])
a[j ++ ] = a[i];
// a ~ a[j - 1] non repeated numbers in all a

return a.begin() + j;
}

int main()
{
cin >> n >> m;
for (int i = 0; i < n; i ++ )
{
int x, c;
cin >> x >> c;

alls.push_back(x);//Insert the number to be discretized into the array to be discretized
}

for (int i = 0; i < m; i ++ )
{
int l, r;
cin >> l >> r;

alls.push_back(l);//Insert the number to be discretized into the array to be discretized
alls.push_back(r);
}

sort(alls.begin(), alls.end());
alls.erase(unique(alls.begin(), alls.end()), alls.end());

// Process insert
{
int x = find(item.first);//Discretized value
//int x = lower_bound(alls.begin(),alls.end(),item.first)-alls.begin()+1;
a[x] += item.second;
}

// Preprocessing prefix and
for (int i = 1; i <= alls.size(); i ++ ) s[i] = s[i - 1] + a[i];

// Handling inquiries
{
int l = find(item.first), r = find(item.second);
//int l = lower_bound(alls.begin(),alls.end(),item.first)-alls.begin()+1;
//int r = lower_bound(alls.begin(),alls.end(),item.second)-alls.begin()+1;
cout << s[r] - s[l - 1] << endl;
}
return 0;
}

```

Tags: C++ Algorithm

Posted by rcarr on Mon, 18 Apr 2022 13:58:28 +0930