[C++] Simulation implementation of string

Table of contents

1. The difference between std::swap and std::string::swap

Second, the default constructor of string

1. Constructor

2. Copy construction

3. Assignment operator overloading

4. Destructor

3. Small interface in string

Fourth, the realization of the traversal interface

1. Overload operator[]

2. Iterator

Five, reserve and resize

6. Insert, delete, find related interfaces

1,push_back,append,+=

2. insert and ear

3,find

7. Stream Insertion and Stream Extraction

8. The overall code of the string implemented by the simulation

1. The difference between std::swap and std::string::swap

If you use std::swap to exchange two string objects, there will be 1 construction and 2 assignments, that is, three deep copies; while using std::string::swap only exchange members, the cost is relatively small.

Second, the default constructor of string

1. Constructor

string(const char* s = "")
{
    _size = strlen(s);//Neither _size nor _capacity contains '\0'
    _capacity = _size;
    _arr = new char[_size + 1];
    memcpy(_arr, s, _size + 1);
}

The constructor uses the default value, which can satisfy the construction of the empty string.

Here, both _size and _capacity do not contain '\0'. The space of _arr is one more than new, which is used to store '\0'.

Then copy the memory of the formal parameter to _arr to complete the construction.

2. Copy construction

Writing method 1: Honestly perform copy construction based on the private variable of the string object.

string(const string& s)
{
    _size = s._size;//Neither _size nor _capacity contains '\0'
    _capacity = s._capacity;
    _arr = new char[_capacity + 1];
    memcpy(_arr, s._arr, _capacity + 1);
}

Writing method 2: By constructing a temporary object, exchange all the private variables of this temporary object with the private variables of *this.

Note that copy construction needs to initialize _arr to nullptr first to prevent subsequent tmp from getting random addresses. (tmp destruction will call the destructor, and the destructor of a random address space will crash)

void swap(string& s)
{
    std::swap(_arr, s._arr);
    std::swap(_size, s._size);
    std::swap(_capacity, s._capacity);
}
string(const string& s)
    :_arr(nullptr)//Prevent tmp._arr from being a random value after the exchange, and the destructor is wrong
{
    string tmp(s.c_str());//structure
    swap(tmp);
}

3. Assignment operator overloading

Writing method 1: The same honest person writing method. This way of writing should prevent yourself from assigning a value to yourself!

string& operator=(const string& s)
{
    if (this != &s)//prevent self-assignment
    {
        _size = s._size;
        _capacity = s._capacity;
        char* tmp = new char[_capacity + 1];
        delete[] _arr;
        _arr = tmp;
        memcpy(_arr, s._arr, _capacity + 1);
    }
    return *this;
}

Writing method 2: Complete the assignment by constructing a temporary variable tmp. This way of writing does not need to worry about assigning a value to itself, and _arr does not need to be initialized to nullptr. the

void swap(string& s)
{
    std::swap(_arr, s._arr);
    std::swap(_size, s._size);
    std::swap(_capacity, s._capacity);
}
string& operator=(const string& s)
{
    string tmp(s.c_str());//structure
    swap(tmp);
    return *this;
}

4. Destructor

~string()
{
    _size = _capacity = 0;
    delete[] _arr;
    _arr = nullptr;
}

3. Small interface in string

//string's size() interface
size_t size()const//The right const modifies *this, so both const and non-const objects can be called
{
    return _size;
}
//c_str() interface of string
const char* c_str()const
{
    return _arr;
}
//The capacity() interface of string
size_t capacity()const
{
    return _capacity;
}
//string's clear() interface
void clear()
{
    _arr[0] = '\0';
    _size = 0;
}
//string null
bool empty()const
{
    return _size == 0 ? false : true;
}

If the function parameter does not change, add const modification without thinking.

Only pointers and references have const permission issues.

Fourth, the realization of the traversal interface

1. Overload operator[]

char& operator[](size_t pos)//Ordinary object, readable and writable
{
    assert(pos < _size);
    return _arr[pos];
}
const char& operator[](size_t pos)const//const object, read-only
{
    assert(pos < _size);
    return _arr[pos];
}

To allow subscripted access to strings, two operator[] functions need to be overloaded. Normal objects can be read and written, and const objects can be read-only.

2. Iterator

typedef char* iterator;
iterator begin()
{
    return _arr;
}
iterator end()//end points to '\0' of the string
{
    return _arr + _size;
}

The iterator of the string is a character pointer, and the iterator can be used to access and modify it after writing the iterator.

The bottom layer of the range for is also an iterator, but the bottom layer of the range for only recognizes begin() and end(). If the name of the iterator interface that you implement does not match, the range for will not work.

Five, reserve and resize

//The reserve interface of sring, if the pre-opened space is smaller than the existing space, the capacity will not be changed.
void reserve(size_t n = 0)
{
    if (n + 1 > _capacity)
    {
        char* tmp = new char[n + 1];
        memset(tmp, '\0', n + 1);
        memcpy(tmp, _arr, _size);
        delete[] _arr;
        _arr = tmp;
        _capacity = n;
    }
}
//The resize interface of sring
void resize(size_t n, char c)
{
    //Determine the size of n
    if (n > _capacity)
    {
        reserve(n);
        memset(_arr + _size, c, n - _size);
        _size = n;
    }
    else
    {
        _arr[n] = '\0';
        _size = n;
    }
}

reserve is capacity expansion, which can be used to pre-open space to prevent frequent space applications. Apply for a space of n+1 size, initialize all the space with '\0', then copy the data in _arr to tmp, release _arr, and _arr points to tmp.

In resize, the problem of _size expansion and shrinkage needs to be considered.

6. Insert, delete, find related interfaces

1,push_back,append,+=

string& push_back(const char c)
{
    //Judgment capacity
    if (_size == _capacity)
    {
        size_t newCapacity = _capacity == 0 ? 4 : _capacity * 2;//Prevent empty strings from appearing
        reserve(newCapacity);
    }
    _arr[_size++] = c;
    return *this;
}
string& append(const char* s)
{
    //Judgment capacity
    size_t len = strlen(s);
    if (_size + len > _capacity)
    {
        reserve(_size + len);
    }
    strcpy(_arr + _size, s);
    _size += len;
    return *this;
}
string& operator+=(const char c)
{
    push_back(c);
    return *this;
}
string& operator+=(const char* s)
{
    append(s);
    return *this;
}

Writing push_back should take into account the fact that the original object is an empty string (that is, _capacity is 0).

+= can reuse push_back and append.

2. insert and ear

string& insert(size_t pos, char c)
{
    assert(pos < _size);
    //Judgment capacity
    if (_size == _capacity)
    {
        reserve(_capacity + 1);
    }
    //move data
    for (size_t i = _size; i > pos; --i)
    {
        _arr[i] = _arr[i - 1];
    }
    _arr[pos] = c;
    ++_size;
    return *this;
}
string& insert(size_t pos, const char* s)
{
    size_t len = strlen(s);
    //Judgment capacity
    if (len + _size > _capacity)
    {
        reserve(len + _size);
    }
    //move data
    for (size_t i = _size + len; i > pos + len - 1; --i)
    {
        _arr[i] = _arr[i - len];
    }
    memcpy(_arr + pos, s, len);
    _size += len;
    return *this;
}
string& earse(size_t pos, size_t len = npos)
{
    assert(pos < _size);
    //First judge the situation of deletion
    if (len == npos || pos + len >= _size)
    {
        _arr[pos] = '\0';
        _size = pos;
    }
    else
    {
        memcpy(_arr + pos, _arr + pos + len, _size - pos - len);
        _size -= len;
    }
    return *this;
}

When the insert interface moves data, it starts to overwrite from the last (the last len) position of the last element, which can ensure that the size_t type does not go out of bounds.

For the earse interface, it is necessary to classify and discuss whether the string is deleted to the end.

Note that this pos is a const static member. In C++ syntax, only pointers and integer const static members can be initialized in a class.

3,find

size_t find(const char c, size_t pos = 0)const
{
    assert(pos < _size);
    for (size_t i = pos; i < _size; ++i)
    {
        if (_arr[i] == c)
        {
            return i;
        }
    }
    return npos;
}
size_t find(const char* s, size_t pos = 0)const
{
    assert(pos < _size);
    const char* p = strstr(_arr, s);
    if (p != nullptr)
    {
        return _arr - p;
    }
    return npos;
}

Find a character or string from the specified position, if found, return the subscript of the first matching character/substring.

7. Stream Insertion and Stream Extraction

//Overloads for stream insertion and stream extraction for custom types of input and output
inline ostream& operator<<(ostream& out, const string& s)//The access here is private, so you don’t need to write it as a friend function
{
    for (size_t i = 0; i < s.size(); ++i)//Stream insertion is printed according to _size, and c_str finds '\0' to end printing
    {                                    //For example, if I insert a '\0' in the middle of the string, the printing result is different
        out << s[i];
    }
    return out;
}
inline istream& operator>>(istream& in, string& s)
{
    s.clear();//Clear s before using
    //in >> c;//stream extraction will not recognize spaces and newlines
    char c = in.get();
    char buff[128] = { '\0' };//Prevent frequent expansion
    size_t i = 0;
    while (c != ' ' && c != '\n')
    {
        if (i == 127)
        {
            s += buff;
            i = 0;
        }
        buff[i++] = c;
        c = in.get();
    }
    if (i > 0)
    {
        buff[i] = '\0';
        s += buff;
    }
    return in;
}

Because string provides access to private interfaces, stream insertion and stream extraction do not need to be overloaded as friend functions of the string class.

For stream extraction, frequent tail insertion will cause frequent capacity expansion. Moreover, the expansion of C++ is different from the expansion of C language. Using new in C++ cannot expand the capacity in situ, but can only be expanded in different places. The expansion in different places will lead to the opening of new space, copying of data, and release of old space. In order to prevent frequent expansion, we can create an array that can store 128 bytes, operate in this array, and insert it into the object s when the array is full.

Why can't getline be used, but one character at a time should be inserted at the end? Because stream extraction encounters a space and '\n', it will end the extraction, and the remaining data is temporarily stored in the buffer. If it is getline, it will not stop reading when it encounters a space.

8. The overall code of the string implemented by the simulation

#pragma once
#define _CRT_SECURE_NO_WARNINGS 1
#include <iostream>
#include <assert.h>
using std::cout;
using std::cin;
using std::endl;
using std::ostream;
using std::istream;
namespace jly
{
	class string
	{
	public:
		void swap(string& s)
		{
			std::swap(_arr, s._arr);
			std::swap(_size, s._size);
			std::swap(_capacity, s._capacity);
		}
		//Constructor
		string(const char* s = "")
		{
			_size = strlen(s);//Neither _size nor _capacity contains '\0'
			_capacity = _size;
			_arr = new char[_size + 1];
			memcpy(_arr, s, _size + 1);
		}
		//copy construction
		//Writing 1
		//string(const string& s)
		//{
		//	_size = s._size;//_size and _capacity do not contain '\0'
		//	_capacity = s._capacity;
		//	_arr = new char[_capacity + 1];
		//	memcpy(_arr, s._arr, _capacity + 1);
		//}
		//Writing 2
		string(const string& s)
			:_arr(nullptr)//Prevent tmp._arr from being a random value after the exchange, and the destructor is wrong
		{
			string tmp(s.c_str());//structure
			swap(tmp);
		}
		//assignment operator overloading
		//Writing 1
		//string& operator=(const string& s)
		//{
		//	if (this != &s)//Prevent yourself from assigning a value to yourself
		//	{ 
		//		_size = s._size;
		//		_capacity = s._capacity;
		//		char* tmp = new char[_capacity + 1];
		//		delete[] _arr;
		//		_arr = tmp;
		//		memcpy(_arr, s._arr, _capacity + 1);
		//	}
		//	return *this;
		//}
		//Writing 2
		string& operator=(const string& s)
		{
			string tmp(s.c_str());//structure
			swap(tmp);
			return *this;
		}
		//destructor
		~string()
		{
			_size = _capacity = 0;
			delete[] _arr;
			_arr = nullptr;
		}
		//string's size() interface
		size_t size()const//The right const modifies *this, so both const and non-const objects can be called
		{
			return _size;
		}
		//c_str() interface of string
		const char* c_str()const
		{
			return _arr;
		}
		//The capacity() interface of string
		size_t capacity()const
		{
			return _capacity;
		}
		//string's clear() interface
		void clear()
		{
			_arr[0] = '\0';
			_size = 0;
		}
		//string null
		bool empty()const
		{
			return _size == 0 ? false : true;
		}
		//Overload operator[]
		char& operator[](size_t pos)//Ordinary object, readable and writable
		{
			assert(pos < _size);
			return _arr[pos];
		}
		const char& operator[](size_t pos)const//const object, read-only
		{
			assert(pos < _size);
			return _arr[pos];
		}
		//iterator
		typedef char* iterator;
		iterator begin()const
		{
			return _arr;
		}
		iterator end()const//end points to '\0' of the string
		{
			return _arr + _size ;
		}
		//The reserve interface of string, if the pre-opened space is smaller than the existing space, the capacity will not be changed.
		void reserve(size_t n=0)
		{
			if (n + 1 > _capacity)
			{
				char* tmp = new char[n + 1];
				memset(tmp, '\0', n + 1);
				memcpy(tmp, _arr, _size);
				delete[] _arr;
				_arr = tmp;
				_capacity = n;
			}
		}
		//string's resize interface
		void resize(size_t n, char c='\0')
		{
			//Determine the size of n
			if (n > _capacity)
			{
				reserve(n);
				memset(_arr + _size,c,n-_size);
				_size = n;
			}
			else
			{
				_arr[n] = '\0';
				_size = n;
			}
		}
		//Insert, delete, find related interfaces
		string& push_back(const char c)
		{
			//Judgment capacity
			if (_size == _capacity)
			{
				size_t newCapacity = _capacity == 0 ? 4 : _capacity * 2;//Prevent empty strings from appearing
				reserve(newCapacity);
			}
			_arr[_size++] = c;
			return *this;
		}
		string& append(const char* s)
		{
			//Judgment capacity
			size_t len = strlen(s);
			if (_size + len > _capacity)
			{
				reserve(_size + len);
			}
			strcpy(_arr+_size,s);
			_size += len;
			return *this;
		}
		string& operator+=(const char c)
		{
			push_back(c);
			return *this;
		}
		string& operator+=(const char* s)
		{
			append(s);
			return *this;
		}
		string& insert(size_t pos, char c)
		{
			assert(pos < _size);
			//Judgment capacity
			if (_size == _capacity)
			{
				reserve(_capacity + 1);
			}
			//move data
			for (size_t i = _size; i > pos; --i)
			{
				_arr[i] = _arr[i - 1];
			}
			_arr[pos] = c;
			++_size;
			return *this;
		}
		string& insert(size_t pos, const char* s)
		{
			size_t len = strlen(s);
			//Judgment capacity
			if (len + _size > _capacity)
			{
				reserve(len + _size);
			}
			//move data
			for (size_t i = _size + len; i > pos + len - 1; --i)
			{
				_arr[i] = _arr[i - len];
			}
			memcpy(_arr + pos, s, len);
			_size += len;
			return *this;
		}
		string& earse(size_t pos, size_t len = npos)
		{
			assert(pos<_size);
			//First judge the situation of deletion
			if (len == npos || pos + len >= _size)
			{
				_arr[pos] = '\0';
				_size = pos;
			}
			else
			{
				memcpy(_arr + pos, _arr + pos + len,_size-pos-len);
				_size -= len;
			}
			return *this;
		}
		size_t find(const char c, size_t pos = 0)const
		{
			assert(pos < _size);
			for (size_t i = pos; i < _size; ++i)
			{
				if (_arr[i] == c)
				{
					return i;
				}
			}
			return npos;
		}
		size_t find(const char* s, size_t pos = 0)const
		{
			assert(pos < _size);
			const char* p = strstr(_arr, s);
			if (p != nullptr)
			{
				return _arr - p;
			}
			return npos;
		}
	private:
		char* _arr;
		size_t _size;
		size_t _capacity;
		const static size_t npos = -1;//Only const static integer and pointer member variables can be defined in the class, other types cannot
	};
	//Overloads for stream insertion and stream extraction for custom types of input and output
	inline ostream& operator<<(ostream& out, const string& s)//The access here is private, so you don’t need to write it as a friend function
	{
		for (size_t i = 0; i < s.size(); ++i)//Stream insertion is printed according to _size, and c_str finds '\0' to end printing
		{                                    //For example, if I insert a '\0' in the middle of the string, the printing result is different
			out << s[i];
		}
		return out;
	}
	inline istream& operator>>(istream& in, string& s)
	{
		s.clear();//Clear s before using
		//in >> c;//stream extraction will not recognize spaces and newlines
		char c=in.get();
		char buff[128] = { '\0' };//Prevent frequent expansion
		size_t i = 0;
		while (c != ' ' && c != '\n')
		{
			if (i == 127)
			{
				s += buff;
				i = 0;
			}
			buff[i++] = c;
			c = in.get();
		}
		if (i > 0)
		{
			buff[i] = '\0';
			s += buff;
		}
		return in;
	}
	//test function
	void test1()
	{
	
	}
}


"Super-detailed linux practical quick start that conforms to the learning rules"

Tags: C++ Algorithm string STL programming language

Posted by JennyG on Thu, 17 Nov 2022 07:11:07 +1030