2.5. Analysis of String Operators

Prior to C++11 the string class was not required to store its character elements contiguously. Now string acts much like the vector class, except for some string optimizations and other minor differences.

C++11 strings use contiguous storage locations in an underlying (typically larger) array just like vector. Due to this, the character elements in strings can be accessed and traversed with the help of iterators, and they can also be accessed randomly using indexes.

Like vectors, strings have a dynamic size meaning that whenever a new character is inserted or deleted, their size changes automatically. Just like vectors, new elements can be inserted into or deleted from any part of a string, and automatic reallocation for other existing items in the string is applied.

Indexing and assigning a new character to an index position that already exists both take \(O(1)\), in other words, the same amount of time no matter how large the string is.

Now that we have seen how performance can be measured concretely you can look at the string operations table to see the Big-O efficiency of all the basic string operations and you will see a striking resemblance to vectors because the implementations are so similar.

Big-O Efficiency of C++ String Operations

Operation

Big-O Efficiency

index []

O(1)

index assignment =

O(1)

push_back()

amortized O(1)

pop_back()

O(1)

erase(i)

O(n)

insert(i, item)

O(n)

find(b, e, item)

O(log n) or O(n)

reserve()

O(n)

begin()

O(1)

end()

O(1)

size()

O(1)

push_back() is \(O(1)\) unless there is inadequate capacity. Then the entire string is moved to a larger contiguous underlying array. Copying all the old string data to a new location is \(O(n)\).

The previous table says that find could be \(O(n)\) or \(O(\log(n))\). One might ask why not just write a little for loop instead? Searching for a value seems like such a simple thing. Why go through the effort to figure out how to use all these string functions? Let’s find out.

In the program below, the time to perform operations is measured using the std::chrono library. It provides a flexible collection of types that track time with varying degrees of precision. steady_clock::now returns the current time. The elapsed time between two time points is stored in a duration object. Duration objects make it obvious what the time units are and also makes it easy to convert. The is one of the major advantages over the C time functions.

To use the steady_clock to time an algorithm or function, create two time points. To get the total runtime, subtract the begin time from the end time.

This example creates a series of increasingly long strings.

Every character except for one is the same.

The program prints how long it takes to find it when placed at the midpoint of the string.

A graph of the loops in the preceding code should look something like this:

Comparison of string::find and for loop times

Try This!

What is the Big-O of find? \(O(n)\) or \(O(\log(n))\)?

What happens if the needle value is in a location other than the midpoint? Try the beginning and end to see what happens.

Challenge: Try putting the needle in a random location to see what happens.


More to Explore

You have attempted of activities on this page