Skip to main content

Complexity analysis: asymptotic notation

Asymptotic notation is used to describe how the (time) complexity of an algorithm grows when the size of the problem becomes asymptotically large.

Say, we need to find out the number of characters in a string. One of the ways we can implement a solution is by counting the characters until we have reached the end of the string. An implementation in C could be:

src/c/misc/string_length.c
#include <stdio.h>

int main()
{
char message[] = "hello";
int i = 0;
for (;; i++) // notice the null statement `;`, we have no loop
// initialization or condition check in our for loop, the loop
// is exited with `break;` statement
{
if (!message[i]) // string literal is stored as array with the end
// denoted by `\0`, we keep on checking each character
// until we have reached `\0`
{
printf("Length = %d\n", i);
break;
}
}

// we can implement using `while` loop as well
i = 0;
while (message[i])
{
i++;
}

printf("Length (while loop implementation) = %d\n", i);

return 0;
}

The above algorithm is linear in time. The complexity of the problem grows linearly with the size of the string. We denote it by: O(n)\mathcal{O}(n).

We could have another implementation, where we store the string and its length in an object (struct). Here we are not interested in how we determined the length of string in the first place.

src/c/misc/string_length_struct.c
#include <stdio.h>
#include <string.h> // strlen

struct message_struct
{
char text[10]; // can hold maximum of 9 characters, plus the null `\0`
// character to signify the end of string
int length;
};

int main()
{
struct message_struct message = {"hello", 5};
printf("Message: %s\nLength = %d\n", message.text, message.length);
printf("Length using built-in method = %lu\n", strlen(message.text));
return 0;
}

In the above implementation, no matter how big the string is, we can always get its length in constant time, i.e., O(1)\mathcal{O}(1)

Let's consider another example: looking up for a contact in a phone book. We have contacts sorted alphabetically, and we have to find the phone number of person. One way could be to start from the beginning until we have found the person i.e., a linear search algorithm. That would be a O(n)\mathcal{O}(n) algorithm.

We could have another algorithm, where we first look up in the middle of the list (which is sorted alphabetically), if the name matches we stop there. Otherwise, if the name starts with an alphabet before the one we are looking in the middle, we again divide the first half of the list and look in its middle, and continue this process until we have found the person's name. This algorithm has O(log(n))\mathcal{O}(\log(n)) complexity.

tip

To be precise, here the logarithm has base 2. However, all logarithmic functions with base greater than 1, are asymptotically equivalent.

loga(N)=logb(N)logb(a)\log_a(N) = \frac{\log_b(N)}{\log_b(a)}

as logb(a)\log_b(a) is a constant.

In the best case scenario, we can find the person in first step, if it's exactly in the middle of the list. We denote best case scenario by Ω(1)\Omega(1).

In cases where the best case and asymptotic case have the same complexity, we denote it by Θ(n)\Theta(n), tight asymptotic bound. For example our second implementation of string length example above, where we store the length as a data member, the best case and worst case scenario both have same complexity Θ(1)\Theta(1).

Nested loops:

for (int i = 0; i < N; i++)
{
for (int j = 0; j < M; j++)
{
arr[i][j] = i * M + j;
}
}

In such cases, the complexity grows as O(MN)\mathcal{O}(MN). If MM and NN are of same order, we can say complexity is O(N2)\mathcal{O}(N^2).

Mathematical definition

f(n)f(n) has complexity O(g(n))\mathcal{O}(g(n)) if

f(n)cg(n)f(n) \le c g(n)

for all n>n0n \gt n_0; cc and n0n_0 are constants.

f(n)f(n) has complexity lower bound Ω(g(n))\Omega(g(n)) if

f(n)cg(n),  n>n0f(n) \ge c g(n), \ \forall \ n \gt n_0

f(n)f(n) has complexity tight bound Θ(g(n))\Theta(g(n)) if and only if

(1) f(n)f(n) is O(g(n))\mathcal{O}(g(n))

(2) f(n)f(n) is Ω(g(n))\Omega(g(n))

cg(n)f(n)cg(n)c g(n) \le f(n) \le c' g(n)
note

Mathematically if f(n)f(n) is O(n)\mathcal{O}(n), it is also O(n2)\mathcal{O}(n^2) or O(n3)\mathcal{O}(n^3). However, we are only interested in the tightest possible upper bound.

Complexity of recursive functions

Recursion could be very useful and elegant way to solve certain type of problems. Using recursion the problem is reduced or divided into smaller parts, and once we reach the base case the problem has constant time complexity.

In case of Fibonacci series calculation (C++ example code):

T(n)=T(n1)+T(n2)+c=(T(n2)+T(n3)+c)+(T(n3)+T(n4)+c)+c\begin{aligned} T(n) &= T(n-1) + T(n-2) + c \\ &= \big(T(n-2) + T(n-3) + c\big) + \big(T(n-3) + T(n-4) + c\big) + c \end{aligned}

Where T(n)T(n) is constant for n1n \le 1.

In this program, the number of function call grows exponentially with complexity O(2n)\mathcal{O}(2^n).

Master theorem

Let a1a \ge 1 and b>1b \gt 1 be constants, let f(n)f(n) be a function, and let T(n)T(n) be a function over the positive numbers defined by the recurrence:

T(n)=aT(n/b)+f(n)T(n) = aT(n/b) + f(n)

If f(n)=Θ(nd)f(n) = \Theta(n^d), where d0d \ge 0, then

T(n)=Θ(nd)T(n) = \Theta(n^d) if a<bda \lt b^d,

T(n)=Θ(ndlogn)T(n) = \Theta(n^d \log n) if a=bda = b^d,

T(n)=Θ(nlogba)T(n) = \Theta(n^{\log_b a}) if a>bda \gt b^d.

Example: Complexity of binary search problem is O(logn)\mathcal{O}(\log n), complexity of merge sort is O(nlogn)\mathcal{O}(n \log n).

Resources