Big-θ (Big-Theta) notation

Let's look at a simple implementation of linear search: Let's denote the size of the array by $$n$$ . The maximum number of times that the for-loop can run is $$n$$, and this worst case occurs when the value being searched for is not present in the array.

Each time the for-loop iterates, it has to do several things:
 * compare  with
 * compare  with
 * possibly return the value of
 * increment

Each of these little computations takes a constant amount of time each time it executes. If the for-loop iterates $$n$$ times, then the time for all $$n$$ iterations is $$c_1*n$$, where $$c_1$$ is the sum of the times for the computations in one loop iteration. Now, we cannot say here what the value of $$c_1$$ is, because it depends on the speed of the computer, the programming language used, the compiler or interpreter that translates the source program into runnable code, and other factors.

This code has a little bit of extra overhead, for setting up the for-loop (including initializing  to 0) and possibly returning   at the end. Let's call the time for this overhead $$c_2$$, which is also a constant. Therefore, the total time for linear search in the worst case is $$c_1*n+c_2$$.

As we've argued, the constant factor c1​c, and the low-order term c2​c, don't tell us about the rate of growth of the running time. What's significant is that the worst-case running time of linear search grows like the array size $$n$$. The notation we use for this running time is $$\Theta(n)$$. That's the Greek letter "theta," and we say "big-Theta of $$n$$" or just "Theta of $$n$$."

When we say that a particular running time is $$\Theta(n)$$, we're saying that once $$n$$ gets large enough, the running time is at least $$k_1*n$$ and at most $$k_1*n$$ for some constants $$k_1$$ and $$k_2$$. Here's how to think of $$\Theta(n)$$:



For small values of $$n$$, we don't care how the running time compares with $$k_1*n$$ or $$k_2*n$$. But once $$n$$ gets large enough - on or to the right of the dashed line - the running time must be sandwiched between $$k_1*n$$ and $$k_2*n$$. As long as these constants $$k_1$$ and $$k_2$$ exist, we say the running time is $$\Theta(n)$$.

We are not restricted to just $$n$$ in big-Θ notation. We can use any function, such as $$n^2$$, $$n\log_2 n$$, or any other function of $$n$$. Here's how to think of a running time that is $$\Theta(f(n))$$ for some function $$f(n)$$:



Once $$n$$ gets large enough, the running time is between $$k_1*f(n)$$ and $$k_2*f(n)$$.

In practice, we just drop constant factors and lower terms. Another advantage of using big-Θ notation is that we don't have to worry about which time units we're using. For example, suppose that you calculate that a running time is $$6n^2 + 100n + 300$$ microseconds. Or maybe it's milliseconds. When you use big-Θ notation, you don't say. You also drop the factor 6 and the low-order terms $$100n + 300$$, and you just say that the running time is $$\Theta(n^2)$$. When we use big-Θ notation, we're saying that we have an asymptotically tight bound of the running time. "Asymptotically" because it matters for only values of $$n$$. "Tight bound" because we're nailed the running time to within a constant factor above and below.

This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.