Python is considered one of the easiest programming languages to use, and for good reason. There are libraries and built-in functionalities that allow you to do nearly anything you want to do.
Counting multiple repeated objects is a programming problem that developers have had to find complex solutions to for decades. But in Python, the Counter subclass, which is a part of the collections module, supplies a simple and efficient solution to the problem.
It is a subclass of the dict class, and learning to use it can allow you to count objects quickly in your programs. In this guide, we’ll walk you through how you can count objects in Python 3 with the counter subclass.
The Basics of Using Counter in Python
There are many reasons why you’d want to count the number of times an object appears in a given data source. Maybe you want to determine how many times a specific item appears in a list.
If the list is short, you might be able to count the number of objects on your hands. But what can you do if the list is large?
The typical solution to this problem is to use a counter variable. The variable has a starting value of zero and is incremented by one every time the object appears in the data source.
Using a counter variable is perfect when you want to count the number of appearances of a single object. All you have to do is make a single counter.
However, if you need to count multiple different objects, you will need to write as many counters as there are objects.
Using the Python dictionary is a great way to circumvent the need to write multiple counters. The dictionary’s keys will store the objects you want to count, and the values will hold the frequency of every object’s appearance.
To count the objects in a sequence with a Python dictionary, you can simply loop over the sequence, checking whether each object is in the dictionary and, if yes, incrementing the counter.
Let’s look at a simple example. Below, we’re trying to find how many times letters occur in the word “bookkeeper.”
word = "bookkeeper" counter = {} for letter in word: if letter not in counter: counter[letter] = 0 counter[letter] += 1 print(counter)
Running the code gives us the output:
{'b': 1, 'o': 2, 'k': 2, 'e': 3, 'p': 1, 'r': 1}
In the example, the for loop iterates over the letters in the variable word. The conditional statement inside the loop checks whether the letter being checked is in the dictionary, which, in this case, is called counter.
If not, it creates a new key holding the letter and sets its value to zero. Finally, the counter variable is incremented by one.
The final statement prints out the counter variable. So, in the output, you can see that the letters function as keys, and the values are the counts.
It’s important to note that when you’re counting several objects with a dictionary, they need to be hashable since they’ll work as dictionary keys. In other words, the objects must have a constant hash value across their lifetime.
There’s a second way to count objects with a dictionary – using dict.get() with zero set as the default value:
word = "bookkeeper" counter = {} for letter in word: counter[letter] = counter.get(letter, 0) + 1 print(counter)
The output of the code will be:
{'b': 1, 'o': 2, 'k': 2, 'e': 3, 'p': 1, 'r': 1}
When you use .get() with dict, it either gives zero (default value) if the letter is missing or the current count of the letter. In the code, the value is then incremented by one and stored as the value of the corresponding letter in the dictionary.
Python also enables you to use defaultdict from collections to count objects inside a loop:
from collections import defaultdict word = "bookkeeper" counter = defaultdict(int) for letter in word: counter[letter] += 1 print(counter)
Running the code, we get the output:
defaultdict(<class 'int'>, {'b': 1, 'o': 2, 'k': 2, 'e': 3, 'p': 1, 'r': 1})
Approaching the solution this way is more readable and concise. You begin by initializing the counter with defaultdict, using int() as the factory function. Doing this allows you to access a key that doesn’t exist in defaultdict.
As you can expect with using int(), the default value will be zero, which happens when you call the function without arguments.
The dictionary will automatically create a key and initialize it with the value that the factory function returns.
The solution above is efficient, but like anything else in Python, there’s a better way to approach the problem. The collections module has a class created to help count different objects simultaneously. This is the Counter class.
Getting Started with Python’s Counter Class
The Counter class is a subclass of dict created to count hashable objects. As you’d expect, it’s a dictionary that holds the objects you’re counting as keys and the counts as values.
To use this class, you must supply an iterable or sequence of hashable objects to the class’s constructor as an argument.
The class internally iterates through the sequence, counting the frequency of an object’s occurrence. Let’s take a look at the different ways to construct counters.
Constructing Counters
To count many objects simultaneously, you must use an iterable or sequence to initialize the counter.
Let’s see how you could write a program to count the letters in “bookkeeper” with this approach:
from collections import Counter print(Counter("bookkeeper")) # Passing a string argument print(Counter(list("bookkeeper"))) # Passing a list as an argument
The output of this code would be:
Counter({'e': 3, 'o': 2, 'k': 2, 'b': 1, 'p': 1, 'r': 1}) Counter({'e': 3, 'o': 2, 'k': 2, 'b': 1, 'p': 1, 'r': 1})
The Counter class iterates over “bookkeeper,” producing a dictionary that stores the letters as keys and the counts as values. In the example, we first import the Counter class and then pass a string as an argument. Then, we pass a list and get the same output.
Besides lists, you can also pass tuples or any other iterables holding repeated objects.
As mentioned earlier, there are many ways to create instances of the Counter class. But these methods don’t strictly imply counting. One thing you can do is use a dictionary:
from collections import Counter print(Counter({"e": 3, "o": 2, "k": 2, "b": 1, "p": 1, "r": 1}))
Running this gives the output:
Counter({'e': 3, 'o': 2, 'k': 2, 'b': 1, 'p': 1, 'r': 1})
Using a dictionary this way gives the counter initial values of key-count pairs. There’s another way to do the same thing by calling the class’s constructor, like so:
from collections import Counter print(Counter(e=3, o=2, k=2, b=1, p=1, r=1))
You can also use the Counter class by calling the set() function, like so:
from collections import Counter print(Counter(set("bookkeeper")))
The output of which is:
Counter({'e': 1, 'o': 1, 'p': 1, 'r': 1, 'b': 1, 'k': 1})
As you might know, sets in Python store unique objects. So, calling set() in this way outputs the repeated letters. But you then end up with one instance of every letter in the original iterable.
Since Counter is a subclass of dict, it inherits its interface from regular dictionaries. But there is no implementation of .fromkeys() to prevent ambiguities.
Remember that you can store any type of hashable object in the keys, and the values can store any type of object. But to work as a counter, the values have to be integers.
Let’s look at an example of a Counter instance holding zero and negative counts:
from collections import Counter collection = Counter( stamps=10, coins=-15, buttons=0, seeds=15 )
In this example, you might ask why there are -15 coins. It might be used to indicate that you’ve lent them to a friend. The bottom line is the Counter class allows storing negative numbers this way, and you might be able to find some use cases for it.
Updating Object Counts
You now understand how to get a Counter instance in place. Updating it with new counts and introducing new objects is as simple as using .update().
It is an implementation of the Counter class that enables adding up existing counts. Creating new key-count pairs is also made possible by it.
.update() works with both mappings and iterables of counts. When you use an iterable as an argument, the method counts the items and updates the counter as needed:
from collections import Counter letters = Counter({"e": 3, "o": 2, "k": 2, "b": 1, "p": 1, "r": 1}) letters.update("toad") print(letters)
Running the code gives the output:
Counter({'e': 3, 'o': 3, 'k': 2, 'b': 1, 'p': 1, 'r': 1, 't': 1, 'a': 1, 'd': 1})
We put the letters and respective counts from the “bookkeeper” example discussed earlier in the beginning of this example. Then, we used .update to add the letters from the word “toad” into the mix. This introduced some new key-count pairs, as shown in the output.
Bear in mind that the iterable needs to be a sequence of elements rather than key-count pairs for this to work. But here’s what’s interesting:
Using values other than integers as the counts breaks the counter. Let’s take a look:
from collections import Counter letters = Counter({"e": "3", "o": "2", "k": "2", "b": "1", "p": "1", "r": "1"}) letters.update("toad") print(letters)
Running the code gives the output:
Traceback (most recent call last): File "<string>", line 5, in <module> File "/usr/lib/python3.8/collections/__init__.py", line 637, in update _count_elements(self, iterable) TypeError: can only concatenate str (not "int") to str
Since the letter counts defined are strings in the example, .update() doesn’t work, raising a TypeError.
.update can also be used in another way by providing a second counter or mapping of counts as an argument. Here’s how:
from collections import Counter sales = Counter(cheese=19, cake=20, foil=5) # Using a counter monday_sales = Counter(cheese=5, cake=4, foil=3) sales.update(monday_sales) print(sales) # Using a dictionary of counts tuesday_sales = {"cheese ": 4, "cake": 3, "foil": 1} sales.update(tuesday_sales) print(sales)
At the beginning of the program, an existing counter is updated with another counter named “monday_sales.” In the program’s second half, a dictionary with items and counts is used to update the counter. As you can see, .update() works with both counters
Accessing a Counter’s Content
Since the Counter class has the same interface as dict, it can perform the same actions as standard dictionaries. By extension, you can access the values in counters using key access just like you would in a dictionary.
It’s also easy to iterate over the keys, items, and values with the typical approaches. Let’s have a look:
from collections import Counter letters = Counter("bookkeeper") print(letters["e"]) # Output: 3 print(letters["k"]) # Output:2 for letter in letters: print(letter, letters[letter]) # Output: b 1 o 2 k 2 e 3 p 1 r 1 for letter in letters.keys(): print(letter, letters[letter]) # Output: b 1 o 2 k 2 e 3 p 1 r 1 for letter, count in letters.items(): print(letter, count) # Output: b 1 o 2 k 2 e 3 p 1 r 1 for count in letters.values(): print(count) # Output: 1 2 2 3 1 1
An interesting thing to note about Counter is that accessing a missing key results in a zero rather than a KeyError. Take a look:
from collections import Counter letters = Counter("bookkeeper") print(letters["z"]) # Output: 0
Finding Most Frequently Appearing Objects
Another convenience of using the Counter class is that you can use the .most_common() method to list objects based on their frequency. If two objects have equal counts, they are displayed in the order they first appear.
If a number “n” is passed to the method as an argument, it will output the “n” most common objects. Here’s an example exploring how the method works:
from collections import Counter sales = Counter(cheese=19, cake=20, foil=5) print(sales.most_common(1)) # Outputs the most common object. print(sales.most_common(2)) # Outputs the two most common objects. print(sales.most_common()) # Outputs all objects in order of frequency. print(sales.most_common(None)) # Outputs all objects in order of frequency. print(sales.most_common(20)) # Outputs all objects in order of frequency since the argument passed is greater than the total number of distinct objects.
When no argument or “None” is passed as an argument, .most_common() returns all the objects. The method also returns all objects when the argument exceeds the counter’s current length.
Interestingly, you can also have .most_common() return objects by order of low to high frequency. Pulling this off is simple with slicing. Here’s how you do it:
from collections import Counter sales = Counter(cheese=19, cake=20, foil=5) print(sales.most_common()[::-1]) # Returns objects in reverse order of commonality. Print(sales.most_common()[:-3:-1]) # Returns the two least-common objects.
As you can see, the first slicing returns objects in the variable from least common to most common. The second slicing returns the last two objects from the result of the method’s output.
Of course, you can change the number of least-common objects in the output by changing the second value in the slicing operator. To get the three least-common objects, the -3 would become a -4 in the operator, and so on.
The important thing to remember is that the values in the counters must be sortable for the method to work correctly. Since any type of data can be stored in a counter, sorting can get complicated.
Conclusion
In this guide, you’ve understood how to use the Counter class from the collections module to count objects. It removes the need to use loops or nested data structures and simplifies the ordeal. Integrating what you've learned in this guide into your code can make it cleaner and faster.