The filter() function is a part of Python's standard library. It is used to extract elements out of an iterable if they satisfy a supplied condition. The process of extracting elements this way is known as filtering.
After carrying out the filtering operation, the filter() function generates a new iterable holding the elements that satisfy the provided condition. As you might be able to guess, the filter() function is one of the Python tools available for functional programming.
In this brief post, we'll walk you through how to use filter(). But bear in mind that you must know how the lambda function works to understand the function.
A Look at the Filtering Problem
Let's assume you have a list of numbers and want to make a new list out of this list that holds only the positive numbers. The primitive approach involves writing a for loop, like so:
>>> numberList1 = [-2, -1, 0, 1, 2] >>> def gather_positive(numberList1): ... numberList2 = [] ... for number in numberList1: ... if number > 0: # Filtering condition ... numberList2.append(number) ... return numberList2 ... >>> gather_positive(numberList1) [1, 2]
The loop defined in gather_positive() iterates through numberList1, storing the positive values in numberList2. The negative values are filtered out using a conditional statement – making sense why the process is called "filtering."
A filtering operation entails testing all the values in an iterable using a predicate function. The values that meet the criteria are retained.
Filtering problems are common in all programming languages. So, virtually all programming languages supply tools to approach these problems. Python's filter() function is one such tool.
The Basics of the Python filter() Function
The filter() function abstracts the logic involved in the filtering process. Its signature looks like this:
filter(function, iterable) |
You must pass a single-argument function to the first argument. Typically, a predicate function is passed to the argument, which returns either True or False according to whether a specific condition is met.
So, the "function" argument essentially acts as a decision function since it supplies the criteria according to which the undesired values are removed from the input iterable. Of course, it also selects the desired values and puts them in the resulting iterable.
Bear in mind that the undesired values are the ones that filter() evaluates to be false using the "function" argument. Also, note that the "function" argument is actually a function object, so you must pass functions without calling them using parentheses.
The "iterable" argument can hold Python iterables, generators, and iterator objects. You can only pass one iterable to the filter() function.
To perform filtering, filter() applies the "function" argument to every item of the generator or iterator stored in "iterable." Finally, filter() produces an iterator holding the values in "iterable" for which "function" returned True. The original input iterable remains unmodified.
It's interesting to note that filter() is written in C. This means that its internal implicit loop is more efficient than a for loop in terms of execution time. For this reason, using the built-in filter() function is preferred over writing a for loop for the same purpose.
Besides being more efficient, the filter() function also returns a filter object – an iterator that supplies values on demand. This way, the filter() function promotes a lazy evaluation strategy.
In other words, not only is filter() more time-efficient, but it is also more memory efficient than using a for loop for filtering. Note that filter() returns list objects in Python 2, and filter objects were introduced in Python 3.
Regardless of which Python version you use, you won't need to import any module to use filter() since it's a built-in function.
Using filter()
We can rewrite the previously discussed example to use the filter() function. This entails writing a predicate function to extract the desired numbers.
The predicate function can be a user-defined function or a lambda function:
>>> numberList1 = [-2, -1, 0, 1, 2] >>> # Using a user-defined function >>> def is_positive(n): ... return n > 0 ... >>> list(filter(is_positive, numberList1)) [1, 2] >>> # Using a lambda function >>> numberList2 = filter(lambda n: n > 0, numberList1) >>> numberList2 <filter object at 0x7f3632683610> >>> list(numberList2) [1, 2]
The is_positive() function is user-defined, and does the same filtering operation you'd expect. The filter() call applies is_positive() to the values in numberList1, winnowing out the numbers less than zero. This is a simple, readable solution to filtering.
On the other hand, the lambda function supplies the filtering functionality intrinsically. When filter() is called, it applies the lambda function to all the values in numberList1, winnowing out the negative numbers and zero value – just like the user-defined function.
However, the solution is less readable compared to the user-defined function. Also, since filter() returns an iterator, the final list isn't made straightaway. As you can see, the list() function is used to process the iterator and return a list.
We've explained how filter() works with a basic example, but filter() isn't limited to use with Boolean functions. The function works just as well with other types of functions and returns values based on truthfulness.
Here's another example to consider:
>>> def identity(x): ... return x ... >>> identity(42) 42 >>> objects = [0, 1, [], 4, 5, "", None, 8] >>> list(filter(identity, objects)) [1, 4, 5, 8]
The identity() function is the filtering function in this example, and it won't explicitly return a True or False value. However, it returns the same argument it takes.
The 0, [], None, and " " values are false, so the function accepts their value and filters them out. You will notice that the resultant list only has the values that Python considers True.
What's more, if you pass None to the "function" argument, then the filter() function returns all the elements in the "iterable" argument that evaluate to true. Let's have a look:
>>> objects = [0, 1, [], 4, 5, " ", None, 8] >>> list(filter(None, objects)) [1, 4, 5, 8]
Now, let's consider a third example. Let's say you want to extract even numbers from a list of numbers. Taking the simplistic approach of writing a for loop would give us the following program:
>>> numberList = [1, 3, 10, 45, 6, 50] >>> def extract_even(numberList): ... evenNumberList = [] ... for number in numberList: ... if number % 2 == 0: # Filtering condition ... evenNumberList.append(number) ... return evenNumberList ... >>> extract_even(numberList) [10, 6, 50]
As you can see, the extract_even() function accepts an iterable with integers and returns a list of even numbers. The conditional statement tests every value and filters out the odd numbers.
When dealing with code like this, you can employ the filter() function by writing the filtering logic in a short predicate function:
>>> numberList = [1, 3, 10, 45, 6, 50] >>> def is_even(number): ... return number % 2 == 0 # Filtering condition ... >>> list(filter(is_even, numberList)) [10, 6, 50]
Now, the is_even() function accepts integers and returns True if it is even. The filter() call winnows out the undesired odd numbers, and as a result, we get a list of even numbers. Not only is the code shorter, but it's also more efficient than the previous solution involving a for loop.