3 Ways To Run A Python Function Asynchronously

100x The Speed of Your Code

3 Ways To Run A Python Function Asynchronously Many developers hate the nuisance of waiting for an HTTP request or an API call to run before preceding code blocks are run.

By default, Python codes are run synchronously. This can not only slow the user’s experience but also takes a lot more time to complete tasks and prevents scalability.

If you web scrape a lot, you’ve probably encountered a situation whereby you need to scrape and store the data you scraped, but your entire code block stops when storing the data, and further scrapes are halted. This problem can also arise in different applications which require frequent and multiple API calls from different users.

This post shows you how to fix that using asynchronous Python programming. You would also understand the difference between asynchronous and synchronous, and how you can use these design principles as a day-to-day heuristic.

What are Synchronous Operations?

One-way traffic jam

Have you ever been stuck in a one-way traffic?
Unless every car in front of you gets passed the traffic, you’re going to be stuck. If any of the cars in front break down, be rest assured you are not going to go home with your car. This is typically how synchronous programs work.

Every initial request/function (car in front) must be resolved (leave the traffic), before any function or further requests (cars behind) are sent or executed and then finally resolved.

In Python, Synchronous operation is simply the execution of code sequentially.

What are Asynchronous Operations?

Using the previous analogy of traffic. You are still stuck in this traffic but you come across McDonald’s by the side of the road. Your tummy grumbles and you decide to quickly leave your car and get a brunch.

This is called asynchronous operation. This is when we switch to a different task once we come across a delay/waiting within our current task.

In Python, Asynchronous operation is simply the execution of code concurrently. Note that this is not the same as running in parallel.

For a more in-depth guide about asynchronous, synchronous, parallelism, threads, multi-threaded, and single-threaded operations, please visit this post. (Explained with real-life analogies)

Using Asynchronous programming with Python

Python wasn’t made for asynchronous operations compared to something like JavaScript. In Python, it can get a bit more complex than just calling `async` on a function. But there are now libraries that make it almost as simple as that.

The 3 most common libraries used for running Python code asynchronously are Threading, Concurrent futures, and asyncio.

We are going to download PDFs from a website using the request library and save them on our devices. I am not going to explain it all step by step, because I suppose you have some knowledge of using requests, saving files, functions, for loops, etc. I’d leave a brief description of each code block to give some context.

We’ll start with downloading the pdfs synchronously and making the process asynchronous.

Synchronous Implementation —

It takes us a total of 42 seconds to download and save all 10 PDFs. Let’s see how much time we can cut just by using asynchronous programming.

10 Hard Python Projects For Intermediates To Boost Your Python Skills & Portfolio.

Timeline to finish each project — 1 month

medium.com

Using the threads library

This library has the longest implementation of async in Python but in my opinion, was far more intuitive than others.

First, we import the threading library.

We then create a new thread. (Ensure to get rid of the for loop)

The Thread method takes in the target (required) and args known as arguments (optional).

If we run this, nothing really happens because all we’ve done is create a thread object. We haven’t really started it. To start the threads, we use the start method.

We can use the is_alive() method to validate if each thread is currently running or if a thread has completed running.

If we run this, it works but something funny happens…

The counter is executed and printed before the threads are completed. To deal with this, we need to use the join method.

The join method allows us to wait for a specific thread to finish before the next thread or entire program (usually called the main thread) terminates / finishes.

This might confuse you because you might be persuaded to think there’s no difference between running synchronously and running asynchronously with threads. Don’t let the join method fool you.

Every thread keeps running and can even reach completion before the thread with a join method attached, has been verified of completion.

For instance, the t.join doesn’t stop other threads from running, instead it stops them from terminating. So everything still runs like before but the join method prevents the threads and main thread (program) from terminating (not completing) until t completes.

Once t has reached completion, the main thread continues to run as supposed, but as per our code, t1 also halts the main thread and t2 thread from terminating.

Note that t2 might have already concluded a while ago, but its thread just hasn’t been terminated yet because of the join method.

Complete code to download all pdfs while using join method to verify thread completion

If we run this, we can see that everything now runs perfectly fine and the time taken has reduced to 12 seconds.

Results for running the code with threads and duration

If you noticed, the print statement `print(‘T just finished’)` hadn’t yet been executed but other threads had completed their respective instructions or code. This is to prove what I was saying earlier about the join method and deepen your understanding of it works.

But this is quite verbose and redundant. Let’s make it cleaner and re-usable.

Cleaner and resuable code

The join method wasn’t used when starting the threads in the for loop as that would just nullify the benefits of asynchronous programming since we haven’t even started all the threads.

Both ways take roughly the same time. One is just more appealing to the eye.

FULL CODE ON GITHUB GIST

Using Concurrent.futures library

This is a higher-level interface to start async tasks and an abstraction layer on top of threading and multiprocessing module.

When do we use concurrent.futures library?
It’s usually the preferred tool when you just want to run a piece of code concurrently and don’t need the extra functionalities provided by the threading or multiprocessing module’s API.

We import the library

We then call the threadPoolExecutor method, which creates a new ThreadPoolExecutor instance. It is used to create and execute threads.

with concurrent.futures.ThreadPoolExecutor() as executor: Creates a ThreadPoolExecutor instance as a context manager, which manages the life cycle of a pool of worker threads that will be used to execute tasks concurrently.

executor.submit schedules a function to be executed and returns a futures object which we can use to check if the thread is running and also to get back the results from the thread after it’s done executing.

Our function doesn’t return any value so we don’t have any return value to get back. Assuming it does, we simply just store the futures object in a variable.

f1 = execute.submit(download_pdf, link)

And call the results method –

f1.results()

A simpler and faster way to do this, is using the map method. If you’re familiar with pythons map method, it’s very similar to the threadPoolExecutor map method.

threadPoolExecutor map method.

Instead of using submit, we simply use the map method and pass in the function and an array/list of items we want to iteratively pass through that function. By default, it creates a new thread for us and schedules it for execution.

Using async.io

Async.io is a relatively new library that came out. It tries to emulate the javascript way of making a function asynchronous.

You first need to install asyncio.

pip install asyncio

Import it and make every function asyncby calling async at the very beginning of the function.

Since request is a blocking library, we have to explicitly run the function in a different thread. To do this we use the to_thread method from asyncio and await it.

[IMG — To thread asyncio]

Even if we run this, it still runs synchronously. To make it run asynchronously, we have to either implicitly use the `gather` method to schedule the execution and wrap the main thread inside the run method provided by asyncio.

[IMG — asyncio finalize]

* — unpacks the list. The gather method on asyncio doesn’t accept lists.

We’ve successfully converted it to an asynchronous program using asyncio.

FAQ

Which asynchronous library is the best in Python?

It all depends on your use-case. The threading library is used when you need more control over each thread and if your function doesn’t return a value.

Concurrent Futures library should be used when you don’t need all those underlying functionality or your function returns a value. If you would also need to use multiprocessing, this library also provides API’s for that.

Asyncio library can both return a value and has a lot of the high-level and low-level APIs you might need but can be more abstract and less intuitive when learning.

If you’re a newbie and don’t understand threading, you should start with the threading module, just to get an intuition. If you have pre-existing code you would like to convert to asynchronous, use Asyncio. If you’re none of these, Asyncio is still the way!

Asyncio vs Threading in terms of speed?

Asyncio is about 3.5x faster because it efficiently uses threads to achieve better performance and scales a lot better than threading.

When threads successfully carry out the instructions, what happens?

It is either terminated or reused.

Can a thread fail to carry out instructions?

Yes. If it fails, your active threads keeps running till completion. In the python threading library, the error is caught using the excepthook() method

Read More

More From Author

Basic REST API Principles That make you a 1% programmer.

5 Concepts Every Python Engineer Should Know in 2024

One thought on “3 Ways To Run A Python Function Asynchronously

Leave a Reply

Your email address will not be published. Required fields are marked *