Coroutines are fundamental to Python's asyncio library, which provides infrastructure for writing concurrent code using the async/await syntax. This is particularly beneficial for I/O-bound operations (For example, network requests and file I/O) where waiting for external resources would otherwise block the main thread.
Coroutines usage gave a huge boost in the performance of the number of API calls and tasks we were able to handle on a day-to-day basis. It came as a need of the moment to replace the Celery code in our communications service’s code base. The reasons for using Coroutines to replace the Celery architecture in the service are as follows:
Below is an example of a simple coroutine-based implementation in Python.
import asyncio
async def greet(name):
print(f"Hello {name} 👋")
await asyncio.sleep(1) # Simulates I/O delay
print(f"Goodbye {name} 👋")
async def main():
await asyncio.gather(
greet("Alice"),
greet("Bob"),
greet("Charlie"),
)
# Entry point
asyncio.run(main())
Output: -
Hello Alice 👋
Hello Bob 👋
Hello Charlie 👋
Goodbye Alice 👋
Goodbye Bob 👋
Goodbye Charlie 👋
What’s Happening?
We had a Celery architecture for making API calls to various vendors regarding our WhatsApp, SMS, and call requests. With increasing loads and the quantum of API calls, we started off making a lot of Celery tasks for making API calls, and were incurring costs on running the Celery machines with full load as well. As mentioned earlier, to make 1000 API calls in a concurrent manner, Celery has to spawn 1000 workers for all API calls to go through together. This issue persisted with all our communication services, the traffic through our communication service has grown by 2-3X of what it was last year. With WhatsApp coming in as a major service of communication, the traffic spike has increased as well.
This demanded a solution from a business and a tech point of view to make our systems more reliable, efficient, and effective in the processing of the huge traffic loads. We then thought of using Async Kafka Consumers. Yes, you read it right! We made consumers with an infinite event loop running inside them. We were able to publish tasks into a Kafka broker, where we were able to consume in a bulk fashion using Kafka, the event loop was able to make concurrent task processing, API calls, and then DB bulk writes and updates accordingly.
We were able to see a huge delta in our peak traffic processing capabilities. Our entire 20-pod’ Celery traffic was now being consumed by a handful of Kafka consumers running in an async fashion and having the computation capability of running 4-5X of the original load with no extra consumers spun up or any extra infra consumed.
Using the asyncio library has its ups and downs. One must not feel that using async def is the straightforward solution to the entire problem. Implementing Coroutines in our service was a task of its own.
Converting Celery code into Coroutines was not as easy as it seemed. Below is a small example:
Earlier, we had a Celery task that used to take data from our API calls and trigger APIs one by one. This made us dependent on multiple workers to be spawned at once to handle traffic. The above code takes dozens of data points at once and, according to the vendor, makes all API calls concurrently. This makes the computation and processing power go higher by multiple folds.
Earlier, we used to see latencies of 100-150 ms for 1 API call and a database insert. Now, we see a total of 350-450 ms for 12-15 API calls and a db insert, and that means we can make 2-3 bulk, i.e., 35-45 requests handled in 1 second in one consumer.
Coroutines are a powerful architecture, but they come with their own caveats. Since we take in bulk data for processing to make huge numbers of concurrent API calls and processes, we see a spike in CPU requests. Our business use case pushes for this necessity, and it is rightfully adjusted as such: 1 consumer with 1-1.5 core CPU is much more beneficial than 10-15 Celery consumers with 0.5 core CPU running simultaneously.
From implementing Coroutines, our Celery traffic of 30 pods has gone down by 90-95% and all the traffic is handled today by 6-8 consumers.
Main points to watch out for are the CPU and Memory consumption and throttling, and potential choke points. If these conditions are covered, Coroutines will provide for a very efficient use of resources for much higher traffic than traditional Celery architecture.
Our Celery machines perform with 20-30 consumers running:
Our Kafka consumers with purely async code are running with the same traffic, and 2 consumers are running:
1. What are Coroutines used for?
Coroutines are used in Python for writing asynchronous, non-blocking code—ideal for I/O-bound tasks like network requests or file operations.
2. What is Celery in Python used for?
Celery is used to run background tasks and schedule jobs asynchronously, often in distributed systems.
3. What is Kafka used for in Python?
Kafka is used to build real-time data pipelines and streaming applications by publishing, consuming, and processing large volumes of messages efficiently.
Hash Tags :
#Shadowfax #Coroutines #Celery #Python #TechInnovation #SystemPerformance #TechTransformation #TechAtShadowfax #Kafka #Concurrency