Sunday, August 2, 2020

Published 9:24 PM by with 0 comment

Simple, Runnable Demo Comparing Sequential and Parallel Python (using concurrent.futures)

I wanted a really simple example explaining this for my wife so I'm going ahead and posting it here.
The concurrent.futures module in Python lets you really easy set up parallel tasks. It is very similar to how promises work in JavaScript so it's great if you're familiar with those.

I threw together a quick example that shows the basics:
  • define a function that waits a random amount of time and returns the number passed to it
  • call it 10 times sequentially in a for loop and display the order in which they returned by passing in the loop index
  • call it 10 times in parallel in a for loop and display the order in which they returned by passing in the loop index
  • diplay the runtime for each set of calls
The syntax is really simple. The code looks basically the same as you'd normally do it. The only real changes are:

  • define a ThreadPoolExecutor named 'executor'
  • instead of calling the test function directly, call it with 'executor.submit' and append the submit call to a list; this adds a 'future' to the list
  • call 'as_completed' on the list from the previous bullet and handle the output of the test functions; this will execute when the futures in the list are finished
You can run the code directly here to play with it, and I went ahead and pasted it below also:

import concurrent.futures
from random import randint
from time import sleep
import datetime

#test function that waits 0.5 seconds on average
def test(index):
  sleep(randint(1,1000)/1000)
  return index

#sequential: run 10 ~0.5 second functions sequentially in order; should take ~5 seconds
temp = datetime.datetime.now()
order = []
for i in range(0,10):
  order.append(test(i))

print('sequential took ' + str((datetime.datetime.now() - temp).total_seconds()) + ' seconds and finished in this order: ', order)

#parallel: run 10 ~0.5 second functions in order but in parallel (so might not finish in order); time taken will depend on the computer but should be well under 5 seconds
temp = datetime.datetime.now()
with concurrent.futures.ThreadPoolExecutor() as executor:
  results = []
  order = []
  for i in range(0,10):

    #add future from function call to array of results
    results.append(executor.submit(test, i))

  #process all results when all futures are finished
  for result in concurrent.futures.as_completed(results):
    order.append(result.result())

print('parallel took ' + str((datetime.datetime.now() - temp).total_seconds()) + ' seconds and finished in this order: ', order)



      edit

0 comments:

Post a Comment