Jump to content

I have a quick question about multiprocessing. I want to create a child process that waits for a task to do from the main process and then I want it to process whatever it needs to (such as getting api data) and then send the results back to the main process; however, I want the child process to immediately wait for another task again. I feel like I'm over seeing something, but I have no idea how to achieve something like this. I'm wondering if I could use queue() or something like that, but I need some other input here. Thanks!

Tech enthusiast and CS Student

 

 

 

 

 

Link to comment
https://linustechtips.com/topic/945132-python-multiprocessing-help/
Share on other sites

Link to post
Share on other sites

So with python, the most important thing to know with multithreading is that you won't actually get any parallel processing with threading, since the standard python interpreter uses what's called the GIL, or "global inter-lock". Any threading done in the main python process is done with time shared multithreading, so each thread shares resources. In other words, you might have multiple threads, but only 1 thread is actually processing at a given time. This is easier to work with though, and if you don't need raw power (which is usually the case if you're working in python to begin with), this might be enough for your use case.

 

This is super handy for stuff like lightweight GUI threading, since you don't need all that much cpu time processing to render a basic GUI, and no human would be able to notice the difference since they would appear to be parallel to the human eye, but if you are just looking for raw cpu time then this won't work for you.

 

You have to instead use Pool https://docs.python.org/2/library/multiprocessing.html which makes a new process entirely, instead of a thread, resulting in the ability to actually run things concurrently in the default python interpreter and getting around the GIL. The major downside of this is that since you are now running in multiple processes on the OS, you no longer have the ability to share memory, so the workers in the pool that are doing the parallel work need to be heavily independent or you will have to figure out a way to share information back and forth between the processes.

 

EDIT: after typing all that out, I think I may have over-explained it a bit. If you are looking for square one python threading help, something like this would help more: https://www.tutorialspoint.com/python/python_multithreading.htm. Once you start having specific concurrency needs, then my nonsense might make sense :D 

Gaming build:

CPU: i7-7700k (5.0ghz, 1.312v)

GPU(s): Asus Strix 1080ti OC (~2063mhz)

Memory: 32GB (4x8) DDR4 G.Skill TridentZ RGB 3000mhz

Motherboard: Asus Prime z270-AR

PSU: Seasonic Prime Titanium 850W

Cooler: Custom water loop (420mm rad + 360mm rad)

Case: Be quiet! Dark base pro 900 (silver)
Primary storage: Samsung 960 evo m.2 SSD (500gb)

Secondary storage: Samsung 850 evo SSD (250gb)

 

Server build:

OS: Ubuntu server 16.04 LTS (though will probably upgrade to 17.04 for better ryzen support)

CPU: Ryzen R7 1700x

Memory: Ballistix Sport LT 16GB

Motherboard: Asrock B350 m4 pro

PSU: Corsair CX550M

Cooler: Cooler master hyper 212 evo

Storage: 2TB WD Red x1, 128gb OCZ SSD for OS

Case: HAF 932 adv

 

Link to post
Share on other sites

Instead of having child processes waiting for input, you're probably better off with @reniat's suggestion of using the multiprocessing module. Have the main script spawn processing processes when data is available, or have a dedicated scheduler sort-of script on stand by that does this and launches jobs on demand. You'll then have to figure out how to communicate this back to the main script.

Crystal: CPU: i7 7700K | Motherboard: Asus ROG Strix Z270F | RAM: GSkill 16 GB@3200MHz | GPU: Nvidia GTX 1080 Ti FE | Case: Corsair Crystal 570X (black) | PSU: EVGA Supernova G2 1000W | Monitor: Asus VG248QE 24"

Laptop: Dell XPS 13 9370 | CPU: i5 10510U | RAM: 16 GB

Server: CPU: i5 4690k | RAM: 16 GB | Case: Corsair Graphite 760T White | Storage: 19 TB

Link to post
Share on other sites

Usually such queue tasks are not forced to be in Python only. There are message brokers (like RabittMQ) or queue implementations that have an API and a Python library that is used to schedule and execute tasks. You can check Celery too. In short - you don't implement queue systems, just use existing best suited for it.

Link to post
Share on other sites

5 hours ago, riklaunim said:

Usually such queue tasks are not forced to be in Python only. There are message brokers (like RabittMQ) or queue implementations that have an API and a Python library that is used to schedule and execute tasks. You can check Celery too. In short - you don't implement queue systems, just use existing best suited for it.

Rabbit mq is what we use at work to process things like letters, email, sms being sent from our system.

 

You could also use socketio.

                     ¸„»°'´¸„»°'´ Vorticalbox `'°«„¸`'°«„¸
`'°«„¸¸„»°'´¸„»°'´`'°«„¸Scientia Potentia est  ¸„»°'´`'°«„¸`'°«„¸¸„»°'´

Link to post
Share on other sites

Multiprocessing in Python is the only way to achieve what other languages call multithreading. TL;DR, Python put restrictions in the multithreading libraries.

 

You can do what you're asking for with multiprocessing, you just have to make sure that the child processes are not repeating code from the parent process, otherwise you'd create an infinite number of child processes.

 

Here's how you can separate the Parent from the Child processes:

#Code for everything else

if __name__ == '__main__':
  #Code for the Parent process goes here

The issue I ran in to when doing something similar (where I wanted the Parent to create Child processes, and the Child processes to return values to the parent) was that I needed to create a dictionary to store each return value.

 

Here's most of the code I used for a personal program (this one was to test the speed differences of calculating Prime numbers using different methods as well as testing if compiling Python with the Cython module would be worthwhile):

import Find_Nth_Prime
import Find_Nth_Prime1
import time
import os
import uuid
import multiprocessing
#import tqdm

def timeFunction(func, argoos, name):
    start = time.time()
    func(int(argoos[0]), int(argoos[1]))
    end = time.time()
    total = end - start
    argoos[2][name] = total

def runInParallel(fns, args):
    proc  = []
    i = 0
    for fn in fns:
        if i == 0:
            namer = "Default"
        elif i == 1:
            namer = "Compiled"
        else:
            namer = "Optimized"
        p = multiprocessing.Process(target=timeFunction, args=(fn, args, namer))
        proc.append(p)
        p.start()
        i += 1
    print("-"*80)
    for p in proc:
        p.join()
    pass

if __name__ == '__main__':
    global returnDict
    manager = multiprocessing.Manager()
    returnDict = manager.dict()

    userMax = input("Enter the maximum number to calculate primes up to: ")
    numLoops = input("How many loops do you want to go through for each test? ")

    testing_Start = time.time()

    funcs = [Find_Nth_Prime.main, Find_Nth_Prime1.main, Find_Nth_Prime2.main]
    arguments = [userMax, numLoops, returnDict]

    runInParallel(funcs, arguments)

    testing_End = time.time()
    testing_Total = testing_End - testing_Start

I made the functionality of my Child processes into different classes (called Find_Nth_Prime, Find_Nth_Prime1, and Find_Nth_Prime2) rather than have them be inside the Parent process. The main part of the Parent creates a dictionary to can store the return values of each Child processes (note how it is a multiprocessing.Manager().dict() and not a normal dictionary). An array containing the names of each class' main function, as well as an array containing my arguments are passed into the runInParallel function, which in turn handles running each class in a Child process.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×