Jump to content

Python3, why does this simple 3 liner use so much ram.

Poet129

If you let this script run, it will slowly eat all your ram. What can I do to fix it?

from torch import manual_seed
while True:
    manual_seed(1)

 

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, Poet129 said:

If you let this script run, it will slowly eat all your ram. What can I do to fix it?


from torch import manual_seed
while True:
    manual_seed(1)

 

i don't know much about coding (only what I've overheard from my dad about memory leaks in this case) but aren't you supposed to have an additional line, stopping the process when the manual_seed gets imported?

gaming system: R7 3700X @ 4.25Ghz cpu / B450 STEEL LEGEND mobo / 4x8gb corsair Vengeance @3333Mhz ram / RX 7900XTX pulse gpu / Be Quiet Dark Rock Pro 3 cpu cooler /

Coolermaster Qube 500 case / Be Quiet Dark Power Pro 12 1500w power supply

 

laptop: Dell xps 9510, 3.5k OLED, i7 11800h, rtx 3050 ti, 2x16gb DDR4 @ 3200Mhz, 1TB main drive, 2TB add in ssd

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, Poet129 said:

If you let this script run, it will slowly eat all your ram. What can I do to fix it?


from torch import manual_seed
while True:
    manual_seed(1)

 

not sure if you're serious, not sure what "manual_seed(1)" does, but I am sure that "while True" basically means "run this forever and ever and ever until I stop it".

so if "manual_seed(1)" does anything with memory, it will use a bunch of it, if you let it run for long enough.

"We're all in this together, might as well be friends" Tom, Toonami.

 

mini eLiXiVy: my open source 65% mechanical PCB, a build log, PCB anatomy and discussing open source licenses: https://linustechtips.com/topic/1366493-elixivy-a-65-mechanical-keyboard-build-log-pcb-anatomy-and-how-i-open-sourced-this-project/

 

mini_cardboard: a 4% keyboard build log and how keyboards workhttps://linustechtips.com/topic/1328547-mini_cardboard-a-4-keyboard-build-log-and-how-keyboards-work/

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, ki8aras said:

i don't know much about coding (only what I've overheard from my dad about memory leaks in this case) but aren't you supposed to have an additional line, stopping the process when the manual_seed gets imported?

I'm aware but I'm saying that just setting the seed to the same value over and over uses many, many gigabytes of ram this is an example. I use that function in a script a bunch of times and it seems to use more memory each time as does this.

 

1 minute ago, minibois said:

not sure if you're serious, not sure what "manual_seed(1)" does, but I am sure that "while True" basically means "run this forever and ever and ever until I stop it".

so if "manual_seed(1)" does anything with memory, it will use a bunch of it, if you let it run for long enough.

Yes, but because of GIL it can only run one at once therefore it should use a set amount of memory.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Poet129 said:

I'm aware but I'm saying that just setting the seed to the same value over and over uses many, many gigabytes of ram this is an example. I use that function in a script a bunch of times and it seems to use more memory each time as does this.

 

Yes, but because of GIL it can only run one at once therefore it should use a set amount of memory.

you haven't defined manual_seed(1), so you shouldn't expect me to assume what the code does.

for all I know, it creates a new instance of a variable all the time, which would indeed use more and more memory the longer the program runs, GIL or not.

"We're all in this together, might as well be friends" Tom, Toonami.

 

mini eLiXiVy: my open source 65% mechanical PCB, a build log, PCB anatomy and discussing open source licenses: https://linustechtips.com/topic/1366493-elixivy-a-65-mechanical-keyboard-build-log-pcb-anatomy-and-how-i-open-sourced-this-project/

 

mini_cardboard: a 4% keyboard build log and how keyboards workhttps://linustechtips.com/topic/1328547-mini_cardboard-a-4-keyboard-build-log-and-how-keyboards-work/

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, minibois said:

you haven't defined manual_seed(1), so you shouldn't expect me to assume what the code does.

for all I know, it creates a new instance of a variable all the time, which would indeed use more and more memory the longer the program runs, GIL or not.

Ok, but it is defined when I import it from torch, anyway I fixed the problem, with some inefficient code...

from torch import manual_seed
from multiprocessing import Process
def seed(x):
    manual_seed(x)
while True:
    p = Process(target=seed, args=(1,))
    p.start()
    p.join()

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Poet129 said:

Ok, but it is defined when I import it from torch, anyway I fixed the problem, with some inefficient code...


from torch import manual_seed
from multiprocessing import Process
def seed(x):
    manual_seed(x)
while True:
    p = Process(target=seed, args=(1,))
    p.start()
    p.join()

 

you just move the problem elsewhere in another process

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Poet129 said:

Ok, but it is defined when I import it from torch, anyway I fixed the problem, with some inefficient code...



from torch import manual_seed
from multiprocessing import Process
def seed(x):
    manual_seed(x)
while True:
    p = Process(target=seed, args=(1,))
    p.start()
    p.join()

 

From what I checked manual_seed() is to set a seed for random generation of numbers, i assume to machine learning? 
either way what youre doing there is eternally creating processes where you set a seed.

In the first code you showed is the same thing, in every cycle you're storing a seed and after a second you can have up to thousands of different seed stored in ram. So if i understand right, that while needs to break after some condition is applied:
 

a = 0
b = 10
while True:
    if a < b:
        break
    else:
        a += 1
    print(a)

 

Current Rig:

Mobo - MSI B450 Pro-Max VDH | CPU - Ryzen 5 3600 (AMD duh) | GPU - GTX 1660 Super (MSI) | WAM - 2 x 8GB 3200MHz (G.SKILLS Ripjaws)Storage - 1 x 256GB SSD (Crucial MX500) + 1 x 2TB | HDD 7200RPM (Seagate Barracuda) | PSU - Cooler Master 450W | Case - Cooler Master Q300L | Display - Asus VP248QG (Main) + AOC 24B1W1 | Other Stuff - Bloody B120 (Keyboard) + HP X220 Gaming Mouse + Asus A541UJ (Laptop)

Link to comment
Share on other sites

Link to post
Share on other sites

Why are you seeding it every time? This only ever needs to be done once if at all...

 

As for why this happens: you're calling a function that returns a C++ object which is never garbage collected because it never goes out of scope (info here on why), causing what is known as a memory leak. Every time you run that code the system allocates more memory for a new torch.Generator object which is never freed. As for why this may be avoided with multiprocessing, the reason is probably that multiprocessing spawns a new instance of python and kills it when it's done, meaning any memory allocated by it is returned to the system. This also means that using that function inside a multiprocessing thread is useless as the change most likely won't be retained.

 

With python you can often forget about memory management because it's taken care of by the garbage collector but you shouldn't assume this is always the case, particularly with modules that have native components and rely on their own internal garbage collector like pytorch.

 

The problem here isn't anything to do with the pytorch module though, it's your incorrect usage of the manual_seed function. As I mentioned, there's no reason to ever run it more than once with the same value.

 

So your code should look like this:

from torch import manual_seed

manual_seed(1)
while True:
    """do whatever you actually wanted to do here"""

 

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

32 minutes ago, Sauron said:

As for why this happens: you're calling a function that returns a C++ object

So, to be clear, I could write something like:

foo = manual_seed(blah)

and foo will be an object?

So manual_seed() isn't setting a global seed value?

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

14 minutes ago, straight_stewie said:

So manual_seed() isn't setting a global seed value?

https://pytorch.org/docs/stable/generated/torch.manual_seed.html

Quote

Sets the seed for generating random numbers. Returns a torch.Generator object.

 

I assume the returned object is a random number generator initialized with the specified seed. So the code is essentially creating an infinite number of random number generators instead of just creating a single one and using it.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

31 minutes ago, straight_stewie said:

So, to be clear, I could write something like:


foo = manual_seed(blah)

and foo will be an object?

So manual_seed() isn't setting a global seed value?

The documentation states it returns a torch.Generator object, which comes from a C++ class. I only sifted through it and I don't use pytorch much but I'm assuming you're supposed to use the newly generated object rather than the default global one.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Sauron said:

Why are you seeding it every time? This only ever needs to be done once if at all...

 

As for why this happens: you're calling a function that returns a C++ object which is never garbage collected because it never goes out of scope (info here on why), causing what is known as a memory leak. Every time you run that code the system allocates more memory for a new torch.Generator object which is never freed. As for why this may be avoided with multiprocessing, the reason is probably that multiprocessing spawns a new instance of python and kills it when it's done, meaning any memory allocated by it is returned to the system. This also means that using that function inside a multiprocessing thread is useless as the change most likely won't be retained.

 

With python you can often forget about memory management because it's taken care of by the garbage collector but you shouldn't assume this is always the case, particularly with modules that have native components and rely on their own internal garbage collector like pytorch.

 

The problem here isn't anything to do with the pytorch module though, it's your incorrect usage of the manual_seed function. As I mentioned, there's no reason to ever run it more than once with the same value.

 

So your code should look like this:


from torch import manual_seed

manual_seed(1)
while True:
    """do whatever you actually wanted to do here"""

 

heyyyy so my limited understanding carried me to the right conclusion hehe

gaming system: R7 3700X @ 4.25Ghz cpu / B450 STEEL LEGEND mobo / 4x8gb corsair Vengeance @3333Mhz ram / RX 7900XTX pulse gpu / Be Quiet Dark Rock Pro 3 cpu cooler /

Coolermaster Qube 500 case / Be Quiet Dark Power Pro 12 1500w power supply

 

laptop: Dell xps 9510, 3.5k OLED, i7 11800h, rtx 3050 ti, 2x16gb DDR4 @ 3200Mhz, 1TB main drive, 2TB add in ssd

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×