Python3, why does this simple 3 liner use so much ram.

Poet129 · February 25, 2021

If you let this script run, it will slowly eat all your ram. What can I do to fix it?

from torch import manual_seed
while True:
    manual_seed(1)

ki8aras · February 25, 2021

8 minutes ago, Poet129 said:
If you let this script run, it will slowly eat all your ram. What can I do to fix it?
from torch import manual_seed
while True:
    manual_seed(1)

i don't know much about coding (only what I've overheard from my dad about memory leaks in this case) but aren't you supposed to have an additional line, stopping the process when the manual_seed gets imported?

minibois · February 25, 2021

10 minutes ago, Poet129 said:
If you let this script run, it will slowly eat all your ram. What can I do to fix it?
from torch import manual_seed
while True:
    manual_seed(1)

not sure if you're serious, not sure what "manual_seed(1)" does, but I am sure that "while True" basically means "run this forever and ever and ever until I stop it".

so if "manual_seed(1)" does anything with memory, it will use a bunch of it, if you let it run for long enough.

Poet129 · February 25, 2021

1 minute ago, ki8aras said:

i don't know much about coding (only what I've overheard from my dad about memory leaks in this case) but aren't you supposed to have an additional line, stopping the process when the manual_seed gets imported?

I'm aware but I'm saying that just setting the seed to the same value over and over uses many, many gigabytes of ram this is an example. I use that function in a script a bunch of times and it seems to use more memory each time as does this.

1 minute ago, minibois said:

not sure if you're serious, not sure what "manual_seed(1)" does, but I am sure that "while True" basically means "run this forever and ever and ever until I stop it".

so if "manual_seed(1)" does anything with memory, it will use a bunch of it, if you let it run for long enough.

Yes, but because of GIL it can only run one at once therefore it should use a set amount of memory.

minibois · February 25, 2021

1 minute ago, Poet129 said:

I'm aware but I'm saying that just setting the seed to the same value over and over uses many, many gigabytes of ram this is an example. I use that function in a script a bunch of times and it seems to use more memory each time as does this.

Yes, but because of GIL it can only run one at once therefore it should use a set amount of memory.

you haven't defined manual_seed(1), so you shouldn't expect me to assume what the code does.

for all I know, it creates a new instance of a variable all the time, which would indeed use more and more memory the longer the program runs, GIL or not.

Poet129 · February 25, 2021

2 minutes ago, minibois said:

you haven't defined manual_seed(1), so you shouldn't expect me to assume what the code does.

for all I know, it creates a new instance of a variable all the time, which would indeed use more and more memory the longer the program runs, GIL or not.

Ok, but it is defined when I import it from torch, anyway I fixed the problem, with some inefficient code...

from torch import manual_seed
from multiprocessing import Process
def seed(x):
    manual_seed(x)
while True:
    p = Process(target=seed, args=(1,))
    p.start()
    p.join()

Franck · February 25, 2021

1 hour ago, Poet129 said:
Ok, but it is defined when I import it from torch, anyway I fixed the problem, with some inefficient code...
from torch import manual_seed
from multiprocessing import Process
def seed(x):
    manual_seed(x)
while True:
    p = Process(target=seed, args=(1,))
    p.start()
    p.join()

you just move the problem elsewhere in another process

MigasTigas · February 25, 2021

1 hour ago, Poet129 said:
Ok, but it is defined when I import it from torch, anyway I fixed the problem, with some inefficient code...
from torch import manual_seed
from multiprocessing import Process
def seed(x):
    manual_seed(x)
while True:
    p = Process(target=seed, args=(1,))
    p.start()
    p.join()

From what I checked manual_seed() is to set a seed for random generation of numbers, i assume to machine learning?
either way what youre doing there is eternally creating processes where you set a seed.

In the first code you showed is the same thing, in every cycle you're storing a seed and after a second you can have up to thousands of different seed stored in ram. So if i understand right, that while needs to break after some condition is applied:

a = 0
b = 10
while True:
    if a < b:
        break
    else:
        a += 1
    print(a)

Sauron · February 25, 2021

Why are you seeding it every time? This only ever needs to be done once if at all...

As for why this happens: you're calling a function that returns a C++ object which is never garbage collected because it never goes out of scope (info here on why), causing what is known as a memory leak. Every time you run that code the system allocates more memory for a new torch.Generator object which is never freed. As for why this may be avoided with multiprocessing, the reason is probably that multiprocessing spawns a new instance of python and kills it when it's done, meaning any memory allocated by it is returned to the system. This also means that using that function inside a multiprocessing thread is useless as the change most likely won't be retained.

With python you can often forget about memory management because it's taken care of by the garbage collector but you shouldn't assume this is always the case, particularly with modules that have native components and rely on their own internal garbage collector like pytorch.

The problem here isn't anything to do with the pytorch module though, it's your incorrect usage of the manual_seed function. As I mentioned, there's no reason to ever run it more than once with the same value.

So your code should look like this:

from torch import manual_seed

manual_seed(1)
while True:
    """do whatever you actually wanted to do here"""

straight_stewie · February 25, 2021

32 minutes ago, Sauron said:

As for why this happens: you're calling a function that returns a C++ object

So, to be clear, I could write something like:

foo = manual_seed(blah)

and foo will be an object?

So manual_seed() isn't setting a global seed value?

Eigenvektor · February 25, 2021

14 minutes ago, straight_stewie said:

So manual_seed() isn't setting a global seed value?

https://pytorch.org/docs/stable/generated/torch.manual_seed.html

Quote

Sets the seed for generating random numbers. Returns a torch.Generator object.

I assume the returned object is a random number generator initialized with the specified seed. So the code is essentially creating an infinite number of random number generators instead of just creating a single one and using it.

Sauron · February 25, 2021

31 minutes ago, straight_stewie said:
So, to be clear, I could write something like:
foo = manual_seed(blah)
and foo will be an object?

So manual_seed() isn't setting a global seed value?

The documentation states it returns a torch.Generator object, which comes from a C++ class. I only sifted through it and I don't use pytorch much but I'm assuming you're supposed to use the newly generated object rather than the default global one.

ki8aras · February 25, 2021

1 hour ago, Sauron said:
Why are you seeding it every time? This only ever needs to be done once if at all...

As for why this happens: you're calling a function that returns a C++ object which is never garbage collected because it never goes out of scope (info here on why), causing what is known as a memory leak. Every time you run that code the system allocates more memory for a new torch.Generator object which is never freed. As for why this may be avoided with multiprocessing, the reason is probably that multiprocessing spawns a new instance of python and kills it when it's done, meaning any memory allocated by it is returned to the system. This also means that using that function inside a multiprocessing thread is useless as the change most likely won't be retained.

With python you can often forget about memory management because it's taken care of by the garbage collector but you shouldn't assume this is always the case, particularly with modules that have native components and rely on their own internal garbage collector like pytorch.

The problem here isn't anything to do with the pytorch module though, it's your incorrect usage of the manual_seed function. As I mentioned, there's no reason to ever run it more than once with the same value.

So your code should look like this:
from torch import manual_seed

manual_seed(1)
while True:
    """do whatever you actually wanted to do here"""

heyyyy so my limited understanding carried me to the right conclusion hehe

Sign In

Python3, why does this simple 3 liner use so much ram.

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

Who Can Find the Weirdest PC Parts on AliExpress?

Latest From Tech Quickie:

Ethernet Is Named After Something Really Dumb (and other tech stories)

Latest From TechLinked:

Oh, Snap

Latest From GameLinked:

Will Assassin's Creed Shadows be good? - Gamers 2 Gamers

Latest From ShortCircuit:

I think y’all overlooked something… - Pixel 8a

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!