Why can a 352GB NumPy ndarray be used on an 8GB memory macOS computer?












6















import numpy as np

array = np.zeros((210000, 210000)) # default numpy.float64
array.nbytes


When I run the above code on my 8GB memory MacBook with macOS, no error occurs. But running the same code on a 16GB memory PC with Windows 10, or a 12GB memory Ubuntu laptop, or even on a 128GB memory Linux supercomputer, the Python interpreter will raise a MemoryError. All the test environments have 64-bit Python 3.6 or 3.7 installed.










share|improve this question




















  • 1





    MacOS extends memory with virtual memory on your disk. Check your process details with Activity Monitor and you'll find a Virtual Memory: 332.71 GB entry. But it's all zeros, so it compresses really, really well..

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters but Windows 10 and Linux also have similar mechanisms. Windows 10 has virtual memory and Linux have swap. Activity Monitor doesn't have VM for 332.71 GB. I use sysctl vm.swapusage to see the real VM usage and got 1200 M

    – Blaise Wang
    6 hours ago











  • But they don't compress.

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters The problem is that Windows 10 added support of RAM compression science build 10525. But still cannot run the above code.

    – Blaise Wang
    3 hours ago
















6















import numpy as np

array = np.zeros((210000, 210000)) # default numpy.float64
array.nbytes


When I run the above code on my 8GB memory MacBook with macOS, no error occurs. But running the same code on a 16GB memory PC with Windows 10, or a 12GB memory Ubuntu laptop, or even on a 128GB memory Linux supercomputer, the Python interpreter will raise a MemoryError. All the test environments have 64-bit Python 3.6 or 3.7 installed.










share|improve this question




















  • 1





    MacOS extends memory with virtual memory on your disk. Check your process details with Activity Monitor and you'll find a Virtual Memory: 332.71 GB entry. But it's all zeros, so it compresses really, really well..

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters but Windows 10 and Linux also have similar mechanisms. Windows 10 has virtual memory and Linux have swap. Activity Monitor doesn't have VM for 332.71 GB. I use sysctl vm.swapusage to see the real VM usage and got 1200 M

    – Blaise Wang
    6 hours ago











  • But they don't compress.

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters The problem is that Windows 10 added support of RAM compression science build 10525. But still cannot run the above code.

    – Blaise Wang
    3 hours ago














6












6








6


1






import numpy as np

array = np.zeros((210000, 210000)) # default numpy.float64
array.nbytes


When I run the above code on my 8GB memory MacBook with macOS, no error occurs. But running the same code on a 16GB memory PC with Windows 10, or a 12GB memory Ubuntu laptop, or even on a 128GB memory Linux supercomputer, the Python interpreter will raise a MemoryError. All the test environments have 64-bit Python 3.6 or 3.7 installed.










share|improve this question
















import numpy as np

array = np.zeros((210000, 210000)) # default numpy.float64
array.nbytes


When I run the above code on my 8GB memory MacBook with macOS, no error occurs. But running the same code on a 16GB memory PC with Windows 10, or a 12GB memory Ubuntu laptop, or even on a 128GB memory Linux supercomputer, the Python interpreter will raise a MemoryError. All the test environments have 64-bit Python 3.6 or 3.7 installed.







python macos numpy memory






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 1 hour ago









Boann

37.1k1290121




37.1k1290121










asked 6 hours ago









Blaise WangBlaise Wang

728




728








  • 1





    MacOS extends memory with virtual memory on your disk. Check your process details with Activity Monitor and you'll find a Virtual Memory: 332.71 GB entry. But it's all zeros, so it compresses really, really well..

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters but Windows 10 and Linux also have similar mechanisms. Windows 10 has virtual memory and Linux have swap. Activity Monitor doesn't have VM for 332.71 GB. I use sysctl vm.swapusage to see the real VM usage and got 1200 M

    – Blaise Wang
    6 hours ago











  • But they don't compress.

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters The problem is that Windows 10 added support of RAM compression science build 10525. But still cannot run the above code.

    – Blaise Wang
    3 hours ago














  • 1





    MacOS extends memory with virtual memory on your disk. Check your process details with Activity Monitor and you'll find a Virtual Memory: 332.71 GB entry. But it's all zeros, so it compresses really, really well..

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters but Windows 10 and Linux also have similar mechanisms. Windows 10 has virtual memory and Linux have swap. Activity Monitor doesn't have VM for 332.71 GB. I use sysctl vm.swapusage to see the real VM usage and got 1200 M

    – Blaise Wang
    6 hours ago











  • But they don't compress.

    – Martijn Pieters
    6 hours ago











  • @MartijnPieters The problem is that Windows 10 added support of RAM compression science build 10525. But still cannot run the above code.

    – Blaise Wang
    3 hours ago








1




1





MacOS extends memory with virtual memory on your disk. Check your process details with Activity Monitor and you'll find a Virtual Memory: 332.71 GB entry. But it's all zeros, so it compresses really, really well..

– Martijn Pieters
6 hours ago





MacOS extends memory with virtual memory on your disk. Check your process details with Activity Monitor and you'll find a Virtual Memory: 332.71 GB entry. But it's all zeros, so it compresses really, really well..

– Martijn Pieters
6 hours ago













@MartijnPieters but Windows 10 and Linux also have similar mechanisms. Windows 10 has virtual memory and Linux have swap. Activity Monitor doesn't have VM for 332.71 GB. I use sysctl vm.swapusage to see the real VM usage and got 1200 M

– Blaise Wang
6 hours ago





@MartijnPieters but Windows 10 and Linux also have similar mechanisms. Windows 10 has virtual memory and Linux have swap. Activity Monitor doesn't have VM for 332.71 GB. I use sysctl vm.swapusage to see the real VM usage and got 1200 M

– Blaise Wang
6 hours ago













But they don't compress.

– Martijn Pieters
6 hours ago





But they don't compress.

– Martijn Pieters
6 hours ago













@MartijnPieters The problem is that Windows 10 added support of RAM compression science build 10525. But still cannot run the above code.

– Blaise Wang
3 hours ago





@MartijnPieters The problem is that Windows 10 added support of RAM compression science build 10525. But still cannot run the above code.

– Blaise Wang
3 hours ago












2 Answers
2






active

oldest

votes


















10














You are most likely using Mac OS X Mavericks or newer, so 10.9 or up. From that version onwards, MacOS uses virtual memory compression, where memory requirements that exceed your physical memory are not only redirected to memory pages on disk, but those pages are compressed to save space.



For your ndarray, you may have requested ~332GB of memory, but it's all a contiguous sequence of NUL bytes at the moment, and that compresses really, really well:



Memory stats from the Activity Monitor, showing a virtual memory size of 332.71 GB but Real Memory Size stat of 9.3 MB



That's a screenshot from the Activity Monitor tool, with the process details of my Python process where I replicated your test (use the (I) icon on the toolbar to open it); this is from the Memory tab, where you can see that the Real Memory Size column is only 9.3 MB used, against a Virtual Memory Size of 332.71GB.



Once you start setting other values for those indices, you'll quickly see the memory stats increase to gigabytes instead of megabytes:



while True:
index = tuple(np.random.randint(array.shape[0], size=2))
array[index] = np.random.uniform(-10 ** -307, 10 ** 307)


or you can push the limit further by assigning to every index (in batches, so you can watch the memory grow):



array = array.reshape((-1,))
for i in range(0, array.shape[0], 10**5):
array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)


The process is eventually terminated; my Macbook Pro doesn't have enough swap space to store hard-to-compress gigabytes of random data:



>>> array = array.reshape((-1,))
>>> for i in range(0, array.shape[0], 10**5):
... array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)
...
Killed: 9


You could argue that MacOS is being too trusting, letting programs request that much memory without bounds, but with memory compression, memory limits are much more fluid. Your np.zeros() array does fit your system, after all. Even though you probably don't actually have the swap space to store the uncompressed data, compressed it all fits fine so MacOS allows it and terminates processes that then take advantage of the generosity.



If you don't want this to happen, use resource.setrlimit() to set limits on RLIMIT_STACK to, say 2 ** 14, at which point the OS will segfault Python when it exceeds the limits.






share|improve this answer


























  • Memory compression should only matter after allocation has already succeeded. The problem here is probably rather either memory limits (ulimits on linux for example) or more likely that the allocator doesn't find a 300GB sized chunk. If you split those up into 100 3GB pieces it would probably work on windows or linux (with big enough swap) as well.

    – inf
    5 hours ago











  • @inf: I don't have 300GB free on my SSD. I do run out of memory when I start filling the array, randomly.

    – Martijn Pieters
    5 hours ago











  • Define "run out of memory", do you get a MemoryError or just start filling RAM, swapping and get OOMed?

    – inf
    5 hours ago











  • @inf: I'm a little reluctant to actually let it run.. As the memory has been allocated by the OS (tracemalloc confirms Python has been given the memory allocation), there won't be a MemoryError, so it'll start swapping and eventually OOMed. But before that point this laptop will be hard to use for a while as everything else is swapped out first.

    – Martijn Pieters
    5 hours ago











  • I understand :) But that's what I mean. The allocation doesn't even succeed on ubuntu and linux and hence the MemoryError.

    – inf
    4 hours ago



















0














@Martijn Pieters' answer is on the right track, but not quite right: this has nothing to do with memory compression, but instead it has to do with virtual memory.



For example, try running the following code on your machine:



arrays = [np.zeros((21000, 21000)) for _ in range(0, 10000)]


This code allocates 32TiB of memory, but you won't get an error (at least I didn't, on Linux). If I check htop, I see the following:



  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
31362 user 20 0 32.1T 69216 12712 S 0.0 0.4 0:00.22 python


This because the OS is perfectly willing to overcommit on virtual memory. It won't actually assign pages to physical memory until it needs to. The way it works is:





  • calloc asks the OS for some memory to use

  • the OS looks in the process's page tables, and finds a chunk of memory that it's willing to assign. This is fast operation, the OS just stores the memory address range in an internal data structure.

  • the program writes to one of the addresses.

  • the OS receives a page fault, at which point it looks and actually assigns the page to physical memory. A page is usually a few KiB in size.

  • the OS passes control back to the program, which proceeds without noticing the interruption.


I have no idea why creating a single huge array doesn't work on Linux or Windows, but I'd expect it to have more to do with the platform's implementation of libc and the limits imposed there than the operating system.



For fun, try running arrays = [np.ones((21000, 21000)) for _ in range(0, 10000)]. You'll definitely get an out of memory error, even on MacOs or Linux with swap compression. Yes, certain OSes can compress RAM, but they can't compress it to the level that you wouldn't run out of memory.





share























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54961554%2fwhy-can-a-352gb-numpy-ndarray-be-used-on-an-8gb-memory-macos-computer%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    10














    You are most likely using Mac OS X Mavericks or newer, so 10.9 or up. From that version onwards, MacOS uses virtual memory compression, where memory requirements that exceed your physical memory are not only redirected to memory pages on disk, but those pages are compressed to save space.



    For your ndarray, you may have requested ~332GB of memory, but it's all a contiguous sequence of NUL bytes at the moment, and that compresses really, really well:



    Memory stats from the Activity Monitor, showing a virtual memory size of 332.71 GB but Real Memory Size stat of 9.3 MB



    That's a screenshot from the Activity Monitor tool, with the process details of my Python process where I replicated your test (use the (I) icon on the toolbar to open it); this is from the Memory tab, where you can see that the Real Memory Size column is only 9.3 MB used, against a Virtual Memory Size of 332.71GB.



    Once you start setting other values for those indices, you'll quickly see the memory stats increase to gigabytes instead of megabytes:



    while True:
    index = tuple(np.random.randint(array.shape[0], size=2))
    array[index] = np.random.uniform(-10 ** -307, 10 ** 307)


    or you can push the limit further by assigning to every index (in batches, so you can watch the memory grow):



    array = array.reshape((-1,))
    for i in range(0, array.shape[0], 10**5):
    array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)


    The process is eventually terminated; my Macbook Pro doesn't have enough swap space to store hard-to-compress gigabytes of random data:



    >>> array = array.reshape((-1,))
    >>> for i in range(0, array.shape[0], 10**5):
    ... array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)
    ...
    Killed: 9


    You could argue that MacOS is being too trusting, letting programs request that much memory without bounds, but with memory compression, memory limits are much more fluid. Your np.zeros() array does fit your system, after all. Even though you probably don't actually have the swap space to store the uncompressed data, compressed it all fits fine so MacOS allows it and terminates processes that then take advantage of the generosity.



    If you don't want this to happen, use resource.setrlimit() to set limits on RLIMIT_STACK to, say 2 ** 14, at which point the OS will segfault Python when it exceeds the limits.






    share|improve this answer


























    • Memory compression should only matter after allocation has already succeeded. The problem here is probably rather either memory limits (ulimits on linux for example) or more likely that the allocator doesn't find a 300GB sized chunk. If you split those up into 100 3GB pieces it would probably work on windows or linux (with big enough swap) as well.

      – inf
      5 hours ago











    • @inf: I don't have 300GB free on my SSD. I do run out of memory when I start filling the array, randomly.

      – Martijn Pieters
      5 hours ago











    • Define "run out of memory", do you get a MemoryError or just start filling RAM, swapping and get OOMed?

      – inf
      5 hours ago











    • @inf: I'm a little reluctant to actually let it run.. As the memory has been allocated by the OS (tracemalloc confirms Python has been given the memory allocation), there won't be a MemoryError, so it'll start swapping and eventually OOMed. But before that point this laptop will be hard to use for a while as everything else is swapped out first.

      – Martijn Pieters
      5 hours ago











    • I understand :) But that's what I mean. The allocation doesn't even succeed on ubuntu and linux and hence the MemoryError.

      – inf
      4 hours ago
















    10














    You are most likely using Mac OS X Mavericks or newer, so 10.9 or up. From that version onwards, MacOS uses virtual memory compression, where memory requirements that exceed your physical memory are not only redirected to memory pages on disk, but those pages are compressed to save space.



    For your ndarray, you may have requested ~332GB of memory, but it's all a contiguous sequence of NUL bytes at the moment, and that compresses really, really well:



    Memory stats from the Activity Monitor, showing a virtual memory size of 332.71 GB but Real Memory Size stat of 9.3 MB



    That's a screenshot from the Activity Monitor tool, with the process details of my Python process where I replicated your test (use the (I) icon on the toolbar to open it); this is from the Memory tab, where you can see that the Real Memory Size column is only 9.3 MB used, against a Virtual Memory Size of 332.71GB.



    Once you start setting other values for those indices, you'll quickly see the memory stats increase to gigabytes instead of megabytes:



    while True:
    index = tuple(np.random.randint(array.shape[0], size=2))
    array[index] = np.random.uniform(-10 ** -307, 10 ** 307)


    or you can push the limit further by assigning to every index (in batches, so you can watch the memory grow):



    array = array.reshape((-1,))
    for i in range(0, array.shape[0], 10**5):
    array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)


    The process is eventually terminated; my Macbook Pro doesn't have enough swap space to store hard-to-compress gigabytes of random data:



    >>> array = array.reshape((-1,))
    >>> for i in range(0, array.shape[0], 10**5):
    ... array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)
    ...
    Killed: 9


    You could argue that MacOS is being too trusting, letting programs request that much memory without bounds, but with memory compression, memory limits are much more fluid. Your np.zeros() array does fit your system, after all. Even though you probably don't actually have the swap space to store the uncompressed data, compressed it all fits fine so MacOS allows it and terminates processes that then take advantage of the generosity.



    If you don't want this to happen, use resource.setrlimit() to set limits on RLIMIT_STACK to, say 2 ** 14, at which point the OS will segfault Python when it exceeds the limits.






    share|improve this answer


























    • Memory compression should only matter after allocation has already succeeded. The problem here is probably rather either memory limits (ulimits on linux for example) or more likely that the allocator doesn't find a 300GB sized chunk. If you split those up into 100 3GB pieces it would probably work on windows or linux (with big enough swap) as well.

      – inf
      5 hours ago











    • @inf: I don't have 300GB free on my SSD. I do run out of memory when I start filling the array, randomly.

      – Martijn Pieters
      5 hours ago











    • Define "run out of memory", do you get a MemoryError or just start filling RAM, swapping and get OOMed?

      – inf
      5 hours ago











    • @inf: I'm a little reluctant to actually let it run.. As the memory has been allocated by the OS (tracemalloc confirms Python has been given the memory allocation), there won't be a MemoryError, so it'll start swapping and eventually OOMed. But before that point this laptop will be hard to use for a while as everything else is swapped out first.

      – Martijn Pieters
      5 hours ago











    • I understand :) But that's what I mean. The allocation doesn't even succeed on ubuntu and linux and hence the MemoryError.

      – inf
      4 hours ago














    10












    10








    10







    You are most likely using Mac OS X Mavericks or newer, so 10.9 or up. From that version onwards, MacOS uses virtual memory compression, where memory requirements that exceed your physical memory are not only redirected to memory pages on disk, but those pages are compressed to save space.



    For your ndarray, you may have requested ~332GB of memory, but it's all a contiguous sequence of NUL bytes at the moment, and that compresses really, really well:



    Memory stats from the Activity Monitor, showing a virtual memory size of 332.71 GB but Real Memory Size stat of 9.3 MB



    That's a screenshot from the Activity Monitor tool, with the process details of my Python process where I replicated your test (use the (I) icon on the toolbar to open it); this is from the Memory tab, where you can see that the Real Memory Size column is only 9.3 MB used, against a Virtual Memory Size of 332.71GB.



    Once you start setting other values for those indices, you'll quickly see the memory stats increase to gigabytes instead of megabytes:



    while True:
    index = tuple(np.random.randint(array.shape[0], size=2))
    array[index] = np.random.uniform(-10 ** -307, 10 ** 307)


    or you can push the limit further by assigning to every index (in batches, so you can watch the memory grow):



    array = array.reshape((-1,))
    for i in range(0, array.shape[0], 10**5):
    array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)


    The process is eventually terminated; my Macbook Pro doesn't have enough swap space to store hard-to-compress gigabytes of random data:



    >>> array = array.reshape((-1,))
    >>> for i in range(0, array.shape[0], 10**5):
    ... array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)
    ...
    Killed: 9


    You could argue that MacOS is being too trusting, letting programs request that much memory without bounds, but with memory compression, memory limits are much more fluid. Your np.zeros() array does fit your system, after all. Even though you probably don't actually have the swap space to store the uncompressed data, compressed it all fits fine so MacOS allows it and terminates processes that then take advantage of the generosity.



    If you don't want this to happen, use resource.setrlimit() to set limits on RLIMIT_STACK to, say 2 ** 14, at which point the OS will segfault Python when it exceeds the limits.






    share|improve this answer















    You are most likely using Mac OS X Mavericks or newer, so 10.9 or up. From that version onwards, MacOS uses virtual memory compression, where memory requirements that exceed your physical memory are not only redirected to memory pages on disk, but those pages are compressed to save space.



    For your ndarray, you may have requested ~332GB of memory, but it's all a contiguous sequence of NUL bytes at the moment, and that compresses really, really well:



    Memory stats from the Activity Monitor, showing a virtual memory size of 332.71 GB but Real Memory Size stat of 9.3 MB



    That's a screenshot from the Activity Monitor tool, with the process details of my Python process where I replicated your test (use the (I) icon on the toolbar to open it); this is from the Memory tab, where you can see that the Real Memory Size column is only 9.3 MB used, against a Virtual Memory Size of 332.71GB.



    Once you start setting other values for those indices, you'll quickly see the memory stats increase to gigabytes instead of megabytes:



    while True:
    index = tuple(np.random.randint(array.shape[0], size=2))
    array[index] = np.random.uniform(-10 ** -307, 10 ** 307)


    or you can push the limit further by assigning to every index (in batches, so you can watch the memory grow):



    array = array.reshape((-1,))
    for i in range(0, array.shape[0], 10**5):
    array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)


    The process is eventually terminated; my Macbook Pro doesn't have enough swap space to store hard-to-compress gigabytes of random data:



    >>> array = array.reshape((-1,))
    >>> for i in range(0, array.shape[0], 10**5):
    ... array[i:i + 10**5] = np.random.uniform(-10 ** -307, 10 ** 307, 10**5)
    ...
    Killed: 9


    You could argue that MacOS is being too trusting, letting programs request that much memory without bounds, but with memory compression, memory limits are much more fluid. Your np.zeros() array does fit your system, after all. Even though you probably don't actually have the swap space to store the uncompressed data, compressed it all fits fine so MacOS allows it and terminates processes that then take advantage of the generosity.



    If you don't want this to happen, use resource.setrlimit() to set limits on RLIMIT_STACK to, say 2 ** 14, at which point the OS will segfault Python when it exceeds the limits.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 2 hours ago

























    answered 6 hours ago









    Martijn PietersMartijn Pieters

    716k13825002313




    716k13825002313













    • Memory compression should only matter after allocation has already succeeded. The problem here is probably rather either memory limits (ulimits on linux for example) or more likely that the allocator doesn't find a 300GB sized chunk. If you split those up into 100 3GB pieces it would probably work on windows or linux (with big enough swap) as well.

      – inf
      5 hours ago











    • @inf: I don't have 300GB free on my SSD. I do run out of memory when I start filling the array, randomly.

      – Martijn Pieters
      5 hours ago











    • Define "run out of memory", do you get a MemoryError or just start filling RAM, swapping and get OOMed?

      – inf
      5 hours ago











    • @inf: I'm a little reluctant to actually let it run.. As the memory has been allocated by the OS (tracemalloc confirms Python has been given the memory allocation), there won't be a MemoryError, so it'll start swapping and eventually OOMed. But before that point this laptop will be hard to use for a while as everything else is swapped out first.

      – Martijn Pieters
      5 hours ago











    • I understand :) But that's what I mean. The allocation doesn't even succeed on ubuntu and linux and hence the MemoryError.

      – inf
      4 hours ago



















    • Memory compression should only matter after allocation has already succeeded. The problem here is probably rather either memory limits (ulimits on linux for example) or more likely that the allocator doesn't find a 300GB sized chunk. If you split those up into 100 3GB pieces it would probably work on windows or linux (with big enough swap) as well.

      – inf
      5 hours ago











    • @inf: I don't have 300GB free on my SSD. I do run out of memory when I start filling the array, randomly.

      – Martijn Pieters
      5 hours ago











    • Define "run out of memory", do you get a MemoryError or just start filling RAM, swapping and get OOMed?

      – inf
      5 hours ago











    • @inf: I'm a little reluctant to actually let it run.. As the memory has been allocated by the OS (tracemalloc confirms Python has been given the memory allocation), there won't be a MemoryError, so it'll start swapping and eventually OOMed. But before that point this laptop will be hard to use for a while as everything else is swapped out first.

      – Martijn Pieters
      5 hours ago











    • I understand :) But that's what I mean. The allocation doesn't even succeed on ubuntu and linux and hence the MemoryError.

      – inf
      4 hours ago

















    Memory compression should only matter after allocation has already succeeded. The problem here is probably rather either memory limits (ulimits on linux for example) or more likely that the allocator doesn't find a 300GB sized chunk. If you split those up into 100 3GB pieces it would probably work on windows or linux (with big enough swap) as well.

    – inf
    5 hours ago





    Memory compression should only matter after allocation has already succeeded. The problem here is probably rather either memory limits (ulimits on linux for example) or more likely that the allocator doesn't find a 300GB sized chunk. If you split those up into 100 3GB pieces it would probably work on windows or linux (with big enough swap) as well.

    – inf
    5 hours ago













    @inf: I don't have 300GB free on my SSD. I do run out of memory when I start filling the array, randomly.

    – Martijn Pieters
    5 hours ago





    @inf: I don't have 300GB free on my SSD. I do run out of memory when I start filling the array, randomly.

    – Martijn Pieters
    5 hours ago













    Define "run out of memory", do you get a MemoryError or just start filling RAM, swapping and get OOMed?

    – inf
    5 hours ago





    Define "run out of memory", do you get a MemoryError or just start filling RAM, swapping and get OOMed?

    – inf
    5 hours ago













    @inf: I'm a little reluctant to actually let it run.. As the memory has been allocated by the OS (tracemalloc confirms Python has been given the memory allocation), there won't be a MemoryError, so it'll start swapping and eventually OOMed. But before that point this laptop will be hard to use for a while as everything else is swapped out first.

    – Martijn Pieters
    5 hours ago





    @inf: I'm a little reluctant to actually let it run.. As the memory has been allocated by the OS (tracemalloc confirms Python has been given the memory allocation), there won't be a MemoryError, so it'll start swapping and eventually OOMed. But before that point this laptop will be hard to use for a while as everything else is swapped out first.

    – Martijn Pieters
    5 hours ago













    I understand :) But that's what I mean. The allocation doesn't even succeed on ubuntu and linux and hence the MemoryError.

    – inf
    4 hours ago





    I understand :) But that's what I mean. The allocation doesn't even succeed on ubuntu and linux and hence the MemoryError.

    – inf
    4 hours ago













    0














    @Martijn Pieters' answer is on the right track, but not quite right: this has nothing to do with memory compression, but instead it has to do with virtual memory.



    For example, try running the following code on your machine:



    arrays = [np.zeros((21000, 21000)) for _ in range(0, 10000)]


    This code allocates 32TiB of memory, but you won't get an error (at least I didn't, on Linux). If I check htop, I see the following:



      PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
    31362 user 20 0 32.1T 69216 12712 S 0.0 0.4 0:00.22 python


    This because the OS is perfectly willing to overcommit on virtual memory. It won't actually assign pages to physical memory until it needs to. The way it works is:





    • calloc asks the OS for some memory to use

    • the OS looks in the process's page tables, and finds a chunk of memory that it's willing to assign. This is fast operation, the OS just stores the memory address range in an internal data structure.

    • the program writes to one of the addresses.

    • the OS receives a page fault, at which point it looks and actually assigns the page to physical memory. A page is usually a few KiB in size.

    • the OS passes control back to the program, which proceeds without noticing the interruption.


    I have no idea why creating a single huge array doesn't work on Linux or Windows, but I'd expect it to have more to do with the platform's implementation of libc and the limits imposed there than the operating system.



    For fun, try running arrays = [np.ones((21000, 21000)) for _ in range(0, 10000)]. You'll definitely get an out of memory error, even on MacOs or Linux with swap compression. Yes, certain OSes can compress RAM, but they can't compress it to the level that you wouldn't run out of memory.





    share




























      0














      @Martijn Pieters' answer is on the right track, but not quite right: this has nothing to do with memory compression, but instead it has to do with virtual memory.



      For example, try running the following code on your machine:



      arrays = [np.zeros((21000, 21000)) for _ in range(0, 10000)]


      This code allocates 32TiB of memory, but you won't get an error (at least I didn't, on Linux). If I check htop, I see the following:



        PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
      31362 user 20 0 32.1T 69216 12712 S 0.0 0.4 0:00.22 python


      This because the OS is perfectly willing to overcommit on virtual memory. It won't actually assign pages to physical memory until it needs to. The way it works is:





      • calloc asks the OS for some memory to use

      • the OS looks in the process's page tables, and finds a chunk of memory that it's willing to assign. This is fast operation, the OS just stores the memory address range in an internal data structure.

      • the program writes to one of the addresses.

      • the OS receives a page fault, at which point it looks and actually assigns the page to physical memory. A page is usually a few KiB in size.

      • the OS passes control back to the program, which proceeds without noticing the interruption.


      I have no idea why creating a single huge array doesn't work on Linux or Windows, but I'd expect it to have more to do with the platform's implementation of libc and the limits imposed there than the operating system.



      For fun, try running arrays = [np.ones((21000, 21000)) for _ in range(0, 10000)]. You'll definitely get an out of memory error, even on MacOs or Linux with swap compression. Yes, certain OSes can compress RAM, but they can't compress it to the level that you wouldn't run out of memory.





      share


























        0












        0








        0







        @Martijn Pieters' answer is on the right track, but not quite right: this has nothing to do with memory compression, but instead it has to do with virtual memory.



        For example, try running the following code on your machine:



        arrays = [np.zeros((21000, 21000)) for _ in range(0, 10000)]


        This code allocates 32TiB of memory, but you won't get an error (at least I didn't, on Linux). If I check htop, I see the following:



          PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
        31362 user 20 0 32.1T 69216 12712 S 0.0 0.4 0:00.22 python


        This because the OS is perfectly willing to overcommit on virtual memory. It won't actually assign pages to physical memory until it needs to. The way it works is:





        • calloc asks the OS for some memory to use

        • the OS looks in the process's page tables, and finds a chunk of memory that it's willing to assign. This is fast operation, the OS just stores the memory address range in an internal data structure.

        • the program writes to one of the addresses.

        • the OS receives a page fault, at which point it looks and actually assigns the page to physical memory. A page is usually a few KiB in size.

        • the OS passes control back to the program, which proceeds without noticing the interruption.


        I have no idea why creating a single huge array doesn't work on Linux or Windows, but I'd expect it to have more to do with the platform's implementation of libc and the limits imposed there than the operating system.



        For fun, try running arrays = [np.ones((21000, 21000)) for _ in range(0, 10000)]. You'll definitely get an out of memory error, even on MacOs or Linux with swap compression. Yes, certain OSes can compress RAM, but they can't compress it to the level that you wouldn't run out of memory.





        share













        @Martijn Pieters' answer is on the right track, but not quite right: this has nothing to do with memory compression, but instead it has to do with virtual memory.



        For example, try running the following code on your machine:



        arrays = [np.zeros((21000, 21000)) for _ in range(0, 10000)]


        This code allocates 32TiB of memory, but you won't get an error (at least I didn't, on Linux). If I check htop, I see the following:



          PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
        31362 user 20 0 32.1T 69216 12712 S 0.0 0.4 0:00.22 python


        This because the OS is perfectly willing to overcommit on virtual memory. It won't actually assign pages to physical memory until it needs to. The way it works is:





        • calloc asks the OS for some memory to use

        • the OS looks in the process's page tables, and finds a chunk of memory that it's willing to assign. This is fast operation, the OS just stores the memory address range in an internal data structure.

        • the program writes to one of the addresses.

        • the OS receives a page fault, at which point it looks and actually assigns the page to physical memory. A page is usually a few KiB in size.

        • the OS passes control back to the program, which proceeds without noticing the interruption.


        I have no idea why creating a single huge array doesn't work on Linux or Windows, but I'd expect it to have more to do with the platform's implementation of libc and the limits imposed there than the operating system.



        For fun, try running arrays = [np.ones((21000, 21000)) for _ in range(0, 10000)]. You'll definitely get an out of memory error, even on MacOs or Linux with swap compression. Yes, certain OSes can compress RAM, but they can't compress it to the level that you wouldn't run out of memory.






        share











        share


        share










        answered 7 mins ago









        user60561user60561

        8851824




        8851824






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54961554%2fwhy-can-a-352gb-numpy-ndarray-be-used-on-an-8gb-memory-macos-computer%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            What other Star Trek series did the main TNG cast show up in?

            Berlina muro

            Berlina aerponto