Home » Linux » Is malloc deterministic?

Is malloc deterministic?

Posted by: admin November 30, 2017 Leave a comment

Questions:

Is malloc deterministic? Say If I have a forked process, that is, a replica of another process, and at some point both of them call the malloc function. Would the address allocated be the same in both processes? Assuming that other parts of execution are also deterministic.

Note: Here, I’m only talking about virtual memory, not physical one.

Answers:

There is no reason at all for it to be deterministic, in fact there can be some benefit to it not being deterministic, for example increasing the complexity of exploiting bugs (see also this paper).

This randomness can be helpful at making exploits harder to write. To successfully exploit a buffer overflow you typically need to do two things:

  1. Deliver a payload into a predictable/known memory location
  2. Cause execution to jump to that location

If the memory location is unpredictable making that jump can become quite a lot harder.

The relevant quote from the standard §7.20.3.3/2:

The malloc function allocates space for an object whose size is
specified by size and whose value is indeterminate

If it were the intention to make it deterministic then that would be clearly stated as such.

Even if it looks deterministic today I wouldn’t bet on it remaining so with a newer kernel or a newer libc/GCC version.

Questions:
Answers:

The C99 spec (at least, in its final public draft) states in ‘J.1 Unspecified behavior’:

The following are unspecified:

The order and contiguity of storage allocated by successive calls to
the calloc, malloc, and realloc functions (7.20.3).

So it would seem that malloc doesn’t have to be deterministic. It therefore isn’t safe to assume that it is.

Questions:
Answers:

That depends entirely on the malloc implementation. There’s no inherent reason why a particular malloc implementation would introduce non-determinism (except possibly as an application fuzzing test, but even then it ought to be disabled by default). For example, Doug Lea’s malloc does not use rand(3) or any similar methods in it.

But, since malloc makes calls to the kernel such as sbrk(2) or mmap(2) on Linux or VirtualAlloc on Windows, those system calls may not always be deterministic, even in otherwise identical processes. The kernel may decide to intentionally provide different mmap‘ed addresses in different processes for whatever reason.

So for small allocations, which are usually serviced in user space without a system call, it will quite likely be the case that the resulting pointers will be the same after a fork(); large allocations that are serviced by a system a call can be the same.

In general, though, do not depend on it. If you really need identical pointers in separate processes, either create them before forking, or use shared memory and share them appropriately.

Questions:
Answers:

It depends on the detailed implementations of malloc. A typical malloc implementation (e.g., dlmalloc) used to be deterministic. This is simply because the algorithm itself is deterministic.

However, due to many security attacks such as heap overflow attacks, malloc, that is a heap manager, introduced some randomness in their implementations. (But, its entropy is relatively small because heap managers must consider speed and space) So, it is safe that you should not assume rigorous determinism in a heap managers.

Also, when you fork a process, there are various sources of randomness including ASLR.

Questions:
Answers:

Yes, it’s deterministic to some degree, but not that doesn’t necessarily mean it’ll given identical results in two forks of a process.

Just for example, the Single Unix Specification says: “[…] to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.”

For better or worse, malloc is not in the list of “async-signal-safe” functions.

This limitation is in a section that discusses multithreaded programs, but doesn’t specify whether the limitation applies only to multithreaded programs, or also applies to single threaded programs.

Conclusion: you can’t count on malloc producing identical results in the parent and the child. If the program is multithreaded, you can’t count on malloc working at all in the child, until it has called exec–and there’s room for reasonable question whether it’s actually guaranteed to work even in a single-threaded child before the child calls exec.

References:

  1. fork specification
  2. async-signal safe functions
Questions:
Answers:

You won’t get the same physical address. If you have process A and B each call of malloc returns the address of a free block. The order in which A and B calls malloc is not predictable. But it never happens “in the same moment”.

Questions:
Answers:

Technically, if the forked processes both request the same size of block, they should get the same address allocated, but each of those addresses will point to a different physical/real memory location.

Linux uses copy-on-write for fork, so forked children share their parent’s memory, until something is changed in either process. At that point the kernel goes through the memory copying sequence to give the forked child it’s own dedicated/unique copy of its memory space.