Discussion:
[PATCH] Permit zero-sized qemu_malloc() & friends
Markus Armbruster
2009-11-30 13:55:34 UTC
Permalink
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.

Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.

If a change breaks two uses, it probably breaks more. As a quick check,
I reviewed the first six of more than 200 uses of qemu_mallocz(),
qemu_malloc() and qemu_realloc() that don't have an argument of the form
sizeof(...) or similar:

* load_elf32(), load_elf64() in hw/elf_ops.h:

size = ehdr.e_phnum * sizeof(phdr[0]);
lseek(fd, ehdr.e_phoff, SEEK_SET);
phdr = qemu_mallocz(size);

This aborts when the ELF file has no program header table, because
then e_phnum is zero (TIS ELF 1.2 page 1-6). Even if we require the
ELF file to have a program header table, aborting is not an acceptable
way to reject an unsuitable or malformed ELF file.

* load_elf32(), load_elf64() in hw/elf_ops.h:

mem_size = ph->p_memsz;
/* XXX: avoid allocating */
data = qemu_mallocz(mem_size);

This aborts when the segment occupies zero bytes in the image file
(TIS ELF 1.2 page 2-2).

* bdrv_open2() in block.c:

bs->opaque = qemu_mallocz(drv->instance_size);

However, vvfat_write_target.instance_size is zero. Not sure whether
this actually bites or is "only" a time bomb.

* qemu_aio_get() in block.c:

acb = qemu_mallocz(pool->aiocb_size);

No existing instance of BlockDriverAIOCB has a zero aiocb_size.
Probably okay.

* defaultallocator_create_displaysurface() in console.c has

surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);

With Xen, surface->linesize and surface->height come out of
xenfb_configure_fb(), which set xenfb->width and xenfb->height to
values obtained from the guest. If a buggy guest sets one to zero, we
abort. Not an good way to handle that.

Non-Xen uses aren't obviously correct either, but I didn't dig deeper.

* load_device_tree() in device_tree.c:

*sizep = 0;
dt_size = get_image_size(filename_path);
if (dt_size < 0) {
printf("Unable to get size of device tree file '%s'\n",
filename_path);
goto fail;
}

/* Expand to 2x size to give enough room for manipulation. */
dt_size *= 2;
/* First allocate space in qemu for device tree */
fdt = qemu_mallocz(dt_size);

We abort if the image is empty. Not an acceptable way to handle that.

Based on this sample, I'm confident there are many more uses broken by
the change.

Signed-off-by: Markus Armbruster <***@redhat.com>
---
block/qcow2-snapshot.c | 5 -----
qemu-malloc.c | 14 ++------------
2 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 94cb838..e3b208c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
QCowSnapshot *sn;
int i;

- if (!s->nb_snapshots) {
- *psn_tab = NULL;
- return s->nb_snapshots;
- }
-
sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
diff --git a/qemu-malloc.c b/qemu-malloc.c
index 295d185..aeeb78b 100644
--- a/qemu-malloc.c
+++ b/qemu-malloc.c
@@ -44,22 +44,12 @@ void qemu_free(void *ptr)

void *qemu_malloc(size_t size)
{
- if (!size) {
- abort();
- }
- return oom_check(malloc(size));
+ return oom_check(malloc(size ? size : 1));
}

void *qemu_realloc(void *ptr, size_t size)
{
- if (size) {
- return oom_check(realloc(ptr, size));
- } else {
- if (ptr) {
- return realloc(ptr, size);
- }
- }
- abort();
+ return oom_check(realloc(ptr, size ? size : 1));
}

void *qemu_mallocz(size_t size)
--
1.6.2.5
Avi Kivity
2009-11-30 14:01:19 UTC
Permalink
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc()& friends. Revert that, but take care never to
return a null pointer, like malloc()& friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
I strongly agree with this patch. A consistent API improves quality,
especially when it is also consistent with people's expectations.
--
error compiling committee.c: too many arguments to function
Kevin Wolf
2009-11-30 14:23:10 UTC
Permalink
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
[...]
Based on this sample, I'm confident there are many more uses broken by
the change.
Acked-by: Kevin Wolf <***@redhat.com>

We really should fix the root cause once and for all instead of working
around unexpected qemu_malloc behaviour each time when one of the time
bombs has blown up another (possibly production) VM.
Gerd Hoffmann
2009-12-01 12:40:13 UTC
Permalink
Post by Markus Armbruster
diff --git a/qemu-malloc.c b/qemu-malloc.c
index 295d185..aeeb78b 100644
--- a/qemu-malloc.c
+++ b/qemu-malloc.c
@@ -44,22 +44,12 @@ void qemu_free(void *ptr)
void *qemu_malloc(size_t size)
{
- if (!size) {
- abort();
- }
- return oom_check(malloc(size));
+ return oom_check(malloc(size ? size : 1));
}
You might want to have a 'static uint8_t zero_length_malloc[0]' and
return that instead of the magic cookie '1'. Makes the code more
readable IMHO and you'll also have symbol in gdb when debugging qemu.

Even more advanced: Make zero_length_malloc page-sized and
page-aligned, then munmap int, so dereferencing it actually traps.
Post by Markus Armbruster
void *qemu_realloc(void *ptr, size_t size)
{
+ return oom_check(realloc(ptr, size ? size : 1));
qemu_realloc(qemu_malloc(0), 42);

should better work correctly ...

Likewise qemu_free(qemu_malloc(0));

cheers,
Gerd
Gerd Hoffmann
2009-12-01 12:57:28 UTC
Permalink
Post by Gerd Hoffmann
+ return oom_check(malloc(size ? size : 1));
void *qemu_realloc(void *ptr, size_t size)
{
+ return oom_check(realloc(ptr, size ? size : 1));
qemu_realloc(qemu_malloc(0), 42);
should better work correctly ...
Oops, scratch that.

"return malloc(size ? size : 1)" is very different from "return size ?
malloc(size) : 1" which my mind somehow made out of the code line above ...

sorry,
Gerd
Paul Brook
2009-12-01 12:57:27 UTC
Permalink
Post by Gerd Hoffmann
You might want to have a 'static uint8_t zero_length_malloc[0]' and
return that instead of the magic cookie '1'. Makes the code more
readable IMHO and you'll also have symbol in gdb when debugging qemu.
Having multiple malloc return the same pointer sounds like a really bad idea.

Paul
Glauber Costa
2009-12-01 13:47:49 UTC
Permalink
Post by Paul Brook
Post by Gerd Hoffmann
You might want to have a 'static uint8_t zero_length_malloc[0]' and
return that instead of the magic cookie '1'. Makes the code more
readable IMHO and you'll also have symbol in gdb when debugging qemu.
Having multiple malloc return the same pointer sounds like a really bad idea.
And why's that?

Keep in mind that *any* dereference over that address is a bug.

Actually, I very much like Gerd's idea to unmap that address, so the bug
won't hide from us in any circumnstances.
Markus Armbruster
2009-12-01 14:08:47 UTC
Permalink
Post by Glauber Costa
Post by Paul Brook
Post by Gerd Hoffmann
You might want to have a 'static uint8_t zero_length_malloc[0]' and
return that instead of the magic cookie '1'. Makes the code more
readable IMHO and you'll also have symbol in gdb when debugging qemu.
Having multiple malloc return the same pointer sounds like a really bad idea.
And why's that?
Keep in mind that *any* dereference over that address is a bug.
Actually, I very much like Gerd's idea to unmap that address, so the bug
won't hide from us in any circumnstances.
For what it's worth, it violates the spec for malloc(). For zero-sized
allocations, we may either return a null pointer (but we already decided
we don't want to), or an object different from any other object alive.
Thus, we can't return the same non-null pointer for all zero-sized
allocations.

Chapter and verse: ISO/IEC 9899:1999 7.20.3 Memory management functions

The order and contiguity of storage allocated by successive calls to
the calloc, malloc, and realloc functions is unspecified. The
pointer returned if the allocation succeeds is suitably aligned so
that it may be assigned to a pointer to any type of object and then
used to access such an object or an array of such objects in the
space allocated (until the space is explicitly deallocated). The
lifetime of an allocated object extends from the allocation until
the deallocation. Each such allocation shall yield a pointer to an
object disjoint from any other object. The pointer returned points
to the start (lowest byte address) of the allocated space. If the
space cannot be allocated, a null pointer is returned. If the size
of the space requested is zero, the behavior is implementation-
defined: either a null pointer is returned, or the behavior is as if
the size were some nonzero value, except that the returned pointer
shall not be used to access an object.
Gerd Hoffmann
2009-12-01 14:47:56 UTC
Permalink
Post by Markus Armbruster
For what it's worth, it violates the spec for malloc(). For zero-sized
allocations, we may either return a null pointer (but we already decided
we don't want to), or an object different from any other object alive.
Which clearly rules out the fixed memory location for malloc(0).

"malloc(size ? size : 1)" is indeed the easiest way to make sure we
conform to the spec and also don't return NULL.

cheers,
Gerd
Paul Brook
2009-12-01 14:21:45 UTC
Permalink
Post by Glauber Costa
Post by Paul Brook
Post by Gerd Hoffmann
You might want to have a 'static uint8_t zero_length_malloc[0]' and
return that instead of the magic cookie '1'. Makes the code more
readable IMHO and you'll also have symbol in gdb when debugging qemu.
Having multiple malloc return the same pointer sounds like a really bad idea.
And why's that?
Keep in mind that *any* dereference over that address is a bug.
Dereferencing the address is a bug. However testing the addresses themselves
for equality is valid. This is much the same reason I think returning NULL
would be a bad idea.

Paul
Markus Armbruster
2009-12-01 13:11:49 UTC
Permalink
Post by Gerd Hoffmann
Post by Markus Armbruster
diff --git a/qemu-malloc.c b/qemu-malloc.c
index 295d185..aeeb78b 100644
--- a/qemu-malloc.c
+++ b/qemu-malloc.c
@@ -44,22 +44,12 @@ void qemu_free(void *ptr)
void *qemu_malloc(size_t size)
{
- if (!size) {
- abort();
- }
- return oom_check(malloc(size));
+ return oom_check(malloc(size ? size : 1));
}
You might want to have a 'static uint8_t zero_length_malloc[0]' and
return that instead of the magic cookie '1'. Makes the code more
readable IMHO and you'll also have symbol in gdb when debugging qemu.
Complicates qemu_realloc() and qemu_free() somewhat, and that makes me
think we better do it as a separate commit. Agree?
Post by Gerd Hoffmann
Even more advanced: Make zero_length_malloc page-sized and
page-aligned, then munmap int, so dereferencing it actually traps.
Overrunning a malloc'ed buffer very rarely traps, not sure catching this
special case is worth the portability headaches. If you really want to
catch overruns, you need special tools like valgrind or electric fence
anyway.
Post by Gerd Hoffmann
Post by Markus Armbruster
void *qemu_realloc(void *ptr, size_t size)
{
+ return oom_check(realloc(ptr, size ? size : 1));
qemu_realloc(qemu_malloc(0), 42);
should better work correctly ...
Likewise qemu_free(qemu_malloc(0));
Avi Kivity
2009-12-01 14:34:51 UTC
Permalink
Post by Gerd Hoffmann
Even more advanced: Make zero_length_malloc page-sized and
page-aligned, then munmap int, so dereferencing it actually traps.
That will cause vma fragmentation. Since we don't trap malloc(n)[n] for
n > 0, we may as well not trap it for n = 0.
--
error compiling committee.c: too many arguments to function
Gerd Hoffmann
2009-12-01 14:53:13 UTC
Permalink
Post by Avi Kivity
Even more advanced: Make zero_length_malloc page-sized and
page-aligned, then munmap int, so dereferencing it actually traps.
That will cause vma fragmentation.
I was thinking about a single unmapped page, which wouldn't cause
fragmentation. But that is a bad idea too for other reasons ...

cheers,
Gerd
Eduardo Habkost
2009-12-01 15:32:00 UTC
Permalink
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
If a change breaks two uses, it probably breaks more. As a quick check,
I reviewed the first six of more than 200 uses of qemu_mallocz(),
qemu_malloc() and qemu_realloc() that don't have an argument of the form
Acked-by: Eduardo Habkost <***@redhat.com>

This also makes qemu_realloc(NULL, size) completely equivalent to
qemu_malloc(size), and that's a good thing.
--
Eduardo
Anthony Liguori
2009-12-04 16:49:41 UTC
Permalink
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
I still believe that it is poor practice to pass size==0 to *malloc().
I think actively discouraging this in qemu is a good thing because it's
a broken idiom.

That said, we've had this change in for a while and it's painfully
obviously we haven't eliminated all of these instances in qemu. Knowing
that we still have occurrences of this and actively exit()'ing when they
happen is pretty much suicidal in a production environment. It's not a
bug, it's just a poor practice.

Here's what I propose. I think we should introduce a
CONFIG_DEBUG_ZERO_MALLOC. When this is set, we should assert(size!=0).
Otherwise, we should return malloc(1) as Markus' patch does below.

Using the same rules as we follow for -Werror, we should enable
CONFIG_DEBUG_ZERO_MALLOC for the development tree and disable it for any
releases.

This helps us make qemu better during development while not
unnecessarily causing us harm in a production environment. What happens
here long term I think remains to be seen. But for right now, I think
this is an important change to make for 0.12.
Post by Markus Armbruster
---
block/qcow2-snapshot.c | 5 -----
qemu-malloc.c | 14 ++------------
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 94cb838..e3b208c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
QCowSnapshot *sn;
int i;
- if (!s->nb_snapshots) {
- *psn_tab = NULL;
- return s->nb_snapshots;
- }
-
This does not belong here.

Regards,

Anthony Liguori
Markus Armbruster
2009-12-05 13:55:22 UTC
Permalink
Post by Anthony Liguori
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
I still believe that it is poor practice to pass size==0 to *malloc().
I think actively discouraging this in qemu is a good thing because
it's a broken idiom.
What's the impact of such usage? What would improve for users if it
were eradicated? For developers?

I believe the answer the first two questions is "nothing in particular",
and the answer to the last one is "hassle". But I'd be happy to see
*specific* examples (code, not hand-waving) for where I'm wrong.

I'm opposed to "fixing" something without a real gain, not just because
it's work, but also because like any change it's going to introduce new
bugs.

I'm particularly critical of outlawing zero size for uses like
malloc(N*sizeof(T). These are fairly common in our code. Having to add
special treatment for N==0 is a trap for the unwary. Which means we'll
never be done chasing this stuff. Moreover, the special treatment
clutters the code, and requiring it without a clear benefit is silly.
Show me the benefit, and I'll happily reconsider.

For a specific example, let's look at the first example from my commit
message again: load_elf32(), load_elf64() in hw/elf_ops.h. "Fix" for
its "broken" usage of qemu_mallocz(), shot from the hip, entirely
untested:

diff --git a/hw/elf_ops.h b/hw/elf_ops.h
index 6093dea..003d2ef 100644
--- a/hw/elf_ops.h
+++ b/hw/elf_ops.h
@@ -219,51 +219,53 @@ static int glue(load_elf, SZ)(const char *name, int fd, int64_t address_offset,

glue(load_symbols, SZ)(&ehdr, fd, must_swab, clear_lsb);

- size = ehdr.e_phnum * sizeof(phdr[0]);
- lseek(fd, ehdr.e_phoff, SEEK_SET);
- phdr = qemu_mallocz(size);
- if (!phdr)
- goto fail;
- if (read(fd, phdr, size) != size)
- goto fail;
- if (must_swab) {
- for(i = 0; i < ehdr.e_phnum; i++) {
- ph = &phdr[i];
- glue(bswap_phdr, SZ)(ph);
- }
- }
-
- total_size = 0;
- for(i = 0; i < ehdr.e_phnum; i++) {
- ph = &phdr[i];
- if (ph->p_type == PT_LOAD) {
- mem_size = ph->p_memsz;
- /* XXX: avoid allocating */
- data = qemu_mallocz(mem_size);
- if (ph->p_filesz > 0) {
- if (lseek(fd, ph->p_offset, SEEK_SET) < 0)
- goto fail;
- if (read(fd, data, ph->p_filesz) != ph->p_filesz)
- goto fail;
+ if (ehdr.e_phnum) {
+ size = ehdr.e_phnum * sizeof(phdr[0]);
+ lseek(fd, ehdr.e_phoff, SEEK_SET);
+ phdr = qemu_mallocz(size);
+ if (!phdr)
+ goto fail;
+ if (read(fd, phdr, size) != size)
+ goto fail;
+ if (must_swab) {
+ for(i = 0; i < ehdr.e_phnum; i++) {
+ ph = &phdr[i];
+ glue(bswap_phdr, SZ)(ph);
}
- /* address_offset is hack for kernel images that are
- linked at the wrong physical address. */
- addr = ph->p_paddr + address_offset;
+ }

- snprintf(label, sizeof(label), "phdr #%d: %s", i, name);
- rom_add_blob_fixed(label, data, mem_size, addr);
+ total_size = 0;
+ for(i = 0; i < ehdr.e_phnum; i++) {
+ ph = &phdr[i];
+ if (ph->p_type == PT_LOAD) {
+ mem_size = ph->p_memsz;
+ if (memsize) {
+ data = qemu_mallocz(mem_size);
+ if (ph->p_filesz > 0) {
+ if (lseek(fd, ph->p_offset, SEEK_SET) < 0)
+ goto fail;
+ if (read(fd, data, ph->p_filesz) != ph->p_filesz)
+ goto fail;
+ }
+ /* address_offset is hack for kernel images that are
+ linked at the wrong physical address. */
+ addr = ph->p_paddr + address_offset;

- total_size += mem_size;
- if (addr < low)
- low = addr;
- if ((addr + mem_size) > high)
- high = addr + mem_size;
+ snprintf(label, sizeof(label), "phdr #%d: %s", i, name);
+ rom_add_blob_fixed(label, data, mem_size, addr);

- qemu_free(data);
- data = NULL;
+ total_size += mem_size;
+ qemu_free(data);
+ data = NULL;
+ }
+ if (addr < low)
+ low = addr;
+ if ((addr + mem_size) > high)
+ high = addr + mem_size;
+ }
}
+ qemu_free(phdr);
}
- qemu_free(phdr);
if (lowaddr)
*lowaddr = (uint64_t)(elf_sword)low;
if (highaddr)

Anyone fancy to review a hundred or two of these?

And what would it buy us? Let's see what the old code does:

* It mallocs a buffer for the PHT.

* It seeks to the PHT, and reads it.

* It maybe byte-swaps the PHT.

* It iterates over the PHT to load segments.

Works just fine for empty PHTs and empty segments.

The new code does slightly less work for empty PHTs and segments (rare
case), and very slightly more work for non-empty ones (common case).
Neither of it matters. What matters is that it is nested two levels
deeper.

Benefit?
Post by Anthony Liguori
That said, we've had this change in for a while and it's painfully
obviously we haven't eliminated all of these instances in qemu.
Knowing that we still have occurrences of this and actively exit()'ing
when they happen is pretty much suicidal in a production environment.
Agreed.
Post by Anthony Liguori
It's not a bug, it's just a poor practice.
Here's what I propose. I think we should introduce a
CONFIG_DEBUG_ZERO_MALLOC. When this is set, we should
assert(size!=0). Otherwise, we should return malloc(1) as Markus'
patch does below.
Using the same rules as we follow for -Werror, we should enable
CONFIG_DEBUG_ZERO_MALLOC for the development tree and disable it for
any releases.
This helps us make qemu better during development while not
unnecessarily causing us harm in a production environment. What
happens here long term I think remains to be seen. But for right now,
I think this is an important change to make for 0.12.
We'll clutter the code, and we'll forever chase the uses of malloc()
that haven't received their clutter, yet. CONFIG_DEBUG_ZERO_MALLOC will
exist forever, and be universally off in production builds.

I'd like to see some real benefit for that, please.
Laurent Desnogues
2009-12-05 14:14:45 UTC
Permalink
Hello,

probably a stupid remark, but I'll do it anyway :-)

Let's assume there are some contexts where doing a memory
allocation with a size of 0 is invalid while in some other contexts
it is valid. Wouldn't it make sense in that case to have two
functions that do memory allocation?

Of course such a split would require full review of existing code
and might introduce complexities for instance for realloc (was
the original memory block allocated with a possible size of 0
or not?).


Laurent
malc
2009-12-05 17:08:27 UTC
Permalink
Post by Markus Armbruster
Post by Anthony Liguori
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
I still believe that it is poor practice to pass size==0 to *malloc().
I think actively discouraging this in qemu is a good thing because
it's a broken idiom.
What's the impact of such usage? What would improve for users if it
were eradicated? For developers?
I believe the answer the first two questions is "nothing in particular",
and the answer to the last one is "hassle". But I'd be happy to see
*specific* examples (code, not hand-waving) for where I'm wrong.
I'm opposed to "fixing" something without a real gain, not just because
it's work, but also because like any change it's going to introduce new
bugs.
I'm particularly critical of outlawing zero size for uses like
malloc(N*sizeof(T). These are fairly common in our code. Having to add
special treatment for N==0 is a trap for the unwary. Which means we'll
never be done chasing this stuff. Moreover, the special treatment
clutters the code, and requiring it without a clear benefit is silly.
Show me the benefit, and I'll happily reconsider.
For a specific example, let's look at the first example from my commit
message again: load_elf32(), load_elf64() in hw/elf_ops.h. "Fix" for
its "broken" usage of qemu_mallocz(), shot from the hip, entirely
Excellent example, now consider this:

read$ cat << eof | gcc -x c -
#define _GNU_SOURCE
#include <err.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>

int main (void)
{
int fd = open ("/dev/zero", 0);
int ret;
void *p = (void *) -1;

if (fd == -1) err (1, "open");
ret = read (fd, p, 0);
if (ret != 0) err (1, "read");
return 0;
}
eof
read$ ./a.out
a.out: read: Bad address

Even though that linux's read(2) man page claims [1]:

DESCRIPTION
read() attempts to read up to count bytes from file descriptor fd into
the buffer starting at buf.

If count is zero, read() returns zero and has no other results. If
count is greater than SSIZE_MAX, the result is unspecified.

[..snip..]

P.S. It would be interesting to know how this code behaves under OpenBSD, with
p = malloc (0);

[1] As does, in essence, http://www.opengroup.org/onlinepubs/7990989775/xsh/read.html
--
mailto:***@comtv.ru
Avi Kivity
2009-12-05 17:23:59 UTC
Permalink
Post by Markus Armbruster
What's the impact of such usage? What would improve for users if it
were eradicated? For developers?
I believe the answer the first two questions is "nothing in particular",
and the answer to the last one is "hassle". But I'd be happy to see
*specific* examples (code, not hand-waving) for where I'm wrong.
I'm opposed to "fixing" something without a real gain, not just because
it's work, but also because like any change it's going to introduce new
bugs.
I'm particularly critical of outlawing zero size for uses like
malloc(N*sizeof(T). These are fairly common in our code. Having to add
special treatment for N==0 is a trap for the unwary. Which means we'll
never be done chasing this stuff. Moreover, the special treatment
clutters the code, and requiring it without a clear benefit is silly.
Show me the benefit, and I'll happily reconsider.
For a specific example, let's look at the first example from my commit
message again: load_elf32(), load_elf64() in hw/elf_ops.h. "Fix" for
its "broken" usage of qemu_mallocz(), shot from the hip, entirely
read$ cat<< eof | gcc -x c -
#define _GNU_SOURCE
#include<err.h>
#include<unistd.h>
#include<stdlib.h>
#include<fcntl.h>
int main (void)
{
int fd = open ("/dev/zero", 0);
int ret;
void *p = (void *) -1;
if (fd == -1) err (1, "open");
ret = read (fd, p, 0);
if (ret != 0) err (1, "read");
return 0;
}
eof
read$ ./a.out
a.out: read: Bad address
I don't see how this is relevant. Linux' read(2) may or may not be
broken, but what does that have to do with breaking callers of
qemu_malloc() for certain use cases?
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Reimar Döffinger
2009-12-05 18:30:15 UTC
Permalink
Post by malc
ret = read (fd, p, 0);
if (ret != 0) err (1, "read");
return 0;
}
eof
read$ ./a.out
a.out: read: Bad address
DESCRIPTION
read() attempts to read up to count bytes from file descriptor fd into
the buffer starting at buf.
If count is zero, read() returns zero and has no other results. If
count is greater than SSIZE_MAX, the result is unspecified.
[..snip..]
P.S. It would be interesting to know how this code behaves under OpenBSD, with
p = malloc (0);
[1] As does, in essence, http://www.opengroup.org/onlinepubs/7990989775/xsh/read.html
Which simply means that the linux man page was based on SUSv2 which obviously was quite badly written.
SUSv3 gets it right (proving along the way it is better to use the POSIX man pages, i.e. "man 3 read").
It might be good to suggest them to update the linux man page though.
Markus Armbruster
2009-12-06 07:57:55 UTC
Permalink
Post by malc
Post by Markus Armbruster
Post by Anthony Liguori
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
I still believe that it is poor practice to pass size==0 to *malloc().
I think actively discouraging this in qemu is a good thing because
it's a broken idiom.
What's the impact of such usage? What would improve for users if it
were eradicated? For developers?
I believe the answer the first two questions is "nothing in particular",
and the answer to the last one is "hassle". But I'd be happy to see
*specific* examples (code, not hand-waving) for where I'm wrong.
I'm opposed to "fixing" something without a real gain, not just because
it's work, but also because like any change it's going to introduce new
bugs.
I'm particularly critical of outlawing zero size for uses like
malloc(N*sizeof(T). These are fairly common in our code. Having to add
special treatment for N==0 is a trap for the unwary. Which means we'll
never be done chasing this stuff. Moreover, the special treatment
clutters the code, and requiring it without a clear benefit is silly.
Show me the benefit, and I'll happily reconsider.
For a specific example, let's look at the first example from my commit
message again: load_elf32(), load_elf64() in hw/elf_ops.h. "Fix" for
its "broken" usage of qemu_mallocz(), shot from the hip, entirely
read$ cat << eof | gcc -x c -
#define _GNU_SOURCE
#include <err.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
int main (void)
{
int fd = open ("/dev/zero", 0);
int ret;
void *p = (void *) -1;
if (fd == -1) err (1, "open");
ret = read (fd, p, 0);
if (ret != 0) err (1, "read");
return 0;
}
eof
read$ ./a.out
a.out: read: Bad address
DESCRIPTION
read() attempts to read up to count bytes from file descriptor fd into
the buffer starting at buf.
If count is zero, read() returns zero and has no other results. If
count is greater than SSIZE_MAX, the result is unspecified.
[..snip..]
P.S. It would be interesting to know how this code behaves under OpenBSD, with
p = malloc (0);
[1] As does, in essence, http://www.opengroup.org/onlinepubs/7990989775/xsh/read.html
Replace "p = (void *)-1" by "p = NULL" and it works just fine.

malloc() either returns a valid pointer or NULL, so what read() does for
other pointers doesn't matter.

qemu_malloc() always returns a valid pointer, but that's not even needed
in this case.
malc
2009-12-06 08:39:39 UTC
Permalink
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
[..snip..]
Post by Markus Armbruster
Post by malc
P.S. It would be interesting to know how this code behaves under OpenBSD, with
p = malloc (0);
[1] As does, in essence, http://www.opengroup.org/onlinepubs/7990989775/xsh/read.html
Replace "p = (void *)-1" by "p = NULL" and it works just fine.
That's why i asked for somone to run it on OpenBSD:

Quoting http://www.openbsd.org/cgi-bin/man.cgi?query=malloc&apropos=0&sektion=3&manpath=OpenBSD+Current&arch=i386&format=html

Allocation of a zero size object returns a pointer to a zero size object.
This zero size object is access protected, so any access to it will gen-
erate an exception (SIGSEGV). Many zero-sized objects can be placed con-
secutively in shared protected pages. The minimum size of the protection
on each object is suitably aligned and sized as previously stated, but
the protection may extend further depending on where in a protected zone
the object lands.
Post by Markus Armbruster
malloc() either returns a valid pointer or NULL, so what read() does for
other pointers doesn't matter.
Replace "returns" with "should" and this still won't match the wording
of the standard, besides as "read" behaviour on Linux shows following
those are not high on the agenda.

7.20.3.3 The malloc function

Synopsis

[#1]

#include <stdlib.h>
void *malloc(size_t size);

Description

[#2] The malloc function allocates space for an object whose
size is specified by size and whose value is indeterminate.

Returns

[#3] The malloc function returns either a null pointer or a
pointer to the allocated space.

I don't know what a "valid pointer" in this context represents.
Post by Markus Armbruster
qemu_malloc() always returns a valid pointer, but that's not even needed
in this case.
--
mailto:***@comtv.ru
Markus Armbruster
2009-12-06 08:59:48 UTC
Permalink
Post by malc
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
[..snip..]
Post by Markus Armbruster
Post by malc
P.S. It would be interesting to know how this code behaves under OpenBSD, with
p = malloc (0);
[...]
Post by malc
Post by Markus Armbruster
Replace "p = (void *)-1" by "p = NULL" and it works just fine.
Quoting http://www.openbsd.org/cgi-bin/man.cgi?query=malloc&apropos=0&sektion=3&manpath=OpenBSD+Current&arch=i386&format=html
Allocation of a zero size object returns a pointer to a zero size object.
This zero size object is access protected, so any access to it will gen-
erate an exception (SIGSEGV). Many zero-sized objects can be placed con-
secutively in shared protected pages. The minimum size of the protection
on each object is suitably aligned and sized as previously stated, but
the protection may extend further depending on where in a protected zone
the object lands.
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
Post by malc
Post by Markus Armbruster
malloc() either returns a valid pointer or NULL, so what read() does for
other pointers doesn't matter.
Replace "returns" with "should" and this still won't match the wording
of the standard, besides as "read" behaviour on Linux shows following
those are not high on the agenda.
7.20.3.3 The malloc function
Synopsis
[#1]
#include <stdlib.h>
void *malloc(size_t size);
Description
[#2] The malloc function allocates space for an object whose
size is specified by size and whose value is indeterminate.
Returns
[#3] The malloc function returns either a null pointer or a
pointer to the allocated space.
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)

malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.

OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.

Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
malc
2009-12-06 10:22:14 UTC
Permalink
[..snip..]
Post by Markus Armbruster
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
[..snip..]

Yet under linux the address is checked even for zero case.
Post by Markus Armbruster
Post by malc
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)
malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.
OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.
Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
--
mailto:***@comtv.ru
Avi Kivity
2009-12-06 10:40:08 UTC
Permalink
Post by malc
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
The implementation needs to track which addresses it handed out, since
it is required that malloc(0) != malloc(0) (unless both are NULL).
--
error compiling committee.c: too many arguments to function
malc
2009-12-06 11:53:46 UTC
Permalink
Post by malc
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
The implementation needs to track which addresses it handed out, since it is
required that malloc(0) != malloc(0) (unless both are NULL).
You haven't read carefully, i said range.
--
mailto:***@comtv.ru
Avi Kivity
2009-12-06 12:07:06 UTC
Permalink
Post by malc
Post by malc
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
The implementation needs to track which addresses it handed out, since it is
required that malloc(0) != malloc(0) (unless both are NULL).
You haven't read carefully, i said range.
I did in fact. Consider a loop

malloc(0);
p = malloc(0);
while (1) {
n = malloc(0);
free(p);
p = n;
}

without some form of tracking, you won't be able to return unique
addresses eventually.
--
error compiling committee.c: too many arguments to function
malc
2009-12-06 12:11:40 UTC
Permalink
Post by Avi Kivity
Post by malc
Post by malc
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
The implementation needs to track which addresses it handed out, since it is
required that malloc(0) != malloc(0) (unless both are NULL).
You haven't read carefully, i said range.
I did in fact. Consider a loop
malloc(0);
p = malloc(0);
while (1) {
n = malloc(0);
free(p);
p = n;
}
without some form of tracking, you won't be able to return unique addresses
eventually.
So? There will be tracking, it's the same as with OpenBSD where
allocations of zero and greater than zero are handled differently.
--
mailto:***@comtv.ru
Avi Kivity
2009-12-06 12:23:58 UTC
Permalink
Post by malc
Post by malc
The implementation needs to track which addresses it handed out, since it is
required that malloc(0) != malloc(0) (unless both are NULL).
You haven't read carefully, i said range.
So? There will be tracking, it's the same as with OpenBSD where
allocations of zero and greater than zero are handled differently.
That's exactly what I said. Some tracking is needed.
--
error compiling committee.c: too many arguments to function
Markus Armbruster
2009-12-06 11:10:12 UTC
Permalink
Post by malc
[..snip..]
Post by Markus Armbruster
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
[..snip..]
Yet under linux the address is checked even for zero case.
Any value you can obtain from malloc() passes that check.

Why does the fact that you can construct pointers that don't pass this
check matter for our discussion of malloc()?
Post by malc
Post by Markus Armbruster
Post by malc
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)
malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.
OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.
Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
Misunderstanding? Such behavior is indeed permissible, and I can't see
where I restricted it away. An implementation that behaves as you
describe returns "pointer to allocated space". That the pointer has
some funny bit set doesn't matter. That it can't be dereferenced is
just fine.

I'm not sure what your point is. If it is that malloc(0) can return a
value that cannot be passed to a zero-sized read(), then I fear you have
not made your point.
malc
2009-12-06 12:00:51 UTC
Permalink
Post by Markus Armbruster
Post by malc
[..snip..]
Post by Markus Armbruster
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
[..snip..]
Yet under linux the address is checked even for zero case.
Any value you can obtain from malloc() passes that check.
Why does the fact that you can construct pointers that don't pass this
check matter for our discussion of malloc()?
Post by malc
Post by Markus Armbruster
Post by malc
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)
malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.
OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.
Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
Misunderstanding? Such behavior is indeed permissible, and I can't see
where I restricted it away. An implementation that behaves as you
describe returns "pointer to allocated space". That the pointer has
some funny bit set doesn't matter. That it can't be dereferenced is
just fine.
I'm not sure what your point is. If it is that malloc(0) can return a
value that cannot be passed to a zero-sized read(), then I fear you have
not made your point.
One more attempt to make it clearer. If you agree that this behaviour
is permissible then the game is lost as things stand now under Linux,
since replacing [1]:

void *p = (void *) -1
with:
void *p = (void *) 0x80000000

or anything else with said bit set will yield EFAULT. Consequently the
code you cited as a well behaving malloc(0) call site will bomb.

[1] Under 32bit Linux that is, with the usual split.
--
mailto:***@comtv.ru
Paolo Bonzini
2009-12-06 16:23:59 UTC
Permalink
Post by malc
or anything else with said bit set will yield EFAULT. Consequently the
code you cited as a well behaving malloc(0) call site will bomb.
It is not well-behaving unless the definition of EFAULT is changed
consistently.

Paolo
Kevin Wolf
2009-12-07 08:35:31 UTC
Permalink
Post by malc
Post by Markus Armbruster
Post by malc
[..snip..]
Post by Markus Armbruster
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
[..snip..]
Yet under linux the address is checked even for zero case.
Any value you can obtain from malloc() passes that check.
Why does the fact that you can construct pointers that don't pass this
check matter for our discussion of malloc()?
Post by malc
Post by Markus Armbruster
Post by malc
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)
malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.
OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.
Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
Misunderstanding? Such behavior is indeed permissible, and I can't see
where I restricted it away. An implementation that behaves as you
describe returns "pointer to allocated space". That the pointer has
some funny bit set doesn't matter. That it can't be dereferenced is
just fine.
I'm not sure what your point is. If it is that malloc(0) can return a
value that cannot be passed to a zero-sized read(), then I fear you have
not made your point.
One more attempt to make it clearer. If you agree that this behaviour
is permissible then the game is lost as things stand now under Linux,
void *p = (void *) -1
void *p = (void *) 0x80000000
Then this implementation of malloc(0) isn't suitable for a libc running
on Linux - despite its general permissibility. You have a point if, and
only if, you can reproduce such misbehaviour with a real malloc of a
real libc that isn't constructed to work against the kernel but to fit
it. So far I can't see that you have found any.

By the way, this whole discussion about theoretical malloc(0) return
values doesn't even matter here. The patch replaces it by malloc(1),
with which we both are very confident that you can pass its return value
to read for reading 0 bytes, right?

Kevin
Markus Armbruster
2009-12-07 09:42:55 UTC
Permalink
Post by malc
Post by Markus Armbruster
Post by malc
[..snip..]
Post by Markus Armbruster
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
[..snip..]
Yet under linux the address is checked even for zero case.
Any value you can obtain from malloc() passes that check.
Why does the fact that you can construct pointers that don't pass this
check matter for our discussion of malloc()?
Post by malc
Post by Markus Armbruster
Post by malc
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)
malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.
OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.
Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
Misunderstanding? Such behavior is indeed permissible, and I can't see
where I restricted it away. An implementation that behaves as you
describe returns "pointer to allocated space". That the pointer has
some funny bit set doesn't matter. That it can't be dereferenced is
just fine.
I'm not sure what your point is. If it is that malloc(0) can return a
value that cannot be passed to a zero-sized read(), then I fear you have
not made your point.
One more attempt to make it clearer. If you agree that this behaviour
is permissible then the game is lost as things stand now under Linux,
void *p = (void *) -1
void *p = (void *) 0x80000000
or anything else with said bit set will yield EFAULT. Consequently the
code you cited as a well behaving malloc(0) call site will bomb.
[1] Under 32bit Linux that is, with the usual split.
You can't just pull pointers out of your ear and expect stuff to work.

malloc() is free to return a pointer to allocated space that is set up
in a way that catches access beyond the allocated size. OpenBSD does
that for size zero; it allocates one byte then, from pages that are used
only for zero-sized allocations, and takes care to disable access to
these pages with mprotect(..., PROT_NONE)[*]. Since read(..., 0) does
not access beyond the allocated size, it still works just fine.

If you replace glibc's malloc() to get OpenBSD-like behavior, you can't
just make up some pointer to a memory area you believe to be unused, you
have to do it right, like OpenBSD does.


[*] Check out omalloc_make_chunks() at
http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/stdlib/malloc.c?rev=1.121;content-type=text%2Fplain
malc
2009-12-07 10:00:18 UTC
Permalink
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
Post by malc
[..snip..]
Post by Markus Armbruster
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
[..snip..]
Yet under linux the address is checked even for zero case.
Any value you can obtain from malloc() passes that check.
Why does the fact that you can construct pointers that don't pass this
check matter for our discussion of malloc()?
Post by malc
Post by Markus Armbruster
Post by malc
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)
malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.
OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.
Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
Misunderstanding? Such behavior is indeed permissible, and I can't see
where I restricted it away. An implementation that behaves as you
describe returns "pointer to allocated space". That the pointer has
some funny bit set doesn't matter. That it can't be dereferenced is
just fine.
Here you agree that it's permissible.
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
I'm not sure what your point is. If it is that malloc(0) can return a
value that cannot be passed to a zero-sized read(), then I fear you have
not made your point.
One more attempt to make it clearer. If you agree that this behaviour
is permissible then the game is lost as things stand now under Linux,
void *p = (void *) -1
void *p = (void *) 0x80000000
or anything else with said bit set will yield EFAULT. Consequently the
code you cited as a well behaving malloc(0) call site will bomb.
[1] Under 32bit Linux that is, with the usual split.
You can't just pull pointers out of your ear and expect stuff to work.
And here you don't. Which renders whole discussion rather pointless.
Post by Markus Armbruster
malloc() is free to return a pointer to allocated space that is set up
in a way that catches access beyond the allocated size. OpenBSD does
that for size zero; it allocates one byte then, from pages that are used
only for zero-sized allocations, and takes care to disable access to
these pages with mprotect(..., PROT_NONE)[*]. Since read(..., 0) does
not access beyond the allocated size, it still works just fine.
If you replace glibc's malloc() to get OpenBSD-like behavior, you can't
just make up some pointer to a memory area you believe to be unused, you
have to do it right, like OpenBSD does.
[*] Check out omalloc_make_chunks() at
http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/stdlib/malloc.c?rev=1.121;content-type=text%2Fplain
--
mailto:***@comtv.ru
Kevin Wolf
2009-12-07 10:17:08 UTC
Permalink
Post by malc
Post by Markus Armbruster
Post by Markus Armbruster
Misunderstanding? Such behavior is indeed permissible, and I can't see
where I restricted it away. An implementation that behaves as you
describe returns "pointer to allocated space". That the pointer has
some funny bit set doesn't matter. That it can't be dereferenced is
just fine.
Here you agree that it's permissible.
He's just differentiating between values that are generally permissible
in terms of the C standard...
Post by malc
Post by Markus Armbruster
You can't just pull pointers out of your ear and expect stuff to work.
And here you don't. Which renders whole discussion rather pointless.
...and values that can be used by a working implementation on Linux.
That's quite a difference. It is obvious that a libc implementation must
match the kernel, or nothing will work.

Kevin
Markus Armbruster
2009-12-07 10:35:22 UTC
Permalink
Post by malc
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
Post by malc
[..snip..]
Post by Markus Armbruster
read(fd, malloc(0), 0) is just fine, because read() doesn't touch the
buffer when the size is zero.
[..snip..]
Yet under linux the address is checked even for zero case.
Any value you can obtain from malloc() passes that check.
Why does the fact that you can construct pointers that don't pass this
check matter for our discussion of malloc()?
Post by malc
Post by Markus Armbruster
Post by malc
I don't know what a "valid pointer" in this context represents.
I can talk standardese, if you prefer :)
malloc() either returns either a null pointer or a pointer to the
allocated space. In either case, you must not dereference the pointer.
OpenBSD chooses to return a pointer to the allocated space. It chooses
to catch common ways to dereference the pointer.
Your "p = (void *)-1" is neither a null pointer nor can it point to
allocated space on your particular system. Hence, it cannot be a value
of malloc() for any argument, and therefore what read() does with it on
that particular system doesn't matter.
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
Misunderstanding? Such behavior is indeed permissible, and I can't see
where I restricted it away. An implementation that behaves as you
describe returns "pointer to allocated space". That the pointer has
some funny bit set doesn't matter. That it can't be dereferenced is
just fine.
Here you agree that it's permissible.
We were talking about ISO C, so by "implementation" I meant an
implementation of ISO C, not an application program using it. Sorry if
I didn't make that sufficiently clear.
Post by malc
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
I'm not sure what your point is. If it is that malloc(0) can return a
value that cannot be passed to a zero-sized read(), then I fear you have
not made your point.
One more attempt to make it clearer. If you agree that this behaviour
is permissible then the game is lost as things stand now under Linux,
void *p = (void *) -1
void *p = (void *) 0x80000000
or anything else with said bit set will yield EFAULT. Consequently the
code you cited as a well behaving malloc(0) call site will bomb.
[1] Under 32bit Linux that is, with the usual split.
You can't just pull pointers out of your ear and expect stuff to work.
And here you don't. Which renders whole discussion rather pointless.
And here we're talking about making up pointers in a portable
application program. Which QEMU is.
Post by malc
Which renders whole discussion rather pointless.
It's only tangentially related to the question whether qemu_malloc()
should accept zero arguments anyway.
Post by malc
Post by Markus Armbruster
malloc() is free to return a pointer to allocated space that is set up
in a way that catches access beyond the allocated size. OpenBSD does
that for size zero; it allocates one byte then, from pages that are used
only for zero-sized allocations, and takes care to disable access to
these pages with mprotect(..., PROT_NONE)[*]. Since read(..., 0) does
not access beyond the allocated size, it still works just fine.
If you replace glibc's malloc() to get OpenBSD-like behavior, you can't
just make up some pointer to a memory area you believe to be unused, you
have to do it right, like OpenBSD does.
[*] Check out omalloc_make_chunks() at
http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/stdlib/malloc.c?rev=1.121;content-type=text%2Fplain
Paolo Bonzini
2009-12-06 11:35:04 UTC
Permalink
Post by malc
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
But it has to make it a valid address anyway. If a zero-sized read
treats it as invalid (SIGSEGV, EFAULT, whatever), malloc has failed to
return a valid address and is not obeying its specification.

Paolo
malc
2009-12-06 12:02:20 UTC
Permalink
Post by malc
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
But it has to make it a valid address anyway. If a zero-sized read treats it
as invalid (SIGSEGV, EFAULT, whatever), malloc has failed to return a valid
address and is not obeying its specification.
Once again - standard doesn't speak about "valid addresses".
--
mailto:***@comtv.ru
Paolo Bonzini
2009-12-06 16:23:03 UTC
Permalink
Post by malc
Post by malc
Here, i believe, you are inventing artificial restrictions on how
malloc behaves, i don't see anything that prevents the implementor
from setting aside a range of addresses with 31st bit set as an
indicator of "zero" allocations, and then happily giving it to the
user of malloc and consumming it in free.
But it has to make it a valid address anyway. If a zero-sized read treats it
as invalid (SIGSEGV, EFAULT, whatever), malloc has failed to return a valid
address and is not obeying its specification.
Once again - standard doesn't speak about "valid addresses".
For that matter, POSIX doesn't mention EFAULT at all, and doesn't
include detecting "valid addresses" among the things that read can do
before returning 0. So if an OS extends POSIX with EFAULT, it had
better provide a malloc that is consistent with whatever definition of
"valid address" EFAULT uses. While if it doesn't provide EFAULT, read
should return 0 for the OS to be conforming to POSIX, and the whole
discussion is moot.

Paolo
Blue Swirl
2009-12-06 09:02:28 UTC
Permalink
Post by malc
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends.  Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate.  The
change broke such legitimate uses.  We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
[..snip..]
Post by Markus Armbruster
Post by malc
P.S. It would be interesting to know how this code behaves under OpenBSD, with
     p = malloc (0);
[1] As does, in essence, http://www.opengroup.org/onlinepubs/7990989775/xsh/read.html
Replace "p = (void *)-1" by "p = NULL" and it works just fine.
$ cat mall.c
#define _GNU_SOURCE
#include <err.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <stdio.h>

int main (void)
{
int fd = open ("/dev/zero", 0);
int ret;
#if 0
void *p = (void *) -1;
#else
void *p = malloc(0);
#endif

fprintf(stderr, "ptr %p\n", p);
if (fd == -1) err (1, "open");
ret = read (fd, p, 0);
if (ret != 0) err (1, "read");
return 0;
}
$ gcc mall.c
$ ./a.out
ptr 0x46974060
$

Changing read count to 1:
$ ./a.out
ptr 0x41ce0070
a.out: read: Bad address
malc
2009-12-06 10:02:50 UTC
Permalink
[..snip..]
Post by Blue Swirl
$ gcc mall.c
$ ./a.out
ptr 0x46974060
$
$ ./a.out
ptr 0x41ce0070
a.out: read: Bad address
Thanks.
--
mailto:***@comtv.ru
Avi Kivity
2009-12-05 17:07:37 UTC
Permalink
Post by Anthony Liguori
I still believe that it is poor practice to pass size==0 to
*malloc(). I think actively discouraging this in qemu is a good thing
because it's a broken idiom.
Why? Unless we have a separate array allocator (like C++'s new and
new[]), we need to support zero-element arrays without pushing the
burden to callers (in the same way that for () supports zero iteration
loops without a separate if ()).
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Blue Swirl
2009-12-05 17:28:50 UTC
Permalink
I still believe that it is poor practice to pass size==0 to *malloc().  I
think actively discouraging this in qemu is a good thing because it's a
broken idiom.
Why?  Unless we have a separate array allocator (like C++'s new and new[]),
we need to support zero-element arrays without pushing the burden to callers
(in the same way that for () supports zero iteration loops without a
separate if ()).
Running a loop zero or nonzero number of times always has a very clear
and precise meaning. A pointer returned from allocating zero or
nonzero number of items may be completely unusable or usable,
respectively.

I think Laurent's proposal would work. We even could go so far as
rename the current function as qemu_malloc_possibly_broken (and adjust
callers mechanically) and introduce two new versions, which handle the
zero case in clearly advertised ways. Patches would fix the callers to
use the correct one.
Avi Kivity
2009-12-05 17:44:20 UTC
Permalink
Post by Blue Swirl
I still believe that it is poor practice to pass size==0 to *malloc(). I
think actively discouraging this in qemu is a good thing because it's a
broken idiom.
Why? Unless we have a separate array allocator (like C++'s new and new[]),
we need to support zero-element arrays without pushing the burden to callers
(in the same way that for () supports zero iteration loops without a
separate if ()).
Running a loop zero or nonzero number of times always has a very clear
and precise meaning. A pointer returned from allocating zero or
nonzero number of items may be completely unusable or usable,
respectively.
Only if you allocate using POSIX malloc(). If you allocate using a
function that is defined to return a valid pointer for zero length
allocations, you're happy.
Post by Blue Swirl
I think Laurent's proposal would work. We even could go so far as
rename the current function as qemu_malloc_possibly_broken (and adjust
callers mechanically) and introduce two new versions, which handle the
zero case in clearly advertised ways. Patches would fix the callers to
use the correct one
Good idea. Let's name the function that returns a valid pointer
qemu_malloc() (since that's what many callers expect anyway, and it's
fully backwards compatible), and see who calls
qemu_malloc_dont_call_me_with_zero().

Realistically, do we need two variants of malloc/realloc/free, or can we
stick with one that works for callers with a minimum of fuss?
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Laurent Desnogues
2009-12-05 18:16:02 UTC
Permalink
On Sat, Dec 5, 2009 at 6:44 PM, Avi Kivity <***@redhat.com> wrote:
[...]
Post by Blue Swirl
I think Laurent's proposal would work. We even could go so far as
rename the current function as qemu_malloc_possibly_broken (and adjust
callers mechanically) and introduce two new versions, which handle the
zero case in clearly advertised ways. Patches would fix the callers to
use the correct one
Good idea.  Let's name the function that returns a valid pointer
qemu_malloc() (since that's what many callers expect anyway, and it's fully
backwards compatible), and see who calls
qemu_malloc_dont_call_me_with_zero().
Yes, you're always free to follow poor coding rules by breaking strict
API.


Laurent
Avi Kivity
2009-12-05 23:11:54 UTC
Permalink
Post by Avi Kivity
Only if you allocate using POSIX malloc(). If you allocate using a
function that is defined to return a valid pointer for zero length
allocations, you're happy.
Wouldnt it be better to, rather than use a qemu_malloc() that is utterly
counterintuitive in that it has no way to report failure, and behaves in
ways people dont expect, to use normal malloc() and never pass it 0 ?
It's not that it doesn't have a way to report failure, it's that it
doesn't fail. Do you prefer functions that fail and report it to
functions that don't fail?
Seriously, who does that anyway? why call malloc when you dont want the
space? so you can use realloc? 99.99% of the time realloc() is the Wrong
Solution(tm).
Read the beginning of the thread. Basically it's for arrays, malloc(n *
sizeof(x)).
stick to what people know, and LART them for misuse of it if necessary.
The LART is a crash, great.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Ian Molton
2009-12-05 23:25:20 UTC
Permalink
Post by Avi Kivity
It's not that it doesn't have a way to report failure, it's that it
doesn't fail. Do you prefer functions that fail and report it to
functions that don't fail?
You have a way of allocating memory that will _never_ fail?
Post by Avi Kivity
Seriously, who does that anyway? why call malloc when you dont want the
space? so you can use realloc? 99.99% of the time realloc() is the Wrong
Solution(tm).
Read the beginning of the thread. Basically it's for arrays, malloc(n *
sizeof(x)).
well, make sure n is not 0. Its not that hard. I dont think I've *ever*
had a situation where I wanted to pass 0 to malloc.
Post by Avi Kivity
stick to what people know, and LART them for misuse of it if necessary.
The LART is a crash, great.
No, the LART would be a 'your patch does this wrong, try this:'

-Ian
Avi Kivity
2009-12-06 13:07:59 UTC
Permalink
Post by Ian Molton
Post by Avi Kivity
It's not that it doesn't have a way to report failure, it's that it
doesn't fail. Do you prefer functions that fail and report it to
functions that don't fail?
You have a way of allocating memory that will _never_ fail?
Sort of. Did you look at the code?
Post by Ian Molton
Post by Avi Kivity
Seriously, who does that anyway? why call malloc when you dont want the
space? so you can use realloc? 99.99% of the time realloc() is the Wrong
Solution(tm).
Read the beginning of the thread. Basically it's for arrays, malloc(n *
sizeof(x)).
well, make sure n is not 0. Its not that hard. I dont think I've *ever*
had a situation where I wanted to pass 0 to malloc.
There are multiple such cases in the code.
Post by Ian Molton
Post by Avi Kivity
stick to what people know, and LART them for misuse of it if necessary.
The LART is a crash, great.
No, the LART would be a 'your patch does this wrong, try this:'
What about existing usage? Will you audit all the existing calls?
--
error compiling committee.c: too many arguments to function
Ian Molton
2009-12-06 16:58:37 UTC
Permalink
Post by Avi Kivity
Post by Ian Molton
Post by Avi Kivity
It's not that it doesn't have a way to report failure, it's that it
doesn't fail. Do you prefer functions that fail and report it to
functions that don't fail?
You have a way of allocating memory that will _never_ fail?
Sort of.
'sort of' never ?
Post by Avi Kivity
Did you look at the code?
Yes. Its hardly infallible.
Post by Avi Kivity
Post by Ian Molton
well, make sure n is not 0. Its not that hard. I dont think I've *ever*
had a situation where I wanted to pass 0 to malloc.
There are multiple such cases in the code.
Post by Ian Molton
Post by Avi Kivity
stick to what people know, and LART them for misuse of it if necessary.
The LART is a crash, great.
No, the LART would be a 'your patch does this wrong, try this:'
What about existing usage? Will you audit all the existing calls?
mark qemu_malloc as deprecated. don't include new patches that use it.
Plenty of time to fix the broken uses...

-Ian
Avi Kivity
2009-12-06 17:07:38 UTC
Permalink
Post by Ian Molton
Post by Avi Kivity
Post by Ian Molton
Post by Avi Kivity
It's not that it doesn't have a way to report failure, it's that it
doesn't fail. Do you prefer functions that fail and report it to
functions that don't fail?
You have a way of allocating memory that will _never_ fail?
Sort of.
'sort of' never ?
Post by Avi Kivity
Did you look at the code?
Yes. Its hardly infallible.
It will never fail on Linux. On other hosts it prevents a broken oom
handler from taking the guest down a death spiral.
Post by Ian Molton
Post by Avi Kivity
What about existing usage? Will you audit all the existing calls?
mark qemu_malloc as deprecated. don't include new patches that use it.
Plenty of time to fix the broken uses...
Send patches. I don't think it's realistic to handle OOM in qemu
(handling n=0 is easy, but a lot of work for no real gain).
--
error compiling committee.c: too many arguments to function
malc
2009-12-06 17:47:33 UTC
Permalink
Post by Ian Molton
Post by Avi Kivity
Post by Ian Molton
Post by Avi Kivity
It's not that it doesn't have a way to report failure, it's that it
doesn't fail. Do you prefer functions that fail and report it to
functions that don't fail?
You have a way of allocating memory that will _never_ fail?
Sort of.
'sort of' never ?
Post by Avi Kivity
Did you look at the code?
Yes. Its hardly infallible.
It will never fail on Linux. On other hosts it prevents a broken oom handler
from taking the guest down a death spiral.
It fails here all the time i'm sorry to say, i have overcommit disabled
(mostly because kpdf when doing a text search tends to overwhelm the VM
subsystem and Linux happily picks X11 as it's OOM kill target)
Post by Ian Molton
Post by Avi Kivity
What about existing usage? Will you audit all the existing calls?
mark qemu_malloc as deprecated. don't include new patches that use it.
Plenty of time to fix the broken uses...
Send patches. I don't think it's realistic to handle OOM in qemu (handling
n=0 is easy, but a lot of work for no real gain).
--
mailto:***@comtv.ru
Avi Kivity
2009-12-06 17:59:39 UTC
Permalink
Post by malc
It will never fail on Linux. On other hosts it prevents a broken oom handler
from taking the guest down a death spiral.
It fails here all the time i'm sorry to say, i have overcommit disabled
(mostly because kpdf when doing a text search tends to overwhelm the VM
subsystem and Linux happily picks X11 as it's OOM kill target)
Right, I meant under the default configuration. Unfortunately there is
no good configuration for Linux - without strict overcommit you'll
invoke the oom killer, with strict overcommit you'll need ridiculous
amounts of swap because fork() and MAP_PRIVATE require tons of backing
store.

On my home machine I have

$ grep Commit /proc/meminfo
CommitLimit: 7235160 kB
Committed_AS: 4386172 kB

So, 4GB of virtual memory needed just to run a normal desktop with
thunderbird+firefox. Actual anonymous memory is less than 2GB, so you
could easily run this workload on a 2GB machine without swap, but with
strict overcommit 2GB RAM + 2GB swap will see failures even though you
have free RAM.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
malc
2009-12-06 18:09:21 UTC
Permalink
Post by malc
It will never fail on Linux. On other hosts it prevents a broken oom handler
from taking the guest down a death spiral.
It fails here all the time i'm sorry to say, i have overcommit disabled
(mostly because kpdf when doing a text search tends to overwhelm the VM
subsystem and Linux happily picks X11 as it's OOM kill target)
Right, I meant under the default configuration. Unfortunately there is no
good configuration for Linux - without strict overcommit you'll invoke the oom
killer, with strict overcommit you'll need ridiculous amounts of swap because
fork() and MAP_PRIVATE require tons of backing store.
On my home machine I have
$ grep Commit /proc/meminfo
CommitLimit: 7235160 kB
Committed_AS: 4386172 kB
So, 4GB of virtual memory needed just to run a normal desktop with
thunderbird+firefox. Actual anonymous memory is less than 2GB, so you could
easily run this workload on a 2GB machine without swap, but with strict
overcommit 2GB RAM + 2GB swap will see failures even though you have free RAM.
Well, i don't have a swap...

~$ cat /proc/meminfo
MemTotal: 515708 kB
MemFree: 3932 kB
Buffers: 10120 kB
Cached: 365320 kB
SwapCached: 0 kB
Active: 238120 kB
Inactive: 222396 kB
SwapTotal: 0 kB
SwapFree: 0 kB
...
CommitLimit: 464136 kB
Committed_AS: 178920 kB
...
--
mailto:***@comtv.ru
Avi Kivity
2009-12-06 18:16:02 UTC
Permalink
Post by malc
Well, i don't have a swap...
~$ cat /proc/meminfo
MemTotal: 515708 kB
MemFree: 3932 kB
Buffers: 10120 kB
Cached: 365320 kB
SwapCached: 0 kB
Active: 238120 kB
Inactive: 222396 kB
SwapTotal: 0 kB
SwapFree: 0 kB
...
CommitLimit: 464136 kB
Committed_AS: 178920 kB
...
Out of curiosity, what desktop and apps are you running? Here firefox
takes 1.3GB virt size and 377MB RSS, that alone could overwhelm your system.

[***@firebolt qemu-kvm (master)]$ cat /proc/2698/status
Name: firefox
State: S (sleeping)
Tgid: 2698
Pid: 2698
PPid: 2686
TracerPid: 0
Uid: 500 500 500 500
Gid: 500 500 500 500
Utrace: 0
FDSize: 256
Groups: 500 502 512
VmPeak: 1478804 kB
VmSize: 1288100 kB
VmLck: 0 kB
VmHWM: 434788 kB
VmRSS: 386296 kB
VmData: 676384 kB
VmStk: 248 kB
VmExe: 88 kB
VmLib: 62504 kB
VmPTE: 2312 kB
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
malc
2009-12-06 18:21:59 UTC
Permalink
Post by malc
Well, i don't have a swap...
~$ cat /proc/meminfo
MemTotal: 515708 kB
MemFree: 3932 kB
Buffers: 10120 kB
Cached: 365320 kB
SwapCached: 0 kB
Active: 238120 kB
Inactive: 222396 kB
SwapTotal: 0 kB
SwapFree: 0 kB
...
CommitLimit: 464136 kB
Committed_AS: 178920 kB
...
Out of curiosity, what desktop and apps are you running? Here firefox takes
1.3GB virt size and 377MB RSS, that alone could overwhelm your system.
Most of the time my setup is:

IceWM - 4 virtual desktops
XEmacs
Seamonkey 1.1.16
IceDock
A bunch of rxvts
Name: firefox
State: S (sleeping)
Tgid: 2698
Pid: 2698
PPid: 2686
TracerPid: 0
Uid: 500 500 500 500
Gid: 500 500 500 500
Utrace: 0
FDSize: 256
Groups: 500 502 512
VmPeak: 1478804 kB
VmSize: 1288100 kB
VmLck: 0 kB
VmHWM: 434788 kB
VmRSS: 386296 kB
VmData: 676384 kB
VmStk: 248 kB
VmExe: 88 kB
VmLib: 62504 kB
VmPTE: 2312 kB
~$ cat /proc/$(pidof seamonkey-bin)/status
Name: seamonkey-bin
State: S (sleeping)
Tgid: 2332
Pid: 2332
PPid: 1
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 100 100 100 100
FDSize: 256
Groups: 10 11 17 18 19 100
VmPeak: 151848 kB
VmSize: 150976 kB
VmLck: 0 kB
VmHWM: 57032 kB
VmRSS: 56260 kB
VmData: 105740 kB
VmStk: 84 kB
VmExe: 296 kB
VmLib: 34472 kB
VmPTE: 136 kB
Threads: 5
[...]
--
mailto:***@comtv.ru
Ian Molton
2009-12-06 22:40:47 UTC
Permalink
Post by malc
Well, i don't have a swap...
Indeed, nor do I (machine has an NFS root. swap on NFS is... not good.)
Jamie Lokier
2009-12-06 18:31:57 UTC
Permalink
Post by Ian Molton
Post by Avi Kivity
Read the beginning of the thread. Basically it's for arrays, malloc(n *
sizeof(x)).
well, make sure n is not 0. Its not that hard. I dont think I've *ever*
had a situation where I wanted to pass 0 to malloc.
I would like to remind everyone that sizeof(x) can be 0 too. For
example, on Linux sizeof(spinlock_t) == 0 on UP. Anything where you
have a bunch of structure fields which depend on compile time
configuration, or where a type might be replaced by a stub empty
structure, is a possible sizeof(x) == 0.
Post by Ian Molton
Its not that hard.
The fact is there are a number of bugs in qemu where n == 0 is not
checked prior to calling qemu_malloc() at the moment. None of them
are "hard" to fix - they are rare cases that nobody noticed when
writing them.

Until we have code analysis tools checking for that, bugs of that kind
will probably keep arising.

-- Jamie
Markus Armbruster
2009-12-07 09:56:54 UTC
Permalink
Post by Jamie Lokier
Post by Ian Molton
Post by Avi Kivity
Read the beginning of the thread. Basically it's for arrays, malloc(n *
sizeof(x)).
well, make sure n is not 0. Its not that hard. I dont think I've *ever*
had a situation where I wanted to pass 0 to malloc.
I would like to remind everyone that sizeof(x) can be 0 too. For
example, on Linux sizeof(spinlock_t) == 0 on UP. Anything where you
have a bunch of structure fields which depend on compile time
configuration, or where a type might be replaced by a stub empty
structure, is a possible sizeof(x) == 0.
Therefore, outlawing qemu_malloc(0) impacts abstract data types: given
some abstract type T, about which you're not supposed to assume
anything, you can't do qemu_malloc(sizeof(T)).

If anybody here is into outlawing scary zeroes, types with size zero
looks like another one to outlaw.
Post by Jamie Lokier
Post by Ian Molton
Its not that hard.
The fact is there are a number of bugs in qemu where n == 0 is not
checked prior to calling qemu_malloc() at the moment. None of them
are "hard" to fix - they are rare cases that nobody noticed when
writing them.
Hard or not so hard, what would "fixing" them buy us? I keep asking
this question, no takers so far.
Post by Jamie Lokier
Until we have code analysis tools checking for that, bugs of that kind
will probably keep arising.
Correct.
Ian Molton
2009-12-05 23:08:19 UTC
Permalink
Post by Avi Kivity
Only if you allocate using POSIX malloc(). If you allocate using a
function that is defined to return a valid pointer for zero length
allocations, you're happy.
Wouldnt it be better to, rather than use a qemu_malloc() that is utterly
counterintuitive in that it has no way to report failure, and behaves in
ways people dont expect, to use normal malloc() and never pass it 0 ?

Seriously, who does that anyway? why call malloc when you dont want the
space? so you can use realloc? 99.99% of the time realloc() is the Wrong
Solution(tm).

stick to what people know, and LART them for misuse of it if necessary.

(Just my 2p)

-Ian
Anthony Liguori
2009-12-05 17:27:45 UTC
Permalink
Post by Avi Kivity
Post by Anthony Liguori
I still believe that it is poor practice to pass size==0 to
*malloc(). I think actively discouraging this in qemu is a good
thing because it's a broken idiom.
Why? Unless we have a separate array allocator (like C++'s new and
new[]), we need to support zero-element arrays without pushing the
burden to callers (in the same way that for () supports zero iteration
loops without a separate if ()).
If you call malloc(size=0), then it's impossible to differentiate
between a failed malloc return and a valid result on certain platform
(like AIX).

So the only correct way would be to write:

array = malloc(size);
if (array == NULL && size != 0) {
return -ENOMEM;
}

If you were writing portable code. But that's not what people write.
You can argue that qemu_malloc() can have any semantics we want and
while that's true, it doesn't change my above statement. I think the
main argument for this behavior in qemu is that people are used to using
this idiom with malloc() but it's a non-portable practice.

If qemu_malloc() didn't carry the name "malloc()" then semantics with
size=0 would be a different discussion. But so far, all qemu_*
functions tend to behave almost exactly like their C counterparts.
Relying on the result of size=0 with malloc() is broken.

Regards,

Anthony Liguori
Avi Kivity
2009-12-05 17:40:09 UTC
Permalink
Post by Anthony Liguori
Post by Avi Kivity
Post by Anthony Liguori
I still believe that it is poor practice to pass size==0 to
*malloc(). I think actively discouraging this in qemu is a good
thing because it's a broken idiom.
Why? Unless we have a separate array allocator (like C++'s new and
new[]), we need to support zero-element arrays without pushing the
burden to callers (in the same way that for () supports zero
iteration loops without a separate if ()).
If you call malloc(size=0), then it's impossible to differentiate
between a failed malloc return and a valid result on certain platform
(like AIX).
array = malloc(size);
if (array == NULL && size != 0) {
return -ENOMEM;
}
If you were writing portable code. But that's not what people write.
You can argue that qemu_malloc() can have any semantics we want and
while that's true, it doesn't change my above statement. I think the
main argument for this behavior in qemu is that people are used to
using this idiom with malloc() but it's a non-portable practice.
If qemu_malloc() didn't carry the name "malloc()" then semantics with
size=0 would be a different discussion. But so far, all qemu_*
functions tend to behave almost exactly like their C counterparts.
Relying on the result of size=0 with malloc() is broken.
A zero-supporting qemu_malloc() is fully compatible with malloc(), we're
only restricting the possible returns. So we're not misleading any
caller. In fact, taking your argument to the extreme, a malloc
implementation would need to

if (size == 0) {
if (rand() % 2) {
return NULL;
} else {
size = 1;
}
}

Note that qemu_malloc() already restricts return values by not returning
NULL on oom, according to your logic we shouldn't do this (and have two
tests accompany any variable-size qemu_malloc()).
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Anthony Liguori
2009-12-05 17:54:01 UTC
Permalink
Post by Avi Kivity
A zero-supporting qemu_malloc() is fully compatible with malloc(),
we're only restricting the possible returns. So we're not misleading
any caller. In fact, taking your argument to the extreme, a malloc
implementation would need to
This is really the crux of the whole argument. You're arguing that
while most people rely on incorrect idioms with malloc(), the problem is
not the idioms themselves but the definition of malloc(). The opposing
argument is that instead of providing a "fixed" version of malloc(), we
should encourage people to use a proper idiom.

I dislike the entire notion of qemu_malloc(). I've always disliked the
fact that it abort()s on OOM. I'd rather see us use a normal malloc()
and code to that malloc currently which means avoiding size=0 and
checking NULL results.

However, this is all personal preference and I'd rather focus my energy
on things that have true functional impact. Markus raised a valid
functional problem with the current implementation and I proposed a
solution that would address that functional problem. I'd rather see the
discussion focus on the merits of that solution than revisiting whether
ANSI got the semantics of malloc() correct in the standards definition.

Regards,

Anthony Liguori
Avi Kivity
2009-12-05 18:06:08 UTC
Permalink
Post by Anthony Liguori
Post by Avi Kivity
A zero-supporting qemu_malloc() is fully compatible with malloc(),
we're only restricting the possible returns. So we're not misleading
any caller. In fact, taking your argument to the extreme, a malloc
implementation would need to
This is really the crux of the whole argument. You're arguing that
while most people rely on incorrect idioms with malloc(), the problem
is not the idioms themselves but the definition of malloc(). The
opposing argument is that instead of providing a "fixed" version of
malloc(), we should encourage people to use a proper idiom.
When we see a lengthy and error prone idiom we usually provide a
wrapper. That wrapper is qemu_malloc(). If you like, don't see it as a
fixed malloc(), but as qemu's way of allocating memory which is totally
independent from malloc().
Post by Anthony Liguori
I dislike the entire notion of qemu_malloc(). I've always disliked
the fact that it abort()s on OOM. I'd rather see us use a normal
malloc() and code to that malloc currently which means avoiding size=0
and checking NULL results.
I believe that's impossible. Many times, when the guest writes to a
register, you have no way to indicate failure. Sometimes you can
indicate failure (say, time out on IDE write requests) but will likely
cause much damage to the guest. Preallocating memory is likely to be
very difficult and also very wasteful (since you'll have to account for
the worst case).

Furthermore, Linux under its default configuration will never fail a
small allocation, instead oom-killing a semi-random process. So all
that work will not help on that platform.
Post by Anthony Liguori
However, this is all personal preference and I'd rather focus my
energy on things that have true functional impact. Markus raised a
valid functional problem with the current implementation and I
proposed a solution that would address that functional problem. I'd
rather see the discussion focus on the merits of that solution than
revisiting whether ANSI got the semantics of malloc() correct in the
standards definition.
Unless ANSI has a say on qemu_malloc(), I think it's worthwhile to get
that right rather than wrapping every array caller with useless tests.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Anthony Liguori
2009-12-05 20:58:19 UTC
Permalink
Post by Avi Kivity
When we see a lengthy and error prone idiom we usually provide a
wrapper. That wrapper is qemu_malloc(). If you like, don't see it as
a fixed malloc(), but as qemu's way of allocating memory which is
totally independent from malloc().
We constantly get patches with qemu_malloc() with a NULL check. Then we
tell people to remove the NULL check. It feels very weird to ask people
to remove error handling.

I can understand the argument that getting OOM right is very difficult
but it's not impossible.
Post by Avi Kivity
Post by Anthony Liguori
However, this is all personal preference and I'd rather focus my
energy on things that have true functional impact. Markus raised a
valid functional problem with the current implementation and I
proposed a solution that would address that functional problem. I'd
rather see the discussion focus on the merits of that solution than
revisiting whether ANSI got the semantics of malloc() correct in the
standards definition.
Unless ANSI has a say on qemu_malloc(), I think it's worthwhile to get
that right rather than wrapping every array caller with useless tests.
If you're concerned about array allocation, introduce an array
allocation function. Honestly, there's very little reason to open code
array allocation/manipulation at all. We should either be using a list
type or if we really need to, we should introduce a vector type.

Regards,

Anthony Liguori
Avi Kivity
2009-12-05 22:26:28 UTC
Permalink
Post by Anthony Liguori
Post by Avi Kivity
When we see a lengthy and error prone idiom we usually provide a
wrapper. That wrapper is qemu_malloc(). If you like, don't see it
as a fixed malloc(), but as qemu's way of allocating memory which is
totally independent from malloc().
We constantly get patches with qemu_malloc() with a NULL check. Then
we tell people to remove the NULL check. It feels very weird to ask
people to remove error handling.
You prefer to explain to them how to do error handling correctly?
Post by Anthony Liguori
I can understand the argument that getting OOM right is very difficult
but it's not impossible.
There are 755 calls to malloc in the code. And practically every
syscall can return ENOMEM, including the innocuous KVM_RUN ioctl().
It's going to be pretty close to impossible to recover from malloc()
failure, and impossible to recover from KVM_RUN failure (except by
retrying, which you can assume the kernel already has). All for
something which never happens. I propose that fixing OOM handling is
going to introduce some errors into the non-error paths, and many errors
into the error return paths, for zero benefit.
Post by Anthony Liguori
Post by Avi Kivity
Post by Anthony Liguori
However, this is all personal preference and I'd rather focus my
energy on things that have true functional impact. Markus raised a
valid functional problem with the current implementation and I
proposed a solution that would address that functional problem. I'd
rather see the discussion focus on the merits of that solution than
revisiting whether ANSI got the semantics of malloc() correct in the
standards definition.
Unless ANSI has a say on qemu_malloc(), I think it's worthwhile to
get that right rather than wrapping every array caller with useless
tests.
If you're concerned about array allocation, introduce an array
allocation function. Honestly, there's very little reason to open
code array allocation/manipulation at all. We should either be using
a list type or if we really need to, we should introduce a vector type.
A NEW(type) and ARRAY_NEW(type, count) marcros would improve type safety
and plug a dormant buffer overflow due to multiplication overflow, yes.
Even qemu_calloc() would be an improvement. But having qemu_malloc()
not fix the zero length array case which we know we have is
irresponsible, IMO.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Markus Armbruster
2009-12-06 08:24:07 UTC
Permalink
Post by Avi Kivity
A NEW(type) and ARRAY_NEW(type, count) marcros would improve type
safety and plug a dormant buffer overflow due to multiplication
overflow, yes. Even qemu_calloc() would be an improvement. But
having qemu_malloc() not fix the zero length array case which we know
we have is irresponsible, IMO.
Agree on all counts.
Jamie Lokier
2009-12-06 18:36:29 UTC
Permalink
Post by Avi Kivity
A NEW(type) and ARRAY_NEW(type, count) marcros would improve type safety
and plug a dormant buffer overflow due to multiplication overflow, yes.
Even qemu_calloc() would be an improvement.
In my code I regularly use type_alloc(type) and type_free(type, ptr),
giving type safety at both ends (and possibility to optimise
allocations, but that's separate).

If you have ARRAY_NEW(type, count) which permits count to be zero and
returns a non-NULL result, I wonder, why is it ok to convert zero
count to a guaranteed non-NULL unique result, but not do that for
sizeof(type) (or just size)?

-- Jamie
Markus Armbruster
2009-12-06 08:12:57 UTC
Permalink
Post by Anthony Liguori
Post by Avi Kivity
Post by Anthony Liguori
I still believe that it is poor practice to pass size==0 to
*malloc(). I think actively discouraging this in qemu is a good
thing because it's a broken idiom.
Why? Unless we have a separate array allocator (like C++'s new and
new[]), we need to support zero-element arrays without pushing the
burden to callers (in the same way that for () supports zero
iteration loops without a separate if ()).
If you call malloc(size=0), then it's impossible to differentiate
between a failed malloc return and a valid result on certain platform
(like AIX).
array = malloc(size);
if (array == NULL && size != 0) {
return -ENOMEM;
}
If you were writing portable code. But that's not what people write.
Yup. But

if (n == 0)
p = malloc(n * sizeof(struct foo))
else
p = NULL;

isn't what people write either.
Post by Anthony Liguori
You can argue that qemu_malloc() can have any semantics we want and
while that's true, it doesn't change my above statement. I think the
main argument for this behavior in qemu is that people are used to
using this idiom with malloc() but it's a non-portable practice.
We've decided long ago to disable the trap you quoted by not returning
NULL for empty allocations. That way it doesn't kill us when people
fail to call qemu_malloc() perfectly every time.
Post by Anthony Liguori
If qemu_malloc() didn't carry the name "malloc()" then semantics with
size=0 would be a different discussion. But so far, all qemu_*
functions tend to behave almost exactly like their C counterparts.
That's a reason for making qemu_malloc() work for zero arguments,
because its C counterpart does.
Post by Anthony Liguori
Relying on the result of size=0 with malloc() is broken.
No. Relying on non-null-ness is broken. You can write perfectly
portable, working code without doing that.

p = malloc(n * sizeof(struct foo);
if (n && !p)
exit_no_mem();
for (i = 0; i < n; i++)
compute_one(p, i);

With qemu_malloc(), the error handling moves into qemu_malloc():

p = qemu_malloc(n * sizeof(struct foo);
for (i = 0; i < n; i++)
compute_one(p, i);
Ian Molton
2009-12-06 16:52:35 UTC
Permalink
Post by Markus Armbruster
p = malloc(n * sizeof(struct foo);
if (n && !p)
exit_no_mem();
for (i = 0; i < n; i++)
compute_one(p, i);
p = qemu_malloc(n * sizeof(struct foo);
for (i = 0; i < n; i++)
compute_one(p, i);
And you lose the ability to fail gracefully...
Avi Kivity
2009-12-06 17:14:57 UTC
Permalink
Post by Ian Molton
Post by Markus Armbruster
p = malloc(n * sizeof(struct foo);
if (n&& !p)
exit_no_mem();
for (i = 0; i< n; i++)
compute_one(p, i);
p = qemu_malloc(n * sizeof(struct foo);
for (i = 0; i< n; i++)
compute_one(p, i);
And you lose the ability to fail gracefully...
We never had it. Suppose p is allocated in response to an SCSI register
write, and we allocate a scatter-gather list. What do you do if you OOM?

1) Introduce an error path that works synchronously off the stack and so
does not need to allocate memory
2) Don't allocate in the first place, always read guest memory and
transform it for sglists, preallocate everything else
3) Have a per-thread emergency pool, stall the vcpu until all memory is
returned to the emergency pool

all of these options are pretty horrible, either to the code or to the
guest, for something that never happens.
--
error compiling committee.c: too many arguments to function
malc
2009-12-06 17:45:44 UTC
Permalink
Post by Avi Kivity
Post by Ian Molton
Post by Markus Armbruster
p = malloc(n * sizeof(struct foo);
if (n&& !p)
exit_no_mem();
for (i = 0; i< n; i++)
compute_one(p, i);
p = qemu_malloc(n * sizeof(struct foo);
for (i = 0; i< n; i++)
compute_one(p, i);
And you lose the ability to fail gracefully...
We never had it. Suppose p is allocated in response to an SCSI register
write, and we allocate a scatter-gather list. What do you do if you OOM?
Uh, please do not generalize.

See for instance 29ddf27b72960d6e6b115cd69812c9c57b2a7b13 the incident was
logged and debugged, no OOM, no abort. Not all scenarios admit this, but
claiming that there are none that do is incorrect.
Post by Avi Kivity
1) Introduce an error path that works synchronously off the stack and so does
not need to allocate memory
2) Don't allocate in the first place, always read guest memory and transform
it for sglists, preallocate everything else
3) Have a per-thread emergency pool, stall the vcpu until all memory is
returned to the emergency pool
all of these options are pretty horrible, either to the code or to the guest,
for something that never happens.
--
mailto:***@comtv.ru
Avi Kivity
2009-12-06 18:02:08 UTC
Permalink
Post by malc
Post by Avi Kivity
Post by Ian Molton
And you lose the ability to fail gracefully...
We never had it. Suppose p is allocated in response to an SCSI register
write, and we allocate a scatter-gather list. What do you do if you OOM?
Uh, please do not generalize.
Sorry.
Post by malc
See for instance 29ddf27b72960d6e6b115cd69812c9c57b2a7b13 the incident was
logged and debugged, no OOM, no abort. Not all scenarios admit this, but
claiming that there are none that do is incorrect.
Init is pretty easy to handle. I'm worried about runtime where you
can't report an error to the guest. Real hardware doesn't oom.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
malc
2009-12-06 18:12:16 UTC
Permalink
Post by Avi Kivity
Post by malc
Post by Avi Kivity
Post by Ian Molton
And you lose the ability to fail gracefully...
We never had it. Suppose p is allocated in response to an SCSI register
write, and we allocate a scatter-gather list. What do you do if you OOM?
Uh, please do not generalize.
Sorry.
Post by malc
See for instance 29ddf27b72960d6e6b115cd69812c9c57b2a7b13 the incident was
logged and debugged, no OOM, no abort. Not all scenarios admit this, but
claiming that there are none that do is incorrect.
Init is pretty easy to handle. I'm worried about runtime where you can't
report an error to the guest. Real hardware doesn't oom.
Here, i do agree, but mostly because most of the users of allocation
functions just themselves returned NULL or -1 or whatever and never
really bothered to report anything, so the addition of OOM check that
you've added made the code less cluttered.
--
mailto:***@comtv.ru
Avi Kivity
2009-12-06 18:19:37 UTC
Permalink
Post by malc
Post by Avi Kivity
Init is pretty easy to handle. I'm worried about runtime where you can't
report an error to the guest. Real hardware doesn't oom.
Here, i do agree, but mostly because most of the users of allocation
functions just themselves returned NULL or -1 or whatever and never
really bothered to report anything, so the addition of OOM check that
you've added made the code less cluttered.
My point is that it would take a major rework, and would probably
involve removing the allocations instead of handling any errors. For
example, a scsi device would tell the block device the upper bound of
aiocbs it could possibly issue, and the maximum number of sg elements in
a request, and qcow2 (or any other backing format driver) would
preallocate enough resources to satisfy the worst case. And we still
can't handle a syscall returning ENOMEM.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
malc
2009-12-06 18:41:20 UTC
Permalink
Post by malc
Post by Avi Kivity
Init is pretty easy to handle. I'm worried about runtime where you can't
report an error to the guest. Real hardware doesn't oom.
Here, i do agree, but mostly because most of the users of allocation
functions just themselves returned NULL or -1 or whatever and never
really bothered to report anything, so the addition of OOM check that
you've added made the code less cluttered.
My point is that it would take a major rework, and would probably involve
removing the allocations instead of handling any errors. For example, a scsi
device would tell the block device the upper bound of aiocbs it could possibly
issue, and the maximum number of sg elements in a request, and qcow2 (or any
other backing format driver) would preallocate enough resources to satisfy the
worst case. And we still can't handle a syscall returning ENOMEM.
Sure. My point is that sometimes failure to allocate is due to bugs,
invalid input etc, and conversion to OOM aborts en masse, might have
not been the best possible route to take, but most likely it was better
than doing nothing.
--
mailto:***@comtv.ru
Avi Kivity
2009-12-07 09:47:30 UTC
Permalink
Post by malc
Sure. My point is that sometimes failure to allocate is due to bugs,
invalid input etc, and conversion to OOM aborts en masse, might have
not been the best possible route to take, but most likely it was better
than doing nothing.
I agree. Early oom handling does limit opportunities for recovery in
the cases where it is possible/easy. We can have an alternative API
that doesn't do oom handling for those cases where it is desirable.
--
error compiling committee.c: too many arguments to function
Kevin Wolf
2009-12-07 10:20:13 UTC
Permalink
Post by Avi Kivity
Post by malc
Sure. My point is that sometimes failure to allocate is due to bugs,
invalid input etc, and conversion to OOM aborts en masse, might have
not been the best possible route to take, but most likely it was better
than doing nothing.
I agree. Early oom handling does limit opportunities for recovery in
the cases where it is possible/easy. We can have an alternative API
that doesn't do oom handling for those cases where it is desirable.
You could simply use normal malloc as an alternative API there. After
all, the only thing that qemu_malloc adds is the OOM check (and
currently also the zero check).

Kevin
Ian Molton
2009-12-06 22:38:58 UTC
Permalink
Post by Avi Kivity
Init is pretty easy to handle. I'm worried about runtime where you
can't report an error to the guest. Real hardware doesn't oom.
In the case of the socket reconnect code I posted recently, if the
allocation failed, it would give up trying to reconnect and inform the
user of that chardev that it had closed. Ok, this doesnt help the guest,
but it allows other code to clean up nicely, and we can report the
failure to the host. IMHO thats better than leaving a sysadmin
scratching their head wondering why it suddenly just stopped feeding the
guest entropy and isnt trying to reconnect anymore...

(IMO)

-Ian
Jamie Lokier
2009-12-07 02:51:56 UTC
Permalink
Post by Ian Molton
Post by Avi Kivity
Init is pretty easy to handle. I'm worried about runtime where you
can't report an error to the guest. Real hardware doesn't oom.
In the case of the socket reconnect code I posted recently, if the
allocation failed, it would give up trying to reconnect and inform the
user of that chardev that it had closed. Ok, this doesnt help the guest,
but it allows other code to clean up nicely, and we can report the
failure to the host. IMHO thats better than leaving a sysadmin
scratching their head wondering why it suddenly just stopped feeding the
guest entropy and isnt trying to reconnect anymore...
If the system as a whole runs out of memory so that no-overcommit
malloc() fails on a small alloc, there's a good chance that you won't
be able to send a message to the host (how do you format the QMP
message without malloc?), and if you do manage that, there's a good
chance the host won't be able to receive it (it can't malloc either),
and if it does manage to receive the message, you can be almost
certain that it won't be able to run any GUI operations, send mail,
etc. to inform the admin.

The chances of the path "qemu small alloc -> chardev error -> send QMP
message -> receive QMP message -> parse QMG message -> do something
useful (log/email/UI)" having fully preallocated buffers for every
step, including a preallocated emergency pool for the buffers used by
QMG formatting and parsing, so that it gets all the way past the last
step are very slim indeed.

There's no point writing the code for the first steps, if it's
intractable to make the later steps do something useful.

Btw, as an admin I would really rather the socket reconnection code
keeps trying in that circumstance, if qemu does not simply fall over
due to alloc failing for something else soon after. The most likely
scenario, imho in a server like that, is to notice it is running out
of memory and kill the real cause (e.g. another runaway process), then
restart all daemons which have died. I'm not going to notice a
non-fatal message (in the unlikely event it is propagated all the way
up) because there are plenty of other non-fatal messages in normal
use, multiplied by hundreds of guests (across a cluster). Or, if you
mean the chardev closing causes qemu to terminate - what's the
difference from the current qemu_malloc() behaviour?

I'd rather it behaves like a broken HWRNG if it can't get host
entropy: Don't provide data, and let the guest decide what to do, just
like it does for a broken HWRNG. Except virtio-rng can report
unavailability rather than simply being broken :-)

-- Jamie
Ian Molton
2009-12-07 09:39:27 UTC
Permalink
Post by Jamie Lokier
If the system as a whole runs out of memory so that no-overcommit
malloc() fails on a small alloc, there's a good chance that you won't
be able to send a message to the host
Send what message to the host? If the malloc in the socet reconnect code
fails, its the code on the host thats going to flag up that malloc failed.
Post by Jamie Lokier
and if it does manage to receive the message, you can be almost
certain that it won't be able to run any GUI operations, send mail,
etc. to inform the admin.
OTOH, If all it does it log it to a file, theres a fair chance it might
succeed.
Post by Jamie Lokier
There's no point writing the code for the first steps, if it's
intractable to make the later steps do something useful.
OTOH, a simple printed warning, and closing the socket are fairly likely
to work.
Post by Jamie Lokier
Btw, as an admin I would really rather the socket reconnection code
keeps trying in that circumstance, if qemu does not simply fall over
due to alloc failing for something else soon after.
Surely better to keep running by dropping nonessential services so that
the guest might get a chance to shut down or the host might recover.
Post by Jamie Lokier
I'd rather it behaves like a broken HWRNG if it can't get host
entropy: Don't provide data, and let the guest decide what to do, just
like it does for a broken HWRNG.
It does.
Post by Jamie Lokier
Except virtio-rng can report unavailability rather than simply being broken :-)
It could, in theory.

-Ian
Paolo Bonzini
2009-12-07 09:55:43 UTC
Permalink
Post by Ian Molton
Send what message to the host? If the malloc in the socet reconnect code
fails, its the code on the host thats going to flag up that malloc failed.
He means to management code.
Post by Ian Molton
Post by Jamie Lokier
and if it does manage to receive the message, you can be almost
certain that it won't be able to run any GUI operations, send mail,
etc. to inform the admin.
OTOH, If all it does it log it to a file, theres a fair chance it might
succeed.
Not necessarily, for example fprintf may fail. There _may_ be a fair
chance to succeed if you use write(2) directly, but that's it basically,
and ENOMEM is always there waiting for you...

Paolo
Avi Kivity
2009-12-07 13:28:46 UTC
Permalink
Post by Paolo Bonzini
Post by Ian Molton
OTOH, If all it does it log it to a file, theres a fair chance it might
succeed.
Not necessarily, for example fprintf may fail. There _may_ be a fair
chance to succeed if you use write(2) directly, but that's it
basically, and ENOMEM is always there waiting for you...
Right, if malloc() failed write(2) is likely to fail as well. More
likely, in fact, since malloc() may have unused process memory to draw
upon whereas write(2) can only allocate kernel memory.
--
error compiling committee.c: too many arguments to function
Markus Armbruster
2009-12-07 09:45:39 UTC
Permalink
Post by Ian Molton
Post by Markus Armbruster
p = malloc(n * sizeof(struct foo);
if (n && !p)
exit_no_mem();
for (i = 0; i < n; i++)
compute_one(p, i);
p = qemu_malloc(n * sizeof(struct foo);
for (i = 0; i < n; i++)
compute_one(p, i);
And you lose the ability to fail gracefully...
That's a deliberate choice. It has its drawbacks, it has its
advantages. And it's not related to the question at hand: permitting
zero arguments.
Kevin Wolf
2009-12-07 08:48:44 UTC
Permalink
Post by Anthony Liguori
If qemu_malloc() didn't carry the name "malloc()" then semantics with
size=0 would be a different discussion. But so far, all qemu_*
functions tend to behave almost exactly like their C counterparts.
Relying on the result of size=0 with malloc() is broken.
Then let's rename it as qemu_getmem, do a sed run over the whole code
and have sane semantics. :-)

Kevin
malc
2009-12-07 11:30:41 UTC
Permalink
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
If a change breaks two uses, it probably breaks more. As a quick check,
I reviewed the first six of more than 200 uses of qemu_mallocz(),
qemu_malloc() and qemu_realloc() that don't have an argument of the form
Bottom line: your point is that there are benign uses of zero allocation.
There are, at the expense of memory consumption/fragmentation and useless
code which, as your investigation shows, involve syscalls and what not.

Outlook from my side of the fence: no one audited the code, no one
knows that all of the uses are benign, abort gives the best automatic
way to know for sure one way or the other.

Rationale for keeping it as is: zero-sized allocations - overwhelming
majority of the times, are not anticipated and not well understood,
aborting gives us the ability to eliminate them in an automatic
fashion.
Post by Markus Armbruster
size = ehdr.e_phnum * sizeof(phdr[0]);
lseek(fd, ehdr.e_phoff, SEEK_SET);
phdr = qemu_mallocz(size);
This aborts when the ELF file has no program header table, because
then e_phnum is zero (TIS ELF 1.2 page 1-6). Even if we require the
ELF file to have a program header table, aborting is not an acceptable
way to reject an unsuitable or malformed ELF file.
If there's a malformed ELF files that must be supported (and by supported
- user notification is meant) then things should be checked, because having
gargantuan size will force qemu_mallocz to abort via OOM check (even
with Linux and overcommit, since malloc will fallback to mmap which
will fail) and this argument falls apart.
Post by Markus Armbruster
mem_size = ph->p_memsz;
/* XXX: avoid allocating */
data = qemu_mallocz(mem_size);
This aborts when the segment occupies zero bytes in the image file
(TIS ELF 1.2 page 2-2).
bs->opaque = qemu_mallocz(drv->instance_size);
However, vvfat_write_target.instance_size is zero. Not sure whether
this actually bites or is "only" a time bomb.
acb = qemu_mallocz(pool->aiocb_size);
No existing instance of BlockDriverAIOCB has a zero aiocb_size.
Probably okay.
* defaultallocator_create_displaysurface() in console.c has
surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);
With Xen, surface->linesize and surface->height come out of
xenfb_configure_fb(), which set xenfb->width and xenfb->height to
values obtained from the guest. If a buggy guest sets one to zero, we
abort. Not an good way to handle that.
What is? Let's suppose height is zero, without explicit checks, user
will never know why his screen doesn't update, with abort he will at
least know that something is very wrong.
Post by Markus Armbruster
Non-Xen uses aren't obviously correct either, but I didn't dig deeper.
*sizep = 0;
dt_size = get_image_size(filename_path);
if (dt_size < 0) {
printf("Unable to get size of device tree file '%s'\n",
filename_path);
goto fail;
}
/* Expand to 2x size to give enough room for manipulation. */
dt_size *= 2;
/* First allocate space in qemu for device tree */
fdt = qemu_mallocz(dt_size);
We abort if the image is empty. Not an acceptable way to handle that.
Based on this sample, I'm confident there are many more uses broken by
the change.
Based on this sample i'm confident, we can eventually fix them should those
become the problem.
Post by Markus Armbruster
---
block/qcow2-snapshot.c | 5 -----
qemu-malloc.c | 14 ++------------
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 94cb838..e3b208c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
QCowSnapshot *sn;
int i;
- if (!s->nb_snapshots) {
- *psn_tab = NULL;
- return s->nb_snapshots;
- }
-
sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
diff --git a/qemu-malloc.c b/qemu-malloc.c
index 295d185..aeeb78b 100644
--- a/qemu-malloc.c
+++ b/qemu-malloc.c
@@ -44,22 +44,12 @@ void qemu_free(void *ptr)
void *qemu_malloc(size_t size)
{
- if (!size) {
- abort();
- }
- return oom_check(malloc(size));
+ return oom_check(malloc(size ? size : 1));
}
void *qemu_realloc(void *ptr, size_t size)
{
- if (size) {
- return oom_check(realloc(ptr, size));
- } else {
- if (ptr) {
- return realloc(ptr, size);
- }
- }
- abort();
+ return oom_check(realloc(ptr, size ? size : 1));
}
http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b
--
mailto:***@comtv.ru
Markus Armbruster
2009-12-07 14:45:32 UTC
Permalink
Post by malc
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
If a change breaks two uses, it probably breaks more. As a quick check,
I reviewed the first six of more than 200 uses of qemu_mallocz(),
qemu_malloc() and qemu_realloc() that don't have an argument of the form
Bottom line: your point is that there are benign uses of zero allocation.
There are, at the expense of memory consumption/fragmentation and useless
code which, as your investigation shows, involve syscalls and what not.
I doubt the performance impact is relevant, but since you insist on
discussing it...

First and foremost, any performance argument not backed by measurements
is speculative.

Potential zero allocation is common in code. Actual zero allocation is
rare at runtime. The amount of memory consumed is therefore utterly
trivial compared to total dynamic memory use. Likewise, time spent in
allocating is dwarved by time spent in other invocations of malloc()
several times over.

Adding a special case for avoiding zero allocation is not free. Unless
zero allocations are sufficiently frequent, that cost exceeds the cost
of the rare zero allocation.
Post by malc
Outlook from my side of the fence: no one audited the code, no one
knows that all of the uses are benign, abort gives the best automatic
way to know for sure one way or the other.
Rationale for keeping it as is: zero-sized allocations - overwhelming
majority of the times, are not anticipated and not well understood,
aborting gives us the ability to eliminate them in an automatic
fashion.
Keeping it as is releasing with known crash bugs, and a known pattern
for unknown crash bugs. Is that what you want us to do? I doubt it.

I grant you that there may be allocations we expect never to be empty,
and where things break when they are. Would you concede that there are
allocations that work just fine when empty?

If you do, we end up with three kinds of allocations: known empty bad,
known empty fine, unknown. Aborting on known empty bad is okay with me.
Aborting on unknown in developement builds is okay with me, too. I
don't expect it to be a successful way to find bugs, because empty
allocations are rare.

Initially, all allocations should be treated as "unknown". I want a way
to mark an allocation as "known empty fine". I figure you want a way to
mark "known empty bad".
Post by malc
Post by Markus Armbruster
size = ehdr.e_phnum * sizeof(phdr[0]);
lseek(fd, ehdr.e_phoff, SEEK_SET);
phdr = qemu_mallocz(size);
This aborts when the ELF file has no program header table, because
then e_phnum is zero (TIS ELF 1.2 page 1-6). Even if we require the
ELF file to have a program header table, aborting is not an acceptable
way to reject an unsuitable or malformed ELF file.
If there's a malformed ELF files that must be supported (and by supported
- user notification is meant) then things should be checked, because having
gargantuan size will force qemu_mallocz to abort via OOM check (even
with Linux and overcommit, since malloc will fallback to mmap which
will fail) and this argument falls apart.
ELF files with empty PHT are *not* malformed. Empty PHT is explicitly
permitted by ELF TIS. Likewise for empty segments. I already gave you
chapter & verse.
Post by malc
Post by Markus Armbruster
mem_size = ph->p_memsz;
/* XXX: avoid allocating */
data = qemu_mallocz(mem_size);
This aborts when the segment occupies zero bytes in the image file
(TIS ELF 1.2 page 2-2).
bs->opaque = qemu_mallocz(drv->instance_size);
However, vvfat_write_target.instance_size is zero. Not sure whether
this actually bites or is "only" a time bomb.
acb = qemu_mallocz(pool->aiocb_size);
No existing instance of BlockDriverAIOCB has a zero aiocb_size.
Probably okay.
* defaultallocator_create_displaysurface() in console.c has
surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);
With Xen, surface->linesize and surface->height come out of
xenfb_configure_fb(), which set xenfb->width and xenfb->height to
values obtained from the guest. If a buggy guest sets one to zero, we
abort. Not an good way to handle that.
What is? Let's suppose height is zero, without explicit checks, user
will never know why his screen doesn't update, with abort he will at
least know that something is very wrong.
My point isn't that permitting malloc(0) is the best way to handle it.
It's a better way than aborting, though.

I'm not arguing against treating case 0 specially ever.
Post by malc
Post by Markus Armbruster
Non-Xen uses aren't obviously correct either, but I didn't dig deeper.
*sizep = 0;
dt_size = get_image_size(filename_path);
if (dt_size < 0) {
printf("Unable to get size of device tree file '%s'\n",
filename_path);
goto fail;
}
/* Expand to 2x size to give enough room for manipulation. */
dt_size *= 2;
/* First allocate space in qemu for device tree */
fdt = qemu_mallocz(dt_size);
We abort if the image is empty. Not an acceptable way to handle that.
Based on this sample, I'm confident there are many more uses broken by
the change.
Based on this sample i'm confident, we can eventually fix them should those
become the problem.
You broke working code. I'm glad you're confident we can fix it
"eventually". What about 0.12? Ship it broken?
Post by malc
Post by Markus Armbruster
---
block/qcow2-snapshot.c | 5 -----
qemu-malloc.c | 14 ++------------
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 94cb838..e3b208c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
QCowSnapshot *sn;
int i;
- if (!s->nb_snapshots) {
- *psn_tab = NULL;
- return s->nb_snapshots;
- }
-
sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
diff --git a/qemu-malloc.c b/qemu-malloc.c
index 295d185..aeeb78b 100644
--- a/qemu-malloc.c
+++ b/qemu-malloc.c
@@ -44,22 +44,12 @@ void qemu_free(void *ptr)
void *qemu_malloc(size_t size)
{
- if (!size) {
- abort();
- }
- return oom_check(malloc(size));
+ return oom_check(malloc(size ? size : 1));
}
void *qemu_realloc(void *ptr, size_t size)
{
- if (size) {
- return oom_check(realloc(ptr, size));
- } else {
- if (ptr) {
- return realloc(ptr, size);
- }
- }
- abort();
+ return oom_check(realloc(ptr, size ? size : 1));
}
http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b
--verbose ?
malc
2009-12-07 16:55:04 UTC
Permalink
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate. The
change broke such legitimate uses. We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.
If a change breaks two uses, it probably breaks more. As a quick check,
I reviewed the first six of more than 200 uses of qemu_mallocz(),
qemu_malloc() and qemu_realloc() that don't have an argument of the form
Bottom line: your point is that there are benign uses of zero allocation.
There are, at the expense of memory consumption/fragmentation and useless
code which, as your investigation shows, involve syscalls and what not.
I doubt the performance impact is relevant, but since you insist on
discussing it...
First and foremost, any performance argument not backed by measurements
is speculative.
Potential zero allocation is common in code. Actual zero allocation is
rare at runtime. The amount of memory consumed is therefore utterly
trivial compared to total dynamic memory use. Likewise, time spent in
allocating is dwarved by time spent in other invocations of malloc()
several times over.
Adding a special case for avoiding zero allocation is not free. Unless
zero allocations are sufficiently frequent, that cost exceeds the cost
of the rare zero allocation.
I bet you dollars to donuts that testing for zero before calling malloc
is cheaper than the eventual test that is done inside it.
Post by Markus Armbruster
Post by malc
Outlook from my side of the fence: no one audited the code, no one
knows that all of the uses are benign, abort gives the best automatic
way to know for sure one way or the other.
Rationale for keeping it as is: zero-sized allocations - overwhelming
majority of the times, are not anticipated and not well understood,
aborting gives us the ability to eliminate them in an automatic
fashion.
Keeping it as is releasing with known crash bugs, and a known pattern
for unknown crash bugs. Is that what you want us to do? I doubt it.
Yes, it's better than having _silent_ bugs, which, furthermore, have a
good potential of mainfesting themselves a lot further from the "bug"
site.
Post by Markus Armbruster
I grant you that there may be allocations we expect never to be empty,
and where things break when they are. Would you concede that there are
allocations that work just fine when empty?
I wont dispute that.
Post by Markus Armbruster
If you do, we end up with three kinds of allocations: known empty bad,
known empty fine, unknown. Aborting on known empty bad is okay with me.
Aborting on unknown in developement builds is okay with me, too. I
don't expect it to be a successful way to find bugs, because empty
allocations are rare.
Initially, all allocations should be treated as "unknown". I want a way
to mark an allocation as "known empty fine". I figure you want a way to
mark "known empty bad".
Post by malc
Post by Markus Armbruster
size = ehdr.e_phnum * sizeof(phdr[0]);
lseek(fd, ehdr.e_phoff, SEEK_SET);
phdr = qemu_mallocz(size);
This aborts when the ELF file has no program header table, because
then e_phnum is zero (TIS ELF 1.2 page 1-6). Even if we require the
ELF file to have a program header table, aborting is not an acceptable
way to reject an unsuitable or malformed ELF file.
If there's a malformed ELF files that must be supported (and by supported
- user notification is meant) then things should be checked, because having
gargantuan size will force qemu_mallocz to abort via OOM check (even
with Linux and overcommit, since malloc will fallback to mmap which
will fail) and this argument falls apart.
ELF files with empty PHT are *not* malformed. Empty PHT is explicitly
permitted by ELF TIS. Likewise for empty segments. I already gave you
chapter & verse.
Uh, it's you who brought the whole malformed issue.
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
mem_size = ph->p_memsz;
/* XXX: avoid allocating */
data = qemu_mallocz(mem_size);
This aborts when the segment occupies zero bytes in the image file
(TIS ELF 1.2 page 2-2).
bs->opaque = qemu_mallocz(drv->instance_size);
However, vvfat_write_target.instance_size is zero. Not sure whether
this actually bites or is "only" a time bomb.
acb = qemu_mallocz(pool->aiocb_size);
No existing instance of BlockDriverAIOCB has a zero aiocb_size.
Probably okay.
* defaultallocator_create_displaysurface() in console.c has
surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);
With Xen, surface->linesize and surface->height come out of
xenfb_configure_fb(), which set xenfb->width and xenfb->height to
values obtained from the guest. If a buggy guest sets one to zero, we
abort. Not an good way to handle that.
What is? Let's suppose height is zero, without explicit checks, user
will never know why his screen doesn't update, with abort he will at
least know that something is very wrong.
My point isn't that permitting malloc(0) is the best way to handle it.
It's a better way than aborting, though.
And this is precisely the gist of our disagreement.
Post by Markus Armbruster
I'm not arguing against treating case 0 specially ever.
Post by malc
Post by Markus Armbruster
Non-Xen uses aren't obviously correct either, but I didn't dig deeper.
*sizep = 0;
dt_size = get_image_size(filename_path);
if (dt_size < 0) {
printf("Unable to get size of device tree file '%s'\n",
filename_path);
goto fail;
}
/* Expand to 2x size to give enough room for manipulation. */
dt_size *= 2;
/* First allocate space in qemu for device tree */
fdt = qemu_mallocz(dt_size);
We abort if the image is empty. Not an acceptable way to handle that.
Based on this sample, I'm confident there are many more uses broken by
the change.
Based on this sample i'm confident, we can eventually fix them should those
become the problem.
You broke working code. I'm glad you're confident we can fix it
"eventually". What about 0.12? Ship it broken?
I'm fixing those as they arrive.
Post by Markus Armbruster
Post by malc
Post by Markus Armbruster
---
block/qcow2-snapshot.c | 5 -----
qemu-malloc.c | 14 ++------------
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 94cb838..e3b208c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
QCowSnapshot *sn;
int i;
- if (!s->nb_snapshots) {
- *psn_tab = NULL;
- return s->nb_snapshots;
- }
-
sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
diff --git a/qemu-malloc.c b/qemu-malloc.c
index 295d185..aeeb78b 100644
--- a/qemu-malloc.c
+++ b/qemu-malloc.c
@@ -44,22 +44,12 @@ void qemu_free(void *ptr)
void *qemu_malloc(size_t size)
{
- if (!size) {
- abort();
- }
- return oom_check(malloc(size));
+ return oom_check(malloc(size ? size : 1));
}
void *qemu_realloc(void *ptr, size_t size)
{
- if (size) {
- return oom_check(realloc(ptr, size));
- } else {
- if (ptr) {
- return realloc(ptr, size);
- }
- }
- abort();
+ return oom_check(realloc(ptr, size ? size : 1));
}
http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b
--verbose ?
Sigh...

C90 - realloc(non-null, 0) =
free(non-null); return NULL;

C99 - realloc(non-null, 0) =
free(non-null); return realloc(NULL, 0);

GLIBC 2.7 = C90

How anyone can use this interface and it's implementations portably
or "sanely" is beyond me.

Do read the discussion the linked above.
--
mailto:***@comtv.ru
Anthony Liguori
2009-12-07 15:50:58 UTC
Permalink
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
While it's always fun to argue about standards interpretation, I wanted
to capture some action items from the discussion that I think there is
agreement about. Since I want to make changes for 0.12, I think it
would be best to try and settle these now so we can do this before -rc2.

For 0.12.0-rc2:

I will send out a patch tonight or tomorrow changing qemu_malloc() to
return malloc(1) when size=0 only for production builds (via
--enable-zero-mallocs). Development trees will maintain their current
behavior.

For 0.13:

Someone (Marcus?) will introduce four new allocation functions.

type *qemu_new(type, n_types);
type *qemu_new0(type, n_types);

type *qemu_renew(type, mem, n_types);
type *qemu_renew0(type, mem, n_types);

NB: The names are borrowed from glib. I'm not tied to them.

Will do our best to convert old code to use these functions where ever
possible. New code will be required to use these functions unless not
possible. n_types==0 is valid. sizeof(type)==0 is valid. Both
circumstances return a unique non-NULL pointer. If memory allocation
fails, execution will abort.

The existing qemu_malloc() will maintain it's current behavior (with the
exception of the relaxed size==0 assertion for production releases).

Does anyone object to this moving forward?

Regards,

Anthony Liguori
Avi Kivity
2009-12-07 16:00:22 UTC
Permalink
Post by Anthony Liguori
While it's always fun to argue about standards interpretation, I
wanted to capture some action items from the discussion that I think
there is agreement about. Since I want to make changes for 0.12, I
think it would be best to try and settle these now so we can do this
before -rc2.
I will send out a patch tonight or tomorrow changing qemu_malloc() to
return malloc(1) when size=0 only for production builds (via
--enable-zero-mallocs). Development trees will maintain their current
behavior.
Since active development is ceasing on 0.12, I'd suggest not having
separate behaviour for devel and production. Do we want patches for
n==0 array allocations at this time?

I'd really like to see Markus' patch applied.
Post by Anthony Liguori
Someone (Marcus?) will introduce four new allocation functions.
type *qemu_new(type, n_types);
type *qemu_new0(type, n_types);
type *qemu_renew(type, mem, n_types);
type *qemu_renew0(type, mem, n_types);
I'd like to see separate functions for arrays and single objects, to
avoid ", 1)" everywhere.

qemu_new()
qemu_new0()

qemu_new_array()
qemu_new_array0()
qemu_renew_array()
qemu_renew_array0()

In addition, Markus' patch should be applied to master to avoid
regressions while the code is converted.
--
error compiling committee.c: too many arguments to function
Anthony Liguori
2009-12-07 16:06:06 UTC
Permalink
Post by Avi Kivity
Post by Anthony Liguori
While it's always fun to argue about standards interpretation, I
wanted to capture some action items from the discussion that I think
there is agreement about. Since I want to make changes for 0.12, I
think it would be best to try and settle these now so we can do this
before -rc2.
I will send out a patch tonight or tomorrow changing qemu_malloc() to
return malloc(1) when size=0 only for production builds (via
--enable-zero-mallocs). Development trees will maintain their
current behavior.
Since active development is ceasing on 0.12, I'd suggest not having
separate behaviour for devel and production. Do we want patches for
n==0 array allocations at this time?
Covering every qemu_malloc instance this close to the GA is too risky.
I agree that having separate behavior is less than ideal but I think
it's the only sane way forward.
Post by Avi Kivity
I'd really like to see Markus' patch applied.
For 0.12, that doesn't seem like a possibility.
Post by Avi Kivity
Post by Anthony Liguori
Someone (Marcus?) will introduce four new allocation functions.
type *qemu_new(type, n_types);
type *qemu_new0(type, n_types);
type *qemu_renew(type, mem, n_types);
type *qemu_renew0(type, mem, n_types);
I'd like to see separate functions for arrays and single objects, to
avoid ", 1)" everywhere.
qemu_new()
qemu_new0()
qemu_new_array()
qemu_new_array0()
qemu_renew_array()
qemu_renew_array0()
Like I said, I'm not tied to naming. I'll defer this to whoever
contributes the patch and signs up for the conversion work.
Post by Avi Kivity
In addition, Markus' patch should be applied to master to avoid
regressions while the code is converted.
Let's separate that discussion as it's an independent consideration.

Regards,

Anthony Liguori
Avi Kivity
2009-12-07 16:11:25 UTC
Permalink
Post by Anthony Liguori
Post by Avi Kivity
Post by Anthony Liguori
While it's always fun to argue about standards interpretation, I
wanted to capture some action items from the discussion that I think
there is agreement about. Since I want to make changes for 0.12, I
think it would be best to try and settle these now so we can do this
before -rc2.
I will send out a patch tonight or tomorrow changing qemu_malloc()
to return malloc(1) when size=0 only for production builds (via
--enable-zero-mallocs). Development trees will maintain their
current behavior.
Since active development is ceasing on 0.12, I'd suggest not having
separate behaviour for devel and production. Do we want patches for
n==0 array allocations at this time?
Covering every qemu_malloc instance this close to the GA is too
risky. I agree that having separate behavior is less than ideal but I
think it's the only sane way forward.
I don't understand why. What's so insane about Markus' patch? Allowing
size=0 for both developer and production builds?

It seems like the least risky, least change approach to me. Exactly
what we want for 0.12.
Post by Anthony Liguori
Post by Avi Kivity
I'd really like to see Markus' patch applied.
For 0.12, that doesn't seem like a possibility.
Please explain why.
Post by Anthony Liguori
Post by Avi Kivity
In addition, Markus' patch should be applied to master to avoid
regressions while the code is converted.
Let's separate that discussion as it's an independent consideration.
I've asked for qemu-malloc-***@vger.kernel.org to be created for
this purpose.
--
error compiling committee.c: too many arguments to function
Anthony Liguori
2009-12-07 16:20:36 UTC
Permalink
Post by Avi Kivity
Post by Anthony Liguori
Covering every qemu_malloc instance this close to the GA is too
risky. I agree that having separate behavior is less than ideal but
I think it's the only sane way forward.
I don't understand why. What's so insane about Markus' patch?
Allowing size=0 for both developer and production builds?
There is a bug here. Callers are calling qemu_malloc incorrectly.

There is an open discussion about how to address it--fix all callers of
qemu_malloc() or allow size=0. Since there isn't agreement, a
compromise of sticking to the current behavior for the development tree,
and using the later for production since we can't guarantee the former
seems reasonable.
Post by Avi Kivity
It seems like the least risky, least change approach to me. Exactly
what we want for 0.12.
The risk is that everyone will agree to this approach in the next two
weeks. I'm fairly certain no amount of discussion on qemu-devel is
going to lead to that.
Post by Avi Kivity
Post by Anthony Liguori
Post by Avi Kivity
In addition, Markus' patch should be applied to master to avoid
regressions while the code is converted.
Let's separate that discussion as it's an independent consideration.
this purpose.
:-)

Regards,

Anthony Liguori
Avi Kivity
2009-12-07 16:26:42 UTC
Permalink
Post by Anthony Liguori
Post by Avi Kivity
Post by Anthony Liguori
Covering every qemu_malloc instance this close to the GA is too
risky. I agree that having separate behavior is less than ideal but
I think it's the only sane way forward.
I don't understand why. What's so insane about Markus' patch?
Allowing size=0 for both developer and production builds?
There is a bug here. Callers are calling qemu_malloc incorrectly.
There is an open discussion about how to address it--fix all callers
of qemu_malloc() or allow size=0. Since there isn't agreement, a
compromise of sticking to the current behavior for the development
tree, and using the later for production since we can't guarantee the
former seems reasonable.
If we apply the patch, the callers are no longer incorrect. Since we're
winding down development on that tree, I see no reason for the
production build to be correct and the development tree to be incorrect.
Post by Anthony Liguori
Post by Avi Kivity
It seems like the least risky, least change approach to me. Exactly
what we want for 0.12.
The risk is that everyone will agree to this approach in the next two
weeks. I'm fairly certain no amount of discussion on qemu-devel is
going to lead to that.
You've already agreed to it for the production build. There's no gain
in having separate behaviour for development and production, when a tree
is headed towards production.
--
error compiling committee.c: too many arguments to function
Anthony Liguori
2009-12-07 16:32:46 UTC
Permalink
Post by Avi Kivity
Post by Anthony Liguori
Post by Avi Kivity
Post by Anthony Liguori
Covering every qemu_malloc instance this close to the GA is too
risky. I agree that having separate behavior is less than ideal
but I think it's the only sane way forward.
I don't understand why. What's so insane about Markus' patch?
Allowing size=0 for both developer and production builds?
There is a bug here. Callers are calling qemu_malloc incorrectly.
There is an open discussion about how to address it--fix all callers
of qemu_malloc() or allow size=0. Since there isn't agreement, a
compromise of sticking to the current behavior for the development
tree, and using the later for production since we can't guarantee the
former seems reasonable.
If we apply the patch, the callers are no longer incorrect. Since
we're winding down development on that tree, I see no reason for the
production build to be correct and the development tree to be incorrect.
The problem with this whole discussion is taking the position that one
form is "correct" while the other form is "incorrect". It's not so
black and white.

I don't think a reasonable person can claim that either form of
qemu_malloc() is absolutely correct while the other is incorrect. You
may think one is better than the other but clearly, that sentiment is
not universally shared.

If we change the production build, we eliminate the immediate problem.
For 0.13, we can reduce the usage of qemu_malloc() such that it stops
becoming an issue. Then no one ever has to agree about which form is
better and we can move on with our lives.

Regards,

Anthony Liguori
Avi Kivity
2009-12-07 16:37:12 UTC
Permalink
Post by Anthony Liguori
Post by Avi Kivity
If we apply the patch, the callers are no longer incorrect. Since
we're winding down development on that tree, I see no reason for the
production build to be correct and the development tree to be incorrect.
The problem with this whole discussion is taking the position that one
form is "correct" while the other form is "incorrect". It's not so
black and white.
I don't think a reasonable person can claim that either form of
qemu_malloc() is absolutely correct while the other is incorrect. You
may think one is better than the other but clearly, that sentiment is
not universally shared.
I'm less concerned about qemu_malloc() and more about qemu itself
(though qemu_malloc() as patched fulfils all the standard's requirements).
Post by Anthony Liguori
If we change the production build, we eliminate the immediate problem.
What about developers that hit the assert? Do they send patches that
fix code that works in production just so they can run in developer mode?
Post by Anthony Liguori
For 0.13, we can reduce the usage of qemu_malloc() such that it stops
becoming an issue. Then no one ever has to agree about which form is
better and we can move on with our lives.
I agree, and the proposed changes are an improvement in their own right.
--
error compiling committee.c: too many arguments to function
Paul Brook
2009-12-07 16:24:00 UTC
Permalink
Post by Anthony Liguori
type *qemu_new(type, n_types);
type *qemu_new0(type, n_types);
type *qemu_renew(type, mem, n_types);
type *qemu_renew0(type, mem, n_types);
It always annoys me having to specify element count for things that aren't
arrays.

I suggestion a single object allocation function, and an array allocation
function that allows zero length arrays. Possibly also a must-not-be-zero
array form if we think it's important.

Note that conversion to object/type based allocation is not always
straightforward because inheritance means we don't have the final object type
when doing the allocation.

Paul
Anthony Liguori
2009-12-07 16:27:01 UTC
Permalink
Post by Paul Brook
Post by Anthony Liguori
type *qemu_new(type, n_types);
type *qemu_new0(type, n_types);
type *qemu_renew(type, mem, n_types);
type *qemu_renew0(type, mem, n_types);
It always annoys me having to specify element count for things that aren't
arrays.
I suggestion a single object allocation function, and an array allocation
function that allows zero length arrays. Possibly also a must-not-be-zero
array form if we think it's important.
Sure.
Post by Paul Brook
Note that conversion to object/type based allocation is not always
straightforward because inheritance means we don't have the final object type
when doing the allocation.
That's if you have the base object do the allocation, yup.
Post by Paul Brook
Paul
Avi Kivity
2009-12-07 16:28:28 UTC
Permalink
Post by Paul Brook
Note that conversion to object/type based allocation is not always
straightforward because inheritance means we don't have the final object type
when doing the allocation.
Instead of specifying the size, we can specify a constructor function
(we usually do have an .init). It's still dissatisfying in that it has
to duplicate the call to whatever it ends up being called, though.
--
error compiling committee.c: too many arguments to function
malc
2009-12-07 16:57:56 UTC
Permalink
Post by Markus Armbruster
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends. Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.
While it's always fun to argue about standards interpretation, I wanted to
capture some action items from the discussion that I think there is agreement
about. Since I want to make changes for 0.12, I think it would be best to try
and settle these now so we can do this before -rc2.
I will send out a patch tonight or tomorrow changing qemu_malloc() to return
malloc(1) when size=0 only for production builds (via --enable-zero-mallocs).
Development trees will maintain their current behavior.
Someone (Marcus?) will introduce four new allocation functions.
type *qemu_new(type, n_types);
type *qemu_new0(type, n_types);
type *qemu_renew(type, mem, n_types);
type *qemu_renew0(type, mem, n_types);
NB: The names are borrowed from glib. I'm not tied to them.
Will do our best to convert old code to use these functions where ever
possible. New code will be required to use these functions unless not
possible. n_types==0 is valid. sizeof(type)==0 is valid. Both circumstances
return a unique non-NULL pointer. If memory allocation fails, execution will
abort.
The existing qemu_malloc() will maintain it's current behavior (with the
exception of the relaxed size==0 assertion for production releases).
Does anyone object to this moving forward?
Yeah, i object to the split production/development qemu_malloc[z].
--
mailto:***@comtv.ru
Loading...