[PATCH] e1000: Pad short frames to minimum size (60 bytes)

Discussion:

[PATCH] e1000: Pad short frames to minimum size (60 bytes)

Stefan Hajnoczi

2010-09-18 20:43:45 UTC

The OpenIndiana (Solaris) e1000g driver drops frames that are too long
or too short. It expects to receive frames of at least the Ethernet
minimum size. ARP requests in particular are small and will be dropped
if they are not padded appropriately, preventing a Solaris VM from
becoming visible on the network.

Signed-off-by: Stefan Hajnoczi <***@linux.vnet.ibm.com>
---
hw/e1000.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index 7d7d140..bc983f9 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -55,6 +55,7 @@ static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL);

#define IOPORT_SIZE 0x40
#define PNPMMIO_SIZE 0x20000
+#define MIN_BUF_SIZE 60

/*
* HW models:
@@ -635,10 +636,19 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size)
uint32_t rdh_start;
uint16_t vlan_special = 0;
uint8_t vlan_status = 0, vlan_offset = 0;
+ uint8_t min_buf[MIN_BUF_SIZE];

if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))
return -1;

+ /* Pad to minimum Ethernet frame length */
+ if (size < sizeof(min_buf)) {
+ memcpy(min_buf, buf, size);
+ memset(&min_buf[size], 0, sizeof(min_buf) - size);
+ buf = min_buf;
+ size = sizeof(min_buf);
+ }
+
if (size > s->rxbuf_size) {
DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
(unsigned long)size, s->rxbuf_size);

--
1.7.1

Hervé Poussineau

2010-09-18 20:57:53 UTC

Hi,

Post by Stefan Hajnoczi
The OpenIndiana (Solaris) e1000g driver drops frames that are too long
or too short. It expects to receive frames of at least the Ethernet
minimum size. ARP requests in particular are small and will be dropped
if they are not padded appropriately, preventing a Solaris VM from
becoming visible on the network.
---
hw/e1000.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)

[...]

Another patch creating ARP replies at least 64 bytes long has been
committed:
http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813

Does it fix your issue?

Hervé

Stefan Hajnoczi

2010-09-18 21:12:13 UTC

Post by HervÃ© Poussineau
Another patch creating ARP replies at least 64 bytes long has been
http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813
Does it fix your issue?

No I don't think so. This is an e1000 issue, it will happen if you
use tap networking too. The commit you linked to only affects slirp
and pads its ARP code.

I think there are two places where the minimum frame length can be enforced:
1. The NIC emulation code. This is currently how rtl8139, pcnet, and
ne2000 do it. My patch adds the same for e1000.
2. The net layer. If we're emulating Ethernet then it would be
possible to pad to minimum frame length in common networking code
(net.c).

Stefan

Kevin Wolf

2010-09-20 08:42:31 UTC

Post by Stefan Hajnoczi

Post by HervÃ© Poussineau
Another patch creating ARP replies at least 64 bytes long has been
http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813
Does it fix your issue?

No I don't think so. This is an e1000 issue, it will happen if you
use tap networking too. The commit you linked to only affects slirp
and pads its ARP code.
1. The NIC emulation code. This is currently how rtl8139, pcnet, and
ne2000 do it. My patch adds the same for e1000.
2. The net layer. If we're emulating Ethernet then it would be
possible to pad to minimum frame length in common networking code
(net.c).

3. The sender. I think it should be the sender's decision which packet
he sends and there's no reason to manipulate it on its way to the guest.
If the sender sends too short packets, this is where the bug is.

Actually, instead of padding the packet we should already drop it in the
device model if RCTL.SBP = 0. Does a real Solaris work when it receives
the same packet?

On the other hand, it seems that we're missing the padding where it
actually belongs: when sending packets with TCTL.PSP = 1. Did you send
the ARP packet from another qemu instance? If so, this might be the real
reason.

Kevin

Edgar E. Iglesias

2010-09-20 08:51:34 UTC

Post by Kevin Wolf

Post by Stefan Hajnoczi

Post by HervÃ© Poussineau
Another patch creating ARP replies at least 64 bytes long has been
http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813
Does it fix your issue?

No I don't think so. This is an e1000 issue, it will happen if you
use tap networking too. The commit you linked to only affects slirp
and pads its ARP code.
1. The NIC emulation code. This is currently how rtl8139, pcnet, and
ne2000 do it. My patch adds the same for e1000.
2. The net layer. If we're emulating Ethernet then it would be
possible to pad to minimum frame length in common networking code
(net.c).

3. The sender. I think it should be the sender's decision which packet
he sends and there's no reason to manipulate it on its way to the guest.
If the sender sends too short packets, this is where the bug is.

Yes, but when using tap, the ethernet sender is QEMU itself. Tap doesn't
have the same requirements as ethernet so the original sender has no
reason to pad.

Internally in QEMU, there is code that picks up tap packets and
forwards them to the emulated ethernet links, this is were padding
should be done IMO. Not in the device models receive path.

The bridge that forwards frames from tap into emulated links must
also handle different kinds of link types, as all emulated network
devices are not necessarily ethernet.

Cheers

Stefan Hajnoczi

2010-09-20 09:13:50 UTC

Post by Kevin Wolf

Post by Stefan Hajnoczi

Post by HervÃ© Poussineau
Another patch creating ARP replies at least 64 bytes long has been
http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813
Does it fix your issue?

No I don't think so. This is an e1000 issue, it will happen if you
use tap networking too. The commit you linked to only affects slirp
and pads its ARP code.
1. The NIC emulation code. This is currently how rtl8139, pcnet, and
ne2000 do it. My patch adds the same for e1000.
2. The net layer. If we're emulating Ethernet then it would be
possible to pad to minimum frame length in common networking code
(net.c).

3. The sender. I think it should be the sender's decision which packet
he sends and there's no reason to manipulate it on its way to the guest.
If the sender sends too short packets, this is where the bug is.
Actually, instead of padding the packet we should already drop it in the
device model if RCTL.SBP = 0. Does a real Solaris work when it receives
the same packet?
On the other hand, it seems that we're missing the padding where it
actually belongs: when sending packets with TCTL.PSP = 1. Did you send
the ARP packet from another qemu instance? If so, this might be the real
reason.

No, I brought up the tap interface on the host and tried to ping the
guest directly. This caused the host to send an ARP request over the
tap interface. It was not padded.

Perhaps Linux expects the NIC to deal with padding transmit frames but
tap does not do this. That's just my theory though from the experience
that real NICs often do pad to minimum size. Does anyone know the
definitive answer?

Stefan

Stefan Hajnoczi

2010-09-19 06:36:51 UTC

On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.

Stefan

Michael S. Tsirkin

2010-09-19 11:18:01 UTC

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

Not all nic devices have to be emulate ethernet, so not all devices want
the padding, e.g. virtio does not. It's also easy to imagine an
ethernet device that strips the padding: would be silly to add it
just to have it stripped.

If we really want to do this generically, we could implement a function dealing
with the padding, and call it from relevant devices.

But I think the patch looks fine as is.

--
MST

Edgar E. Iglesias

2010-09-19 12:04:55 UTC

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

Not all nic devices have to be emulate ethernet, so not all devices want
the padding, e.g. virtio does not.

Right, ethernet behaviour should obviously not be applied unconditionally for
all net devices.

Post by Michael S. Tsirkin
It's also easy to imagine an
ethernet device that strips the padding: would be silly to add it
just to have it stripped.

I dont beleive that is possible. The FCS comes last, so an ethernet MAC
would have to do really silly things to differentiate between padding and
real payload.

Post by Michael S. Tsirkin
If we really want to do this generically, we could implement a function dealing
with the padding, and call it from relevant devices.

Another way is to have network devices register their link types so that the
generic bridge can apply whatever link specific fixups that may be needed.

I would prefer to have the padding of bridged frames decoupled from the
device models, but I cant say I feel very strongly about this.

Cheers

Michael S. Tsirkin

2010-09-19 12:03:48 UTC

Post by Edgar E. Iglesias

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

Not all nic devices have to be emulate ethernet, so not all devices want
the padding, e.g. virtio does not.

Right, ethernet behaviour should obviously not be applied unconditionally for
all net devices.

Post by Michael S. Tsirkin
It's also easy to imagine an
ethernet device that strips the padding: would be silly to add it
just to have it stripped.

I dont beleive that is possible. The FCS comes last, so an ethernet MAC
would have to do really silly things to differentiate between padding and
real payload.

Post by Michael S. Tsirkin
If we really want to do this generically, we could implement a function dealing
with the padding, and call it from relevant devices.

Another way is to have network devices register their link types so that the
generic bridge can apply whatever link specific fixups that may be needed.

Well, we want to move away from using the generic bridge and to
-netdev pairing of front/backend, anyway.
Adding code there will just complicate that.

Post by Edgar E. Iglesias
I would prefer to have the padding of bridged frames decoupled from the
device models, but I cant say I feel very strongly about this.
Cheers

Kevin Wolf

2010-09-20 08:50:40 UTC

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

What's the reason that it isn't true in QEMU today? Shouldn't we fix
these problems rather than making device emulations incorrect to
compensate for it?

Kevin

Edgar E. Iglesias

2010-09-20 09:03:37 UTC

Post by Kevin Wolf

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

What's the reason that it isn't true in QEMU today? Shouldn't we fix
these problems rather than making device emulations incorrect to
compensate for it?

Yes we should, I agree.

Cheers

Stefan Hajnoczi

2010-09-20 09:34:57 UTC

Post by Edgar E. Iglesias

Post by Kevin Wolf

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

What's the reason that it isn't true in QEMU today? Shouldn't we fix
these problems rather than making device emulations incorrect to
compensate for it?

Yes we should, I agree.

Does someone with more knowledge for QEMU networking than me want to
take a stab at it?

Stefan

Michael S. Tsirkin

2010-09-20 10:42:59 UTC

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
Our APIs do not really pass complete ethernet packets:
we forward packets without checksum and padding.

I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Post by Stefan Hajnoczi
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

Anthony Liguori

2010-09-20 20:31:32 UTC

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
we forward packets without checksum and padding.
I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Would this not also address the problem? It sounds like the root cause
is the tap code, not the devices..

Regards,

Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

Edgar E. Iglesias

2010-09-20 20:40:35 UTC

Post by Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
we forward packets without checksum and padding.
I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Would this not also address the problem? It sounds like the root cause
is the tap code, not the devices..
Regards,
Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

From f77c3143f3fbefdfa2f0cc873c2665b5aa78e8c9 Mon Sep 17 00:00:00 2001
Date: Mon, 20 Sep 2010 15:29:31 -0500
Subject: [PATCH] tap: make sure packets are at least 40 bytes long
This is required by ethernet drivers but not enforced in the Linux tap code so
we need to fix it up ourselves.

This enforces ethernet semantics on the internal links (which is probably
not good), but it's IMO much better than changing the devices. It also
moves the workaround closer to the root of the problem. IMO, it's a step
in the right direction.

Post by Anthony Liguori
diff --git a/net/tap.c b/net/tap.c
index 4afb314..822241a 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -179,7 +179,13 @@ static int tap_can_send(void *opaque)
#ifndef __sun__
ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen)
{
- return read(tapfd, buf, maxlen);
+ ssize_t len;
+
+ len = read(tapfd, buf, maxlen);
+ if (len > 0) {
+ len = MAX(MIN(maxlen, 40), len);
+ }
+ return len;
}
#endif
--
1.7.0.4

Anthony Liguori

2010-09-20 20:57:19 UTC

Post by Edgar E. Iglesias

Post by Anthony Liguori
From f77c3143f3fbefdfa2f0cc873c2665b5aa78e8c9 Mon Sep 17 00:00:00 2001
Date: Mon, 20 Sep 2010 15:29:31 -0500
Subject: [PATCH] tap: make sure packets are at least 40 bytes long
This is required by ethernet drivers but not enforced in the Linux tap code so
we need to fix it up ourselves.

This enforces ethernet semantics on the internal links (which is probably
not good),

Plus plus ungood.
When we do add e.g. ipoib support, we'll have to go and hunt these bugs down again.
Also will make it impossible to implement any devices that pass in guest buffers
without FCS and padding.

That's actually a good point which strongly is in favor of making the
devices do the padding themselves.

Regards,

Anthony Liguori

Michael S. Tsirkin

2010-09-20 20:44:34 UTC

Post by Edgar E. Iglesias

Post by Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
we forward packets without checksum and padding.
I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Would this not also address the problem? It sounds like the root cause
is the tap code, not the devices..
Regards,
Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

From f77c3143f3fbefdfa2f0cc873c2665b5aa78e8c9 Mon Sep 17 00:00:00 2001
Date: Mon, 20 Sep 2010 15:29:31 -0500
Subject: [PATCH] tap: make sure packets are at least 40 bytes long
This is required by ethernet drivers but not enforced in the Linux tap code so
we need to fix it up ourselves.

This enforces ethernet semantics on the internal links (which is probably
not good),

Plus plus ungood.
When we do add e.g. ipoib support, we'll have to go and hunt these bugs down again.
Also will make it impossible to implement any devices that pass in guest buffers
without FCS and padding.

Post by Edgar E. Iglesias
but it's IMO much better than changing the devices.

How much better?

Post by Edgar E. Iglesias
It also
moves the workaround closer to the root of the problem.
IMO, it's a step in the right direction.

Post by Anthony Liguori
diff --git a/net/tap.c b/net/tap.c
index 4afb314..822241a 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -179,7 +179,13 @@ static int tap_can_send(void *opaque)
#ifndef __sun__
ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen)
{
- return read(tapfd, buf, maxlen);
+ ssize_t len;
+
+ len = read(tapfd, buf, maxlen);
+ if (len > 0) {
+ len = MAX(MIN(maxlen, 40), len);
+ }

Let's at least add a comment explaining what does this do?
Also - does tcp backend need this as well? Other backends?

Post by Edgar E. Iglesias

Post by Anthony Liguori
+ return len;
}
#endif
--
1.7.0.4

Michael S. Tsirkin

2010-09-20 20:35:01 UTC

Post by Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
we forward packets without checksum and padding.
I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Would this not also address the problem? It sounds like the root
cause is the tap code, not the devices..
Regards,
Anthony Liguori

There are any number of ways to fix it.
virtio-net does not need the padding and tap does not
give it to us. So current API where we pass in the eth frame
without FCS or padding and devices add it if they like
seems more generic.

Post by Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

From f77c3143f3fbefdfa2f0cc873c2665b5aa78e8c9 Mon Sep 17 00:00:00 2001

Date: Mon, 20 Sep 2010 15:29:31 -0500
Subject: [PATCH] tap: make sure packets are at least 40 bytes long
This is required by ethernet drivers but not enforced in the Linux tap code so
we need to fix it up ourselves.
diff --git a/net/tap.c b/net/tap.c
index 4afb314..822241a 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -179,7 +179,13 @@ static int tap_can_send(void *opaque)
#ifndef __sun__
ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen)
{
- return read(tapfd, buf, maxlen);
+ ssize_t len;
+
+ len = read(tapfd, buf, maxlen);
+ if (len > 0) {
+ len = MAX(MIN(maxlen, 40), len);
+ }
+ return len;
}
#endif
--
1.7.0.4

Edgar E. Iglesias

2010-09-21 09:21:04 UTC

Post by Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
we forward packets without checksum and padding.
I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Would this not also address the problem? It sounds like the root cause
is the tap code, not the devices..
Regards,
Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

From f77c3143f3fbefdfa2f0cc873c2665b5aa78e8c9 Mon Sep 17 00:00:00 2001
Date: Mon, 20 Sep 2010 15:29:31 -0500
Subject: [PATCH] tap: make sure packets are at least 40 bytes long
This is required by ethernet drivers but not enforced in the Linux tap code so
we need to fix it up ourselves.
diff --git a/net/tap.c b/net/tap.c
index 4afb314..822241a 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -179,7 +179,13 @@ static int tap_can_send(void *opaque)
#ifndef __sun__
ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen)
{
- return read(tapfd, buf, maxlen);
+ ssize_t len;
+
+ len = read(tapfd, buf, maxlen);
+ if (len > 0) {
+ len = MAX(MIN(maxlen, 40), len);

A small detail :)
40 -> 64 (including a dummy FCS).

I don't think so: e1000 at least has code to tack the FCS on,
so we'll end up with a 68 bytes.

And at the moment e1000 also has padding, both padding
and FCS appending should go away from ethernet models before
this goes in.

Anyway, if you guys maintaining the networking parts are in
agreement that padding and FCS appending should be done in
the device models (at least for the moment), I'll accept
that and back-off. In that case, I think your suggestion
of hiding things behind some kind of generic macro or
function would be good. At least it will clarify things.

Cheers

Michael S. Tsirkin

2010-09-21 09:17:07 UTC

Post by Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
we forward packets without checksum and padding.
I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Would this not also address the problem? It sounds like the root cause
is the tap code, not the devices..
Regards,
Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
In QEMU that isn't true today and that's why rtl8139, pcnet, and
ne2000 already do this same padding. This patch is the smallest
change to cover e1000.

IMO this kind of padding should somehow be done by the bridge that forwards
packets into the qemu vlan (e.g slirp or the generic tap bridge).

That should work and we can then drop the padding code from existing
NICs. I'll take a look.
Stefan

From f77c3143f3fbefdfa2f0cc873c2665b5aa78e8c9 Mon Sep 17 00:00:00 2001
Date: Mon, 20 Sep 2010 15:29:31 -0500
Subject: [PATCH] tap: make sure packets are at least 40 bytes long
This is required by ethernet drivers but not enforced in the Linux tap code so
we need to fix it up ourselves.
diff --git a/net/tap.c b/net/tap.c
index 4afb314..822241a 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -179,7 +179,13 @@ static int tap_can_send(void *opaque)
#ifndef __sun__
ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen)
{
- return read(tapfd, buf, maxlen);
+ ssize_t len;
+
+ len = read(tapfd, buf, maxlen);
+ if (len > 0) {
+ len = MAX(MIN(maxlen, 40), len);

A small detail :)
40 -> 64 (including a dummy FCS).

I don't think so: e1000 at least has code to tack the FCS on,
so we'll end up with a 68 bytes.

Post by Anthony Liguori
+ }
+ return len;
}
#endif
--
1.7.0.4

Stefan Weil

2011-02-21 21:13:48 UTC

On Mon, Sep 20, 2010 at 9:31 PM, Anthony Liguori

Post by Anthony Liguori

Post by Michael S. Tsirkin

Post by Stefan Hajnoczi
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias

This doesn't look right. AFAIK, MAC's dont pad on receive.

I agree. NICs that do padding will do it on transmit, not receive.
Anything coming in on the wire should already have the minimum length.

QEMU never gets access to the wire.
we forward packets without checksum and padding.
I think it makes complete sense to keep this and
handle padding in devices because we
have devices that pass the frame to guest without padding and checksum.
It should be easy to replace padding code in devices that
need it with some kind of macro.

Would this not also address the problem? It sounds like the root
cause is
the tap code, not the devices..

This won't work when s->has_vnet_hdr is 1 because the virtio-net
header consumes buffer space and reduces the amount we pad. The
padding size should be 60 + (s->has_vnet_hdr ? sizeof(struct
virtio_net_hdr) : 0).
Adjusting the length without clearing the untouched buffer space is
probably fine. I'm trying to think of a scenario where this becomes
an information leak (security issue). Perhaps if the guest has vlans
enabled and allows different users to sniff traffic only on their
vlans? Then you may be able to read part of another vlan's traffic by
sending short packets to your vlan and gathering the padding data.
This is pretty contrived but doing a <60 byte memset would prevent the
issue for sure.
Stefan

The latest patch which was sent was for eepro100.c (Bruce Rogers),
but any ethernet NIC has the same kind of problem.

Does the majority still think that patching the MAC
emulation is the right way (although real MACs don't
pad on receive, as Edgar already explained)?

Then all ethernet NIC emulations should
handle the padding in the same way.
The code should be marked as a workaround
which has nothing to do with the MAC emulation.
MAC emulation code for short frames should not be
removed.

If there is consensus on this, I'll send a modified
patch for eepro100.c (or Bruce modifies his patch
so it does add the workaround comment without
removing the short frame code).

The better way would be padding in qemu's network
code for those devices which need it (that means
adding a new flag "min_frame_length" for ethernet
devices).

Stefan W.

Michael S. Tsirkin

2010-09-20 17:52:12 UTC

Post by Stefan Hajnoczi
The OpenIndiana (Solaris) e1000g driver drops frames that are too long
or too short. It expects to receive frames of at least the Ethernet
minimum size. ARP requests in particular are small and will be dropped
if they are not padded appropriately, preventing a Solaris VM from
becoming visible on the network.

I'll put this patch on my tree: let's be consistent and fix e1000,
it is also a good approach for 0.13 I think.
Anthony - could you pick this up for 0.13 please?
If someone wants to then strip it off from all devices and tweak, it can
be done.

Post by Stefan Hajnoczi
---
hw/e1000.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)
diff --git a/hw/e1000.c b/hw/e1000.c
index 7d7d140..bc983f9 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -55,6 +55,7 @@ static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL);
#define IOPORT_SIZE 0x40
#define PNPMMIO_SIZE 0x20000
+#define MIN_BUF_SIZE 60
/*
@@ -635,10 +636,19 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size)
uint32_t rdh_start;
uint16_t vlan_special = 0;
uint8_t vlan_status = 0, vlan_offset = 0;
+ uint8_t min_buf[MIN_BUF_SIZE];
if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))
return -1;
+ /* Pad to minimum Ethernet frame length */
+ if (size < sizeof(min_buf)) {
+ memcpy(min_buf, buf, size);
+ memset(&min_buf[size], 0, sizeof(min_buf) - size);
+ buf = min_buf;
+ size = sizeof(min_buf);
+ }
+
if (size > s->rxbuf_size) {
DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
(unsigned long)size, s->rxbuf_size);
--
1.7.1

22 Replies
3 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Stefan Hajnoczi 2010-09-18 20:43:45 UTC

Hervé Poussineau 2010-09-18 20:57:53 UTC

Stefan Hajnoczi 2010-09-18 21:12:13 UTC

Kevin Wolf 2010-09-20 08:42:31 UTC

Edgar E. Iglesias 2010-09-20 08:51:34 UTC

Stefan Hajnoczi 2010-09-20 09:13:50 UTC

Stefan Hajnoczi 2010-09-19 06:36:51 UTC

Michael S. Tsirkin 2010-09-19 11:18:01 UTC

Edgar E. Iglesias 2010-09-19 12:04:55 UTC

Michael S. Tsirkin 2010-09-19 12:03:48 UTC

Kevin Wolf 2010-09-20 08:50:40 UTC

Edgar E. Iglesias 2010-09-20 09:03:37 UTC

Stefan Hajnoczi 2010-09-20 09:34:57 UTC

Michael S. Tsirkin 2010-09-20 10:42:59 UTC

Anthony Liguori 2010-09-20 20:31:32 UTC

Edgar E. Iglesias 2010-09-20 20:40:35 UTC

Anthony Liguori 2010-09-20 20:57:19 UTC

Michael S. Tsirkin 2010-09-20 20:44:34 UTC

Michael S. Tsirkin 2010-09-20 20:35:01 UTC

Edgar E. Iglesias 2010-09-21 09:21:04 UTC

Michael S. Tsirkin 2010-09-21 09:17:07 UTC

Stefan Weil 2011-02-21 21:13:48 UTC

Michael S. Tsirkin 2010-09-20 17:52:12 UTC

about - legalese

Loading...