Discussion:
Win2K8 32-bit Mix of IDE and SCSI assertion
Peter Lieven
2010-08-05 20:58:24 UTC
Permalink
Hi,

we today saw a Win2k8 VM with first device IDE and second device SCSI crash reprocably during boot.
Win2k8 32-bit Server was installed with IDE only. The second SCSI device was added later.

Here is the output and commandline:

Aug 5 20:42:55 172.21.59.142 exec: /usr/bin/qemu-kvm-0.12.4 -net tap,vlan=726,script=no,downscript=no,ifname=tap0 -net nic,vlan=726,model=rtl8139,macaddr=52:54:00:fe:00:bf -drive format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e4ce1ab04-3ce7a1250354c545-p01-w2k8,if=ide,boot=on,cache=none,aio=native -drive format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-319e1ab04-39a7a1250204c503-test2,if=scsi,boot=off,cache=none,aio=native -m 4096 -smp 4 -monitor tcp:0:4001,server,nowait -vnc :1 -name 'W2K8-01' -boot order=dc,menu=off -k de -pidfile /var/run/qemu/vm-200.pid -mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R) Xeon(R) CPU L5640 @ 2.27GHz',-nx -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0xff = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x100 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x101 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x102 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: qemu-kvm-0.12.4: /usr/src/qemu-kvm-0.12.4/hw/lsi53c895a.c:512: lsi_do_dma: Assertion `s->current' failed.
Aug 5 20:43:06 172.21.59.142 kvm: Aborted
Aug 5 20:43:06 172.21.59.142 kvm errno=134

Any ideas?

Best Regards,
Peter

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Gerd Hoffmann
2010-08-06 08:45:10 UTC
Permalink
Post by Peter Lieven
Hi,
we today saw a Win2k8 VM with first device IDE and second device SCSI crash reprocably during boot.
Win2k8 32-bit Server was installed with IDE only. The second SCSI device was added later.
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0xff = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x100 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x101 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x102 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: qemu-kvm-0.12.4: /usr/src/qemu-kvm-0.12.4/hw/lsi53c895a.c:512: lsi_do_dma: Assertion `s->current' failed.
Aug 5 20:43:06 172.21.59.142 kvm: Aborted
Aug 5 20:43:06 172.21.59.142 kvm errno=134
Any ideas?
Fixed in master.

http://git.qemu.org/qemu.git/commit/?id=d1d74664ea99cdc571afee12e31c8625595765b0

cheers,
Gerd

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Novotny
2010-08-06 11:08:25 UTC
Permalink
Post by Peter Lieven
Hi,
we today saw a Win2k8 VM with first device IDE and second device SCSI crash reprocably during boot.
Win2k8 32-bit Server was installed with IDE only. The second SCSI device was added later.
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0xff = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x100 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x101 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: lsi_scsi: error: Unhandled writeb 0x102 = 0x0
Aug 5 20:43:06 172.21.59.142 kvm: qemu-kvm-0.12.4: /usr/src/qemu-kvm-0.12.4/hw/lsi53c895a.c:512: lsi_do_dma: Assertion `s->current' failed.
Aug 5 20:43:06 172.21.59.142 kvm: Aborted
Aug 5 20:43:06 172.21.59.142 kvm errno=134
Any ideas?
Best Regards,
Peter
Peter, you say you're using Win2K8 32-bit (x86) version (i.e. you don't
use Win2K8R2 - it doesn't have LSI drivers for lsi53c895a AFAIK) ? I
don't know but isn't Win2K8 x64 only? Does 32-bit version for this
system exists? Nevertheless the registers on the log you wrote about
were somehow problematic on x64 LSI drivers because of bad phase jump
implementation.

Paolo, I remember we've been studying the LSI SCSI controller code and
those unhandled writes on registers 0xff, 0x100, 0x101 and 0x102 were
there for the Windows x64 guests and this was caused by invalid phase
jump registers not meeting the specs. Nevertheless when I tried to seach
some information on errno=134 (based on assumption it's a standard OS
error) I used perror but it returned some kind of MySQL error code:

$ perror 134
MySQL error code 134: Record was already deleted (or record file crashed)
$

I don't know whether installation MySQL can override some codes from
perror however even googling for the OS Error 134 proved to be
ineffective. Are you having any ideas ? Is your patch for LSI SCSI
controller applied in the upstream ?

Thanks,
Michal
--
Michal Novotny<***@redhat.com>, RHCE
Virtualization Team (xen userspace), Red Hat

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2010-08-06 11:17:51 UTC
Permalink
Aug 5 20:43:06 172.21.59.142 kvm: Aborted
Aug 5 20:43:06 172.21.59.142 kvm errno=134
when I tried to seach some information on errno=134 (based on
assumption it's a standard OS error)
I don't know where exactly the output is coming from, but in this case
134 is not really an errno, but a value returned from waitpid. It
indicates that kvm exited with SIGABRT (SIGABRT = 6, plus bit 7 is set).
I used perror but it returned some kind of MySQL error code: $ perror
134 MySQL error code 134: Record was already deleted (or record file
crashed) $
You're confusing the C standard function perror with some random
executable you have on your system:

$ yum whatprovides '*/perror'
mysql-server-5.1.45-2.fc13.x86_64 : The MySQL server and related files
Repo : fedora
Matched from:
Filename : /usr/bin/perror

:)
Is your patch for LSI SCSI controller applied in the upstream ?
Yes, Gerd already pointed to it.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Novotny
2010-08-06 11:30:47 UTC
Permalink
Post by Paolo Bonzini
Aug 5 20:43:06 172.21.59.142 kvm: Aborted
Aug 5 20:43:06 172.21.59.142 kvm errno=134
when I tried to seach some information on errno=134 (based on
assumption it's a standard OS error)
I don't know where exactly the output is coming from, but in this case
134 is not really an errno, but a value returned from waitpid. It
indicates that kvm exited with SIGABRT (SIGABRT = 6, plus bit 7 is set).
Well then, this could be the thing.
Post by Paolo Bonzini
I used perror but it returned some kind of MySQL error code: $ perror
134 MySQL error code 134: Record was already deleted (or record file
crashed) $
You're confusing the C standard function perror with some random
$ yum whatprovides '*/perror'
mysql-server-5.1.45-2.fc13.x86_64 : The MySQL server and related files
Repo : fedora
Filename : /usr/bin/perror
:)
Yeah, you're right. It's accessing this file nevertheless the reason I
confused it was that when I put the argument of some known error code
for OS, it's returning the OS error but when it's not known it's
returning the MySQL error.

I wrote a small program to confirm it. It has this line: printf("err
134: %s\n", strerror(134)); and when I run this program it's returning:

$./ax
err 134: Unknown error 134
$

so that's why I got confused, sorry.
Post by Paolo Bonzini
Is your patch for LSI SCSI controller applied in the upstream ?
Yes, Gerd already pointed to it.
Paolo
Well then, then if the patch is applied I don't know what else could
caused it since those registers were really closely connected to the
invalid phase jumps.

Michal
--
Michal Novotny<***@redhat.com>, RHCE
Virtualization Team (xen userspace), Red Hat

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Lieven
2010-08-06 14:37:05 UTC
Permalink
Post by Michal Novotny
Aug 5 20:43:06 172.21.59.142 kvm: Aborted
Aug 5 20:43:06 172.21.59.142 kvm errno=134
when I tried to seach some information on errno=134 (based on
assumption it's a standard OS error)
I don't know where exactly the output is coming from, but in this case 134 is not really an errno, but a value returned from waitpid. It indicates that kvm exited with SIGABRT (SIGABRT = 6, plus bit 7 is set).
Well then, this could be the thing.
I used perror but it returned some kind of MySQL error code: $ perror
134 MySQL error code 134: Record was already deleted (or record file
crashed) $
You're confusing the C standard function perror with some random
$ yum whatprovides '*/perror'
mysql-server-5.1.45-2.fc13.x86_64 : The MySQL server and related files
Repo : fedora
Filename : /usr/bin/perror
:)
Yeah, you're right. It's accessing this file nevertheless the reason I confused it was that when I put the argument of some known error code for OS, it's returning the OS error but when it's not known it's returning the MySQL error.
$./ax
err 134: Unknown error 134
$
so that's why I got confused, sorry.
Is your patch for LSI SCSI controller applied in the upstream ?
Yes, Gerd already pointed to it.
Paolo
Well then, then if the patch is applied I don't know what else could caused it since those registers were really closely connected to the invalid phase jumps.
hi all, the error was with qemu 0.12.4. i looked at 0.12.5 changelog, but did not find any scsi related changes after 0.12.4.
the guest that caused the problem was win2k8 server 64-bit. i applied the patch gerd pointed to and this fixed the issue.

thanks to all,
peter
Post by Michal Novotny
Michal
--
Virtualization Team (xen userspace), Red Hat
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Novotny
2010-08-06 14:36:47 UTC
Permalink
Post by Peter Lieven
...
hi all, the error was with qemu 0.12.4. i looked at 0.12.5 changelog, but did not find any scsi related changes after 0.12.4.
the guest that caused the problem was win2k8 server 64-bit. i applied the patch gerd pointed to and this fixed the issue.
thanks to all,
peter
Hi Peter,
I got a little confused since you wrote something about Win2K8 32-bit so
I didn't know whether you were referring to 32-bit guest or 64-bit one.
The patch is necessary for x64 version of LSI SCSI drivers since it
fixes the phase mismatch implementations. Linux doesn't have those
issues since the jump registers 1 and 2 are the same. This was also true
for 32-bit versions of Windows nevertheless Microsoft seems to change
this to match specs in the x64 version of those LSI SCSI drivers. But
it's great to hear it's working fine and this is not any other issue ;)

Regards,
Michal
--
Michal Novotny<***@redhat.com>, RHCE
Virtualization Team (xen userspace), Red Hat

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...