ttt: (FreeBSD) ZFSの実戦導入は…

あいかわらず、ZFSの動作テスト中です。なかなか安定しなくって　[E:coldsweats01]

☆

まず、vfs.numvnodesの値ですが、こんな感じ。

20091221

増える一方ではなく、ときどき減ったりしてます。ファイルを削除すると、減りました。

たぶん、メモリに余裕がある間は、あまり減少しない、ってことだと思います。

☆

ストレス・テスト中に、たまに発生していた、ときどき通信できなくなる、という問題。

ifconfig re0 down ; sleep 5 ; ifconfig re0 up

すると一瞬だけ直ることもあったりしたんですが。

これは、どうも、if_reドライバまわりの問題ではないか？という感じがしてきました。

emなネットワークインターフェイスカードを増設して使ってみたところ、通信できなくなる問題は、発生しなくなりました。

re0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=389b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC>
      ether 90:e6:XX:XX:XX:XX
      media: Ethernet autoselect (10baseT/UTP <half-duplex>)
      status: no carrier
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
      options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
      ether 00:02:XX:XX:XX:XX
      inet ***.***.***.*** netmask 0xffffff00 broadcast ***.**.***.255
      media: Ethernet autoselect (1000baseT <full-duplex>)
      status: active

ただ、PCIなので、パフォーマンスが落ちるんじゃないか、という懸念はあります。

割り込みはこんな感じ。

% vmstat -i
interrupt                         total    rate
irq1: atkbd0                      292       0
irq16: hdac1 ohci+                   14       0
irq18: ohci2 ohci+                   4       0
irq20: em0                   359874479    14735
irq22: atapci0                17501782       716
irq23: fwohci0                      2       0
cpu0: timer                   48840942    1999
irq256: hdac0                         1       0
cpu3: timer                   48837822    1999
cpu2: timer                   48837822    1999
cpu1: timer                   48837821    1999
Total                         572730981    23451

NFSでファイルを大量にコピーしたり、make buildworld buildkernel、 make installworld installkernel DESTDIR=～　などしても大丈夫でした。
portsのopenoffice.org-3-develもビルドできました。

☆

さあ、これで実戦導入できるか、と思ったんですが、こんなトラブル発生。

Dec 21 12:18:37 HOST kernel: ad12: TIMEOUT - READ_DMA retrying (1 retry left) LBA=142673783
Dec 21 12:21:25 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad12 offset=262144 size=8192 error=6
Dec 21 12:21:25 HOST kernel: ata6: port is not ready (timeout 10000ms) tfd = 00000080
Dec 21 12:21:25 HOST kernel: ata6: hardware reset timeout
Dec 21 12:21:25 HOST kernel: ad12: FAILURE - device detached
Dec 21 12:21:25 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad12 offset=1500301230080 size=8192 error=6
Dec 21 12:21:25 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad12 offset=1500301492224 size=8192 error=6

突然、/dev/ad12なディスクが１台、エラーを出したということで切り離されてしまいました。

# atacontrol list

ATA channel 0:
Master:    no device present
Slave:    no device present
ATA channel 1:
Master:    no device present
Slave:    no device present
ATA channel 2:
Master: ad4 <WDC WD15EADS-00P8B0/01.00A01> SATA revision 2.x
Slave:    no device present
ATA channel 3:
Master: ad6 <WDC WD15EADS-00P8B0/01.00A01> SATA revision 2.x
Slave:    no device present
ATA channel 4:
Master: ad8 <WDC WD15EADS-00P8B0/01.00A01> SATA revision 2.x
Slave:    no device present
ATA channel 5:
Master:    no device present
Slave:    no device present
ATA channel 6:
Master:    no device present　■さっきまで、ここにad12がいた
Slave:    no device present
ATA channel 7:
Master: ad14 <WDC WD15EADS-00P8B0/01.00A01> SATA revision 2.x
Slave:    no device present

RAIDZ（RAID5相当）なので、ディスクが１台ダメになっても動いてしまうわけですが・・・

こんなことになっていました。

# zpool list
NAME    SIZE USED AVAIL CAP HEALTH ALTROOT
storage 5.44T 837G 4.62T 15% DEGRADED -

# zpool status
pool: storage
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
      attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
      using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: none requested
config:

      NAME       STATE    READ WRITE CKSUM
      storage    DEGRADED    0    0    0
      raidz1 DEGRADED    0    0    0
         ad6    ONLINE    0    0    0
         ad8    ONLINE    0    0    0
         ad12 REMOVED    0 126K    0
         ad14 ONLINE    0    0    0

こんなことをやっているうちに、ad12が復活しちゃいました。

# atacontrol reinit ata6
Master: no device present
Slave: no device present

# atacontrol detach ata6

# atacontrol reinit ata6
atacontrol: ioctl(IOCATAREINIT): Device not configured
Master: no device present
Slave: no device present

# atacontrol attach ata6
Master: ad12 <WDC WD15EADS-00P8B0/01.00A01> SATA revision 2.x
Slave: no device present

smartctl -a /dev/ad12で、SMARTの情報を見ても、とくにエラーは記録されていませんでした。

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status                Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline    Completed without error    00%    302       -
# 2 Short offline    Completed without error    00%    278       -
# 3 Extended offline Completed without error    00%    264       -
# 4 Short offline    Completed without error    00%    254       -
# 5 Short offline    Completed without error    00%    231       -
# 6 Short offline    Completed without error    00%    207       -
# 7 Short offline    Completed without error    00%    184       -
# 8 Short offline    Completed without error    00%    160       -
# 9 Extended offline Completed without error    00%    150       -
#10 Short offline    Completed without error    00%    137       -
#11 Short offline    Completed without error    00%    113       -
#12 Extended offline Completed without error    00%       99       -
#13 Short offline    Completed without error    00%       89       -
#14 Conveyance offline Completed without error    00%       0       -

RAIDZのほうは、こんな感じで、修復できちゃうみたい・・・？！

# zpool offline storage ad12

# zpool online storage ad12

気になったので、scrubというのを実行。

# zpool scrub storage
# zpool status storage
pool: storage
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
      attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
      using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub in progress for 0h0m, 0.00% done, 1096h30m to go
config:

      NAME       STATE    READ WRITE CKSUM
      storage    ONLINE    0    0    0
      raidz1 ONLINE    0    0    0
         ad6    ONLINE    0    0    0
         ad8    ONLINE    0    0    0
         ad12 ONLINE    0 157K    0
         ad14 ONLINE    0    0    0

errors: No known data errors

１時間ちょっとで終了したらしく、とくにエラーもなし。

# zpool status -v storage
pool: storage
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
      attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
      using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed after 1h12m with 0 errors on Mon Dec 21 17:38:06 2009
config:

      NAME       STATE    READ WRITE CKSUM
      storage    ONLINE    0    0    0
      raidz1 ONLINE    0    0    0
         ad6    ONLINE    0    0    0
         ad8    ONLINE    0    0    0
         ad12 ONLINE    0 157K    0
         ad14 ONLINE    0    0    0

errors: No known data errors

まったくの野生の勘というか、経験的に感じることなんですが、今回発生したエラーは、ハードウェアの障害ではなく、ソフトウェア的な問題、FreeBSDのドライバ周りの問題のような気がしてなりません。

SMARTでエラーが記録されていない、という理由もありますが（SMARTも、あまりあてにならない気がしてるけど）、それ以外にも、ログを抜粋しますが・・・

Dec 21 12:21:25 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad12 offset=1500301230080 size=8192 error=6
Dec 21 12:21:25 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad12 offset=1500301492224 size=8192 error=6

Dec 11 18:22:25 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad8 offset=1500301230080 size=8192 error=6
Dec 11 18:22:25 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad8 offset=1500301492224 size=8192 error=6

Dec 11 18:22:40 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad6 offset=1500301230080 size=8192 error=6
Dec 11 18:22:40 HOST root: ZFS: vdev I/O failure, zpool=storage path=/dev/ad6 offset=1500301492224 size=8192 error=6

異なる３台のドライブで（ただしモデルは同じ）、まったく同じ場所（offset=1500301492224とか）をアクセスして、同じエラーが出るなんて偶然、あるわけないと思うわけで。

ところで、このoffsetの数値は、ドライブの容量に近い値なんですが

=== START OF INFORMATION SECTION ===
Device Model: WDC WD15EADS-00P8B0
Serial Number: WD-WCAVU*******
Firmware Version: 01.00A01
User Capacity: 1,500,301,910,016 bytes

なんか意味ありげです。

1500301492224/512 = 2930276352
8192 / 512 = 16

なので、

# dd if=/dev/ad12 bs=512 iseek=2930276352 count=16 of=/tmp/ad12.dat
16+0 records in
16+0 records out
8192 bytes transferred in 0.485605 secs (16870 bytes/sec)

# hd /tmp/ad12.dat
00000000 00 78 a8 ae 00 00 00 00 00 00 00 00 00 00 00 00 |.x┏............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 78 a8 ae |.............x┏|
00000200 01 78 a8 ae 00 00 00 00 00 00 00 00 00 00 00 00 |.x┏............|
00000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
略

# dd if=/dev/ad8 bs=512 iseek=2930276352 count=16 of=/tmp/ad8.dat
16+0 records in
16+0 records out
8192 bytes transferred in 0.038545 secs (212531 bytes/sec)
# hd /tmp/ad8.dat
00000000 00 78 a8 ae 00 00 00 00 00 00 00 00 00 00 00 00 |.x┏............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 78 a8 ae |.............x┏|
00000200 01 78 a8 ae 00 00 00 00 00 00 00 00 00 00 00 00 |.x┏............|
00000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
略

# dd if=/dev/ad6 bs=512 iseek=2930276352 count=16 of=/tmp/ad6.dat
16+0 records in
16+0 records out
8192 bytes transferred in 0.030044 secs (272668 bytes/sec)
# hd /tmp/ad6.dat
00000000 00 78 a8 ae 00 00 00 00 00 00 00 00 00 00 00 00 |.x┏............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 78 a8 ae |.............x┏|
00000200 01 78 a8 ae 00 00 00 00 00 00 00 00 00 00 00 00 |.x┏............|
00000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*

さらには、0xaea87600 = 2930275840 です。

というわけで、意味ありげな数値が出てくるんですが、ここって、ZFSの管理用の情報が格納されている領域なんでしょうかね。

☆

ディスク周りのハードウェアの情報。

# grep ata /var/run/dmesg.boot
atapci0: <ATI IXP700/800 SATA300 controller> port 0xb000-0xb007,0xa000-0xa003,0x9000-0x9007,0x8000-0x8003,0x7000-0x700f mem 0xfe7ffc00-0xfe7fffff irq 22 at device 17.0 on pci0
atapci0: [ITHREAD]
atapci0: AHCI v1.10 controller with 6 3Gbps ports, PM supported
ata2: <ATA channel 0> on atapci0
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci0
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci0
ata4: [ITHREAD]
ata5: <ATA channel 3> on atapci0
ata5: [ITHREAD]
ata6: <ATA channel 4> on atapci0
ata6: [ITHREAD]
ata7: <ATA channel 5> on atapci0
ata7: [ITHREAD]
atapci1: <ATI IXP700/800 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 20.1 on pci0
ata0: <ATA channel 0> on atapci1
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci1
ata1: [ITHREAD]
ad4: 1430799MB <WDC WD15EADS-00P8B0 01.00A01> at ata2-master SATA300
ad6: 1430799MB <WDC WD15EADS-00P8B0 01.00A01> at ata3-master SATA300
ad8: 1430799MB <WDC WD15EADS-00P8B0 01.00A01> at ata4-master SATA300
ad12: 1430799MB <WDC WD15EADS-00P8B0 01.00A01> at ata6-master SATA300
ad14: 1430799MB <WDC WD15EADS-00P8B0 01.00A01> at ata7-master SATA300

☆

FreeBSDのメーリングリストで、気になる情報も発見。

http://docs.freebsd.org/cgi/mid.cgi?FC3C9F7DE00D4B8FA102488806B848D4

Date:      Wed, 16 Dec 2009 13:42:54 +0300
 Cc:        freebsd-stable@freebsd.org
Subject:   RE: 8.0-RELEASE: disk IO temporarily hangs up (ZFS or ATA related problem)

FreeBSD 8.0-RELEASEで、ZFSを使ってて、ディスクアクセスがときどき固まってしまう、という報告なんですが、これ、まったく同じ「WDC WD15EADS-00P8B0」で起きたっていってるんです。
たとえば、このハードディスクが持っている省電力のための機能が、なにか問題を引き起こすとかあるのでしょうか？

☆

(2009/12/22 追記)
SATAをAHCIモードからIDEモードへ切り替えてみたけど、同じようなエラーが出て、ディスクが切り離されてしまい、ダメダメ。

2 件のコメント:

とおりすがり2010年3月15日 19:37
それって例のHDDのファーム問題じゃないですか？
返信削除
返信
本人2010年3月16日 6:13
ファームっていうと、WDのGreenシリーズが8秒でheadをparkingしちゃう、っていう、なんだか融通きかなすぎな仕様のことでしょうか。[E:coldsweats01]
それも十分あるかもしれません（
ですが、8.0-RELASEから、8-STABLEにして以来、HDDが切り離されちゃう現象は収まりましたので、見なかったことにしています。[E:coldsweats01]
最近は、NICのif_reで、突然通信できなくなる現象が、ごくまれに発生していて、そっちで悩まされています。
返信削除
返信

コメントを追加

2009年12月22日火曜日

(FreeBSD) ZFSの実戦導入は…

2 件のコメント: