Hello, all,
We are experiencing some serious problems on a virtualized Oracle RAC 10.2.0.4.0, standard edition, built on two Windows 2003 virtual machine hosted on a VSphere-running 3-server physical cluster.
Win2003 machine are configured this way:
OS: WinX64 Workstation Release 2
4 CPU, 8 GB Ram
Two ESXI3 network adapters
a 70-gig system disk.
Every virtual host is on its own server.
Cluster disks are configured as follow:
Data disk: 400gig under SCSI 1.1, physical, permanent independent
Arc Log: 420gig under SCSI 1.2, physical, permanent independent
Redo Log: 20gig under SCSI 1.3, physical, permanent independent
1 OCR Disk (Scsi 2.1), physical, permanent, independent
5 Voting Disks, (Scsi 2.2 to Scsi 2.8), physical, permanent, independent
All of these virtual disks have been created using vmkfstools, using eagerzeroedthick options.
Disk locking is false, and, according to best practices found online, we set up these extra parameters on vmx files:
diskLib.dataCacheMaxReadAheadSize = "0"
diskLib.dataCacheMinReadAheadSize = "0"
diskLib.dataCachePageSize = "4096"
diskLib.maxUnsyncedWrites = "0"
Data, Arc and Redo are used by Oracle ASM: all diskgroups have been configured using external redundancy. All virtual disks are placed on RAID-1 LUNS.
I enjoyed a flawless Oracle Stack installation; the same can be said for ASM and DB Instance creation, and we all know how tricky RAC installation can be.
When I imported data from the old RAC, data exported using plain old Exp, I suddenly experienced these errors, leading to failed import and sudden instance shutdown.
-
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_lgwr_3020.trc:
ORA-00345: redo log write error block 247036 count 2048
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.256.720556015'
ORA-27070: async read/write failed
ORA-00345: redo log write error block 247036 count 2048
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.257.720556021'
ORA-27070: async read/write failed
ORA-00345: redo log write error block 249084 count 1
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.256.720556015'
ORA-27070: async read/write failed
ORA-00345: redo log write error block 249084 count 1
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.257.720556021'
ORA-27070: async read/write failed
Tue Jun 01 18:59:08 2010
Doing block recovery for file 7 block 329099
Tue Jun 01 18:59:08 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_lgwr_3020.trc:
ORA-00340: IO error processing online log 5 of thread 1
ORA-00345: redo log write error block 247036 count 2048
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.256.720556015'
ORA-27070: async read/write failed
ORA-00345: redo log write error block 247036 count 2048
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.257.720556021'
ORA-27070: async read/write failed
ORA-00345: redo log write error block 249084 count 1
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.256.720556015'
ORA-27070: async read/write failed
ORA-00345: redo log write error block 249084 count 1
ORA-00312: online log 5 thread 1: '+REDO/rucla/onlinelog/group_5.257.720556021'
ORA-27070: async read/write failed
Tue Jun 01 18:59:08 2010
LGWR: terminating instance due to error 340
Tue Jun 01 18:59:08 2010
Doing block recovery for file 7 block 329099
Tue Jun 01 18:59:08 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_lms1_2100.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:08 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_lmd0_1468.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:08 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_lms0_2540.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:08 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_lmon_824.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:08 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_pmon_3332.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:09 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_ckpt_2400.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:09 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_j000_4060.trc:
ORA-00340: errore IO elaborando log in linea del thread
Tue Jun 01 18:59:09 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_psp0_1596.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:09 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_q000_3656.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:09 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_lck0_3088.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:10 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_rbal_2092.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:10 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_mman_844.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:10 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_dbw0_644.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:11 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_o000_3292.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:17 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_reco_916.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:17 2010
Errors in file c:\oracle\product\10.2.0\admin\rucla\bdump\rucla1_smon_672.trc:
ORA-00340: IO error processing online log of thread
Tue Jun 01 18:59:18 2010
Instance terminated by LGWR, pid = 3020
-
Moreover, I've spotted the same ORA-27070 error when importing a large table and creating the index on it.
-
IMP-00017: l'istruzione seguente non è riuscita causa errore ORACLE 1115:
"CREATE INDEX "CRI10" ON "CRI_ORDINI_RIGHE" ("CLICTE_CRI" , "KEY_CRI" , "ARC"
"ART_VEN_CRI" , "ARCAAC_VEN_CRI" , "ARCARD_VEN_CRI" , "ARCARS_VEN_CRI" , "AR"
"CART_PRO_CRI" , "ARCAAC_PRO_CRI" , "ARCARD_PRO_CRI" , "ARCARS_PRO_CRI" , "D"
"T_CON_CHIESTA_CRI" , "NUMERO_RIGA_CRI" , "QT_ORDINATA_CRI" ) PCTFREE 10 IN"
"ITRANS 2 MAXTRANS 255 STORAGE(INITIAL 234881024 FREELISTS 1 FREELIST GROUPS"
" 1 BUFFER_POOL DEFAULT) TABLESPACE "USERS" LOGGING"
IMP-00003: rilevato errore ORACLE 1115
ORA-01115: errore I/O leggendo blocco su file 7 (blocco n. 402013)
ORA-01110: file di dati 7: '+DATA/rucla/datafile/users.270.720554561'
ORA-27070: lettura/scrittura asincrona non riuscita
IMP-00017: l'istruzione seguente non è riuscita causa errore ORACLE 20000:
"BEGIN DBMS_STATS.SET_INDEX_STATS(NULL,'"CRI10"',NULL,NULL,NULL,1831008,264"
"52,1831008,1,1,240554,2,6); END;"
IMP-00003: rilevato errore ORACLE 20000
ORA-20000: INDEX "RUBRA"."CRI10" does not exist or insufficient privileges
ORA-06512: a "SYS.DBMS_STATS", line 2124
ORA-06512: a "SYS.DBMS_STATS", line 5473
ORA-06512: a line 1
IMP-00017: l'istruzione seguente non è riuscita causa errore ORACLE 1115:
"CREATE INDEX "CRI00" ON "CRI_ORDINI_RIGHE" ("KEY_CRI" , "ST_MODIFICA_CRI" )"
" PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 75497472 FREELISTS 1 F"
"REELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "USERS" LOGGING"
IMP-00003: rilevato errore ORACLE 1115
ORA-01115: errore I/O leggendo blocco su file 7 (blocco n. 402845)
ORA-01110: file di dati 7: '+DATA/rucla/datafile/users.270.720554561'
ORA-27070: lettura/scrittura asincrona non riuscita
IMP-00017: l'istruzione seguente non è riuscita causa errore ORACLE 20000:
"BEGIN DBMS_STATS.SET_INDEX_STATS(NULL,'"CRI00"',NULL,NULL,NULL,2010220,880"
"8,2010220,1,1,227401,2,6); END;"
IMP-00003: rilevato errore ORACLE 20000
ORA-20000: INDEX "RUBRA"."CRI00" does not exist or insufficient privileges
ORA-06512: a "SYS.DBMS_STATS", line 2124
ORA-06512: a "SYS.DBMS_STATS", line 5473
ORA-06512: a line 1
-
On physical machines, these errors are tell-tale signs of disk problems coming, either bad sectors or controller about to crash. But what about virtual machines? I googled a lot to get help about this.
The folks that set up ESX infrastructure, have been, so far, of no or little help. We' ve opened a call with hardware manufacturer, We're still waiting for the answer.
In short, we're stuck, since we followed all of the best practices we could lay our hand on.
Any help?
Thanks in advance,
Max Lambertini