We recently completed virtualization of a few Oracle 10.x Windows servers. This is a huge deal--a few people did not want this project to be carried out.
Here is where things went wrong. We are (were?) backing up or VMs using scripted snapshots and disk exports (we have no SAN at this point). We didn't realize that snapshots do not play nicely with Oracle DBs (we did not notice anything unusual during the testing phase). A few days after deploying the backup scripts into production our main Oracle DB crashed while being snapshotted. Now there is a huge snapshot 'backlash'--snapshots are forbidden to be used with Oracle DBs.
At this point, I feel I need to give a bit of background information. One of the objectives with virtualization for us was to reduce the manual/attended portion of the backup job. The previous non-virtualized system backups went something like this: backup Oracle DB to tape, run DB batch job, backup Oracle DB to tape again. This entire process would be completed at midnight every night. With the introduction of virtualization we were hoping to eliminate the first Oracle tape backup and use snapshoting instead (if something goes wrong with the DB batch job we simply roll back the snapshot). However, thanks to the 'backlash' we are now carrying out a complete DB export prior to the DB batch job (the tape backup system was eliminated as part of the refresh), which takes much longer time-wise. 1AM to be exact.
Now for the question. Is there a way to safely use snapshots with an Oracle database that guarantees that data integrity will be maintained always? Is there a best practices document I can read? Also, if I am a risk-adverse manager, can you explain to me in simple and rational language why snapshotting would be a more desirable alternative to database exporting (the present mentaility: snapshots=fast, yet unfamiliar and risky. DB exports=slow, yet familiar and safe)?