In this mini-blog-series I will
be going over the different victims of data corruption: Mainly nonclustered indexes (this post),
clustered indexes, and system tables.
The first part of this blog post is how to corrupt a nonclustered index
in a TEST environment in order to
learn and practice the resolution to such a problem. The base database I’ll be using is
AdventureWorks2012 (it can be found here). I then restored a copy of AdventureWorks2012
to corrupt, and called it AdventureWorks2012_CorruptNCI:
restore database AdventureWorks2012_CorruptNCI
from disk = 'C:\DatabaseFiles\AdventureWorks\AdventureWorks2012_base.bak'
with
move 'AdventureWorks2012_Data'
to 'c:\MyDataDir\AdventureWorks2012_CorruptCI_Data.mdf',
move 'AdventureWorks2012_Log'
to 'c:\MyLogDir\AdventureWorks2012_CorruptCI_log.ldf',
stats = 5
go
Corrupting the Data
First off, I want to emphasize that we are corrupting data! In other words, THIS NEEDS TO BE DONE IN AN
ISOLATED TEST ENVIRONMENT WITH TEST DATA AND DATABASES. What we want to do now is corrupt a
nonclustered index in our new AdventureWorks2012_CorruptNCI database. I randomly chose the Person.EmailAddress
table, so I wanted to get all of the indexes on that table:
select
t.name as table_name,
i.name as index_name,
i.index_id,
i.type_desc
from sys.indexes i
inner join sys.tables t
on i.object_id = t.object_id
where t.name =
'EmailAddress'
order by i.index_id
After viewing the result set, I
see there is nonclustered index on this table called IX_EmailAddress_EmailAddress (note the index_id
of 2): That will be our victim. Now we want to see all of the page
information of this index. In order to
do this, we’ll utilize the DBCC IND command to output a record per
index page showing us certain page metadata:
dbcc ind('adventureworks2012_corruptnci', 'person.emailaddress', 2)
go
This shows us a list of all
pages in the nonclustered index, IX_EmailAddress_EmailAddress. Choose a page, any page. This page will be the one that we target for
data corruption. I have randomly
selected a data page with an ID of 18250 (your PagePID may vary). Now we want to just get a glimpse of the data
contained on that page, and to do that we’ll make use of the DBCC PAGE
command:
dbcc page('adventureworks2012_corruptnci', 1, 18250, 3)
go
Now that we know we want to
corrupt our specific page (PagePID 18250 in my case), I’ll do a
quick calculation to get the page offset in the file. We know that a page is 8 KB = 8192 B:
select 18250 * 8192
-- page offset: 149504000
I get a resulting page offset of
149504000. In order to actually corrupt
the containing data file, we need to take the database offline to release the
NTFS lock on the file:
use master
go
alter database AdventureWorks2012_CorruptNCI
set offline
with rollback immediate
go
Now for the fun part: Corrupting the data. You can use a hex editor to alter the
contents of the data file, and the one I typically use for direct file
manipulation is XVI32 Hex Editor (it’s free and exactly what you need for
this). It can be found here. The following explicit instructions are
directions for this particular application, so if you are not using the XVI32
Hex Editor, then you may need to do different steps.
Open up the AdventureWorks2012_CorruptNCI data file that contains the
nonclustered index. Click on Address and then GoTo and input your page offset (ensure that the decimal format is selected). In my case, I entered in 149504000. Click OK
and then XVI32 will take you right to the beginning of that particular
page. On the right you will see remnants
of the actual data of the table (if you’ve opted to use the EmailAddress table
as well, you’ll see AdventureWorks email addresses). On the left will be the hex representation of
that data. Click anywhere on that data
page (I recommend using the right side to navigate to a character), and then
zero out (00) the hex value for that byte. That is enough! Our data is now corrupted. Save and close the data file.
Back in SSMS, set the database online:
use master
go
alter database AdventureWorks2012_CorruptNCI
set online
go
Now I want to just make sure
that we definitely corrupted the index:
use AdventureWorks2012_CorruptNCI
go
select EmailAddress
from Person.EmailAddress
This should give you an error
stating that “SQL Server detected a logical consistency-based I/O error:
incorrect checksum…….” so on and so forth.
Fixing a Corrupted Nonclustered Index
So the above was just a really
slick way to corrupt a nonclustered index in our test database. We broke it, and now it’s time to fix
it. But in reality, we initially won’t
know exactly what is corrupt. So you run
a routine DBCC CHECKDB, or a user has an error of sorts, and now it’s
time to get to the bottom of it. Running
a DBCC CHECKDB will give you the information you’ll need as to
what’s corrupt:
dbcc checkdb with no_infomsgs
The biggest takeaway here is the
partition ID of the corrupt table/index.
In my case, my DBCC CHECKDB tells me the
issue is with partition number 72057594050445312. So now I want to find out exactly what table
and index that is for (if you did the above process of data corruption, you
already know. But just for explanation’s
sake, view the query results):
select
p.partition_id,
o.name as table_name,
i.name as index_name,
i.type_desc
from sys.partitions p
inner join sys.objects o
on o.object_id = p.object_id
inner join sys.indexes i
on o.object_id = i.object_id
and p.index_id =
i.index_id
where partition_id = 72057594050445312
Bingo! This tells me that the corrupt partition is
indeed a nonclustered index on Person.EmailAddress, and the faulty
index is IX_EmailAddress_EmailAddress.
When a nonclustered index is
corrupt, the first step is to try a rebuild (and, it goes without saying but a
little investigation as to the cause of the data corruption is warranted and
necessary, otherwise for things like a failing drive, it’ll happen again):
alter index IX_EmailAddress_EmailAddress
on Person.EmailAddress
rebuild
go
If the rebuild isn’t possible,
you may need to drop and recreate the index.
The easiest way to do this is through SSMS: Right-Click on the index and select Script Index as DROP and CREATE to New Query
Window.
And, just for good measure I’ll
run my DBCC CHECKDB to ensure that all is well. And in my case, it is:
-- run DBCC CHECKDB to ensure the
database is back to normal
dbcc checkdb with no_infomsgs
Summary
When we run into corrupt data,
our first step is to find out what exactly is that is corrupt. Far too often we want to run DBCC CHECKDB
with REPAIR_ALLOW_DATA_LOSS, but that isn’t always the right
answer. What was explained above is when
the victim of data corruption is a nonclustered index. The first step is to try to rebuild the index if possible. No harm, no
foul. In the next couple of blog posts
I’ll be going over different victims of data corruption and their fixes. If there are any questions, comments, or
issues please feel free to leave a comment or email me at sqlsalt@gmail.com.
Code
As always, the corresponding
code can be found on my SQLSalt GitHub repository:
Comment
Thanks to Paul Randal for his
insightful comment and correction on my post, in regards to rebuild the corrupt
index in lieu of jumping to a drop and create first (as well as scripting out
the DROP and CREATE through SSMS). Paul Randal is a leading expert in all things SQL Server.
Thanks, Paul!! Much appreciated. Quoted comment from Paul Randal for
reference:
“Beware of dropping and recreating the index - if it's
enforcing a constraint then the constraint could be violated while the index is
dropped, meaning it can't be created again. Better to try rebuilding it first
always. I'd also advise using SSMS to script-create the index before dropping
it just in case the system tables change in a future release - you don't have
to rely on your own code.”
Beware of dropping and recreating the index - if it's enforcing a constraint then the constraint could be violated while the index is dropped, meaning it can't be created again. Better to try rebuilding it first always. I'd also advise using SSMS to script-create the index before dropping it just in case the system tables change in a future release - you don't have to rely on your own code.
ReplyDeleteCheers