Saturday, 19 January 2013

Deleting Duplicate records in SQL Server

As I have mentioned in one of my previous tips that using ROW_NUMBER() we can get the serial number which is here.
Using the same ROW_NUMBER() function you can also easily delete the Duplicate Records present inside a table.
Lets say I have a Users Table which is having duplicate records. I need to delete the duplicate records which are having both the FirstName and LastName same.

Table: UsersList

 
FirstName       LastName       PhoneNumber       Address
------------    ------------     ----------------    -------------
Rashmita        Devi                987554437           NULL
Rashmita        Devi                446576578           NULL
Adwin            Ratzz              222222222           NULL
Charle            Hardwick         343534545           NULL
Adwin            Ratzz              576767688           NULL
Charle            Hardwick         877778777           NULL

 

 
So, with the help of ROW_NUMBER() all the duplicate records present in the above table can be removed easily.
The query will be composed this way:-

 
WITH TempUsers (FirstName,LastName, duplicateRecordCount)
AS
(
SELECT FirstName,LastName,
ROW_NUMBER()OVER(PARTITION BY FirstNameLastName ORDER BYFirstNameAS duplicateRecordCount
FROM dbo.Users
)
DELETE
FROM TempUsers
WHERE duplicateRecordCount > 1
GO

 
Instead of TempUsers you can give any name. Because this is used only for Temporary purpose.
After the execution of the above query, Users table will have the following records.

 
FirstName          LastName        PhoneNumber         Address
------------       ------------      ----------------      -------------
Rashmita           Devi                987554437              NULL
Adwin               Ratzz              222222222             NULL
Charle               Hardwick         877778777             NULL
Note: Row_Number() function is present in SQL Server 2005 and in later version(s).

No comments:

Post a Comment