Today I found a pretty nice way to delete duplicate rows in a table on SQL Server. I had a table with 25,000 rows where 7,500 rows where rows containing one or more duplicates. I was not very eager to manually delete these duplicate, so I started to googling for answers. I found many different approaches, but suddenly I found my answer at stackoverflow.com. This thread help me rewriting a simple SQL statement after I added an [Id] column to uniquely identify a row. The clue is to use Common Table Expression (CTE) in SQL Server together with the OVER() function to create an unique row number for all duplicates within the key expression (in my case [Name] column), and ordering by the [Id] column. I only want to keep the first row for each duplicate item. Therefore, deleting all [rowno] greater than 1.
WITH cte_duplicates AS ( SELECT [Name], row_number() OVER ( PARTITION BY [Name] ORDER BY [Id] ) AS [rowno] FROM [dbo].[fm2011] ) DELETE FROM cte_duplicates WHERE [rowno] > 1