6 Replies - 1047 Views - Last Post: 23 April 2013 - 03:39 AM Rate Topic: -----

#1 itzmePoulami  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 33
  • Joined: 12-March 13

delete duplicate rows of a table (sql server 2008)

Posted 21 April 2013 - 11:48 PM

i want to delete only a particular duplicate row from a table. can u help me with the query...
rght nw m using:

WITH CTE (Serial_No,ISBN_No,Title,Author,
	Publisher ,Purchase_From,Purchase_Date,Quantity,
	Cost_Price,Currency1,Selling_Price,Currency2,Discount,Sales_Return,Remarks,DuplicateCount)
AS
(
SELECT Serial_No,ISBN_No,Title,Author,
	Publisher ,Purchase_From,Purchase_Date,Quantity,
	Cost_Price,Currency1,Selling_Price,Currency2,Discount,Sales_Return,Remarks  ,
ROW_NUMBER() OVER(PARTITION BY Serial_No,ISBN_No,Title,Author,
	Publisher ,Purchase_From,Purchase_Date,Quantity,
	Cost_Price,Currency1,Selling_Price,Currency2,Discount,Sales_Return,Remarks ORDER BY ISBN_No) AS DuplicateCount
FROM M_BookDetails 
)
DELETE
FROM CTE
WHERE DuplicateCount > 1


but this query deletes all duplicate rows in the table. I want to delete only for a particular row. :helpsmilie:/>

This post has been edited by modi123_1: 22 April 2013 - 07:58 AM
Reason for edit:: highlight the text and just click the 'code' button in the text editor


Is This A Good Question/Topic? 0
  • +

Replies To: delete duplicate rows of a table (sql server 2008)

#2 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 9487
  • View blog
  • Posts: 35,774
  • Joined: 12-June 08

Re: delete duplicate rows of a table (sql server 2008)

Posted 22 April 2013 - 08:00 AM

The SQL statement for deleting a row is pretty straight forward.

DELETE from <TABLE NAME>
WHERE <col1> = <some value>


Typically <col1> would be your key column..
Was This Post Helpful? 0
  • +
  • -

#3 CharlieMay  Icon User is offline

  • This space intentionally left blank
  • member icon

Reputation: 1606
  • View blog
  • Posts: 5,163
  • Joined: 25-September 09

Re: delete duplicate rows of a table (sql server 2008)

Posted 22 April 2013 - 09:28 AM

When you say duplicate, are they exact in each field and column? If so, you might want to start manually going through your table and manually deleting the row you don't want. (keeping in mind any related tables that may have data associated with it)

If you don't have anything that differentiates the row, the delete is going to delete ALL occurrences that match the statement.
Was This Post Helpful? 0
  • +
  • -

#4 andrewsw  Icon User is offline

  • It's just been revoked!
  • member icon

Reputation: 3729
  • View blog
  • Posts: 13,019
  • Joined: 12-December 12

Re: delete duplicate rows of a table (sql server 2008)

Posted 22 April 2013 - 09:37 AM

An alternative approach to removing duplicates (except the first one) is to create a copy of the table, create a unique (multi-column) index on the combination of fields that prove the row to be a duplicate, then use INSERT INTO to add rows from the first table. Only the first of the duplicated rows will be inserted.

It is possible to do this in a query without involving a new table. As I recall, it requires a calculated column which numbers the rows, and a subquery that counts the duplicates, then keeping the one (where this duplicate value is > 1) with the lowest row number. Unfortunately, I don't have an example to hand :dontgetit: - sorry!

This post has been edited by andrewsw: 22 April 2013 - 09:39 AM

Was This Post Helpful? 0
  • +
  • -

#5 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5901
  • View blog
  • Posts: 12,805
  • Joined: 16-October 07

Re: delete duplicate rows of a table (sql server 2008)

Posted 22 April 2013 - 11:25 AM

What a wonky kind of syntax. I'm guessing you're following something like this: http://www.kodyaz.co...y-order-by.aspx

I'm going to riff off the above link and give a more condensed example:
drop TABLE #Suppliers
go

CREATE TABLE #Suppliers (
  Id int identity (1,1),
  CompanyTitle nvarchar(25),
  ContactName nvarchar(15),
  LastContactDate datetime
)
go

INSERT #Suppliers VALUES (N'Melody Music Instruments',N'James Manning', '20090623 10:15')
INSERT #Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090720 15:40')
INSERT #Suppliers VALUES (N'Top Music',N'Katy Swan', '20090827 18:00')
INSERT #Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090806 10:00')
INSERT #Suppliers VALUES (N'Melody Music Instruments',N'James Brown', '20080121 11:20')
INSERT #Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00')
INSERT #Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00')
go

select * from #Suppliers
go

SELECT
  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC),
  Id, CompanyTitle, ContactName, LastContactDate
FROM #Suppliers
go


WITH [CTE DUPLICATE] AS
(
SELECT
  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC),
  Id, CompanyTitle, ContactName, LastContactDate
FROM #Suppliers
)
DELETE FROM [CTE DUPLICATE] WHERE RN > 1
go

select * from #Suppliers
go



Results:
Id          CompanyTitle              ContactName     LastContactDate
----------- ------------------------- --------------- -----------------------
1           Melody Music Instruments  James Manning   2009-06-23 10:15:00.000
2           Blue Jazz                 Mike Clark      2009-07-20 15:40:00.000
3           Top Music                 Katy Swan       2009-08-27 18:00:00.000
4           Blue Jazz                 Mike Clark      2009-08-06 10:00:00.000
5           Melody Music Instruments  James Brown     2008-01-21 11:20:00.000
6           Top Music                 Katy Perry      2009-08-25 14:00:00.000
7           Top Music                 Katy Perry      2009-08-25 14:00:00.000

(7 row(s) affected)

RN                   Id          CompanyTitle              ContactName     LastContactDate
-------------------- ----------- ------------------------- --------------- -----------------------
1                    4           Blue Jazz                 Mike Clark      2009-08-06 10:00:00.000
2                    2           Blue Jazz                 Mike Clark      2009-07-20 15:40:00.000
1                    1           Melody Music Instruments  James Manning   2009-06-23 10:15:00.000
2                    5           Melody Music Instruments  James Brown     2008-01-21 11:20:00.000
1                    3           Top Music                 Katy Swan       2009-08-27 18:00:00.000
2                    6           Top Music                 Katy Perry      2009-08-25 14:00:00.000
3                    7           Top Music                 Katy Perry      2009-08-25 14:00:00.000

(7 row(s) affected)


(4 row(s) affected)
Id          CompanyTitle              ContactName     LastContactDate
----------- ------------------------- --------------- -----------------------
1           Melody Music Instruments  James Manning   2009-06-23 10:15:00.000
3           Top Music                 Katy Swan       2009-08-27 18:00:00.000
4           Blue Jazz                 Mike Clark      2009-08-06 10:00:00.000

(3 row(s) affected)





I've used SQL Server a long long time. I've never, ever, used that syntax. It's messy and proprietary.

To achieve the same results, I'd first identify the rows I want to keep. In the above example, we have a title and date to key off of, so it's simple.

select CompanyTitle, Max(LastContactDate) as LastContactDate
	from #Suppliers
	group by CompanyTitle
	having COUNT(*)>1
go



Results:
CompanyTitle              LastContactDate
------------------------- -----------------------
Blue Jazz                 2009-08-06 10:00:00.000
Melody Music Instruments  2009-06-23 10:15:00.000
Top Music                 2009-08-27 18:00:00.000

(3 row(s) affected)



So, for those three CompanyTitle values, I only want to keep the rows with the matching date. SQL server does provide a handy syntax for delete:
delete #Suppliers
	from #Suppliers a
		inner join (
		select CompanyTitle, Max(LastContactDate) as LastContactDate
			from #Suppliers
			group by CompanyTitle
			having COUNT(*)>1
			) b on a.CompanyTitle = b.CompanyTitle
	where a.LastContactDate!=b.LastContactDate



Here, we join on title and then kill anything with a non matching key field.

The important thing here is a key field. You need to have some kind of criteria for what you want to keep.

For duplicate rows with a unique identity column, you can just choose to keep the max identity column.
Was This Post Helpful? 1
  • +
  • -

#6 itzmePoulami  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 33
  • Joined: 12-March 13

Re: delete duplicate rows of a table (sql server 2008)

Posted 23 April 2013 - 01:26 AM

View PostCharlieMay, on 22 April 2013 - 09:28 AM, said:

When you say duplicate, are they exact in each field and column? If so, you might want to start manually going through your table and manually deleting the row you don't want. (keeping in mind any related tables that may have data associated with it)

If you don't have anything that differentiates the row, the delete is going to delete ALL occurrences that match the statement.


yeah they are same and exact...
isn't it possible to delete jst one of the two same rows ???
Was This Post Helpful? 0
  • +
  • -

#7 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5901
  • View blog
  • Posts: 12,805
  • Joined: 16-October 07

Re: delete duplicate rows of a table (sql server 2008)

Posted 23 April 2013 - 03:39 AM

View PostitzmePoulami, on 23 April 2013 - 04:26 AM, said:

yeah they are same and exact...


Without a primary key, you really don't have a database...

Dead simple way would be to just copy, trunc, and copy back. Normally there are too many considerations to do this, but I suspect normalization is not one of your concerns, so it will probably work for you.

e.g.
SELECT DISTINCT Serial_No,ISBN_No,Title,Author,
	Publisher ,Purchase_From,Purchase_Date,Quantity,
	Cost_Price,Currency1,Selling_Price,Currency2,Discount,Sales_Return,Remarks
into #unique
FROM M_BookDetails 

truncate table M_BookDetails 

insert into M_BookDetails 
	SELECT Serial_No,ISBN_No,Title,Author,
		Publisher ,Purchase_From,Purchase_Date,Quantity,
		Cost_Price,Currency1,Selling_Price,Currency2,Discount,Sales_Return,Remarks
	from #unique



Looking at your fields, I suspect the real issue is one of structure. I would model your data as:
Table: Book
	Serial_No (PK)
	ISBN_No
	Title
	Author
	Publisher

Table: Sales
	Serial_No (PK) (FK)
	Purchase_From (PK)
	Purchase_Date (PK)
	...


Was This Post Helpful? 0
  • +
  • -

Page 1 of 1