Delete Duplicates From a Table in SQL Server

To delete the duplicate rows from the table in SQL Server, you follow these steps:

Let’s set up a sample table for the demonstration.

Setting up a sample table

First, create a new table named sales.contacts as follows:

DROP TABLE IF EXISTS sales.contacts;

CREATE TABLE sales.contacts(
    contact_id INT IDENTITY(1,1) PRIMARY KEY,
    first_name NVARCHAR(100) NOT NULL,
    last_name NVARCHAR(100) NOT NULL,
    email NVARCHAR(255) NOT NULL,
);
Code language: SQL (Structured Query Language) (sql)

Second, insert some rows into the sales.contacts table:

INSERT INTO sales.contacts
    (first_name,last_name,email) 
VALUES
    ('Syed','Abbas','syed.abbas@example.com'),
    ('Catherine','Abel','catherine.abel@example.com'),
    ('Kim','Abercrombie','kim.abercrombie@example.com'),
    ('Kim','Abercrombie','kim.abercrombie@example.com'),
    ('Kim','Abercrombie','kim.abercrombie@example.com'),
    ('Hazem','Abolrous','hazem.abolrous@example.com'),
    ('Hazem','Abolrous','hazem.abolrous@example.com'),
    ('Humberto','Acevedo','humberto.acevedo@example.com'),
    ('Humberto','Acevedo','humberto.acevedo@example.com'),
    ('Pilar','Ackerman','pilar.ackerman@example.com');
Code language: SQL (Structured Query Language) (sql)

Third, query data from the sales.contacts table:

SELECT 
   contact_id, 
   first_name, 
   last_name, 
   email
FROM 
   sales.contacts;Code language: SQL (Structured Query Language) (sql)

The following picture shows the output of the query:

SQL Server Delete Duplicates

There are many duplicate rows (3,4,5), (6,7), and (8,9) for the contacts that have the same first name, last name, and email.

Delete duplicate rows from a table example

The following statement uses a common table expression (CTE) to delete duplicate rows:

WITH cte AS (
    SELECT 
        contact_id, 
        first_name, 
        last_name, 
        email, 
        ROW_NUMBER() OVER (
            PARTITION BY 
                first_name, 
                last_name, 
                email
            ORDER BY 
                first_name, 
                last_name, 
                email
        ) row_num
     FROM 
        sales.contacts
)
DELETE FROM cte
WHERE row_num > 1;
Code language: SQL (Structured Query Language) (sql)

In this statement:

  • First, the CTE uses the ROW_NUMBER() function to find the duplicate rows specified by values in the first_namelast_name, and email columns.
  • Then, the DELETE statement deletes all the duplicate rows but keeps only one occurrence of each duplicate group.

SQL Server issued the following message indicating that the duplicate rows have been removed.

(4 rows affected)

If you query data from the sales.contacts table again, you will find that all duplicate rows are deleted.

SELECT contact_id, 
       first_name, 
       last_name, 
       email
FROM sales.contacts
ORDER BY first_name, 
         last_name, 
         email;Code language: SQL (Structured Query Language) (sql)
SQL Server Delete Duplicate Rows Result

https://www.sqlservertutorial.net/sql-server-basics/delete-duplicates-sql-server/

Setup A Excel file as A linked server in ms-sql

USE [master]
GO
sp_configure ‘show advanced options’, 1
GO
RECONFIGURE WITH OverRide
GO
sp_configure ‘Ad Hoc Distributed Queries’, 1
GO
RECONFIGURE WITH OverRide
GO

EXEC master . dbo. sp_MSset_oledb_prop N’Microsoft.ACE.OLEDB.12.0′ , N’AllowInProcess’ , 1
GO

EXEC master . dbo. sp_MSset_oledb_prop N’Microsoft.ACE.OLEDB.12.0′ , N’DynamicParameters’ , 1

EXEC sp_addLinkedServer
@server= ‘XLSX_MATRIX’,
@srvproduct = ‘ACE 12.0’,
@provider = ‘Microsoft.ACE.OLEDB.12.0’,
@datasrc =’D:\SQL\vn_item_no.xlsx’,
@provstr = ‘Excel 12.0; HDR=Yes’

SELECT * FROM OPENQUERY (XLSX_MATRIX, ‘Select * from [Application$]’)
SELECT * FROM OPENQUERY (XLSX_MATRIX, ‘Select * from [Device$]’)
SELECT * FROM OPENQUERY (XLSX_MATRIX, ‘Select * from [User$]’)

SELECT * FROM XLSX_MATRIX…[Application$]
SELECT * FROM XLSX_MATRIX…[Device$]
SELECT * FROM XLSX_MATRIX…[User$]

select b.*,a.item_no,a.ref_no from [XLSX_MATRIX]…[工作表1$] a right outter join [XLSX_MATRIX]…[‘原料(B倉)$’] b on a.f_2=b.f_2 where (b.ref_no like ‘%’+a.ref_no+’%’ or (a.ref_no) is null)
order by 1

select b.*,a.item_no,a.ref_no from [XLSX_MATRIX]…[‘原料(B倉)$’] b left outer join [XLSX_MATRIX]…[工作表1$] a on a.f_2=b.f_2
where (b.ref_no like ‘%’+a.ref_no+’%’ or (a.ref_no) is null) and sort_id is not null
order by sort_id

final**
select distinct b.*,a.item_no,a.ref_no from [XLSX_MATRIX]…[‘原料(B倉)$’] b left outer join [XLSX_MATRIX]…[工作表1$] a on a.f_2=b.f_2
and (b.ref_no like ‘%’+a.ref_no+’%’ or (a.ref_no) is null) where sort_id is not null
order by sort_id

select distinct b.*,a.item_no,a.ref_no,isnull(a.v_no,a.t_no) from [XLSX_MATRIX]…[‘原料(B倉)$’] b left outer join [XLSX_MATRIX]…[工作表1$] a on isnull(a.v_no,a.t_no)=b.f_2
and (b.ref_no like ‘%’+a.ref_no+’%’ or (a.ref_no) is null) where sort_id is not null
order by sort_id

AccessDatabaseEngine_X64 maybe needed for Microsoft.ACE.OLEDB.12.0 driver Microsoft Access Database Engine 2010 可轉散發套件

p.s.使用openrowset 可直接讀excel檔案,不用設置linked server
select a.* from
(SELECT * FROM OpenRowSet(‘Microsoft.ACE.OLEDB.12.0’,
‘Excel 12.0;HDR=Yes;IMEX=2;Database=c:\temp\Employee.xls’,[工作表1$])) a,
(SELECT * FROM OpenRowSet(‘Microsoft.ACE.OLEDB.12.0’,
‘Excel 12.0;HDR=Yes;IMEX=2;Database=c:\temp\Employee.xls’,[工作表1$])) b
where a.col1 = b.col1

select distinct b.*,a.item_no,a.ref_no,isnull(a.v_no,a.t_no) from
(SELECT * FROM OpenRowSet(‘Microsoft.ACE.OLEDB.12.0’,
‘Excel 12.0;HDR=Yes;IMEX=1;Database=D:\SQL\vn_item_no.xlsx’,[‘原料(B倉)$’])) b left outer join
(SELECT * FROM OpenRowSet(‘Microsoft.ACE.OLEDB.12.0’,
‘Excel 12.0;HDR=Yes;IMEX=1;Database=D:\SQL\vn_item_no.xlsx’,[工作表1$])) a on isnull(a.v_no,a.t_no)=b.f_2
and (b.ref_no like ‘%’+a.ref_no+’%’ or (a.ref_no) is null) where sort_id is not null
order by sort_id

(註1)SQL Server預設不允許使用 OPENROWSET 和 OPENDATASOURCE 進行特定分散式查詢。使用 OpenRowSet前,需使用sp_configure 來啟用 ‘Ad Hoc Distributed Queries’

EXEC sp_configure 'show advanced options','1'
RECONFIGURE 
EXEC sp_configure 'Ad Hoc Distributed Queries',1
RECONFIGURE 

參照

Import data from Excel to SQL Server or Azure SQL Database

使用 OpenRowset、OpenDataSource 查詢EXCE

設定 IMEX=1 透過 OleDb 讀取 Excel 仍然出現 null 值的問題

如何跳過 Win11 強制連網?

1.在連到網路的頁面,不要連上網路,也把實體網路線拔除,這時候按下鍵盤上的「Shift + F10」就可以開啟 CMD 視窗,如果按下去沒有反應,請試試看「fn + Shift + F10」組合,應該就可以順利呼叫開啟 CMD。

2.開啟之後輸入「oobe\bypassnro」,並按下 Enter 鍵。之後系統會自動重新啟動,重開機後會再進入 OOBE 畫面一次,設定都會重新來過,請依照自己需求設定。

4.到連接網路畫面時,就可以看到「我沒有網際網路」的選項,點選後就可以跳過連網!