Strategy to identify duplicates in an ETL context

Submitted by:Jhon Brain

Date added:18 June, 2012

Category:PL SQL

This sql statement enables to detect duplicates entries over a key and creates a seq column which says which one is the sequence of the duplicates. This way you can take only the seq = 1 and still be able to identify all duplicates

Tags: identify duplicates , etl context

Code Snippet:

    SELECT fields, fields, ROW_NUMBER OVER (PARTITION primary key(s) ORDER BY...) seq
FROM TABLE WHERE condition
 
 

Comments