There is a table
How do I remove duplicate sql query?
1) Through another table
CREATE TEMPORARY TABLE tmp_tab AS SELECT DISTINCT * FROM your_table; DELETE FROM your_table; INSERT INTO your_table SELECT * FROM tmp_tab; DROP TABLE tmp_tab; 2) Adding an index. I personally have not tried it myself, but they say it works. A unique index is added, and duplicates are deleted. Actual for MySQL
ALTER IGNORE TABLE your_table ADD UNIQUE INDEX(id_category, id_product, position); PostgreSQL
You can solve the problem with a single query with CTE (where T is the source table):
with td as (delete from T returning *), tt as (select row_number() over(partition by id_category,id_product,position order by id_category,id_product,position) num, * from td) insert into T select id_category,id_product,position from tt where num=1; MySQL:
You can create a column with unique values. Using it, remove duplicates, then delete a column (or better leave and hang a unique index on it). Which is quite resource intensive, but an option.
To create a column and fill it with natural numbers, you can:
ALTER TABLE Test ADD Id INT; UPDATE Test SET Id = @I := @I + 1 /*тут можно задать нужную сортировку при желании, я добавил по A, B*/ ORDER BY A, B, (SELECT @I := 0) As a result, a column id appears in the Test table, filled with a numeric sequence sorted by columns A, B.
UPD: There is a somewhat extravagant way :) First, mark the lines for deletion, then delete. Using again the accumulation in the variable.
For clarity, I will show all the scripts in working form.
Create a sign and fill:
CREATE TABLE TEST_DUPLICATE( A VARCHAR(20), B VARCHAR(20) ); INSERT TEST_DUPLICATE SELECT 'AAA', 'BBB'; INSERT TEST_DUPLICATE SELECT 'AAA', 'BBB'; INSERT TEST_DUPLICATE SELECT 'BBB', 'BBB'; INSERT TEST_DUPLICATE SELECT 'AAA', 'AAA'; INSERT TEST_DUPLICATE SELECT 'BBB', 'BBB'; INSERT TEST_DUPLICATE SELECT 'AAA', 'AAA'; INSERT TEST_DUPLICATE SELECT 'AAA', 'BBB'; SELECT * FROM TEST_DUPLICATE; Here are its contents:
AAA BBB AAA BBB BBB BBB AAA AAA BBB BBB AAA AAA AAA BBB Now mark duplicates in the field B with the string DUPLICATED
UPDATE TEST_DUPLICATE SET B = CONCAT( CASE WHEN A=@A AND B=@B THEN 'DUPLICATED' ELSE B END , /*тут фейковое слагаемое, просто чтобы изменить значения @A и @B*/ CASE WHEN CONCAT((@A:=A),(@B:=B)) >= '' THEN '' END) ORDER BY A, B, (SELECT @A:=''), (SELECT @B:='') ; SELECT * FROM TEST_DUPLICATE; Now the table contents:
AAA BBB AAA DUPLICATED BBB BBB AAA AAA BBB DUPLICATED AAA DUPLICATED AAA DUPLICATED Remove the marked lines:
DELETE FROM TEST_DUPLICATE WHERE B = 'DUPLICATED'; SELECT * FROM TEST_DUPLICATE; got what they wanted:
AAA BBB BBB BBB AAA AAA There is a certain criticism of such a decision. But I have described it for the sake of simplicity. If desired, the theme can be developed and used
Addition: All the same can be done on other DBMS, replacing the accumulation in a variable analytical functions ROW_NUMBER, LEAD. In other DBMS this will look "nicer".
Source: https://ru.stackoverflow.com/questions/542072/
All Articles