There is a table:

CREATE TABLE "DATA" ( date timestamp without time zone, array_values int[] ); 

It contains data:

 INSERT INTO "DATA" (date, array_values ) VALUES ('2017-01-19 00:00:00',ARRAY[1,2,3]); INSERT INTO "DATA" (date, array_values ) VALUES ('2017-01-19 00:00:00',ARRAY[4,5,6]); INSERT INTO "DATA" (date, array_values ) VALUES ('2017-01-19 00:00:00',ARRAY[7,8]); INSERT INTO "DATA" (date, array_values ) VALUES ('2017-01-19 01:01:01',ARRAY[100,102]); INSERT INTO "DATA" (date, array_values ) VALUES ('2017-01-19 01:01:01',ARRAY[103,104,105,106]); 

How to update the table, so that the arrays with the same dates were combined into one?

"2017-01-19 00:00:00" | {1,2,3,4,5,6,7,8}

"2017-01-19 01:01:01" | {101,102,103,104,105,106}

  • Those. Do you need to delete some of the lines and what would the rest have had the full amount of arrays by date? - Mike
  • Yes, remove some of the lines - Damir Gafarov

2 answers 2

Step one is reformat the data. To do this, you can unpack the array unnset'om and assemble through array_agg when grouping:

 select date, array_agg(av) from "DATA" cross join unnest(array_values) av group by 1; 

Step two, make changes to the plate. The easiest way is to copy everything into a separate table, truncate and copy the data back:

 create temporary table data_tmp_agg select date, array_agg(av) as array_values from "DATA" cross join unnest(array_values) av group by 1; truncate table "DATA"; insert into "DATA" (date, array_values ) select date, array_values from data_tmp_agg; 

Or, which will probably be better done:

 ALTER TABLE "DATA" RENAME TO old_data; create table "DATA" ... insert into "DATA" (date, array_values ) select date, array_agg(av) as array_values from old_data cross join unnest(array_values) av group by 1; 

Those. rename nameplate.

You can collect one CTE, which you need to allocate the necessary data, update one line, remove the remaining duplicates. And the option with the re-creation of the plate but does not require a vacuum, and the disk space will be immediately returned to the system instead of leaving the dead lines.

PS: do not forget to analyze after the mass change.

    One request:

     WITH DEL as( delete from "DATA" where ctid in( select ctid from ( select ctid, row_number() over(partition by date) rn from "DATA" ) A where rn>1 ) returning date, array_values ) update "DATA" U set array_values=U.array_values || D.arr::integer[] from ( select date, string_to_array(string_agg(array_to_string(array_values,','),','),',') arr from DEL group by date ) D where U.date=D.date 

    First, it deletes the extra rows, which returns the deleted data. After that, the values ​​of the arrays from the deleted rows are added to the remaining ones.