如果行具有相同的id或属性,我需要将行分组在一行中。所以我想我需要使用inner join
和group_concat
,但我不知道如何使用。问题是,如果两个用户没有一个共同的属性,而是与同一个第三个用户归入同一个组,则必须将所有这三个用户合并到一个组中。表中也没有group_id
列。
group_id, user_id, group_attributes
1, 1, "red, green, yellow, grey, purple, coffeemaker"
1, 2, "red, green, yellow, grey, purple, coffeemaker"
1, 3, "red, green, yellow, grey, purple, coffeemaker"
1, 4, "red, green, yellow, grey, purple, coffeemaker"
1, 5, "red, green, yellow, grey, purple, coffeemaker"
2, 6, "coffee, milk, croissant"
2, 7, "coffee, milk, croissant"
2, 8, "coffee, milk, croissant"
原始数据,以减少您的回答时间。
CREATE TABLE task (
user_id INT(10) NOT NULL,
attribute VARCHAR(50) NULL DEFAULT NULL);
INSERT INTO task (user_id, attribute)
VALUES
(1, 'red'),
(1, 'green'),
(2, 'green'),
(2, 'yellow'),
(3, 'grey'),
(3, 'coffeemaker'),
(4, 'grey'),
(4, 'purple'),
(5, 'purple'),
(5, 'red'),
(6, 'black'),
(7, 'black'),
(7, 'milk'),
(8, 'milk'),
(8, 'croissant');
这是一个图遍历问题,因此简单的join
是不够的。一种方法是将所有属性与给定的属性关联起来。以下递归CTE执行此操作:
with recursive aa as (
select distinct t1.attribute as at1, t2.attribute as at2
from task t1 join
task t2
on t1.user_id = t2.user_id
),
cte as (
select at1, at2, at1 as found, 1 as lev
from aa
union all
select cte.at1, aa.at2, concat_ws(',', found, aa.at2), lev + 1
from cte join
aa
on cte.at2 = aa.at1
where find_in_set(aa.at2, found) = 0
)
select distinct at1, at2
from cte;
然后可以使用相同的递归CTE将这些值组合成字符串:
with recursive aa as (
select distinct t1.attribute as at1, t2.attribute as at2
from task t1 join
task t2
on t1.user_id = t2.user_id
),
cte as (
select at1, at2, at1 as found, 1 as lev
from aa
union all
select cte.at1, aa.at2, concat_ws(',', found, aa.at2), lev + 1
from cte join
aa
on cte.at2 = aa.at1
where find_in_set(aa.at2, found) = 0
)
select dense_rank() over (order by pairs.all_attributes) as group_id, t.user_id, pairs.all_attributes
from (select at1, group_concat(at2) as all_attributes
from cte
group by at1
) pairs join
(select user_id, min(attribute) as min_attribute
from task
group by user_id
) t
on t.min_attribute = pairs.at1;
我看不出这段代码有什么问题。但是DB<>Fiddle坚持为pairs
创建十六进制字符串。然而,我认为这将工作在您的数据库。这是小提琴。