假设我们有三排水果.每个数组都有不同的项,但有些项在多个数组中重复.我想删除重复项,以便数组只包含唯一项.

之前:

arrayA = {"lychee", "orange", "grape", "watermelon"};
arrayB = {"banana", "grape", "pear", "apple"};
arrayC = {"pear", "orange", "strawberry", "apple"};

之后:

arrayA = {"lychee", "watermelon"}
arrayB = {"banana"}
arrayC = {"strawberry"}

我已经编写了以下Java代码,它可以工作,但我想知道是否有更快的算法.


编辑:我简化了Java代码并添加了文档.我想知道如何使算法更高效,并降低时间复杂度.


这是伪代码版本:

    declare method [@name=removeDuplicates] [@params={@name=NItems, @type=String-Matrix/Set-List}]:
        declare a variable [@name=MapList, @type=List[item:HashMap[key:String, value:Number]]] assign List.@new
        iterate NItems with @[index, element]:
            declare a variable [@name=nmap, @type=HashMap[key:String, value:Number]] assign HashMap.@new
            for each @[item] in @element, put [key:@item, value:0] to @[name=nmap]
            reference @[name=MapList][@index] to @[name=nmap]
        declare a variable [@name=result] with the same type of @NItems
        iterate NItems with @[index, element]:
            iterate MapList with @[mapIndex, mapElement] where @index != @mapIndex:
                iterate @element with @[item]:
                    if @mapElement[@mapIndex] contains the [key:@item]:
                        set the corresponding map @MapList[@index] update [key:@item, value:@old+1]
            set @result[@index] values: items from @MapList[@index] where it's value == 0
        return @result

这是Java代码版本:

    /**
     * Remove duplicate items among N container
     * @param arrays list of item container, each container contains no duplicates,
     *               and the container is unordered internally, which can be considered as a Set
     * @return N containers after remove duplicates
     */
    @SuppressWarnings("unchecked")
    public static String[][] removeDuplicates(String[]... arrays) {
        Map<String, Integer>[] maps = new Map[arrays.length];
        for (int i = 0; i < arrays.length; i++) {
            maps[i] = new HashMap<>();
            for (String itm : arrays[i]) {
                maps[i].put(itm, 0);
            }
        }
        String[][] result = new String[arrays.length][];
        for (int i = 0; i < arrays.length; i++) {
            for (int j = 0; j < maps.length; j++) {
                if (j == i)
                    continue;
                for (String s : arrays[i]) {
                    if (maps[j].containsKey(s))
                        maps[i].compute(s, (_, v) -> v + 1);
                }
            }
            int finalI = i;
            result[i] = Arrays.stream(arrays[i])
                    .filter(itm -> maps[finalI].get(itm) < 1)
                    .toArray(String[]::new);
        }
        return result;
    }

推荐答案

实现相当简单:

final var map = Stream.of(arrayA, arrayB, arrayC)
    .flatMap(Set::stream)
    .collect(Collectors.groupingBy(
            Function.identity(), 
            Collectors.counting()));

map.entrySet().removeIf(entry -> entry.getValue() == 1);

arrayA.removeIf(map::containsKey);
arrayB.removeIf(map::containsKey);
arrayC.removeIf(map::containsKey);

以下是细目:

假设Set个水果是可变的:

final var arrayA = new HashSet<>(Set.of("lychee", "orange", "grape", "watermelon"));
final var arrayB = new HashSet<>(Set.of("banana", "grape", "pear", "apple"));
final var arrayC = new HashSet<>(Set.of("pear", "orange", "strawberry", "apple"));

现在,您希望收集一个Map,其中包含每个水果(键)在上面定义的所有Set中出现的次数.

final var map = Stream.of(arrayA, arrayB, arrayC)
                .flatMap(Set::stream)
                .collect(Collectors.groupingBy(
                        Function.identity(), 
                        Collectors.counting()));
{banana=1, orange=2, apple=2, lychee=1, pear=2, strawberry=1, watermelon=1, grape=2}

现在,您只需要保留那些多次出现的内容.这里有一句俏皮话:

map.entrySet().removeIf(entry -> entry.getValue() == 1);
{orange=2, apple=2, pear=2, grape=2}

最后,从每个Set中删除所有存在于这个Map中的水果:

arrayA.removeIf(map::containsKey);
arrayB.removeIf(map::containsKey);
arrayC.removeIf(map::containsKey);
[lychee, watermelon]
[banana]
[strawberry]

Java相关问答推荐

PostgreSQL货币兑换率查询

如何让HikariCP指标在NewRelic中正确显示?

收听RDX中用户数据的变化

基本时态运算的ISO-8601周数据表示法

Java Swing:初始化身份验证类后未检测到ATM_Interface键事件

R.id.main给我一个红色错误,无法解析MainActivity.java中的符号main

通过Spring Security公开Spring Boot执行器端点

声明带有泛型的函数以用作查找映射中的值

内存和硬盘中的Zip不同,这会导致下载后的Zip损坏

使用While循环打印素数,无法正常工作

如何在Jooq中获取临时表列引用?

Dijkstra搜索算法的实现

在JDK Flight Recorder中只记录单个线程

在Java中将.GRF转换为图像文件

将BlockingQueue+守护程序线程替换为执行器

在具有Quarkus Panache的PostgreSQL中将JSON数据存储为JSONB时,会将其存储为转义字符串

我们可以在方法中声明接口吗?

按长度排序字符串数组

Maven创建带有特定类的Spring Boot jar和普通jar

如何组合Mono和Flux