提取R中没有重复顶点属性的所有子图

发布于12月17日

我有一个如下所示的图形对象:

# Create an empty graph
gss <- make_empty_graph(n = 12, directed = FALSE)

# Define vertex attributes
vertex_attr(gss) <- list(
  name = c("1", "2", "3", "4", "6", "7", "8", "10", "11", "17", "21", "23"),
  label = c("st_con_rt=main-room", "st_con_rt=sub-room", "st_con_tr=direct", "st_con_tr=terrace", "st_th=tsuma", "st_adsb=add", "st_adsb=sub", "tr_adsb=sub", "st_sub_main_th=hira", "roo_com=1a+7", "roo_com=2a+7", "roo_com=4a"),
  index = c(1, 2, 3, 4, 6, 7, 8, 10, 11, 17, 21, 23),
  element = c("st_con_rt", "st_con_rt", "st_con_tr", "st_con_tr", "st_th", "st_adsb", "st_adsb", "tr_adsb", "st_sub_main_th", "roo_com", "roo_com", "roo_com")
)

# Define edges
edges <- c("1", "4", "1", "6", "1", "10", "1", "23", "2", "3", "2", "7", "2", "11", "3", "8", "4", "6", "4", "7", "4", "10", "4", "23", "6", "10", "6", "23", "7", "10", "10", "23", "11", "17", "11", "21")

# Add edges to the graph
gss <- add_edges(gss, edges = edges)

我绘制了两种类型的图；一种是每个 node 有index个属性，另一个是element个属性，以便于理解我需要达到的内容.

plot(gss, vertex.label=V(gss)$index)
plot(gss, vertex.label=V(gss)$element)

Image

在element中存储了7个唯一值，即"st_con_rt"、"st_con_tr"、"st_th"、"st_adsb"、"tr_adsb"、"st_Sub_main_th"和"roo_com".此属性将用作条件图提取.

现在，我需要提取不包含相同element的 node 的所有子图.我手动判断了图表，所需结果的一部分可能包含这样的子图.(请确保st_con_tr和st_con_rt不同于element)

为了实现它，我在我的脚本中实现了以下过程.但不幸的是，它没有得到很好的实现，仍然存在算法问题.

从开始 node 开始循环到存储在图形对象中的 node 的结尾
判断起始 node 的相邻 node 并将其存储在向量中
开始迭代以判断每个相邻 node 是否具有相同的element
如果它们具有相同的元素，则忽略它
如果不是，则将其保存为起始 node 中的已连接 node
然后，转到该连接的 node
判断来自该连接 node 的相邻 node (与#2相同)
然后，继续进行与#3-7相同的过程
一旦到达终点，它将返回到存储在#3中的下一个相邻 node
如果选中了所有分支，则转到下一个开始 node
对所有 node 完成它，然后删除重复的子图

在当前的脚本中，它是这样的.让我从 node 1开始.

node 1作为相邻 node 连接到 node 4、6、10和23.
并不是所有的相邻 node 都有相同的element，因此它可以go 4个方向.让我们以升序从 node 4开始.现在， node 1和4已连接.此子图已经包含两个elements:roo_com和st_con_tr.
node 4连接到 node 6、7、10和23，它们没有相同的"元素".我们go 6号 node .现在 node 1、4和6连接.现在这个子图包含roo_com，st_con_tr和st_th.
node 6连接到 node 1、4、10和23.但在这一批中，我们已经访问了 node 1和4，因此目标 node 只有10个和23个，其中element个没有在当前子图中列出.让我们转到 node 10并将其连接到它.现在这个子图包含roo_com、st_con_tr、st_th和tr_adsb.
node 10连接1、4、6、7和23.由于它已经访问了 node 1、4、6，所以目标 node 只有7个和23个，其中element个没有在当前子图中列出.让我们转到 node 7并将其连接到它.现在这个子图包含roo_com、st_con_tr、st_th、tr_adsb和st_adsb.
等等，这一批没有访问 node 23，尽管它的element没有被使用.<= THIS IS AN ALGORITHMIC ISSUE
现在 node 7的邻居只有 node 2，但它的element是st_con_rt.它已经在此批次中列出，因为 node 4具有相同的element.现在第一批已经完成了.
下一个批处理从 node 6开始，它已经连接到 node 1.然后 node 6的邻居是 node 1、4、10和23.这个过程将继续下go ……

我意识到，目前的流程无法获得我想要的东西.这是当前的脚本，但它也带来了奇怪的输出，即从后续 node 开始的分支迭代没有按预期工作，这可能是由于嵌套循环期间替换visited_node和stack造成的.我插入了visited_recur和stack_recur以避免它，但它仍然存在.无论如何，这个脚本包含了上面提到的基本问题.

## Extract all subgraphs without duplicated "element" values
  # It shall be pre-defined as this function contain itlsef recursively
  
  ## Set recursive function for exploring neighbors
  search_neighbor <- function(graph, current_node, visited_node, stack) {
    
    # Get neighbors of the current node
    neighbors <- neighbors(graph, current_node)
    
    # Explore each neighboring node recursively
    for (neighbor_node in neighbors[!visited_node[neighbors]]) {
      if (vertex_attr(graph)$element[current_node] != vertex_attr(graph)$element[neighbor_node]) {
      
      print(paste0("# Recursive search: node ", current_node, " connected to node ", neighbor_node))

      # Update visited node not to revisit
        visited_node[neighbor_node] <- TRUE
      
      # Add the node to the connected subgraph
        stack <- c(stack, neighbor_node)
        
      # Recursive process to go to next neighbor node
        stack <- search_neighbor(gss, neighbor_node, visited_node, stack)
      
      }
    }
  }
  
## Initialize subgraph to store connected nodes and edges
  list_subgraph <- list()
  
## Initialize i
  i <- 1
  
  ## Iteration for all nodes as starting point
  for (start_node in V(gss)) {
    
    # Initialize visited nodes as all FALSE from node list in the graph
    visited_node <- logical(length(V(gss)))
    visited_node[start_node] <- TRUE
    
    # Stack start node as starting point
    stack <- start_node
    
    # Search neighboring nodes from start node
    neighbors <- neighbors(gss, start_node)
    
    # Start node as current node as starting point
    current_node <- start_node
    
    print(paste0("// Start from node ", current_node))
    
      # Iteration for neighboring nodes
      for (neighbor_node in neighbors[!visited_node[neighbors]]) {
        
        # Initialize an empty graph for the connected subgraph
        connected_subgraph <- make_empty_graph(n = 0, directed = FALSE)

        # Check if neighboring node contains a different element
        if (vertex_attr(gss)$element[current_node] != vertex_attr(gss)$element[neighbor_node]) {
          
          print(paste0("Node ", current_node, " connected to node ", neighbor_node))
          
          # Add neighboring node to subgraph
          stack <- c(stack, neighbor_node) ; stack_recur <- stack
          # Update visited node not to revisit
          visited_node[neighbor_node] <- TRUE ; visited_recur <- visited_node

          # Recursive exploration of neighbors from current neighbor node till the end
          search_neighbor(gss, neighbor_node, visited_recur, stack_recur)
          
          # Store connected subgraph
          connected_subgraph <- subgraph(gss, stack)
          
        }
        
        print(paste0("\\ Finished #", i, " search neighbors:", paste(stack, collapse = ", ")))
        
        # Store extracted subgraph from search result
        list_subgraph[[i]] <- connected_subgraph
        
        # Update i
        i <- i + 1
        
      }
    }

  # Remove duplicated subtracted graphs
  unique_subgraphs <- unique(list_subgraph)

我不确定这个复杂的问题是否可以在这个论坛上提出，因为它没有具体说明脚本问题.如果有人能提供任何见解来改善它，希望.我希望有什么函数可以在不实现复杂的搜索方法的情况下解决这个问题.

提取R中没有重复顶点属性的所有子图

Image

推荐答案

Visualization

R相关问答推荐

在ComplexHeatmap中，如何更改anno_barplot()标题的Angular ？

如何创建构成多个独立列条目列表的收件箱框列？

基于shiny 应用程序中的日期范围子集xts索引

如何对数据集进行逆向工程？

在垂直轴中包含多个ggplot2图中的平均值

如何在R中合并和合并多个rabrame？

二维样条，严格以一个参数递增

如果某些列全部为NA，则更改列

用关联字符串替换列名的元素

计算两列中满足特定条件连续行之间的平均值

从多个线性回归模型中提取系数

如何通过匹配R中所有可能的组合来从宽到长旋转多个列？

使用Scale_*_MANUAL时在图例中保留未使用的系数级别

从R中的对数正态分布生成随机数的正确方法

在另一个包中设置断点&S R函数

如何使用For-R循环在向量中找到一系列数字

无法将条件case_when()应用于使用！！创建的新变量Mutations

如何移动点以使它们的打印不重叠

从多行中 Select 最小值

R data.设置函数&；连接中的列值而不使用for循环的表方法？