Html 从网页中提取URL

发布于11月22日

我想从包含多个URL的网页中提取URL，并将提取的URL保存为txt文件.

网页中的URL以‘127.0.0.1’开头，但我想从它们中删除‘127.0.0.1’，只提取URL.当我运行下面的ps脚本时，它只保存‘127.0.0.1’.请帮我解决这个问题.

$threatFeedUrl = "https://raw.githubusercontent.com/DandelionSprout/adfilt/master/Alternate versions Anti-Malware List/AntiMalwareHosts.txt"
    
    # Download the threat feed data
    $threatFeedData = Invoke-WebRequest -Uri $threatFeedUrl
    
    # Define a regular expression pattern to match URLs starting with '127.0.0.1'
    $pattern = '127\.0\.0\.1(?:[^\s]*)'
    
    # Use the regular expression to find matches in the threat feed data
    $matches = [regex]::Matches($threatFeedData.Content, $pattern)
    
    # Create a list to store the matched URLs
    $urlList = @()
    
    # Populate the list with matched URLs
    foreach ($match in $matches) {
        $urlList += $match.Value
    }
    
    # Specify the output file path
    $outputFilePath = "output.txt"
    
    # Save the URLs to the output file
    $urlList | Out-File -FilePath $outputFilePath
    
    Write-Host "URLs starting with '127.0.0.1' extracted from threat feed have been saved to $outputFilePath."

$threatFeedUrl = 'https://raw.githubusercontent.com/DandelionSprout/adfilt/master/Alternate versions Anti-Malware List/AntiMalwareHosts.txt' # Download the threat feed data $threatFeedData = Invoke-RestMethod -Uri $threatFeedUrl # Define a regular expression pattern to match URLs starting with '127.0.0.1' $pattern = '127\.0\.0\.1 (\S+)' # Use the regular expression to find matches in the threat feed data $matchList = [regex]::Matches($threatFeedData, $pattern) # Create and populate the list with matched URLs $urlList = foreach ($match in $matchList) { $match.Groups[1].Value } # Specify the output file path $outputFilePath = 'output.txt' # Save the URLs to the output file $urlList | Out-File -FilePath $outputFilePath Write-Host "URLs starting with '127.0.0.1' extracted from threat feed have been saved to $outputFilePath."

Html 从网页中提取URL

推荐答案

Html相关问答推荐

试图让三个Divs与下面的另外三个对齐

HTML表行悬停仅适用于偶数行

容器内的SVG图像没有响应

更改垫子输入的涟漪 colored颜色

如何翻转卡片图像的背面

如何在R中渲染Quarto文档时动态设置html文件名

如何显示自定义按钮以将产品添加到WooCommerce购物车？

我如何确保我的网格永远不会小于它的子网格

HTML，CSS-阻止按钮在单击时向上移动

：：可点击图标之前

Tailwind 网格行高度可防止拉伸到最高行的所有相同高度

CSS 跨度与子元素交替 colored颜色

使用 Selenium for Python 在网站上查找类名称中包含换行符的元素

变换元素以适应高度(Angular，SCSS)

自定义进度条 Html 和 CSS 布局

浮动元素忽略 margin-top 属性后的块元素

弹出窗口溢出溢出：自动，我不明白为什么

Bootstrap 5 网格问题，div 向下转义

如何阻止网格项目拉伸？

如何将内容从侧边栏的底部移动到右侧？