使用 jq 工具将文本从 txt 文件转换为 json

发布于05月09日

我有一个txt文件，其中的值是通过递归调用以下命令获得的:gsutil ls -r gs://bucket-test/** | while IFS= read -r key; do gsutil stat $key; done，如下所示:

gs://bucket-test/4e123978-8eed-43ae-f521-8fba54c704ea.zip:
    Creation time:          Wed, 21 Dec 2022 10:39:27 GMT
    Update time:            Wed, 21 Dec 2022 10:39:27 GMT
    Storage class:          STANDARD
    Content-Length:         0
    Content-Type:           application/zip
    Hash (crc32c):          AAAAAA==
    Hash (md5):             1B2M2Y8AsgTpgAmY7PhCfg==
    ETag:                   CM30q9XCivwCEAE=
    Generation:             1671619167320653
    Metageneration:         1
gs://bucket-test/GKiSQMZ5rAqrSWwur/uploads/GENERAL/SNrQD97nzQN9eDLeA/AAZYefiL5CT8pxe4L:
    Creation time:          Mon, 10 Apr 2023 19:09:41 GMT
    Update time:            Mon, 10 Apr 2023 19:09:41 GMT
    Storage class:          STANDARD
    Content-Disposition:    inline; filename=James_INGREDIENTS_A3.pdf
    Content-Length:         4381797
    Content-Type:           application/pdf
    Hash (crc32c):          GOzitA==
    Hash (md5):             eUSLC/z70gjDB2WQKIPOuQ==
    ETag:                   CLGPvu+BoP4CEAE=
    Generation:             1681153781106609
    Metageneration:         1
gs://bucket-test/prova.pdf:
    Creation time:          Mon, 08 May 2023 15:37:26 GMT
    Update time:            Mon, 08 May 2023 15:40:12 GMT
    Storage class:          STANDARD
    Content-Disposition:    inline; filename=James_KEY_VISUAL_A3.pdf
    Content-Language:       ace
    Content-Length:         15407
    Content-Type:           application/pdf
    Metadata:               
        meta-1:             prova 1
        meta-2:             prova 2
    Hash (crc32c):          ZIrHPA==
    Hash (md5):             oZbD+S8y35spkNozW3hUDA==
    ETag:                   CNDj09OG5v4CEAM=
    Generation:             1683560246604240
    Metageneration:         3

我需要将输出转换为json格式，按前导空格拆分，并将每个组第一行上的值分配给"key"字段，然后可能会有子字段，例如在"METADATA"值下:

{
  "Key": "gs://bucket-test/4e123978-8eed-43ae-f521-8fba54c704ea.zip",
  "Creation time": "Wed, 21 Dec 2022 10:39:27 GMT",
  "Update time": "Wed, 21 Dec 2022 10:39:27 GMT",
  "Storage class": "STANDARD",
  "Content-Length": "0",
  "Content-Type": "application/zip",
  "Hash (crc32c)": "AAAAAA==",
  "Hash (md5)": "1B2M2Y8AsgTpgAmY7PhCfg==",
  "ETag": "CM30q9XCivwCEAE=",
  "Generation": "1671619167320653",
  "Metageneration": "1"
},
{
  "Key": "gs://bucket-test/GKiSQMZ5rAqrSWwur/uploads/GENERAL/SNrQD97nzQN9eDLeA/AAZYefiL5CT8pxe4L",
  "Creation time": "Mon, 10 Apr 2023 19:09:41 GMT",
  "Update time": "Mon, 10 Apr 2023 19:09:41 GMT",
  "Storage class": "STANDARD",
  "Content-Disposition": "inline; filename=James_INGREDIENTS_A3.pdf",
  "Content-Length": "4381797",
  "Content-Type": "application/pdf",
  "Hash (crc32c)": "GOzitA==",
  "Hash (md5)": "eUSLC/z70gjDB2WQKIPOuQ==",
  "ETag": "CLGPvu+BoP4CEAE=",
  "Generation": "1681153781106609",
  "Metageneration": "1"
},
{
  "Key": "gs://bucket-test/prova.pdf",
  "Creation time": "Mon, 08 May 2023 15:37:26 GMT",
  "Update time": "Mon, 08 May 2023 15:40:12 GMT",
  "Storage class": "STANDARD",
  "Content-Disposition": "inline; filename=James_KEY_VISUAL_A3.pdf",
  "Content-Language": "ace",
  "Content-Length": "15407",
  "Content-Type": "application/pdf",
  "Metadata": {
    "meta-1": "prova 1",
    "meta-2": "prova 2"
  },
  "Hash (crc32c)": "ZIrHPA==",
  "Hash (md5)": "oZbD+S8y35spkNozW3hUDA==",
  "ETag": "CNDj09OG5v4CEAM=",
  "Generation": "1683560246604240",
  "Metageneration": "3"
}

我try 对唯一的组使用此命令，但没有成功: gsutil stat gs://bucket-test/prova.pdf | printf %s "$(cat)" | jq -R -s 'split("\n") | map({key: split(": ")[0], value: split(": ")[1]})'个

将json转换为数组:

[
  {
    "key": "gs://spin8-test/prova.pdf:",
    "value": null
  },
  {
    "key": "    Creation time",
    "value": "         Mon, 08 May 2023 15:37:26 GMT"
  },
  {
    "key": "    Update time",
    "value": "           Mon, 08 May 2023 15:40:12 GMT"
  },
  {
    "key": "    Storage class",
    "value": "         STANDARD"
  },
  {
    "key": "    Content-Disposition",
    "value": "   inline; filename=James_KEY_VISUAL_A3.pdf"
  },
  {
    "key": "    Content-Language",
    "value": "      ace"
  },
  {
    "key": "    Content-Length",
    "value": "        15407"
  },
  {
    "key": "    Content-Type",
    "value": "          application/pdf"
  },
  {
    "key": "    Metadata",
    "value": "              "
  },
  {
    "key": "        meta-1",
    "value": "            prova 1"
  },
  {
    "key": "        meta-2",
    "value": "            prova 2"
  },
  {
    "key": "    Hash (crc32c)",
    "value": "         ZIrHPA=="
  },
  {
    "key": "    Hash (md5)",
    "value": "            oZbD+S8y35spkNozW3hUDA=="
  },
  {
    "key": "    ETag",
    "value": "                  CNDj09OG5v4CEAM="
  },
  {
    "key": "    Generation",
    "value": "            1683560246604240"
  },
  {
    "key": "    Metageneration",
    "value": "        3"
  }
]

有什么建议吗？谢谢

jq -Rn ' reduce (inputs | { ind: match("^\\s*").length, cap: capture("\\s*(?<key>.*):(\\s+(?<value>.*))?$") }) as {$ind, $cap} ([]; if $ind == 0 then . + [$cap | {key}] elif $ind == 4 then last += ([$cap | select(.key == "Metadata").value = {}] | from_entries) elif $ind == 8 then last.Metadata += ([$cap] | from_entries) else . end ) '

[ { "key": "gs://bucket-test/4e123978-8eed-43ae-f521-8fba54c704ea.zip", "Creation time": "Wed, 21 Dec 2022 10:39:27 GMT", "Update time": "Wed, 21 Dec 2022 10:39:27 GMT", "Storage class": "STANDARD", "Content-Length": "0", "Content-Type": "application/zip", "Hash (crc32c)": "AAAAAA==", "Hash (md5)": "1B2M2Y8AsgTpgAmY7PhCfg==", "ETag": "CM30q9XCivwCEAE=", "Generation": "1671619167320653", "Metageneration": "1" }, { "key": "gs://bucket-test/GKiSQMZ5rAqrSWwur/uploads/GENERAL/SNrQD97nzQN9eDLeA/AAZYefiL5CT8pxe4L", "Creation time": "Mon, 10 Apr 2023 19:09:41 GMT", "Update time": "Mon, 10 Apr 2023 19:09:41 GMT", "Storage class": "STANDARD", "Content-Disposition": "inline; filename=James_INGREDIENTS_A3.pdf", "Content-Length": "4381797", "Content-Type": "application/pdf", "Hash (crc32c)": "GOzitA==", "Hash (md5)": "eUSLC/z70gjDB2WQKIPOuQ==", "ETag": "CLGPvu+BoP4CEAE=", "Generation": "1681153781106609", "Metageneration": "1" }, { "key": "gs://bucket-test/prova.pdf", "Creation time": "Mon, 08 May 2023 15:37:26 GMT", "Update time": "Mon, 08 May 2023 15:40:12 GMT", "Storage class": "STANDARD", "Content-Disposition": "inline; filename=James_KEY_VISUAL_A3.pdf", "Content-Language": "ace", "Content-Length": "15407", "Content-Type": "application/pdf", "Metadata": { "meta-1": "prova 1", "meta-2": "prova 2" }, "Hash (crc32c)": "ZIrHPA==", "Hash (md5)": "oZbD+S8y35spkNozW3hUDA==", "ETag": "CNDj09OG5v4CEAM=", "Generation": "1683560246604240", "Metageneration": "3" } ]

使用 jq 工具将文本从 txt 文件转换为 json

推荐答案

Json相关问答推荐

JQ如何获取特定子元素的所有父母

无法使用Jolt变换在嵌套的JSON中提取值

当 JSON 字段名称有空格时，ABAP 中的 JSON 反序列化

使用 jolt 变换压平具有公共列 JSON 的复杂嵌套

使用 jq 如何更改键的值？

将 GEOSwift.JSON 转换为 Swift 中的 struct

如何使用 jq 在连续的 json 记录流上调用操作

如何在 Postman 中匹配 json 响应中的内容？并可视化

jq：来自嵌套 JSON 的映射

Vue 3如何将参数作为json发送到axios get

如果 JSON 对象包含列表中的子字符串，则丢弃它们

如何使用 gson 调用默认反序列化

使用 gson 反序列化对象的特定 JSON 字段

使用适用于 Python 的 Google API - 我从哪里获取 client_secrets.json 文件？

严重：找不到媒体类型 = 应用程序/json、类型 = 类 com.jersey.jaxb.Todo、通用类型 = 类 com.jersey.jaxb.Todo 的 MessageBodyWriter

获取一个数字的 PHP 对象属性

IE10/11 Ajax XHR 错误 - SCRIPT7002：XMLHttpRequest：网络错误 0x2ef3

强制 JSON.NET 在序列化 DateTime 时包含毫秒(即使 ms 组件为零)

如何在dart Flutter 中将json字符串转换为json对象？

SCRIPT5009：JSON未定义