有许多专门为从命令行操作JSON而设计的工具,它们比使用Awk操作要容易得多,也更可靠,例如jq
:
curl -s 'https://api.github.com/users/lambda' | jq -r '.name'
You can also do this with tools that are likely already installed on your system, like Python using the json
module, and so avoid any extra dependencies, while still having the benefit of a proper JSON parser. The following assume you want to use UTF-8, which the original JSON should be encoded in and is what most modern terminals use as well:
Python 3:
curl -s 'https://api.github.com/users/lambda' | \
python3 -c "import sys, json; print(json.load(sys.stdin)['name'])"
Python 2:
export PYTHONIOENCODING=utf8
curl -s 'https://api.github.com/users/lambda' | \
python2 -c "import sys, json; print json.load(sys.stdin)['name']"
常见问题解答
为什么不是纯粹的shell 解决方案呢?
The standard POSIX/Single Unix Specification shell is a very limited language which doesn't contain facilities for representing sequences (list or arrays) or associative arrays (also known as hash tables, maps, dicts, or objects in some other languages). This makes representing the result of parsing JSON somewhat tricky in portable shell scripts. There are somewhat hacky ways to do it, but many of them can break if keys or values contain certain special characters.
Bash 4及更高版本、zsh和ksh都支持数组和关联数组,但这些shell并不是通用的(macOS在Bash 3上停止更新Bash,原因是从GPLv2更改为GPLv3,而许多Linux系统没有现成安装zsh).你可以编写一个可以在Bash 4或zsh中工作的脚本,其中一个现在在大多数macOS、Linux和BSD系统上都可以使用,但是要编写一个适用于这样一个多语言脚本的shebang行是很困难的.
最后,在shell中编写完整的JSON解析器将是一个足够重要的依赖项,因此您可能只需要使用JQ或Python之类的现有依赖项即可.要实现一个好的实现,它不会是一行代码,甚至不是很小的五行代码片段.
Why not use awk, sed, or grep?
可以使用这些工具以已知的形状和格式(例如每行一个键)快速提取JSON.在其他答案中,有几个例子可以说明这一点.
However, these tools are designed for line based or record based formats; they are not designed for recursive parsing of matched delimiters with possible escape characters.
So these quick and dirty solutions using awk/sed/grep are likely to be fragile, and break if some aspect of the input format changes, such as collapsing whitespace, or adding additional levels of nesting to the JSON objects, or an escaped quote within a string. A solution that is robust enough to handle all JSON input without breaking will also be fairly large and complex, and so not too much different than adding another dependency on jq
or Python.
我以前不得不处理由于shell脚本中的输入解析不佳而导致大量客户数据被删除的情况,所以我从不推荐可能以这种方式脆弱的又快又脏的方法.如果您正在进行一些一次性处理,请参阅其他答案以获得建议,但我仍然强烈建议您只使用现有的经过测试的JSON解析器.
Historical notes
This answer originally recommended jsawk, which should still work, but is a little more cumbersome to use than jq
, and depends on a standalone JavaScript interpreter being installed which is less common than a Python interpreter, so the above answers are probably preferable:
curl -s 'https://api.github.com/users/lambda' | jsawk -a 'return this.name'
这个答案最初也使用了问题中的Twitter API,但该API不再工作,因此很难复制示例进行测试,而新的Twitter API需要API密钥,因此我转而使用GitHub API,它可以在没有API密钥的情况下轻松使用.最初问题的第一个答案是:
curl 'http://twitter.com/users/username.json' | jq -r '.text'