日志聚合工具Loki使用【LogQL】

LogQL简介

LogQL: Log Query Language
Loki comes with its very own language for querying logs called LogQL. LogQL can be considered a distributed grep with labels for filtering.

A basic LogQL query consists of two parts: the log stream selector and a filter expression. Due to Loki’s design, all LogQL queries are required to contain a log stream selector.

The log stream selector will reduce the number of log streams to a manageable volume. Depending how many labels you use to filter down the log streams will affect the relative performance of the query’s execution. The filter expression is then used to do a distributed grep over the retrieved log streams.

Loki选择器

对于查询表达式的标签部分，将其包装在花括号中{}，然后使用键值对的语法来选择标签，多个标签表达式用逗号分隔，比如：

{app="mysql",name="mysql-backup"}

目前支持以下标签匹配运算符：

=等于
!=不相等
=~正则表达式匹配
!~不匹配正则表达式
比如：

{name=~"mysql.+"}
{name!~"mysql.+"}

日志过滤器

编写日志流选择器后，您可以通过编写搜索表达式来进一步过滤结果。搜索表达式可以只是文本或正则表达式。
查询示例：

{job="mysql"} |= "error"
{name="kafka"} |~ "tsdb-ops.*io:2003"
{instance=~"kafka-[23]",name="kafka"} != kafka.server:type=ReplicaManager

过滤器运算符可以被链接，并将顺序过滤表达式-结果日志行将满足每个过滤器。例如：

{job="mysql"} |= "error" != "timeout"

已实现以下过滤器类型：

|= 行包含字符串。
!= 行不包含字符串。
|~ 行匹配正则表达式。
!~ 行与正则表达式不匹配。
regex表达式接受RE2语法。默认情况下，匹配项区分大小写，并且可以将regex切换为不区分大小写的前缀(?i)。

日志统计

rate: calculate the number of entries per second

rate( ( {job="mysql"} |= "error" != "timeout)[10s] ) )

count_over_time: counts the entries for each log stream within the given range.

count_over_time({job="mysql"}[5m])

聚合运算

sum: Calculate sum over labels
min: Select minimum over labels
max: Select maximum over labels
avg: Calculate the average over labels
stddev: Calculate the population standard deviation over labels
stdvar: Calculate the population standard variance over labels
count: Count number of elements in the vector
bottomk: Select smallest k elements by sample value
topk: Select largest k elements by sample value

示例：
Get the top 10 applications by the highest log throughput:

topk(10,sum(rate({region="us-east1"}[5m])) by (name))

Get the count of logs during the last five minutes, grouping by level:

sum(count_over_time({job="mysql"}[5m])) by (level)

Get the rate of HTTP GET requests from NGINX logs:

avg(rate(({job="nginx"} |= "GET")[10s])) by (region)