0%

elasticsearch索引优化参考

节点离开延迟分配

index.unassigned.node_left.delayed_timeout默认为1m. 当一个节点出于某种原因离开集群时,无论是有意的还是其他的,主节点的反应是:

  • 将副本分片提升为主分片以替换节点上的任何主分片。
  • 分配副本分片以替换丢失的副本(假设有足够的节点)。
  • 在剩余节点上均匀地重新平衡分片。

如果一个节点被删除永远不会返回,希望 Elasticsearch 立即分配丢失的分片,只需将超时更新为0

PUT _all/_settings
{
  "settings": {
    "index.unassigned.node_left.delayed_timeout": "0"
  }
}

索引恢复优先级

尽可能按优先级顺序恢复未分配的分片。指数按优先级排序如下:

  • 可选index.priority设置(先高后低)
  • 索引创建日期(先高后低)
  • 索引名称(先高后低)
PUT index_4/_settings
{
  "index.priority": 1
}

节点分片总数限制

  • 为单个索引设置限制index.routing.allocation.total_shards_per_node默认为无限制;
  • 为集群统一设置cluster.routing.allocation.total_shards_per_node默认为-1无限制.

数据冷热节点角色

  • data_content
  • data_hot
  • data_warm
  • data_cold
  • data_frozen
# es 7.13之前用下面的语法,7.13之后过时
index.routing.allocation.include._tier: data_warm
index.routing.allocation.require._tier: data_warm
index.routing.allocation.exclude._tier: data_warm
# es 7.13版本之后使用
index.routing.allocation.include._tier_preference: data_warm,data_hot

索引块

# 设置索引和索引元数据只读
index.blocks.read_only: true
# 设置只读,不能删除索引内doc,但是允许删除索引
index.blocks.read_only_allow_delete: true
# 
index.blocks.read: true
index.blocks.write: true
index.blocks.metadata: true

操作

# <block>可以是metadata,read,read_only,write
PUT /my-index-000001/_block/<block>

慢日志

  • 系统级别
// 可以动态设置,threshold默认disabled,为-1
PUT /my-index-000001/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.query.info": "5s",
  "index.search.slowlog.threshold.query.debug": "2s",
  "index.search.slowlog.threshold.query.trace": "500ms",
  "index.search.slowlog.threshold.fetch.warn": "1s",
  "index.search.slowlog.threshold.fetch.info": "800ms",
  "index.search.slowlog.threshold.fetch.debug": "500ms",
  "index.search.slowlog.threshold.fetch.trace": "200ms"
}
  • 索引级别,文件名以_index_indexing_slowlog.log结尾
PUT /my-index-000001/_settings
{
  "index.indexing.slowlog.threshold.index.warn": "10s",
  "index.indexing.slowlog.threshold.index.info": "5s",
  "index.indexing.slowlog.threshold.index.debug": "2s",
  "index.indexing.slowlog.threshold.index.trace": "500ms",
  "index.indexing.slowlog.source": "1000"
}

存储

store模块允许控制索引数据在磁盘上的存储和访问方式,建议采用默认值.

  • 系统级别设置,elasticsearch.yml
index.store.type: hybridfs
  • 索引级别
PUT /my-index-000001
{
  "settings": {
    "index.store.type": "hybridfs"
  }
}

可选的值:fs,simplefs,niofs,mmapfs,hybridfs.

事务日志

ES提交到Lucene的索引、删除、分片拷贝、写等操作在未确认之前都会写入translog.

# 默认request意味所有操作(index, delete, update, bulk)只有同步到所有分片和副本后才会返回success
index.translog.durability: request
# 可以设置异步提交到磁盘
index.translog.durability: async
# 异步同步到磁盘的时间,最少100ms
index.translog.sync_interval: 5s
# 达到这个大小立即刷新磁盘
index.translog.flush_threshold_size: 512mb

历史保留-软删除

# Elasticsearch 6.5.0 之后可用,默认为true
index.soft_deletes.enabled: true
# 保留时间,默认12h
index.soft_deletes.retention_lease.period: 12h

索引排序

默认不排序.

# 支持boolean, numeric, date and keyword
index.sort.field: ["username"]
# 支持asc,desc
index.sort.order: ["asc"]
# 支持min,max
index.sort.mode: min
# 支持_last,_first
index.sort.missing: _last

示例

PUT my-index-000001
{
  "settings": {
    "index": {
      "sort.field": [ "username", "date" ], 
      "sort.order": [ "asc", "desc" ]       
    }
  },
  "mappings": {
    "properties": {
      "username": {
        "type": "keyword",
        "doc_values": true
      },
      "date": {
        "type": "date"
      }
    }
  }
}

indexing pressure

# Defaults to 10% of the heap.
indexing_pressure.memory.limit: 10%