Filebeat:抓取指定的文件
在以前,配置ELK的数据采集端的时候,我们使用的是Logstash,那时候的抓取文件和Logstash本身的配置都是写在同一个配置文件之中的【logstash.yml】。
不过在Filebeat之后,服务本身的配置文件以及采集目标的配置日志可以拆分开来:
- filebeat.yml
- modules.d/*
下面,会呈现,在我的环境中的这一方面的配置细节。
先看看Filebeat的服务配置。
默认情况下Filebeat的配置都位于文件路径【/etc/filebeat】
服务配置文件:【/etc/filebeat/filebeat.yml】
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
[root@cloudera1 filebeat]# pwd /etc/filebeat [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr total 332 -rw-r--r-- 1 root root 78871 Aug 20 03:30 filebeat.reference.yml -rw-r--r-- 1 root root 242580 Aug 20 03:30 fields.yml drwxr-xr-x 2 root root 4096 Oct 16 13:35 modules.d -rw------- 1 root root 8184 Oct 16 13:45 filebeat.yml [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# cat filebeat.yml | wc -l 224 [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# cat filebeat.yml ###################### Filebeat Configuration Example ######################### # This file is an example configuration file highlighting only the most common # options. The filebeat.reference.yml file from the same directory contains all the # supported options with more comments. You can use it as a reference. # # You can find the full configuration reference here: # https://www.elastic.co/guide/en/beats/filebeat/index.html # For more available modules and options, please see the filebeat.reference.yml sample # configuration file. #=========================== Filebeat inputs ============================= filebeat.inputs: # Each - is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations. - type: log # Change to true to enable this input configuration. enabled: false # Paths that should be crawled and fetched. Glob based paths. paths: - /var/log/*.log #- c:\programdata\elasticsearch\logs\* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering #fields: # level: debug # review: 1 ### Multiline options # Multiline can be used for log messages spanning multiple lines. This is common # for Java Stack Traces or C-Line Continuation # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [ #multiline.pattern: ^\[ # Defines if the pattern set under pattern should be negated or not. Default is false. #multiline.negate: false # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern # that was (not) matched before or after or as long as a pattern is not matched based on negate. # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash #multiline.match: after #============================= Filebeat modules =============================== filebeat.config.modules: # Glob pattern for configuration loading path: ${path.config}/modules.d/*.yml # Set to true to enable config reloading reload.enabled: false # Period on which files under path should be checked for changes #reload.period: 10s #==================== Elasticsearch template setting ========================== setup.template.settings: index.number_of_shards: 1 #index.codec: best_compression #_source.enabled: false #================================ General ===================================== # The name of the shipper that publishes the network data. It can be used to group # all the transactions sent by a single shipper in the web interface. #name: # The tags of the shipper are included in their own field with each # transaction published. #tags: ["service-X", "web-tier"] # Optional fields that you can specify to add additional information to the # output. #fields: # env: staging #============================== Dashboards ===================================== # These settings control loading the sample dashboards to the Kibana index. Loading # the dashboards is disabled by default and can be enabled either by setting the # options here or by using the `setup` command. #setup.dashboards.enabled: false # The URL from where to download the dashboards archive. By default this URL # has a value which is computed based on the Beat name and version. For released # versions, this URL points to the dashboard archive on the artifacts.elastic.co # website. #setup.dashboards.url: #============================== Kibana ===================================== # Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana: # Kibana Host # Scheme and port can be left out and will be set to the default (http and 5601) # In case you specify and additional path, the scheme is required: http://localhost:5601/path # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601 #host: "localhost:5601" host: "192.168.72.112:5601" # Kibana Space ID # ID of the Kibana Space into which the dashboards should be loaded. By default, # the Default Space will be used. #space.id: #============================= Elastic Cloud ================================== # These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/). # The cloud.id setting overwrites the `output.elasticsearch.hosts` and # `setup.kibana.host` options. # You can find the `cloud.id` in the Elastic Cloud web UI. #cloud.id: # The cloud.auth setting overwrites the `output.elasticsearch.username` and # `output.elasticsearch.password` settings. The format is `<user>:<pass>`. #cloud.auth: #================================ Outputs ===================================== # Configure what output to use when sending the data collected by the beat. #-------------------------- Elasticsearch output ------------------------------ setup.ilm.enabled: false setup.template.enabled: false setup.template.name: "filebeat-cloudera1" setup.template.pattern: "filebeat-cloudera1-" output.elasticsearch: # Array of hosts to connect to. #hosts: ["localhost:9200"] hosts: ["192.168.72.112:9200"] # adamhuan define # index index: "filebeat-192.168.72.131-cloudera1-%{+yyyy.MM.dd}" # Optional protocol and basic auth credentials. #protocol: "https" #username: "elastic" #password: "changeme" #----------------------------- Logstash output -------------------------------- #output.logstash: # The Logstash hosts #hosts: ["localhost:5044"] # Optional SSL. By default is off. # List of root certificates for HTTPS server verifications #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"] # Certificate for SSL client authentication #ssl.certificate: "/etc/pki/client/cert.pem" # Client Certificate Key #ssl.key: "/etc/pki/client/cert.key" #================================ Processors ===================================== # Configure processors to enhance or manipulate events generated by the beat. processors: - add_host_metadata: ~ - add_cloud_metadata: ~ #================================ Logging ===================================== # Sets log level. The default log level is info. # Available log levels are: error, warning, info, debug #logging.level: debug # At debug level, you can selectively enable logging only for some components. # To enable all selectors use ["*"]. Examples of other selectors are "beat", # "publish", "service". #logging.selectors: ["*"] #============================== Xpack Monitoring =============================== # filebeat can export internal metrics to a central Elasticsearch monitoring # cluster. This requires xpack monitoring to be enabled in Elasticsearch. The # reporting is disabled by default. # Set to true to enable the monitoring reporter. #monitoring.enabled: false # Uncomment to send the metrics to Elasticsearch. Most settings from the # Elasticsearch output are accepted here as well. # Note that the settings should point to your Elasticsearch *monitoring* cluster. # Any setting that is not set is automatically inherited from the Elasticsearch # output configuration, so if you have the Elasticsearch output configured such # that it is pointing to your Elasticsearch monitoring cluster, you can simply # uncomment the following line. #monitoring.elasticsearch: #================================= Migration ================================== # This allows to enable 6.7 migration aliases #migration.6_to_7.enabled: true [root@cloudera1 filebeat]# |
在上面的这段配置中,清注意这里的配置:
1 2 3 4 5 6 |
[root@cloudera1 filebeat]# cat filebeat.yml | grep modules # For more available modules and options, please see the filebeat.reference.yml sample #============================= Filebeat modules =============================== filebeat.config.modules: path: ${path.config}/modules.d/*.yml [root@cloudera1 filebeat]# |
可以看到,除了服务本身的配置定义,其他的具体的对象的配置定义都在路径【/etc/filebeat/modules.d/*.yml】
来看看针对不同的对象Filebeat给出的配置:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
[root@cloudera1 filebeat]# pwd /etc/filebeat [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr total 332 -rw-r--r-- 1 root root 78871 Aug 20 03:30 filebeat.reference.yml -rw-r--r-- 1 root root 242580 Aug 20 03:30 fields.yml drwxr-xr-x 2 root root 4096 Oct 16 13:35 modules.d -rw------- 1 root root 8184 Oct 16 13:45 filebeat.yml [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr modules.d/ total 124 -rw-r--r-- 1 root root 426 Aug 20 03:30 zeek.yml.disabled -rw-r--r-- 1 root root 302 Aug 20 03:30 traefik.yml.disabled -rw-r--r-- 1 root root 299 Aug 20 03:30 suricata.yml.disabled -rw-r--r-- 1 root root 266 Aug 20 03:30 santa.yml.disabled -rw-r--r-- 1 root root 566 Aug 20 03:30 redis.yml.disabled -rw-r--r-- 1 root root 343 Aug 20 03:30 rabbitmq.yml.disabled -rw-r--r-- 1 root root 305 Aug 20 03:30 postgresql.yml.disabled -rw-r--r-- 1 root root 356 Aug 20 03:30 panw.yml.disabled -rw-r--r-- 1 root root 495 Aug 20 03:30 osquery.yml.disabled -rw-r--r-- 1 root root 472 Aug 20 03:30 nginx.yml.disabled -rw-r--r-- 1 root root 214 Aug 20 03:30 netflow.yml.disabled -rw-r--r-- 1 root root 287 Aug 20 03:30 nats.yml.disabled -rw-r--r-- 1 root root 471 Aug 20 03:30 mysql.yml.disabled -rw-r--r-- 1 root root 311 Aug 20 03:30 mssql.yml.disabled -rw-r--r-- 1 root root 296 Aug 20 03:30 mongodb.yml.disabled -rw-r--r-- 1 root root 470 Aug 20 03:30 logstash.yml.disabled -rw-r--r-- 1 root root 293 Aug 20 03:30 kibana.yml.disabled -rw-r--r-- 1 root root 398 Aug 20 03:30 kafka.yml.disabled -rw-r--r-- 1 root root 366 Aug 20 03:30 iptables.yml.disabled -rw-r--r-- 1 root root 470 Aug 20 03:30 iis.yml.disabled -rw-r--r-- 1 root root 651 Aug 20 03:30 icinga.yml.disabled -rw-r--r-- 1 root root 376 Aug 20 03:30 haproxy.yml.disabled -rw-r--r-- 1 root root 770 Aug 20 03:30 googlecloud.yml.disabled -rw-r--r-- 1 root root 327 Aug 20 03:30 envoyproxy.yml.disabled -rw-r--r-- 1 root root 964 Aug 20 03:30 elasticsearch.yml.disabled -rw-r--r-- 1 root root 318 Aug 20 03:30 coredns.yml.disabled -rw-r--r-- 1 root root 1307 Aug 20 03:30 cisco.yml.disabled -rw-r--r-- 1 root root 280 Aug 20 03:30 auditd.yml.disabled -rw-r--r-- 1 root root 475 Aug 20 03:30 apache.yml.disabled -rw-r--r-- 1 root root 2248 Oct 16 09:45 cloudera.yml.disabled -rw-r--r-- 1 root root 550 Oct 16 11:00 system.yml [root@cloudera1 filebeat]# |
可以看到,其中定义了很多不同的对象【模块】。
但是大体上,从状态划分,可以有两类:enable / disable
通过【filebeat】命令可以对具体的模块的状态做出修改,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
[root@cloudera1 filebeat]# pwd /etc/filebeat [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr modules.d/*.yml -rw-r--r-- 1 root root 550 Oct 16 11:00 modules.d/system.yml [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# filebeat modules disable system Disabled system [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr modules.d/*.yml ls: cannot access modules.d/*.yml: No such file or directory [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr modules.d/ | grep system -rw-r--r-- 1 root root 550 Oct 16 11:00 system.yml.disabled [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# filebeat modules enable system Enabled system [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr modules.d/ | grep system -rw-r--r-- 1 root root 550 Oct 16 11:00 system.yml [root@cloudera1 filebeat]# |
了解了Filebeat的模块的管理,那么就继续看看具体的模块的配置:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
[root@cloudera1 filebeat]# cat modules.d/system.yml # Module: system # Docs: https://www.elastic.co/guide/en/beats/filebeat/7.3/filebeat-module-system.html - module: system # Syslog syslog: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: ["/var/log/*","/var/log/*/*","/var/log/cloudera*/*","/var/log/hadoop*/*"] # Authorization logs auth: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. #var.paths: [root@cloudera1 filebeat]# |
可以看到,配置文件中定义了module的名称【system】并且包含两个分类【syslog / auth】
其实,这里的配置并不是随便写的,而是需要在【/usr/share/filebeat/module】中对应的module文件夹中真实存在。
具体如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
[root@cloudera1 filebeat]# ls -ltr /usr/share/filebeat total 244 -rw-r--r-- 1 root root 13675 Aug 20 03:08 LICENSE.txt -rw-r--r-- 1 root root 216284 Aug 20 03:08 NOTICE.txt -rw-r--r-- 1 root root 802 Aug 20 03:32 README.md drwxr-xr-x 3 root root 4096 Oct 15 23:06 kibana drwxr-xr-x 33 root root 4096 Oct 15 23:06 module drwxr-xr-x 2 root root 4096 Oct 15 23:06 bin [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr /usr/share/filebeat/module/system/ total 12 -rw-r--r-- 1 root root 330 Aug 20 03:30 module.yml drwxr-xr-x 4 root root 4096 Oct 15 23:06 auth drwxr-xr-x 4 root root 4096 Oct 15 23:06 syslog [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# tree -L 3 /usr/share/filebeat/module/system/ /usr/share/filebeat/module/system/ ├── auth │ ├── config │ │ └── auth.yml │ ├── ingest │ │ └── pipeline.json │ └── manifest.yml ├── module.yml └── syslog ├── config │ └── syslog.yml ├── ingest │ └── pipeline.json └── manifest.yml 6 directories, 7 files [root@cloudera1 filebeat]# |
如果,上面的配置文件中指定的具体配置,在【/usr/share/filebeat/module】里面其实不存在,则在稍后的【filebeat setup】中会遇到错误,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
[root@cloudera1 filebeat]# filebeat modules enable cloudera Enabled cloudera [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# cat modules.d/cloudera.yml # Module: system # Docs: https://www.elastic.co/guide/en/beats/filebeat/7.3/filebeat-module-system.html - module: cloudera # log - cloudera-scm-agent log-cloudera-scm-agent: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/cloudera-scm-agent # log - cloudera-scm-alertpublisher log-cloudera-scm-alertpublisher: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/cloudera-scm-alertpublisher # log - cloudera-scm-eventserver log-cloudera-scm-eventserver: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/cloudera-scm-eventserver # log - cloudera-scm-firehose log-cloudera-scm-firehose: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/cloudera-scm-firehose # log - cloudera-scm-headlamp log-cloudera-scm-headlamp: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/cloudera-scm-headlamp # log - cloudera-scm-server log-cloudera-scm-server: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/cloudera-scm-server # log - hadoop-hdfs log-hadoop-hdfs: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/hadoop-hdfs # log - hadoop-mapreduce log-hadoop-mapreduce: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/hadoop-mapreduce # log - hadoop-yarn log-hadoop-yarn: enabled: true # Set custom paths for the log files. If left empty, # Filebeat will choose the paths depending on your OS. var.paths: /var/log/hadoop-yarn [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr /usr/share/filebeat/module/ | grep cloudera [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# filebeat setup ILM policy and write alias loading not enabled. Template loading not enabled. Index setup finished. Loading dashboards (Kibana must be running and reachable) Loaded dashboards Exiting: 1 error: Error getting filesets for module cloudera: open /usr/share/filebeat/module/cloudera: no such file or directory [root@cloudera1 filebeat]# |
如果是正常的情况,则表现是这样的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[root@cloudera1 filebeat]# filebeat modules disable cloudera Disabled cloudera [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# ls -ltr modules.d/*.yml -rw-r--r-- 1 root root 550 Oct 16 11:00 modules.d/system.yml [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# filebeat setup ILM policy and write alias loading not enabled. Template loading not enabled. Index setup finished. Loading dashboards (Kibana must be running and reachable) Loaded dashboards Loaded machine learning job configurations Loaded Ingest pipelines [root@cloudera1 filebeat]# |
通过上面的module的配置文件,相信你应该理解了,如果需要添加对某个文件的抓取,应该怎么设置。
如下:
1 2 3 4 5 6 7 8 9 10 11 |
[root@cloudera1 filebeat]# pwd /etc/filebeat [root@cloudera1 filebeat]# [root@cloudera1 filebeat]# cat modules.d/system.yml | grep -v "#" | strings - module: system syslog: enabled: true var.paths: ["/var/log/*","/var/log/*/*","/var/log/cloudera*/*","/var/log/hadoop*/*"] auth: enabled: true [root@cloudera1 filebeat]# |
在上面的【var.paths】中设置。
上面已经给出了设置多个路径的例子。
Finished。