Grafana+collectd/Telegraf+InfluxDB 搭建服务器监控系统.

本文编写于2832天前,最后编辑于 2832天前,部分内容可能已经过时,请您自行斟酌确认。

流程大概这样子
采集数据(collectd/Telegraf)-> 存储数据(InfluxDB) -> 显示数据(Grafana)。
系统环境为Debian 8 64bit,以下为配置笔记

更新系统

apt-get update && apt-get upgrade -y

安装InfluxDB(下载地址:https://portal.influxdata.com/downloads#influxdb)

wget https://dl.influxdata.com/influxdb/releases/influxdb_1.3.5_amd64.deb
sudo dpkg -i influxdb_1.3.5_amd64.deb

启动InfluxDB

systemctl start influxdb.service
systemctl enable influxdb

查看状态

systemctl status influxdb.service

默认情况下,InfluxDB使用以下网络端口:

TCP端口8086用于通过InfluxDB的HTTP API进行client-server通信
TCP端口8088用于RPC服务进行备份和还原
除了上述端口,InfluxDB还提供可能需要自定义端口的多个插件。 所有端口映射可以通过配置文件进行修改,配置文件位于/etc/influxdb/influxdb.conf,用于默认安装。

nano /etc/influxdb/influxdb.conf

打开collectd插件

[[collectd]]
  enabled = true
  bind-address = ":25826"
  database = "collectd"
  retention-policy = ""
  
  # The collectd service supports either scanning a directory for multiple types
  # db files, or specifying a single db file.
  typesdb = "/usr/share/collectd/types.db"
  security-level = "none"
  # auth-file = "/etc/collectd/auth_file"

  # These next lines control how batching works. You should have this enabled
  # otherwise you could get dropped metrics or poor performance. Batching
  # will buffer points in memory if you have many coming in.

  # Flush if this many points get buffered
  batch-size = 5000

  # Number of batches that may be pending in memory
  batch-pending = 10

  # Flush at least this often even if we haven't hit buffer limit
  batch-timeout = "10s"

  # UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
  read-buffer = 0

重启

systemctl restart influxdb.service

创建数据库,可使用Cli或者http api

influx -precision rfc3339

默认情况下,InfluxDB HTTP API在端口8086上运行。 因此,influx将默认连接到端口8086和localhost。 如果您需要更改这些默认值,请运行influx --help。
-precision参数指定任何返回的时间戳的格式/精度。
rfc3339告诉InfluxDB返回RFC3339格式(YYYY-MM-DDTHH:MM:SS.nnnnnnnnnnZ)的时间戳。

CREATE DATABASE collectd

可以使用SHOW DATABASES来列出所有的数据库:

> SHOW DATABASES
name: databases
name
----
_internal
collectd
>

为 influxdb 添加权限认证

curl -G http://localhost:8086/query --data-urlencode "q=SHOW DATABASES"

返回:

{"results":[{"statement_id":0,"series":[{"name":"databases","columns":["name"],"values":[["_internal"]]}]}]}

即说明未开启认证,修改配置文件如下

[http]
  # Determines whether HTTP endpoint is enabled.
  enabled = true

  # The bind address used by the HTTP service.
  bind-address = ":8086"

  # Determines whether user authentication is enabled over HTTP/HTTPS.
  auth-enabled = true

  # The default realm sent back when issuing a basic auth challenge.
  realm = "InfluxDB"

  # Determines whether HTTP request logging is enabled.
  log-enabled = true

  # Determines whether detailed write logging is enabled.
  write-tracing = false

  # Determines whether the pprof endpoint is enabled.  This endpoint is used for
  # troubleshooting and monitoring.
  pprof-enabled = true

  # Determines whether HTTPS is enabled.
  https-enabled = false

  # The SSL certificate to use when HTTPS is enabled.
  #https-certificate = "/etc/ssl/influxdb.pem"

  # Use a separate private key location.
  #https-private-key = ""

  # The JWT auth shared secret to validate requests using JSON web tokens.
  # shared-secret = ""

  # The default chunk size for result sets that should be chunked.
  # max-row-limit = 0

  # The maximum number of HTTP connections that may be open at once.  New connections that
  # would exceed this limit are dropped.  Setting this value to 0 disables the limit.
  # max-connection-limit = 0

  # Enable http service over unix domain socket
  # unix-socket-enabled = false

  # The path of the unix domain socket.
  # bind-socket = "/var/run/influxdb.sock"

然后重启实例,再次验证,返回:

{"error":"error authorizing query: create admin user first or disable authentication"}

即说明已开启认证

当我们开启了认证,influxdb需要我们至少创建一个admin用户,不然不能交互

输入influx进入cli,输入如下命令创建用户名为:admin,密码为:passowrd的超级用户

➜ influx
Connected to http://localhost:8086 version 1.3.2
InfluxDB shell version: 1.3.2
> CREATE USER admin with PASSWORD 'password' WITH ALL PRIVILEGES
>

创建了用户之后验证:

curl -G http://localhost:8086/query -u admin:password  --data-urlencode "q=SHOW DATABASES"

或者

curl -G "http://localhost:8086/query?u=admin&p=password" --data-urlencode "q=SHOW DATABASES"

注意引号
如果我们不加验证的信息,会返回:

{"error":"unable to parse authentication credentials"}

如果账号或者密码输入错误,会返回:

{"error":"authorization failed"}

InfluxDB的坑还是蛮多的,版本更新太快,网络上好多教程里面的内容都已经失效了,
1.从1.1.0开始web管理就被默认禁用但可以修改配置文件开启,到1.3.0就彻底被移除了无法开启了,官方建议使用Chronograf 或 Grafana替代.详见.现在的最新版本是1.3.5 .
2.默认管理账号
0.8及以前版本安装完后,是有一个账号为root、密码为root的管理账号。可作为一个默认管理权限的账号。
0.9版本及以后版本是没有默认账号的,都要自己创建。

3.HTTP API接口的不同
0.9开始跟以前的版本完全不一样,所以如果是从老版本升级到0.9或者更高版本,要注意了,接口文件要完全重新写过。
0.8以前是类似这样的请求:
http://localhost:8086/db/mydb/
而新版的是这样的:
http://localhost:8086/query?db=test&pretty=true
新版基本上都是用/query来执行,而旧版是没有的。网上很多都是旧版代码,如果用来查询新版的,是得不到数据的。
安装 collectd

apt-get install collectd -y

安装 Grafana

wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana_4.5.1_amd64.deb 
sudo dpkg -i grafana_4.5.1_amd64.deb 

Installs binary to /usr/sbin/grafana-server
Installs Init.d script to /etc/init.d/grafana-server
Creates default file (environment vars) to /etc/default/grafana-server
Installs configuration file to /etc/grafana/grafana.ini
Installs systemd service (if systemd is available) name grafana-server.service
The default configuration sets the log file at /var/log/grafana/grafana.log
The default configuration specifies an sqlite3 db at /var/lib/grafana/grafana.db
Installs HTML/JS/CSS and other Grafana files at /usr/share/grafana

启动

systemctl daemon-reload
systemctl start grafana-server
systemctl status grafana-server

开机自启

systemctl enable grafana-server.service

默认的日志路径

/var/log/grafana

默认的sqlite3数据库路径,升级之前注意备份

/var/lib/grafana/grafana.db

配置文件在这里

/etc/grafana/grafana.ini 

发表评论

电子邮件地址不会被公开。 必填项已用 * 标注