这个插件是监控文件夹中文件,进行解析,解析完remove到指定目录。我这块用于解析csv文件,其他类型自行探索。

贴一下配置信息,下面有解释

[[inputs.directory_monitor]]
  ##指定表名
  name_override = "four_base_test"
  ## The directory to monitor and read files from (including sub-directories if "recursive" is true).
  directory = "/mydata/Telegraf/input"
  #
  ## The directory to move finished files to (maintaining directory hierarchy from source).
  finished_directory = "/mydata/Telegraf/temp"
  #
  ## Setting recursive to true will make the plugin recursively walk the directory and process all sub-directories.
  recursive = false
  #
  ## The directory to move files to upon file error.
  ## If not provided, erroring files will stay in the monitored directory.
  error_directory = "/mydata/Telegraf/errorfile"
  #
  ## The amount of time a file is allowed to sit in the directory before it is picked up.
  ## This time can generally be low but if you choose to have a very large file written to the directory and it's potentially slow,
  ## set this higher so that the plugin will wait until the file is fully copied to the directory.
  directory_duration_threshold = "100ms"
  #
  ## A list of the only file names to monitor, if necessary. Supports regex. If left blank, all files are ingested.
  files_to_monitor = ["^.*\\.csv"]
  #
  ## A list of files to ignore, if necessary. Supports regex.
  files_to_ignore = [".DS_Store"]
  #
  ## Maximum lines of the file to process that have not yet be written by the
  ## output. For best throughput set to the size of the output's metric_buffer_limit.
  ## Warning: setting this number higher than the output's metric_buffer_limit can cause dropped metrics.
  # max_buffered_metrics = 10000
  #
  ## The maximum amount of file paths to queue up for processing at once, before waiting until files are processed to find more files.
  ## Lowering this value will result in *slightly* less memory use, with a potential sacrifice in speed efficiency, if absolutely necessary.
  # file_queue_size = 100000
  #
  ## Name a tag containing the name of the file the data was parsed from.  Leave empty
  ## to disable. Cautious when file name variation is high, this can increase the cardinality
  ## significantly. Read more about cardinality here:
  ## https://docs.influxdata.com/influxdb/cloud/reference/glossary/#series-cardinality
  # file_tag = ""
  #
  ## Specify if the file can be read completely at once or if it needs to be read line by line (default).
  ## Possible values: "line-by-line", "at-once"
  # parse_method = "line-by-line"
  csv_header_row_count = 1
  #
  ## The dataformat to be read from the files.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "csv"

其中  name_override = "four_base_test"  用于指定表名,别的插件应该也可以用

  • directory:

    • 说明: 需要监控和读取文件的源目录路径。

    • 示例: "/path/to/source/directory"

  • finished_directory:

    • 说明: 文件处理完后,将它们移动到的目标目录。这会保持源目录的目录层次结构。

    • 示例: "/path/to/finished/directory"

  • recursive:

    • 说明: 如果设置为 true,插件将递归地遍历指定目录及其子目录以处理所有文件。

    • 默认值: false

    • 示例: true

  • error_directory:

    • 说明: 如果处理文件时发生错误,将它们移动到该目录。如果未指定,错误文件将保留在监控目录中。

    • 示例: "/path/to/error/directory"

  • directory_duration_threshold:

    • 说明: 文件在目录中允许存在的时间,以确保文件已完全写入。时间可以设置得较低,但对于大型文件,可以设置较高的值以等待文件完全写入。

    • 默认值: "50ms"

    • 示例: "100ms"

  • files_to_monitor:

    • 说明: 仅监控符合指定正则表达式的文件名。如果留空,将监控所有文件。

    • 示例: ["^.*\\.csv"] (监控所有以 .csv 结尾的文件)

  • files_to_ignore:

    • 说明: 需要忽略的文件名列表,支持正则表达式。

    • 示例: [".DS_Store"] (忽略 .DS_Store 文件)

  • max_buffered_metrics:

    • 说明: 处理文件时,最大允许的未写入输出的行数。为了最佳吞吐量,设置为输出的 metric_buffer_limit 的大小。

    • 默认值: 10000

    • 示例: 5000

  • file_queue_size:

    • 说明: 在处理完当前文件之前,最多允许排队处理的文件路径数量。较低的值将减少内存使用,但可能会影响速度。

    • 默认值: 100000

    • 示例: 50000

  • file_tag:

    • 说明: 为文件数据添加的标签,标签值是文件名。留空以禁用。文件名变异性较高时,这可能会显著增加卡迪纳利性(cardinality)。

    • 示例: "filename"

  • parse_method:

    • 说明: 指定文件的读取方式。可能的值包括 "line-by-line""at-once"

    • 默认值: "line-by-line"

    • 示例: "at-once"

  • data_format:

    • 说明: 读取文件的数据格式。支持多种数据格式,每种格式有其特定的配置选项。

    • 示例: "influx" (表示文件格式符合 InfluxDB 的格式)

点赞(0) 打赏

评论列表 共有 0 条评论

暂无评论

微信公众账号

微信扫一扫加关注

发表
评论
返回
顶部