i configured logstash process csv files file system , put them elastic further analysis. our elk heavily separated original source of csv files, thought sending csv files via http logstash instead of using file system.
the issue if use input "http" whole file taken , processed 1 big bunch. csv filter recognized first line. mentioned, same file works via "file" input.
logstash config this:
input { # http { # host => "localhost" # port => 8080 # } file { path => "/media/sample_files/debit_201606.csv" type => "items" start_position => "beginning" } } filter { csv { columns => ["created", "direction", "member", "point value", "type", "sub type"] separator => " " convert => { "point value" => "integer" } } date { match => [ "created", "yyyy-mm-dd hh:mm:ss" ] timezone => "utc" } } output { # elasticsearch { # action => "index" # hosts => ["localhost"] # index => "logstash-%{+yyyy.mm.dd}" # workers => 1 # } stdout { codec => rubydebug } }
my goal pass csv via curl. switching commented part of input area above, , use curl pass files: curl http://localhost:8080/ -t /media/samples/debit_201606.csv
what need achieve logstash processing csv line line?
i tried , think need split input. here's how that:
my configuration:
input { http { port => 8787 } } filter { split {} csv {} } output { stdout { codec => rubydebug } }
and test created csv file looking this:
artur@pandaadb:~/tmp/logstash$ cat test.csv a,b,c d,e,f g,h,i
and test:
artur@pandaadb:~/dev/logstash/conf3$ curl localhost:8787 -t ~/tmp/logstash/test.csv
outputs:
{ "message" => "a,b,c", "@version" => "1", "@timestamp" => "2016-08-01t15:27:17.477z", "host" => "127.0.0.1", "headers" => { "request_method" => "put", "request_path" => "/test.csv", "request_uri" => "/test.csv", "http_version" => "http/1.1", "http_host" => "localhost:8787", "http_user_agent" => "curl/7.47.0", "http_accept" => "*/*", "content_length" => "18", "http_expect" => "100-continue" }, "column1" => "a", "column2" => "b", "column3" => "c" } { "message" => "d,e,f", "@version" => "1", "@timestamp" => "2016-08-01t15:27:17.477z", "host" => "127.0.0.1", "headers" => { "request_method" => "put", "request_path" => "/test.csv", "request_uri" => "/test.csv", "http_version" => "http/1.1", "http_host" => "localhost:8787", "http_user_agent" => "curl/7.47.0", "http_accept" => "*/*", "content_length" => "18", "http_expect" => "100-continue" }, "column1" => "d", "column2" => "e", "column3" => "f" } { "message" => "g,h,i", "@version" => "1", "@timestamp" => "2016-08-01t15:27:17.477z", "host" => "127.0.0.1", "headers" => { "request_method" => "put", "request_path" => "/test.csv", "request_uri" => "/test.csv", "http_version" => "http/1.1", "http_host" => "localhost:8787", "http_user_agent" => "curl/7.47.0", "http_accept" => "*/*", "content_length" => "18", "http_expect" => "100-continue" }, "column1" => "g", "column2" => "h", "column3" => "i" }
what split filter is:
it takes input message (which 1 string including new-lines) , splits configured value (which default new-line). cancels original event , re-submits split events logstash. important execute split before execute csv filter.
i hope answers question!
artur
Comments
Post a Comment