Custom csv log format for custom nginx log format

Hi,
I want to monitor nginx log file and I have to deal with following custom log format

log_format custom '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for" "$host" sn="$server_name" rt=$request_time ua="$upstream_addr" us="$upstream_status" ut="$upstream_response_time" ul="$upstream_response_length" cs=$upstream_cache_status'

Note ‘sn=’, ‘rt=’ and other patterns defined there.
So I need a custom csv format for go web log plugin.
The trouble is - I don’t really understand a single word from this.

I tried to escape equal sign with double slashes, escape other characters, escape double quote, put star into log format string, but I always end up with this go plugin error:
bare " in non-quoted-field

[ DEBUG ] build[manager] build.go:295 building web_log[customlog] job, config: map[__provider__:file reader __source__:/etc/netdata/go.d/web_log.conf autodetection_retry:0 csv_config:map[format:$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - - "$host" \\s\\n\\=\\"-\\" \\r\\t\\\\=$request_time \\u\\a\\\\="-" \\u\\s\\\\="-" \\u\\t\\\\="$upstream_response_time" \\u\\l\\\\="-" \\c\\s\\\\=-] log_type:csv module:web_log name:customlog path:/someapp/logs/nginx-access.log priority:70000 update_every:1]
[ DEBUG ] web_log[customlog] init.go:31 skipping URL patterns creating, no patterns provided
[ DEBUG ] web_log[customlog] init.go:49 skipping custom fields creating, no custom fields provided
[ DEBUG ] web_log[customlog] init.go:74 skipping custom time fields creating, no custom time fields provided
[ DEBUG ] web_log[customlog] init.go:103 starting log reader creating
[ DEBUG ] web_log[customlog] reader.go:71 open log file: /someapp/logs/nginx-access.log
[ DEBUG ] web_log[customlog] init.go:108 created log reader, current file '/someapp/logs/nginx-access.log'
[ DEBUG ] web_log[customlog] init.go:114 starting parser creating
[ DEBUG ] web_log[customlog] init.go:120 last line: 'xx.yy.ww.zz - - [08/Jan/2021:15:39:53 +0200] "GET /static/cloud-zoom-img/css/cloud-zoom.css HTTP/2.0" 200 599 "https://www.company.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 12_4_9 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.2 Mobile/15E148 Safari/604.1" "-" "www.company.com" sn="www.company.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-'
[ DEBUG ] web_log[customlog] parser.go:83 log_type is csv, skipping auto-detection
[ DEBUG ] web_log[customlog] parser.go:86 config: {FieldsPerRecord:-1 Delimiter:32 TrimLeadingSpace:false Format:$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - - "$host" sn="-" rt=$request_time ua="-" us="-" ut="$upstream_response_time" ul="-" cs=- CheckField:0xb8fd00}
[ WARN  ] web_log[customlog] weblog.go:112 check failed: create parser: bad csv format '$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - - "$host" sn="-" rt=$request_time ua="-" us="-" ut="$upstream_response_time" ul="-" cs=-': parse error on line 1, column 84: bare " in non-quoted-field
[ ERROR ] web_log[customlog] job.go:153 check failed
[ DEBUG ] web_log[customlog] reader.go:127 close log file: /someapp/logs/nginx-access.log

Please, hint me how to create a proper custom log format for the customized nginx log above.
Thanks!

Hey @shpokas and welcome to our community!

Have you tried following our documentation here in regards to Custom Log Formats for NginX/Apache?

You only need to copy and paste the custom format into the configuration of the collector and it will auto-detect everything on it’s own.

Please let me know if this works for you.

Best,
odysseas

Yes, I did and the link you provided contains the link I provided in my first post.
In the custom log I have format like ’ sn=“www.company.com” ’ and this confuses parser.

1 Like

Reading that log you sent (super helpful), it seems that it doesn’t detect a custom log format.

Can you please share the configuration for the web log collector?

I think go plugin detects the custom log format. This line:
[ DEBUG ] web_log[customlog] parser.go:86 config: {FieldsPerRecord:-1 Delimiter:32 TrimLeadingSpace:false Format:$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - - "$host" sn="-" rt=$request_time ua="-" us="-" ut="$upstream_response_time" ul="-" cs=- CheckField:0xb8fd00}

My last try was:

  - name: customlog
    path: /someapp/logs/nginx-access.log
    log_type: csv
    csv_config:
      format: '$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - - "$host" \\s\\n\\=\\"-\\" \\r\\t\\\\=$request_time \\u\\a\\\\="-" \\u\\s\\\\="-" \\u\\t\\\\="$upstream_response_time" \\u\\l\\\\="-" \\c\\s\\\\=-'

Okay, yes you are correct.

I think that we don’t need to escape anything. Also can you please tell me more about this syntax: ut="$upstream_response_time". I don’t have personal experience with the web log.

We are getting to the bottom of it, thanks for the patience!

Cheers :slight_smile:

I inherited this server and I’d prefer not to change the nginx log format but adjust netdata instead.
I guess however that ‘ut=’ ‘ua=’ etc. are just markers for a human being to be able to differentiate between columns.

Well, in the end I decided this really is not worth the effort. Changed nginx log format, threw out all xx= field prefixes and voíla - all is good!
Thanks for your help and effort, @OdysLam !

I am glad that. you made it work in the end, I am curious on the syntax though and why our web log collector didn’t like it. I will do some research and post here any results for posterity.

Cheers :slight_smile:

@shpokas hi!

Your format looks like mix of csv + ltsv.

weblog supports ltsv format, so if you convert your log format to (all fields, not several of them)

fieldName=value<space|tab>fieldName=value<space|tab>…

it should work, see Labels for Web server's Log section on the ltsv format page.


Changed nginx log format, threw out all xx= field prefixes and voíla - all is good!

:sweat_smile:

If you still want to switch to key=value format let me know!

You’re absolutely right. Of course this was a mix of formats and I have to stick to either one.
Thank you for update and clarification!