Parsing Log Files
Apache Log File
Let's first list the important information that we may need from the Apache logs
To read a log file, I prefer to read it as lines
I was looking for a simple regular expression for Apache logs. I found one here with small tweak.
So I came up with this small method which parses and converts Apache "access.log" file to an array contains a list of hashes with our needed information.
Returns
Note: The Apache LogFormat is configured as LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined
which is the default configurations.
%h is the remote host (i.e. the client IP address)
%l is the identity of the user determined by identd (not usually used since not reliable)
%u is the user name determined by HTTP authentication
%t is the time the request was received.
%r is the request line from the client. ("GET / HTTP/1.0")
%>s is the status code sent from the server to the client (200, 404 etc.)
%b is the size of the response to the client (in bytes)
Referer is the page that linked to this URL.
User-agent is the browser identification string.
IIS Log File
Here is a basic IIS log regular expression
Last updated