FlyData supports the commonly used Apache log format, Common Log Format (CLF) and Combined Log Format.
Using this format, users can upload their Apache access logs to Amazon Redshift and start analyzing their data right away. You don’t even need to create a table in Redshift, as FlyData does the job for you.
Standard log attributes
For example, the following log data in Combined Log Format will be uploaded to the Redshift table as follows:
- Log Data
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
- Data in Redshift
// apache_access_log table ------------------------------------------------------ | COLUMN | ROW | +----------------+-----------------------------------+ | ip | 127.0.0.1 | | remote_logname | - | | remote_user | frank | | timestamp | 2000-10-10 13:55:36 | | http_method | GET | | resource | /apache_pb.gif | | protocol | HTTP/1.0 | | status | 200 | | size | 2326 | | referrer | http://www.example.com/start.html | | user_agent | Mozilla/4.08 [en] (Win98; I ;Nav) | ------------------------------------------------------
Request query parameters
In addition to the standard log attributes mentioned above, FlyData also allows users to store the values of request query parameters, (for instance, ?user_id=123
) into their own corresponding columns on the Redshift table.
If the column for this is missing, FlyData automatically creates the column so that the user doesn’t have to worry about the table definition.
- Log Data
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /purchase?user_id=293&item_id=201 HTTP/1.0" 200 2326 "http://www.example.com/store.html" "Mozilla/4.08 [en] (Win98; I Nav)"
- Data in Redshift
// Apache_access_log table ------------------------------------------------------ | COLUMN | ROW | +----------------+-----------------------------------+ | ip | 127.0.0.1 | | remote_logname | - | | remote_user | frank | | timestamp | 2000-10-10 13:55:36 | | http_method | GET | | resource | /purchase?user_id=293&item_id=102 | | protocol | HTTP/1.0 | | status | 200 | | size | 2326 | | referrer | http://www.example.com/store.html | | user_agent | Mozilla/4.08 [en] (Win98; I ;Nav) | | user_id | 293 | | item_id | 102 | ------------------------------------------------------
FlyData extended format
In the event that the user has custom data that fits neither into the standard parameters nor into the request query parameters, they can use the FlyData Extended Log Format.
Extended Log Format is the addition of double-quoted strings to the end of Common Log Format or Combined Log Format. The contents must be in a string of key=value pairs concatenated with &
, which is the same format as the request query parameters.
- Log Data
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /purchase?user_id=293&item_id=201 HTTP/1.0" 200 2326 "http://www.example.com/store.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" "session_id=rfnq17675gtrfejbtc46n0vi97&response_time=7"
Here, the last double-quoted string, for
session_id
andresponse_time
, are in the FlyData Extended Log Format. During upload, FlyData will create columns for them on the Amazon Redshift table. - Data in Redshift
// Apache_access_log table ------------------------------------------------------ | COLUMN | ROW | +----------------+-----------------------------------+ | ip | 127.0.0.1 | | remote_logname | - | | remote_user | frank | | timestamp | 2000-10-10 13:55:36 | | http_method | GET | | resource | /purchase?user_id=293&item_id=102 | | protocol | HTTP/1.0 | | status | 200 | | size | 2326 | | referrer | http://www.example.com/store.html | | user_agent | Mozilla/4.08 [en] (Win98; I ;Nav) | | user_id | 293 | | item_id | 102 | | session_id | rfnq17675gtrfejbtc46n0vi97 | | response_time | 7 | ------------------------------------------------------
0 Comments