python日志解析
setdefault(key[, default])If key is in the dictionary, return its value.If not, insert key with a value of default and return default.default defaults to None.
如果键在字典中,返回这个键所对应的值。如果键不在字典中,向字典 中插入这个键,并且以default为这个键的值,并返回 default。default的默认值为None
[*]>>> dict={}
[*]>>> dict['key']='a'
[*]>>> dict
[*]{'key': 'a'}
[*]>>> dict.setdefault('key', 'b')# 键key存在,故返回简直a.
[*]'a'
[*]>>> dict
[*]{'key': 'a'}
[*]>>> dict.setdefault('key0', 'b') # 键key0不存在,故插入此键,并以b为键值.
[*]'b'
[*]>>> dict
[*]{'key0': 'b', 'key': 'a'}
日志分析
利用字典分析apache访问日志的脚本,以提取IP地址,字节数和状态
[*]#!/usr/bin/env python
[*]"""
[*]USAGE:
[*]apache_log.py some_log_file
[*]
[*]This script takes one command line argument:the name of a log file to parse.It then parses the lof file and generates a report which associates remote hosts with number of bytes transferred to them.
[*]"""
[*]
[*]import sys
[*]
[*]def dictify_logline(line):
[*] split_line = line.split()
[*] return {'remote_host': split_line,'status':split_line,'bytes_sent':split_line}
[*]
[*]def generate_log_report(logfile):
[*] report_dict = {}
[*] for line in logfile:
[*] line_dict = dictify_logline(line)
[*] print line_dict
[*] try:
[*] bytes_sent = int(line_dict['bytes_sent'])
[*] except ValueError:
[*] continue
[*] report_dict.setdefault(line_dict['remote_host'],[]).append(bytes_sent)
[*] return report_dict
[*]
[*]if __name__ == "__main__":
[*] if not len(sys.argv) > 1:
[*] print __doc__
[*] sys.exit(1)
[*] infile_name = sys.argv
[*] try:
[*] infile = open(infile_name,'r')
[*] except ValueError:
[*] print "You must specify a valid file to parse"
[*] sys.exit(1)
[*] log_report = generate_log_report(infile)
[*] print log_report
[*] infile.close()
页:
[1]