设为首页 收藏本站
查看: 569|回复: 0

[经验分享] 云计算实战 (海量日志管理)hadoop + scribe -- scribe配置详解

[复制链接]

尚未签到

发表于 2016-12-13 10:09:42 | 显示全部楼层 |阅读模式
  


Scribe can be configured with:


  • the file specified in the -c command line option
  • the file at DEFAULT_CONF_FILE_LOCATION in env_default.h

Global Configuration Variables

port: assigned to variable “port”


  • which port the scribe server will listen on
  • default 0, passed at command line with -p, can also be set in conf file

max_msg_per_second:


  • used in scribeHandler::throttleDeny
  • default 0
  • the default value is 0 and this parameter is ignored if the value is 0. With recent changes this parameter has become less relevant, and max_queue_size should be the parameter used for throttling bussiness

max_queue_size: in bytes


  • used in scribeHandler::Log
  • default 5,000,000 bytes

check_interval: in seconds


  • used to control how often to check each store
  • default 5

new_thread_per_category: yes/no


  • If yes, will create a new thread for every category seen. Otherwise, will only create a single thread for every store defined in the configuration.
  • For prefix stores or the default store, setting this parameter to “no” will cause all messages that match this category to get processed by a single store. Otherwise, a new store will be created for each unique category name.
  • default yes

num_thrift_server_threads:


  • Number of threads listening for incoming messages
  • default 3

Example:


port=1463
max_msg_per_second=2000000
max_queue_size=10000000
check_interval=3
Store Configuration

Scribe Server determines how to log messages based on the Stores defined in the configuration. Every store must specify what message category it handles with three exceptions:
default store: The ‘default’ category handles any category that is not handled by any other store. There can only be one default store.


  • category=default

prefix stores: If the specified category ends in a *, the store will handle all categories that begin with the specified prefix.


  • category=web*

multiple categories: Can use ‘categories=’ to create multiple stores with a single store definition.


  • categories=rock paper* scissors

In the above three cases, Scribe will create a subdirectory for each unique category in File Stores (unless new_thread_per_category is set to false).

Store Configuration Variables

category: Determines which messages are handled by this store
type:


  • file
  • buffer
  • network
  • bucket
  • thriftfile
  • null
  • multi

target_write_size: 16,384 bytes by default


  • determines how large to let the message queue grow for a given category before processing the messages

max_batch_size: 1,024,000 bytes by default (may not be in open-source yet)


  • determines the amount of data from the in-memory store queue to be handled at a time. In practice, this (together with buffer file rotation size) controls how big a thrift call can be.

max_write_interval: 10 seconds by default


  • determines how long to let the messages queue for a given category before processing the messages

must_succeed: yes/no


  • Whether to requeue messages and retry if a store failed to process messages.
  • If set to ‘no’, messages will be dropped if the store cannot process them.
  • Note: We recommended using Buffer Stores to specify a secondary store to handle logging failures.
  • default yes

Example:

<store>
category=statistics
type=file
target_write_size=20480
max_write_interval=2
</store>
File Store Configuration

File Stores write messages to a file.
file_path: defaults to “/tmp”
base_filename: defaults to category name
use_hostname_sub_directory: yes/no, default no


  • Create a subdirectory using the server’s hostname

sub_directory: string


  • Create a subdirectory with the specified name

rotate_period: “hourly”, “daily”, “never”, or number[suffix]; “never” by default


  • determines how often to create new files
  • suffix may be “s”, “m”, “h”, “d”, “w” for seconds (the default), minutes, hours, days and weeks, respectively

rotate_hour: 0-23, 1 by default


  • if rotation_period is daily, determines what hour of day to rotate

rotate_minute 0-59, 15 by default


  • if rotation_period is daily or hourly, determines how many minutes after the hour to rotate

max_size: 1,000,000,000 bytes by default


  • determines approximately how large to let a file grow before rotating to a new file

write_meta: “yes” or anything else; false by default


  • if the file was rotated, the last line will contain "scribe_meta: " followed by the next filename

fs_type: supports two types “std” and “hdfs”. “std” by default
chunk_size: 0 by default. If a chunk size is specified, no messages within the file will cross chunk boundaries unless there are messages larger than the chunk size
add_newlines: 0 or 1, 0 by default


  • if set to 1, will write a newline after every message

create_symlink: “yes” or anything else; “yes” by default


  • if true, will maintain a symlink that points to the most recently written file

write_stats: yes/no, yes by default


  • whether to create a scribe_stats file for each store to keep track of files written

max_write_size: 1000000 bytes by default. The file store will try to flush the data out to the file system in chunks of max_write_size of bytes. max_write_size cannot be more than max_size. Say due to target_write_size a certain number of messages were buffered. And then the file store was called to save these messages. The file-store will save these messages at least max_write_size bytes sized chunks at a time. The last write that the file store will make can be smaller than max_write_size.
Example:

<store>
category=sprockets
type=file
file_path=/tmp/sprockets
base_filename=sprockets_log
max_size=1000000
add_newlines=1
rotate_period=daily
rotate_hour=0
rotate_minute=10
max_write_size=4096
</store>
Network Store Configuration

Network Stores forward messages to other Scribe Servers. Scribe keeps persistent connections open as long as it is able to send messages. (It will only re-open a connection on error or if the downstream machine is overloaded). Scribe will send messages in batches during normal operation based on how many messages are currently sitting in the queue waiting to be sent. (If Scribe is backed up and buffering messages to local disk, Scribe will send messages in chunks based on the buffer file sizes.)
remote_host: name or ip of remote host to forward messages
remote_port: port number on remote host
timeout: socket timeout, in MS; defaults to DEFAULT_SOCKET_TIMEOUT_MS, which is set to 5000 in store.h
use_conn_pool: “yes” or anything else; defaults to false


  • whether to use connection pooling instead of opening up multiple connections to each remote host

Example:

<store>
category=default
type=network
remote_host=hal
remote_port=1465
</store>
Buffer Store Configuration

Buffer Stores must have two sub-stores named “primary” and “secondary”. Buffer Stores will first attempt to Log messages to the primary store and only log to the secondary if the primary is not available. Once the primary store comes back online, a Buffer store will read messages out of the secondary store and send them to the primary store (unless replay_buffer=no). Only stores that are readable (store that implement the readOldest() method) may be used as secondary store. Currently, the only readable stores are File Stores and Null Stores.
max_queue_length: 2,000,000 messages by default


  • if the number of messages in the queue exceeds this value, the buffer store will switch to writing to the secondary store

buffer_send_rate: 1 by default


  • determines, for each check_interval, how many times to read a group of messages from the secondary store and send them to the primary store

retry_interval: 300 seconds by default


  • how long to wait to retry sending to the primary store after failing to write to the primary store

retry_interval_range: 60 seconds by default


  • will randomly pick a retry interval that is within this range of the specified retry_interval

replay_buffer: yes/no, default yes


  • If set to ‘no’, Buffer Store will not remove messages from the secondary store and send them to the primary store

Example:

<store>
category=default
type=buffer
buffer_send_rate=1
retry_interval=30
retry_interval_range=10
<primary>
type=network
remote_host=wopr
remote_port=1456
</primary>
<secondary>
type=file
file_path=/tmp
base_filename=thisisoverwritten
max_size=10000000
</secondary>
</store>
Bucket Store Configuration

Bucket Stores will hash messages to multiple files using a prefix of each message as the key.
You can define each bucket implicitly(using a single ‘bucket’ definition) or explicitly (using a bucket definition for every bucket). Bucket Stores that are defined implicitly must have a substore named “bucket” that is either a File Store, Network store or ThriftFile Store (see examples).
num_buckets: defaults to 1


  • number of buckets to hash into
  • messages that cannot be hashed into any bucket will be put into a special bucket number 0

bucket_type: “key_hash”, “key_modulo”, or “random”
delimiter: must be an ascii code between 1 and 255; otherwise the default delimiter is ‘:’


  • The message prefix up to(but not including) the first occurrence of the delimiter will be used as the key to do the hash/modulo. ‘random’ hashing does not use a delimiter.

remove_key: yes/no, defaults to no


  • whether to remove the key prefix from each message.

bucket_subdir: the name of each subdirectory will be this name followed by the bucket number if a single ‘bucket’ definition is used
Example:

<store>
category=bucket_me
type=bucket
num_buckets=5
bucket_subdir=bucket
bucket_type=key_hash
delimiter=58
<bucket>
type=file
fs_type=std
file_path=/tmp/scribetest
base_filename=bucket_me
</bucket>
</store>
Instead of using a single ‘bucket’ definition for all buckets, you can specify each bucket explicitly:

<store>
category=bucket_me
type=bucket
num_buckets=2
bucket_type=key_hash
<bucket0>
type=file
fs_type=std
file_path=/tmp/scribetest/bucket0
base_filename=bucket0
</bucket0>
<bucket1>
...
</bucket1>
<bucket2>
...
</bucket2>
</store>
You can also bucket into network stores as well:

<store>
category=bucket_me
type=bucket
num_buckets=2
bucket_type=random
<bucket0>
type=file
fs_type=std
file_path=/tmp/scribetest/bucket0
base_filename=bucket0
</bucket0>
<bucket1>
type=network
remote_host=wopr
remote_port=1463
</bucket1>
<bucket2>
type=network
remote_host=hal
remote_port=1463
</bucket2>
</store>
Null Store Configuration

Null Stores can be used to tell Scribe to ignore all messages of a given category.
(no configuration parameters)
Example:

<store>
category=tps_report*
type=null
</store>
Multi Store Configuration

A Multi Store is a store that will forward all messages to multiple sub-stores.
A Multi Store may have any number of substores named “store0”, “store1”, “store2”, etc
report_success: “all” or “any”, defaults to “all”


  • whether all substores or any substores must succeed in logging a message in order for the Multi Store to report the message logging as successful

Example:

<store>
category=default
type=multi
target_write_size=20480
max_write_interval=1
<store0>
type=file
file_path=/tmp/store0
</store0>
<store1>
type=file
file_path=/tmp/store1
</store1>
</store>
Thriftfile Store Configuration

A Thriftfile store is similar to a File store except that it stores messages in a Thrift TFileTransport file.
file_path: defaults to “/tmp”
base_filename: defaults to category name
rotate_period: “hourly”, “daily”, “never”, or number[suffix]; “never” by default


  • determines how often to create new files
  • suffix may be “s”, “m”, “h”, “d”, “w” for seconds (the default), minutes, hours, days and weeks, respectively

rotate_hour: 0-23, 1 by default


  • if rotation_period is daily, determines what hour of day to rotate

rotate_minute 0-59, 15 by default


  • if rotation_period is daily or hourly, determines how many minutes after the hour to rotate

max_size: 1,000,000,000 bytes by default


  • determines approximately how large to let a file grow before rotating to a new file

fs_type: currently only “std” is supported; “std” by default
chunk_size: 0 by default


  • if a chunk size is specified, no messages within the file will cross chunk boundaries unless there are messages larger than the chunk size

create_symlink: “yes” or anything else; “yes” by default


  • if true, will maintain a symlink that points to the most recently written file

flush_frequency_ms: milliseconds, will use TFileTransport default of 3000ms if not specified


  • determines how frequently to sync the Thrift file to disk

msg_buffer_size: in bytes, will use TFileTransport default of 0 if not specified


  • if non-zero, store will reject any writes larger than this size

Example:

<store>
category=sprockets
type=thriftfile
file_path=/tmp/sprockets
base_filename=sprockets_log
max_size=1000000
flush_frequency_ms=2000
</store>





Last edited by zshao, September 13, 2010

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.iyunv.com/thread-313648-1-1.html 上篇帖子: 云计算实战 (海量日志管理)hadoop + scribe -- log4j 客户端写入scribe 下篇帖子: hadoop-0.20.2+737 and hbase-0.20.6 not compatible?
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表