Hadoop源码之JobTracker
转自:http://blog.csdn.net/wwtang9527/article/details/8330472JobTracker是Map/Reducer中任务调度的服务器。
1、有如下线程为其服务:
1)提供两组RPC服务(InterTrackerProtocol、JobSubmissionProtocol)的1个Listener线程与默认10个Handler线程;
2)提供任务执行情况查询的一组web服务线程,包括Socker Listener等;
3)ExpireTrackers:用来停止已经无效的TaskTracker服务;
view plaincopyprint?
[*]
synchronized(taskTrackers){
[*]
synchronized(trackerExpiryQueue){
[*]
longnow=System.currentTimeMillis();
[*]TaskTrackerStatusleastRecent=null;
[*]
while((trackerExpiryQueue.size()>0)&&
[*]((leastRecent=(TaskTrackerStatus)trackerExpiryQueue.first())!=null)&&
[*](now-leastRecent.getLastSeen()>TASKTRACKER_EXPIRY_INTERVAL)){
[*]
[*]//Removeprofilefromheadofqueue
[*]trackerExpiryQueue.remove(leastRecent);
[*]StringtrackerName=leastRecent.getTrackerName();
[*]
[*]//Figureoutiflast-seentimeshouldbeupdated,oriftrackerisdead
[*]TaskTrackerStatusnewProfile=(TaskTrackerStatus)taskTrackers.get(leastRecent.getTrackerName());
[*]//ItemsmightleavethetaskTrackersetthroughothermeans;the
[*]//statusstoredin'taskTrackers'mightbenull,whichmeansthe
[*]//trackerhasalreadybeendestroyed.
[*]
if(newProfile!=null){
[*]
if(now-newProfile.getLastSeen()>TASKTRACKER_EXPIRY_INTERVAL){
[*]//Removecompletely
[*]updateTaskTrackerStatus(trackerName,null);
[*]lostTaskTracker(leastRecent.getTrackerName());
[*]}else{
[*]
//Updatetimebyinsertinglatestprofile
[*]trackerExpiryQueue.add(newProfile);
[*]}
[*]}
[*]}
[*]}
[*]}
synchronized (taskTrackers) {
synchronized (trackerExpiryQueue) {
long now = System.currentTimeMillis();
TaskTrackerStatus leastRecent = null;
while ((trackerExpiryQueue.size() > 0) &&
((leastRecent = (TaskTrackerStatus) trackerExpiryQueue.first()) != null) &&
(now - leastRecent.getLastSeen() > TASKTRACKER_EXPIRY_INTERVAL)) {
// Remove profile from head of queue
trackerExpiryQueue.remove(leastRecent);
String trackerName = leastRecent.getTrackerName();
// Figure out if last-seen time should be updated, or if tracker is dead
TaskTrackerStatus newProfile = (TaskTrackerStatus) taskTrackers.get(leastRecent.getTrackerName());
// Items might leave the taskTracker set through other means; the
// status stored in 'taskTrackers' might be null, which means the
// tracker has already been destroyed.
if (newProfile != null) {
if (now - newProfile.getLastSeen() > TASKTRACKER_EXPIRY_INTERVAL) {
// Remove completely
updateTaskTrackerStatus(trackerName, null);
lostTaskTracker(leastRecent.getTrackerName());
} else {
// Update time by inserting latest profile
trackerExpiryQueue.add(newProfile);
}
}
}
}
}
4)RetireJobs:用来除去已经完成任务的TaskTracker;
view plaincopyprint?
[*]
synchronized(jobs){
[*]
synchronized(jobInitQueue){
[*]
synchronized(jobsByArrival){
[*]
for(Iteratorit=jobs.keySet().iterator();it.hasNext();){
[*]Stringjobid=(String)it.next();
[*]JobInProgressjob=(JobInProgress)jobs.get(jobid);
[*]
if(job.getStatus().getRunState()!=JobStatus.RUNNING&&
[*]job.getStatus().getRunState()!=JobStatus.PREP&&
[*](job.getFinishTime()+RETIRE_JOB_INTERVAL<System.currentTimeMillis())){
[*]it.remove();
[*]
[*]jobInitQueue.remove(job);
[*]jobsByArrival.remove(job);
[*]}
[*]}
[*]}
[*]}
[*]}
synchronized (jobs) {
synchronized (jobInitQueue) {
synchronized (jobsByArrival) {
for (Iterator it = jobs.keySet().iterator(); it.hasNext(); ) {
String jobid = (String) it.next();
JobInProgress job = (JobInProgress) jobs.get(jobid);
if (job.getStatus().getRunState() != JobStatus.RUNNING &&
job.getStatus().getRunState() != JobStatus.PREP &&
(job.getFinishTime() + RETIRE_JOB_INTERVAL < System.currentTimeMillis())) {
it.remove();
jobInitQueue.remove(job);
jobsByArrival.remove(job);
}
}
}
}
}
5)JobInitThread:用来对job做一些初始化的工作;
view plaincopyprint?
[*]
synchronized(jobInitQueue){
[*]
if(jobInitQueue.size()>0){
[*]job=(JobInProgress)jobInitQueue.elementAt(0);
[*]jobInitQueue.remove(job);
[*]}else{
[*]
try{
[*]jobInitQueue.wait(JOBINIT_SLEEP_INTERVAL);
[*]}catch(InterruptedExceptioniex){
[*]}
[*]}
[*]}
[*]
try{
[*]
if(job!=null){
[*]job.initTasks();
[*]}
[*]}catch(Exceptione){
[*]LOG.log(Level.WARNING,"jobinitfailed",e);
[*]job.kill();
[*]}
synchronized (jobInitQueue) {
if (jobInitQueue.size() > 0) {
job = (JobInProgress) jobInitQueue.elementAt(0);
jobInitQueue.remove(job);
} else {
try {
jobInitQueue.wait(JOBINIT_SLEEP_INTERVAL);
} catch (InterruptedException iex) {
}
}
}
try {
if (job != null) {
job.initTasks();
}
} catch (Exception e) {
LOG.log(Level.WARNING, "job init failed", e);
job.kill();
}
2、实现了两组rpc服务(协议),其中InterTrackerProtocol如下:
1)TaskTracker间隔几秒钟发送的心跳服务;
view plaincopyprint?
[*]
intemitHeartbeat(TaskTrackerStatusstatus,booleaninitialContact);
int emitHeartbeat(TaskTrackerStatus status, boolean initialContact);
2)向JobTracker获取新的任务;
view plaincopyprint?
[*]TaskpollForNewTask(StringtrackerName);
Task pollForNewTask(String trackerName);
3)询问JobTracker,任务是否可以终结;
view plaincopyprint?
[*]StringpollForTaskWithClosedJob(StringtrackerName);
String pollForTaskWithClosedJob(String trackerName);
4)Reduce Task询问JobTracker,哪些Map Task已经结束;
view plaincopyprint?
[*]MapOutputLocation[]locateMapOutputs(StringtaskId,String[][]mapTasksNeeded);
MapOutputLocation[] locateMapOutputs(String taskId, String[][] mapTasksNeeded);
5)获取文件系统名;
view plaincopyprint?
[*]
publicStringgetFilesystemName()throwsIOException;
public String getFilesystemName() throws IOException;
JobSubmissionProtocol如下:
1)提交一个待执行的job;
view plaincopyprint?
[*]
publicJobStatussubmitJob(StringjobFile)throwsIOException;
public JobStatus submitJob(String jobFile) throws IOException;
2)杀死一个job;
view plaincopyprint?
[*]
publicvoidkillJob(Stringjobid);
public void killJob(String jobid);
3)获取job的名字、id等信息;
view plaincopyprint?
[*]
publicJobProfilegetJobProfile(Stringjobid);
public JobProfile getJobProfile(String jobid);
4)获取job的状态;
view plaincopyprint?
[*]
publicJobStatusgetJobStatus(Stringjobid);
public JobStatus getJobStatus(String jobid);
5)获取Map任务的报告;
view plaincopyprint?
[*]
publicTaskReport[]getMapTaskReports(Stringjobid);
public TaskReport[] getMapTaskReports(String jobid);
6)获取Reduce任务的报告;
view plaincopyprint?
[*]
publicTaskReport[]getReduceTaskReports(Stringjobid);
页:
[1]