Apache Hadoop 2.7.1 yarn-概述
- 在hadoop-0.23以后新的架构介绍,分开了JobTracker的两个主要功能,资源管理(resource management)和 作业生命周期管理(job life-cycle management) 作为单独的组件
- 新的资源管理器(ResourceManager)管理着所有应用(application)计算资源的总体分配,每一个应用(application)的应用主(ApplicationMaster)管理着应用的调度(scheduling)和协调(coordination)
- 一个应用(application) 从经典的MapReduce来说是单个job,或是这些jobs中的一个DAG
- 资源管理器(ResourceManager)和每台机器的节点管理器(NodeManager)的守护进程,管理着每台机器(形成计算结构)的用户进程
- 每个应用(application)的应用主(ApplicationMaster),实际上是一个标准的框架库,任务是和资源管理器(ResourceManager)协商资源 ,工作是和节点管理器(NodeManager(s)) 执行和监控任务
- 更多的详情是可以找到在架构文档中
MapReduce NextGen aka YARN aka MRv2
The new architecture introduced in hadoop-0.23, divides the two major functions of the JobTracker: resource management and job life-cycle management into separate components.
The new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application’s scheduling and coordination.
An application is either a single job in the sense of classic MapReduce jobs or a DAG of such jobs.
The ResourceManager and per-machine NodeManager daemon, which manages the user processes on that machine, form the computation fabric.
The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.
More details are available in the Architecture document.
Documentation Index
YARN
YARN Architecture
Capacity Scheduler
Fair Scheduler
ResourceManager Restart
ResourceManager HA
Web Application Proxy
YARN Timeline Server
Writing YARN Applications
YARN Commands
Scheduler Load Simulator
NodeManager Restart
DockerContainerExecutor
Using CGroups
Secure Containers
Registry
YARN REST APIs
Introduction
Resource Manager
Node Manager