Term
|
Definition
Smallest accessible unit in MapFS. Volumes are logical partitions of the filesystem. Each volume is comprised of one or more data containers. Also contains metadata container about data in all other containers |
|
|
Term
MetaData Container (Logic of location) |
|
Definition
Container held in each volume providing infomration about the data in other containers in volume. Using container in volume ensure meta data is replicated across all nodes in volume |
|
|
Term
|
Definition
a service that runs on all nodes to manage monitor and report on services in EACH node |
|
|
Term
|
Definition
A service used to coordinate services running across multiple nodes. Zookeeper prevents service conflicts by enforcing rules and conditions to determin which instance of service is the master |
|
|
Term
|
Definition
Warden will not start a process on zookeeper until a quorom of sookeepers are active |
|
|
Term
Does MapR require HBASE master or region servers? |
|
Definition
No, if MAPR-DB is using only relational files |
|
|
Term
|
Definition
"Container Location Database" service that runs across multiple cluster nodes, provides directory of container locations |
|
|
Term
Can MapR-DB be used for structured and unstructured data? |
|
Definition
Yes, and allows both in a single cluster. MapR-DB available in both community and Enterprise editions |
|
|
Term
|
Definition
Mapreduce Service. Hadoop Task tracker starts and tracks mapreduce tasks on a node. Task Tracker service receives task assignments from the job tracker service and manages task execution |
|
|
Term
|
Definition
YARN Hadoop MapReduce Service. Manages node resources and monitors health of node works with ResourceManager to manage YARN containers that run on node |
|
|
Term
|
Definition
MapR Service. Manages Disk storage to Mapr-FS and Mapr-DB on each node. |
|
|
Term
|
Definition
Coordinates data storage services among MapR-FS fileserver nodes, MapR NFS Gateways, and MapR clients |
|
|
Term
|
Definition
MapR Service, provides Read-Write MapR Direct Access NFS access to the cluster |
|
|
Term
|
Definition
Provides access to MapR-DB tables via HBase APIs. Required on all nodes that will access table data n MapR-FS. Typically all tracker nodes and edge nodes for accessing table data. |
|
|
Term
|
Definition
Hadoop Mapreduce Management Service. Coordinates execution of MapReduce jobs by assigning tasks to task tracker nodes and monitoring execution |
|
|
Term
|
Definition
Hadoop YARN Management Service. manages cluster resources. Tracks resource usage and node health |
|
|
Term
|
Definition
Enables HA and Fault Tolerance for MapR clusters by providing coordination |
|
|
Term
|
Definition
Manages region servers that make up HBase table storage. Only required for native Apache HBase applications, not required for MapR-DB |
|
|
Term
|
Definition
Runs MapR Control System (MCS) |
|
|
Term
Metrics (service required on which nodes) |
|
Definition
Optional Real-Time analytics data on cluster and job performance through Analyzing Job Metrics interface. If used, Metric service required on all JobTracker and WebServer nodes. |
|
|
Term
Services Typically Running on a MapR Data Node |
|
Definition
FileServer, TaskTracker, NodeManager |
|
|
Term
When running multiple NICs on node, is it necessary to bond or trunk together? |
|
Definition
No MapR is able to handle multiple NICs transparently |
|
|
Term
|
Definition
Min 2 Instances, Master/slave for failover |
|
|
Term
|
Definition
Majority of nodes (Quorum) must be up. Min 3 instances (2/3 must be up to function). Should run odd number of instances, setting up more than 5 is not recommended. |
|
|
Term
HA for JobTracker, ResourceManager, and HBaseMaster |
|
Definition
All Active/Standby. If active instance fails, standby takes over. |
|
|
Term
|
Definition
VIPS or Virtual IP Addresses can be used for load balancing and HA in NFS, as well as providing access into NFS gateway through firewall via loadbalancer |
|
|