Hadoop源碼分析三啟動及腳本剖析
1、 啟動
hadoop的啟動是通過其sbin目錄下的腳本來啟動的。與啟動相關(guān)的叫腳本有以下幾個:
start-all.sh、start-dfs.sh、start-yarn.sh、hadoop-daemon.sh、yarn-daemon.sh。
hadoop-daemon.sh是用來啟動與hdfs相關(guān)的服務(wù)的
yarn-daemon.sh是用來啟動和yarn相關(guān)的服務(wù)
start-dfs.sh是用來啟動hdfs集群的
start-yarn.sh是用來啟動yarn集群
start-all.sh是用來啟動yarn和hdfs集群的
這幾個start開頭的腳本都是通過調(diào)用那兩個daemon腳本來啟動的。
2、 腳本分析
這里先從start-all.sh開始分析,然后逐步分析其腳本的調(diào)用。
start-all.sh腳本內(nèi)容如下:
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Start all hadoop daemons. Run this on master node.
echo "This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh"
bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh
# start hdfs daemons if hdfs is present
if [ -f "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh ]; then
"${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh --config $HADOOP_CONF_DIR
fi
# start yarn daemons if yarn is present
if [ -f "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh ]; then
"${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIR
fi
這個腳本的重點(diǎn)在第31行到末尾,這里是兩個if語句,第一個if語句里(第34行)調(diào)用的是start-dfs.sh腳本,第二個if語句里(第37行)調(diào)用的是start-yarn.sh。
然后我們以start-dfs.sh為例,繼續(xù)向下分析。
start-dfs.sh的內(nèi)容如下:
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Start hadoop dfs daemons.
# Optinally upgrade or rollback dfs state.
# Run this on master node.
usage="Usage: start-dfs.sh [-upgrade|-rollback] [other options such as -clusterId]"
bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hdfs-config.sh
# get arguments
if [[ $# -ge 1 ]]; then
startOpt="$1"
shift
case "$startOpt" in
-upgrade)
nameStartOpt="$startOpt"
;;
-rollback)
dataStartOpt="$startOpt"
;;
*)
echo $usage
exit 1
;;
esac
fi
#Add other possible options
nameStartOpt="$nameStartOpt $@"
#---------------------------------------------------------
# namenodes
NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes)
echo "Starting namenodes on [$NAMENODES]"
"$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
--config "$HADOOP_CONF_DIR" \
--hostnames "$NAMENODES" \
--script "$bin/hdfs" start namenode $nameStartOpt
#---------------------------------------------------------
# datanodes (using default slaves file)
if [ -n "$HADOOP_SECURE_DN_USER" ]; then
echo \
"Attempting to start secure cluster, skipping datanodes. " \
"Run start-secure-dns.sh as root to complete startup."
else
"$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
--config "$HADOOP_CONF_DIR" \
--script "$bin/hdfs" start datanode $dataStartOpt
fi
#---------------------------------------------------------
# secondary namenodes (if any)
SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null)
if [ -n "$SECONDARY_NAMENODES" ]; then
echo "Starting secondary namenodes [$SECONDARY_NAMENODES]"
"$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
--config "$HADOOP_CONF_DIR" \
--hostnames "$SECONDARY_NAMENODES" \
--script "$bin/hdfs" start secondarynamenode
fi
#---------------------------------------------------------
# quorumjournal nodes (if any)
SHARED_EDITS_DIR=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.namenode.shared.edits.dir 2>&-)
case "$SHARED_EDITS_DIR" in
qjournal://*)
JOURNAL_NODES=$(echo "$SHARED_EDITS_DIR" | sed 's,qjournal://\([^/]*\)/.*,\1,g; s/;/ /g; s/:[0-9]*//g')
echo "Starting journal nodes [$JOURNAL_NODES]"
"$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
--config "$HADOOP_CONF_DIR" \
--hostnames "$JOURNAL_NODES" \
--script "$bin/hdfs" start journalnode ;;
esac
#---------------------------------------------------------
# ZK Failover controllers, if auto-HA is enabled
AUTOHA_ENABLED=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.ha.automatic-failover.enabled)
if [ "$(echo "$AUTOHA_ENABLED" | tr A-Z a-z)" = "true" ]; then
echo "Starting ZK Failover Controllers on NN hosts [$NAMENODES]"
"$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
--config "$HADOOP_CONF_DIR" \
--hostnames "$NAMENODES" \
--script "$bin/hdfs" start zkfc
fi
# eof
首先是第23行到30行,這里是在處理hadoop的路徑。
然后是第32行到51行是在處理腳本傳入的參數(shù)。最后就啟動hdfs的各個角色了。
首先啟動的是namenode,在第60行到63行,這段代碼是調(diào)用了hadoop-daemons.sh腳本來啟動,這個腳本和之前提到的hadoop-daemon.sh腳本的區(qū)別在于這個腳本可以在集群的其他機(jī)器上啟動該啟動的角色,而hadoop-daemon.sh只能啟動當(dāng)前機(jī)器上的角色。其實(shí)hadoop-daemons.sh也是通過調(diào)用hadoop-daemon.sh來啟動的,這個稍后再分析。這個腳本還有幾個參數(shù),其中最重要的是:start namenode。它表示啟動namenode。
然后是第65行到第76行,這里是在啟動datanode,啟動的方式和namenode相同。啟動DataNode的代碼在第73行。
然后是第78行到第90行,這里在啟動secondarynamenode。如果是配置了namenode的高可用,secondarynamenode便不會啟動。
然后是第92行到第105行,這里在啟動journalnode。如果配置了namenode的高可用,journalnode才會啟動。
最后是第107行到末尾,這里啟動的是zkfc。同樣這個也是要配置高可用才會啟動。
如果按照文檔(2)中配置的高可用來看,這里啟動的角色應(yīng)該為:namenode、datanode、journalnode、zkfc。
啟動上述角色調(diào)用的hadoop-daemons.sh腳本內(nèi)容如下:
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Run a Hadoop command on all slave hosts.
usage="Usage: hadoop-daemons.sh [--config confdir] [--hosts hostlistfile] [start|stop] command args..."
# if no args specified, show usage
if [ $# -le 1 ]; then
echo $usage
exit 1
fi
bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh
exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@"
這個腳本的重點(diǎn)在最后一行。這里有兩個腳本slaves.sh和hadoop-daemon.sh。slaves.sh腳本會使用ssh登錄到指定的服務(wù)器中然后執(zhí)行其中的hadoop-daemon.sh腳本。這里就不分析slaves.sh腳本了。
我們繼續(xù)看hadoop-daemon.sh腳本。
其內(nèi)容如下:
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Runs a Hadoop command as a daemon.
#
# Environment Variables
#
# HADOOP_CONF_DIR Alternate conf dir. Default is ${HADOOP_PREFIX}/conf.
# HADOOP_LOG_DIR Where log files are stored. PWD by default.
# HADOOP_MASTER host:path where hadoop code should be rsync'd from
# HADOOP_PID_DIR The pid files are stored. /tmp by default.
# HADOOP_IDENT_STRING A string representing this instance of hadoop. $USER by default
# HADOOP_NICENESS The scheduling priority for daemons. Defaults to 0.
##
usage="Usage: hadoop-daemon.sh [--config <conf-dir>] [--hosts hostlistfile] [--script script] (start|stop) <hadoop-command> <args...>"
# if no args specified, show usage
if [ $# -le 1 ]; then
echo $usage
exit 1
fi
bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh
# get arguments
#default value
hadoopScript="$HADOOP_PREFIX"/bin/hadoop
if [ "--script" = "$1" ]
then
shift
hadoopScript=$1
shift
fi
startStop=$1
shift
command=$1
shift
hadoop_rotate_log ()
{
log=$1;
num=5;
if [ -n "$2" ]; then
num=$2
fi
if [ -f "$log" ]; then # rotate logs
while [ $num -gt 1 ]; do
prev=`expr $num - 1`
[ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num"
num=$prev
done
mv "$log" "$log.$num";
fi
}
if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
. "${HADOOP_CONF_DIR}/hadoop-env.sh"
fi
# Determine if we're starting a secure datanode, and if so, redefine appropriate variables
if [ "$command" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then
export HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR
export HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR
export HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER
starting_secure_dn="true"
fi
#Determine if we're starting a privileged NFS, if so, redefine the appropriate variables
if [ "$command" == "nfs3" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_PRIVILEGED_NFS_USER" ]; then
export HADOOP_PID_DIR=$HADOOP_PRIVILEGED_NFS_PID_DIR
export HADOOP_LOG_DIR=$HADOOP_PRIVILEGED_NFS_LOG_DIR
export HADOOP_IDENT_STRING=$HADOOP_PRIVILEGED_NFS_USER
starting_privileged_nfs="true"
fi
if [ "$HADOOP_IDENT_STRING" = "" ]; then
export HADOOP_IDENT_STRING="$USER"
fi
# get log directory
if [ "$HADOOP_LOG_DIR" = "" ]; then
export HADOOP_LOG_DIR="$HADOOP_PREFIX/logs"
fi
if [ ! -w "$HADOOP_LOG_DIR" ] ; then
mkdir -p "$HADOOP_LOG_DIR"
chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR
fi
if [ "$HADOOP_PID_DIR" = "" ]; then
HADOOP_PID_DIR=/tmp
fi
# some variables
export HADOOP_LOGFILE=hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.log
export HADOOP_ROOT_LOGGER=${HADOOP_ROOT_LOGGER:-"INFO,RFA"}
export HADOOP_SECURITY_LOGGER=${HADOOP_SECURITY_LOGGER:-"INFO,RFAS"}
export HDFS_AUDIT_LOGGER=${HDFS_AUDIT_LOGGER:-"INFO,NullAppender"}
log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out
pid=$HADOOP_PID_DIR/hadoop-$HADOOP_IDENT_STRING-$command.pid
HADOOP_STOP_TIMEOUT=${HADOOP_STOP_TIMEOUT:-5}
# Set default scheduling priority
if [ "$HADOOP_NICENESS" = "" ]; then
export HADOOP_NICENESS=0
fi
case $startStop in
(start)
[ -w "$HADOOP_PID_DIR" ] || mkdir -p "$HADOOP_PID_DIR"
if [ -f $pid ]; then
if kill -0 `cat $pid` > /dev/null 2>&1; then
echo $command running as process `cat $pid`. Stop it first.
exit 1
fi
fi
if [ "$HADOOP_MASTER" != "" ]; then
echo rsync from $HADOOP_MASTER
rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_PREFIX"
fi
hadoop_rotate_log $log
echo starting $command, logging to $log
cd "$HADOOP_PREFIX"
case $command in
namenode|secondarynamenode|datanode|journalnode|dfs|dfsadmin|fsck|balancer|zkfc)
if [ -z "$HADOOP_HDFS_HOME" ]; then
hdfsScript="$HADOOP_PREFIX"/bin/hdfs
else
hdfsScript="$HADOOP_HDFS_HOME"/bin/hdfs
fi
nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
;;
(*)
nohup nice -n $HADOOP_NICENESS $hadoopScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
;;
esac
echo $! > $pid
sleep 1
head "$log"
# capture the ulimit output
if [ "true" = "$starting_secure_dn" ]; then
echo "ulimit -a for secure datanode user $HADOOP_SECURE_DN_USER" >> $log
# capture the ulimit info for the appropriate user
su --shell=/bin/bash $HADOOP_SECURE_DN_USER -c 'ulimit -a' >> $log 2>&1
elif [ "true" = "$starting_privileged_nfs" ]; then
echo "ulimit -a for privileged nfs user $HADOOP_PRIVILEGED_NFS_USER" >> $log
su --shell=/bin/bash $HADOOP_PRIVILEGED_NFS_USER -c 'ulimit -a' >> $log 2>&1
else
echo "ulimit -a for user $USER" >> $log
ulimit -a >> $log 2>&1
fi
sleep 3;
if ! ps -p $! > /dev/null ; then
exit 1
fi
;;
(stop)
if [ -f $pid ]; then
TARGET_PID=`cat $pid`
if kill -0 $TARGET_PID > /dev/null 2>&1; then
echo stopping $command
kill $TARGET_PID
sleep $HADOOP_STOP_TIMEOUT
if kill -0 $TARGET_PID > /dev/null 2>&1; then
echo "$command did not stop gracefully after $HADOOP_STOP_TIMEOUT seconds: killing with kill -9"
kill -9 $TARGET_PID
fi
else
echo no $command to stop
fi
rm -f $pid
else
echo no $command to stop
fi
;;
(*)
echo $usage
exit 1
;;
esac
這段代碼的重點(diǎn)在第131行到結(jié)束。這里是真正在啟動服務(wù)的代碼,這個文件在調(diào)用的時(shí)候,會傳入兩個重要的參數(shù)start/stop xxx。用于啟動或停止某些服務(wù)。以啟動服務(wù)為例,其重點(diǎn)在第153行,這里會執(zhí)行一個hdfsScript腳本。這個參數(shù)的定義在第155行,
這里可以看見它實(shí)際是hadoop的bin目錄下的hdfs文件
文件的內(nèi)容如下:
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Environment Variables
#
# JSVC_HOME home directory of jsvc binary. Required for starting secure
# datanode.
#
# JSVC_OUTFILE path to jsvc output file. Defaults to
# $HADOOP_LOG_DIR/jsvc.out.
#
# JSVC_ERRFILE path to jsvc error file. Defaults to $HADOOP_LOG_DIR/jsvc.err.
bin=`which $0`
bin=`dirname ${bin}`
bin=`cd "$bin" > /dev/null; pwd`
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hdfs-config.sh
function print_usage(){
echo "Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND"
echo " where COMMAND is one of:"
echo " dfs run a filesystem command on the file systems supported in Hadoop."
echo " classpath prints the classpath"
echo " namenode -format format the DFS filesystem"
echo " secondarynamenode run the DFS secondary namenode"
echo " namenode run the DFS namenode"
echo " journalnode run the DFS journalnode"
echo " zkfc run the ZK Failover Controller daemon"
echo " datanode run a DFS datanode"
echo " dfsadmin run a DFS admin client"
echo " haadmin run a DFS HA admin client"
echo " fsck run a DFS filesystem checking utility"
echo " balancer run a cluster balancing utility"
echo " jmxget get JMX exported values from NameNode or DataNode."
echo " mover run a utility to move block replicas across"
echo " storage types"
echo " oiv apply the offline fsimage viewer to an fsimage"
echo " oiv_legacy apply the offline fsimage viewer to an legacy fsimage"
echo " oev apply the offline edits viewer to an edits file"
echo " fetchdt fetch a delegation token from the NameNode"
echo " getconf get config values from configuration"
echo " groups get the groups which users belong to"
echo " snapshotDiff diff two snapshots of a directory or diff the"
echo " current directory contents with a snapshot"
echo " lsSnapshottableDir list all snapshottable dirs owned by the current user"
echo " Use -help to see options"
echo " portmap run a portmap service"
echo " nfs3 run an NFS version 3 gateway"
echo " cacheadmin configure the HDFS cache"
echo " crypto configure HDFS encryption zones"
echo " storagepolicies list/get/set block storage policies"
echo " version print the version"
echo ""
echo "Most commands print help when invoked w/o parameters."
# There are also debug commands, but they don't show up in this listing.
}
if [ $# = 0 ]; then
print_usage
exit
fi
COMMAND=$1
shift
case $COMMAND in
# usage flags
--help|-help|-h)
print_usage
exit
;;
esac
# Determine if we're starting a secure datanode, and if so, redefine appropriate variables
if [ "$COMMAND" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then
if [ -n "$JSVC_HOME" ]; then
if [ -n "$HADOOP_SECURE_DN_PID_DIR" ]; then
HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR
fi
if [ -n "$HADOOP_SECURE_DN_LOG_DIR" ]; then
HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
fi
HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
starting_secure_dn="true"
else
echo "It looks like you're trying to start a secure DN, but \$JSVC_HOME"\
"isn't set. Falling back to starting insecure DN."
fi
fi
# Determine if we're starting a privileged NFS daemon, and if so, redefine appropriate variables
if [ "$COMMAND" == "nfs3" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_PRIVILEGED_NFS_USER" ]; then
if [ -n "$JSVC_HOME" ]; then
if [ -n "$HADOOP_PRIVILEGED_NFS_PID_DIR" ]; then
HADOOP_PID_DIR=$HADOOP_PRIVILEGED_NFS_PID_DIR
fi
if [ -n "$HADOOP_PRIVILEGED_NFS_LOG_DIR" ]; then
HADOOP_LOG_DIR=$HADOOP_PRIVILEGED_NFS_LOG_DIR
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
fi
HADOOP_IDENT_STRING=$HADOOP_PRIVILEGED_NFS_USER
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
starting_privileged_nfs="true"
else
echo "It looks like you're trying to start a privileged NFS server, but"\
"\$JSVC_HOME isn't set. Falling back to starting unprivileged NFS server."
fi
fi
if [ "$COMMAND" = "namenode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode'
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"
elif [ "$COMMAND" = "zkfc" ] ; then
CLASS='org.apache.hadoop.hdfs.tools.DFSZKFailoverController'
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_ZKFC_OPTS"
elif [ "$COMMAND" = "secondarynamenode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode'
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_SECONDARYNAMENODE_OPTS"
elif [ "$COMMAND" = "datanode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
if [ "$starting_secure_dn" = "true" ]; then
HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
else
HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
fi
elif [ "$COMMAND" = "journalnode" ] ; then
CLASS='org.apache.hadoop.hdfs.qjournal.server.JournalNode'
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOURNALNODE_OPTS"
elif [ "$COMMAND" = "dfs" ] ; then
CLASS=org.apache.hadoop.fs.FsShell
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "dfsadmin" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "haadmin" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.DFSHAAdmin
CLASSPATH=${CLASSPATH}:${TOOL_PATH}
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "fsck" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.DFSck
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "balancer" ] ; then
CLASS=org.apache.hadoop.hdfs.server.balancer.Balancer
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_BALANCER_OPTS"
elif [ "$COMMAND" = "mover" ] ; then
CLASS=org.apache.hadoop.hdfs.server.mover.Mover
HADOOP_OPTS="${HADOOP_OPTS} ${HADOOP_MOVER_OPTS}"
elif [ "$COMMAND" = "storagepolicies" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.StoragePolicyAdmin
elif [ "$COMMAND" = "jmxget" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.JMXGet
elif [ "$COMMAND" = "oiv" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB
elif [ "$COMMAND" = "oiv_legacy" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer
elif [ "$COMMAND" = "oev" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.offlineEditsViewer.OfflineEditsViewer
elif [ "$COMMAND" = "fetchdt" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.DelegationTokenFetcher
elif [ "$COMMAND" = "getconf" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.GetConf
elif [ "$COMMAND" = "groups" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.GetGroups
elif [ "$COMMAND" = "snapshotDiff" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.snapshot.SnapshotDiff
elif [ "$COMMAND" = "lsSnapshottableDir" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.snapshot.LsSnapshottableDir
elif [ "$COMMAND" = "portmap" ] ; then
CLASS=org.apache.hadoop.portmap.Portmap
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_PORTMAP_OPTS"
elif [ "$COMMAND" = "nfs3" ] ; then
CLASS=org.apache.hadoop.hdfs.nfs.nfs3.Nfs3
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NFS3_OPTS"
elif [ "$COMMAND" = "cacheadmin" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.CacheAdmin
elif [ "$COMMAND" = "crypto" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.CryptoAdmin
elif [ "$COMMAND" = "version" ] ; then
CLASS=org.apache.hadoop.util.VersionInfo
elif [ "$COMMAND" = "debug" ]; then
CLASS=org.apache.hadoop.hdfs.tools.DebugAdmin
elif [ "$COMMAND" = "classpath" ]; then
if [ "$#" -gt 0 ]; then
CLASS=org.apache.hadoop.util.Classpath
else
# No need to bother starting up a JVM for this simple case.
if $cygwin; then
CLASSPATH=$(cygpath -p -w "$CLASSPATH" 2>/dev/null)
fi
echo $CLASSPATH
exit 0
fi
else
CLASS="$COMMAND"
fi
# cygwin path translation
if $cygwin; then
CLASSPATH=$(cygpath -p -w "$CLASSPATH" 2>/dev/null)
HADOOP_LOG_DIR=$(cygpath -w "$HADOOP_LOG_DIR" 2>/dev/null)
HADOOP_PREFIX=$(cygpath -w "$HADOOP_PREFIX" 2>/dev/null)
HADOOP_CONF_DIR=$(cygpath -w "$HADOOP_CONF_DIR" 2>/dev/null)
HADOOP_COMMON_HOME=$(cygpath -w "$HADOOP_COMMON_HOME" 2>/dev/null)
HADOOP_HDFS_HOME=$(cygpath -w "$HADOOP_HDFS_HOME" 2>/dev/null)
HADOOP_YARN_HOME=$(cygpath -w "$HADOOP_YARN_HOME" 2>/dev/null)
HADOOP_MAPRED_HOME=$(cygpath -w "$HADOOP_MAPRED_HOME" 2>/dev/null)
fi
export CLASSPATH=$CLASSPATH
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,NullAppender}"
# Check to see if we should start a secure datanode
if [ "$starting_secure_dn" = "true" ]; then
if [ "$HADOOP_PID_DIR" = "" ]; then
HADOOP_SECURE_DN_PID="/tmp/hadoop_secure_dn.pid"
else
HADOOP_SECURE_DN_PID="$HADOOP_PID_DIR/hadoop_secure_dn.pid"
fi
JSVC=$JSVC_HOME/jsvc
if [ ! -f $JSVC ]; then
echo "JSVC_HOME is not set correctly so jsvc cannot be found. jsvc is required to run secure datanodes. "
echo "Please download and install jsvc from http://archive.apache.org/dist/commons/daemon/binaries/ "\
"and set JSVC_HOME to the directory containing the jsvc binary."
exit
fi
if [[ ! $JSVC_OUTFILE ]]; then
JSVC_OUTFILE="$HADOOP_LOG_DIR/jsvc.out"
fi
if [[ ! $JSVC_ERRFILE ]]; then
JSVC_ERRFILE="$HADOOP_LOG_DIR/jsvc.err"
fi
exec "$JSVC" \
-Dproc_$COMMAND -outfile "$JSVC_OUTFILE" \
-errfile "$JSVC_ERRFILE" \
-pidfile "$HADOOP_SECURE_DN_PID" \
-nodetach \
-user "$HADOOP_SECURE_DN_USER" \
-cp "$CLASSPATH" \
$JAVA_HEAP_MAX $HADOOP_OPTS \
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@"
elif [ "$starting_privileged_nfs" = "true" ] ; then
if [ "$HADOOP_PID_DIR" = "" ]; then
HADOOP_PRIVILEGED_NFS_PID="/tmp/hadoop_privileged_nfs3.pid"
else
HADOOP_PRIVILEGED_NFS_PID="$HADOOP_PID_DIR/hadoop_privileged_nfs3.pid"
fi
JSVC=$JSVC_HOME/jsvc
if [ ! -f $JSVC ]; then
echo "JSVC_HOME is not set correctly so jsvc cannot be found. jsvc is required to run privileged NFS gateways. "
echo "Please download and install jsvc from http://archive.apache.org/dist/commons/daemon/binaries/ "\
"and set JSVC_HOME to the directory containing the jsvc binary."
exit
fi
if [[ ! $JSVC_OUTFILE ]]; then
JSVC_OUTFILE="$HADOOP_LOG_DIR/nfs3_jsvc.out"
fi
if [[ ! $JSVC_ERRFILE ]]; then
JSVC_ERRFILE="$HADOOP_LOG_DIR/nfs3_jsvc.err"
fi
exec "$JSVC" \
-Dproc_$COMMAND -outfile "$JSVC_OUTFILE" \
-errfile "$JSVC_ERRFILE" \
-pidfile "$HADOOP_PRIVILEGED_NFS_PID" \
-nodetach \
-user "$HADOOP_PRIVILEGED_NFS_USER" \
-cp "$CLASSPATH" \
$JAVA_HEAP_MAX $HADOOP_OPTS \
org.apache.hadoop.hdfs.nfs.nfs3.PrivilegedNfsGatewayStarter "$@"
else
# run it
exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
fi
這段代碼的重點(diǎn)在第134行到219行和第304行。首先看第134行到219行,這段代碼雖然很長但全是一堆if else語句。其邏輯也很簡單,就是根據(jù)傳入的COMMAND的值來為CLASS和HADOOP_OPTS賦值。然后是第304行,執(zhí)行CLASS類。以第134行的namenode類為例,這里CLASS的值為org.apache.hadoop.hdfs.server.namenode.NameNode,這是一個java類隨后便會執(zhí)行這個類啟動namenode。
如果想要debug hdfs的源碼,最好在這里設(shè)置遠(yuǎn)程調(diào)試。因?yàn)檫@里有單個的服務(wù)類與啟動參數(shù),可以準(zhǔn)確的定位需要的服務(wù)。
以上就是Hadoop源碼分析之啟動及腳本剖析的詳細(xì)內(nèi)容,本系列下一篇文章傳送門Hadoop源碼分析四遠(yuǎn)程debug調(diào)試更多關(guān)于Hadoop源碼分析分析的資料請繼續(xù)關(guān)注腳本之家其它相關(guān)文章!
相關(guān)文章
springboot默認(rèn)文件緩存(easy-captcha?驗(yàn)證碼)
這篇文章主要介紹了springboot的文件緩存(easy-captcha?驗(yàn)證碼),本文通過實(shí)例代碼給大家介紹的非常詳細(xì),對大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2023-06-06
SpringBoot如何獲取application.properties中自定義的值
這篇文章主要介紹了SpringBoot獲取application.properties中的自定義的值,目錄結(jié)構(gòu)文件代碼給大家列舉的非常詳細(xì),需要的朋友可以參考下2021-09-09
SpringBoot2 task scheduler 定時(shí)任務(wù)調(diào)度器四種方式
這篇文章主要介紹了SpringBoot2 task scheduler 定時(shí)任務(wù)調(diào)度器四種方式,文中通過示例代碼介紹的非常詳細(xì),對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧2019-03-03
Java線程池FutureTask實(shí)現(xiàn)原理詳解
這篇文章主要介紹了Java線程池FutureTask實(shí)現(xiàn)原理詳解,小編覺得還是挺不錯的,具有一定借鑒價(jià)值,需要的朋友可以參考下2018-02-02
SpringBoot整合Mybatis-Plus實(shí)現(xiàn)關(guān)聯(lián)查詢
Mybatis-Plus(簡稱MP)是一個Mybatis的增強(qiáng)工具,只是在Mybatis的基礎(chǔ)上做了增強(qiáng)卻不做改變,MyBatis-Plus支持所有Mybatis原生的特性,本文給大家介紹了SpringBoot整合Mybatis-Plus實(shí)現(xiàn)關(guān)聯(lián)查詢,需要的朋友可以參考下2024-08-08
Mybatis-Plus @TableField自動填充時(shí)間為null的問題解決
本文主要介紹了Mybatis-Plus @TableField自動填充時(shí)間為null的問題解決,文中通過示例代碼介紹的非常詳細(xì),對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧2023-01-01

