Hadoop源碼分析三啟動及腳本剖析
1、 啟動
hadoop的啟動是通過其sbin目錄下的腳本來啟動的。與啟動相關的叫腳本有以下幾個:
start-all.sh
、start-dfs.sh
、start-yarn.sh
、hadoop-daemon.sh
、yarn-daemon.sh
。
hadoop-daemon.sh
是用來啟動與hdfs相關的服務的
yarn-daemon.sh
是用來啟動和yarn相關的服務
start-dfs.sh
是用來啟動hdfs集群的
start-yarn.sh
是用來啟動yarn集群
start-all.sh
是用來啟動yarn和hdfs集群的
這幾個start開頭的腳本都是通過調(diào)用那兩個daemon
腳本來啟動的。
2、 腳本分析
這里先從start-all.sh開始分析,然后逐步分析其腳本的調(diào)用。
start-all.sh腳本內(nèi)容如下:
#!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Start all hadoop daemons. Run this on master node. echo "This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh" bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hadoop-config.sh # start hdfs daemons if hdfs is present if [ -f "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh ]; then "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh --config $HADOOP_CONF_DIR fi # start yarn daemons if yarn is present if [ -f "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh ]; then "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIR fi
這個腳本的重點在第31行到末尾,這里是兩個if語句,第一個if語句里(第34行)調(diào)用的是start-dfs.sh腳本,第二個if語句里(第37行)調(diào)用的是start-yarn.sh。
然后我們以start-dfs.sh為例,繼續(xù)向下分析。
start-dfs.sh的內(nèi)容如下:
#!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Start hadoop dfs daemons. # Optinally upgrade or rollback dfs state. # Run this on master node. usage="Usage: start-dfs.sh [-upgrade|-rollback] [other options such as -clusterId]" bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hdfs-config.sh # get arguments if [[ $# -ge 1 ]]; then startOpt="$1" shift case "$startOpt" in -upgrade) nameStartOpt="$startOpt" ;; -rollback) dataStartOpt="$startOpt" ;; *) echo $usage exit 1 ;; esac fi #Add other possible options nameStartOpt="$nameStartOpt $@" #--------------------------------------------------------- # namenodes NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes) echo "Starting namenodes on [$NAMENODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$NAMENODES" \ --script "$bin/hdfs" start namenode $nameStartOpt #--------------------------------------------------------- # datanodes (using default slaves file) if [ -n "$HADOOP_SECURE_DN_USER" ]; then echo \ "Attempting to start secure cluster, skipping datanodes. " \ "Run start-secure-dns.sh as root to complete startup." else "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --script "$bin/hdfs" start datanode $dataStartOpt fi #--------------------------------------------------------- # secondary namenodes (if any) SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null) if [ -n "$SECONDARY_NAMENODES" ]; then echo "Starting secondary namenodes [$SECONDARY_NAMENODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$SECONDARY_NAMENODES" \ --script "$bin/hdfs" start secondarynamenode fi #--------------------------------------------------------- # quorumjournal nodes (if any) SHARED_EDITS_DIR=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.namenode.shared.edits.dir 2>&-) case "$SHARED_EDITS_DIR" in qjournal://*) JOURNAL_NODES=$(echo "$SHARED_EDITS_DIR" | sed 's,qjournal://\([^/]*\)/.*,\1,g; s/;/ /g; s/:[0-9]*//g') echo "Starting journal nodes [$JOURNAL_NODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$JOURNAL_NODES" \ --script "$bin/hdfs" start journalnode ;; esac #--------------------------------------------------------- # ZK Failover controllers, if auto-HA is enabled AUTOHA_ENABLED=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.ha.automatic-failover.enabled) if [ "$(echo "$AUTOHA_ENABLED" | tr A-Z a-z)" = "true" ]; then echo "Starting ZK Failover Controllers on NN hosts [$NAMENODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$NAMENODES" \ --script "$bin/hdfs" start zkfc fi # eof
首先是第23行到30行,這里是在處理hadoop的路徑。
然后是第32行到51行是在處理腳本傳入的參數(shù)。最后就啟動hdfs的各個角色了。
首先啟動的是namenode,在第60行到63行,這段代碼是調(diào)用了hadoop-daemons.sh腳本來啟動,這個腳本和之前提到的hadoop-daemon.sh腳本的區(qū)別在于這個腳本可以在集群的其他機器上啟動該啟動的角色,而hadoop-daemon.sh只能啟動當前機器上的角色。其實hadoop-daemons.sh也是通過調(diào)用hadoop-daemon.sh來啟動的,這個稍后再分析。這個腳本還有幾個參數(shù),其中最重要的是:start namenode。它表示啟動namenode。
然后是第65行到第76行,這里是在啟動datanode,啟動的方式和namenode相同。啟動DataNode的代碼在第73行。
然后是第78行到第90行,這里在啟動secondarynamenode。如果是配置了namenode的高可用,secondarynamenode便不會啟動。
然后是第92行到第105行,這里在啟動journalnode。如果配置了namenode的高可用,journalnode才會啟動。
最后是第107行到末尾,這里啟動的是zkfc。同樣這個也是要配置高可用才會啟動。
如果按照文檔(2)中配置的高可用來看,這里啟動的角色應該為:namenode、datanode、journalnode、zkfc。
啟動上述角色調(diào)用的hadoop-daemons.sh腳本內(nèi)容如下:
#!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Run a Hadoop command on all slave hosts. usage="Usage: hadoop-daemons.sh [--config confdir] [--hosts hostlistfile] [start|stop] command args..." # if no args specified, show usage if [ $# -le 1 ]; then echo $usage exit 1 fi bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hadoop-config.sh exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@"
這個腳本的重點在最后一行。這里有兩個腳本slaves.sh和hadoop-daemon.sh。slaves.sh腳本會使用ssh登錄到指定的服務器中然后執(zhí)行其中的hadoop-daemon.sh腳本。這里就不分析slaves.sh腳本了。
我們繼續(xù)看hadoop-daemon.sh腳本。
其內(nèi)容如下:
#!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Runs a Hadoop command as a daemon. # # Environment Variables # # HADOOP_CONF_DIR Alternate conf dir. Default is ${HADOOP_PREFIX}/conf. # HADOOP_LOG_DIR Where log files are stored. PWD by default. # HADOOP_MASTER host:path where hadoop code should be rsync'd from # HADOOP_PID_DIR The pid files are stored. /tmp by default. # HADOOP_IDENT_STRING A string representing this instance of hadoop. $USER by default # HADOOP_NICENESS The scheduling priority for daemons. Defaults to 0. ## usage="Usage: hadoop-daemon.sh [--config <conf-dir>] [--hosts hostlistfile] [--script script] (start|stop) <hadoop-command> <args...>" # if no args specified, show usage if [ $# -le 1 ]; then echo $usage exit 1 fi bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hadoop-config.sh # get arguments #default value hadoopScript="$HADOOP_PREFIX"/bin/hadoop if [ "--script" = "$1" ] then shift hadoopScript=$1 shift fi startStop=$1 shift command=$1 shift hadoop_rotate_log () { log=$1; num=5; if [ -n "$2" ]; then num=$2 fi if [ -f "$log" ]; then # rotate logs while [ $num -gt 1 ]; do prev=`expr $num - 1` [ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num" num=$prev done mv "$log" "$log.$num"; fi } if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then . "${HADOOP_CONF_DIR}/hadoop-env.sh" fi # Determine if we're starting a secure datanode, and if so, redefine appropriate variables if [ "$command" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then export HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR export HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR export HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER starting_secure_dn="true" fi #Determine if we're starting a privileged NFS, if so, redefine the appropriate variables if [ "$command" == "nfs3" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_PRIVILEGED_NFS_USER" ]; then export HADOOP_PID_DIR=$HADOOP_PRIVILEGED_NFS_PID_DIR export HADOOP_LOG_DIR=$HADOOP_PRIVILEGED_NFS_LOG_DIR export HADOOP_IDENT_STRING=$HADOOP_PRIVILEGED_NFS_USER starting_privileged_nfs="true" fi if [ "$HADOOP_IDENT_STRING" = "" ]; then export HADOOP_IDENT_STRING="$USER" fi # get log directory if [ "$HADOOP_LOG_DIR" = "" ]; then export HADOOP_LOG_DIR="$HADOOP_PREFIX/logs" fi if [ ! -w "$HADOOP_LOG_DIR" ] ; then mkdir -p "$HADOOP_LOG_DIR" chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR fi if [ "$HADOOP_PID_DIR" = "" ]; then HADOOP_PID_DIR=/tmp fi # some variables export HADOOP_LOGFILE=hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.log export HADOOP_ROOT_LOGGER=${HADOOP_ROOT_LOGGER:-"INFO,RFA"} export HADOOP_SECURITY_LOGGER=${HADOOP_SECURITY_LOGGER:-"INFO,RFAS"} export HDFS_AUDIT_LOGGER=${HDFS_AUDIT_LOGGER:-"INFO,NullAppender"} log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out pid=$HADOOP_PID_DIR/hadoop-$HADOOP_IDENT_STRING-$command.pid HADOOP_STOP_TIMEOUT=${HADOOP_STOP_TIMEOUT:-5} # Set default scheduling priority if [ "$HADOOP_NICENESS" = "" ]; then export HADOOP_NICENESS=0 fi case $startStop in (start) [ -w "$HADOOP_PID_DIR" ] || mkdir -p "$HADOOP_PID_DIR" if [ -f $pid ]; then if kill -0 `cat $pid` > /dev/null 2>&1; then echo $command running as process `cat $pid`. Stop it first. exit 1 fi fi if [ "$HADOOP_MASTER" != "" ]; then echo rsync from $HADOOP_MASTER rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_PREFIX" fi hadoop_rotate_log $log echo starting $command, logging to $log cd "$HADOOP_PREFIX" case $command in namenode|secondarynamenode|datanode|journalnode|dfs|dfsadmin|fsck|balancer|zkfc) if [ -z "$HADOOP_HDFS_HOME" ]; then hdfsScript="$HADOOP_PREFIX"/bin/hdfs else hdfsScript="$HADOOP_HDFS_HOME"/bin/hdfs fi nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null & ;; (*) nohup nice -n $HADOOP_NICENESS $hadoopScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null & ;; esac echo $! > $pid sleep 1 head "$log" # capture the ulimit output if [ "true" = "$starting_secure_dn" ]; then echo "ulimit -a for secure datanode user $HADOOP_SECURE_DN_USER" >> $log # capture the ulimit info for the appropriate user su --shell=/bin/bash $HADOOP_SECURE_DN_USER -c 'ulimit -a' >> $log 2>&1 elif [ "true" = "$starting_privileged_nfs" ]; then echo "ulimit -a for privileged nfs user $HADOOP_PRIVILEGED_NFS_USER" >> $log su --shell=/bin/bash $HADOOP_PRIVILEGED_NFS_USER -c 'ulimit -a' >> $log 2>&1 else echo "ulimit -a for user $USER" >> $log ulimit -a >> $log 2>&1 fi sleep 3; if ! ps -p $! > /dev/null ; then exit 1 fi ;; (stop) if [ -f $pid ]; then TARGET_PID=`cat $pid` if kill -0 $TARGET_PID > /dev/null 2>&1; then echo stopping $command kill $TARGET_PID sleep $HADOOP_STOP_TIMEOUT if kill -0 $TARGET_PID > /dev/null 2>&1; then echo "$command did not stop gracefully after $HADOOP_STOP_TIMEOUT seconds: killing with kill -9" kill -9 $TARGET_PID fi else echo no $command to stop fi rm -f $pid else echo no $command to stop fi ;; (*) echo $usage exit 1 ;; esac
這段代碼的重點在第131行到結束。這里是真正在啟動服務的代碼,這個文件在調(diào)用的時候,會傳入兩個重要的參數(shù)start/stop xxx。用于啟動或停止某些服務。以啟動服務為例,其重點在第153行,這里會執(zhí)行一個hdfsScript腳本。這個參數(shù)的定義在第155行,
這里可以看見它實際是hadoop的bin目錄下的hdfs文件
文件的內(nèi)容如下:
#!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Environment Variables # # JSVC_HOME home directory of jsvc binary. Required for starting secure # datanode. # # JSVC_OUTFILE path to jsvc output file. Defaults to # $HADOOP_LOG_DIR/jsvc.out. # # JSVC_ERRFILE path to jsvc error file. Defaults to $HADOOP_LOG_DIR/jsvc.err. bin=`which $0` bin=`dirname ${bin}` bin=`cd "$bin" > /dev/null; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hdfs-config.sh function print_usage(){ echo "Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND" echo " where COMMAND is one of:" echo " dfs run a filesystem command on the file systems supported in Hadoop." echo " classpath prints the classpath" echo " namenode -format format the DFS filesystem" echo " secondarynamenode run the DFS secondary namenode" echo " namenode run the DFS namenode" echo " journalnode run the DFS journalnode" echo " zkfc run the ZK Failover Controller daemon" echo " datanode run a DFS datanode" echo " dfsadmin run a DFS admin client" echo " haadmin run a DFS HA admin client" echo " fsck run a DFS filesystem checking utility" echo " balancer run a cluster balancing utility" echo " jmxget get JMX exported values from NameNode or DataNode." echo " mover run a utility to move block replicas across" echo " storage types" echo " oiv apply the offline fsimage viewer to an fsimage" echo " oiv_legacy apply the offline fsimage viewer to an legacy fsimage" echo " oev apply the offline edits viewer to an edits file" echo " fetchdt fetch a delegation token from the NameNode" echo " getconf get config values from configuration" echo " groups get the groups which users belong to" echo " snapshotDiff diff two snapshots of a directory or diff the" echo " current directory contents with a snapshot" echo " lsSnapshottableDir list all snapshottable dirs owned by the current user" echo " Use -help to see options" echo " portmap run a portmap service" echo " nfs3 run an NFS version 3 gateway" echo " cacheadmin configure the HDFS cache" echo " crypto configure HDFS encryption zones" echo " storagepolicies list/get/set block storage policies" echo " version print the version" echo "" echo "Most commands print help when invoked w/o parameters." # There are also debug commands, but they don't show up in this listing. } if [ $# = 0 ]; then print_usage exit fi COMMAND=$1 shift case $COMMAND in # usage flags --help|-help|-h) print_usage exit ;; esac # Determine if we're starting a secure datanode, and if so, redefine appropriate variables if [ "$COMMAND" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then if [ -n "$JSVC_HOME" ]; then if [ -n "$HADOOP_SECURE_DN_PID_DIR" ]; then HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR fi if [ -n "$HADOOP_SECURE_DN_LOG_DIR" ]; then HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR" fi HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING" starting_secure_dn="true" else echo "It looks like you're trying to start a secure DN, but \$JSVC_HOME"\ "isn't set. Falling back to starting insecure DN." fi fi # Determine if we're starting a privileged NFS daemon, and if so, redefine appropriate variables if [ "$COMMAND" == "nfs3" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_PRIVILEGED_NFS_USER" ]; then if [ -n "$JSVC_HOME" ]; then if [ -n "$HADOOP_PRIVILEGED_NFS_PID_DIR" ]; then HADOOP_PID_DIR=$HADOOP_PRIVILEGED_NFS_PID_DIR fi if [ -n "$HADOOP_PRIVILEGED_NFS_LOG_DIR" ]; then HADOOP_LOG_DIR=$HADOOP_PRIVILEGED_NFS_LOG_DIR HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR" fi HADOOP_IDENT_STRING=$HADOOP_PRIVILEGED_NFS_USER HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING" starting_privileged_nfs="true" else echo "It looks like you're trying to start a privileged NFS server, but"\ "\$JSVC_HOME isn't set. Falling back to starting unprivileged NFS server." fi fi if [ "$COMMAND" = "namenode" ] ; then CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode' HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS" elif [ "$COMMAND" = "zkfc" ] ; then CLASS='org.apache.hadoop.hdfs.tools.DFSZKFailoverController' HADOOP_OPTS="$HADOOP_OPTS $HADOOP_ZKFC_OPTS" elif [ "$COMMAND" = "secondarynamenode" ] ; then CLASS='org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode' HADOOP_OPTS="$HADOOP_OPTS $HADOOP_SECONDARYNAMENODE_OPTS" elif [ "$COMMAND" = "datanode" ] ; then CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode' if [ "$starting_secure_dn" = "true" ]; then HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS" else HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS" fi elif [ "$COMMAND" = "journalnode" ] ; then CLASS='org.apache.hadoop.hdfs.qjournal.server.JournalNode' HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOURNALNODE_OPTS" elif [ "$COMMAND" = "dfs" ] ; then CLASS=org.apache.hadoop.fs.FsShell HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "dfsadmin" ] ; then CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "haadmin" ] ; then CLASS=org.apache.hadoop.hdfs.tools.DFSHAAdmin CLASSPATH=${CLASSPATH}:${TOOL_PATH} HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "fsck" ] ; then CLASS=org.apache.hadoop.hdfs.tools.DFSck HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "balancer" ] ; then CLASS=org.apache.hadoop.hdfs.server.balancer.Balancer HADOOP_OPTS="$HADOOP_OPTS $HADOOP_BALANCER_OPTS" elif [ "$COMMAND" = "mover" ] ; then CLASS=org.apache.hadoop.hdfs.server.mover.Mover HADOOP_OPTS="${HADOOP_OPTS} ${HADOOP_MOVER_OPTS}" elif [ "$COMMAND" = "storagepolicies" ] ; then CLASS=org.apache.hadoop.hdfs.tools.StoragePolicyAdmin elif [ "$COMMAND" = "jmxget" ] ; then CLASS=org.apache.hadoop.hdfs.tools.JMXGet elif [ "$COMMAND" = "oiv" ] ; then CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB elif [ "$COMMAND" = "oiv_legacy" ] ; then CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer elif [ "$COMMAND" = "oev" ] ; then CLASS=org.apache.hadoop.hdfs.tools.offlineEditsViewer.OfflineEditsViewer elif [ "$COMMAND" = "fetchdt" ] ; then CLASS=org.apache.hadoop.hdfs.tools.DelegationTokenFetcher elif [ "$COMMAND" = "getconf" ] ; then CLASS=org.apache.hadoop.hdfs.tools.GetConf elif [ "$COMMAND" = "groups" ] ; then CLASS=org.apache.hadoop.hdfs.tools.GetGroups elif [ "$COMMAND" = "snapshotDiff" ] ; then CLASS=org.apache.hadoop.hdfs.tools.snapshot.SnapshotDiff elif [ "$COMMAND" = "lsSnapshottableDir" ] ; then CLASS=org.apache.hadoop.hdfs.tools.snapshot.LsSnapshottableDir elif [ "$COMMAND" = "portmap" ] ; then CLASS=org.apache.hadoop.portmap.Portmap HADOOP_OPTS="$HADOOP_OPTS $HADOOP_PORTMAP_OPTS" elif [ "$COMMAND" = "nfs3" ] ; then CLASS=org.apache.hadoop.hdfs.nfs.nfs3.Nfs3 HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NFS3_OPTS" elif [ "$COMMAND" = "cacheadmin" ] ; then CLASS=org.apache.hadoop.hdfs.tools.CacheAdmin elif [ "$COMMAND" = "crypto" ] ; then CLASS=org.apache.hadoop.hdfs.tools.CryptoAdmin elif [ "$COMMAND" = "version" ] ; then CLASS=org.apache.hadoop.util.VersionInfo elif [ "$COMMAND" = "debug" ]; then CLASS=org.apache.hadoop.hdfs.tools.DebugAdmin elif [ "$COMMAND" = "classpath" ]; then if [ "$#" -gt 0 ]; then CLASS=org.apache.hadoop.util.Classpath else # No need to bother starting up a JVM for this simple case. if $cygwin; then CLASSPATH=$(cygpath -p -w "$CLASSPATH" 2>/dev/null) fi echo $CLASSPATH exit 0 fi else CLASS="$COMMAND" fi # cygwin path translation if $cygwin; then CLASSPATH=$(cygpath -p -w "$CLASSPATH" 2>/dev/null) HADOOP_LOG_DIR=$(cygpath -w "$HADOOP_LOG_DIR" 2>/dev/null) HADOOP_PREFIX=$(cygpath -w "$HADOOP_PREFIX" 2>/dev/null) HADOOP_CONF_DIR=$(cygpath -w "$HADOOP_CONF_DIR" 2>/dev/null) HADOOP_COMMON_HOME=$(cygpath -w "$HADOOP_COMMON_HOME" 2>/dev/null) HADOOP_HDFS_HOME=$(cygpath -w "$HADOOP_HDFS_HOME" 2>/dev/null) HADOOP_YARN_HOME=$(cygpath -w "$HADOOP_YARN_HOME" 2>/dev/null) HADOOP_MAPRED_HOME=$(cygpath -w "$HADOOP_MAPRED_HOME" 2>/dev/null) fi export CLASSPATH=$CLASSPATH HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,NullAppender}" # Check to see if we should start a secure datanode if [ "$starting_secure_dn" = "true" ]; then if [ "$HADOOP_PID_DIR" = "" ]; then HADOOP_SECURE_DN_PID="/tmp/hadoop_secure_dn.pid" else HADOOP_SECURE_DN_PID="$HADOOP_PID_DIR/hadoop_secure_dn.pid" fi JSVC=$JSVC_HOME/jsvc if [ ! -f $JSVC ]; then echo "JSVC_HOME is not set correctly so jsvc cannot be found. jsvc is required to run secure datanodes. " echo "Please download and install jsvc from http://archive.apache.org/dist/commons/daemon/binaries/ "\ "and set JSVC_HOME to the directory containing the jsvc binary." exit fi if [[ ! $JSVC_OUTFILE ]]; then JSVC_OUTFILE="$HADOOP_LOG_DIR/jsvc.out" fi if [[ ! $JSVC_ERRFILE ]]; then JSVC_ERRFILE="$HADOOP_LOG_DIR/jsvc.err" fi exec "$JSVC" \ -Dproc_$COMMAND -outfile "$JSVC_OUTFILE" \ -errfile "$JSVC_ERRFILE" \ -pidfile "$HADOOP_SECURE_DN_PID" \ -nodetach \ -user "$HADOOP_SECURE_DN_USER" \ -cp "$CLASSPATH" \ $JAVA_HEAP_MAX $HADOOP_OPTS \ org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@" elif [ "$starting_privileged_nfs" = "true" ] ; then if [ "$HADOOP_PID_DIR" = "" ]; then HADOOP_PRIVILEGED_NFS_PID="/tmp/hadoop_privileged_nfs3.pid" else HADOOP_PRIVILEGED_NFS_PID="$HADOOP_PID_DIR/hadoop_privileged_nfs3.pid" fi JSVC=$JSVC_HOME/jsvc if [ ! -f $JSVC ]; then echo "JSVC_HOME is not set correctly so jsvc cannot be found. jsvc is required to run privileged NFS gateways. " echo "Please download and install jsvc from http://archive.apache.org/dist/commons/daemon/binaries/ "\ "and set JSVC_HOME to the directory containing the jsvc binary." exit fi if [[ ! $JSVC_OUTFILE ]]; then JSVC_OUTFILE="$HADOOP_LOG_DIR/nfs3_jsvc.out" fi if [[ ! $JSVC_ERRFILE ]]; then JSVC_ERRFILE="$HADOOP_LOG_DIR/nfs3_jsvc.err" fi exec "$JSVC" \ -Dproc_$COMMAND -outfile "$JSVC_OUTFILE" \ -errfile "$JSVC_ERRFILE" \ -pidfile "$HADOOP_PRIVILEGED_NFS_PID" \ -nodetach \ -user "$HADOOP_PRIVILEGED_NFS_USER" \ -cp "$CLASSPATH" \ $JAVA_HEAP_MAX $HADOOP_OPTS \ org.apache.hadoop.hdfs.nfs.nfs3.PrivilegedNfsGatewayStarter "$@" else # run it exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@" fi
這段代碼的重點在第134行到219行和第304行。首先看第134行到219行,這段代碼雖然很長但全是一堆if else語句。其邏輯也很簡單,就是根據(jù)傳入的COMMAND的值來為CLASS和HADOOP_OPTS賦值。然后是第304行,執(zhí)行CLASS類。以第134行的namenode類為例,這里CLASS的值為org.apache.hadoop.hdfs.server.namenode.NameNode,這是一個java類隨后便會執(zhí)行這個類啟動namenode。
如果想要debug hdfs的源碼,最好在這里設置遠程調(diào)試。因為這里有單個的服務類與啟動參數(shù),可以準確的定位需要的服務。
以上就是Hadoop源碼分析之啟動及腳本剖析的詳細內(nèi)容,本系列下一篇文章傳送門Hadoop源碼分析四遠程debug調(diào)試更多關于Hadoop源碼分析分析的資料請繼續(xù)關注腳本之家其它相關文章!
相關文章
springboot默認文件緩存(easy-captcha?驗證碼)
這篇文章主要介紹了springboot的文件緩存(easy-captcha?驗證碼),本文通過實例代碼給大家介紹的非常詳細,對大家的學習或工作具有一定的參考借鑒價值,需要的朋友可以參考下2023-06-06SpringBoot如何獲取application.properties中自定義的值
這篇文章主要介紹了SpringBoot獲取application.properties中的自定義的值,目錄結構文件代碼給大家列舉的非常詳細,需要的朋友可以參考下2021-09-09SpringBoot2 task scheduler 定時任務調(diào)度器四種方式
這篇文章主要介紹了SpringBoot2 task scheduler 定時任務調(diào)度器四種方式,文中通過示例代碼介紹的非常詳細,對大家的學習或者工作具有一定的參考學習價值,需要的朋友們下面隨著小編來一起學習學習吧2019-03-03SpringBoot整合Mybatis-Plus實現(xiàn)關聯(lián)查詢
Mybatis-Plus(簡稱MP)是一個Mybatis的增強工具,只是在Mybatis的基礎上做了增強卻不做改變,MyBatis-Plus支持所有Mybatis原生的特性,本文給大家介紹了SpringBoot整合Mybatis-Plus實現(xiàn)關聯(lián)查詢,需要的朋友可以參考下2024-08-08Mybatis-Plus @TableField自動填充時間為null的問題解決
本文主要介紹了Mybatis-Plus @TableField自動填充時間為null的問題解決,文中通過示例代碼介紹的非常詳細,對大家的學習或者工作具有一定的參考學習價值,需要的朋友們下面隨著小編來一起學習學習吧2023-01-01