源碼解析springbatch的job運行機制
源碼解析springbatch的job是如何運行的?
注,本文中的demo代碼節(jié)選于圖書《Spring Batch批處理框架》的配套源代碼,并做并適配springboot升級版本,完全開源。
SpringBatch的背景和用法,就不再贅述了,默認本文受眾都使用過batch框架。
本文僅討論普通的ChunkStep,分片/異步處理等功能暫不討論。
1. 表結構
Spring系列的框架代碼,大多又臭又長,讓人頭暈。先列出整體流程,再去看源碼。順帶也可以了解存儲表結構。
- 每一個jobname,加運行參數(shù)的MD5值,被定義為一個job_instance,存儲在batch_job_instance表中;
- job_instance每次運行時,會創(chuàng)建一個新的job_execution,存儲在batch_job_execution / batch_job_execution_context 表中;
擴展:任務重啟時,如何續(xù)作? 答,判定為任務續(xù)作,創(chuàng)建新的job_execution時,會使用舊job_execution的運行態(tài)ExecutionContext(通俗講,火車出故障只換了車頭,車廂貨物不變。)
- job_execution會根據(jù)job排程中的step順序,逐個執(zhí)行,逐個轉(zhuǎn)化為step_execution,并存儲在batch_step_execution / batch_step_execution_context表中
- 每個step在執(zhí)行時,會維護step運行狀態(tài),當出現(xiàn)異?;蛘哒麄€step清單執(zhí)行完成,會更新job_execution的狀態(tài)
- 在每個step執(zhí)行前后、job_execution前后,都會通知Listener做回調(diào)。
框架使用的表
batch_job_instance batch_job_execution batch_job_execution_context batch_job_execution_params batch_step_execution batch_step_execution_context batch_job_seq batch_step_execution_seq batch_job_execution_seq
2. API入口
先看看怎么調(diào)用啟動Job的API,看起來非常簡單,傳入job信息和參數(shù)即可
@Autowired @Qualifier("billJob") private Job job; @Test public void billJob() throws Exception { JobParameters jobParameters = new JobParametersBuilder() .addLong("currentTimeMillis", System.currentTimeMillis()) .addString("batchNo","2022080402") .toJobParameters(); JobExecution result = jobLauncher.run(job, jobParameters); System.out.println(result.toString()); Thread.sleep(6000); }
<!-- 賬單作業(yè) --> <batch:job id="billJob"> <batch:step id="billStep"> <batch:tasklet transaction-manager="transactionManager"> <batch:chunk reader="csvItemReader" writer="csvItemWriter" processor="creditBillProcessor" commit-interval="3"> </batch:chunk> </batch:tasklet> </batch:step> </batch:job>
org.springframework.batch.core.launch.support.SimpleJobLauncher#run
// 簡化部分代碼(參數(shù)檢查、log日志) @Override public JobExecution run(final Job job, final JobParameters jobParameters){ final JobExecution jobExecution; JobExecution lastExecution = jobRepository.getLastJobExecution(job.getName(), jobParameters); // 上次執(zhí)行存在,說明本次請求是重啟job,先做檢查 if (lastExecution != null) { if (!job.isRestartable()) { throw new JobRestartException("JobInstance already exists and is not restartable"); } /* 檢查stepExecutions的狀態(tài) * validate here if it has stepExecutions that are UNKNOWN, STARTING, STARTED and STOPPING * retrieve the previous execution and check */ for (StepExecution execution : lastExecution.getStepExecutions()) { BatchStatus status = execution.getStatus(); if (status.isRunning() || status == BatchStatus.STOPPING) { throw new JobExecutionAlreadyRunningException("A job execution for this job is already running: " + lastExecution); } else if (status == BatchStatus.UNKNOWN) { throw new JobRestartException( "Cannot restart step [" + execution.getStepName() + "] from UNKNOWN status. "); } } } // Check jobParameters job.getJobParametersValidator().validate(jobParameters); // 創(chuàng)建JobExecution 同一個job+參數(shù),只能有一個Execution執(zhí)行器 jobExecution = jobRepository.createJobExecution(job.getName(), jobParameters); try { // SyncTaskExecutor 看似是異步,實際是同步執(zhí)行(可擴展) taskExecutor.execute(new Runnable() { @Override public void run() { try { // 關鍵入口,請看[org.springframework.batch.core.job.AbstractJob#execute] job.execute(jobExecution); if (logger.isInfoEnabled()) { Duration jobExecutionDuration = BatchMetrics.calculateDuration(jobExecution.getStartTime(), jobExecution.getEndTime()); } } catch (Throwable t) { rethrow(t); } } private void rethrow(Throwable t) { // 省略各類拋異常 throw new IllegalStateException(t); } }); } catch (TaskRejectedException e) { // 更新job_execution的運行狀態(tài) jobExecution.upgradeStatus(BatchStatus.FAILED); if (jobExecution.getExitStatus().equals(ExitStatus.UNKNOWN)) { jobExecution.setExitStatus(ExitStatus.FAILED.addExitDescription(e)); } jobRepository.update(jobExecution); } return jobExecution; }
3. 深入代碼流程
簡單看看API入口,子類劃分較多,繼續(xù)往后看
總體代碼流程
- org.springframework.batch.core.launch.support.SimpleJobLauncher#run 入口api,構建jobExecution
- org.springframework.batch.core.job.AbstractJob#execute 對jobExecution進行執(zhí)行、listener的前置處理
- FlowJob#doExecute -> SimpleFlow#start 按順序逐個處理Step、構建stepExecution
- JobFlowExecutor#executeStep -> SimpleStepHandler#handleStep -> AbstractStep#execute 執(zhí)行stepExecution
- TaskletStep#doExecute 通過RepeatTemplate,調(diào)用TransactionTemplate方法,在事務中執(zhí)行
- 內(nèi)部類TaskletStep.ChunkTransactionCallback#doInTransaction
- 反復調(diào)起ChunkOrientedTasklet#execute 去執(zhí)行read-process-writer方法,
- 通過自定義的Reader得到inputs,例如本文實現(xiàn)的是flatReader讀取csv文件
- 遍歷inputs,將item逐個傳入,調(diào)用processor處理
- 調(diào)用writer,將outputs一次性寫入
- 不同reader的實現(xiàn)內(nèi)容不同,通過緩存讀取的行數(shù)等信息,可做到分片、按數(shù)量處理chunk
JobExecution的處理過程
org.springframework.batch.core.job.AbstractJob#execute
/** 運行給定的job,處理全部listener和DB存儲的調(diào)用 * Run the specified job, handling all listener and repository calls, and * delegating the actual processing to {@link #doExecute(JobExecution)}. * * @see Job#execute(JobExecution) * @throws StartLimitExceededException * if start limit of one of the steps was exceeded */ @Ovrride public final void execute(JobExecution execution) { // 同步控制器,防并發(fā)執(zhí)行 JobSynchronizationManager.register(execution); // 計時器,記錄耗時 LongTaskTimer longTaskTimer = BatchMetrics.createLongTaskTimer("job.active", "Active jobs", Tag.of("name", execution.getJobInstance().getJobName())); LongTaskTimer.Sample longTaskTimerSample = longTaskTimer.start(); Timer.Sample timerSample = BatchMetrics.createTimerSample(); try { // 參數(shù)再次進行校驗 jobParametersValidator.validate(execution.getJobParameters()); if (execution.getStatus() != BatchStatus.STOPPING) { // 更新db中任務狀態(tài) execution.setStartTime(new Date()); updateStatus(execution, BatchStatus.STARTED); // 回調(diào)所有l(wèi)istener的beforeJob方法 listener.beforeJob(execution); try { doExecute(execution); } catch (RepeatException e) { throw e.getCause(); // 搞不懂這里包一個RepeatException 有啥用 } } else { // 任務狀態(tài)時BatchStatus.STOPPING,說明任務已經(jīng)停止,直接改成STOPPED // The job was already stopped before we even got this far. Deal // with it in the same way as any other interruption. execution.setStatus(BatchStatus.STOPPED); execution.setExitStatus(ExitStatus.COMPLETED); } } catch (JobInterruptedException e) { // 任務被打斷 STOPPED execution.setExitStatus(getDefaultExitStatusForFailure(e, execution)); execution.setStatus(BatchStatus.max(BatchStatus.STOPPED, e.getStatus())); execution.addFailureException(e); } catch (Throwable t) { // 其他原因失敗 FAILED logger.error("Encountered fatal error executing job", t); execution.setExitStatus(getDefaultExitStatusForFailure(t, execution)); execution.setStatus(BatchStatus.FAILED); execution.addFailureException(t); } finally { try { if (execution.getStatus().isLessThanOrEqualTo(BatchStatus.STOPPED) && execution.getStepExecutions().isEmpty()) { ExitStatus exitStatus = execution.getExitStatus(); ExitStatus newExitStatus = ExitStatus.NOOP.addExitDescription("All steps already completed or no steps configured for this job."); execution.setExitStatus(exitStatus.and(newExitStatus)); } // 計時器 計算總耗時 timerSample.stop(BatchMetrics.createTimer("job", "Job duration", Tag.of("name", execution.getJobInstance().getJobName()), Tag.of("status", execution.getExitStatus().getExitCode()) )); longTaskTimerSample.stop(); execution.setEndTime(new Date()); try { // 回調(diào)所有l(wèi)istener的afterJob方法 調(diào)用失敗也不影響任務完成 listener.afterJob(execution); } catch (Exception e) { logger.error("Exception encountered in afterJob callback", e); } // 寫入db jobRepository.update(execution); } finally { // 釋放控制 JobSynchronizationManager.release(); } } }
3.1何時調(diào)用Reader?
在SimpleChunkProvider#provide中會分次調(diào)用reader,并將結果包裝為Chunk返回。
其中有幾個細節(jié),此處不再贅述。
- 如何控制一次讀取幾個item?
- 如何控制最后一行讀完就不讀了?
- 如果需要跳過文件頭的前N行,怎么處理?
- 在StepContribution中記錄讀取數(shù)量
org.springframework.batch.core.step.item.SimpleChunkProcessor#process @Nullable @Override public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception { @SuppressWarnings("unchecked") Chunk<I> inputs = (Chunk<I>) chunkContext.getAttribute(INPUTS_KEY); if (inputs == null) { inputs = chunkProvider.provide(contribution); if (buffering) { chunkContext.setAttribute(INPUTS_KEY, inputs); } } chunkProcessor.process(contribution, inputs); chunkProvider.postProcess(contribution, inputs); // Allow a message coming back from the processor to say that we // are not done yet if (inputs.isBusy()) { logger.debug("Inputs still busy"); return RepeatStatus.CONTINUABLE; } chunkContext.removeAttribute(INPUTS_KEY); chunkContext.setComplete(); if (logger.isDebugEnabled()) { logger.debug("Inputs not busy, ended: " + inputs.isEnd()); } return RepeatStatus.continueIf(!inputs.isEnd()); }
3.2何時調(diào)用Processor/Writer?
在RepeatTemplate和外圍事務模板的包裝下,通過SimpleChunkProcessor進行處理:
- 查出若干條數(shù)的items,打包為Chunk
- 遍歷items,逐個item調(diào)用processor
- 通知StepListener,環(huán)繞處理調(diào)用before/after方法
// 忽略無關代碼... @Override public final void process(StepContribution contribution, Chunk<I> inputs) throws Exception { // 輸入為空,直接返回If there is no input we don't have to do anything more if (isComplete(inputs)) { return; } // Make the transformation, calling remove() on the inputs iterator if // any items are filtered. Might throw exception and cause rollback. Chunk<O> outputs = transform(contribution, inputs); // Adjust the filter count based on available data contribution.incrementFilterCount(getFilterCount(inputs, outputs)); // Adjust the outputs if necessary for housekeeping purposes, and then // write them out... write(contribution, inputs, getAdjustedOutputs(inputs, outputs)); } // 遍歷items,逐個item調(diào)用processor protected Chunk<O> transform(StepContribution contribution, Chunk<I> inputs) throws Exception { Chunk<O> outputs = new Chunk<>(); for (Chunk<I>.ChunkIterator iterator = inputs.iterator(); iterator.hasNext();) { final I item = iterator.next(); O output; String status = BatchMetrics.STATUS_SUCCESS; try { output = doProcess(item); } catch (Exception e) { /* * For a simple chunk processor (no fault tolerance) we are done here, so prevent any more processing of these inputs. */ inputs.clear(); status = BatchMetrics.STATUS_FAILURE; throw e; } if (output != null) { outputs.add(output); } else { iterator.remove(); } } return outputs; }
4. 每個step是如何與事務處理掛鉤?
在TaskletStep#doExecute中會使用TransactionTemplate,包裝事務操作
標準的事務操作,通過函數(shù)式編程風格,從action的CallBack調(diào)用實際處理方法
- 通過transactionManager獲取事務
- 執(zhí)行操作
- 無異常,則提交事務
- 若異常,則回滾
// org.springframework.batch.core.step.tasklet.TaskletStep#doExecute result = new TransactionTemplate(transactionManager, transactionAttribute) .execute(new ChunkTransactionCallback(chunkContext, semaphore)); // 事務啟用過程 // org.springframework.transaction.support.TransactionTemplate#execute @Override @Nullable public <T> T execute(TransactionCallback<T> action) throws TransactionException { Assert.state(this.transactionManager != null, "No PlatformTransactionManager set"); if (this.transactionManager instanceof CallbackPreferringPlatformTransactionManager) { return ((CallbackPreferringPlatformTransactionManager) this.transactionManager).execute(this, action); } else { TransactionStatus status = this.transactionManager.getTransaction(this); T result; try { result = action.doInTransaction(status); } catch (RuntimeException | Error ex) { // Transactional code threw application exception -> rollback rollbackOnException(status, ex); throw ex; } catch (Throwable ex) { // Transactional code threw unexpected exception -> rollback rollbackOnException(status, ex); throw new UndeclaredThrowableException(ex, "TransactionCallback threw undeclared checked exception"); } this.transactionManager.commit(status); return result; } }
5. 怎么控制每個chunk幾條記錄提交一次事務? 控制每個事務窗口處理的item數(shù)量
在配置任務時,有個step級別的參數(shù),[commit-interval],用于每個事務窗口提交的控制被處理的item數(shù)量。
RepeatTemplate#executeInternal 在處理單條item后,會查看已處理完的item數(shù)量,與配置的chunk數(shù)量做比較,如果滿足chunk數(shù),則不再繼續(xù),準備提交事務。
StepBean在初始化時,會新建SimpleCompletionPolicy(chunkSize會優(yōu)先使用配置值,默認是5)
在每個chunk處理開始時,都會調(diào)用SimpleCompletionPolicy#start新建RepeatContextSupport#count用于計數(shù)。
源碼(簡化) org.springframework.batch.repeat.support.RepeatTemplate#executeInternal
/** * Internal convenience method to loop over interceptors and batch * callbacks. * @param callback the callback to process each element of the loop. */ private RepeatStatus executeInternal(final RepeatCallback callback) { // Reset the termination policy if there is one... // 此處會調(diào)用completionPolicy.start方法,更新chunk的計數(shù)器 RepeatContext context = start(); // Make sure if we are already marked complete before we start then no processing takes place. // 通過running字段來判斷是否繼續(xù)處理next boolean running = !isMarkedComplete(context); // 省略listeners處理.... // Return value, default is to allow continued processing. RepeatStatus result = RepeatStatus.CONTINUABLE; RepeatInternalState state = createInternalState(context); try { while (running) { /* * Run the before interceptors here, not in the task executor so * that they all happen in the same thread - it's easier for * tracking batch status, amongst other things. */ // 省略listeners處理.... if (running) { try { // callback是實際處理方法,類似函數(shù)式編程 result = getNextResult(context, callback, state); executeAfterInterceptors(context, result); } catch (Throwable throwable) { doHandle(throwable, context, deferred); } // 檢查當前chunk是否處理完,決策出是否繼續(xù)處理下一條item // N.B. the order may be important here: if (isComplete(context, result) || isMarkedComplete(context) || !deferred.isEmpty() { running = false; } } } result = result.and(waitForResults(state)); // 省略throwables處理.... // Explicitly drop any references to internal state... state = null; } finally { // 省略代碼... } return result; }
總結
JSR-352標準定義了Java批處理的基本模型,包含批處理的元數(shù)據(jù)像 JobExecutions,JobInstances,StepExecutions 等等。通過此類模型,提供了許多基礎組件與擴展點:
- 完善的基礎組件
Spring Batch 有很多的這類組件 例如 ItemReaders,ItemWriters,PartitionHandlers 等等對應各類數(shù)據(jù)和環(huán)境。
- 豐富的配置
JSR-352 定義了基于XML的任務設置模型。Spring Batch 提供了基于Java (類型安全的)的配置方式
- 可伸縮性
伸縮性選項-Local Partitioning 已經(jīng)包含在JSR -352 里面了。但是還應該有更多的選擇 ,例如Spring Batch 提供的 Multi-threaded Step,Remote Partitioning ,Parallel Step,Remote Chunking 等等選項
- 擴展點
良好的listener模式,提供step/job運行前后的錨點,以供開發(fā)人員個性化處理批處理流程。
2013年, JSR-352標準包含在 JavaEE7中發(fā)布,到2022年已近10年,Spring也在探索新的批處理模式, 如Spring Attic /Spring Cloud Data Flow。 https://docs.spring.io/spring-batch/docs/current/reference/html/jsr-352.html
擴展
1. Job/Step運行時的上下文,是如何保存?如何控制?
整個Job在運行時,會將運行信息保存在JobContext中。 類似的,Step運行時也有StepContext??梢栽贑ontext中保存一些參數(shù),在任務或者步驟中傳遞使用。
查看JobContext/StepContext源碼,發(fā)現(xiàn)僅用了普通變量保存Execution,這個類肯定有線程安全問題。 生產(chǎn)環(huán)境中常常出現(xiàn)多個任務并處處理的情況。
SpringBatch用了幾種方式來包裝并發(fā)安全:
- 每個job初始化時,通過JobExecution新建了JobContext,每個任務線程都用自己的對象。
- 使用JobSynchronizationManager,內(nèi)含一個ConcurrentHashMap,KEY是JobExecution,VALUE是JobContext
- 在任務解釋時,會移除當前JobExecution對應的k-v
此處能看到,如果在JobExecution存儲大量的業(yè)務數(shù)據(jù),會導致無法GC回收,導致OOM。所以在上下文中,只應保存精簡的數(shù)據(jù)。
2. step執(zhí)行時,如果出現(xiàn)異常,如何保護運行狀態(tài)?
在源碼中,使用了各類同步控制和加鎖、oldVersion版本拷貝,整體比較復雜(org.springframework.batch.core.step.tasklet.TaskletStep.ChunkTransactionCallback#doInTransaction)
1.oldVersion版本拷貝:上一次運行出現(xiàn)異常時,本次執(zhí)行時沿用上次的斷點內(nèi)容
// 節(jié)選部分代碼 oldVersion = new StepExecution(stepExecution.getStepName(), stepExecution.getJobExecution()); copy(stepExecution, oldVersion); private void copy(final StepExecution source, final StepExecution target) { target.setVersion(source.getVersion()); target.setWriteCount(source.getWriteCount()); target.setFilterCount(source.getFilterCount()); target.setCommitCount(source.getCommitCount()); target.setExecutionContext(new ExecutionContext(source.getExecutionContext())); }
2.信號量控制,在每個chunk運行完成后,需先獲取鎖,再更新stepExecution前Shared semaphore per step execution, so other step executions can run in parallel without needing the lockSemaphore (org.springframework.batch.core.step.tasklet.TaskletStep#doExecute)
// 省略無關代碼 try { try { // 執(zhí)行w-p-r模型方法 result = tasklet.execute(contribution, chunkContext); if (result == null) { result = RepeatStatus.FINISHED; } } catch (Exception e) { // 省略... } } finally { // If the step operations are asynchronous then we need to synchronize changes to the step execution (at a // minimum). Take the lock *before* changing the step execution. try { // 獲取鎖 semaphore.acquire(); locked = true; } catch (InterruptedException e) { logger.error("Thread interrupted while locking for repository update"); stepExecution.setStatus(BatchStatus.STOPPED); stepExecution.setTerminateOnly(); Thread.currentThread().interrupt(); } stepExecution.apply(contribution); } stepExecutionUpdated = true; stream.update(stepExecution.getExecutionContext()); try { // 更新上下文、DB中的狀態(tài) // Going to attempt a commit. If it fails this flag will stay false and we can use that later. getJobRepository().updateExecutionContext(stepExecution); stepExecution.incrementCommitCount(); getJobRepository().update(stepExecution); } catch (Exception e) { // If we get to here there was a problem saving the step execution and we have to fail. String msg = "JobRepository failure forcing rollback"; logger.error(msg, e); throw new FatalStepExecutionException(msg, e); }
到此這篇關于springbatch的job是如何運行的?的文章就介紹到這了,更多相關springbatch job運行內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家!
相關文章
request如何獲取完整url(包括域名、端口、參數(shù))
這篇文章主要介紹了request如何獲取完整url(包括域名、端口、參數(shù))問題,具有很好的參考價值,希望對大家有所幫助,如有錯誤或未考慮完全的地方,望不吝賜教2023-12-12關于Object中equals方法和hashCode方法判斷的分析
今天小編就為大家分享一篇關于關于Object中equals方法和hashCode方法判斷的分析,小編覺得內(nèi)容挺不錯的,現(xiàn)在分享給大家,具有很好的參考價值,需要的朋友一起跟隨小編來看看吧2019-01-01Spring @Cacheable redis異常不影響正常業(yè)務方案
這篇文章主要介紹了Spring @Cacheable redis異常不影響正常業(yè)務方案,文中通過示例代碼介紹的非常詳細,對大家的學習或者工作具有一定的參考學習價值,需要的朋友們下面隨著小編來一起學習學習吧2021-02-02