FilenameUtils.getName?函數(shù)源碼分析
一、背景
最近用到了 org.apache.commons.io.FilenameUtils#getName
這個(gè)方法,該方法可以傳入文件路徑,獲取文件名。 簡單看了下源碼,雖然并不復(fù)雜,但和自己設(shè)想略有區(qū)別,值得學(xué)習(xí),本文簡單分析下。
二、源碼分析
org.apache.commons.io.FilenameUtils#getName
/** * Gets the name minus the path from a full fileName. * <p> * This method will handle a file in either Unix or Windows format. * The text after the last forward or backslash is returned. * <pre> * a/b/c.txt --> c.txt * a.txt --> a.txt * a/b/c --> c * a/b/c/ --> "" * </pre> * <p> * The output will be the same irrespective of the machine that the code is running on. * * @param fileName the fileName to query, null returns null * @return the name of the file without the path, or an empty string if none exists. * Null bytes inside string will be removed */ public static String getName(final String fileName) { // 傳入 null 直接返回 null if (fileName == null) { return null; } // NonNul 檢查 requireNonNullChars(fileName); // 查找最后一個(gè)分隔符 final int index = indexOfLastSeparator(fileName); // 從最后一個(gè)分隔符竊到最后 return fileName.substring(index + 1); }
2.1 問題1:為什么需要 NonNul 檢查 ?
2.1.1 怎么檢查的?
org.apache.commons.io.FilenameUtils#requireNonNullChars
/** * Checks the input for null bytes, a sign of unsanitized data being passed to to file level functions. * * This may be used for poison byte attacks. * * @param path the path to check */ private static void requireNonNullChars(final String path) { if (path.indexOf(0) >= 0) { throw new IllegalArgumentException("Null byte present in file/path name. There are no " + "known legitimate use cases for such data, but several injection attacks may use it"); } }
java.lang.String#indexOf(int)
源碼:
/** * Returns the index within this string of the first occurrence of * the specified character. If a character with value * {@code ch} occurs in the character sequence represented by * this {@code String} object, then the index (in Unicode * code units) of the first such occurrence is returned. For * values of {@code ch} in the range from 0 to 0xFFFF * (inclusive), this is the smallest value <i>k</i> such that: * <blockquote><pre> * this.charAt(<i>k</i>) == ch * </pre></blockquote> * is true. For other values of {@code ch}, it is the * smallest value <i>k</i> such that: * <blockquote><pre> * this.codePointAt(<i>k</i>) == ch * </pre></blockquote> * is true. In either case, if no such character occurs in this * string, then {@code -1} is returned. * * @param ch a character (Unicode code point). * @return the index of the first occurrence of the character in the * character sequence represented by this object, or * {@code -1} if the character does not occur. */ public int indexOf(int ch) { return indexOf(ch, 0); }
可知,indexOf(0)
目的是查找 ASCII 碼為 0 的字符的位置,如果找到則拋出 IllegalArgumentException
異常。 搜索 ASCII 對(duì)照表,得知 ASCII 值為 0 代表控制字符 NUT,并不是常規(guī)的文件名所應(yīng)該包含的字符。
2.1.2 為什么要做這個(gè)檢查呢?
null 字節(jié)是一個(gè)值為 0 的字節(jié),如十六進(jìn)制中的 0x00。 存在與 null 字節(jié)有關(guān)的安全漏洞。 因?yàn)?C 語言中使用 null 字節(jié)作為字符串終結(jié)符,而其他語言(Java,PHP等)沒有這個(gè)字符串終結(jié)符; 例如,Java Web 項(xiàng)目只允許用戶上傳 .jpg 格式的圖片,但利用這個(gè)漏洞就可以上傳 .jsp 文件。 如用戶上傳 hack.jsp<NUL>.jpg
文件, Java 會(huì)認(rèn)為符合 .jpg 格式,實(shí)際調(diào)用 C 語言系統(tǒng)函數(shù)寫入磁盤時(shí)講 當(dāng)做字符串分隔符,結(jié)果將文件保存為 hack.jsp
。 有些編程語言不允許在文件名中使用 ·· <NUL>
,如果你使用的編程語言沒有對(duì)此處理,就需要自己去處理。 因此,這個(gè)檢查很有必要。
代碼示例:
package org.example; import org.apache.commons.io.FilenameUtils; public class FilenameDemo { public static void main(String[] args) { String filename= "hack.jsp\0.jpg"; System.out.println( FilenameUtils.getName(filename)); } }
報(bào)錯(cuò)信息:
Exception in thread "main" java.lang.IllegalArgumentException: Null byte present in file/path name. There are no known legitimate use cases for such data, but several injection attacks may use it
at org.apache.commons.io.FilenameUtils.requireNonNullChars(FilenameUtils.java:998)
at org.apache.commons.io.FilenameUtils.getName(FilenameUtils.java:984)
at org.example.FilenameDemo.main(FilenameDemo.java:8)
如果去掉校驗(yàn):
package org.example; import org.apache.commons.io.FilenameUtils; public class FilenameDemo { public static void main(String[] args) { String filename= "hack.jsp\0.jpg"; // 不添加校驗(yàn) String name = getName(filename); // 獲取拓展名 String extension = FilenameUtils.getExtension(name); System.out.println(extension); } public static String getName(final String fileName) { if (fileName == null) { return null; } final int index = FilenameUtils.indexOfLastSeparator(fileName); return fileName.substring(index + 1); } }
Java 的確會(huì)將拓展名識(shí)別為 jpg
jpg
JDK 8 及其以上版本試圖創(chuàng)建 hack.jsp\0.jpg
的文件時(shí),底層也會(huì)做類似的校驗(yàn),無法創(chuàng)建成功。
大家感興趣可以試試使用 C 語言寫入名為 hack.jsp\0.jpg
的文件,最終很可能文件名為 hack.jsp
。
2.2 問題2: 為什么不根據(jù)當(dāng)前系統(tǒng)類型來獲取分隔符?
查找最后一個(gè)分隔符 org.apache.commons.io.FilenameUtils#indexOfLastSeparator
/** * Returns the index of the last directory separator character. * <p> * This method will handle a file in either Unix or Windows format. * The position of the last forward or backslash is returned. * <p> * The output will be the same irrespective of the machine that the code is running on. * * @param fileName the fileName to find the last path separator in, null returns -1 * @return the index of the last separator character, or -1 if there * is no such character */ public static int indexOfLastSeparator(final String fileName) { if (fileName == null) { return NOT_FOUND; } final int lastUnixPos = fileName.lastIndexOf(UNIX_SEPARATOR); final int lastWindowsPos = fileName.lastIndexOf(WINDOWS_SEPARATOR); return Math.max(lastUnixPos, lastWindowsPos); }
該方法的語義是獲取文件名,那么從函數(shù)的語義層面上來說,不管是啥系統(tǒng)的文件分隔符都必須要保證得到正確的文件名。 試想一下,在 Windows 系統(tǒng)上調(diào)用該函數(shù),傳入一個(gè) Unix 文件路徑,得不到正確的文件名合理嗎? 函數(shù)設(shè)計(jì)本身就應(yīng)該考慮兼容性。 因此不能獲取當(dāng)前系統(tǒng)的分隔符來截取文件名。 源碼中分別獲取 Window 和 Unix 分隔符,有哪個(gè)用哪個(gè),顯然更加合理。
三、Zoom Out
3.1 代碼健壯性
我們?nèi)粘>幋a時(shí),要做防御性編程,對(duì)于錯(cuò)誤的、非法的輸入都要做好預(yù)防。
3.2 代碼嚴(yán)謹(jǐn)性
我們寫代碼一定不要想當(dāng)然。 我們先想清楚這個(gè)函數(shù)究竟要實(shí)現(xiàn)怎樣的功能,而且不是做一個(gè) “CV 工程師”,無腦“拷貝”代碼。 同時(shí),我們也應(yīng)該寫好單測(cè),充分考慮各種異常 Case ,保證正常和異常的 Case 都覆蓋到。
3.3 如何寫注釋
org.apache.commons.io.FilenameUtils#requireNonNullChars
函數(shù)注釋部分就給出了這么設(shè)計(jì)的原因:This may be used for poison byte attacks.
注釋不應(yīng)該“喃喃自語”講一些顯而易見的廢話。 對(duì)于容易讓人困惑的設(shè)計(jì),一定要通過注釋講清楚設(shè)計(jì)原因。
此外,結(jié)合工作經(jīng)驗(yàn),推薦一些其他注釋技巧: (1)對(duì)于稍微復(fù)雜或者重要的設(shè)計(jì),可以通過注釋給出核心的設(shè)計(jì)思路; 如: java.util.concurrent.ThreadPoolExecutor#execute
/** * Executes the given task sometime in the future. The task * may execute in a new thread or in an existing pooled thread. * * If the task cannot be submitted for execution, either because this * executor has been shutdown or because its capacity has been reached, * the task is handled by the current {@link RejectedExecutionHandler}. * * @param command the task to execute * @throws RejectedExecutionException at discretion of * {@code RejectedExecutionHandler}, if the task * cannot be accepted for execution * @throws NullPointerException if {@code command} is null */ public void execute(Runnable command) { if (command == null) throw new NullPointerException(); /* * Proceed in 3 steps: * * 1. If fewer than corePoolSize threads are running, try to * start a new thread with the given command as its first * task. The call to addWorker atomically checks runState and * workerCount, and so prevents false alarms that would add * threads when it shouldn't, by returning false. * * 2. If a task can be successfully queued, then we still need * to double-check whether we should have added a thread * (because existing ones died since last checking) or that * the pool shut down since entry into this method. So we * recheck state and if necessary roll back the enqueuing if * stopped, or start a new thread if there are none. * * 3. If we cannot queue task, then we try to add a new * thread. If it fails, we know we are shut down or saturated * and so reject the task. */ int c = ctl.get(); if (workerCountOf(c) < corePoolSize) { if (addWorker(command, true)) return; c = ctl.get(); } if (isRunning(c) && workQueue.offer(command)) { int recheck = ctl.get(); if (! isRunning(recheck) && remove(command)) reject(command); else if (workerCountOf(recheck) == 0) addWorker(null, false); } else if (!addWorker(command, false)) reject(command); }
(2)對(duì)于關(guān)聯(lián)的代碼,可以使用 @see 或者 {@link } 的方式,在代碼中提供關(guān)聯(lián)代碼的快捷跳轉(zhuǎn)方式。
/** * Sets the core number of threads. This overrides any value set * in the constructor. If the new value is smaller than the * current value, excess existing threads will be terminated when * they next become idle. If larger, new threads will, if needed, * be started to execute any queued tasks. * * @param corePoolSize the new core size * @throws IllegalArgumentException if {@code corePoolSize < 0} * or {@code corePoolSize} is greater than the {@linkplain * #getMaximumPoolSize() maximum pool size} * @see #getCorePoolSize */ public void setCorePoolSize(int corePoolSize) { if (corePoolSize < 0 || maximumPoolSize < corePoolSize) throw new IllegalArgumentException(); int delta = corePoolSize - this.corePoolSize; this.corePoolSize = corePoolSize; if (workerCountOf(ctl.get()) > corePoolSize) interruptIdleWorkers(); else if (delta > 0) { // We don't really know how many new threads are "needed". // As a heuristic, prestart enough new workers (up to new // core size) to handle the current number of tasks in // queue, but stop if queue becomes empty while doing so. int k = Math.min(delta, workQueue.size()); while (k-- > 0 && addWorker(null, true)) { if (workQueue.isEmpty()) break; } } }
(2)在日常業(yè)務(wù)開發(fā)中,非常推薦講相關(guān)的文檔、配置頁面鏈接也放到注釋中,極大方便后期維護(hù)。 如:
/** * 某某功能 * * 相關(guān)文檔: * <a rel="external nofollow" rel="external nofollow" >設(shè)計(jì)文檔</a> * <a rel="external nofollow" rel="external nofollow" >三方API地址</a> */ public void demo(){ // 省略 }
(4)對(duì)于工具類可以考慮講給出常見的輸入對(duì)應(yīng)的輸出。 如 org.apache.commons.lang3.StringUtils#center(java.lang.String, int, char)
/** * <p>Centers a String in a larger String of size {@code size}. * Uses a supplied character as the value to pad the String with.</p> * * <p>If the size is less than the String length, the String is returned. * A {@code null} String returns {@code null}. * A negative size is treated as zero.</p> * * <pre> * StringUtils.center(null, *, *) = null * StringUtils.center("", 4, ' ') = " " * StringUtils.center("ab", -1, ' ') = "ab" * StringUtils.center("ab", 4, ' ') = " ab " * StringUtils.center("abcd", 2, ' ') = "abcd" * StringUtils.center("a", 4, ' ') = " a " * StringUtils.center("a", 4, 'y') = "yayy" * </pre> * * @param str the String to center, may be null * @param size the int size of new String, negative treated as zero * @param padChar the character to pad the new String with * @return centered String, {@code null} if null String input * @since 2.0 */ public static String center(String str, final int size, final char padChar) { if (str == null || size <= 0) { return str; } final int strLen = str.length(); final int pads = size - strLen; if (pads <= 0) { return str; } str = leftPad(str, strLen + pads / 2, padChar); str = rightPad(str, size, padChar); return str; }
(5) 對(duì)于廢棄的方法,一定要注明廢棄的原因,給出替代方案。 如:java.security.Signature#setParameter(java.lang.String, java.lang.Object)
/** * 省略部分 * * @see #getParameter * * @deprecated Use * {@link #setParameter(java.security.spec.AlgorithmParameterSpec) * setParameter}. */ @Deprecated public final void setParameter(String param, Object value) throws InvalidParameterException { engineSetParameter(param, value); }
四、總結(jié)
很多優(yōu)秀的開源項(xiàng)目的代碼設(shè)計(jì)都非常嚴(yán)謹(jǐn),往往簡單的代碼中也蘊(yùn)藏著縝密的思考。 我們有時(shí)間可以看看一些優(yōu)秀的開源項(xiàng)目,可以從簡單的入手,可以先想想如果自己寫大概該如何實(shí)現(xiàn),然后和作者的實(shí)現(xiàn)思路對(duì)比,會(huì)有更大收獲。 平時(shí)看源碼時(shí),不僅要知道源碼長這樣,更要了解為什么這么設(shè)計(jì)。
以上就是FilenameUtils.getName 函數(shù)源碼分析的詳細(xì)內(nèi)容,更多關(guān)于FilenameUtils.getName 函數(shù)的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章!
相關(guān)文章
詳解Java Project項(xiàng)目打包成jar,并生成exe文件
本篇文章主要介紹了Java Project項(xiàng)目打包成jar,并生成exe文件,非常具有實(shí)用價(jià)值,有興趣的可以了解一下。2017-01-01Spring Boot 項(xiàng)目啟動(dòng)失敗的解決方案
這篇文章主要介紹了Spring Boot 項(xiàng)目啟動(dòng)失敗的解決方案,幫助大家更好的理解和學(xué)習(xí)使用Spring Boot,感興趣的朋友可以了解下2021-03-03詳解Servlet入門級(jí)設(shè)置(超詳細(xì) IDEA2020版)
這篇文章主要介紹了詳解Servlet入門級(jí)設(shè)置(超詳細(xì) IDEA2020版),文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧2020-11-11Java基于IDEA實(shí)現(xiàn)http編程的示例代碼
這篇文章主要介紹了Java基于IDEA實(shí)現(xiàn)http編程的示例代碼,文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧2021-04-04java通過url讀取遠(yuǎn)程數(shù)據(jù)并保持到本地的實(shí)例代碼
本文通過實(shí)例代碼給大家介紹了java通過url讀取遠(yuǎn)程數(shù)據(jù)并保持到本地的方法,本文給大家介紹的非常詳細(xì),具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2018-07-07Java 實(shí)現(xiàn)并發(fā)的幾種方式小結(jié)
這篇文章主要介紹了Java 實(shí)現(xiàn)并發(fā)的幾種方式小結(jié),具有很好的參考價(jià)值,希望對(duì)大家有所幫助。如有錯(cuò)誤或未考慮完全的地方,望不吝賜教2021-05-05SpringBoot整合RabbitMQ實(shí)現(xiàn)六種工作模式的示例
這篇文章主要介紹了SpringBoot整合RabbitMQ實(shí)現(xiàn)六種工作模式,本文通過實(shí)例代碼給大家介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2022-07-07