Java中的StringTokenizer實(shí)現(xiàn)字符串切割詳解
前言
java.util工具包提供了字符串切割的工具類StringTokenizer,Spring等常見(jiàn)框架的字符串工具類(如Spring的StringUtils),常見(jiàn)此類使用。
例如Spring的StringUtils下的方法:
public static String[] tokenizeToStringArray(
@Nullable String str, String delimiters, boolean trimTokens, boolean ignoreEmptyTokens) {
if (str == null) {
return EMPTY_STRING_ARRAY;
}
StringTokenizer st = new StringTokenizer(str, delimiters);
List<String> tokens = new ArrayList<>();
while (st.hasMoreTokens()) {
String token = st.nextToken();
if (trimTokens) {
token = token.trim();
}
if (!ignoreEmptyTokens || token.length() > 0) {
tokens.add(token);
}
}
return toStringArray(tokens);
}
又如定時(shí)任務(wù)框架Quartz中,cron表達(dá)式類CronExpression,其中的buildExpression方法是為了處理cron表達(dá)式的,cron表達(dá)式有7個(gè)子表達(dá)式,空格隔開(kāi),cron表達(dá)式字符串的切割也使用到了StringTokenizer類,方法如下:
protected void buildExpression(String expression) throws ParseException {
this.expressionParsed = true;
try {
if (this.seconds == null) {
this.seconds = new TreeSet();
}
if (this.minutes == null) {
this.minutes = new TreeSet();
}
if (this.hours == null) {
this.hours = new TreeSet();
}
if (this.daysOfMonth == null) {
this.daysOfMonth = new TreeSet();
}
if (this.months == null) {
this.months = new TreeSet();
}
if (this.daysOfWeek == null) {
this.daysOfWeek = new TreeSet();
}
if (this.years == null) {
this.years = new TreeSet();
}
int exprOn = 0;
for(StringTokenizer exprsTok = new StringTokenizer(expression, " \t", false); exprsTok.hasMoreTokens() && exprOn <= 6; ++exprOn) {
String expr = exprsTok.nextToken().trim();
if (exprOn == 3 && expr.indexOf(76) != -1 && expr.length() > 1 && expr.contains(",")) {
throw new ParseException("Support for specifying 'L' and 'LW' with other days of the month is not implemented", -1);
}
if (exprOn == 5 && expr.indexOf(76) != -1 && expr.length() > 1 && expr.contains(",")) {
throw new ParseException("Support for specifying 'L' with other days of the week is not implemented", -1);
}
if (exprOn == 5 && expr.indexOf(35) != -1 && expr.indexOf(35, expr.indexOf(35) + 1) != -1) {
throw new ParseException("Support for specifying multiple \"nth\" days is not implemented.", -1);
}
StringTokenizer vTok = new StringTokenizer(expr, ",");
while(vTok.hasMoreTokens()) {
String v = vTok.nextToken();
this.storeExpressionVals(0, v, exprOn);
}
}
if (exprOn <= 5) {
throw new ParseException("Unexpected end of expression.", expression.length());
} else {
if (exprOn <= 6) {
this.storeExpressionVals(0, "*", 6);
}
TreeSet<Integer> dow = this.getSet(5);
TreeSet<Integer> dom = this.getSet(3);
boolean dayOfMSpec = !dom.contains(NO_SPEC);
boolean dayOfWSpec = !dow.contains(NO_SPEC);
if ((!dayOfMSpec || dayOfWSpec) && (!dayOfWSpec || dayOfMSpec)) {
throw new ParseException("Support for specifying both a day-of-week AND a day-of-month parameter is not implemented.", 0);
}
}
} catch (ParseException var8) {
throw var8;
} catch (Exception var9) {
throw new ParseException("Illegal cron expression format (" + var9.toString() + ")", 0);
}
}使用方法
import com.google.common.collect.Lists;
import java.util.List;
import java.util.StringTokenizer;
/**
* @author xiaoxu
* @date 2023-10-18
* spring_boot:com.xiaoxu.boot.tokenizer.TestStringTokenizer
*/
public class TestStringTokenizer {
public static void main(String[] args) {
print("你 好 嗎\t我是 \t你的\t 朋友 \t", " \t", false);
}
public static void print(String str, String delimiter, boolean isReturnDelims) {
System.out.println("切割字符串:【" + str + "】;" + "分隔符:【" + delimiter + "】。");
List<String> strs = Lists.newArrayList();
String s;
boolean x;
for (StringTokenizer strToken = new StringTokenizer(str, delimiter, false); strToken.hasMoreTokens(); x = (s != null && strs.add(s))) {
s = strToken.nextToken();
System.out.println("切割:【" + s + "】");
if(s.equals("嗎"))
s = null;
}
System.out.println("字符串?dāng)?shù)組:" + strs);
}
}執(zhí)行結(jié)果:
切割字符串:【你 好 嗎 我是 你的 朋友 】;分隔符:【 】。
切割:【你】
切割:【好】
切割:【嗎】
切割:【我是】
切割:【你的】
切割:【朋友】
字符串?dāng)?shù)組:[你, 好, 我是, 你的, 朋友]
源碼片段分析
public StringTokenizer(String str, String delim, boolean returnDelims) {
currentPosition = 0;
newPosition = -1;
delimsChanged = false;
this.str = str;
maxPosition = str.length();
delimiters = delim;
retDelims = returnDelims;
setMaxDelimCodePoint();
}private void setMaxDelimCodePoint() {
if (delimiters == null) {
maxDelimCodePoint = 0;
return;
}
int m = 0;
int c;
int count = 0;
for (int i = 0; i < delimiters.length(); i += Character.charCount(c)) {
c = delimiters.charAt(i);
if (c >= Character.MIN_HIGH_SURROGATE && c <= Character.MAX_LOW_SURROGATE) {
c = delimiters.codePointAt(i);
hasSurrogates = true;
}
if (m < c)
m = c;
count++;
}
maxDelimCodePoint = m;
if (hasSurrogates) {
delimiterCodePoints = new int[count];
for (int i = 0, j = 0; i < count; i++, j += Character.charCount(c)) {
c = delimiters.codePointAt(j);
delimiterCodePoints[i] = c;
}
}
}調(diào)用setMaxDelimCodePoint()方法,源碼可知,切割時(shí)設(shè)置int maxDelimCodePoint,是為了優(yōu)化分隔符的檢測(cè)(取的是分隔字符串中char的ASCII碼值最大的字符的ASCII值,存入maxDelimCodePoint中。在方法int scanToken(int startPos)中,若滿足條件(c <= maxDelimCodePoint) && (delimiters.indexOf© >= 0),意即該字符的ASCII碼值小于等于最大的maxDelimCodePoint,那么這個(gè)字符可能存在于分隔字符串中,再檢測(cè)delimiters分隔字符串中是否包含該字符,反之,若ASCII碼值大于分隔字符串中最大的maxDelimCodePoint,也就是說(shuō)該字符一定不存在于分隔字符串里,&&直接跳過(guò)delimiters.indexOf的檢測(cè),也就達(dá)到了優(yōu)化分隔符檢測(cè)的效果了)。
private int scanToken(int startPos) {
int position = startPos;
while (position < maxPosition) {
if (!hasSurrogates) {
char c = str.charAt(position);
if ((c <= maxDelimCodePoint) && (delimiters.indexOf(c) >= 0))
break;
position++;
} else {
int c = str.codePointAt(position);
if ((c <= maxDelimCodePoint) && isDelimiter(c))
break;
position += Character.charCount(c);
}
}
if (retDelims && (startPos == position)) {
if (!hasSurrogates) {
char c = str.charAt(position);
if ((c <= maxDelimCodePoint) && (delimiters.indexOf(c) >= 0))
position++;
} else {
int c = str.codePointAt(position);
if ((c <= maxDelimCodePoint) && isDelimiter(c))
position += Character.charCount(c);
}
}
return position;
}scanToken方法即跳過(guò)分隔字符串,只要某此循環(huán)時(shí),該字符包含在分隔字符串里,那么position不再自增,以此時(shí)的position值作為實(shí)際切割獲取字符串的末索引, 因?yàn)閟ubString方法是左閉右開(kāi)的,該值是實(shí)際獲取字符串的末索引值+1,所以可以截取到完整的不包含分隔符的字符串片段。
skipDelimiters方法類似,即過(guò)濾連續(xù)包含于分隔字符串中的字符,獲取實(shí)際需要切割獲取的字符串的開(kāi)始索引值。
private int skipDelimiters(int startPos) {
if (delimiters == null)
throw new NullPointerException();
int position = startPos;
while (!retDelims && position < maxPosition) {
if (!hasSurrogates) {
char c = str.charAt(position);
if ((c > maxDelimCodePoint) || (delimiters.indexOf(c) < 0))
break;
position++;
} else {
int c = str.codePointAt(position);
if ((c > maxDelimCodePoint) || !isDelimiter(c)) {
break;
}
position += Character.charCount(c);
}
}
return position;
}上述分析可知,只要待切割字符串中的字符,在分隔字符串中出現(xiàn),那么就會(huì)做一次切割(也就是不論分隔字符串中的每個(gè)char或字符串片段的順序,只要連續(xù)包含在分隔字符串里,就切割)。
演示如下(注意countTokens()方法不要在循環(huán)中和nextToken()一同使用):
public static void print2(String str, String delimiter, boolean isReturnDelims) {
StringTokenizer strTokenizer = new StringTokenizer(str, delimiter);
System.out.println("總數(shù)目:" + strTokenizer.countTokens());
int count;
String[] strs = new String[count = strTokenizer.countTokens()];
// 注意:不要在循環(huán)里寫(xiě) int i = 0; i < strTokenizer.countTokens();
// 因?yàn)? countTokens方法需要使用currentPosition,而每次執(zhí)行nextToken方法時(shí),currentPosition會(huì)一直往下偏移計(jì)算,
// 會(huì)導(dǎo)致循環(huán)中, i < strTokenizer.countTokens();發(fā)生改變,這里應(yīng)該是常量總數(shù)目
for (int i = 0; i < count; i++) {
String s = strTokenizer.nextToken();
strs[i] = s;
}
System.out.println(Arrays.toString(strs));
}countTokens源碼如下:
public int countTokens() {
int count = 0;
int currpos = currentPosition;
while (currpos < maxPosition) {
currpos = skipDelimiters(currpos);
if (currpos >= maxPosition)
break;
currpos = scanToken(currpos);
count++;
}
return count;
}執(zhí)行:
print2("1a2b3c4ca5bc6ba7abc8acbbaba9", "abc", false);
結(jié)果如下所示:
總數(shù)目:9
[1, 2, 3, 4, 5, 6, 7, 8, 9]
到此這篇關(guān)于Java中的StringTokenizer實(shí)現(xiàn)字符串切割詳解的文章就介紹到這了,更多相關(guān)StringTokenizer字符串切割內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
- JSON.toJSONString()方法在Java中的使用方法及應(yīng)用場(chǎng)景
- Java中的String不可變性實(shí)現(xiàn)
- Java中的String、StringBuilder、StringBuffer三者的區(qū)別詳解
- Java的String類中的startsWith方法和endsWith方法示例詳解
- Java中的StringJoiner類使用示例深入詳解
- Java中的StringBuilder()常見(jiàn)方法詳解
- java8中的List<String>轉(zhuǎn)List<Integer>的實(shí)例代碼
- Java中如何取出String字符串括號(hào)中的內(nèi)容
- Java中的StringUtils引入及使用示例教程
- 深入理解Java中的String(示例詳解)
相關(guān)文章
idea項(xiàng)目的左側(cè)目錄沒(méi)了如何設(shè)置
這篇文章主要介紹了idea項(xiàng)目的左側(cè)目錄沒(méi)了如何設(shè)置的操作,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2021-02-02
Java大數(shù)運(yùn)算BigInteger與進(jìn)制轉(zhuǎn)換詳解
這篇文章主要介紹了Java大數(shù)運(yùn)算BigInteger與進(jìn)制轉(zhuǎn)換詳解,Java 提供了 BigInteger(大整數(shù))類和 BigDecimal(大浮點(diǎn)數(shù))類用于大數(shù)運(yùn)算,這兩個(gè)類都繼承自 Number 類(抽象類),由于 BigInteger 在大數(shù)運(yùn)算中更常見(jiàn),需要的朋友可以參考下2023-09-09
WebSocket無(wú)法注入屬性的問(wèn)題及解決方案
這篇文章主要介紹了WebSocket無(wú)法注入屬性的問(wèn)題及解決方法,本文通過(guò)示例代碼給大家介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2023-09-09
SpringBoot嵌入式Servlet容器與定制化組件超詳細(xì)講解
這篇文章主要介紹了SpringBoot嵌入式Servlet容器與定制化組件的使用介紹,文中通過(guò)示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)吧2022-10-10
IDEA新建Springboot項(xiàng)目(圖文教程)
下面小編就為大家?guī)?lái)一篇IDEA新建Springboot項(xiàng)目(圖文教程)。小編覺(jué)得挺不錯(cuò)的,現(xiàn)在就分享給大家,也給大家做個(gè)參考。一起跟隨小編過(guò)來(lái)看看吧2017-07-07
PHP Laravel實(shí)現(xiàn)文件下載功能
本文重點(diǎn)給大家介紹Laravel實(shí)現(xiàn)文件下載功能的實(shí)例代碼,需要的朋友參考下吧2017-09-09

