SCJP Study Guide:
API Contents


Printer-friendly version Printer-friendly version | Send this 
article to a friend Mail this to a friend


Previous Next vertical dots separating previous/next from contents/index/pdf Contents
XyzWs Study Guide: SCJP: String

String

The Java platform provides three classes String, StringBuffer and StringBuilder which store and manipulate strings ? character data consisting of more than one character. Use the following guidelines for deciding which class to use:

  • If your text is not going to change, use a string ? a String object is immutable.
  • If your text will change, and will only be accessed from a single thread, use a string builder.
  • If your text will change, and will be accessed from multiple threads, use a string buffer.

String

The String class represents character strings. All string literals in Java programs, such as "abc", are implemented as instances of this class. It is a public, final class and implements Serializable, Comparable<String> , CharSequence.

Strings are constant; their values cannot be changed after they are created. Each time you manipulate with an existing strings, a new string object is created that contains the modifications. Because String objects are immutable they can be shared.

String Creation

There are number of ways to create a new String objects:

String() Initializes a newly created String object so that it represents an empty character sequence. Note that use of this constructor is unnecessary since String are immutable.
String(byte[] bytes) Constructs a new String by decoding the specified array of bytes using the platform's default charset. The length of the new String is a function of the charset, and hence may not be equal to the length of the byte array.

The behavior of this constructor when the given bytes are not valid in the default charset is unspecified. The CharsetDecoder class should be used when more control over the decoding process is required.

String(byte[] bytes, int offset, int length) Constructs a new String by decoding the specified subarray of bytes using the platform's default charset. The length of the new String is a function of the charset, and hence may not be equal to the length of the subarray.

The behavior of this constructor when the given bytes are not valid in the default charset is unspecified. The CharsetDecoder class should be used when more control over the decoding process is required.

If the offset and length arguments index characters outside the bounds of the value array, it throws an IndexOutOfBoundsException exception.

String(byte[] bytes, int offset, int length, String charsetName) Constructs a new String by decoding the specified subarray of bytes using the specified charset. The length of the new String is a function of the charset, and hence may not be equal to the length of the subarray.

The behavior of this constructor when the given bytes are not valid in the given charset is unspecified. The CharstDecoder class should be used when more control over the decoding process is required.

It throws an UnsupportedEncodingException exception if the named charset is not supported. If the offset and length arguments index characters outside the bounds of the bytes array, it throws an IndexOutBoundsException exception.

String(byte[] bytes, String charsetName) Constructs a new String by decoding the specified array of bytes using the specified charset. The length of the new String is a function of the charset, and hence may not be equal to the length of the byte array.

The behavior of this constructor when the given bytes are not valid in the given charset is unspecified. The CharsetDecoder class should be used when more control over the decoding process is required. 

It throws an UnsupportedEncodingException exception if the named charset is not supported.

String(char[] value) Allocates a new String so that it represents the sequence of characters currently contained in the character array argument. The contents of the character array are copied; subsequent modification of the character array does not affect the newly created string.
String(char[] value, int offset, int count) Allocates a new String that contains characters from a subarray of the character array argument. The offset argument is the index of the first character of the subarray and the count argument specifies the length of the subarray. The contents of the subarray are copied; subsequent modification of the character array does not affect the newly created string. If the offset and count arguments index characters outside the bounds of the value array, it throws an IndexOutOfBoundsException exception.
String(int[] codePoints, int offset, int count) Allocates a new String that contains characters from a subarray of the Unicode code point array argument. The offset argument is the index of the first code point of the subarray and the count argument specifies the length of the subarray. The contents of the subarray are converted to chars; subsequent modification of the int array does not affect the newly created string. If any invalid Unicode code point is found in codePoints, it throws an IllegalArgumentException exception. If the offset and count arguments index characters outside the bounds of the value array, it throws an IndexOutOfBoundsException exception.
String(String original) Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string.Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable.
String(StringBuffer buffer) Allocates a new string that contains the sequence of characters currently contained in the string buffer argument. The contents of the string buffer are copied; subsequent modification of the string buffer does not affect the newly created string.
String(StringBuilder builder) Allocates a new string that contains the sequence of characters currently contained in the string builder argument. The contents of the string builder are copied; subsequent modification of the string builder does not affect the newly created string.

This constructor is provided to ease migration to StringBuilder. Obtaining a string from a string builder via the toString method is likely to run faster and is generally preferred.

Note: If you pass null in a constructor, compiler error says that it is ambiguous (for String and StringBuffer versions).

A string is often created from a string literal ? a series of characters enclosed in double quotes. The shortcut syntax for instantiating new String objects from string literals:

String s = "abc";

This special shortcut syntax was designed to improve String performance: The JVM sets aside a special area of memory called the "String constant pool". Each JVM only keeps one copy of each string literal. If the compiler encounters a String literal, and  the String literal has been created in String constant pool, the reference to the new String literal is directed to the existing String.

String s = "abc";
String s1 = "abc";

The String reference variables s and s1 are pointing to the same String Object in the String constant pool.

String Concatenate

The arithmetic operator "+" and assigment operator  "+=" can be used to join (or concatenate) two String together. In Java, String objects are immutable! The operation does not change the original String object and actually create a new String object that contains the data from these two string. For example:

String  s = "abc" + "def";

There are three String objects created: "abc", "def", and "abcdef". The String literal "abc" and "def" are constant string and stored in the String constant pool. The new created "abcdef" in program's memory is assigned to the String reference variable s

The '+' and '+=' operators are also overloaded to join Strings to all the primitive data types. This is a common way to convert numbers to Strings. For the '+' operator, if either operand is String, the other is converted to String and concatenated with it. For the '+=', the left side operand must  be String, the right side primitive data type is converted to String and then join with left side String.

String s = "abc";
s = s + 5;  // create a new String object "abc5"
s = 6 + s; // create a new String object 6abc5
s +=7;      // create a new String object "6abc57";

If a non-String object is concatenated with a String, its toString() method is called. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java.. When you write a class that might have a sensible representation as a string, it's useful for debugging purposes to override Object's useless toString() method.

public String concat(String s) accessor method concatenates the String argument to the end of this string. This is equivalent to the +  and  += operators for strings. If the length of the argument string is 0, then the original String object is returned. Otherwise, a new String object is created, representing a character sequence that is the concatenation of the character sequence represented by this String object and the character sequence represented by the argument string

String s = "Hello, ";
System.out.println(s.concat("Mike")); // output is "Hello, Mike"
System.out.println(s); //output is "Hello, "

Why does the second statement output "Hello, " instead of "Hello, Mike"? The concat() method appears to modify a string. String object is immutable, the method really does is create and return a second string that contains the result. The new created string need a reference variable to hold its reference. In our example, no reference variable is created to access the "Hello, Mike" String.

String Comparsion

Strings can not be compared with the usual <, <=, >, or >= operators. The == and != operators don't compare the value of the two String objects. They compare the equality of the two String references (the String reference stores the location of the String Object allocated in memory).

public boolean equals(String s) method compares whether the value of the two String objects are the same and does not count where these two String object allocated in memory. For example:

String a = new String("abc"); 
String aVar = new String("abc"); 
//Java create two String objects with string value "abc" 
System.out.println("a == aVar:" + (a == aVar)); 
System.out.println("a.equals(aVar):" + a.equals(aVar)); 

String b = new String("abc"); 
String bVar = b; 
//One "abc" String object created, b and bVar reference to the same object
System.out.println("b == bVar:" + (b == bVar)); 
System.out.println("b.equals(bVar):" + b.equals(bVar)); 

String c = "abc";
String cVar = "abc"; 
//One "abc" String object created in the String constant pool
System.out.println("c == cVar:" + (c == cVar));
System.out.println("c.equals(cVar):" + c.equals(cVar));

This will produce:

a == aVar:false
a.equals(aVar):true
b == bVar:true
b.equals(bVar):true
c == cVar:true
c.equals(cVar):true

public boolean equalsIgnoreCase(String anotherString) compares this String to another String, ignoring case considerations. Two strings are considered equal ignoring case if they are of the same length, and corresponding characters in the two strings are equal ignoring case.

public boolean contentEquals(CharSequence cs) Returns true if and only if this String represents the same sequence of char values as the specified sequence.

public int compareTo(String s) method can be used to do relational operators > and <. It compares two strings lexicographically. The comparison is based on the Unicode value of each character in the strings. The character sequence represented by this String object is compared lexicographically to the character sequence represented by the argument string. The result is a negative integer if this String object lexicographically precedes the argument string. The result is a positive integer if this String object lexicographically follows the argument string. The result is zero if the strings are equal; compareTo returns 0 exactly when the equals(Object) method would return true.

		
String a = new String("abc");
String b = new String("bcd");
String c = new String("abc");
	  
System.out.println("a.compareTo(b):" + (a.compareTo(b)));
System.out.println("b.compareTo(a):" + (b.compareTo(a)));
System.out.println("a.compareTo(c):" + (a.compareTo(c)));
System.out.println("c.compareTo(a):" + (c.compareTo(a)));

This will produce

a.compareTo(b):-1
b.compareTo(a):1
a.compareTo(c):0
c.compareTo(a):0

public int compareToIgnoreCase(String str)  compares two strings lexicographically ignoring case differences and returns an integer indicating whether this string is greater than (result is > 0), equal to (result is = 0), or less than (result is < 0) the argument. The Object argument is converted to a string before the comparison takes place.

Method Description
boolean endsWith(String)
boolean startsWith(String)
boolean startsWith(String, int)
Returns true if this string ends with or begins with the substring specified as an argument to the method. The integer argument, when present, indicates the offset within the original string at which to begin looking.
boolean regionMatches(int, String, int, int)
boolean regionMatches(boolean, int, String, int, int)
Tests whether the specified region of this string matches the specified region of the String argument. The boolean argument indicates whether case should be ignored; if true, the case is ignored when comparing characters.
boolean matches(String) Tests whether this string matches the specified regular expression.

String Methods

Methods used to obtain information about an object are known as accessor methods.

  • Any String methods requiring an index will throw an hroIndexOutOfBoundsException exception, if the index argument is negative or larger than the length of this string.

Length of the String

public int length()  accessor method returns the length of this string. The length is equal to the number of 16-bit Unicode characters in the string.

String s = "abcdef";
System.out.println("string s length is " + s.length());

The output is "string s length is 6".

Get a character

public char charAt(int index) accessor method returns the char value at the specified index. An index ranges from 0 to length() - 1. The first char value of the sequence is at index 0, the next at index 1, and so on, as for array indexing.

String s = "abcdef";
System.out.println("char at index=1 in string s is " + s.charAt(1));

The output is "char at index=1 in string s is b". Indices begin at 0, so the character at index 1 is 'b' .

Get a substring

public String substring(int beginIndex) /public String substring(int beginIndex, int endIndex) accessor methods return a new string that is a substring of this string, string buffer, or string builder.The first integer argument specifies the index of the first character. The second integer argument is the index of the last character -1. The length of the substring is therefore the second int minus the first int. If the second integer is not present, the substring extends to the end of the original string.

An IndexOutOfBoundsException exception will ne thrown :

  • if beginIndex is negative or larger than the length of this String object for substring(int beginIndex).
  • if the beginIndex is negative, or endIndex is larger than the length of this String object, or beginIndex is larger than endIndex for substring(inde beginIndex, int endIndex).

For example:

"unhappy".substring(2) returns "happy" "Harbison".substring(3) returns "bison"
"emptiness".substring(9) returns "" (an emptystring)
"emptiness".substring(10) throws an IndexOutOfBoundsException exception
"hamburger".substring(4, 8) returns "urge" 
"smiles".substring(1, 5) returns "mile"
"smiles".substring(1, 7) throws an IndexOutOfBoundsException exception
"smiles".substring(5, 1) throws an IndexOutOfBoundsException exception

If the accessor method dose not produce a different resulting string, the original String object reference is returned by the method.

if("Hello".substring(0,5) == "Hello") 
    System.out.println("Equal"); 
else 
    System.out.println("Not Equal");

The output is "Equal", because "Hello".substring(0,5) does not produce a new string, it just return the original "Hello" String object reference. The "Hello" is a string literal and stored in String constant pool only once. But the following code will output "Not Equal".

if(" Hello ".substring(1,6) == "Hello") 
    System.out.println("Equal"); 
else 
    System.out.println("Not Equal");

Find a character or a substring

The String class has provide the various string search methods:

Method Description
int indexOf(int ch)
int lastIndexOf(int ch)
Returns the index of the first (last) occurrence of the specified character.
int indexOf(int ch, int fromIndex)
int lastIndexOf(int, int fromIndex)
Returns the index of the first (last) occurrence of the specified character, searching forward (backward) from the specified index.
int indexOf(String str)
int lastIndexOf(String str)
Returns the index of the first (last) occurrence of the specified string.
int indexOf(String str, int fromIndex)
int lastIndexOf(String str, int fromIndex)
Returns the index of the first (last) occurrence of the specified string, searching forward (backward) from the specified index.
boolean contains(CharSequence s) Returns true if the string contains the specified character sequence.

The String class provides two accessor methods that return the position within the string of a specific character or substring: indexOf and lastIndexOf. The indexOf method searches forward from the beginning of the string, and lastIndexOf searches backward from the end of the string. All of these methods return -1 if the character or string is not found. All of the index begin at 0.

The String class also provides a search method, contains, that returns true if the string contains a particular character sequence. Use this method when you only need to know that the string contains a character sequence, but the precise location isn't important.

Replace Methods

public String replace(char oldChar, char newChar) returns a new string resulting from replacing all occurrences of oldChar in this string with newChar. If the character oldChar does not occur in the character sequence represented by this String object, then a reference to the original String object is returned. Otherwise, a new String object is created that represents a character sequence identical to the character sequence represented by this String object, except that every occurrence of oldChar is replaced by an occurrence of newChar. Again, you have to assign the retrun string to a reference variable that refers to it.

"mesquite in your cellar".replace('e', 'o'); //returns a new string "mosquito in your collar"
"the war of baronets".replace('r', 'y'); // returns a new string "the way of bayonets"
"sparring with a purple porpoise".replace('p', 't');//returns a new string "starring with a turtle tortoise"
"JonL".replace('q', 'x'); //returns original string "JonL" (no change)

String s = "mesquite in your cellar";
System.out.println(s.replace('e', 'o')); //output is "mosquito in your collar"
System.out.ptintln(s); //output is "mesquite in your cellar"

if ("JONL".replace('q', 'x') == "JONL") 
    System.out.println("Equal.");       //Output is "Equal"
else
    System.out.println("Not Equal.");

public String replaceAll(String regex,String replacement) method replaces each substring of this string that matches the given regular expression with the given replacement. An invocation of this method of the form str.replaceAll(regex, repl) yields exactly the same result as the expression.

String.replaceAll("\n", ""); // Remove all \n
String.replaceAll("\n", "\r"); // Replace \n by \r

Manipulating Strings

String object is immutable. Any method, that appear to modify a string, really does is create and return a second string that contains the result.

public String toLowerCase() converts all of the characters in this String to lower case using the rules of the default locale. This is equivalent to calling toLowerCase(Locale.getDefault()). If no conversions are necessary, these methods return the original string. Otherwise a new string is created.

String s = "JAVA";
System.out.println(s.toLowerCase()); //output is "java"
System.out.println(s); //output is "JAVA"
String s1 = "java";
System.out.println("Equal is " + (s1.toLowerCase() == s1)); //output "Equal is true"

public String toUpperCase() converts all of the characters in this String to upper case using the rules of the default locale. This method is equivalent to toUpperCase(Locale.getDefault()). If no conversions are necessary, these methods return the original string. Otherwise a new string is created.

String s = "java";
System.out.println(s.toUpperCase()); //output is "JAVA"
System.out.println(s); //output is "java"
String s1 = "JAVA";
System.out.println("Equal is " + (s1.toUpperCase() == s1)); //output "Equal is true"

public String trim() returns a copy of the string used to invoke the method, with leading and trailing whitespace omitted. If it has no leading or trailing white space then the original string is returned.

String s = "         Hello          ";
System.out.println(s.trim()); //output is "Hello"
System.out.println(s); //output is "         Hello          "
Method Description
String[] split(String, int)
String[] split(String)
Searches for a match as specified by the string argument (which contains a regular expression) and splits this string into an array of strings accordingly. The optional integer argument specifies the maximum size of the returned array.
CharSequence subSequence(int, int) Returns a new character sequence constructed from the beginning index (inclusive) up until the ending index (exclusive).

Previous Next vertical dots separating previous/next from contents/index/pdf Contents

  |   |