Strings in java

Introduction:

  • String’s are a sequence of characters.
  • String’s are immutable (its value cannot be changed.)
  • String Class is Final in java.
  • String Literal always refers to a String in String Constant Pool.
  • String Objects are String’s created with the new operator and do not refer to String in String Constant Pool.

String Pool:

 String uses the concept of String Pool for memory management. It is also known as  String Constant Pool.

The need of String Pool :

In general, as any application grows, it is very common for String literals to occupy a large amount of a Program’s memory and there is often a lot of redundancy within the universe of String literals for a program.

The concept of String Pool:

  • To make java more efficient, JVM sets aside a special area of memory called as String Constant Pool.
  • When Complier encounters a String literal, it checks the String Constant Pool to see if an identical String already exists.
  •  If-Match found – Reference of existing String literal is redirected to new literal.
  •  If-Match is not found – New String literal is created and reference is redirected to new literal.

How does String work? :

  • String Literals:

When we write below code in java

Snippet-1 :  

Below mentioned activities happen in the background when we create String literal as shown in Snippet-1.

JVM checks whether there is any String literal exists with value as “abc” in the pool, since no match found it creates String literal foo in String Constant Pool and redirect its reference as shown with a Solid line in below Figure-1.

Figure-1
Figure-1

Now if we created another String literal with same value as  String literal fooas shown in Snippet-2.

 Snippet-2 :

Now when JVM check for existence of String literal with value “abc” in String Constant Pool, it finds match so JVM returns reference of existing String literal (Without creating new String literal) to New String literal in order to avoid redundancy, as shown in below Figure-2 with the help of Dash line.

                                                                                   Figure-2

String Objects:

When we write below code in java

 Snippet-3:

 

When we create String object as shown in Snippet-3.String Object is created outside of String Constant Pool i.e. in Heap memory, Even though a String literal with same value already exists in the String Constant Pool. As shown with Red line in Figure-3.

Figure-3

String intern method :

  • String Objects created with the new operator can be made to refer String literals in String Constant Pool with the help of intern () method of String class.
  • Intern method of String class returns a Interned String i.e. If String Object’s has value which is already existing in String Constant Pool then  reference of that String literal is returned else new String literal is added in String Constant Pool and its reference will be returned.
  • When we write below code in java

Snippet-4 :

String will be created in 2 steps for code Snippet-4, Steps are as follows:

  1. String Object with value “abc” is created outside the String Constant Pool then  fooBarOne will points to it.
  2. When intern method is called on the  String , it checks whether there is any String literal in  String Constant Pool  having same value
  3. If Match is found– Reference of existing  String literal  is redirected to new literal. So now  fooBarOne  will refer to String literal instead of  String Object  which is created in Step-1. So  String Object  which is created in Step-1 will be now eligible for Garbage Collection.
  4. If Match is not found – New  String literalis created and reference is redirected to new literal. So now fooBarOne will refer to  String literal  instead of  String Object  which is created in Step-1. So  String Object  which is created in Step-1 will be now eligible for Garbage Collection.

Figure-4 shows details about Step-1 and Step-2 enclosed under box with  red color.

                                                                                Figure-4

String hashCode method :

In an attempt to provide a fast implementation, early versions of the Java String class provided a hashCode () implementation that considered at most 16 characters picked from the string. For some common data this worked very poorly, delivering unacceptably clustered results and consequently slow hashtable performance.

From Java 1.2, java.lang.String class implements its hashCode () using a product sum algorithm over the entire text of the string. Given an instance s of the java.lang.String class, for example, would have a hash code h(s) defined by

Where terms are summed using Java 32-bit int addition, h(s) denotes the h(s) th character of the string, and h(s) is the length of s.

As with any general hashing function, collisions are possible. For example, the strings “FB” and “Ea” have the same hash value. The hashCode() implementation of String uses the prime number 31 and the difference between ‘a’ and ‘B’ is just 31, so the calculation is 70 × 31 + 66 = 69 × 31 + 97.

  • Why is 31 used as a multiplier and why not 29, or 37, or even  97 ?

 The value 31 was chosen because it is an odd prime. If it was even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) – i. Modern VMs do this sort of optimization automatically.

  • Sample implementations of the java.lang.String algorithm

1 thought on “Strings in java”

Leave a Reply

Your email address will not be published. Required fields are marked *