Thread deadlock in Java

Deadlock occurs when a group of processes blocks forever because each process is waiting for resources which are held by another process in the group. What happens is that a task is stuck waiting for another task to release a resource which itself is stuck waiting for another task to release a resource and so on such that a circular wait loop ensures. The result is that no task can proceed. Thus a deadlock results.

The classis case of deadlock is the Dining Philosopher problem.
In this problem we have, say 5 philosophers sitting down for dinner at a round table.
To the left and right of each philosopher is a chopstick and there are 5 of these chopsticks F1 – F5, illustrated below:

Dining philosopher problem

Reference: https://samplecodes.files.wordpress.com/2008/11/23.jpg

In order for each philospher to eat, they must pick up the left and right chopsticks.
Each philosopher decides to pick up the chopstick on his right 1st before picking up the one on his left.
After picking up the chopstick on the right, each philosopher attempts to pick up the chopstick on his left and if it is not yet available, has to wait.

Thus we can have the following scenario:

P1 picks up F1, waits to pick up F2
P2 picks up F2, waits to pick up F3
P3 picks up F3, waits to pick up F4
P4 picks up F4, waits to pick up F1
P5 picks up F5, waits to pick up F1

Thus we have a circular wait scenario, where each philosopher is waiting on the next philosopher to his left to drop his right chopstick and so on such that no philosopher can eat.

Here is Java code for a simpler example of deadlock involving 2 tasks:

public class DeadlockDemo {   public Integer obj1 = 1;   public Integer obj2 = 2;    private class Runnable1 implements Runnable   {    public void run()    {     synchronized(obj1)     {      System.out.println("R1: Obtained lock on obj1:" + obj1);      System.out.println("R1: Waiting to obtain lock on obj2..."):      synchronized(obj2)      {       System.out.println("R1: Obtained lock on obj2:" + obj2);      }     }    }   }    private class Runnable2 implements Runnable   {    public void run()    {     synchronized(obj2)     {      System.out.println("R2: Obtained lock on obj2:" + obj2);      System.out.println("R2: Waiting to obtain lock on obj1..."):      synchronized(obj1)      {       System.out.println("R2: Obtained lock on obj1:" + obj1);      }     }    }   }    public static void main(String[] args)   {    DeadlockDemo dDemo=new DeadlockDemo();    Runnable r1=dDemo.new Runnable1();    Runnable r2=new Runnable2();    new Thread(r1).start();    new Thread(r2).start();   }  }

 

I ran the above code and it produced the following result:

R2: Obtained lock on obj2:2 R2: Waiting to obtain lock on obj1... R1: Obtained lock on obj1:1 R1: Waiting to obtain lock on obj2...

and the program hung with the 2 threads stuck in a circular wait.

Difference between equals() and == operator

The difference between the equals() and == operator is as follows:

The == operator checks for equality of object references in the case of objects and equality of value in the case of primitives.

However equals() checks for “meaningful” equality in the case of objects of the same class where “meaningful equality” has been defined by overriding the equals() operator of the base

Object class in Java.

To illustrate what I mean, let us take the example of a simple Integer wrapper class and consider the following piece of code:

public class IllustrateEquality
{
public static void main(String[] args)

{

Integer i1=new Integer(2);
Integer i2=new Integer(2);

System.out.println(“i1==i2 produces: ” + (i1==i2));
System.out.println(“i1.equals(i2) produces: ” + (i1.equals(i2)));

}

This results in the following output:

i1==i2 produces: false
i1.equals(i2) produces: true

Thus we see that i1 and i2 are 2 different Integer objects created in memory, hence i1==i2 produces false, but in the definition of the Integer wrapper
they are meaningfully equivalent (after all they both have value 2) hence i1.equals(i2) produces true.

Creating and executing a thread in Java

There are 3 ways to do this:

1. Subclass Thread class

i. Subclass the Thread class and override the run() method
ii. Instantiate the Thread subclass.
iii. Call Thread.start() method.

public class MyThread extends Thread {    public void run() {        System.out.println("Thread via Extending Thread!");     }    public static void main(String []args) {        (new MyThread()).start();     }  }

 

2. Provide a Runnable object by implementing Runnable

Runnable interface defines a single method run(), meant to contain the code executed in the thread.

i. Implement the Runnable interface by implementing run() method.
ii. Pass instance of Runnable object to the Thread(..) constructor.
iii. Call Thread.start() method.

public class MyRunnable implements Runnable {     @Override    public void run() {       System.out.println("Thread via Implementing Runnable!");    }     public static void main(String[] args) {       (new Thread(new MyRunnable())).start();     }  }

Notice that both cases above invoke Thread.start() in order to start the new thread. In either case above, the result is a Thread object, where the run() method is the body of the thread. When the start() method of the Thread object is called, the interpreter creates a new thread to execute the run() method. The new thread continues to run until the run() method exits. Meanwhile the original thread continues running itself, starting with the statement following the start() method.

 

3. Using Executor interface

The Executor interface can be used to invoke threads as well. It isn’t really a new idiom, since either a Thread object or object that implements Runnable needs to be created first. The Executor interface provides a layer of indirection between a client and the execution of a task; instead of a client executing a task directly, an intermediate object executes the task. Executors allow you to manage the execution of asynchronous tasks without having to explicitly manage the lifecycle of threads.

 

  1. Implement and create a new instance of Runnable.
  2. Create a concrete instance of ExecutorService by calling one of the Executors factory methods.
  3. Call Executor.execute(..) method, passing the Runnable object as argument.
import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors;  public class MyExecutor {  	public static void main(String[] args) { 		ExecutorService exec = Executors.newCachedThreadPool();  	    for(int i = 0; i < 5; i++)  	      exec.execute(new MyRunnable()); //MyRunnable implements Runnable 	}  }

Thread vs. Runnable
Which of these idioms should you use? The first idiom, which employs a Runnable object, is more general, because the Runnable object can subclass a class other than Thread. Also using Runnable enhances separation of concerns vi a composition, by separating the method of execution from the interface used to construct the Runnable
The second idiom is easier to use in simple applications, but is limited by the fact that your task class must be a descendant of Thread.
References: http://www.coderanch.com/t/233370/threads/java/Thread-vs-Runnable
http://stackoverflow.com/questions/541487/java-implements-runnable-vs-extends-thread

Method 1 – Subclassing the Thread class.

This method is the simplest and most straightforward. It has the disadvantage, however that the 
subclass cannot override any other class than the Thread class. There is also tight coupling between the thread 
implementation and invocation.

Method 2 – Implementing the Runnable interface
This method is slightly more involved, it has the advantage that the Runnable subinterface can override another class, and 
there is a decoupling between the thread implementation and invocation.

Method 3 – Utilizing the Executor interface
In this method we pass an instance of the Thread class or a Runnable to the Executor interface.
The advantage of this approach is that it provides a layer of indirection between a client and the execution of a task;
instead of a client executing a task directly, an intermediate object executes the task. 
Executors allow you to manage the execution of asynchronous tasks without having to explicitly manage the lifecycle of threads.

Executor Factory Methods

FACTORYMETHOD DETAILS
newCachedThreadPool() Creates thread pool that creates new threads as needed, but will reuse previously constructred threads when they’re available
newFixedThreadPool(..) Creates thread pool that reuses a fixed number of threads operating off a shared unbounded queue
newScheduledThreadPool(..) Creates a thread pool that can schedule commands to run after a given delay, or to execute periodically
newSingleThreadExecutor() Creates an Executor that uses a single worker thread operating off an unbounded queue
newSingleThreadScheduledExecutor() Creates a single-threaded executor that can schedule commands to run after a given delay, or to execute periodically

Abstract Data Type vs Data Structure

What is the difference between an abstract data type (ADT) and a data structure ?

The question above can be reframed in a more concrete manner by asking this question:
What is the difference between an array and a stack ?
An array is a data structure while a stack is an abstract data type.
An abstract data type (ADT) is a specification for how an ‘data’ interface should behave without any reference to its actual implementation.

The wikipedia definition of an ADT is as follows:
An abstract data type is defined as a mathematical model of the data objects that make up a data type as well as the functions that operate on these objects.
From the NIST Dictionary of Algorithms and Data structures we have the following definition of an ADT:
A set of data values and associated operations that are precisely specified independent of any particular implementation.
Thus in the case of a stack, it exhibits Last In First Out (LIFO) behavior when elements are
added to and removed from it. The concrete implementation of the ADT is where the data structure comes in. Thus a Stack can be implemented as an array or as a linked list.
This might lead us to conclude that an ADT is more a “theoretical/abstract” concept, while the data structure has more to do with a concrete implementation.
Under the definitions above,
the following constructs are ADTs:

  • Stack
  • Queue
  • Bag
  • List
  • Priority Queue
  • Trie
  • Heap
  • Binary Tree
  • Set
  • Map

while these are data structures:

  • array
  • linked list
  • hash map/dictionary

We can gain a better understanding of this when we consider these constructs in Java.
The ADT corresponds to the interface type, while the data structure would correspond to the concrete class.Thus a Java array, ArrayListLinkedListHashMap are actually data structures and the corresponding interfaces they implement would be equivalent to ADTs.
See this article for more about abstract data types in Java

Variable storage in Java

In order to figure out where a variable is stored in Java the most important factor is where the variable is declared.
A general rule of thumb is this:

 

  • local variables are stored on the stack
  • instance variables are stored on the heap
  • static variables are stored on the PermGen area of the heap

 

There are caveats to this however, which are explained below:

Variable Storage Details

Local variables

Primitives and object references declared in a method will be stored on the stack. However, the actual object, if created using new() will be stored on the heap, regardless of where the declaration took place. Hence in the following piece of code:
void aMethod()
{
int playerNum=5;
Player pl=new Player();
}

The primitive variable playerNum and object reference variable pl will be stored on the stack, while the actual Player object itself will live on the heap. When the code exits aMethod and goes out of scope, playerNum and pl will be popped from the stack and cease to exist but the Player object will persist on the heap until it is eventually garbage collected.

Instance variables

Instance variables, even primitives live on the stack.

Consider the code below:

public class Car {
int vinNumber;
String make;
String model;
int year;
String class;

Car(int vin, String make, String model, int year, String class)
{
this.vinNumber=vin;
this.make=make;
this.model=model;
this.year=year;
this.class=class;
}

public static void main(String[] args)
{
Car c=new Car(19281,”Audi”, “A6″,2012,”sedan”);

}

 

Since an instance of Car can only be instantiated via a call to new(), we see that:

  • The Car object c lives on the heap
  • All instance primitives and objects that are part of the Car object are also stored on the heap.

Static variables

The rule for static variables is this: Static methods, primitive variables and object references are stored in the PermGen section of the heap since they are part of the reflection i.e. class, not instance related data. However, in the case of objects, the actual object itself is stored in the regular areas of the heap (young/old generation or survivor space).

 

References:

  1. http://www.tutorialspoint.com/java/java_variable_types.htm
  2. http://www.coderanch.com/t/202217/Performance/java/JVM-heap-stores-local-objects
  3. http://stackoverflow.com/questions/8387989/where-is-a-static-method-and-a-static-variable-stored-in-java-in-heap-or-in-sta
  4. http://stackoverflow.com/questions/3698078/where-does-the-jvm-store-primitive-variables

Connecting to database via JDBC

Outline the steps involved in conecting to a database via JDBC and executing a query

 

  1. //i. Load the database JDBC Driver
    try { Class.forName("com.ibm.db2.jcc.DB2Driver");
  2. //Obtain a connection using the database JDBC Driver
    Connection conn = DriverManager.getConnection (url);
  3. // Create a PreparedStatement from the connection:
    PreparedStatement stmt=conn.createStatement();
  4. //Execute a query and generate a ResultSet instance
    String qry="select empId from Employee";
    ResultSet rs = stmt.executeQuery(qry); //stmt.executeUpdate for updates and inserts
  5. // Obtain the results of the query from the ResultSet
    while (rs.next()) {
    String empId = rs.getString(1);
    }
  6. // Close the result set and statement
    rs.close();
    stmt.close();
  7. // Close the connection
    conn.close();
    catch(SQLException ex)
    {
    System.out.println(sqlEx.getMessage());
    }

Row With Ceiling Value

Consider the following database table which contains the list of options expiring for a particular account.

Option Description         Strike Price  Expiration Date TWTR May 09 2014 $30 Call	30	  05/09/2014 TWTR May 17 2014 $30 Call	30	  05/17/2014 DDD May 09 2014 $48 Call	48	  05/09/2014 DDD May 17 2014 $50 Call	50	  05/17/2014 FB May 09 2014 $58.50 Put	58.5	  05/09/2014 FB May 17 2014 $60 Put	        60	  05/17/2014 AMZN May 23 2014 $300 Call	300	  05/23/2014 AMZN May 23 2014 $305 Put	305	  05/23/2014 CVS May 17 2014 $72.50 Put	72.5	  05/17/2014 CVS Jun 21 2014 $75 Put         75	  06/21/2014 

The owner of the account would like to exercise all options that expire as close as possible before the date of May 18 as he will be out of town on that date. Come up with an SQL query that will show which options need to be exercised.
The table name is optionsExpirationSchedule

Solution

SELECT * FROM optionsExpirationSchedule oes WHERE oes.expiryDate= (   SELECT MAX(expiryDate) FROM optionsExpirationSchedule   WHERE expiryDate <= TO_DATE('20140518','YYYYMMDD') ); 

Clustered Indexes

In this article I explain what is meant by a clustered index and compare it with a non-clustered index.

Definition

A clustered index on a table determines the order in which the rows of the table are stored on disk. If the table has a clustered index, the rows of the table are stored in the same order as that of the clustered index. As an illustration, suppose we have a Customer table that contains the following columns:

  • customerID
  • firstName
  • lastName

where customerID is the primary key on the table. If we define the clustered index on customerID, then the rows of the table will be stored in sorted order according to customerID. This means that rows with customerID=1000, customerID=1001, etc will be adjacent to each other on disk.

Advantages and Disadvantages

The advantages of having a clustered index include the following:

  1. Range queries involving the clustered index will be faster since once the row with 1st key value is located the remaining rows physically stored next to each other and no more searching is needed.
  2. The leaf nodes of the B-tree that make up the clustered index contain the actual data pages as opposed to a non-clustered index where the leaf nodes contain pointers to rows on data pages. Hence there is 1 less level of indirection for a clustered index and this improves performance.

The disadvantages of a clustered index include:

  1. Updates involving columns used in the clustered index result in a performance hit since the rows may have to be re-arranged to keep the table in sorted order in line with the clustered index. In light of this, it is recommended that a clustered index is created on a primary or foreign key, since this would be less prone to updates.

Comparison of clustered vs non-clustered indexes

  • Returning to our Customer table example, if we have a clustered index on Customer id, then the leaf node of the clustered index will contain the actual row data (data pages) for a particular customerID while for a non-clustered index the value of customerID and a pointer to the actual row is what is stored at the leaf node.
  • There can only be 1 clustered index per table, but multiple non-clustered indexes (up to 249 in the case of Sybase). This is because the rows in the table are physically ordered according to the clustered index and there is only 1 way of doing so.
  • A clustered index determines the order in which the rows of the table can be stored on disk, but this is not the case for a non-clustered index
  • A clustered index can be a performance hit in the case of updates to an indexed column since the rows may have to be re-ordered in order to maintain order. Range queries and queries involving indexed foreign keys tend to show better performance for a clustered vs non-clustered index.

References

For a more in depth look at clustered indexes, see the following very good article by Michelle Ufford : Effective Clustered Indexes