Java World: October 2012

Tuesday, October 16, 2012

Differences between Interface and Abstract class

Abstract classes

Abstract classes are classes that contain one or more abstract methods. An abstract method is a method that is declared, but contains no implementation An Abstract class can't be instantiated

Abstract class is a class which contain one or more abstract methods, which has to be implemented by sub classes.

Abstract class is a Class prefix with a abstract keyword followed by Class definition.

Abstract class contains one or more abstract methods.

Abstract classes are useful in a situation that Some general methods should be implemented and specialization behavior should be implemented by child classes. Interfaces are useful in a situation that all properties should be implemented.

If even a single method is abstract, the whole class must be declared abstract.
Abstract classes may not be instantiated, and require subclasses to provide implementations for the abstract methods.
You can’t mark a class as both abstract and final.

Example of Abstract class:

abstract class testAbstractClass {

    protected String myString;

    public String getMyString() {

    return myString;

}

public abstract string anyAbstractFunction();

}


Interface

An interface is a description of a set of methods that conforming implementing classes must have.

Interface is a Java Object containing method declaration and doesn't contain implementation.

The classes which have implementing the Interfaces must provide the method definition for all the methods .

Interface contains all abstract methods and final declarations

In Java Interface defines the methods but does not implement them. Interface can include constants. A class that implements the interfaces is bound to implement all the methods defined in Interface.

You can’t mark an interface as final.
Interface variables must be static.
An Interface cannot extend anything but another interfaces.

Example of Interface:

public interface sampleInterface {

public void functionOne();

public long CONSTANT_ONE = 1000;

}

Similarities:

Neither Abstract classes or Interface can be instantiated

Differences: -

Abstract Class	Interfaces
An abstract class can provide complete, default code and/or just the details that have to be overridden.	An interface cannot provide any code at all,just the signature.
In case of abstract class, a class may extend only one abstract class.	A Class may implement several interfaces.
An abstract class can have non-abstract methods.	All methods of an Interface are abstract.
An abstract class can have instance variables.	An Interface cannot have instance variables.
An abstract class can have any visibility: public, private, protected.	An Interface visibility must be public (or) none.
If we add a new method to an abstract class then we have the option of providing default implementation and therefore all the existing code might work properly.	If we add a new method to an Interface then we have to track down all the implementations of the interface and define implementation for the new method.
An abstract class can contain constructors .	An Interface cannot contain constructors .
Abstract classes are fast.	Interfaces are slow as it requires extra indirection to find corresponding method in the actual class.

Use Interfaces when…

You see that something in your design will change frequently.
If various implementations only share method signatures then it is better to use Interfaces.
you need some classes to use some methods which you don't want to be included in the class, then you go for the interface, which makes it easy to just implement and make use of the methods defined in the interface.

Use Abstract Class when…

If various implementations are of the same kind and use common behavior or status then abstract class is better to use.
When you want to provide a generalized form of abstraction and leave the implementation task with the inheriting subclass.
Abstract classes are an excellent way to create planned inheritance hierarchies. They're also a good choice for nonleaf classes in class hierarchies.

Connection Pooling in Java

Connection Pooling - what is it and why do we need it?

It's a technique to allow multiple clinets to make use of a cached set of shared and reusable connection objects providing access to a database. Connection Pooling feature is supported only on J2SDK 1.4 and later releases.

Opening/Closing database connections is an expensive process and hence connection pools improve the performance of execution of commands on a database for which we maintain connection objects in the pool. It facilitates reuse of the same connection object to serve a number of client requests. Every time a client request is received, the pool is searched for an available connection object and it's highly likely that it gets a free connection object. Otherwise, either the incoming requests are queued or a new connection object is created and added to the pool (depending upon how many connections are already there in the pool and how many the particular implementation and configuration can support). As soon as a request finishes using a connection object, the object is given back to the pool from where it's assigned to one of the queued requests (based on what scheduling algorithm the particular connection pool implementation follows for serving queued requests). Since most of the requests are served using existing connection objects only so the connection pooling approach brings down the average time required for the users to wait for establishing the connection to the database.

How is it used?

It's normally used in a web-based enterprise application where the application server handles the responsibilities of creating connection objects, adding them to the pool, assigning them to the incoming requests, taking the used connection objects back, returning them back to the pool, etc. When a dynamic web page of the web-based application explicitily creates a connection (using JDBC 2.0 pooling manager interfaces and calling getConnection() method on a PooledConnection object ... I'll discuss both the JDBC 1.0 and JDBC 2.0 approaches in a separate article) to the database and closes it after use then the application server internally gives a connection object from the pool itself on the execution of the statement which tries to create a connection (in this case it's called logical connection) and on execution of the statement which tries to close the connection, the application server simply returns the connection back to pool. Remember you can still use JDBC 1.0 / JDBC 2.0 APIs to obtain physical connections. This of course is used very rarely - probably in the cases where connection to that particular database is needed once in a while and maintaining a pool of connection is not really needed.

How many connections the Pool can handle? Who creates/releases them?

These days it's pretty configurable - the maximum connections, the minimum connections, the maximum number of idle connections, etc. these all parameters can be configured by the server administrator. On start up the server creates a fixed number (the configured minimum) of connection objects and adds them to the pool. Once all of these connection objects are exhausted by serving those many clinet requests then any extra request causes a new connection object to be created, added to the pool, and then to be assigned to server that extra request. This continues till the number of connection objects doesn't reach the configured maximum number of connection objects in the pool. The server keep on checking the number of idle connection objects as well and if it finds that there are more number of idle connection objects than the configured value of that parameter then the server simply closes the extra number of idle connections, which are subsequently garbage collected.

Traditional Connection Pooling vs Managed Connection Pooling

Connection Pooling is an open concept and its certainly not limited to the connection pooling we normally notice in the enterprise application i.e., the one managed by the Application Servers. Any application can use this concept and can manage it the way it wants. Connection Pooling simply means creating, managing, and maintaining connection objects in advance. A traditional application can do it manually, but as we can easily observe that as the scalability and reach of the application grows, it becomes more and more difficult to manage connections without having a defined and robust connection pooling mechanism. Otherwise it'll be extremely difficult to ensure the maintainability and availability of the connections and in turn the application.

JDBC connection pooling

The addition of JDBC connection pooling to your application usually involves little or no code modification but can often provide significant benefits in terms of application performance, concurrency and scalability. Improvements such as these can become especially important when your application is tasked with servicing many concurrent users within the requirements of sub second response time. By adhering to a small number of relatively simple connection pooling best practices your application can quickly and easily take effective advantage of connection pooling.

Software Object Pooling

There are many scenarios in software architecture where some type of object pooling is employed as a technique to improve application performance. Object pooling is effective for two simple reasons. First, the run time creation of new software objects is often more expensive in terms of performance and memory than the reuse of previously created objects. Second, garbage collection is an expensive process so when we reduce the number of objects to clean up we generally reduce the garbage collection load.
As the saying goes, there is no such thing as a free lunch and this maxim is also true with object pooling. Object pooling does require additional overhead for such tasks as managing the state of the object pool, issuing objects to the application and recycling used objects. Therefore objects that don’t have short lifetimes in your application may not be good choices for object pooling since their low rate of reuse may not warrant the overhead of pooling.
However, objects that do have short lifetimes are often excellent candidates for pooling. In a pooling scenario your application first creates an object pool that can both cache pooled objects and issue objects that are not in use back to the application. For example, pooled objects could be database connections, process threads, server sockets or any other kind of object that may be expensive to create from scratch. As your application first starts asking the pool for objects they will be newly created but when the application has finished with the object it is returned to the pool rather than destroyed. At this point the benefits of object pooling will be realized since, now as the application needs more objects, the pool will be able to issue recycled objects that have previously been returned by the application.

JDBC Connection Pooling

JDBC connection pooling is conceptually similar to any other form of object pooling. Database connections are often expensive to create because of the overhead of establishing a network connection and initializing a database connection session in the back end database. In turn, connection session initialization often requires time consuming processing to perform user authentication, establish transactional contexts and establish other aspects of the session that are required for subsequent database usage.
Additionally, the database's ongoing management of all of its connection sessions can impose a major limiting factor on the scalability of your application. Valuable database resources such as locks, memory, cursors, transaction logs, statement handles and temporary tables all tend to increase based on the number of concurrent connection sessions.
All in all, JDBC database connections are both expensive to initially create and then maintain over time. Therefore, as we shall see, they are an ideal resource to pool.
If your application runs within a J2EE environment and acquires JDBC connections from an appserver defined datasource then your application is probably already using connection pooling. This fact also illustrates an important characteristic of a best practices pooling implementation -- your application is not even aware it's using it! Your J2EE application simply acquires JDBC connections from the datasource, does some work on the connection then closes the connection. Your application's use of connection pooling is transparent. The characteristics of the connection pool can be tweaked and tuned by your appserver's administrator without the application ever needing to know.
If your application is not J2EE based then you may need to investigate using a standalone connection pool manager. Connection pool implementations are available from JDBC driver vendors and a number of other sources.

JDBC Connection Scope

How should your application manage the life cycle of JDBC connections? Asked another way, this question really asks - what is the scope of the JDBC connection object within your application? Let's consider a servlet that performs JDBC access. One possibility is to define the connection with servlet scope as follows.

import java.sql.*;

public class JDBCServlet extends HttpServlet {

    private Connection connection;

    public void init(ServletConfig c) throws ServletException {
      //Open the connection here
    }

   public void destroy() {
     //Close the connection here
   }

    public void doGet (HttpServletRequest req, HttpServletResponse res) throws ServletException {
      //Use the connection here
      Statement stmt = connection.createStatement();
      ..<do JDBC work>..
}
}
Using this approach the servlet creates a JDBC connection when it is loaded and destroys it when it is unloaded. The doGet() method has immediate access to the connection since it has servlet scope. However the database connection is kept open for the entire lifetime of the servlet and that the database will have to retain an open connection for every user that is connected to your application. If your application supports a large number of concurrent users its scalability will be severely limited!

Method Scope Connections

To avoid the long life time of the JDBC connection in the above example we can change the connection to have method scope as follows.

public class JDBCServlet extends HttpServlet {

private Connection getConnection() throws SQLException {
    ..<create a JDBC connection>..
}

public void doGet (HttpServletRequest req, HttpServletResponse res) throws ServletException {
    try {
      Connection connection = getConnection();
      ..<do JDBC work>..
      connection.close();
    }
    catch (SQLException sqlException) {
      sqlException.printStackTrace();
    }
}
}

This approach represents a significant improvement over our first example because now the connection's life time is reduced to the time it takes to execute doGet(). The number of connections to the back end database at any instant is reduced to the number of users who are concurrently executing doGet(). However this example will create and destroy a lot more connections than the first example and this could easily become a performance problem.
In order to retain the advantages of a method scoped connection but reduce the performance hit of creating and destroying a large number of connections we now utilize connection pooling to arrive at our finished example that illustrates the best practices of connecting pool usage.

import java.sql.*;
import javax.sql.*;

public class JDBCServlet extends HttpServlet {

private DataSource datasource;

public void init(ServletConfig config) throws ServletException {
    try {
      // Look up the JNDI data source only once at init time
      Context envCtx = (Context) new InitialContext().lookup("java:comp/env");
      datasource = (DataSource) envCtx.lookup("jdbc/MyDataSource");
    }
    catch (NamingException e) {
      e.printStackTrace();
    }
}

private Connection getConnection() throws SQLException {
    return datasource.getConnection();
}

public void doGet (HttpServletRequest req, HttpServletResponse res) throws ServletException {
    Connection connection=null;
    try {
      connection = getConnection();
      ..<do JDBC work>..
    }
    catch (SQLException sqlException) {
      sqlException.printStackTrace();
    }
    finally {
      if (connection != null)
        try {connection.close();} catch (SQLException e) {}
      }
    }
}
}

This approach uses the connection only for the minimum time the servlet requires it and also avoids creating and destroying a large number of physical database connections. The connection best practices that we have used are:

A JNDI datasource is used as a factory for connections. The JNDI datasource is instantiated only once in init() since JNDI lookup can also be slow. JNDI should be configured so that the bound datasource implements connecting pooling. Connections issued from the pooling datasource will be returned to the pool when closed.
We have moved the connection.close() into a finally block to ensure that the connection is closed even if an exception occurs during the doGet() JDBC processing. This practice is essential when using a connection pool. If a connection is not closed it will never be returned to the connection pool and become available for reuse. A finally block can also guarantee the closure of resources attached to JDBC statements and result sets when unexpected exceptions occur. Just call close() on these objects also.

Connection Pool Tuning

One of the major advantages of using a connection pool is that characteristics of the pool can be changed without affecting the application. If your application confines itself to using generic JDBC you could even point it at a different vendor's database without changing any code! Different pool implementations will provide different settable properties to tune the connection pool. Typical properties include the number of initial connections, the minimum and maximum number of connections that can be present at any time and a mechanism to purge connections that have been idle for a specific period of time.

In general, optimal performance is attained when the pool in its steady state contains just enough connections to service all concurrent connection requests without having to create new physical database connections. If the pooling implementation supports purging idle connections it can optimize its size over time to accommodate varying application loads over the course of a day. For example, scaling up the number of connections cached in the pool during business hours then dynamically reducing the pool size after business hours.

Connection Pooling Metrics

In order to compare the difference between using non pooled connections and connection pooling I built a simple servlet that displays orders in Oracle's sample OE (Order Entry) database schema. The testing configuration consists of Jakarta's JMetric load testing tool, the Tomcat 5.5 servlet container and an Oracle 10g database instance. Tomcat and Oracle were running on separate 512MB machines connected by 100Mbps Ethernet.
The servlet is written to use either pooled or non pooled database connections depending on the query string passed in its URL. So the servlet can be dynamically instructed by the load tester to use (or not use) connection pooling in order to compare throughput in both modes. The servlet creates pooled connections using a Tomcat DBCP connection pool and non pooled connections directly from Oracle's thin JDBC driver. Having acquired a connection, the servlet executes a simple join between the order header and line tables then formats and outputs the results as HTML.

Java Regex Tutorial

1. Regular Expressions

1.1. Overview

A regular expression defines a search pattern for strings. This pattern may match one or several times or not at all for a given string. The abbreviation for regular expression is regex.
A simple example for a regular expression is a (literal) string. For example the Hello World regex will match the "Hello World" string.
.. (dot) is another example for an regular expression. .. matches any single character; it would match for example "a" or "z" or "1".

1.2. Usage

Regular expressions can be used to search, edit and manipulate text.
Regular expressions are supported by most programming languages, e.g. Java, Perl, Groovy, etc.
Unfortunately each language supports regular expressions slightly different.
If a regular expression is used to analyse or modify a text, this process is called The regular expression is applied to the text.
The pattern defined by the regular expression is applied on the string from left to right. Once a source character has been used in a match, it cannot be reused. For example the regex "aba" will match "ababababa" only two times (aba_aba__).

2. Prerequisites

Some of the following examples use JUnit to validate the result. You should be able to adjust them in case if you do not want to use JUnit. To learn about JUnit please see JUnit Tutorial .

3. Regular Expressions

The following is an overview of regular expressions. This chapter is supposed to be a references for the different regex elements.

3.1. Common matching symbols

Table 1.

Regular Expression	Description
`.`	Matches any sign
`^regex`	regex must match at the beginning of the line
`regex$`	Finds regex must match at the end of the line
`[abc]`	Set definition, can match the letter a or b or c
`[abc][vz]`	Set definition, can match a or b or c followed by either v or z
`[^abc]`	When a "^" appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c
`[a-d1-7]`	Ranges, letter between a and d and figures from 1 to 7, will not match d1
`X\|Z`	Finds X or Z
`XZ`	Finds X directly followed by Z
`$`	Checks if a line end follows

3.2. Metacharacters

The following metacharacters have a pre-defined meaning and make certain common pattern easier to use, e.g. \d instead of [0..9].

Table 2.

Regular Expression	Description
`\d`	Any digit, short for [0-9]
`\D`	A non-digit, short for [^0-9]
`\s`	A whitespace character, short for [ \t\n\x0b\r\f]
`\S`	A non-whitespace character, for short for [^\s]
`\w`	A word character, short for [a-zA-Z_0-9]
`\W`	A non-word character [^\w]
`\S+`	Several non-whitespace characters
`\b`	Matches a word boundary. A word character is [a-zA-Z0-9_] and \b matches its bounderies.

3.3. Quantifier

A quantifier defines how often an element can occur. The symbols ?, *, + and {} define the quantity of the regular expressions

Table 3.

Regular Expression	Description	Examples
`*`	Occurs zero or more times, is short for {0,}	X* - Finds no or several letter X, .* - any character sequence
`+`	Occurs one or more times, is short for {1,}	X+ - Finds one or several letter X
`?`	Occurs no or one times, ? is short for {0,1}	X? -Finds no or exactly one letter X
`{X}`	Occurs X number of times, {} describes the order of the preceding liberal	\d{3} - Three digits, .{10} - any character sequence of length 10
`{X,Y}`	Occurs between X and Y times,	\d{1,4}- \d must occur at least once and at a maximum of four
`*?`	? after a qualifier makes it a "reluctant quantifier", it tries to find the smallest match.

3.4. Grouping and Backreference

You can group parts of your regular expression. In your pattern you group elements via round brackets, e.g. "()". This allows you to assign a repetition operator the a complete group.

In addition these groups also create a backreference to the part of the regular expression. This captures the group. A backreference stores the part of the String which matched the group. This allows you to use this part in the replacement.

Via the $ you can refer to a group. $1 is the first group, $2 the second, etc.

Lets for example assume you want to replace all whitespace between a letter followed by a point or a comma. This would involve that the point or the comma is part of the pattern. Still it should be included in the result

// Removes whitespace between a word character and . or ,
String pattern = "(\\w)(\\s+)([\\.,])";
System.out.println(EXAMPLE_TEST.replaceAll(pattern, "</code>$3"));

3.5. Negative Lookahead

Negative Lookahead provide the possibility to exclude a pattern. With this you can say that a string should not be followed by another string.

Negative Lookaheads are defined via (?!pattern). For example the following will match a if a is not followed by b.

a(?!b)

3.6. Backslashes in Java

The backslash is an escape character in Java Strings. e.g. backslash has a predefined meaning in Java. You have to use "\\" to define a single backslash. If you want to define "\w" then you must be using "\\w" in your regex. If you want to use backslash you as a literal you have to type \\\\ as \ is also a escape character in regular expressions.

4. Using Regular Expressions with String.matches()

4.1. Overview

Strings in Java have build in support for regular expressions. Strings have three build in methods for regular expressions, e.g. matches(), split()), replace(). .
These methods are not optimized for performance. We will later use classes which are optimized for performance.

Table 4.

Method	Description
s.matches("regex")	Evaluates if "regex" matches s. Returns only true if the WHOLE string can be matched
s.split("regex")	Creates array with substrings of s divided at occurance of "regex". "regex" is not included in the result.
s.replace("regex"), "replacement"	Replaces "regex" with "replacement

Create for the following example the Java project de.vogella.regex.test

public class RegexTestStrings {
  public static final String EXAMPLE_TEST = "This is my small example "
      + "string which I'm going to " + "use for pattern matching.";

  public static void main(String[] args) {
    System.out.println(EXAMPLE_TEST.matches("\\w.*"));
    String[] splitString = (EXAMPLE_TEST.split("\\s+"));
    System.out.println(splitString.length);// Should be 14
    for (String string : splitString) {
      System.out.println(string);
    }
    // Replace all whitespace with tabs
    System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t"));
  }
}

4.2. Examples

The following class gives several examples for the usage of regular expressions with strings. See the comment for the purpose.
If you want to test these examples, create for the Java project de.vogella.regex.string.

public class StringMatcher {
// Returns true if the string matches exactly "true"
public boolean isTrue(String s){
    return s.matches("true");
}
// Returns true if the string matches exactly "true" or "True"
public boolean isTrueVersion2(String s){
    return s.matches("[tT]rue");
}

// Returns true if the string matches exactly "true" or "True"
// or "yes" or "Yes"
public boolean isTrueOrYes(String s){
    return s.matches("[tT]rue|[yY]es");
}

// Returns true if the string contains exactly "true"
public boolean containsTrue(String s){
    return s.matches(".*true.*");
}

// Returns true if the string contains of three letters
public boolean isThreeLetters(String s){
    return s.matches("[a-zA-Z]{3}");
    // Simpler from for
//    return s.matches("[a-Z][a-Z][a-Z]");
}

// Returns true if the string does not have a number at the beginning
public boolean isNoNumberAtBeginning(String s){
    return s.matches("^[^\\d].*");
}
// Returns true if the string contains a arbitrary number of characters except b
public boolean isIntersection(String s){
    return s.matches("([\\w&&[^b]])*");
}
// Returns true if the string contains a number less then 300
public boolean isLessThenThreeHundret(String s){
    return s.matches("[^0-9]*[12]?[0-9]{1,2}[^0-9]*");
}

}

5. Pattern and Matcher

For advanced regular expressions the java.util.regex.Pattern and java.util.regex.Matcher classes are used.
You first create a Pattern object which defines the regular expression. This Pattern object allows you to create a Matcher object for a given string. This Matcher object then allows you to do regex operations on a String.

import java.util.regex.Matcher;
import java.util.regex.Pattern; public class RegexTestPatternMatcher {
public static final String EXAMPLE_TEST = "This is my small example string which I'm going to use for pattern matching.";

public static void main(String[] args) {
    Pattern pattern = Pattern.compile("\\w+");
    // In case you would like to ignore case sensitivity you could use this
    // statement
    // Pattern pattern = Pattern.compile("\\s+", Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(EXAMPLE_TEST);
    // Check all occurance
    while (matcher.find()) {
      System.out.print("Start index: " + matcher.start());
      System.out.print(" End index: " + matcher.end() + " ");
      System.out.println(matcher.group());
    }
    // Now create a new pattern and matcher to replace whitespace with tabs
    Pattern replace = Pattern.compile("\\s+");
    Matcher matcher2 = replace.matcher(EXAMPLE_TEST);
    System.out.println(matcher2.replaceAll("\t"));
}
}

6. Java Regex Examples

The following lists typical examples for the usage of regular expressions. I hope you find similarities to your examples.

6.1. Or

Task: Write a regular expression which matches a text line if this text line contains either the word "Joe" or the word "Jim" or both.
Create a project de.vogella.regex.eitheror and the following class.

import org.junit.Test;

import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;

public class EitherOrCheck {
@Test
public void testSimpleTrue() {
    String s = "humbapumpa jim";
    assertTrue(s.matches(".*(jim|joe).*"));
    s = "humbapumpa jom";
    assertFalse(s.matches(".*(jim|joe).*"));
    s = "humbaPumpa joe";
    assertTrue(s.matches(".*(jim|joe).*"));
    s = "humbapumpa joe jim";
    assertTrue(s.matches(".*(jim|joe).*"));
}
}

6.2. Phone number

Task: Write a regular expression which matches any phone number.
A phone number in this example consists either out of 7 numbers in a row or out of 3 number a (white)space or a dash and then 4 numbers.

import org.junit.Test;

import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue; public class CheckPhone {

@Test
public void testSimpleTrue() {
    String pattern = "\\d\\d\\d([,\\s])?\\d\\d\\d\\d";
    String s= "1233323322";
    assertFalse(s.matches(pattern));
    s = "1233323";
    assertTrue(s.matches(pattern));
    s = "123 3323";
    assertTrue(s.matches(pattern));
}
}

6.3. Check for a certain number range

The following example will check if a text contains a number with 3 digits.
Create the Java project "de.vogella.regex.numbermatch" and the following class

import org.junit.Test;

import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;

public class CheckNumber {

@Test
public void testSimpleTrue() {
    String s= "1233";
    assertTrue(test(s));
    s= "0";
    assertFalse(test(s));
    s = "29 Kasdkf 2300 Kdsdf";
    assertTrue(test(s));
    s = "99900234";
    assertTrue(test(s));
}

public static boolean test (String s){
    Pattern pattern = Pattern.compile("\\d{3}");
    Matcher matcher = pattern.matcher(s);
    if (matcher.find()){
      return true;
    }
    return false;
}

}

6.4. Building a link checker

The following example allows you to extract all valid links from a webpage. It does not consider links with start with "javascript:" or "mailto:".

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class LinkGetter {
private Pattern htmltag;
private Pattern link;
private final String root;

public LinkGetter(String root) {
    this.root = root;
    htmltag = Pattern.compile("<a\\b[^>]*href=\"[^>]*>(.*?)</a>");
    link = Pattern.compile("href=\"[^>]*\">");
}

public List<String> getLinks(String url) {
    List<String> links = new ArrayList<String>();
    try {
      BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new URL(url).openStream()));
      String s;
      StringBuilder builder = new StringBuilder();
      while ((s = bufferedReader.readLine()) != null) {
        builder.append(s);
      }

      Matcher tagmatch = htmltag.matcher(builder.toString());
      while (tagmatch.find()) {
        Matcher matcher = link.matcher(tagmatch.group());
        matcher.find();
        String link = matcher.group().replaceFirst("href=\"", "")
            .replaceFirst("\">", "");
        if (valid(link)) {
          links.add(makeAbsolute(url, link));
        }
      }
    } catch (MalformedURLException e) {
      e.printStackTrace();
    } catch (IOException e) {
      e.printStackTrace();
    }
    return links;
}

private boolean valid(String s) {
    if (s.matches("javascript:.*|mailto:.*")) {
      return false;
    }
    return true;
}

private String makeAbsolute(String url, String link) {
    if (link.matches("http://.*")) {
      return link;
    }
    if (link.matches("/.*") && url.matches(".*$[^/]")) {
      return url + "/" + link;
    }
    if (link.matches("[^/].*") && url.matches(".*[^/]")) {
      return url + "/" + link;
    }
    if (link.matches("/.*") && url.matches(".*[/]")) {
      return url + link;
    }
    if (link.matches("/.*") && url.matches(".*[^/]")) {
      return url + link;
    }
    throw new RuntimeException("Cannot make the link absolute. Url: " + url
        + " Link " + link);
}
}

6.5. Finding duplicated words

The regular expression \b(\w+) \1\b matches duplicated words. The (?!-in)\b(\w+) \1\b finds duplicate words if they do not start with "-in".

Java Preferences API

1.Introduction

The Preferences API provides a systematic way to handle program preference configurations, e.g. to save user settings, remember the last value of a field etc.
Preferences are key / values pairs where the key is an arbitrary name for the preference. The value can be a boolean, string, int of another primitive type. Preferences are received and saved by get and put methods while the get methods also supply a default value in case the preferences is not yet set.
This Java Preferences API is not indented to save application data.
The Java Preference API removes the burden from the individual programmer to write code to save configuration values on the different platforms his program may be running.
The actual storage of the data is dependent on the platform.

2. Using the API

java.util.prefs.Preferences can be easily used. You have to define a node in which the data is stored. Then you can call the getter and setter methods. The second value is the default value, e.g. if the preference value is not set yet, then this value will be used.
Create the following program.

public class PreferenceTest {
private Preferences prefs;

public void setPreference() {
    // This will define a node in which the preferences can be stored
    prefs = Preferences.userRoot().node(this.getClass().getName());
    String ID1 = "Test1";
    String ID2 = "Test2";
    String ID3 = "Test3";

    // First we will get the values
    // Define a boolean value
    System.out.println(prefs.getBoolean(ID1, true));
    // Define a string with default "Hello World
    System.out.println(prefs.get(ID2, "Hello World"));
    // Define a integer with default 50
    System.out.println(prefs.getInt(ID3, 50));

    // Now set the values
    prefs.putBoolean(ID1, false);
    prefs.put(ID2, "Hello Europa");
    prefs.putInt(ID3, 45);

    // Delete the preference settings for the first value
    prefs.remove(ID1);

}

public static void main(String[] args) {
    PreferenceTest test = new PreferenceTest();
    test.setPreference();
}
}

Run the program twice. The value of "ID1" should be still true as we delete it. The value of "ID2" and "ID2" should have changed after the first call.

Java Serialization

1. Java Serialization

Via Java Serialization you can stream your Java object to a sequence of byte and restore these objects from this stream of bytes. To make a Java object serializable you implement the java.io.Serializable interface. This is only a marker interface which tells the platform that the object is serializable.
Certain system-level classes such as Thread, OutputStream and its subclasses, and Socket are not serializable. If you serializable class contains such objects, it must mark then as "transient".

2. Example

import java.io.Serializable;

public class Person implements Serializable {
private String firstName;
private String lastName;
// stupid example for transient
transient private Thread myThread;

public Person(String firstName, String lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
    this.myThread = new Thread();
}

public String getFirstName() {
    return firstName;
}

public void setFirstName(String firstName) {
    this.firstName = firstName;
}

public String getLastName() {
    return lastName;
}

public void setLastName(String lastName) {
    this.lastName = lastName;
}

@Override
public String toString() {
    return "Person [firstName=" + firstName + ", lastName=" + lastName
        + "]";
}

}

The following code example show you how you can serializable and de-serializable this object.

import java.io.FileInputStream;
import java.io.FileOutputStream; import java.io.ObjectInputStream; import java.io.ObjectOutputStream;

public class Main {
public static void main(String[] args) {
    String filename = "time.ser";
    Person p = new Person("Lars", "Vogel");

    // Save the object to file
    FileOutputStream fos = null;
    ObjectOutputStream out = null;
    try {
      fos = new FileOutputStream(filename);
      out = new ObjectOutputStream(fos);
      out.writeObject(p);

      out.close();
    } catch (Exception ex) {
      ex.printStackTrace();
    }
    // Read the object from file
    // Save the object to file
    FileInputStream fis = null;
    ObjectInputStream in = null;
    try {
      fis = new FileInputStream(filename);
      in = new ObjectInputStream(fis);
      p = (Person) in.readObject();
      out.close();
    } catch (Exception ex) {
      ex.printStackTrace();
    }
    System.out.println(p);
}
}

Dependency Injection

1. What is Dependency Injection?

The general concept behind dependency injection is called Inversion of Control . A class should not configure its dependencies but should be configured from outside.
Dependency injection is a concept which is not limited to Java. But we will look at dependency injection from a Java point of view.
A Java class has a dependency on another class if it uses an instance of this class, e.g. via calling the constructor or via a static method call. For example a class which accesses a logger service has a dependency on this service class.
Ideally Java classes should be as independent as possible from other Java classes. This increases the possibility of reusing these classes and to be able to test them independently from other classes, for example for unit testing.
If the Java class directly creates an instance of another class via the new operator, it cannot be used and tested independently from this class.
To decouple Java classes its dependencies should be fulfilled from the outside. A Java class would simply define its requirements like in the following example:

public class MyPart {
  
  @Inject private Logger logger;
  // DatabaseAccessClass would talk to the DB
  @Inject private DatabaseAccessClass dao;
  
  @Inject
  public void init(Composite parent) {
    logger.info("UI will start to build");
    Label label = new Label(parent, SWT.NONE);
    label.setText("Eclipse 4");
    Text text = new Text(parent, SWT.NONE);
    text.setText(dao.getNumber());
  }

}

Another class could read these dependencies and create an instance of the class, injecting objects into the defined dependency. This can be done via the Java reflection functionality. This class is usually called the dependency container and is a framework class.
This way the Java class has no hard dependencies, i.e. it does not rely on an instance of a certain class. For example if you want to test a class which uses another object which directly uses a database, you could inject a mock object.
Mock objects are objects which act as if they are the real object but only simulate their behavior. Mock is an old English word meaning to mimic or imitate.
If dependency injection is used, a Java class can be tested in isolation, which is good.
Dependency injection can happen on:

the constructor of the class (construction injection)
a method (method injection)
a field (field injection)

Dependency injection can happen on static as well as on non-static fields and methods.

Java and dependency injection frameworks.:-

You can use dependency injection without any additional framework by providing classes with sufficient constructors or getter and setter methods.
A dependency injection framework simplifies the initialization of the classes with the correct objects.
Two popular dependency injection frameworks are Spring and Google Guice .
Also Eclipse 4 is using dependency injection.

Singleton Design Pattern

1. Singletons in Java

1.1. Overview

A singleton in Java is a class for which only one instance can be created provides a global point of access this instance. The singleton pattern describe how this can be archived.
Singletons are useful to provide a unique source of data or functionality to other Java Objects. For example you may use a singleton to access your data model from within your application or to define logger which the rest of the application can use.

1.2. Code Example

The possible implementation of Java depends on the version of Java you are using.
As of Java 6 you can singletons with a single-element enum type. This way is currently the best way to implement a singleton in Java 1.6 or later according to tht book ""Effective Java from Joshua Bloch.

package mypackage;

public enum MyEnumSingleton {
  INSTANCE;
  
  // other useful methods here
}

Before Java 1.6 a class which should be a singleton can be defined like the following.

public class Singleton {
  private static Singleton uniqInstance;

  private Singleton() {
  }

  public static synchronized Singleton getInstance() {
    if (uniqInstance == null) {
      uniqInstance = new Singleton();
    }
    return uniqInstance;
  }
  // other useful methods here
}






1.3. Evaluation 



A static class with static method would result in the same functionality as a singleton. As singletons are define using an object orientated approach it is in general advised to work with singletons. 

Singleton violate the "One Class, one responsibility" principle as they are used to manage its one instance and the functionality of the class. 

A singleton cannot be subclassed as the constructor is declared private. 

If you are using multiple classloaders then several instances of the singleton can get created.

Design Pattrens Introduction

1.Introduction

The terminology of "Design Pattern" in software developed is based on the GOF (Gang of Four) book "Design Patterns - Elements of Reusable Object-Oriented Software" from Erich Gamma, Richard Helm, Ralph Johnson und John Vlissides. Design Pattern are proven solutions approaches to specific problems. A design pattern is not framework and is not directly deployed via code.
Design Pattern have two main usages:

Common language for developers: They provide developer a common language for certain problems. For example if a developer tells another developer that he is using a Singleton, the another developer (should) know exactly what this means.
Capture best practices: Design patterns capture solutions which have been applied to certain problems. By learning these patterns and the problem they are trying to solve a unexperienced developer can learn a lot about software design.

Design pattern are based on the base principles of object orientated design.

Program to an interface not an implementation
Favor object composition over inheritance

Design Patterns can be divided into:

Creational Patterns
Structural Patterns
Behavioral Patterns

2.Object Orientated Programming

OO programming suggests that you use the following principles during the design of a software. The following are not "Design Principles" but a repetition of a good OO design.

2.1. Encapsulation

In general a general manipulation of an object's variables by other objects or classes is discouraged to ensure data encapsulation. A class should provide methods through which other objects could access variables. Java deletes objects which are not longer used (garbage collection).

2.2. Abstraction

Java support the abstraction of data definition and concrete usage of this definition.
The concept is divided from the concrete which means you first define a class containing the variables and the behavior (methods) and afterwards you create the real objects which then all behave like the class defined it.
A class is the definition of the behavior and data. A class can not be directly be used.
A object in an instance of this class and is the real object which can be worked with.

2.3. Polymorphisms

The ability of object variables to contain objects of different classes. If class X1 is a subclass of class X then a method which is defined with a parameter for an object X can also get called which an object X1.
If you define a supertype for a group of classes any subclass of that supertype can be substituted where the supertype is expected.
If you use an interface as a polymorphic type any object which implements this interface can be used as arguments.

2.4. Inheritance

Inheritance allows that classes can be based on each other. If a class A inherits another class B this is called "class A extends class B".
For example you can define a base class which provides certain logging functionality and this class is extended by another class which adds email notification to the functionality.

2.5. Delegation

Delegation is then you hand over the responsibility for a particular task to anther class or method.
If you need to use functionality in another class but you do not want to change that functionality then use delegation instead of inheritance.

2.6. Composition

When you refer to a whole family of behavior then you use composition. Here you program against an interface and then any class which implements this interface can be used to be defined. In composition the composition class is still defined in the calling class.
When you use composition, the composing object owns the behaviors is uses and they stop existing as soon as the composing object does.

2.7. Aggregation

Aggregation allows you to use behavior from another class without limiting the lifetime of those behaviors.
Aggregation is when one class is used as part of another class but still exists outside of that class.

2.8. Design by contract

Programming by contract assumes both sides in a transaction understand what actions generate what behavior and will abide by that contact.
Methods usually return null or unchecked exceptions when errors occurs in programming by contract environment.
If you believe that a method should not get called in a certain way just throw an unchecked runtime exception. This can be really powerful. Instead of checking in your calling code for exceptions you just throw an exception in the called code. Therefore you can easier identify the place in the coding their an error occurs. This follows the "crash-early" principle, which tells that if an error occurs in your software you should crash immediately and not later in the program because then it is hard to find the error.

2.9. Cohesion

A system should have a high cohesion.
In a highly-cohesive system, code readability and the likelihood of reuse is increased, while complexity is kept manageable.
Cohesion is a measure of how strongly-related and focused the responsibilities of a single class are. In object-oriented programming, it is beneficial to assign responsibilities to classes in a way that keeps cohesion high. Code readability and the likelihood of reuse is increased, while complexity is kept manageable, in a highly-cohesive system.

2.10. The Principle of Least Knowledge

Talk only to your immediate friends.
Also known as Law of Demeter.

2.11. The Open Closed Principle

Software entities like classes, modules and functions should be open for extension but closed for modifications.
This principles encourages developers to write code that can be easily extended with only minimal or no changes to existing code.
An example for a good application of this principles would be that a certain class calls internally an abstract class to conducted a certain behavior. At runtime this class is provided with an concrete implementation of this abstract class. This allows the developer later to implement another concrete calls of this abstract class without changing the code of the class which uses this abstract class.
Another excellent example is the Eclipse Extension Point method. Eclipse Plugins or Eclipse based application can define extension points where other plugs-ins can later add functionality.

Mergesort in Java

Mergesort

Mergesort is a divide and conquer algorithm. The sorting elements are stored in a collection. This collection is divided into two collections and these are again sorted via mergesort. Once the two collections are sorted then the result is combined
Mergesort will take the middle of the collection and takes then the two collection for the next iteration of mergesort. In the merging part mergesort runs through the both collections and selects the lowest of the both to insert it into a new collection.
In comparison to quicksort the divide part is simple in mergesort while the merging take is complex. In addition quicksort can work "inline", e.g. it does not have to create a copy of the collection while mergesort requires a copy

Program

Mergesort .java :-
public class Mergesort {
private int[] numbers;
private int[] helper;

private int number;

public void sort(int[] values) {
    this.numbers = values;
    number = values.length;
    this.helper = new int[number];
    mergesort(0, number - 1);
}

private void mergesort(int low, int high) {
    // Check if low is smaller then high, if not then the array is sorted
    if (low < high) {
      // Get the index of the element which is in the middle
      int middle = (low + high) / 2;
      // Sort the left side of the array
      mergesort(low, middle);
      // Sort the right side of the array
      mergesort(middle + 1, high);
      // Combine them both
      merge(low, middle, high);
    }
}

private void merge(int low, int middle, int high) {

    // Copy both parts into the helper array
    for (int i = low; i <= high; i++) {
      helper[i] = numbers[i];
    }

    int i = low;
    int j = middle + 1;
    int k = low;
    // Copy the smallest values from either the left or the right side back
    // to the original array
    while (i <= middle && j <= high) {
      if (helper[i] <= helper[j]) {
        numbers[k] = helper[i];
        i++;
      } else {
        numbers[k] = helper[j];
        j++;
      }
      k++;
    }
    // Copy the rest of the left side of the array into the target array
    while (i <= middle) { numbers[k] = helper[i]; k++; i++; } }}

Test.java:-
import java.util.Random;

class Test
{

public static void main(String args[])
{
     int numbers = new int[SIZE];
    Random generator = new Random();
    for (int i = 0; i < numbers.length; i++) {
      numbers[i] = generator.nextInt(MAX);
   }
Mergesort sorter = new Mergesort();
sorter.sort(numbers);

}
}

Complexity Analysis

The following describes the runtime complexity of mergesort.
Mergesort sorts in worst case in O(n log n) time. Due to the required copying of the collection Mergesort is in the average case slower then Quicksort.

Quicksort in Java

1.1. Overview

Sort algorithms are ordering the elements of an array according to a predefined order. Quicksort is a divide and conquer algorithm. In a divide and conquer sorting algorithm their the original data is separated into two parts (divide) which are individually sorted (conquered) and then combined.

1.2. Description of the algorithm

If the array contains only one element or zero elements then the array is sorted.
If the array contains more then one element then:

Select an element from the array. This element is called the "pivot element". For example select the element in the middle of the array.
All elements which are smaller then the pivot element are placed in one array and all elements which are larger are placed in another array.
Sort both arrays by recursively applying Quicksort to them.
Combine the arrays

Quicksort can be implemented to sort "in-place". This means that the sorting takes place in the array and that no additional array need to be created.

Program

Quicksort.java:--

public class Quicksort {
private int[] numbers;
private int number;

public void sort(int[] values) {
// Check for empty or null array
    if (values ==null || values.length==0){
      return;
    }
    this.numbers = values;
    number = values.length;
    quicksort(0, number - 1);
}

private void quicksort(int low, int high) {
    int i = low, j = high;
    // Get the pivot element from the middle of the list
    int pivot = numbers[low + (high-low)/2];

    // Divide into two lists
    while (i <= j) {
      // If the current value from the left list is smaller then the pivot
      // element then get the next element from the left list
      while (numbers[i] < pivot) {
        i++;
      }
      // If the current value from the right list is larger then the pivot
      // element then get the next element from the right list
      while (numbers[j] > pivot) {
        j--;
      }

      // If we have found a values in the left list which is larger then
      // the pivot element and if we have found a value in the right list
      // which is smaller then the pivot element then we exchange the
      // values.
      // As we are done we can increase i and j
      if (i <= j) {
        exchange(i, j);
        i++;
        j--;
      }
    }
    // Recursion
    if (low < j)
      quicksort(low, j);
    if (i < high)
      quicksort(i, high);
}

private void exchange(int i, int j) {
    int temp = numbers[i];
    numbers[i] = numbers[j];
    numbers[j] = temp;
}
}

Test.java:-

public class Test
{
public static void main(String args[]){
    Quicksort sorter = new Quicksort();
    int[] test = { 5, 5, 6, 6, 4, 4, 5, 5, 4, 4, 6, 6, 5, 5 };
    sorter.sort(test);
   printResult(test);
}
private void printResult(int[] numbers) {
    for (int i = 0; i < numbers.length; i++) {
      System.out.print(numbers[i]);
    }
    System.out.println();
}
}

Complexity Analysis

The following describes the runtime complexity of quicksort.
Fast, recursive, non-stable sort algorithm which works by the divide and conquer principle. Quicksort will in the best case divide the array into almost two identical parts. It the array contains n elements then the first run will need O(n). Sorting the remaining two sub-arrays takes 2* O(n/2). This ends up in a performance of O(n log n).
In the worst case quicksort selects only one element in each iteration. So it is O(n) + O(n-1) + (On-2).. O(1) which is equal to O(n^2).
The average case of quicksort is O(n log n).