de.fesenmeyer.dbnormalizer.core.algorithms
Class FDAlgorithms

java.lang.Object
  extended by de.fesenmeyer.dbnormalizer.core.algorithms.FDAlgorithms

public class FDAlgorithms
extends Object

Class containing several algorithms performing operations on FDs (Functional Dependencies).

Author:
DF

Field Summary
private static DBNormalizerLogger logger
           
 
Constructor Summary
FDAlgorithms()
           
 
Method Summary
static AttributeStringSet calculateAttributeClosure(Collection<FD> fds, AttributeStringSet attributes)
           Calculates the Attribute-Closure for the set of attributes given by argument attributes regarding the set of FDs given by argument fds.
static SortedSet<FD> calculateMinCover(Set<FD> fds)
           Calculates the minimal cover of the set of FDs specified by the fds argument.
private static ArrayList<FD> calculateReducedCover(Set<FD> fds)
          Calculates a reduced cover for the set of FDs specified by the fds argument.
private static boolean containsKey(Collection<CandKey> candidateKeys, Collection<String> attributes)
          Helper method which determines if a key consisting of the attributes specified by attributes argument is contained in the candidate keys specified by the candidateKeys argument
static SortedSet<AttributeStringSet> determineCandidateKeys(AttributeStringSet allAttributes, Set<FD> minCover)
           Algorithm for determination of candidate keys for a table.
static SortedSet<AttributeStringSet> determineCandidateKeysGeneral(AttributeStringSet allAttributes, Set<FD> minCover)
           Algorithm for determination of candidate keys from
static boolean isMember(FD fd, Set<FD> fds)
          The Membership algorithm checks if the FD specified by argument fd is member of the set of FDs specified by argument fds, that means if it can be derived from this set.
static boolean isRedundant(FD fd, Set<FD> fds)
          This algorithm checks if an FD is redundant by using the Membership algorithm isMember(FD, Set).
static SortedSet<FD> mergeFDs(Collection<FD> fds)
           Merges FDs using the union inference rule, e.g.: {a -> b, a -> c} => {a -> bc} One can also say that the FDs are grouped by equal LHS.
static void setKeysForTable(Table table, Table oldTable)
           Computes and sets the candidate keys for a table and sets the PK (and where required, DB-PK) which is considered to be the best.
static SortedSet<FD> splitFDs(Collection<FD> fds)
          Splits the FDs specified by argument fds by using the augmentation inference rule, e.g.: {a -> bc} => {a -> b, a -> c}
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

private static DBNormalizerLogger logger
Constructor Detail

FDAlgorithms

public FDAlgorithms()
Method Detail

isMember

public static boolean isMember(FD fd,
                               Set<FD> fds)
The Membership algorithm checks if the FD specified by argument fd is member of the set of FDs specified by argument fds, that means if it can be derived from this set. The membership algorithm is very similar to the Attribute-Closure algorithm. Hence this implementation is based on the Attribute-Closure algorithm calculateAttributeClosure(Collection, AttributeStringSet). Descriptions of the membership algorithm are given in [The theory of relational databases, Maier, 1983] and [Database analysis and design, Hawryszkiewycz, 1991].

Parameters:
fd - a FD
fds - a set of FDs
Returns:
true, if the FD specified by argument fd is member of the set of FDs specified by argumemt fds; false, otherwise

isRedundant

public static boolean isRedundant(FD fd,
                                  Set<FD> fds)
This algorithm checks if an FD is redundant by using the Membership algorithm isMember(FD, Set). The FD specified by argument fd should be contained in the set of FDs specified by argument fds. If this FDs is still a member of the given set of FDs, if it is removed from this set, then this FD is redundant regarding the given set.

Parameters:
fd - a FD
fds - a set of FDs
Returns:
true, if the FD specified by argument fd is redundant regarding the set of FDs specified by argumemt fds; false, otherwise

calculateAttributeClosure

public static AttributeStringSet calculateAttributeClosure(Collection<FD> fds,
                                                           AttributeStringSet attributes)

Calculates the Attribute-Closure for the set of attributes given by argument attributes regarding the set of FDs given by argument fds. The Attribute-Closure contains all attributes which are functionally dependent on the the given set of attributes regarding the given set of FDs. The Attribute-Closure contains at least the attributes given by the attributes argument, because each attribute functionally determines itself due to the reflexivity inference rule.

This algorithm has been implemented based on the remarks in [Fundamentals of database systems, 5th Ed., Elmasri/Navathe, 2007], where the algorithm is called "Closure of F under X".

Parameters:
fds - a collection (may be a set) of fds
attributes - a set of attributes
Returns:
the set of attributes which form the Attribute-Closure. Is empty, if and only if the specified set of attributes is empty.

calculateMinCover

public static SortedSet<FD> calculateMinCover(Set<FD> fds)

Calculates the minimal cover of the set of FDs specified by the fds argument. The minimal cover is a set of FDs without redundant FDs and contains no extraneous attributes on both sides and only one attribute on the RHS of each FD.

This algorithm is based on [The theory of relational databases, Maier, 1983].

Parameters:
fds - a set of FDs
Returns:
the minimal cover for the given set of FDs

mergeFDs

public static SortedSet<FD> mergeFDs(Collection<FD> fds)

Merges FDs using the union inference rule, e.g.:

{a -> b, a -> c} => {a -> bc}
One can also say that the FDs are grouped by equal LHS.

If the set of of FDs specified by argument fds is a minimal cover, the returned set of FDs will be a grouped minimal cover.

Parameters:
fds - a set of FDs
Returns:
a set of FDs as described above

calculateReducedCover

private static ArrayList<FD> calculateReducedCover(Set<FD> fds)
Calculates a reduced cover for the set of FDs specified by the fds argument. This algorithm first removes all extraneous attributes on the LHS of each FD (Left reduction). As a second step, it removes all extraneous attributes on the RHS of each FD, removing the whole FD, if the RHS gets empty during this step.

The reduced cover is the return value of this method. The FDs contained in the fds argument are not changed by the algorithm.

This algorithm is based on [The theory of relational databases, Maier, 1983].

Parameters:
fds - a set of FDs
Returns:
a list representing the reduced cover for the given set of FDs

splitFDs

public static SortedSet<FD> splitFDs(Collection<FD> fds)
Splits the FDs specified by argument fds by using the augmentation inference rule, e.g.:
 {a -> bc} => {a -> b, a -> c}
 

Parameters:
fds - a set of FDs
Returns:
a "split" representation of the FDs as described above

determineCandidateKeysGeneral

public static SortedSet<AttributeStringSet> determineCandidateKeysGeneral(AttributeStringSet allAttributes,
                                                                          Set<FD> minCover)

Algorithm for determination of candidate keys from http://en.wikipedia.org/w/index.php?title=Database_normalization&oldid=616875.

This algorithm should not be used directly, because it is quite slow. Use the determineCandidateKeys(AttributeStringSet, Set) method. It calls this method, if the candidate keys cannot be found in an easier and faster way.

NOTE: The argument minCover must be a minimal cover. If you provide an arbitrary set of FDs, the result may be wrong!

Parameters:
allAttributes - all attributes of the table for which the candidate keys should be determined
minCover - a minimal cover of the FDs holding in the table for which the candidate keys should be determined
Returns:
a set containing all candidate keys for the table

determineCandidateKeys

public static SortedSet<AttributeStringSet> determineCandidateKeys(AttributeStringSet allAttributes,
                                                                   Set<FD> minCover)

Algorithm for determination of candidate keys for a table. The allAttributes argument specifies the table's attributes, the minCover argument specifies a minimal cover of the FDs holding in the Table. This algorithm first tries to find the candidate keys by analyzing the structure of the minimal cover and calls the determineCandidateKeysGeneral(AttributeStringSet, Set) method if the candidate keys cannot be determined this easy (and fast) way.

This algorithm is in the most cases faster than the general algorithm determineCandidateKeysGeneral(AttributeStringSet, Set) and calls the general algorithm only if it cannot find the candidate keys by analyzing the structure of the minimal cover.

This algorithm is based on the Article [An Efficient Algorithm to Compute the Candidate Keys of a Relational Database Schema, Saiedian/Spencer, The Computer Journal, 1996].

NOTE: The argument minCover must be a minimal cover. If you provide an arbitrary set of FDs, the result may be wrong!

Parameters:
allAttributes - all attributes of the table for which the candidate keys should be determined
minCover - a minimal cover of the FDs holding in the table for which the candidate keys should be determined
Returns:
a set containing all candidate keys for the table

setKeysForTable

public static void setKeysForTable(Table table,
                                   Table oldTable)

Computes and sets the candidate keys for a table and sets the PK (and where required, DB-PK) which is considered to be the best. For selecting the PK currently a trivial and in some cases insufficient approach has been taken, as you can see in the source code of this method.

NOTE: The minimal cover for this table must have been computed before and must have been set with the Table.setFDMinCover(Set) method.

Parameters:
table - the table to set the keys for
oldTable - the table from which the table was extracted by means of normalization, can be null

containsKey

private static boolean containsKey(Collection<CandKey> candidateKeys,
                                   Collection<String> attributes)
Helper method which determines if a key consisting of the attributes specified by attributes argument is contained in the candidate keys specified by the candidateKeys argument

Parameters:
candidateKeys - a collection of candidate keys
attributes - the attributes of a potential key
Returns:
true, if such a key is contained in the candidate keys; false, otherwise