Skip navigation links

Package org.apache.commons.collections4.sequence

This package provides classes to compare two sequences of objects.

See: Description

Package org.apache.commons.collections4.sequence Description

This package provides classes to compare two sequences of objects.

The two sequences can hold any object type, as only the equals method is used to compare the elements of the sequences. It is guaranteed that the comparisons will always be done as o1.equals(o2) where o1 belongs to the first sequence and o2 belongs to the second sequence. This can be important if subclassing is used for some elements in the first sequence and the equals method is specialized.

Comparison can be seen from two points of view: either as giving the smallest modification allowing to transform the first sequence into the second one, or as giving the longest sequence which is a subsequence of both initial sequences. The equals method is used to compare objects, so any object can be put into sequences. Modifications include deleting, inserting or keeping one object, starting from the beginning of the first sequence. Like most algorithms of the same type, objects transpositions are not supported. This means that if a sequence (A, B) is compared to (B, A), the result will be either the sequence of three commands delete A, keep B, insert A or the sequence insert B, keep A, delete B.

The package uses a very efficient comparison algorithm designed by Eugene W. Myers and described in his paper: An O(ND) Difference Algorithm and Its Variations. This algorithm produces the shortest possible edit script containing all the commands needed to transform the first sequence into the second one. The entry point for the user to this algorithm is the SequencesComparator class.

As explained in Gene Myers paper, the edit script is equivalent to all other representations and contains all the needed information either to perform the transformation, of course, or to retrieve the longest common subsequence for example.

If the user needs a very fine grained access to the comparison result, he needs to go through this script by providing a visitor implementing the CommandVisitor interface.

Sometimes however, a more synthetic approach is needed. If the user prefers to see the differences between the two sequences as global replacement operations acting on complete subsequences of the original sequences, he will provide an object implementing the simple ReplacementsHandler interface, using an instance of the ReplacementsFinder class as a command converting layer between his object and the edit script. The number of objects which are common to both initial arrays and hence are skipped between each call to the user handleReplacement method is also provided. This allows the user to keep track of the current index in both arrays if he needs so.

Skip navigation links

Copyright © 2010 - 2020 Adobe. All Rights Reserved