Interface AttributedRun


  • public interface AttributedRun
    A run of element occurrences with attributes, with support for transformations, used by inline text formatters.

    For details on the formatting model, see the package overview.

    Overall model

    An AttributedRun represents a sequence of elements, which are either occurrences of characters or of glyphs. Each occurrence (or element) can have attributes (represented as key/value pairs), as can each position between two consecutive occurrences. Typically, these attributes describe constraints on the result of the layout process (e.g. “this character must be styled using this font”, or “it is possible to ligate the character occurrences on each side of this inter-element position”), or intermediate data that is created and used during the layout process.

    An AttributedRun provides methods to access the elements, methods to change or replace elements (e.g. to implement ligatures), as well as methods to query and set the attributes.

    While each element can in principle have attributes which are unrelated to the attributes of the neighboring elements, some attributes will be found on all elements and will typically have the same value for multiple successive elements (e.g. the font to use, or the point size). Because AttributedRun is only an interface, it does not impose nor preclude that the actual data structure used by the layout engine take advantage of that situation, e.g. to reduce the storage needed for those attributes.

    A design point for this interface is that the allocation of semi-permanent objects is minimized during processing. [By semi-permanent, we want to exclude the very short lived objects, which are easily amenable to reuse via pooling; and the permanent objects, which are not a problem.] In particular, we chose to not force an allocation when the processing moves from the character domain to the glyph domain. This explains why we say that the elements can be occurrences of either characters or glyphs: at the beginning of layout, each element usually represents a character occurrence, and at some point, each element will be transformed to represent a glyph occurrence (for the character it supercedes).

    This choice accomodates well the usual case where layout is done in a straitghforward way: starting with some character occurrences, each is replaced by a single glyph occurrence (using something like a font 'cmap'), each occurrence may be refined a bit (e.g. switching to a small cap version), the advance widths are assigned, and the inline text formatting is done. This can be done without any allocation.

    The layout engine may need to have access to its input characters after formatting, and possibly needs access the relationship between those characters and the glyphs that render them, e.g. to support visual selection (selecting among the input characters via a rendering of them) or visual editing (aka wysiwyg editing). While the elements of an AttributedRun object are indeed those characters originally, they are mutated over the course of the layout. Thus, it is the responsibility of the layout engine to keep the original characters if needed, as well as to track the correlation with the glyphs. The latter is possible because all the changes to the elements are made visible to the layout engine, via the AttributedRun interface.

    Characters and glyphs

    We refer to the value stored in a run to represent an element as the elementId of that element.

    When an element is a character, its elementId is the Unicode scalar value of that character. In particular, surrogate code points cannot be elements, since they are not scalar values; but supplementary characters are of course supported. It is up to the client to deal with character set conversions as needed.

    When an element is a glyph, its elementId is the glyph index of that glyph. Unlike characters, glyph indices (or gids, for short) do not have a universal meaning and can be interpreted only in the context of a font: gid 34 could be for a glyph representing “a” in one font, and for a glyph representing “z” in another one.

    To keep track of whether an element is a character or a glyph, we use the attribute ElementAttribute.isGlyph, the value of which is java.lang.Boolean.TRUE for glyphs, and java.lang.Boolean.FALSE for characters. The client of an inline text formatter can provide an AttributedRun which mixes characters and glyphs: this is useful to support user-specified glyphs, as via SVG’s <altGlyph>.

    During the course of formatting, it is possible for an element to have a gid value which does not correspond to an actual glyph in the font, i.e. is greater or equal to the number of glyphs in the font. This can be unintentional (e.g. a bug in the font) or intentional (this “virtual” glyph is intended to be replaced by a “real” glyph later in processing). Because there is no good way to determine a priori whether such gids are intentional, our strategy is to accept them during layout, and to take them care of those that remain at the end of layout (or more precisely at the end of glyph selection) by replacing them silently with the .notdef glyph. (A future version may instead throw an Exception; the most appropriate design is not entirely clear yet.)

    Indexing model

    Each element in an AttributedRun is identified by an int index. The interface does not provide access to the indices of the first and last element. Rather, whenever an object implementing AttributedRun is handed to some component for processing, that component also receives the range on indices on which processing is to be performed, in the form of the index for the first element to process, and the index of the element following the last element to process (i.e. the range to process is [first, limit[, using the traditional Java names).

    The motivation for this choice is to support the creation of a single object underlying an AttributedRun, and to allow different processing of the various subruns, or possibly the concurrent processing of non-overlapping subruns. Exposing the range of application via the object implementing AttributedRun would imply either duplication of objects, and/or state in proxy objects.

    One cost of this choice is that the client of an AttributedRun object will need to maintain its first and limit as the run is modified, but since this maintenance only reflects the changes asked by the client, this should not be a big problem.

    Given a subrun where the elements are characters, those are stored in logical order as defined by Unicode. Thus, in a subrun for the word “school”, the elements are the characters s, c, h, o, o, l, in that order; in a subrun for the word “مدرسة” (school in Arabic), the elements are the characters meem, dal, reh, seen, teh marbuta in that order; in a subrun for the word “विद्यार्थी” (scholar in Hindi), the elements are the characters va, sign i, da, virama, ya, sign aa, ra, virama, tha, sign ii in that order. Those three subruns can be in the same AttributedRun object, presumably separated by spaces, to represent the string “school مدرسة विद्यार्थी”.

    It is the typically the responsibility of the client of a formatter to label characters in an AttributedRun with their Unicode bidirectional level, via the attribute ElementAttribute.bidiLevel, the value of which is java.lang.Integer. If the run above is displayed in a left-to-right context, then the Latin characters, the spaces and the Devanagari characters will be at level 0, and the Arabic characters will be at level 1.

    Given a subrun where the elements are glyphs, those are stored in reading order, i.e. in the order in which the eye will see them in reading. For most writing systems, including English and Arabic, the reading order coincides with the logical order. The glyphs for the characters s, c, h, o, o, l are in that order in a run representing the word “school”, and the glyphs for the characters meem, dal, reh, seen, teh marbuta are in that order in a run representing the word “مدرسة”. The most notable writing systems in which the logical and reading order do not coincide are the Indic writing systems: as can be seen in the word “विद्यार्थी”, the glyph order is actually: sign i, va, dya, sign aa, tha, sign ii, reph.

    Formatting process

    The following picture shows an AttributedRun before, during and after formatting, using our example string “school مدرسة विद्यार्थी”.

    The left part shows the attributed run before formatting. It has 23 elements, which are all characters. The right-pointing triangles indicate that the bidi level of the corresponding character is 0, while the left-pointing triangles indicate that it is 1.

    The middle part is a snapshot during formatting: at that stage, the characters have been replaced by glyphs, one per character.

    The right part shows the result after formatting. Some glyphs have not been further changed (e.g. in the English text), others have been replaced one for one (shaping in the Arabic text, taking into account the directionality), while in the Devanagari text, the replacement is m to n.

    The arcs and lines illustrate the transformations applied to the AttributedRun; the trivial transformations, both from the “before” column to the “during” column and from the top part of the “during” column to the top part of the “after” column are omitted for clarity.

    Elements 13 and 14 have been replaced by two elements. While it is graphically clear that the two elements have simply been exchanged, this is not reflected in the way the attributed run is modified: rather, the two new glyphs should be considered as both participating in the rendering of each of the two original characters. (This behaviour is ok: each of the three groups in the Hindi word form an orthographic cluster which is a well understood entity by native readers.)

    Since all the operations on an AttributedRun are performed by invoking methods an that object, and the layout engine implements those methods, it can maintain the relationship between characters and glyphs. Still using our example, the layout engine knows that before formatting, the first element of the run render the input character “s” (or more precisely, the first character occurrence in the input string), and similarly for the other elements. During formatting, when n elements are replaced by m elements, the layout engine can update its correlation to record that each of the m new elements renders the union of the character occurrences that each of the n original element rendered. that glyph renders whatever the first element was rendering, i.e. the first character occurrence.

    Subruns

    A subrun with respect to an attribute set is a maximal subrun of the whole run such that for each the attribute in the set, all the elements in the subrun either do not have this attribute, or have the same value. Attributes on positions between elements are irrelevant for the determination of subruns.

    Coordinate system for glyphs

    Inline formatters determine the positions of glyphs. Those positions are not expressed in a single coordinate system, but rather each glyph has its own coordinate system, with the x axis in the direction of the line, increasing from left to right, and the y axis perpendicular to the line direction, increasing toward the top of the glyph. The units of that coordinate system are not specified; instead, quantities must be divided by Font.getUnitsPerEmX()(for horizontal metrics) or by Font.getUnitsPerEmY()(for vertical metrics) to obtain them in the em-space.

    Two points in that coordinate system are determined for each glyph:

    • the placement is the point to be used as the origin for the glyph outline
    • the advance is the point to be used as the origin for the coordinate system of the glyph on the right of this glyph (if the line is horizontal), or the glyph below (if the line is vertical)
    Alternatively, we can view those two points as vector originating from (0,0).

    In picture:

    The red lines represent the coordinate system of the outline of the glyph. The blue cross is the origin of the coordinate system of the glyph. The purple arrow is the placement vector. The green arrow is the advance vector.

    When a subrun of text is processed as a unit, inline formatting engines determine the placement and advance of each glyph such that the subrun has the appropriate display. Using our previous example, the vectors on the glyphs for each subrun are arranged like this (the numbers indicate the indices of the occurrences):

    It is up to the layout engine to decide how to arrange the various pieces, and in particular to account for the directionality. If our example string is a paragraph with overall directionality left-to-right, it is up to the layout engine to align the origin of element 0 with the left margin, to make the origin of element 11 coincide with the extremity of the advance vector of element 6, and to make the origin of element 12 coincide with extremity of the advance vector of element 7.

    Typically, for a run of English text set on the Latin baseline, the advance vector will be horizontal,the placement vector will be [0,0]. Kerning is achieved by reducing or augmenting the advance vector. A synthesized superscript has a vertical placement vector. For these reasons, two different occurrences of the same glyphs can have different vectors.

    Representation of the vectors

    The placement and advance vectors could be handled with suitable attributes. However, those attributes have peculiarities which motivate a separate treatment in the interface:

    • these attributes typically change which each glyph occurrence, unlike the other styling attributes which typically have the same value for a significant number of successive elements (e.g. the same font is used in a whole paragraph).
    • these attributes are computed relatively late in the layout process.
    • after these attributes are set for the first time on a range, the elements in that range are usually not modified, nor are elements inserted or removed.

    Inline formatters indicate their client the point where they start to compute the vectors by invoking the startWorkingWithPositions method. An implementation of the AttributedRun interface may choose to allocate some arrays for the vectors at that point.

    Intepretation of attributes

    In the larger scheme of things, the various components which interact to do a document layout need to have a common understanding of the attributes: one component will set an attribute (e.g. in response to the input) and another will actually implement that attribute (e.g. select glyphs which have the characteristic requested by the attribute, and therefore requested by the input).

    The AttributedRun interface, for the most part, does not include that common understanding. It is only a vehicle to carry those attributes. It is up to the parties exchanging an AttributedRun to ensure that they establish this common understanding.

    Concurrency

    The inline text formatters implemented in this package perform their work entirely in the thread in which they are called, so they do not force implementations of this interface to be synchronized.

    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      void adjustPlacementAndAdvance​(int position, double xPlacementDelta, double yPlacementDelta, double xAdvanceDelta, double yAdvanceDetla)
      Increment the placement and advance of an element.
      int elementAt​(int position)
      Return the int identifying the element at some position.
      java.lang.Object getElementStyle​(int position, ElementAttribute att)
      Get the value of an attribute on an element.
      double getElementXAdvance​(int pos)
      Get the xAdvance of an element.
      double getElementXPlacement​(int pos)
      Get the xPlacement of an element.
      double getElementYAdvance​(int pos)
      Get the yAdvance of an element
      double getElementYPlacement​(int pos)
      Get the yPlacement of an element.
      java.lang.Object getInterElementStyleBefore​(int position, InterElementAttribute att)
      Get the attribute on the position between two elements.
      int getSubrunLimit​(int start, int limit, ElementAttribute attribute)
      Return the index of the element following the run (with respect to attribute) that includes start.
      int getSubrunLimit​(int start, int limit, java.util.Set attributes)
      Return the index of the element following the run (with respect to attributes) that includes start.
      void remove​(int position)
      Remove the element at position.
      void replace​(int[] positions, int elementId)
      Merge multiple elements into a single element.
      void replace​(int[] positions, int[] elementIds)
      Merge multiple elements into multiple elements.
      void replace​(int position, int elementId)
      Replace the element at some position by a single element.
      void replace​(int position, int[] elementIds)
      Expand an element.
      void replace​(int first, int limit, int elementId)
      Merge multiple contiguous elements into a single element.
      void setElementAscentAndDescent​(int position, double ascent, double descent)
      Set the ascent and Descent of an element
      void setElementPlacementAndAdvance​(int position, double xPlacement, double yPlacement, double xAdvance, double yAdvance)
      Set the placement and advance of an element.
      void setElementStyle​(int start, int limit, ElementAttribute att, java.lang.Object attValue)
      Set an attribute on a range of elements.
      void setElementStyle​(int position, ElementAttribute att, java.lang.Object attValue)
      Set an attribute on an element.
      void setInterElementStyleBefore​(int start, int limit, InterElementAttribute att, java.lang.Object value)
      Set an attribute on a range positions between elements.
      void setInterElementStyleBefore​(int position, InterElementAttribute att, java.lang.Object attValue)
      Set the attribute on the position between two elements.
      void startWorkingWithPositions​(int start, int limit)
      Announces that the elements in a range of the run will no longer change, and that the methods related to the x and y attributes will now be called on those elements.
    • Method Detail

      • elementAt

        int elementAt​(int position)
        Return the int identifying the element at some position.
        Parameters:
        position - the position of the element in the run
        Returns:
        -1 if the position does not exist in the run, or the int identifying the element.
      • replace

        void replace​(int[] positions,
                     int[] elementIds)
        Merge multiple elements into multiple elements.

        The elements identified by positions are merged, and replaced by the elements in elementIds. These elements occur where positions[0] is. For example, if a run contains: 'a b c d e f' (indexed from 0), positions is {1, 3, 4} (that is, identifies b, d and e) and elementIds is {y, z}, then after this operation, the run contains: 'a y z c f'.

        Both positions and elementIds have at least one value. The positions are in increasing order, but do not need to be contiguous.

        The replacement elements should generally have the same styling attributes as the element at positions[0].

        A typical case is the formation of orthographic clusters in Indic scripts. After this operation, each of the resulting elements contributes to the rendering of each of the original elements.

        The other replace methods are special cases, where positions contains a single element, or contains a contiguous range of elements, and/or where elementIds contains a single element.

      • replace

        void replace​(int position,
                     int elementId)
        Replace the element at some position by a single element. Equivalent to replace (positions, elementIds) where positions is an array with the single value position and elementIds is an array with the single value elementId. A typical case is the replacement of a character by a glyph representing it, or the shaping of digit glyph into an old style digit glyph.
      • replace

        void replace​(int position,
                     int[] elementIds)
        Expand an element. Equivalent to replace (positions, elementIds) where positions is an array with the single value position. A typical case is the decomposition of a glyph representing an accented character into a pair of glyphs, one for the base character and one for the accent.
      • replace

        void replace​(int[] positions,
                     int elementId)
        Merge multiple elements into a single element. Equivalent to replace (positions, elementsIds) where elementIds is an array with the single value elementId. A typical case is the formation of a ligature, say 'f', 'i' merged into a single 'fi' glyph. The positions are in increasing order, but do not need to be contiguous: for example, the original run could contain 'e', 'combining acute', 't', and positions could refer to 'e' and 't'; the merged element is at positions[0], so the result would be 'et', 'combining acute'.
      • replace

        void replace​(int first,
                     int limit,
                     int elementId)
        Merge multiple contiguous elements into a single element. Equivalent to replace (positions, elementIds) where positions is an array with the values first, first + 1, ..., limit - 1, and elementIds is an array with the single value elementId. A typical case is when multiple characters are mapped to a single glyph.
      • remove

        void remove​(int position)
        Remove the element at position. The case where this is used is when a control character, e.g. U+200D ZERO WIDTH JOINER does not result in a glyph. Removing is a bit brutal, because is means there is no trace in the glyph stream of the character; on the other hand, it's not quite possible to have a trace.
      • startWorkingWithPositions

        void startWorkingWithPositions​(int start,
                                       int limit)
        Announces that the elements in a range of the run will no longer change, and that the methods related to the x and y attributes will now be called on those elements.
        Parameters:
        start - the index of the first element of the range
        limit - the index of the first element following the range
      • getElementXPlacement

        double getElementXPlacement​(int pos)
        Get the xPlacement of an element.
      • getElementYPlacement

        double getElementYPlacement​(int pos)
        Get the yPlacement of an element.
      • getElementXAdvance

        double getElementXAdvance​(int pos)
        Get the xAdvance of an element.
      • getElementYAdvance

        double getElementYAdvance​(int pos)
        Get the yAdvance of an element
      • setElementPlacementAndAdvance

        void setElementPlacementAndAdvance​(int position,
                                           double xPlacement,
                                           double yPlacement,
                                           double xAdvance,
                                           double yAdvance)
        Set the placement and advance of an element.
      • adjustPlacementAndAdvance

        void adjustPlacementAndAdvance​(int position,
                                       double xPlacementDelta,
                                       double yPlacementDelta,
                                       double xAdvanceDelta,
                                       double yAdvanceDetla)
        Increment the placement and advance of an element.
      • setElementAscentAndDescent

        void setElementAscentAndDescent​(int position,
                                        double ascent,
                                        double descent)
        Set the ascent and Descent of an element
      • getElementStyle

        java.lang.Object getElementStyle​(int position,
                                         ElementAttribute att)
        Get the value of an attribute on an element.
        Parameters:
        position - the position of the element
        att - the attribute to look up
        Returns:
        null if the element does not have the attribute.
      • setElementStyle

        void setElementStyle​(int position,
                             ElementAttribute att,
                             java.lang.Object attValue)
        Set an attribute on an element.
        Parameters:
        position - the position of the element
        att - the attribute to set
        attValue - the value of the attribute, should not be null
      • setElementStyle

        void setElementStyle​(int start,
                             int limit,
                             ElementAttribute att,
                             java.lang.Object attValue)
        Set an attribute on a range of elements. Equivalent to:
         for (int i = start; i < limit; i++) {
           setElementStyle (i, att, value); }
         
      • getInterElementStyleBefore

        java.lang.Object getInterElementStyleBefore​(int position,
                                                    InterElementAttribute att)
        Get the attribute on the position between two elements.
        Parameters:
        position - the position of the second element
        att - the attribute to look up
        Returns:
        null if the position does not have the attribute.
      • setInterElementStyleBefore

        void setInterElementStyleBefore​(int position,
                                        InterElementAttribute att,
                                        java.lang.Object attValue)
        Set the attribute on the position between two elements.
        Parameters:
        position - the position of the second element
        att - the attribute to set
        attValue - the value of the attribute, should not be null
      • setInterElementStyleBefore

        void setInterElementStyleBefore​(int start,
                                        int limit,
                                        InterElementAttribute att,
                                        java.lang.Object value)
        Set an attribute on a range positions between elements. Equivalent to:
         for (int i = start; i < limit; i++) {
           setInterElementStyleBefore (i, att, value); }
         
      • getSubrunLimit

        int getSubrunLimit​(int start,
                           int limit,
                           ElementAttribute attribute)
        Return the index of the element following the run (with respect to attribute) that includes start.
      • getSubrunLimit

        int getSubrunLimit​(int start,
                           int limit,
                           java.util.Set attributes)
        Return the index of the element following the run (with respect to attributes) that includes start.