This book page describes and compares several methods for sorting lists of names. It revisits using attributes for rapid sorting, with attention to their global effect on strings. It demonstrates the use of the lexorder[n] option to sort and it illustrates the use of convert/local. Occasionally it is useful to sort a list of names—usually variables extracted from an expression—into a repeatable order. The simple and obvious way to do this is with the sort command.
      sort([x, y, Z, a1, a1]);
                                     [Z, a1, a1, x, y]
    
That approach does not work if there are indexed names in the list.
      sort([x, y, Z, a[1], a[2]]);
                                    [x, y, a[1], a[2], Z]
    
The elements are sorted, but by machine address rather than alphabetically. This can be shown using the addressof command:
      map(addressof,%);
                [135350720, 141574716, 143418428, 143418444, 143451544]
    
The explanation can be found from a careful reading of the sort help page. A list of symbols are, by default, sorted lexically. However, an indexed name is not a symbol. Consequently, the list is sorted by machine address, which is session dependent and hence not repeatable. The following sections look at a few ways to use the Maple sort command to sort a list of names into a repeatable order. Ideally the same final permutation is returned for any given permutation of the elements in the list. The sort command takes an optional second argument which can be an ordering function that takes two arguments—elements of the list—and returns true if the first argument is sorted with respect to the second, false otherwise. To sort names we can compare them as strings.
      SortNames[1] := proc(names :: list)
          sort( names
                , (a,b) -> convert(a,string) <= convert(b,string)
              );
      end proc:

      SortNames[1]( [x, y, Z, a[1], a[2]] );
                                             [ Z, a[1], a[2], x, y ]
    
Note the use of the inequality `<=' in the comparison rather than strict inequality, `<'. That prevents an unnecessary swap if the strings are equal, and achieves a stable sorting, that is, one in which equal elements maintain their original order.

This procedure has two problems. First, it is rather slow, at least in comparison to later techniques. Second, because they map to the same string, it cannot distinguish the indexed variable x[1] and the symbol `x[1]`. We can correct this latter problem by using sprintf with the %a format string to convert names to strings:

      SortNames[2] := proc(names :: list)
          sort( names
                , (a,b) -> sprintf("%a",a) <= sprintf("%a",b)
              );
      end proc:

      SortNames[2]( [x, y, Z, `a[1]`, a[1], a[2], `b[1]` ] );
                                    [Z, `a[1]`, `b[1]`, a[1], a[2], x, y]

    
In a previous blog entry I explained how attributes can be used to sort a list of floating-point numbers with respect to a relatively expensive function. That speeded up the operation by (1) reducing the number of calls to the function to the number of elements in the list, and (2) removing the need to call a user-defined order function, which is invariably slower than the built-in Maple order functions. The same technique can be employed here. However, because we are sorting strings rather than floats, there are a few new subtleties to consider.

Here is the basic routine

      SortNames[3] := proc(names :: list)
      local n;
          map(attributes, sort([seq](setattribute(sprintf("%a",n),n)
                                     , n = names)
                               , 'string'));
      end proc:
    
If you have not read the previously mentioned article this may need a bit of explanation. The [seq](...) expression creates a list of strings, using sprintf as previously described. As each string is created it is passed to setattribute which assigns the original name as the attribute to the string. The resulting list of strings is then sorted lexically. Finally the attributes of the sorted strings are extracted and returned. Because the attributes are the original names of the strings, the returned list is the sorted list of names.

This procedure has two flaws: one minor, one major.

The minor flaw is that it has global side-effects due to the way that Maple handles strings. Each unique string in Maple exists in memory in one location. Adding an attribute to a string does not change this location. Calling setattribute replaces any preexisting attributes, so if some other procedure was using attributes with strings, and one of the strings matched a name passed to this procedure, its attribute would be changed.

The major flaw is subtle but potentially fatal. Suppose two distinct names mapped to the same string. When the attributes are extracted from the sorted strings, both strings are mapped to the same name, the second one in the original list. The returned list of names, then, does not contain one of the original names. Using sprintf rather than convert/string eliminates some of the collisions, however, it is still possible for two names to map to the same string. A simple example is escaped local variables, which can be readily generated with convert/local:

      lst := map(convert, [a,a], `local`):
      convert(lst, set);
                                                 {a,a}
    
To demonstrate that this really occurs convert the sorted list to a set.
      convert(Sort[3](lst), set);
                                                  {a}
    
Both flaws in SortNames[3] can be corrected by attaching the attributes to a list containing a string rather than the string itself. Unlike attributes attached to a string, attributes attached to a list (and most attributable objects) only affect that instance of the object; two lists that are otherwise identical but have separate attributes are different elements in Maple. This can be verified with
      lst := [1,2]:
      lst1 := setattribute(lst,1):
      lst2 := setattribute(lst,2):
      lst3 := setattribute(lst,2):
      map(evalb, [lst1=lst2, lst2=lst3]);
                                          [ false, true ]
    
Normally, enclosing elements of a list in lists requires providing a user-defined order function to sort to unpack and compare them, which reduces the advantage of this technique. However, sort takes the optional symbol lexorder[n] which tells it to lexically compare the n'th element of each sublist. The following procedure does just that, using n=1.
      SortNames[4] := proc(names :: list)
      local n;
         map(attributes, sort([seq](setattribute([sprintf]("%a",n),n)
                                    , n = names)
                              , 'lexorder'[1]));
      end proc:
    
Demonstrate that this fixes the collision problem.
      convert(SortNames[4](lst), set);
                                       {a,a}
    
The procedures discussed convert names to strings and then sort the strings. Another option is to convert names to symbols and sort the symbols. The Maple procedure convert/symbol may be used to convert names (and other expressions) to symbols. However, the distinct names a[1] and `a[1]` get mapped to the same symbol. To avoid that, we can use convert/local, which maps each name to a distinct symbol.
      SortNames[5] := proc(names :: list)
      local n;
          map(attributes, sort([seq](setattribute(convert(n,'`local`'),n)
                                     , n = names)));
      end proc:
    
      convert(SortNames[5](lst), set);
                                       {a,a}
    
The goal was to create a procedure that rapidly sorts any permutation of a list of names into a session-independent, repeatable, sorted order. The results, SortNames[4] and SortNames[5], do not quite meet that goal. If the list of names contains names that lexically match but are different (escaped local variables) then the resulting order of those names is identical to their relative input order. Otherwise those procedures work as required. This is a relatively benign failure; furthermore it is not clear it can be be avoided. As shown in the following table, using attributes (cases 3-5) is substantially faster and uses less memory than providing a user-defined order function to sort (cases 1-2). SortNames[4], which uses lists to avoid string collisions, is slighter slower, but more robust, than SortNames[3], but faster than SortNames[5], which uses symbols. SortNames[4] is the best overall choice.
      $ tmaple sort-names.mpl

	Compare performance of sorting names.
	Each test sorts the same list of 50 names of 5 characters 500 times.

	time (s)  used (MB)  alloc (MB)  gctimes  proc
	--------  ---------  ----------  -------  ----
	  2.28     54.70       36.12         2    SortNames[1]
	  3.73     47.19       35.93         2    SortNames[2]
	  0.40      4.90        3.44         1    SortNames[3]
	  0.46      6.25        3.50         1    SortNames[4]
	  0.60     10.63        6.56         1    SortNames[5]
    
The table was generated using custom tools for accurately and repeatably measuring the time and memory usage of Maple procedures. They shall be the subject of another article.

Please Wait...