Carl Love

Carl Love

26104 Reputation

25 Badges

11 years, 56 days
Wayland, Massachusetts, United States
My name was formerly Carl Devore.

MaplePrimes Activity

These are replies submitted by Carl Love

@C_R You wrote

  • The fact that the rendering is as expected after a restart (of Maple or the whole system?) indicates that the system starts Maple in a correct way with all involved software and hardware components configured correctly. 

The required "restart" in this case is of the Maple GUI (user interface), which necessitates the closure of all open worksheets. Thus, a system "reboot" (restart of the operating system) works also but is more than is necessary.

To avoid confusion. readers should note that the usage of the word "restart" in this discussion has nothing to do with Maple's restart command.

@Ronan For any procedure keyword option X, passing alone is equivalent to passing X= true, even if true isn't a valid value for X.

@Jesús Guillera You're welcome. There are many notations commonly used for pochhammer and related functions. Maple's Typesetting package allows the display (only) of many MathML formulae that are beyond the usual Maple displays. Let me know if you'd like something different. 

@mmcdara Thank you. I didn't expect my technique to be the fastest possible, just brief and reasonably fast and intuitive.

I realize that your Maple version may not have the command convert(M, list, dimension= 1). If is a matrix, this makes a list of the rows, but the rows themselves are still rtables. I haven't tested, but I suspect that that is more efficient than convert(M, listlist) or its modern replacement convert(M, list, nested); and I suspect it'd be even more efficient if has order= C_order.

If you care to rerun your test, I think that my code using ArrayTools:-Alias in the Reply immediately above will work in Maple 2015.

@sursumCorda You wrote:

  • Maybe such functionalities can be built into the underlying core so that Maple can sort rows of the rtable in-place.

The following pretty much does that. It does create a "dope vector" (i.e., vector of pointers/addresses) with number-of-rows elements, but that's much less effort than copying the whole rtable. This also requires that the sub-rtables being sorted (the rows in this case) are each stored in contiguous memory locations, which is why I used order= C_order.

M:= rtable(<4, 1, 1 | 0, 5, 2 | 7, 8, 9>, order= C_order):  
(r,c):= op(1, M):
    rtable(1..r, i-> ArrayTools:-Alias(M, c*(i-1), [1..c])),
    key= (R-> [seq](R[[1,3,2]])), 
    output= permutation

If doesn't have order= C_order, the above won't give an error message, but the results won't be meaningful.

There are no files attached to your Question. Please try attaching them again.

@Joe Riel It's frustrating that the expository section of help page ?object,create has 8 paragraphs devoted to "Inheritance", but the "Examples" section doesn't have a single example of it.

@sursumCorda The specific example that you show can be done by

M:= <4, 1, 1 | 0, 5, 2 | 7, 8, 9>:
<sort(convert(M, list, dimension= 1), key= (R-> [seq](R[[1,3,2]])))[]>;

Higher dimensions can also be accomodated.

@Mike Mc Dermott Okay, here's an example verifying real-time enhancement on both integer and hardware-float data:

nV:= 6:  V:= [x||(1..nV)]:
P0:= codegen:-makeproc(randpoly(V, dense, degree= nV), V):
P1:= codegen:-optimize(P0):
P2:= codegen:-optimize(P0, tryhard):
(lprint@codegen:-cost)~([P0, P1, P2]):

    'rtable(1..2^13, random(-9..9), datatype= integer[4], subtype= Vector[row])' 
    $ nV
for k from 0 to 2 do gc(); R[k]:= CodeTools:-Usage((P||k)~(DataInt)) od:
memory used=375.70MiB, alloc change=0 bytes, 
cpu time=1.95s, real time=1.96s, gc time=0ns

memory used=72.91MiB, alloc change=0 bytes, 
cpu time=844.00ms, real time=847.00ms, gc time=0ns

memory used=39.18MiB, alloc change=0 bytes, 
cpu time=469.00ms, real time=473.00ms, gc time=0ns

#Verify accuracy:
LinearAlgebra:-Norm~([R[0]-R[1], R[1]-R[2], R[0]-R[2]], infinity);
                           [0, 0, 0]
    'rtable(1..2^13, frandom(-2..2), datatype= hfloat, subtype= Vector[row])' 
    $ nV
for k from 0 to 2 do gc(); R[k]:= CodeTools:-Usage((P||k)~(DataHF)) od:
memory used=404.77MiB, alloc change=0 bytes, 
cpu time=7.02s, real time=4.07s, gc time=3.53s

memory used=195.47MiB, alloc change=0 bytes, 
cpu time=1.38s, real time=1.38s, gc time=0ns

memory used=92.04MiB, alloc change=0 bytes,
cpu time=563.00ms, real time=558.00ms, gc time=0ns

#Verify accuracy:
(lprint@LinearAlgebra:-Norm)~([R[0]-R[1], R[1]-R[2], R[0]-R[2]], infinity):


I was rushed when I wrote the Answer above. Here's a more-complete example:

    [1,2,3,4], #column heights
    width= 2, distance= 1,
    axis[1]= [
        tickmarks= [
            [seq](k+1 = sprintf("%a - %a", k, k+2), k= 0..11, 3), 
            rotation= Pi/4
    axesfont= ["Arial", 18]


@2cUniverse You can never make a valid comparison between two measurements both of which have been rounded down to 0. You must either use a finer measuring instrument, or increase the scale of the problem until the instrument returns positive values. The latter is usually easier in Maple.

iremFrac2a:=proc(n,d, b)
   local r1,r,i,li:
   r:= irem(n*b,d):
   if NumberTheory:-AreCoprime(d,b) then
     while true do
        r1:= irem(b*r,d):
        if r1=n then break fi:
     end do:
   else print("Denom-Base not coprime "): return [] fi:
end proc
iremFrac2b:= proc(n, d, b) local N:= irem(n,d), r:= irem(N*b, d), s; 
    [if igcd(b,d) = 1 then 
        (do [[r,r], [r, (s:= irem(b*r, d))]], [[r,s], [s,s]] until (r:= s)=N)
        print("Denom-Base not coprime")
end proc
d:= nextprime(2^17):  b:= irem(rand(), d);
                          b := 116956

(L1, CT1, RT1):= CodeTools:-Usage(
    iremFrac2a(1, d, b), output= [output, cputime, realtime]
memory used=14.24GiB, alloc change=-32.00MiB, 
cpu time=5.09m, real time=79.49s, gc time=4.93m

(L2, CT2, RT2):= CodeTools:-Usage(
    iremFrac2b(1, d, b), output= [output, cputime, realtime]
memory used=14.89MiB, alloc change=0 bytes, 
cpu time=172.00ms, real time=165.00ms, gc time=0ns

#Verify that computed results are equal:
evalb(L1 = L2);

#time ratios (real and CPU):
RT1/RT2, CT1/CT2;
                    481.7333333, 1776.529070

To be fair, those ratios are not as extreme if I increase the scale of the problem "horizontally" (i.e., smaller test cases but more of them) rather than "vertically" (i.e., a single large test case), as above. I'll explain why in a moment.

d:= nextprime(2^10):  b:= irem~(['rand()'$64], d);
b := [173, 917, 905, 706, 476, 906, 759, 1001, 827, 720, 297, 35, 
  933, 582, 61, 289, 139, 245, 817, 496, 11, 477, 1001, 541, 824, 
  911, 787, 901, 990, 853, 562, 160, 248, 313, 631, 873, 630, 
  281, 565, 282, 85, 761, 311, 1021, 475, 123, 593, 632, 209, 
  683, 931, 296, 348, 381, 890, 772, 720, 565, 529, 277, 861, 
  286, 147, 912]

(L1, CT1, RT1):= CodeTools:-Usage(
    iremFrac2a~(1, d, b), output= [output, cputime, realtime]
memory used=283.35MiB, alloc change=8.00KiB,
cpu time=6.34s, real time=1.70s, gc time=6.02s

(L2, CT2, RT2):= CodeTools:-Usage(
    iremFrac2b~(1, d, b), output= [output, cputime, realtime]
memory used=13.95MiB, alloc change=0 bytes,
cpu time=172.00ms, real time=172.00ms, gc time=0ns

#Verify that computed results are equal:
evalb(L1 = L2);

#time ratios (real and CPU):
RT1/RT2, CT1/CT2;
                    9.877906977, 36.88372093

The reason is that when you create a sequence (or list or set) by appending to an existing sequence, each assignment reconstructs the entire sequence. Of course, the time for each such reconstruction is proportional to the length of the sequence. Thus, the time for the whole loop (just for this reconstruction minutiae, not for the actual desired computation) is proportional to the square of the number of loop iterations. Using the standard asymtotic-order ("big O") notation, we say that the time for this process is O(n^2) (where n is the number of iterations).  Assuming that the actual computation time for each sequence element is independent of its position in the sequence (which is very often the case, and it's certainly true in this case), that time is O(n) in total. It's mathematically certain that there's a value of n, say n_c, such that the O(n^2) process will dominate the O(n) process for all n > n_c. In more-technical mathematical language, for any A > 0, B > 0,

limit(A*n^2 / (B*n) , n= infinity) = infinity.

@2cUniverse Your new version can be coded like this:

iremFrac2:= proc(n, d, b) local N:= irem(n,d), r:= irem(N*b, d), s; 
    [if igcd(b,d) = 1 then 
        (do [[r,r], [r, (s:= irem(b*r, d))]], [[r,s], [s,s]] until (r:= s)=N)
        print("Denom-Base not coprime")
end proc

The N:= irem(n,d) is needed because if n >= d then the loop will never reach n.

Oh, I just realized that this may have not been obvious before: The number of loop iterations is always MutiplicativeOrder(b,d), but dividing that number by two isn't always relevant.

@sursumCorda GMP is open source and publically licensed. You can find numerous websites with documentation of the whole package or of invidual components such as "GMP isprime". The Maple help page ?GMP contains a link to the source code as it's been modified for use in Maple.

@Mike Mc Dermott Your example is too small for a difference to be shown. For any expression of practical size for which I've used optimize(..., tryhard) over the past 20 years, it has made a great improvement in the optimization. The extra time taken by the optimizer has always been trivial compared to the improvement in execution time. Here's an example:

P0:= codegen:-makeproc(randpoly([x,y,z], dense), [x,y,z]);

proc (x, y, z) -27-16*x+18*y+51*z+33*x^5+59*y^5+52*z^5-89*x^4-46*y^4+36*z^4-60*x^3-34*y^3+91*z^3-20*x^2-10*y^2-22*z^2+12*x^3*y*z+7*x^2*y^2*z-70*x^2*y*z^2-89*x*y^3*z+69*x*y^2*z^2-42*x*y*z^3+34*x^2*y*z+80*x*y^2*z-33*x*y*z^2+21*x*y*z-25*x^3*y+10*x^4*y+7*x^4*z+65*x^3*y^2-96*x^3*z^2-42*x^2*y^3-60*x^2*z^3-4*x*y^4+97*x*z^4-69*y^4*z-33*y^3*z^2+40*y^2*z^3-65*y*z^4+50*x^3*z-89*x^2*y^2+16*x^2*z^2-77*x*y^3+30*x*z^3+87*y^3*z+77*y^2*z^2-85*y*z^3-68*x^2*y+52*x^2*z+28*x*y^2-64*x*z^2+y^2*z+54*y*z^2-35*x*y+89*x*z end proc

P1:= codegen:-optimize(P0);

proc (x, y, z) local t1, t10, t109, t13, t16, t19, t2, t22, t25, t28, t31, t34, t37, t40, t41, t44, t68, t9, t92; t1 := x^2; t2 := t1^2; t9 := t1*x; t10 := y^2; t13 := y*t9; t16 := z^2; t19 := t10*y; t22 := t10*t1; t25 := y*t1; t28 := t16*z; t31 := t10^2; t34 := t19*x; t37 := t10*x; t40 := -42*t19*t1-60*t1*t28+65*t10*t9+12*t13*z-70*t16*t25+69*t16*t37-96*t16*t9+33*t2*x+10*t2*y+7*t2*z+7*t22*z-4*t31*x-89*t34*z; t41 := x*y; t44 := t16^2; t68 := 16*t1*t16+40*t10*t28-33*t16*t19+34*t25*z-42*t28*t41+59*t31*y-69*t31*z+97*t44*x-65*t44*y+52*t44*z+50*t9*z-25*t13-89*t2-89*t22; t92 := 52*t1*z+77*t16*t10-33*t16*t41+87*t19*z+30*t28*x-85*t28*y+80*t37*z+21*t41*z-68*t25-46*t31-77*t34+28*t37+36*t44-60*t9; t109 := t10*z-64*t16*x+54*t16*y+89*x*z-20*t1-10*t10-22*t16-34*t19+91*t28-35*t41-16*x+18*y+51*z-27; t40+t68+t92+t109 end proc

P2:= codegen:-optimize(P0, tryhard);

proc (x, y, z) local result, t19, t2, t20, t3, t5, t6, t8, t9; t20 := 7*z-89; t19 := 52*z; t9 := x^2; t8 := x*t9; t6 := y^2; t5 := t6*y; t3 := z^2; t2 := t3*z; result := 91*t2-34*t5-60*t8-27+(87*t5+50*t8+51)*z+(-85*t2+18+(12*z-25)*t8)*y+(30*t2+89*z-16+(-89*z-77)*t5+(-42*t2+21*z-35)*y)*x+(-60*t2-42*t5+t19-20+(34*z-68)*y+(33*x+10*y+t20)*t9)*t9+(40*t2+65*t8-10+z+t20*t9+(80*z+28)*x+(59*y-46-69*z-4*x)*t6)*t6+(-33*t5+77*t6-96*t8-22+54*y+(-70*y+16)*t9+(69*t6-33*y-64)*x+(36+t19-65*y+97*x)*t3)*t3 end proc

P3:= codegen:-optimize(P1, tryhard);

proc (x, y, z) local t1, t10, t13, t16, t19, t2, t22, t25, t28, t31, t34, t37, t41, t44, t9, result; t1 := x^2; t2 := t1^2; t9 := t1*x; t10 := y^2; t13 := y*t9; t16 := z^2; t19 := t10*y; t22 := t10*t1; t25 := y*t1; t28 := t16*z; t31 := t10^2; t34 := t19*x; t37 := t10*x; t41 := x*y; t44 := t16^2; result := -27-20*t1-25*t13-34*t19-68*t25+91*t28-89*t22-46*t31-89*t2-16*x+18*y-4*t31*x-77*t34+36*t44-60*t9-35*t41+65*t9*t10+(-42*t19-60*t28)*t1+(-96*t9-70*t25+69*t37)*t16+(12*t13+7*t22-89*t34)*z+(33*x+10*y+7*z)*t2+(40*t10-42*t41)*t28+(16*t1-33*t19)*t16+(50*t9+34*t25)*z+(97*x-65*y+52*z)*t44+(59*y-69*z)*t31+(30*x-85*y)*t28+(77*t10-33*t41)*t16+(52*t1+87*t19+21*t41)*z+(80*z+28)*t37+(-64*x+54*y-22)*t16+(z-10)*t10+(89*x+51)*z end proc

P4:= codegen:-optimize(P2);

proc (x, y, z) local t19, t2, t20, t24, t3, t5, t6, t8, t9; t20 := 7*z-89; t19 := 52*z; t9 := x^2; t8 := x*t9; t6 := y^2; t5 := t6*y; t3 := z^2; t2 := t3*z; t24 := 89*z; 91*t2-34*t5-60*t8-27+(87*t5+50*t8+51)*z+(-85*t2+18+(12*z-25)*t8)*y+x*(30*t2+t24-16+t5*(-t24-77)+(-42*t2+21*z-35)*y)+(-60*t2-42*t5+t19-20+(34*z-68)*y+(33*x+10*y+t20)*t9)*t9+(40*t2+65*t8-10+z+t20*t9+(80*z+28)*x+(59*y-46-69*z-4*x)*t6)*t6+(-33*t5+77*t6-96*t8-22+54*y+(-70*y+16)*t9+(69*t6-33*y-64)*x+(36+t19-65*y+97*x)*t3)*t3 end proc


Vector(5, {(1) = 54*additions+207*multiplications, (2) = 19*storage+19*assignments+104*multiplications+54*additions, (3) = 9*storage+9*assignments+53*additions+57*multiplications, (4) = 16*storage+16*assignments+82*multiplications+54*additions, (5) = 9*storage+9*assignments+53*additions+56*multiplications})



@sursumCorda That's a good find on your part. Here are the new timings using option compile on all Iterators, and running twice so that the compilation time isn't counted.

LL:= [seq]([$0..k], k= 1..8): #So, product has 9! 8-tuples.
for k to 6 do cp[k]:= CodeTools:-Usage(CP[k](LL)) od:
memory used=189.79MiB, alloc change=0 bytes,
cpu time=2.66s, real time=2.15s, gc time=703.12ms

memory used=173.28MiB, alloc change=5.64MiB, 
cpu time=2.66s, real time=2.09s, gc time=796.88ms

memory used=106.80MiB, alloc change=2.87MiB, 
cpu time=2.56s, real time=2.29s, gc time=390.62ms

memory used=106.80MiB, alloc change=0 bytes, 
cpu time=2.69s, real time=2.36s, gc time=421.88ms

memory used=386.46MiB, alloc change=0 bytes,
cpu time=4.70s, real time=3.46s, gc time=1.72s

Carl's own cartesian product iterator:
memory used=138.49MiB, alloc change=-8.50MiB,
cpu time=1.89s, real time=1.62s, gc time=359.38ms


5 6 7 8 9 10 11 Last Page 7 of 680