acer

32313 Reputation

29 Badges

19 years, 311 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

My reasoning was as follows. Since the external routines for matrix exponential demand full rectangular hardware datatype Matrices, then have hermitian_basis produce those in the first place (to avoid copies). This should avoid a lot of copying, and allow the evalf calls around Matrix results to be avoided. I also used some lower level (but not undocumented, internal) functions to do the linear algebra, although I wouldn't expect that to compare with the effect of avoiding copies and using the hardware datatype. I removed some conversions, where it seemed possible. Please check for correctness, and adjust accordingly. :) I didn't get as much of a speedup and memory improvement as I hoped for, only about 33% or so on N=20.
################################################################################
hermitian_basis := proc(N::posint)
#
# returns a list of N^2-1 traceless, orthogonal, hermitian NxN matrices
# (generalized Pauli or Gell-Mann matrices) which form a basis of a
# vector space and can serve as the generators of SU(N). For N=2,
# this yields the normal Pauli matrices.
# The procedure follows [Tilma, Sudarshan, J. Phys. A 35 (2002) 10467]
#
options cache;
local kronecker_symbol, temp_list, temp_list_counter, lambda, lambda1,
      lambda2, i, j, mu, nu, st, scal;
st:=time();
Digits := max(Digits,trunc(evalhf(Digits)));
kronecker_symbol := (a,b) -> if a=b then 1 else 0 end if;
#
# following the reference, first two sets of linearly independent hermitian
# matrices (lambda1, lambda2) are generated and stored in a temporary list.
#
temp_list :=
[seq( seq( op([Matrix(N, (mu,nu)->kronecker_symbol(j,mu)*kronecker_symbol(i,nu)
                             +kronecker_symbol(j,nu)*kronecker_symbol(i,mu),
                 datatype=complex(float)),
          Matrix(N, (mu,nu)->-I*(kronecker_symbol(i,mu)*kronecker_symbol(j,nu)
                             -kronecker_symbol(i,nu)*kronecker_symbol(j,mu)),
                 datatype=complex(float)) ]),
          i=1..j-1 ), j=1..N )];
#
# the remaining N-1 matrices are generated (as described in the reference).
#
for i from 2 to N do
   lambda[i^2-1] := Matrix(N, datatype=complex(float)):
   # Since only diagonal elements are nonzero, scale them as assigned,
   # instead of scaling each whole Matrix. Check which is better.
   scal := evalf(sqrt(2/(i^2-i)));
   for j from 1 to i-1 do
      lambda[i^2-1][j,j] := scal; # was 1
   end do;
  lambda[i^2-1][i,i] := (-i+1)*scal; # was -i+1
  #LinearAlgebra:-LA_Main:-MatrixScalarMultiply(lambda[i^2-1],scal,inplace=true,outputoptions=[]);
end do:
                                                                                                                                           
#
# now the matrices from the temporary list and the matrices with fixed index
# are combined to a single list
#
temp_list_counter := 1;
for i from 1 to N^2-1 do
   if not assigned(lambda[i]) then
      lambda[i] := temp_list[temp_list_counter];
      temp_list_counter := temp_list_counter + 1;
   end if;
end do;
printf("hb time: %a \n", time()-st);
return lambda;
end proc:
####################################################################################                                                                                                                                            
####################################################################################
parametrize_unitary_Euler := proc(N::posint)
#
# returns a list containing a general NxN unitary matrix as described by the
# given list of parameters (or the keyword "random").
# The procedure follows [Tilma, Sudarshan, J. Phys. A 35 (2002) 10467] (see Eq. (19))
#
local lambda, i, j, k, m, alpha, used_lambdas, U, st, temp;
st := time();
#
# create a list of N^2-1 (uniformly distributed) random numbers in the
# range 0..2*Pi. These are actually the random parameters for the unitary matrix.
#
alpha := RandomTools[Generate](list(float(range=0..evalf(2*Pi), method=uniform), N^2-1));
                                                                                                                                           
#
# first, a list of the generalized Gell-Mann matrices is needed as
# a hermitian basis of the space of NxN matrices.
#
lambda := hermitian_basis(N);
                                                                                                                                           
#
# define auxiliary function j(m) as in the reference
#
#j := m -> piecewise(m=N, 0, sum(2*(m+l), l=0..N-m-1));
#
# actually the same but easier is the following
j := m -> (N+m-1)*(N-m);
                                                                                                                                           
#
# create a list of those lambda matrices that are actually used
# (in the order of later use, including multiple occurrences)
#
used_lambdas := Vector(N^2-1):
i := 1;
for m from N to 2 by -1 do
   for k from 2 to m do
      used_lambdas[i]   := 3;
      used_lambdas[i+1] := (k-1)^2 + 1;
      i := i+2;
   end do:
end do:
for m from 2 to N do
   used_lambdas[i] := m^2-1;
   i := i+1;
end do:
printf("puE 1st time: %a \n", time()-st);
temp := LinearAlgebra:-LA_Main:-MatrixScalarMultiply(
              lambda[used_lambdas[1]], I*alpha[used_lambdas[1]],
              inplace=false, outputoptions=[]);
U := LinearAlgebra:-LA_Main:-MatrixFunction(
              temp, exp(dummy), dummy, outputoptions=[]);
for k from 2 to op(1,used_lambdas) do
   temp := LinearAlgebra:-LA_Main:-MatrixScalarMultiply(
                 lambda[used_lambdas[k]], I*alpha[used_lambdas[k]],
                 inplace=false, outputoptions=[]);
   U := LinearAlgebra:-LA_Main:-MatrixMatrixMultiply(
                 U, LinearAlgebra:-LA_Main:-MatrixFunction(
                             temp, exp(dummy), dummy, outputoptions=[]),
                             inplace=false, outputoptions=[]);
end do;
printf("puE time: %a \n", time()-st);
return U;
                                                                                                                                           
end proc:
#########################################################################################
                                                                                                                                           
NN := 20:
st,ba:=time(),kernelopts(bytesalloc):parametrize_unitary_Euler(NN):time()-st,kernelopts(bytesalloc)-ba;
st,ba:=time(),kernelopts(bytesalloc):parametrize_unitary_Euler(NN):time()-st,kernelopts(bytesalloc)-ba;
acer
Do not the Matrices need to commute, for that property to hold? I thought of doing trying that, to be able to replace the product of matrix exponentials with a single matrix exponential of a sum. The result differs in general, I think. And although I did not look hard at the alternative result, it too might be unitary!? (If this is from an algorithm in a published paper, one could imagine that the original authors would have thought of this and discarded it for some reason, however.) I had some other performance improvements for this, which got both memory and speed (33%) gains. It might still be slower than Matlab. For the N=20 case there are something like 399 Matrix complex float exponentials to be computed. Even at something like 1/100th of a second per 20x20 matrix exponential, it might be relatively slower. I will try to find time to post my code edits. I assume that the poster is most interested in following his algorithm, rather than work on another altogether. acer
Do not the Matrices need to commute, for that property to hold? I thought of doing trying that, to be able to replace the product of matrix exponentials with a single matrix exponential of a sum. The result differs in general, I think. And although I did not look hard at the alternative result, it too might be unitary!? (If this is from an algorithm in a published paper, one could imagine that the original authors would have thought of this and discarded it for some reason, however.) I had some other performance improvements for this, which got both memory and speed (33%) gains. It might still be slower than Matlab. For the N=20 case there are something like 399 Matrix complex float exponentials to be computed. Even at something like 1/100th of a second per 20x20 matrix exponential, it might be relatively slower. I will try to find time to post my code edits. I assume that the poster is most interested in following his algorithm, rather than work on another altogether. acer
The maple "engine", which does the computations, is mserver (mserver.exe on Windows). It communicates with one of the three interfaces over sockets. The mserver is supposed to shut down when the interface shuts down, but it seems that under some rare circumstances this might not happen (or gets delayed, perhaps if the kernel has "gone to the think tank"). An unusual exit (crash) of the interface may also orphan the mserver. The mservers can use a lot of memory, because that's where the computations are done. The Std GUI can use a lot of memory because, well, because that runs under java. On unix/linux the interfaces are java (Std GUI), cmaple (TTY), and maplew (Classic, cwmaple on Windows). I'm not sure how "mjava" ties in, though it's likely a Std GUI thing. acer
The maple "engine", which does the computations, is mserver (mserver.exe on Windows). It communicates with one of the three interfaces over sockets. The mserver is supposed to shut down when the interface shuts down, but it seems that under some rare circumstances this might not happen (or gets delayed, perhaps if the kernel has "gone to the think tank"). An unusual exit (crash) of the interface may also orphan the mserver. The mservers can use a lot of memory, because that's where the computations are done. The Std GUI can use a lot of memory because, well, because that runs under java. On unix/linux the interfaces are java (Std GUI), cmaple (TTY), and maplew (Classic, cwmaple on Windows). I'm not sure how "mjava" ties in, though it's likely a Std GUI thing. acer
How would one save the user-assigned variables in such a session as below? a:=proc() local x; x; end: x:=5; y:=a(); Is y "user-assigned"? The fact that it was assigned an escaped local appears lost, using anames(user) or anames(alluser) and the technique suggested. Is there some easy modification to get around that? Are there other problematic cases? (Maybe using modules, or anonymous modules created by a parent module's ModuleApply, or...?) Is a .mla sufficient to store the whole "user-defined" state? Should there be a mechanism to store everything (including the nature of the localness of all locals)? acer
How would one save the user-assigned variables in such a session as below? a:=proc() local x; x; end: x:=5; y:=a(); Is y "user-assigned"? The fact that it was assigned an escaped local appears lost, using anames(user) or anames(alluser) and the technique suggested. Is there some easy modification to get around that? Are there other problematic cases? (Maybe using modules, or anonymous modules created by a parent module's ModuleApply, or...?) Is a .mla sufficient to store the whole "user-defined" state? Should there be a mechanism to store everything (including the nature of the localness of all locals)? acer
I believe that objects are all over the place, in memory. The only data that I know to be stored in a contiguous segments is that of "hardware" datatype rtables (which is why vendor or cache-tuned BLAS are used). If memory is all over, and the simpl table stores a hash value dependent upon memory location (and if that value appears in many other objects), then compacting all stored data is problematic. So fragmentation can be a problem. You can see what I think is evidence of this in how much large hardware rtable creation can cause somewhat "unexpected" allocation increases -- there's free memory already allocated, but not a large enough contiguous piece. acer
I believe that objects are all over the place, in memory. The only data that I know to be stored in a contiguous segments is that of "hardware" datatype rtables (which is why vendor or cache-tuned BLAS are used). If memory is all over, and the simpl table stores a hash value dependent upon memory location (and if that value appears in many other objects), then compacting all stored data is problematic. So fragmentation can be a problem. You can see what I think is evidence of this in how much large hardware rtable creation can cause somewhat "unexpected" allocation increases -- there's free memory already allocated, but not a large enough contiguous piece. acer
The colour does vary, from leaf to leaf and tree to tree. As the green chlorophyll disappears, the yellows begin to show up. The red is usually due to converted sugars, I believe, which varies from tree to tree. A quick web search indicates that the red is due to anthocyanins in the sap (pigments also found in cranberries, cherries, etc), and the yellow and orange is due to carotene type pigments (also found in carrots). acer
showstat(curry); It's a programming convenience. acer
showstat(curry); It's a programming convenience. acer
Could you not have delayed evaluation by using uneval quotes? define( '`&d`' , 'orderless', &d( a::list, b::list ) = 'simplify(sqrt(add( (a[i]-b[i])^2, i=1..nops(a))))' ); [1,2,3,4,5] &d [4,3,2,1,0]; Isn't define() showing maple's usual evaluation rules? If you would want that changed, for define(), then how precisely? In any case, the help-page ?define could be made to explain this. acer
Yes, of course, sorry. The rest still holds, I think. But we seemed to agree that andmap only got called once for product 2*a, which is wrong. I wonder whether it relates to the fact that 2*a dismantles to a SUM, and not a PROD. acer
It looks like it is going wrong, for the multiplication. andmap(A,2*a) is calling A just once, with argument 2. Compare with ormap(A,c*a*b), for which A gets called three times, with arguments a, b, and c. Interestingly, andmap(A,c*a*b) also calls A just once. It seems that andmap is considering only a single operand of a product, while ormap considers all the operands. I wonder whether the problem is restricted to andmap and products. acer
First 573 574 575 576 577 578 579 Last Page 575 of 591