acer

32405 Reputation

29 Badges

19 years, 345 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

@itsme As I'm sure you're aware, more usual array plots have the problem that right-click-export only exports one single cell's plot to a graphics format file. That is, IMO, a serious bug/limitation in the GUI's handling of GUI Table's of plots. And it affects this colorbar approach as well. I hope that it gets fixed in general.

Apart from export of the whole worksheet (to .pdf say) I'm not aware of any other way to get proper export of side-by-side multiple 3D plots or 2D/3D plot mixes.

This is why for a 2D plot and its colorbar an appealing way to proceed (at present) is to display both in a single plot -- with tickmarks and other things tweaked or faked accordingly.

@Carl Love In the attached worksheet I inlined (a modified version of) Ceil into Sols_z6z7 and then compiled that to Sols_z6z7c. I gave option threadsafe to Sols_z6z7 in order that it could be run in parallel without its call_external being blocking. Then I changed SolsRecurse to call Sols_z6z7c without evalhf.

On my 64bit Linux i5 this brought timing for the final full example from 26 minutes down to 18 minutes, where the latter is 68% of the former. I used Maple 2015.1.

I noticed that when running the original evalhf based version `top` reported a steady 365 "% CPU" on my quad-core i5. The compiled version ran steady with 295 "% CPU". The load averages reflected those numbers too. I find that difference in peak processor usage interesting. (A hypothetical number of 400 %CPU would mean all four cores were running flat out.)

LatticeCountmodif4f.mw

I made earlier experiments by only compiling Ceil. For example, I did it with calls to compiled Ceil wrapped in eval, using evalhf around the calls to Sols_z6z7. And I did it with compiled Ceil (and no eval) in Sols_z6z7 without evalhf. And I tried it with float arguments and a few more trunc calls (on both, or either proc separately), and with integer arguments, and so on. But the best I got with any of that was a timing slightly worse than the original.

For the inlined version of Ceil I used a single operator-form call to `if`, rather than a pair of such where one condition allows a return that didn't need trunc. You'll know what I mean. That change made no difference on the original evalhf version, but it did seem to speed it up a little for the compiled version. You could retest both, naturally.

The initial complaint from Compiler:-Compile about not handling rational arithmetic I got around by simply multiplying x/y by the float 1.0. (In the inlined version this became 1.0*a1/a2.)

I didn't put types on the declarations of parameters and locals of Sols_z6z7, and am not sure whether doing so would change the performance.

The final long example call to SolsParallel "used" 225Gb, while allocation stays very low. This means that a great deal of garbage collection has taken place. I wonder whether SolsRecurse could be called under evalhf inside SolsParallel (which would have to wrap the call to Sols_z6z7c in eval). Of course this would require alterations in how SolsRecurse is coded. Perhaps Vector arguments and an extra argument to indicate how many values are yet present in the Vector might work. I'm not sure.

 

@Markiyan Hirnyk The post you cited is about creating a 3D surface plot that has custom shading (by z-values). But in itself that does not provide a way to obtain an associated colorbar, which I believe is the central point of the question.

Actually, a 3D plot with a custom coloring (such as the cited link demonstrates) makes for a slightly more difficult case in constructing the corresponding colorbar as a separate 2D plot. Eventually I'd like to accomodate this more involved case in some code I'm investigating for generating 2D colorbars. But to begin I'm just looking at the simple, stock HSV shading=zhue case, for which the colorbar construction is simple. To begin I'm looking at conjoined display of 3D plot and 2D colorbar.

In Maple 18 a vertical color gradient can be applied directly and efficienctly in the plotting call, using the colorscheme option.

Eg,

restart:
N:=25000:
xy:=LinearAlgebra:-RandomMatrix(N,2,generator=0.0..1.0,
                                outputoptions=[datatype=float[8]]):

st,str:=time(),time[real]():

P:=plots:-pointplot(xy, colorscheme=["zgradient",[black,red]],
                    symbolsize=4):

time()-st,time[real]()-str;

                          0.090, 0.093

kernelopts(version);

    Maple 18.02, X86 64 LINUX, Oct 20 2014, Build ID 991181

acer

@itsme Does your example get laid out acceptably better in Maple 2015.1?

Explore(plots:-display(Matrix((i,j)->plot(sin(a*x), x=0..5),4,4),
                       labels=["x", "this is y"]),
        parameters=[a=1..2], placement=left, size=[1200, 1000]);

@Carl Love It is not unquestionable that cp=3220 is mathematically a root of expression pp just because eval(pp,cp=3220) returns exact zero.

It may be true that cp=3220 is an exact root of the given expression pp for any finite value of x. But that fact is not demonstrated just because eval(pp,cp=3220) returns exact zero.

It is not even true for ee an expression which is univariate (in name cp) that cp=val is a root of ee just because eval(ee,cp=val) returns exact zero. And the situation is more complicated for multivariate expressions. Here's a univariate example:

ee:=cp/(8*sin(cp^2)*cos(4)*cos(cp)^4-8*sin(cp^2)*cos(4)*cos(cp)^2
        +sin(cp^2)*cos(4)+8*sin(cp^2)*sin(4)*sin(cp)*cos(cp)^3
        -4*sin(cp^2)*sin(4)*sin(cp)*cos(cp)-8*cos(cp^2)*cos(4)*sin(cp)*cos(cp)^3
        +4*cos(cp^2)*cos(4)*sin(cp)*cos(cp)+8*cos(cp^2)*sin(4)*cos(cp)^4
        -8*cos(cp^2)*sin(4)*cos(cp)^2+cos(cp^2)*sin(4))
        -2/(8*sin(cp^2)*cos(4)*cos(cp)^4
        -8*sin(cp^2)*cos(4)*cos(cp)^2+sin(cp^2)*cos(4)
        +8*sin(cp^2)*sin(4)*sin(cp)*cos(cp)^3
        -4*sin(cp^2)*sin(4)*sin(cp)*cos(cp)
        -8*cos(cp^2)*cos(4)*sin(cp)*cos(cp)^3
        +4*cos(cp^2)*cos(4)*sin(cp)*cos(cp)
        +8*cos(cp^2)*sin(4)*cos(cp)^4-8*cos(cp^2)*sin(4)*cos(cp)^2
        +cos(cp^2)*sin(4));

eval(ee,cp=2);
                                      0

ff:=combine(ee);
                                      cp - 2       
                          ff := -------------------
                                   /  2           \
                                sin\cp  - 4 cp + 4/

eval(ff,cp=2);
Error, numeric exception: division by zero

limit(ff,cp=2);
                                  undefined

limit(ee,cp=2);
                                  undefined

So, we would go wrong by accepting only the result of exact zero from eval(ee,cp=2). So I don't see anything wrong with Markiyan's wanting to examine things further than just accepting eval(pp,cp=3220)=0.

And relying on high Digits for a floating point evaluation can also be more tricky. I'm not sure how tools like shake might do with the original expression pp, though.

@Wang Gaoteng The nonconvergence problem depends on the values of Digits (which is being used for both working precision and the accuracy goal).

It can be seen even when supplying a point that should be close enough -- internally `fsolve/sysnewton` gets close to the solution but bails when it fails to attain enough accuracy.

Notice that, in the cases where fsolve does return a numeric result below, there are relatively few correct digits in the result (relative to the Digits used). It's an interesting example.

restart:
f1 := {y = tan(-8*Pi*(1/180))*x,
      (x-cos(-8*Pi*(1/180))-sin(-8*Pi*(1/180)))^2
       +(y-sin(-8*Pi*(1/180))+cos(-8*Pi*(1/180)))^2 = 1}:
solve(f1);
                /       /2    \          /2    \    /2    \\ 
               { x = cos|-- Pi|, y = -tan|-- Pi| cos|-- Pi| }
                \       \45   /          \45   /    \45   // 
evalf(%);
                    {x = 0.9902680687, y = -0.1391731010}
for d from 10 to 25 do
  Digits:=d:
  sol:=fsolve(f1,{x=0.5,y=-0.5});
  if type(eval(sol,1),specfunc(anything,fsolve)) then
    print(d, FAIL);
  else
    print(d, sol);
  end if;
end do:
                                  10, FAIL
                                  11, FAIL
                12, {x = 0.990266468986, y = -0.139172876129}
                                  13, FAIL
              14, {x = 0.99026777793041, y = -0.13917306008923}
                                  15, FAIL
                                  16, FAIL
                                  17, FAIL
                                  18, FAIL
         19, {x = 0.9902680681796783120, y = -0.1391731008810966729}
                                  20, FAIL
                                  21, FAIL
      22, {x = 0.9902680687192257411404, y = -0.1391731009569251190394}
                                  23, FAIL
    24, {x = 0.990268068742932805847978, y = -0.139173100960256929701771}
                                  25, FAIL

@Rouben Rostamian  I would not be surprised if there were things in PDETools that could make it easier to do with D.

@Alejandro Jakubi I would guess so, in the simplest case.

But there are a few mechanisms by which an externally called routine (call_external, set up using define_external) can call back to Maple using functions in the OpenMaple API. For example, EvalMapleProc and friends, or queryInterrupt, or even MapleUserInfo... and I do not know whether any of these would allow the kernel to check the timelimit.

@kegj It's difficult to debug your code efficiently when all we have to go on is vague description. (Eg. first you write print (A) gives "no result", and then later you write that it gets "an output A".) Please just use the green up-arrow icon on the response editor here, and upload a worksheet containing what you've tried so far. That way we can ascertain what kind of "array of elements" may or may be yet assigned to any name, and guide you, etc.

By the way, have you read through the Programming Guide? That is one of the product manuals. I'd suggest you start with chapters 1-6 and try the examples from chapters 2-5, and then consider chapters 7-8 if you have time. Don't get bogged down in the heavy details of chapter 6.

@nm ...Just for the other readers: the term builtin is being used to refer to a procedure with option builtin. See here.

That pretty much almost always means a procedure with little or no interpreted code in its body -- the code for it is written in C and compiled into the Maple kernel interpreter.

Sometimes routines get moved from interpreted Library code to precompiled kernel code, to get better speed. And on rare occasions routines can sometimes even get moved from kernel out to the interpreted Library -- usually to allow intermediate garbage collection (if otherwise problematic in memory use).

@kegj If you are using 2D Math input mode (the default for new users) then print (A) with a space before the bracket will get interpreted as the implicit multiplication print*A rather than a call to the print command.

@Preben Alsholm I have submitted a bug report.

What you've shown in your followup Matlab figure is sometimes referred to as a colorbar.

I have some code for Maple 2015 which automates construction and placement of such a thing, from a given 3D plot with HUE/HSV coloring.

But I wish to make some adjustments to make it more robust (to handle both GRID and MESH structures for 3D plots) and customizable (orientation, position, tickmarks), and am also still fiddling with the automated sizing. I hope to be able to branch off a post for it towards the middle of next week.

acer

 

With some triple bypass surgery the procedure InlinedNewton below gets the bodies of the procedures prccons, prcconsJ, and s3 inlined into it. And it is then singly Compiled.

 

The singly Compiled InlinedNewtonc runs somewhat faster than did the previous Compiled `Newtonc` (which called compiled versions of those three other procedures) on a single iteration loop from a single initial point, with tolerance target 2e-6 and problem size N=50. And `Newtonc` was faster than uncompiled `Newton` even when that too called the three compiled worker procedures.

 

It's appropriate now to find examples which need multiple initial points in order to converge.

 

There are also the "extra parameters" yet to introduce, which must be done somewhat carefully as some of the ToInert/FromInert inlining is hand-edited.


restart:

Digits := 15:

N := 50;

50

f := array([seq(evalf(-2*x[i]+x[i]^2+0.99/i^2),i=1..N)]):

CodeTools:-Usage(
  fsolve({seq(f[i],i=1..N)},{seq(x[i]=0.5,i=1..N)})  ):
type(eval(%,1),set(name=numeric));

memory used=62.57MiB, alloc change=96.00MiB, cpu time=1.98s, real time=1.96s, gc time=62.40ms

true

# s3 is not recreated for each new problem, so does not contribute
# to problem specific timing. Even compilation to s3c would be done
# at most once, upon package load say.
#
s3 := proc(n::posint,
           A::Array( datatype = float[8] ),
           ip::Array( datatype = integer[4] ),
           b::Array( datatype = float[8] ) )
   local i::integer, j::integer, k::integer, m::integer, t::float;
      ip[n] := 1;
      for k to n-1 do
        m := k;
        for i from k+1 to n do
          if abs(A[m,k]) < abs(A[i,k]) then
            m := i
           end if
         end do;
        ip[k] := m;
        if m <> k then
          ip[n] := -ip[n]
         end if;
       t := A[m,k];
       A[m,k] := A[k,k];
       A[k,k] := t;
       if t = 0 then
         ip[n] := 0;
         #return ip
         end if;
       for i from k+1 to n do
         A[i,k] := -A[i,k]/t
         end do;
       for j from k+1 to n do
         t := A[m,j];
         A[m,j] := A[k,j];
         A[k,j] := t;
         if t <> 0 then
           for i from k+1 to n do
             A[i,j] := A[i,j]+A[i,k]*t
             end do
           end if
         end do
       end do;
     if A[n,n] = 0 then
       ip[n] := 0
       end if;
     #ip[n]
      if ip[n] = 0 then
        error "The matrix A is numerically singular"
       end if;
      if 1 < n then
        for k to n-1 do
          m := ip[k];
          t := b[m];
          b[m] := b[k];
          b[k] := t;
          for i from k+1 to n do
           b[i] := b[i]+A[i,k]*t
           end do
         end do;
       for k from n by -1 to 2 do
         b[k] := b[k]/A[k,k];
         t := -b[k];
        for i to k-1 do
           b[i] := b[i]+A[i,k]*t
           end do
         end do
       end if;
     b[1] := b[1]/A[1,1];
     #return 0.0; # Better handling of explicit returns needed when inlining
end proc:

# Newtondummy is not recreated for each new problem. But its
# instantiation is done for each problem, and that is accounted
# for in the base timing below.
#
Newtondummy := proc(x::Array(datatype=float[8]),
               f0::Array(datatype=float[8]),
               j0::Array(datatype=float[8]),
               N::integer,
               ip::Array(datatype=integer[4]),
               maxiter::integer[4],
               tol::float[8])
   local ferr::float[8], temp::float[8],
         i::integer, k::integer, iter::integer;
      ferr := 10.0*tol;
      iter := 0;
      while ferr > tol and iter < maxiter do
        _DUMMY1; # f3(x, f0);
        _DUMMY2; # J3(x, j0);
        j0[1,1] := j0[1,1]; # else j0 not recognized as 2-dim?
        _DUMMY3; # s3c(N, j0, ip, f0);
        for i from 1 to N do
           x[i] := x[i] - f0[i];
        end do;
        ferr := abs(f0[1]);
        for i from 2 to N do
           temp := abs(f0[i]);
           if temp > ferr then
              ferr := temp;
           end if;
        end do;
        ferr := ferr/N;
        iter := iter + 1;
     end do:
     return ferr;
end proc:

# The following are all the problem specific assembly and compilation
# tasks, timed together. (They could be part of an appliable module,
# eventually).
#
(st,str) := time(),time[real]():

eqsA := [seq(F[i]=f[i],i=1..N)]:
irform2 := StatSeq(seq(Assign(op(i)),i=eqsA)):
prccons:= codegen[intrep2maple](Proc(Parameters(x::Array,F::Array),irform2)):
eqsJ := remove(type,[seq(seq(Jac[i,j]=evalf(diff(f[i],x[j])),j=1..N),i=1..N)],name=0.0):
irformJ := StatSeq(seq(Assign(op(i)),i=eqsJ)):
prcconsJ:= codegen[intrep2maple](Proc(Parameters(x::Array,Jac::Array),irformJ)):

(basetime,basetimer) := time()-st,time[real]()-str;

0.16e-1, 0.18e-1

# I need to document what I've done here.
# (I ought to consider writing a more general purpose "function-call Inliner".)
#
(st,str) := time(),time[real]():

UNewtondummy := ToInert(eval(Newtondummy)):
UNdlocals := [op([2,..],UNewtondummy)]:
Uprccons := ToInert(eval(prccons)):
UprcconsJ := ToInert(eval(prcconsJ)):
Us3 := ToInert(eval(s3)):
Us3locals := [op([2,..],Us3)]:
newlocals := [seq(`if`(t::'specfunc(anything,_Inert_DCOLON)',
                  _Inert_DCOLON(_Inert_NAME(cat(op([1,1],t),"_n1")),op([2..],t)),NULL),
                  t=Us3locals)]:
Ls3 := [seq(_Inert_LOCAL(i)=_Inert_LOCAL(i+nops(op(2,UNewtondummy))),i=1..nops(Us3locals))]:
T1 := subs(_Inert_NAME("_DUMMY1")=op([5,..],Uprccons), # param sequence 1st 2nd entries match
           _Inert_NAME("_DUMMY2")=op(subs([_Inert_PARAM(2)=_Inert_PARAM(3)],op([5],UprcconsJ))),
           _Inert_NAME("_DUMMY3")=op(subs([_Inert_PARAM(1)=_Inert_PARAM(4),
                                           _Inert_PARAM(2)=_Inert_PARAM(3),
                                           _Inert_PARAM(3)=_Inert_PARAM(5),
                                           _Inert_PARAM(4)=_Inert_PARAM(2),
                                           op(Ls3)],op([5],Us3))),
           subsop(2=_Inert_LOCALSEQ(op([2,..],UNewtondummy),op(newlocals)),UNewtondummy)):

# This is like `Newton`, but with the code from prccons,prcconsJ,s3 inlined.
InlinedNewton := FromInert(T1):

InlinedNewtonc := Compiler:-Compile(InlinedNewton):

(fusedcompiletime,fusedcompiletimer) := time()-st,time[real]()-str;

1.170, 1.159

# And now, essentially what was done previously. Ie, all of J3, f3, and s3c are
# Compiled. And so is `Newton`, except now it is accessing those three lexically
# while before they were explicitly declared global in `Newton`.
#
(st,str) := time(),time[real]():

f3 := Compiler:-Compile(prccons):
J3 := Compiler:-Compile(prcconsJ):
s3c := Compiler:-Compile(s3):
Newton := subs([_DUMMY1='f3(x, f0)',
                _DUMMY2='J3(x, j0)',
                _DUMMY3='s3c(N, j0, ip, f0)'],eval(Newtondummy)):
Newtonc := Compiler:-Compile(Newton):

(separatedcompiletime,separatedcompiletimer) := time()-st,time[real]()-str;

Warning, the function names {J3, f3, s3c} are not recognized in the target language

1.201, 1.157

# These need only be created for the first problem, and could be
# efficiently grown (but not replaced) for new problems with larger size N.
#
x0 := Array(1..N,datatype=float[8]):
f0 := Array(1..N,datatype=float[8]):
j0 := Array(1..N,1..N,datatype=float[8]):
ip := Array(1..N,datatype=integer[4]):

repeats := 1000:

ArrayTools:-Fill(0.0,f0):
ArrayTools:-Fill(1,ip):
(t11,t11r) := time(),time[real]():
for ii from 1 to repeats do
  ArrayTools:-Fill(0.5,x0):
  ArrayTools:-Fill(0.0,j0):
  InlinedNewtonc(x0,f0,j0,N,ip,100,HFloat(2e-6)); # singly compiled
end do:
(tottime,totrtime) := time()-t11,time[real]()-t11r:
evalf[4](tottime/repeats)*'seconds',evalf[4](totrtime/repeats)*'seconds[real]';

ArrayTools:-Fill(0.5,x0):
ArrayTools:-Fill(0.0,j0):
InlinedNewtonc(x0,f0,j0,N,ip,100,HFloat(2e-6)); # singly compiled

0.1092e-2*seconds, 0.1096e-2*seconds[real]

0.927128008641316376e-8

ArrayTools:-Fill(0.0,f0):
ArrayTools:-Fill(1,ip):
(t11,t11r) := time(),time[real]():
for ii from 1 to repeats do
  ArrayTools:-Fill(0.5,x0):
  ArrayTools:-Fill(0.0,j0):
  Newtonc(x0,f0,j0,N,ip,100,HFloat(2e-6)); # separetly compiled
end do:
(tottime,totrtime) := time()-t11,time[real]()-t11r:
evalf[4](tottime/repeats)*'seconds',evalf[4](totrtime/repeats)*'seconds[real]';

ArrayTools:-Fill(0.5,x0):
ArrayTools:-Fill(0.0,j0):
Newtonc(x0,f0,j0,N,ip,100,HFloat(2e-6)); # separately compiled

0.1529e-2*seconds, 0.1538e-2*seconds[real]

0.927128008641316376e-8

 


Download newtoncompiled5.mw

 

First 334 335 336 337 338 339 340 Last Page 336 of 593