Active CCI list - as of 10/23/97
-----------------
4 new or pending items for Group C
*lots* of pending items for Group D because I don't have record of how things have been formally resolved, except for those few items reported in full meeting.
1 new item for Group E clear at the end.

       -mary-
===========================
CCI # 18
Defined assignment in FORALL
Group C
Submitted: 05/04/95
Updated: 10/23/95
Submitted by: Henry Zongaro
S  tatus: in progress
Maybe Resolved.  CHK will provide clarification in HPF 2 document.    It will be a change to the "sequentialization" of the forall to account for defined  assignments.   BUT Henry suggests alternative based on new X3J3 action.

Original question:
     I was wondering whether there's not a problem with allowing defined assignment to appear within a FORALL.  Consider the following example.

     module mod
       integer :: a(3) = (/1,2,3/)
     contains
       pure subroutine def_assign(lhs, rhs)
         integer, intent(inout) :: lhs
         character, intent(in) :: rhs

         lhs = a(ichar(rhs)+1)
       end subroutine def_assign
     end module mod

     program p
       use mod
       interface assignment(=)
         module procedure def_assign
       end interface

     forall (i = 1:2) a(i) = char(i)  ! Sneaky way of passing 
                                      ! "i" to def_assign
    end program p

     The rules of forall specify that the right-hand side and the indices of the left-hand side are evaluated, in any order, prior to assignment, which also takes place in any order.  In the above example, we have

          a(1) = char(1)
          a(2) = char(2)

as the two defined assignments which take place.  Inside of def_assign, there's
a host-associated reference to a, so what ends up happening is the following:

          a(1) = a(2)
          a(2) = a(3)

     The order in which these assignments occurs affects the result.  The value
of a after the forall statement is executed could be (/2,3,3/) or (/3,3,3/).

     Basically, the problem is that in defined assignment, completely
evaluating the right-hand side for all active combinations does not necessarily
let the compiler precompute everything which might also appear on the 
left-hand
side.
Thanks,
Henry

DISCUSSION AT MEETING:

y, Henry, Jerry to circulate proposed wording for this definition.


No action from the July meeting was recorded ... was this resolved or not?
-----
Rob's recollection:
Resolved.  CHK will provide clarification in HPF 2 document.   It will be a change to the "sequentializatio" of the forall to account for defined assignments.

-----
new note from Henry

    X3J3 is taking a different approach on CCI 18.  They're trying to prohibit
 references in the procedure that defines the assignment to the variable that
 appears on the left-hand side of the defined assignment.  I believe WG5 will
 decide on this in their November meeting.

     We might want to pick up whatever they decide.  One advantage of the
estriction is that no extra compiler mechanisms are required for this somewhat
obscure case.
=====================================
=====================================

CCI# 36
Independent and remapping
Group C
Larry Meadows
Submitted: 10/7/95
status: new

Question:
In Section 4.4 of the standard, one of the conditions on INDEPENDENT is
that realignment and redistribution cannot occur, since they may change
the processor storing a particular array element.

I would argue that the same reasoning applies to remapping of arguments
in subroutines called inside of INDEPENDENT DO loops. For example:

!hpf$ distribute a(block)
!hpf$ indepedent
        do i = 1,n
            call sub(a)
        enddo

        subroutine sub(a)
        real a(:)
!hpf$ distribute a(cyclic)
        ...
        return
        end


>From an implementation point of view, remapping of arguments is a
collective operation, just as is realignment or redistribution, so
is difficult to implement inside INDEPENDENT DO loops.

Couple of other points:

- Would be nice to have some examples in 4.4.1 where subroutines were
  called, legally and illegally.

(see cci #37 for other point)
Thanks.
lfm
=====================================
=====================================

CCI #37
Status of locals in procedures called inside Independent DO
Group C
Larry Meadows
Submitted 10/7/95
Status: new
Question:
- I seem to recall that we decided that local variables of subroutines
  called inside INDEPENDENT DO loops were automatically NEW (and I
  assume that this doesn't apply to SAVE or COMMON variables). Is this
  documented anywhere?
Thanks.
lfm

=====================================
=====================================
CCI #29
Calling hpf_local from independent loop
Group C
Updated: 10/23/95
Rob Schreiber
Submitted: 8/3/95
Status: in progress
QUESTION:
I have two CCI questions.
Question I.  Can an extrinsic(hpf_local) be invoked in an independent loop?
In a Forall?
     Ex:

Forall (1 = 1:10) a(i) = f(i,a(i))

Note that part of the calling sequence, as specified in Ver 1.1, appendix A, is 
"The processors are synchronized.  In other words, all actions that logically precede the call are completed."

It seems clear that when this was written it was tacitly assumed that the call did not occur in an independent loop or forall.

Part 2:  May ony other kind of extrinsic be called in a forall or independent loop?
================
Discussion begins here:     Summary provided by Rob S. 

Status:   Under discussion.   Summary of the discussion in September:

1.  Only and HPF type routine can be PURE.  Thus, only and HPF_LOCAL
routine could be local, pure, and hence called in a forall.

2.  There is no semantic problem with this, or with invocation of any
extrinsic routine in an independent loop.

3.  An example
        real x(N, 100000)
        !hpf$ distribute x(*, block)

        !hpf$ independent
        do i = 1, N
           call extrinsic_fn(x(i,:))
        enddo
The independent loop is a mechanism for spawning N threads per processor, each independent of the others;  the load per thread may be variable.  It would
possibly be useful here to NOT synchronize after each call to the extrinsic
routine!   Is there any semantic reason to force the synchronization?

4.  There are some very important issues in the implementation, with possible
language impacts.

Let us assume an MPI based implementation of HPF calling an extrinsic
local routine that uses MPI for communication.  Because of the unity of
purpose and of hotel between HPF and MPI, it is arguably necessary for
HPFF to make this work cleanly and efficiently.

Issue MPI-1:   Does HPF handle MPI_Init and MPI_Finalize, automatically.

Issue MPI-2:   Are any of the MPI routines PURE?

Issue MPI-3:   Thread safety.   In a naive implementation, HPF does a barrier
before and after the call to the extrinsic;  but there is no guarantee that
there are no outstanding, nonreceived messages in the messaging system.  Thus, to be safe, any extrinsic routine should use its own communicator.   To prevent
interference between separate calls to the routine, a new communicator should 
becreated for every call.   An obvious way to do this is to call
      MPI_Comm_Dup(MPI_COMM_WORLD, New_comm)
at the beginning of every such extrinsic.    However, an extrinsic
that consumes all its messages would be justified in doing this once, on its
first invocation, and saving the communicator for reuse on later invocations.
   But consider what happens if the extrinsic is called in an independent DO loop
as in the example above, and there is no barrier used.  Now we really need a
separate communicator per thread. On the other hand, a call to MPI_Comm_Dup is a collective call, which synchronizes the processes.
     Perhaps this should be done by the calling HPF routine, so that the
MPI_COMM_WORLD communicator is different on every call.

Issue MPI-4:  If called from the range of an ON_HOME directive, what
set of processors does MPI_COMM_WORLD correspond to?   If it corresponds to the subset executing the ON block, then how can the called routine access 
nonresident data?   Should there be a way to access a communicator that 
corresponds to these executing processors, while MPI_COMM_WORLD 
always corresponds to all of the processors?

Issue MPI-5:  If called from separate ON_HOME blocks in the scope of
a TASK directive, with disjoint processors groups, so that the two
ON blocks may be executed concurrently, what communicators correspond to the two processor groups?   (If, in issue 4 above, the answer is that MPI_COMM_WORLD corresponds to the executing subset of the processors, then the answer here is MPI_COMM_WORLD.)

=========== comments from Jim Cownie =================

> Issue MPI-1:   Does HPF handle MPI_Init and MPI_Finalize, automatically.
I would say that the HPF run-time should have called MPI_Init before
any user code has run, therefore user extrinsic functions which need
MPI can just use it.

This is actually not a big deal, since the user routine can always do
use MPI_Initialized() to guard her call to MPI_Init. (Though if this
is done, then the HPF run-time needs to do the same, since MPI_Init
should only be called once.) That's why it's simpler to say that
MPI_Init has already been called before any user HPF code has run.

> Issue MPI-2:   Are any of the MPI routines PURE?
Probably. For instance one could cast reductions as functions which
return the result, and only read the arguments (though why you'd want
to use an MPI reduction extrinsically rather than an HPF one is beyond
me).
 ... after all 5 MPI issues ...
     I would suggest that in all of these cases the HPF run-time should
provide a "current communicator" which includes the set of processes
running the current construct. In some cases this will be
MPI_COMM_WORLD (or a Comm_Dup of COMM_WORLD), in others (ON_HOME, task parallelism, processor subsets) it will represent a subset of the available processes. In MPI MPI_COMM_WORLD is always available as the set of all processes (until MPI-2 introduces a dynamic process model,
though that shouldn't worry HPF implementations).
     Therefore I think
1) MPI_COMM_WORLD is *always* the set of all processes. (This is the
   current MPI view).
2) If you need subsets then you should create new communicators, and
   provide a way for the user code to access them.

MPI_COMM_WORLD should mean the same thing in a routine called from HPF extrinsic as it did in a "raw" MPI program. The HPF extrinsic MPI
environment should contain additions to the raw MPI environment (new
communicators, maybe pre-defined datatypes giving array distributions,
etc), but should not change the meaning of things in the raw MPI
world. In other words you may need to learn more to work in the HPF
extrinsic environment, but you shouldn't have to unlearn things you
already knew about MPI.
-- Jim


=====================================
GROUP D  ACTIVE ITEMS   
=====================================

CCI #11
Pointer with sequence
Group D
Updated: 10/03/95
Henry Zongaro
Submitted: 02/16/95
Status: in progress
Question:
Hello,

     Things have been quiet here lately, so I thought I'd send a few questions
that I've been hoarding.  All page and line references are relative to the
1.0 HPF Language Spec.

     I'd like to hear other people's opinions on these, especially the 2nd and
3rd items.

1) The response to CCI item 6.3 indicated that variables with the POINTER
   attribute must be distributed or aligned.  A related question - can they be
   given the SEQUENCE attribute?  Or can a pointer be associated with both
   sequential and nonsequential targets?

Discussion: 
Meeting minutes:  needs more research
========
from July meeting - subgroup proposal:
conforming dimensions of the pointer object and target either must be (unmapped or both be identcally distributed) or both must be sequential.

The issue is ... does the pointer exist as an instance by itself, or is it just assocated  with its bound arg.  There is a case of pointers to sections of arrays, so just knowing that a pointer goes a block distributed thing, one can't talk about pointers to section.   Andy Meltzer recalls that was related to the fact that allocatable objects don't have their distribution until after they are assigned.
A straw poll was taken with the special understanding that a substantial vote for "abstain" would mean reconsider.  The vote was   7-3-10, so this CCI item is returned to committee for further clarification of the issue.

=========
from September meeting minutes:

CCI 11  Pointers cannot be mapped.  Can they be associated with both sequential and non sequential targets?   Subgroup recommends that pointers cannot point to sequential variables.   Full group didn't think this was right and asked the subgroup to try again.


=====================================
=====================================
CCI #32
Changing distribution of SAVE array
Group D
Updated: 9/18/95
Henry Zongaro
Submitted: 8/31/95
Status: in progress
QUESTION:
Hello,
     Page 41, lines 21-34 of the HPF 1.1 document specifies that an array or
template must not be distributed on a processor arrangement at the time the
arrangement becomes undefined, unless the array or template also becomes
undefined or the processor arrangement always has identical bounds.

     Presumably this was done so that objects with the SAVE attribute would not change mappings from one call to the next.  Is this rule sufficiently strict?  Consider the following:

        program p
          call sub(5)
          call sub(10)
        end program p

        subroutine sub(n)
          integer, save :: a(10)
!hpf$     processors proc(2)
!hpf$     distribute a(block(n)) onto proc
        end subroutine sub

     In the first call to sub, a is distributed block(5); in the second call, it is distributed block(10).  This is currently permitted because the bounds of the processor arrangement have not changed.

     More complicated examples could be drawn involving ALIGN.

     Similar text appears on page 44, line 43 - page 45, line 3 for templates.

DISCUSSION
Comments from Rob  9/1/95

We should tighten the language to prevent this.   It's a way to remap, which
should be done only if the object remapped has the dynamic attribute, and only
via executable directives.    I believe this applies to objects in
modules, for example:

 module mod
        real a(10)
 end module mod

 subroutine sub(n)
 use mod
!hpf$   processors procs(2)
!hpf$   distribute a(block(n)) onto procs
 end subroutine sub

This should be proscribed; if redistribute is used and if A is dynamic, however,
I think it's legal.  Right?

No action reported from Sept. meeting.

=====================================
=====================================
CCI # 34
Alignment of a single dimension
Group D
Updated: 9/18/95
Adriaan Joubert
Submitted: 9/06/95
Status: in progress
Question: 
Hello,
    I am trying to align one dimension of two 2-dimensional arrays, and cannot find a way of expressing this in HPF. The problem is the following

PROGRAM MAIN
   REAL, ALLOCATABLE :: A(:,:), Res(:,:)
   !HPF$ DISTRIBUTE A(BLOCK,*)
   !HPF$ DISTRIBUTE Res(BLOCK,*)
   ...
   ALLOCATE(A(N,M))
   ALLOCATE(Res(N,M*2))
   Res = SUB (A)
   ...
CONTAINS
   FUNCTION SUB (A) RESULT(B)
     REAL, INTENT(in) :: A(:,:)
     !HPF$ DISTRIBUTE *(BLOCK,*) :: A
     REAL :: B(SIZE(A,1),SIZE(A,2)*2)
     !HPF$ DISTRIBUTE (BLOCK,*) :: B
     ...
   END FUNCTION SUB
END PROGRAM MAIN

So the 1st dimension of B and A in the subroutine will be distributed in the same way. It seems however that compilers can generate faster code, if they know how arrays are aligned with one another. Among other things the compiler would have to know that B is exactly aligned with Res. In other words I would like to add to the main program something like

 !HPF$ TEMPLATE :: MYTemp(N,M*2)
        !HPF$ DISTRIBUTE MyTemp(BLOCK,*)
 !HPF$ ALIGN WITH MyTemp(:,I) :: A(:,I)
 !HPF$ ALIGN WITH MyTemp :: Res
and in the subroutine
 !HPF$ TEMPLATE :: MYTemp(SIZE(A,1),SIZE(A,2)*2)
 !HPF$ DISTRIBUTE MyTemp(BLOCK,*)
 !HPF$ ALIGN WITH *MyTemp(:,I) :: A(:,I)
 !HPF$ ALIGN WITH MyTemp :: B

and then the compiler should be able to figure out that everything is nicely aligned. But I cannot do this, as I do not know N and M at compile time. 

In this case I could probably get away with definitions
        !HPF$ ALIGN WITH A(:,*) :: Res(:,*)
in the main program and
 !HPF$ ALIGN WITH A(:,*) :: B(:,*)
in the subroutine. The replication of the second dimension would not matter, as it is all on the same processor. 

But if I have a (BLOCK,BLOCK) distribution for both arrays, but I still want to ensure that all elements in the first section of every column are on the same row of processors, i.e.
 REAL A(20,20), B(20,40)

 P1: A(1:10,1:10) P2: A(1:10,11:20)
     B(1:10,1:20)     B(1:10,21:40)

 P3: A(11:20,1:10)P4: A(11:20,11:20)
     B(11:20,1:20)    B(11:20,21:40)
there seems to be no way of telling the compiler about this with a descriptive statement. 
    Well, what about
 !HPF$ ALIGN WITH A(:,I) :: B(:,I*2-1)
 !HPF$ ALIGN WITH A(:,I) :: B(:,I*2)
But is this legal? And this can be harder to do if the second dimension is
(SIZE(A,2)-1)*SIZE(A,2)/2, as in my case.

I'd appreciate any help on this one. Understanding the distribution
directives seems to get harder the more you know, instead of easier ;-(
Adriaan
=========
Discussion
=========
Rob comments  9/6/95
I see you want allocatable templates, a hole in HPF that we know about.
But in your case, no template is needed. There is no reason to specify replicationin your alignment.   You can use
   !HPF$ ALIGN A(I,J) WITH RES(I,J)
and drop the distribute directive for A in the main program.   In the function, you can use a distribute on B and a descriptive alignment of A to B.
 .....
  ..."But is this legal? And this can be harder to do if the second dimension is
 (SIZE(A,2)-1)*SIZE(A,2)/2, as in my case." ...
No, it's not legal.   Align is not a symmetric relation, and the alignment
map cannot be many to one unless it collapses a dimension; at least that's my
understanding.   In case this is not clear, your statement is
equivalent to
   !HPF$ ALIGN B(:,2*I-1) WITH A(:,I)
which only explicitly aligns the odd numbered columns of B!
But why align the bigger of the two arrays (B) with the smaller (A)?    
My version of SUB would be:
    FUNCTION SUB (A) RESULT(B)
      REAL, INTENT(in) :: A(:,:)
      REAL :: B(SIZE(A,1),SIZE(A,2)*2)
      !HPF$ ALIGN *(I,J) WITH B(I,J) :: A
      !HPF$ DISTRIBUTE B (BLOCK,*)
 ...."I'd appreciate any help on this one. Understanding the distribution
directives seems to get harder the more you know, instead of easier ;-( "...
Yup.  -- Rob
=======
>From Adam M. 9/5/95
As I understand it Adriaan Joubert wants to tell HPF to set the data up as
follows:
                REAL A(20,20), B(20,40)
                P1: A(1:10,1:10)        P2: A(1:10,11:20)
                    B(1:10,1:20)            B(1:10,21:40)
                P3: A(11:20,1:10)       P4: A(11:20,11:20)
                    B(11:20,1:20)           B(11:20,21:40)
I would say (like Rob Schreiber) that you include, in Main, the line 
    !HPF$ ALIGN A(I,J) WITH B(I,J*2-1)
or  equivalently
    !HPF$ ALIGN A(:,J) WITH B(:,J*2-1)
PLUS (distribute them together)
    !HPF$ DISTRIBUTE B (BLOCK,BLOCK)
And I would claim that the procedure should be coded as follows:
      FUNCTION SUB (A) RESULT(B)
      REAL, INTENT(in) :: A(:,:)
      REAL :: B(SIZE(A,1),SIZE(A,2)*2)
      !HPF$ ALIGN *(I,J) WITH B(I,J*2-1) :: A
      !HPF$ DISTRIBUTE *B (BLOCK,BLOCK)
If I am wrong - why?   - Adam

 No action reported from Sept. meeting.

=====================================
=====================================

I believe that the following Group D items were resolved in subcommittee -
without bringing the issues to full committee.  But I have no record
of official action ... so they are still here as open items.  PLEASE someone
help close these out.  Many of them have answers that were proposed via
email ... but these answers need to be ok'd by subgroup.

=====================================
=====================================
CCI #10
Permutations in HPF
Group D
Updated: 07/12/95
Kenth  Engo
Submitted: 01/11/95
Status: in progress
Question: 
I have a general question about the way HPF deals with permutations of
data on the different parallel architectures.

In many applications in MIMD and SIMD computations today, one often
encounter the need to just perform a permutation of the data
distributed on the parallel computer, i.e. a one-to-one mapping of the
data set onto itself. It is then very important that the routing of
the data is done in such a fashion that traffic contetions are
eliminated in the interconnecting network. One does not want a
situation where 10 processors are communicating data to the very same
processor. This would make all but one processor idle, since
no more than one processor is allowed to communicate with the
destination processor at the same time.

What I have in mind is examplified by the following HPF code:
  C 8 PROCESSORS AND AN ARRAY OF 32 ELEMENTS
  C
  !HPF$ PROCESSORS SEDECIM(8)
        REAL CENTURY(32)
  C
  C THE ARRAY IS DISTRIBUTED BY BLOCK.
  C
  !HPF$ DISTRIBUTE CENTURY(BLOCK) ONTO SEDECIM
  C
  C THE ELEMENTS ARE DISTRIBUTED WITH ELEMENTS 1,2,3,4 ON PROC
  C #1, ELEMENTS 5,6,7,8 ON PROC. 2 AND SO ON.
 

Suppose one want to redistribute the elements during execution to the
following arrangement

  C DATA REDISTRIBUTED CYCLIC ON THE PROCESSORS
  C
  !HPF$ REDISTRIBUTE CENTURY(CYCLIC) ONTO SEDECIM
  C
  C THE ELEMENTS ARE NOW REDISTRIBUTED WITH ELEMENTS 1,9,17,25 ON
  C PROC #1, ELEMENTS 2,10,18,26 ON PROC. 2 AND SO ON.

During this redistribution a permutation of the data set is
performed. How is this permutation implemented, and what is actually
done. Is there a strategy/theory for generally doing this permutation
optimally, that is; without any traffic contetions in the network?

I will be grateful if someone could answer this email, and possibly
send or give me references to literature or people where I can find
out more about how HPF implements the permutations.
Best regards,
Kenth Engo

Discussion:
Notes from May meeting
CCI # 10 - not cci but a request for implementation practice.
======================
NO ACTION FROM JULY MEETING RECORDED. WAS THIS RESOLVED OR NOT?
10/23/95 - still no action report


=====================================
=====================================

CCI #17
Defaults for distribution
Group D
Updated:  07/11/95
A.C. Marshall
Submitted: 04/18/95
Status: in progress
Questions:
Forgive me for being dim (and only having v1.0 of the draft standard)
but...

Looking at the syntax rules for DISTRIBUTE (p24 & 26) it would appear to me
that:

        !HPF$ DISTRIBUTE A(BLOCK) !H303/5/8
        !HPF$ DISTRIBUTE ONTO P :: A !H301/2/6/10

are valid but that

        !HPF$ DISTRIBUTE A
        !HPF$ DISTRIBUTE :: A

are not. Is this just me or is this how things are supposed to be, and if
so why is it not possible to use default distribution and processor grid in
the same statement, after all

        !HPF$ PROCESSORS P(NUMBER_OF_PROCESSORS())
        !HPF$ DISTRIBUTE ONTO P :: A

is valid and has the same effect.

Adam Marshall

Discussion: 

notes from May meeting:   needs more research:
----
Scott Baden and Chuck Koelbel reply   07/07/95

This is the way things are "supposed to be"
Consult the  relevant text (page 30, line 20-21, v. 1.1)

... To prevent syntactic ambiguity, the dist-format-clause
    must be present in the statement form [of a distribute spec]

Chuck Koelbel adds that the "syntactic ambiguity" referred to here is
due to the problem of non-significant blanks in Fortran:

Consider
>
>  !HPF$ DISTRIBUTE  PRONTO  ONTO  LOGY
>  !HPF$ DISTRIBUTE  PR  ONTO  ONTOLOGY
>

or the following example:
>
>  !HPF$ ALIGN  TWITHEADS  WITH  A
>  !HPF$ ALIGN  T  WITH  EADSWITHA
>
>  Disallowed for the same reason.

Chuck also continues:

>
>  It's not clear that his example has "the same effect".  For example,
>  consider
>
>  !HPF$ PROCESSORS P(NUMBER_OF_PROCESSORS())
>  !HPF$ PROCESSORS Q(4,NUMBER_OF_PROCESSORS()/4)
>  !HPF$ DISTRIBUTE :: A
>  Does A have a 1-dimensional or 2-dimensional distribution?  (Yeah, this
>  assumes that NUMBER_OF_PROCESSORS() is divisible by 4...)

>
>  The reason for requiring at least one of the clauses is, "What information
>  are you giving if you leave both out?"  In effect,
>
>  !HPF$ DISTRIBUTE :: A
>
>  would be a no-op, and we didn't think there was a need for that.


Scott Baden
Chuck Koelbel

======================

NO ACTION FROM JULY MEETING RECORDED. WAS THIS RESOLVED OR NOT?  10/23/95   still don't know.

=====================================
=====================================
CCI #8
Dummy assertion asterisk
Group D
Updated: 7/13/95
Yasuharu Hayashi
Submitted: 04/25/95
Status: in progress
Question:
I have a question about the interpretation of the assertion asterisk
when the template of a dummy argument is a natural template.
According to High Performance Fortran Language Specification
November 10 ,1994 Version 1.1 p.51,l.31,
 "If the dummy argument has a natural template (no INHERIT attribute)
  then things are more complicated. In certain situations the programmer
  is justified in inferring a preexisting distribution for the natural
  template ......"
 When a actual argument is a whole array, the text on p.51,l.35 states only
 "In all these situations, the actual argument must be a whole array or
  array section, and the template of the actual must be coextensive with
  the array along any axis having a distribution format other than "*".
  If the actual argument is a whole array, then the pre-existing
  distribution of the natural template of the dummy is identical
  to that of the actual argument".
 I think this description is ambiguous. For example :
      PROGRAM EX
      REAL A(10,10),B(5,5)
!HPF$ PROCESSORS P(5)
!HPF$ DISTRIBUTE A(BLOCK,*) ONTO P
!HPF$ ALIGN B(*,I) WITH A(2*I,*)
      CALL SUB(B)
        :
      END
      SUBROUTINE SUB(BB)
      REAL BB(5,5)
!HPF$ PROCESSORS P(5)
!HPF$ DISTRIBUTE *(*,BLOCK(1)) ONTO *P :: BB
        :
      END
Is the assertion asterisks for BB in SUB HPF-conforming ?
 Isn't it necessary to add the list as follows which shows
what assertion asterisks for a natural template are legal
when a actual argument is a whole array ?
"1.If n th axis of a actual argument which is a whole array
   corresponds to T(n) th axis of the template of it and j > i,
   T(j) must be larger than T(i).
 2.If the situation is not described below ,no assertion about the
   distribution of the natural template of a dummy is HPF-conforming.
   (a) If the alignment of the actual array axis with its template is
       collapsed, then * should appear in the distribution for the
       corresponding axis of the natural template of the dummy.
   (b) If the actual array is aligned with the axis of its template by
       replication (or "replication-triplet") and that template axis
       is distributed * ,
       then no entry should appear in the distribution for
       the natural template of the dummy.
   (c) If the actual array is aligned with the axis of its template by
       int-expr and that template axis is distributed * ,
       then no entry should appear in the distribution for
       the natural template of the dummy.
   (d) If the alignment of the actual array axis with the axis of
       its template is subscript triplet l:u:s and that axis of its
       template distributed *, then * should appear in the distribution
       for the corresponding axis of the natural template of the dummy.
   (e) If the alignment of the actual array axis with the axis of
       its template is subscript triplet l:u:s and that axis of its
       template distributed BLOCK(n) and LB is the lower bound for
       that axis of the template, then BLOCK(n/s) should appear in the
       distribution for the natural template of the dummy,
       provided that s divides n evenly and that l - LB < s.
  (f) If the alignment of the actual array axis with the axis of
       its template is subscript triplet l:u:s and that axis of its
       template distributed CYCLIC(n) and LB is the lower bound for
       that axis of the template, then CYCLIC(n/s) should appear in the
       distribution for the natural template of the dummy ,
       provided that s divides n evenly and that l - LB < s."
   (g) If the alignment of the actual array axis with the axis of
       its template is subscript triplet l:u:s ,s must be positive.

Or it might be better to forbid the use of any assertion asterisks
in DISTRIBUTE directive in case that a dummy argument doesn't
have INHERIT attribute and the corresponding actual argument isn't
ultimately aligned with itself since it seems that this solution makes
things far simplified and cause little actual inconvenience (the same
effect can also be achieved by ALIGN directive).

Discussion:   Rob replies ...
This example is nonconforming because axis 2 of B is NOT coextensive
(one-to-one and onto mapping to) axis 1 of the template to which B is
aligned.
         >       ... example from original message
Now let the example be this:

         >       PROGRAM EX
         >       REAL A(10,10),B(5,10)
         > !HPF$ PROCESSORS P(5)
         > !HPF$ DISTRIBUTE A(BLOCK,*) ONTO P
         > !HPF$ ALIGN B(*,I) WITH A(I,*)
         >       CALL SUB(B)
         >         :
         >       END
         >
         >       SUBROUTINE SUB(BB)
         >       REAL BB(5,10)
         > !HPF$ PROCESSORS P(5)
         > !HPF$ DISTRIBUTE *(*,BLOCK(2)) ONTO *P :: BB
         >         :
         >       END
This example is correct.   The replication over the second
axis of the template of the actual is not a problem because that
is an axis whose distribution format is *.   B is not coextensive with
that axis because it has a one-to-many association with it, but since
the template axis has a * distribution, coextension is not a requirement.
Is this reasonable?
Rob Schreiber
=============
meeting discussion
, Henry, Jerry to circulate proposed wording for this definition.
======================

NO ACTION FROM JULY MEETING RECORDED. WAS THIS RESOLVED OR NOT?  10/23/95 still don't know

=====================================
=====================================
CCI #19
Mapping function results
Group D
Updated: 7/13/95
Henry Zongaro
Submitted: 05/04/95
Status: in progress
Question: 
     I have a couple of questions related to specification of mappings for
function results.
1)   Consider the following program fragment
          program prog
            interface
              function f()
                integer f(100)
    !hpf$       processors p(number_of_processors())
    !hpf$       distribute f(block) onto p
              end function f
            end interface

            call sub(f())
          end program prog

          subroutine sub(i)
            integer :: i(100)
    !hpf$   processors p(number_of_processors())
    !hpf$   distribute i *(block) onto *p
          end subroutine sub
     Is the above HPF conforming?  Does distribution of a function result
variable affect the distribution of the expression returned?  The text on
page 53, lines 27-28 indicates that the alignment of an expression is, in
general, unpredictable, except in the case of arrays and array sections, so I
believe the answer to my question is "No".  However, this is actually spurred
by another question relating to the SEQUENCE directive.

     According to page 151, line 47 of the 1.1 HPF Spec., an <association-name>
can be a <function-name>.  When I first read this, I thought the explicit
reference to <function-name> was there to include result variables.  Now a
co-worker has suggested an alternate interpretation, and we were wondering
which is correct.  Her suggestion was that this is trying to allow something
like the following:
               program p
                 integer, external :: f
         !hpf$   sequence :: f

                 i = f()
               end program p
Is this correct?  Will this make the result of the function sequential?  If so,
that brings up another question:

               program p
                 interface
                   function f()
                     integer :: f(10)
         !hpf$       sequence :: f
                   end function f
                 end interface

                 call sub(f())
               end program p

               subroutine sub(a)
                 integer a(2, 5)
         !hpf$   sequence a
               end subroutine sub

According to page 155, lines 33-35, an array valued expression cannot be
specified to be sequential, but if specifying f to be a sequential function
makes its result value sequential, this would be a contradiction.

Discussion: 
May meeting minutes:
needs more research

======================

NO ACTION FROM JULY MEETING RECORDED. WAS THIS RESOLVED OR NOT?  10/23/95   Still don't know.

=====================================
=====================================
CCI #21
Question about derived type mappings and documentation.
Group D
Updated: 7/19/95
Michael Hennecker
Submitted: 05/11/95
Status: in progress
Question:
Hello,
I have some questions regarding data mapping of objects of derived type:

(1) Is it possible to DISTRIBUTE / ALIGN objects of derived type,
    or are the data mapping attibutes restricted to intrinsic types?

(2) If mapping of objects of derived type is not possible, shouldn't the v1.0
    and v1.1 specs for HPF_ALIGNMENT, HPF_DISTRIBUTION and HPF_TEMPLATE

      "ALIGNEE        may be of any type."  (5.7.15, 5.7.16)
      "DISTRIBUTEE    may be of any type."  (5.7.17)

    read "may be of any intrinsic type." ?

Best regards,
Michael

Discussion:

Reply by CHK  5/15/95
...
It is possible to map objects of derived type.
(It is currently not possible to map components of derived type objects;
this is being discussed in the HPFF 95 meetings.)

Second question is moot, given the first answer.

Thanks for asking.

                                                        Chuck Koelbel

======================

NO ACTION FROM JULY MEETING RECORDED. WAS THIS RESOLVED OR NOT?  10/23/95   still don't know.

=====================================
=====================================
CCI #22
Implementor note about processor distributions?
Group D
Updated: 7/19/95
Henry Zongaro
Submitted: 6/15/95
Status: in progress
Question:
Hello,

     We came across something that didn't seem immediately obvious here, and
might not be immediately obvious to others, so we were wondering whether a note
to users and/or implementers might be justified.

     On page 30 of the 1.1 Language Spec., it's stated that if the ONTO clause
of a DISTRIBUTE directive is omitted, an arbitrary processor arrangement is
chosen for each distributee.  In some cases, there may be no suitable
arrangement; I assume such a program would not be HPF-conforming.  For example,

             program p
               integer :: a(10, 10)
       !hpf$   distribute a(block(5), block(5))
             end program p

Here, the processor arrangement created would have to have an extent of at
least two in each dimension (which, by the way, constrains how arbitrary the
selection of a processor arrangement can be), so this program could not run in
an environment in which the number of processors was fewer than four.

     Does a note seem worthwhile here, or do others feel such a case is
immediately obvious?

Thanks,
Henry
======================

NO ACTION FROM JULY MEETING RECORDED. WAS THIS RESOLVED OR NOT?  10/23/95  still don't know

=====================================
=====================================
CCI #26
Conditional realignment
Group D
Updated: 07/19/95
Fabien COELHO
Submitted: 07/01/95
Status: in progress
Question: 
Hi out there,
  Is this kind of thing allowed in HPF ?

! align A with T
  if (some runtime condition)
! realign A with T'
  endif
! redistribute T ...

  after the redistribution, array A mapping is not known. It depends on
the runtime condition. I cannot remember anything that may forbid this.
I guess it is not very nice for the compiler...
  should/could be forbidden ? Or I may be wrong ?

Fabien.

DISCUSSION:
Chuck Koelbel replies  7/3/95:
Yes, this is allowed.  This was the intent of HPF REALIGN and REDISTRIBUTE
- to allow the user to make run-time decisions about data mapping.
Allowing run-time remapping will indeed require substantial support in the
run-time system.  We discussed this tradeoff, and the concensus was that
users had valid reasons for wanting this capability, therefore it should be
in the language.  The difficulties with implementation were one reason that
REALIGN and REDISTRIBUTE were not put in Subset HPF.

In short, you are right that this is legal and hard to implement.  You are
wrong that it is/should be forbidden.

                                                Chuck

======================

NO ACTION FROM JULY MEETING RECORDED. WAS THIS RESOLVED OR NOT?  10/23/95  still don't know

=====================================
=====================================
CCI #30
Collapsing dimensions
Group D
Updated: 9/18/95
Rob Schreiber
Submitted: 8/3/95
Status: in progress
Question II.   This is really a question for implementors, as much as a
question for language lawyers.
     Consider this program

real a(100,200), b(100)
!hpf$ distribute a(block, *)
!hpf$ distribute b(*)

forall (row = 1:100)  a(row,:) = f(a(row,*))
...
pure function f(x)
real x(:)
real f(size(x))
!hpf$ distribute *(*) :: x
!hpf$ align f(:) with x(:)

I cannot find any rule against this.  (The issue is whether one may distribute
an r dimensional object with fewer than r instances of BLOCK or CYCLIC(k) in its  dist format list.)    What is the mapping of b?   And should one be allowed to describe the mapping of x in this manner, or must one use the more cumbersome and specific:

pure function f(x, row)
integer row
real x(:)
real f(size(x))
!hpf$ template t(100,200)
!hpf$ align f(:) with x(:)
!hpf$ align x(:) with t(row, :)
!hpf$ distribute t(block, *)
-----------
Hello,
Pres asked me to amplify my previous CCI request concerning the directive    distribute x(*)

Here is some additional commentary:

The key issue is to let the compiler know what's going on when a subroutine is passed an "on one processor only" section of a distributed array; the call site is probably in a forall or independent loop.  The obvious syntax is to say, prescriptively:
    !hpf$ distribute dummy_arg(*)
or descriptively:
    !hpf$ distribute dummy_arg*(*)

I was surprised that this is allowed by the HPF syntax: if this distribution is specified by the program for an array, that is not a dummy arg, I don't know what to make of it.   Would it mean to replicate the array?  To store it on one processor of the compiler's choice?  To store it on the "front-end?  In shared memory?

I think a reasonable proposal would be as follows:
--------------------------------------------------------------------------------------
   In a (re)distribute directive, the number of non-* (i.e. block and cyclic[(k)]) entries in the dist-format-list must ordinarily be at least one, and must be the same
as the rank of the processors arrangement in the ONTO clause, if present.
   If, however, the distributee is a dummy argument, then, if the distribute directive is descriptive, the requirement of at least one non-* entry in the dist-format-list is waived.  Thus
   real dummy(:,:)
   !hpf$ distribute dummy *(*,*)
Is valid for a dummy argument; it asserts that the actual argument will be distributed on a single processor.
--------------------------------------------------------------------------------------
(Advise to language designers:)
   It's quite likely that a section of a processors arrangement will be allowed in the ONTO clause of (re)distribute.   In that case, one could also use the following

subroutine act_on_local_info(dummy, iproc)
real dummy(:,:)
!hpf$ processors all_procs(number_of_processors())
!hpf$ distribute dummy *(*,*) onto all_procs(iproc)

This would be appropriate in the following contexts:

program main
real actual_2d(8,16), actual_wide_2d(8,32), actual_3d(8, 16, 10)
$hpf$ processors procs(8)
!hpf$ distribute (*, block) onto procs :: actual_2d, actual_wide_2d
!hpf$ distribute (*, block, *) onto procs :: actual_2d, actual_wide_2d

!hpf$ independent
do j = 1, 1
    call act_on_local_info( actual_2d(:, j:j), (j+1)/2 )          ! dummy shape is (8,1)
    call act_on_local_info( actual_wide_2d(:, 2*j-1:2*j), (j+1)/2 )  ! dummy shape is (8,2)
    call act_on_local_info( actual_3d(:, j, :), (j+1)/2 )         ! dummy shape is (8,10)
enddo
--------------------------------------------------------------------------------------
The alternative to this, as far as I can tell, is to make the programmer align the dummy to a template, as follows:

subroutine act_on_located_info(dummy, iproc)
real dummy(:,:)
!hpf$ processors all_procs(number_of_processors())
!hpf$ template, distribute onto all_procs :: all_temp(number_of_processors())
!hpf$ align *(*,*) with all_temp(iproc) :: dummy

program main
real actual_2d(8,16), actual_wide_2d(8,32), actual_3d(8, 16, 10)
$hpf$ processors procs(8)
!hpf$ distribute (*, block) onto procs :: actual_2d, actual_wide_2d
!hpf$ distribute (*, block, *) onto procs :: actual_2d, actual_wide_2d

!hpf$ independent
do j = 1, 16
    call act_on_located_info( actual_2d(:, j:j), (j+1)/2 )        ! dummy shape is (8,1)
    call act_on_located_info( actual_wide_2d(:, 2*j-1:2*j), (j+1)/2 )  ! dummy shape is (8,2)
    call act_on_located_info( actual_3d(:, j, :), (j+1)/2 )     ! dummy shape is (8,10)
enddo

 -- Rob 

no action report ed from Sept meeting.

=====================================
GROUP E  New items
=====================================
CCI #35
F95 and reduction functions.
Group E
Adam Marshall
Submitted: 10/5/95
Status: new
Question:
I may well have missed something here but does the HPFF intend to `redraft'
the HPF V1.1 spec to define the language binding in terms of Fortran 95 or
is that going to be left to HPF 2? For example, I am thinking of the
argument lists to tghe reduction functions MINVAL, PRODUCT etc. Also there
are some very "sensible" extensions in allowing user defined functions in
specification expressions.

I guess as F95 is a superset of F90 vendors could provide the new F95
features as extensions to HPF V1.1. It would seem a sensible thing to do.

Adam Marshall

Discussion:

partial reply from Mary ...
Adam
We plan to address F95 in HPF2 ... timing of approval for F95 is a
bit awkward, since we may be a bit ahead of F95 formal approval.
But the spirit of the group is that full HPF is F95 + ...

It is highly unlikely that we will make these changes retroactive
to the definition of HPF1.1 specification.  As you suggest, that
would be left to the vendors.  We are not in the mode of making
extensions to HPF1.1 at this time.
   -mary zosel-

and partial reply from Chuck ...
The short answer is, "We'll leave Fortran 95 up to HPF 2.0."

I agree that F95 has some nice features, taken from HPF and from other
places.  (I can't speak for the whole HPFF working group, but I think this
is a common view.)  Plans are to restructure the HPF 2.0 specification to
be based on F95, but intermediate versions (corrections, clarifications,
interpretations) will continue to be based on F90.  Think of this as
bundling all the big changes into one package.

As with (almost) any language, vendors can add their own extensions.  Their
customers can decide whether they want to pay for worthwhile features that
may be nonportable to other platforms.  F95 extensions to HPF sound very
feasible, and should be more portable than extensions like "brand X active
objects".  (Well, I'm *trying* not to slam any particular company...)

                                                Chuck Koelbel

NOTE - these replies don't address entire question ....