# Copyright (c) 1999 Greg London. All rights reserved.
# You can redistribute this file and/or
# modify it under the same terms as Perl itself.

# contact greg42@bellatlantic.net with any questions,
# corrections, clarifications, etc.


what follows is a first attempt to create a
specification for sensitive perl threads.

this document attempts to define all of the
concepts needed to build sensitive threads
into perl. The focus of the design is to
create an environment for threaded perl
that naturally leads designers to create
code that doesn't contain race conditions
and other problems normally associated with
threads sharing global data.

the key concepts are:
VARIABLES:
WIRES:
WIRE RESOLUTION:
TIME: 
DELTA:
THREAD:
SENSITIVITY LIST:

If you are a hardware engineer, you will recognize
this as the approach used in VHDL and Verilog languages
which are used to design hardware. 

the purpose of this document is to introduce 
software people to these concepts so that these ideas
get rolled into threaded perl at some point.
After describing these concepts to some software 
people, I found that these concepts were not
familiar in the software world.

since hardware requires designing stuff to run in
parallel, the hardware languages VHDL and Verilog
have developed around creating a paradigm for
designers to program within that encourages
good design free of race conditions and other
problems.  Get yourself a VHDL book for a good
reference.


VARIABLES:
==========

requirements:
-------------
Thread variables are exactly like perl variables.
When a variable is assigned a new value, the
assignment takes place immediately and the new
value can be read back immediately. 

Variables cannot be listed in a thread's 
SENSITIVITY list.

rules of thumb:
---------------
Variables tend to be local to a single thread.
It is generally not recommended that variables
be shared between threads, although this should
not be a requirement. 


WIRES:
========

requirements:
-------------
WIRES are implemented using tied variables in perl.
Assignment to a wire is delayed until the 
current DELTA has completed.
Reading a wire immediately after it has been assigned
and before the delta has completed will read the
wire's old value.
Only wires can be used in a thread's SENSITIVITY list.
Only wires can be connected to an ENTITY's PORTs.
When a wire is modified and the wire is 
contained within the sensitivity list of a thread,
that thread is scheduled to be executed as part of 
the next DELTA slice.
Temporal based functions can be called on wires.
tEvent(wire_name) returns a boolean to indicate
if the thread is being executed this DELTA because
wire_name changed in the previous DELTA slice.


rules of thumb:
---------------
wires are the means that threads communicate.
Communication between threads should generally
be used via wires.

wires are the preferred way to execute threads.
Another method to execute threads is to schedule
them based on a SIMULATION TIME.

distinctions of wires:
------------------------

there are two distinct types of wires:
wires with 1 driver and 
wires with 2 or more drivers.

a driver is simply a thread that assigns the wire.

much of the time, a wire will be assigned
within one thread only, and be read by many 
other threads as an input. nothing special
is needed to handle this case.

a wire that is driven by multiple threads
represents something similar to a shared
data bus in the expansion slots of a PC.
The PC might drive data onto the wire,
or one of many IO cards might drive data
onto the wire.  THe wire is shared by multiple
drivers.

with multiple drivers on a wire, some means
is necessary to "resolve" the final value of
the wire.

wire RESOLUTION:
==================

a wire driven by multiple threads must
have a means to determine the resulting 
value of the wire.

in hardware, if two outputs are tied together,
one driving high, the other driving low,
the result is undefined, or an "X".
Imagine two drivers tied to the same wire, 
one driving 5 volts and the other driving 0
volts. The result is undefined.
For multiple drivers to share a wire,
everyone drives a high impedance on the
wire, "Z", until they need to send 
actual data somewhere.

driving a "Z" and "1" on a wire resolves
to "1".

to perform wire resolution, the simulation
engine will need to know all of the threads
that are driving the wire, and the last 
value each thread assigned that wire.
everytime a driver changes its value, the
engine must combine the new output with all
the other current outputs, and determine
what the result is.

it is possible that the method of wire 
resolution is simply: "last assignment wins".
this is the simplest approach.

however for certain designs, this may be
insufficient. for one, it does not allow
threads to simulate hardware. 

even if threads will be used for software
designs only, "wire resolution"  is
another means to help find possible race 
conditions.

an example:

two processors are networked together, each
running a thread. both processes make an
assignment to a shared wire.

if they do it at the same time, and they
have different results, then the wire 
will resolve to "x". if they do not have
wire resolution, then the value will
be "last assignment wins".

having wire resolution will result in the
wire being an "x", and uncovering a possible
design error. perhaps the two processors were
not supposed to drive the same wire.

or if they are supposed to share the wire,
a means to say who gets final say should be 
designed into the thread. the thread that is
not doing the assignment should drive the wire
with Z's.  this reflects the designs intention
that the two threads were meant to share the
same wire, and the thread driving Z's is
relinquishing control of assignment at the 
moment.

--------

it may be possible to design the threading
engine without wire resolution and then
have a separate module which is used for
resolving wires.

my $res_sig = new Resolvedwire;

then each thread could register itself
as one of drivers on the wire.

process(clock)is
	RegisterAsDriverFor($res_sig);
begin
	$res_sig->assign($new_value);
end process;

the shortcomings of this approach would be:

1) perl subroutines do not have an initialization
	section. driver registration would have
	to occur somewhere separate from the thread
	that is doing the driving.

2) it requires the wire be handled as an object.
	not a big problem, but methods have to be
	created for any special functionality needed.
	
however, I currently do not know if wire 
resolution can be built into threaded perl
without resorting to making resolved wires
into objects.
it would be great if it could be done, but I cannot
see a way to get there.


TIME:
=====

requirements:
-------------
Threads must occur in time, either SIMULATION time 
or REAL-TIME.

Threads execute within a specific time. 
At the start of a simulation cycle, the current time
is copied and stored in a wire. From that point,
this stored time is used by all the threads that
are scheduled to execute within that cycle.

several threads may execute in sequence within a 
single cycle, but those threads will all report
the same CURRENT-TIME.

REPORTED-TIME:
=============

REPORTED-TIME is the time reported to all threads
executing within the same simulation cycle.
it can be either SIMULATION time or REAL time.
current time is grabbed at the start of the 
cycle and does not change until the start of the
next cycle.

			CYCLE	1	CYCLE	2	CYCLE 3
REPORTED_TIME	100		200		300
--------------------------------------------------
threads		thread1	thread1	thread1 
executed		thread2	thread3	thread4

When thread1 and thread2 execute, in cycle 1,
if they read REPORTED TIME, they will both read
the value 100, regardless of the order of execution
for either thread, or how long either thread takes
to execute.

DELTA TIME:
===========

requirements
------------
A single simulation CYCLE may be broken down into 
multiple DELTA's. Delta's control when wireS are
updated with the values they were assigned.

From the start of any DELTA until its completion,
a wire can be written a new value, but reading
that wire any time during that DELTA will always
return the value of that wire at the beginning
of the delta.

At the end of every DELTA, all values written
to wireS will be updated to become the
wireS new read value.

There may be only 1 delta for a simulation cycle,
or there may be many. Within a single DELTA, 
the THREAD ENGINE may execute only one thread, or it
may execute many, depending on the code being
executed.

The first thread within a simulation cycle
contains all the threads which are scheduled
to execute based on REPORTED-TIME contraints.
All the threads scheduled to execute at time
100 nanoseconds will execute within the first
DELTA of the simulation cycle at time 100 nanoseconds.

At the end of each simulation DELTA slice,
all assigned wire values are updated to
the wires' read values.

If any of these wires are contained in 
a thread's SENSITIVITY LIST, then all sensitive
threads are executed in the next DELTA for
simulation cycle at time 100 nanoseconds.

For example, at the end of the first DELTA slice,
"clock" wire and the "output" wire might change.
Another thread might be sensitive to changes
on the "clock" wire, and another thread might
be sensitive to changes on the "output" 
wire. These two threads would be scheduled to
execute in the second DELTA for the simulation
cycle at time 100 nanoseconds.

During DELTA 2, all assignments would be deferred
until then end of DELTA 2. The sensitive threads
are executed, order is undertermined. At the end
of DELTA 2, the wire assignments are updated to 
read value for the wires.

Updating the wires at the end of DELTA 2
may trigger another set of threads to execute.

example

			CYCLE	1	CYCLE	2	CYCLE 3
CURRENT_TIME	100		200		300
--------------------------------------------------
			thread1	thread1	thread1 
			-------	thread7	-------
			thread2	thread8	threadB
			thread3	-------	-------
			thread4	thread9
			-------	threadA
			thread5	-------
			thread6	
			-------


cycle 1 contains 3 deltas: 
	delta 1 : thread1
	delta 2 : thread 2,3,4
	delta 3 : thread 5,6

cycle 2 contains 2 deltas:
	delta 1 : thread 1,7,8
	delta 2 : thread 9,A

cycle 3 contains 2 deltas:
	delta 1 : thread1
	delta 2 : threadB


ERROR TRAPPING:
a thread should only execute once within a
single simulation cycle. if it executes
2 or more times, it indicates a race condition
or a faulty / unreliable design, or both, 
or worse.  If a thread executes twice within 
the same simulation cycle, the THREAD ENGINE
should report an error.

SIMULATION CYCLE:
=================

During a simulation cycle, REPORTED TIME
does not move forward, exept by delta counts.

A simulation cycle is divided into DELTA slices.
Each delta contains 1 or more threads which have
been scheduled to execute.

REPORTED TIME during a cycle will indicate the
time (either real time or simulated) and a
delta count.

time 300 delta 1
time 300 delta 2
etc.

the time can reflect real-time
based off of a real-time clock.
a thread might execute every 5 minutes
to back-up the data in a text editor.

time can also reflect simulation time.
Simulation time represents some passage
of time that is not neccasarily executed
in real-time. a thread might represent
a piece of hardware which will produce
an output 10 nanoseconds after a change
on its input. The thread that represents
this hardware might take a lot more time
than 10 nanoseconds to execute. simulation
time allows the threads to model instances
such as this.

random thought:

it might be possible to create this model
of time by having REPORTED TIME be a
wire. a thread that is scheduled to execute
at a specific time would register itself with 
a scheduling thread which is sensitive to changes
of the "time wire". the scheduler could
look at the simulation time and compare that
to when the thread was supposed to execute,
and if it matches, schedule it for execution
within the next delta slice.

at the begining of each delta slice, the 
wire containing the simulation time
would have its delta count incremented.

at the beginning of each simulation cycle,
simulation time would be updated to its next
value.

this model would allow time to be a wire,
simplifying designing threads somewhat,
and also allow others to have threads which
are sensitive to time as well.

it would also make it relatively easy to
use real-time or simulated time as the
value given for REPORTED TIME.


THREAD:
========

A thread is a subroutine which can run in
parallel with other threads. A thread
can suspend execution of itself, and return
control to some master code. A suspended
thread can be re-executed so that it
continues where it left off previously.

Threads may have SENSITIVITY LISTS.

SENSITIVITY LIST:
=================

A sensitivity list is a list of wire names
to which the thread is sensitive to changes.
If a wire in the sensitivity list is
modified, then the thread will be scheduled
to execute at the next DELTA slice.
A sensitivity list can contain only wires;
it cannot contain variables.


THREAD EXECUTION:
=================

a thread with a sensitivity list will not 
execute until a wire it is sensitive to changes.
a thread with NO sensitivity list will execute
once, at the begining of simulation, and 
cannot be reexecuted again.

once a thread begins execution, it may either
complete and return within a single DELTA slice,
or it may suspend itself and resume execution
at a later simulation cycle.

if a thread suspends itself, it must provide
the method to reactivate it. This can be
either by waiting for some simulation time
to occur or waiting until a wire or wires
is modified.

a thread may have a sensitivity list containing
the wire "address", and after partially executing 
itself, it may suspend execution until the 
"clock" wire changes. At this point, the thread
is no longer sensitive to "address", it is only
sensitive to "clock".

a change on the "address" wire will not cause
a new thread to execute from the beginning.
the thread will only respond to a change in "clock",
at which point, it will resume execution where
it left off previously.

also, a thread may have no sensitivity list,
and execute once at the start of simulation.
During execution, however, the thread may
suspend itself, and wait for a specific amount
of time or a wire to change.

if this thread is waiting for a wire to change,
it now is sensitive to that wire even though
it did not have a sensitivity list to begin with.

==================================================
==================================================

it might be possible to build this using only
the concept of THREADS, SENSITIVITY LISTS, and
DELTA SLICES.

Simulation time could be built on top of these
three things as a wire, and scheduling threads
based on time would simply be done by a thread
with TIME in its sensitivity list.

this would allow software designs that use threads
to be independent of time, and simply react to 
changes in wires.

it would also allow software designs using threads
to use REAL-TIME to schedule when threads execute.
this could be used to have a thread execute every 5
minutes to backup your data in a text editor,
or to schedule a real-time operating system in an
embedded processor to perform some function in
real time. an embedded application might read a 
sensor value 10 times a second and run a short piece
of code to determine what to do.

finally, it would allow hardware designs to be simulated
using software threads. This would use simulated time
rather than real-time for the value assigned
to SIMULATION TIME. a piece of hardware might
be sensitive to a wire and produce a valid
output 10 nanoseconds after the input changes.
but the software thread that is modeling this
might take much longer to execute and calculate
the output.

wire resolution is not a requirement to get the
functionality working, however, it is a concept
built into VHDL as a means to prevent accidental
race conditions from being hidden. "last assignment
wins" will not uncover that two or more threads
are driving the same wire when they aren't
supposed to. wire resolution allows threads to
indicate they are temporary relinquishing control
of a wire by driving it with "high impedance"
or "z"'s


===================================================
some internal things, used to make sensitive threads
do their magic, not generally visible to designers
using sensitive threads.
===================================================

THREADING ENGINE:
-----------------

MASTER SCHEDULER LIST:
----------------------

in my simulator, time starts out at zero.
all processes scheduled to run at time zero are run.
some of these processes shedule themselves to re-run
at a future time.  as each process schedules itself,
I add the time they want to start, and a code reference
to execute, to a master list. then I sort the list
by time, so that soonest time is first.

once the process has rescheduled itself, and is
done executing for the current time cycle, I delete
its entry from the master schedule list.

the next entry inthe schedule list will contain the
simulation time, and the code ref to execute.
so I update REPORTED TIME to the value of the
next entry in the master schedule list, and execute
the next code ref.

if you had a simple process like this:

process1() { a<=7; wait 10 ns; a<=42;}

the sequence would look something like this:

read in the processes, update initial schedule for time 0.
master scheduler list: [0,\&process1];

get first entry off master scheduler list,
set REPORTED TIME to time value in entry.
call the code reference.

calling the process causes the process to 
assign the value to wire A (read value gets updated later),
and schedule itself to resume at time 10 ns, then suspend.

master scheduler list [0, \&process1], [10 ns, \&process1]

process1 then suspends itself, and control returns to 
the master scheduling routine. the master scheduler 
looks at master schedule list and compares the
bottom time value with the next from bottom time value.
if they are differnt, then delta cycle has completed,
increment the delta count,

shift the bottom entry off the master scheduler list:
master scheduler list [10 ns, \&process1]

and update all the wires for this DELTA.

the read value for wire A is now updated to 7;

this update may cause a new set of processes to trigger
immediately, which would insert themselves into the bottom
of the master scheduler list. (they would execute at
time 0, delta 2).
in this case, it doesn't since there are no processes 
that are sensitive to changes in wire A.

since the end of the DELTA cycle did not cause
any processes to be scheduled for immediate 
execution, this indicates the end of the 
current simulation cycle (time 0).

(a cycle is a bunch of  DELTA's that all report the 
same REPORTED TIME).

at the end of a simulation CYCLE, the current
REPORTED TIME is updated to be equal to the
value of the next entry in the master scheduler
list. in this case, 10 ns.  the delta counter
is reset to 1.

so, blammo, simulation time is bumped from 0 to 10 ns,
instantaneously.  you dont count opcodes, dont pass
go, and whatever you do, do NOT collect 200 dollars.
counting opcodes is evil incarnate here, I promise.

if the engine is configured to use real time, then
it looks at the current time, figures out how long to
wait and schedules an alarm or something to restart 
itself at some point in the future. if the current
time has passed, then it should report an ERROR
that the threads took longer to executed than expected,
and the amount of time to wait has already passed
from the beginning of the last simulation CYCLE.

(remember, TIME is snap-shot'ed at the beginning
of a CYCLE. for all the DELTA's within that cycle,
TIME is reported to the threads as being the same
value.)

there is a method to this madness, I assure you.

it requires designers to think about syncronization
from the very beginning, and it also creates a 
framework which captures some of the common mistakes
made (it took longer to execute than you designed for),
flagging the problem, and forcing you to make your
design reliable.

it helps designers avoid race conditions and similar
problems. it helps hardware guys determine if they
are trying to force some piece of hardware to run
faster than it is designed to run. it would help
software guys to determine if they were too
optimistic, thinking you can schedule something
10 times a second, when the simulator will find
out that it can only fit about 5/second.

it also allows the same engine to be used for
real time and for simulated time, VERY important
if you want threaded perl to be generic and
have a wide user base.


=========================


all events scheduled based on absolute time will execute in the
first DELTA of the appropriate simulation CYCLE.

examples:
a clock generator that says: 
	loop { wait 5 ns; clock <= not clock;}
a thread that has time delays: 
	thread{ address<=0; wait 5 ns; strobe<=0; wait 3ns; strobe<=1;}
a wire assignment reflecting "gate delay":
	strobe_wire <= 1 after 5 ns;

ALL OTHER EVENTS during that CYCLE will happen as a result of
senstive threads being triggered.

CYCLE 100 NS:
================================ 
DELTA 1 :
	threads scheduled for 100 ns
	modifies wire a
	modifies wire b
	at end of DELTA1, update all wires
DELTA 2 :
	thread sensitive to A
	modifies wire C
	thread sensitive to B
	modifies D
	at end of DELTA2, update all wires
DELTA 3 :
	thread sensitive to C and D
	modifies wire E
	at end of DELTA3, updates wire E

NO THREADS ARE SENSITIVE TO wire E,
THEREFORE, DELTA 3 IS THE LAST DELTA SLICE
FOR THE SIMULATION CYCLE AT TIME 100 NS.

END OF SIMULATION CYCLE 100 NS
===============================

NEXT EVENT IS SCHEDULED TO OCCUR AT TIME 175 NS:
START NEW SIMULATION CYCLE

CYCLE 175 NS:
================================ 
DELTA 1 :
	threads scheduled for 175 ns
	modifies wire X
	modifies wire Y
	at end of DELTA1, update all wires

DELTA 2:
	thread sensitive to x
	modifies wire w
	thread sensitive to y
	modifies wire Z
	at end of DELTA2, update wire w,Z

NO THREADS ARE SENSITIVE TO wireS W OR Z
SO DELTA 2 IS THE LAST DELTA SLICE FOR SIMULATION
CYCLE 175 NS;




another attempt at an example:

#################################################################
#################################################################
# start of code
#################################################################
#################################################################

# declare the wires, queues, that connect the threads.
my $clock = new Queue(0);   # initial value for clock is 0;
my $a = new Queue (12);
my $b = new Queue(34);

#################################################################
# declare the thread that drives $clock
# no wakeup list, therefore it executes at time "zero"
# the subroutine never returns,
# instead it re-schedules its own execution every "100 ns"
# end result, every 100 ns, the value of $clock inverts itself.
#################################################################
my $clock_generator = new Thread;
$clock_generator->Code(
  sub{ 
   while(1)        
    {
    ThreadWait(100 ns);
    if ($clock)
      {$clock=0;}
    else
      {$clock=1;}
    }
  }
 );


#################################################################
# define a thread that assigns wire/queue $a
#################################################################

my $thread_a = new Thread;

# indicate that thread_a will "wake up" whenever $clock changes
$thread_a -> Wakeup ($clock);

# indicate that thread_a will assign $a the value of $b when it is called
$thread_a -> Code ( sub {$a=$b;} );


#################################################################
# define a thread that assigns wire/queue $b
#################################################################

my $thread_b = new Thread;

# indicate that thread_a will "wake up" whenever $clock changes
$thread_b -> Wakeup ($clock);

# indicate that thread_a will assign $b the value of $a when it is called
$thread_b -> Code ( sub {$b=$a;} );


#################################################################
# start the threading engine. call it what you want.
#################################################################
ThreadingLoop();  


#################################################################
#################################################################
# end of code
#################################################################
#################################################################



so what happens?

there are 3 threads involved: $clock_generator, $thread_a, $thread_b

when the ThreadingLoop is called, the $clock_generator thread is 
invoked. why? because it doesn't have a ->Wakeup list, and
therefore starts as soon as the threading loop starts.

----------------------------
Time: 0
start of Delta 1

the Code reference of $clock_generator is then called.
the first thing it does is immediately schedule itself
to be restarted in 100 ns. This causes the threading engine
to suspend execution of $clock_generator.

end of Delta 1
----------------------------

time advances to the next scheduled event.
which happens to be time 100 ns.


----------------------------
Time 100 ns
start delta 1

the code reference of $clock_generator is restarted
where it left off. which is right after it called
ThreadWait(100 ns).  the code continues and looks at
the value of wire/queue $clock, and then inverts it.
since $clock starts with an initial value of Zero,
the code assigns $clock to One.

the while(1) loop continues and loops back to the
ThreadWait(100 ns) call. $clock_generator suspends 
itself at this point, and schedules itself to restart
at time 200 ns;

The assignment to $clock isn't immediate.
$clock is queued to receive teh new assignment at the
end of the currently running Delta. all threads that
read $clock will still read Zero.

End of delta 1
----------------------------

the threading engine updates all queued assignments.
$clock is updated to One.

the threading engine looks to see if any threads will
be ->Wakeup'ed by $clock.  it sees that $thread_a
and $thread_b are supposed to be woken up by $clock.

so, the engine starts a new "delta" cycle at time
100 ns.

----------------------------
Time 100 ns
Start of delta 2

thread_a and thread_b are both called.

*** the order they are called is irrelevent ****

decide to call thread_b first. the ->Code reference
is called. $b is queued to be assigned the value of $a (12).
but any threads that read $b in this delta slice will
read $b's old value. i.e. 34.

then, call thread_a next. the ->code reference is called.
$a is queued to be assigned the value of $b,
reading $b gives you its old value of 34.

end of Delta 1
----------------------------

the threading engine updates all queued assignments.
$a is now 34
$b is now 12

since $a and $b have changed, the threading engine
looks to see if any threads are to be ->Wakeup'ed by
$a or $b.  none are.

there for the threading engine advances simulation time
to the next scheduled event, which is time 200 ns.

----------------------------
time 200 ns
delta 1


......................
end of example
......................


===============================================
Questions:
===============================================

> The major win, if any, is your suggestion to allow no other
> interconnection between threads other than Wires. This is what
> eliminates all contention.

actually, one of the major win comes in the form of the way the 
Delta slices are managed.

thread 1: sensitive to (clock) a = b;
thread 2: sensitive to (clock) b = a;

if Wire "clock" changes,
then thread 1 and 2 are both invoked.

since all updates are delayed until the end of a Delta slice,
the result is totally predictable:
a and b will always swap values.

this as opposed to all assignments happen immediately,
which means a and b will end up with a value
depending one which thread is executed first.

having the Delta slice built into the threading engine
allows programmers to not worry about one of the most
infuriating problems with current threads.



> how would you represent your ideas in perl syntax.

my $t1 = new Thread;

$t1->Sensitivity(clock, reset);

$t1->Code(sub($a=$b;});

what is needed here is for the perl code to execute and 
create all the instances of threads.

from that point, plain perl code stops execution,
and the threading engine takes over.

note that any thread without a Sensitivity list will
execute once at this time, and all other threads
will execute whenever the queues in the Sensitivity
list changes values.

 

[end]