[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]
Link Behavior Questions

To: <marcs>
Subject: Link Behavior Questions
From: Mark S. Miller <mark>
Date: Mon, 16 Oct 89 20:59:05 PDT
Cc: <xanatech>
In-reply-to: <Marc>,17 PDT <8910162200.AA10145@xanadu>
Date: Mon, 16 Oct 89 15:00:17 PDT
   From: marcs (Marc Stiegler)

   Anyway, I have tentatively identified the following differences 
   between embedded the links and first-class links:

   1)If embedded, the link falls under the version control of the 
   document.

Yes.  Absolutely.  This is the definition of "embedded".

   2)If embedded, the link falls under the permission control of 
   the document.

Currently this is certainly true.  It is interesting that the ent
structure which makes it eventually possible to separately endorse
parts of documents should make it eventually possible to separately
permit parts of documents (since, as far as the MDSE is concerned,
both endorsements & permissions are just baskets).  This is a case
where we understand the need, and we understand the
data-structure/algorithm, but we don't yet understand what it means in
the (orgls & berts level) semantics.  I feel somewhat confident that
this can be figured out in a clean way.  In any case, your statement
is true of the first product.

   3) If embedded, the link is more likely to be fetched without 
   another disk access, since it is so conceptually close to the 
   document that it is easy to lay the link physically close to 
   the document on the disk. Since, on the average, half the links 
   on a document would be embedded, this would cut in half the number 
   of links for which a fetch might be required.

I think this is a misleading issue.  If we go with independent links,
we can still heuristically locate it on disk near (say) the document
named in the (first) from end.  This would have the same performance
effect, without changing the semantics.  I think this is an issue not
if we attempt to cut the number of seeks down by a factor of two, but
if we attempt to usually eliminate all seeks other than the one
required to follow a link into its target document.  We can do this
only if we make a front-end distinction between forward links (those
whose home are in this document), and backwards links (those whose
home is elsewhere, but have an end in this document).  The front-end
distinction would have to allow it to be sensible to initially show a
document & just its forward links.  The backwards links would have to
be asked for, or would pop in one at a time (at a cost of at least a
seek per).  

It is conceivable that we can eventually get instantaneous (i.e., one
seek) access both ways, but only at the cost of maintaining double the
link storage overhead (from storing the link both places), and having
to worry about distributed update.  I REALLY REALLY don't want to have
to worry about distributed update of redundant link storage even
during performance engineering phase.  This kind of messy performance
hack should really be left till after first product.

Conclusion: If we go with the current symetrical front-end display,
the embedded vs independent link issue shouldn't make a meaningful
performance difference.  However, going embedded could be a big
performance win if we go with an assymetrical front-end

   For reasons that I'll explain another day, I'd be happy to have 
   the links fall under the permissions control of the document, 
   but unhappy to see them fall under the version control of the 
   document. So permission control is a vote for embedded links, 
   version control is a vote for independent links. 

I'm quite curious about the reasons.  Let's schedule "another day".
Given a symmetrical front-end, which of the two documents would you
want the permission control of the link to fall under?

   The performance issue is probably the driver here, if it is true 
   that embedding the links might cut in half the number of fetches 
   associated with a document. This is a big vote in favor of embedded 
   links. However, I have a couple of other questions which may 
   drive the decision even more strongly. Let us examine the following 
   scenario:

See above.

   Mark creates document A1. He attaches a link-activated sensor 
   to the whole document. He creates derived document A2, i.e., 
   A2 starts life as a vcopy of A1, and only unimportant edits are 
   made to it. Now Mark runs a link from A1 to document B, with 
   the link embedded in A1. He also runs a link to document C from 
   A1, with the link embedded in C. Finally he creates more derivative 
   documents from A1, documents  A3 and A4.

Let's call the first one link X (the one from A1 to B in A1),
and the second one link Y (the one from A1 to C in C). 

I presume you mean that the recorder is on the entire data space of
the document (as opposed to the entire contents of the document, or to
the document itself).  Let's call this recorder recorder Q.

   Questions I would like someone (markm or Dean, I guess) to answer, 
   either because I have no idea or because I'm not sure enough 
   of my analysis:

   1) We do a backfollow from B. How many links do we find?

X in A1, X in A3, X in A4

   2) Eric runs a link to A3, with the link contained in document 
   D. How many sensors ring?

Let's call this link link Z.

I presume you mean that the link is to some data in the data space of
A3.  I just realized that this ambiguity exists in the previous
discussion as well.  I presume that we meant corresponding things
there (e.g., that link X is from some data in the data space of A1 and
to some data in the data space of B, and further that X specifies A1
as the from context and specifies B as the to context).

Recorder Q rings (are there any other recorders yet?).  It rings
because Z is from some data in A3, which is (I presume) data which is
also in A1.

   (uh, make that "how many recorders ring?" If these puppies still 
   do the things I understand, shouldn't they still be called sensors 
   for the end user? I will use the term "recorder" for the moment, 
   to avoid distracting one of the key Answerers Of These Questions).

Too bad.  I got distracted anyway.  I'm really answering in terms of
Orgls & Berts level semantic constructs.  It's up to Ravi what the
Docs & Links level analogue to recorders is, and what they should be
called.  It's up to you front-end guys what is presented to the user,
and what that should be called.  I'll stay out of that one, thank you.

The reason the Orgls & Berts level construct is called a recorder
instead of a sensor is that it doesn't necessarily alert the user at
the time when it rings (since the user could be logged off).  Instead,
it records the event by filling in a partial orgl.  A sensor is used
by a front-end in a live febe session to be asynchronously informed
(yes, we can do this with no scheduler) when (for example) a non-ready
partial orgl becomes ready (i.e., when there's stuff available in it).
It is the combination of a recorder & a sensor that lets a front-end
wait for links as above.  Recorders are persistent backend data
structures, whereas sensors only exist as part of a febe session.

(We used to make this distinction by saying "persistent sensor" vs
"transient sensor", but this is bad).

   3) We restrict the permissions on A1.  A person with read access 
   to A3, A4,  and C backfollows from C. How many links do we find?

We find link Y in C.

   4) We attach sensors to documents B and  C. We create another 
   derivative document A5. How many recorders ring?

I presume the standard presumptions.  Let's call the recorder on the
data in the data space of B recorder R, and that on the data in the
data space of C recorder S.  I also presume that A5 is derivative from
A1 (as opposed to A2).  Q and R ring.  S does not.

   DISCUSSION OF ANSWERS

   Markm, dean, if you plan to answer the above questions, you should 
   try to formulate the answers before reading the following discussion. 

I've done so.

   However, if you need clarification of the issue at hand, here's 
   my analysis:

   1) Two views of how many links you get backfollowing from B seem 
   possible: either there are 3 links, one each embedded in A1, 
   A3, and A4, or there is one link that terminates in 4 contexts 
   (the 4 contexts include A2, since the link ends on text in A2, 
   though the link is not contained by A2). 

   It is not clear to me that we would be doing the user a favor 
   by showing him 3 separate links, each of which ends with the 
   same 4 different contexts, even though the default context for 
   each of the 3 is different (uh, as I think about it, since the 
   default context is also vcopied, they all share that, too!). 
   Indeed, it is my strong opinion that this would be a very hostile 
   thing to do to a user. 

Except for the context issue, it is as you say.  There are in fact
three different documents (A1, A3, and A4) which link into B.

The way we deal with context here is neat and interesting, but beyond
the scope of this message.  Suffice it to say that it properly acts as
though "the default context for each of the 3 is different".  Ask me
about this.

   I would like to represent all these identical links using a single 
   viewing mechanism. Though we may pick a better mechanism, we 
   currently achieve this by showing a single link in the link pane 
   with a single default context, and letting the user select a 
   non-default context if he so desires. If the three links appear 
   as separate items in the link pane, it's real boring--the only 
   discriminator among them is the bert name of the container.

   Please note, those readers who don't like link panes anyway, 
   that the problem is more fundamental than the link pane representation.

Although it would be more work for the front-end, it could still
collapse multiple such lines into one based on link-id:  All links
which are derivative of a given link (either because they were edited,
or because their containing document was) would have the same link id.

In your scheme, which of the three documents would be the default
context?  I see no natural choice.

   2) When the link goes to A3, I hope one recorder rings, the recorder 
   that was copied along with A1 when A3 was created. It would be 
   a real bore if recorders rang in A1 and A4 as well because the 
   link was attached to text inside their text space but outside 
   their context. It would be acceptable, even reasonable, if the 
   recorder didn't get copied from A1 to A3 and no rings occurred 
   at all. 

There's only one recorder.  It lives on the data which is in all the
A's.  I think it is confusing to talk about this as copying the
recorder (but then, you may say that it is confusing to talk about
vcopying the data.  The data isn't copied in any sense, it simply
becomes accessible from more than one place.  Similarly with the
recorder).

   Actually, a more correct model of the behavior I would like is 
   that there is conceptually a single recorder attached to all 
   these vcopied documents, and a link to any document sets off 
   the single recorder. If I understand the system correctly, this 
   actually is how it would work, but I figured I better check.

Ah, perhaps my presumption above was in error.  Did you attach the
recorder to the data in document A1, or to document (bert) A1 itself.
If the latter, many of the above answers are different.

   3) Though the permissions on the original document containing 
   the link no longer allow access, I would like to find 1 link 
   with 2 bert contexts, A3 and A4 (and perhaps A2, depending on 
   whether the reader has permission on it, of course).

Nope.  You find one link with context A1, which you can't read.  I
would argue that this is sensible and correct.  Of course, you can
designate A3 or A4 as alternate contexts and follow the link into
there, but the link itself doesn't designate these contexts.  If this
reader backfollows (as in find-docs-containing) from the context path
tree in the from end of the link, he'll indeed find all the A's that
he can read.

   4) I would like to have no link-activated-recorder rings go off 
   just because someone vcopied a document. In vcopying the document 
   containing a link, are we creating new links for every link embedded 
   in the document? This would set off a huge number of recorders 
   if someone embedded a link in a document that many people used 
   as a template. 

Interesting.  I understand your concern.  I fear that the opposite
decision may also have bad consequences.  Interesting.

   Of course, it would be a silly error to put a link in a template 
   document, but it would be an easy mistake to make, particularly 
   if Xanadu is successful, and all documents wind up being used 
   more often as templates. 

I think links in template documents are a natural thing to do, but
this of course weakens my position further.

   To understand the depth of my concern, please realize that I 
   expect people to put recorders on almost every interesting thing 
   they read, and on virtually everything they create. If vcopies 
   of documents cause link-activated recorder rings, people won't 
   be able to do this: the noise level would be horrendous. 

   In all these situations, first-class links give us the behavior 
   that seems most appropriate to me, if I understand how first 
   class links would work (which is in itself questionable). We 
   could implement first class links with a minimum of modifying 
   the backend by creating a new document for each link. This creates 
   more overhead for the link but I don't find that very disturbing. 
   However, the potential performance loss is disturbing.

As above, I wouldn't worry too much about the performance loss.  Disk
grouping heuristics can nullify these.  Performance is only an issue
if we're also considering an assymetric front-end

I believe that the whole of the above argument was predicated on the
assumption that we have a front-end much like the current design,
which presents links in a way independent of where their home is, and
doesn't take the issues of embedded links in forking documents as a
primary presentation problem.

I think this assumption is not a bad one!  As I've said, the front-end
and the back-end should not be pulling in two different directions.  I
think your examples above make it clear that doing a front-end which
deals well with the embedded link versioning presentation problem is
quite a chore.  I'd like to see it done eventually, but for now
independent links seem fine.  Your technique of making per-link
documents allows this without changing the fundamental organization
and definition of the docs & links layer.

   I have a few words in response to Greg's concern about overhead 
   in another message. 

   --marcs

Congratulations on being the first to think through these consequences
of the design

--MarkM
Follow-Ups:
- Link Behavior Questions
  - From: Eric Dean Tribble
References:
- Link Behavior Questions
  - From: Marc Stiegler
Prev by Date: intersection of enclosures
Next by Date: intersection of enclosures
Previous by thread: Link Behavior Questions
Next by thread: Link Behavior Questions
Index(es):