[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]
Re: Why have a multipart document address?
- To: "xanadu@xxxxxxxxxx" <xanadu@xxxxxxxxxx>
- Subject: Re: Why have a multipart document address?
- From: roger gregory <roger@xxxxxxxxxxxxxxxxxxxxx>
- Date: Thu, 10 Feb 2005 09:58:20 -0800
- In-reply-to: <4fdbc18c14b8dad71125938f840f190a@xxxxxxxxxxxxx>
- References: <4fdbc18c14b8dad71125938f840f190a@xxxxxxxxxxxxx>
OK Jack, here goes. The document address is an elaborate method to
insure unique id's in a distributed environment where the various
servers might not be in communication with each other. The server isn't
a location it's an identification. The kinds of stuff in Linda isn't
suitable for real distributed systems. It requires communication between
nodes. For a simple case, assume one node is at alpha centuri and
another is here. For many things the kinds of delays that can ensue for
intermittent connectivity, provide even more need for uniqueness without
communication.
Both document, version and author are for similar reasons. Yes you can
assign a different number if you want, but you can use a version number
if you want to. Remember it's not a URL it doesn't use DNS, the server
name isn't a "machine" or an ip node, it's a different but related
abstraction.
The idea of having a server id is to allow a localization of the
document id in some automatic and automateable way, without incurring an
n squared explosion in document notification traffic.
Suppose in a simple situation that there 10^9 servers each creating a
document a second, Just notifying every machine of the new documents
will overwhelm every machine, so we don't do it that way, We use the
spread of the knowledge of the document names to provide an abstract of
the locations of the documents quite analogous to the way DNS keeps
track of there IP nodes are.
Having a version number is merely a convenience, and almost orthogonal
to transclusions.
There were a lot of assumptions built into the naming scheme, mostly
they were driven by the need to deterministically derive uniqueness of
document names. Thus the server addresses descended from each other.
There are other ways to do this, but most are worse. (dollar bill
serial numbers, ransom id's ans such). When we wrote the spec no one
thought about real distributed systems. Currently we are in an era
where everyone thinks about distributed systems as connected by the
internet. This too is a limitation we can't really accept. The
discipline imposed by looking at the requirements imposed by long
transmission times, and disconnected nodes will lead to a better design
at minimal increased cost.
Remember DNS hides a lot of real problems, but it cheats. One simple
result of this is that I can't connect to Jack's machine because of
details of isp's and url's and network routing and nats and dynamic ip
addresses, just of give an immediate example.
Sorry to be so rambling, but this needs a real discussion, at least
before anyone implements a real system intended to do more than just
reside on the internet under DNS, and even there a few simple hooks can
probably provide for extensibility ( another level of indirection
usyally is enough).
On Tue, 2005-02-08 at 21:03, Jack Seay wrote:
> Is it necessary to have the server number as part of a document
> address? What if the server is shut down or the document needs to be
> moved? Maybe the author just wants to use a different server.
>
> If a language that implements the Linda functions designed by David
> Gelernter was used, it wouldn’t matter where the document was stored.
> Just make sure it is located in more than one place for safety. There
> wouldn’t be a server address at all. A query about a document would
> just request a document number and distributed agents would retrieve it
> from wherever it is located on the network of servers and return it to
> you.
>
> Also, why have a version number as part of the address? Why not just
> give each new added document a new document number. If it is a new
> version of another document, it will transclude much of the previous
> document. The revisor could be the original author, a group of writers,
> or a different writer. There could be major changes, or just a few. How
> do you decide if it’s a new version or just transcludes a lot from
> another document. If the version number is eliminated from the address,
> it doesn’t have to become a hardcoded item. A separate document could
> tie the various versions together with links. And by looking at all the
> documents that transclude the current one, newer versions will be
> found.
>
> Also, why have the author part of the document address? What if there
> are several authors? What if the author uses several names? What if it
> is published by a group of people working for a business? What if that
> business sells?
>
> Could a series of relational tables (or xanalogical documents or zigzag
> dimensions) store the information on authors, publishers, buyers,
> links, formats, etc. and combine the data in whatever way is needed to
> create the composited document?
--
Roger Gregory
roger@xxxxxxxxxxxxxxxxxxxxx
http://www.halfwaytoanywhere.com