oslc-core message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: Fw: [lyo-dev] TRS Server paging & persistence proposal
- From: "Jim Amsden" <jamsden@us.ibm.com>
- To: OASIS (oslc-core@lists.oasis-open.org) <oslc-core@lists.oasis-open.org>
- Date: Tue, 5 Jun 2018 21:26:02 -0400
Forwarding from lyo-dev since this important
for our TRS standards discussions. Nick makes some great and import points
for our consideration.
Jim Amsden, Senior Technical Staff Member
OSLC and Linked Lifecycle Data
919-525-6575
----- Forwarded by Jim
Amsden/Raleigh/IBM on 06/05/2018 08:28 PM -----
From:
"Nicholas Crossley"
<nick_crossley@us.ibm.com>
To:
Lyo project developer
discussions <lyo-dev@eclipse.org>
Date:
06/05/2018 05:41 PM
Subject:
Re: [lyo-dev]
TRS Server paging & persistence proposal
Sent by:
lyo-dev-bounces@eclipse.org
I believe it is unacceptable for a client
to need to re-read the base when the server rebases and/or truncates the
change log, unless this happens extremely rarely - perhaps once per several
years. Restarting processing of the entire base plus change log can take
4 weeks or more in some existing user installations of IBM's Requirement
Management applications with many millions of requirements. Such users
have been very unhappy if we tell them to rebuild the reporting index from
scratch, and have their reports incomplete for the next month until the
index catches up!
For this reason, I do not think it acceptable to replace all change log
pages at the time of a rebase, unless old pages are also kept for a reasonable
period of time (at least 15 days, and preferably double that). The proposal
allows for this, but does not make it clear that implementations really
should do this.
Under normal circumstances, the client should not need to re-read the base,
and should be able to completely ignore the server's rebase procedure -
it should not need to detect it, because normal processing of the change
log should suffice.
Considering TRS client performance, having the number of change events
in the TRS resource itself be reduced to 1, or a small number during the
period following a rebase, is not ideal. Ideally, clients that poll the
TRS feed at a reasonable frequency might expect to get all the change events
they have not yet processed in the initial GET of the TRS resource - so
keeping that first page fully populated with the most recent change events
is more efficient for the clients. With a reduced number of change events
in the TRS resource, the client has a higher chance of needing to read
the next page of the change log to find the change event it last processed.
Nick.
From: Andrii
Berezovskyi <andriib@kth.se>
To: Lyo
project developer discussions <lyo-dev@eclipse.org>
Date: 06/05/2018
11:51 AM
Subject: [lyo-dev]
TRS Server paging & persistence proposal
Sent by: lyo-dev-bounces@eclipse.org
Hello,
Current
TRS Server implementation in
Lyo is rather naïve when it comes to long-term server operation. One thing
that can be improved is how paging is done. Another is how to use Lyo Store
and/or Redis for persisting the pages.
Currently, a TRS Server keeps its Change Log in memory and uses simple
URI patterns:
- /services/trs/ points to
- /services/trs/base/1
- /services/trs/changeLog/2
- …
- /services/trs/changeLog/m
When the rebase happens, it rebuilds the Change Log completely. Finally,
Change Log pages are formed on-the-fly.
Issues 1 & 2 can cause the following problems:
- Keeping things in memory means rebase
would happen on every restart.
- Stateful TRS Server also prevents an OSLC
microservice from being placed behind a load balancer.
- When a new event happens, the contents
of all Change Log pages would change and a TRS Client would see previously
observed Change Events in the subsequent Change Log pages.
- trs:order property may be assigned to different
resources upon rebase and/or restart and the Cutoff Event would lose sense.
TRS Client would detect this and would perform a full rebase.
Because of the URI patterns & issue 3, the only way a TRS Client can
detect a rebase is to follow the TRS Base link and check on its first page
if the Cutoff Event URI has changed or wait to fail to find a Cutoff Event
(or the most recent Change Event observed by the TRS Client).
Jad suggested an idea to use Lyo Store to persist Change Events under a
triplestore. But without nice and clean paging that allows pages to be
persisted once and be deleted completely once they are “evicted”, doing
this would be challenging. After discussions in OSLC Core committee (special
thanks to Nick for extensive analysis and detailed examples in the slides),
I came up with the following:
- The TRS Resource should display
a variable number of the most recent changes, the pages should have
fixed size “n”.
- When the number of Change Events
to be returned with the TRS Resource exceeds the page size, a new page
is created and the number of Change Events returned in the TRS Resource
should go from n+1 to 1.
- The Change Log pages should be
numbered in reverse (see an example below).
- A truncated hash of a Cutoff Event
URI is used to provide an ability to return 410 gone or 404 Not Found when
the rebase happens in the middle of the client’s traversal of the change
log pages.
Here is an example:
- /services/trs includes a TrackedResourceSet
resource with
- a ChangeLog resource that
- via trs:previous points to /services/trs/log/ABCD/9
which
- via trs:previous points to /services/trs/log/ABCD/8…
When m events get added, and the “root” ChangeLog resource grows beyond
the page size, we add a page log/ABCD/10:
- /trs includes a TrackedResourceSet resource
with
- a ChangeLog resource that
- via trs:previous points to /services/trs/log/ABCD/10
which
- via trs:previous points to /services/trs/log/ABCD/9…
- the contents of /services/trs/log/ABCD/9
are identical to its contents before page 10 was added
- the ChangeLog resource under /trs now contains
only one Change Event
When rebase happens and we decide to keep 3 pages worth of events:
- /trs includes a TrackedResourceSet resource
with
- a ChangeLog resource that
- via trs:previous points to /services/trs/log/9FE2/2
which
- via trs:previous points to /services/trs/log/9FE2/1…
- a request to /services/trs/log/ABCD/5 would
return 404 Not Found if we only keep the hash of the current base or 410
Gone if we also keep a set of older bases.
At the cost of keeping the list of the most recent changes separate from
the paged log, we get a perfectly cacheable solution with predictable behavior
that can be persisted in a triplestore:
- each page must be persisted once
- all pages from a given key get removed
upon rebase
- or we keep all pages for the current and
the last keys
- Varnish can be employed to cache whole
Change Log pages for extended periods of time (>60s) and the TRS Resource
page for shorter periods (<10s).
Feedback is welcome!
/Andrew
_______________________________________________
lyo-dev mailing list
lyo-dev@eclipse.org
To change your delivery options, retrieve your password, or unsubscribe
from this list, visit
https://dev.eclipse.org/mailman/listinfo/lyo-dev
_______________________________________________
lyo-dev mailing list
lyo-dev@eclipse.org
To change your delivery options, retrieve your password, or unsubscribe
from this list, visit
https://dev.eclipse.org/mailman/listinfo/lyo-dev
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]