[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: AW: AW: [sdo] ChangeSummary and OrphanHolder properties
Hi Blaise,
Maybe I should move CS up on the agenda. I think we
are mostly agreeing, but I have some problems with the logic in the second
paragraph. What if customer is itself an orphan, "contained" via an
orphanHolder property? Unless the property is bidirectional, a Customer
would have no way of knowing how or if it is in scope of a CS. Or am I
missing something? In order to implement this,
or DataObject.getChangeSummary(), won't DataObjects have to store
a reference to the containing ChangeSummary, thereby increasing our memory
footprint? There's of course another way to implement this, by
keeping the contents of the orphanHolder properties "live", rather than
calculating them at the moment they are necessary (eg, CS.beginLogging,
XMLHelper.save) It seems to me that this not only increases our footprint,
it is unneccessarily complex.
I'm also wondering why we need
DataObject.getChangeSummary() at all. Isn't the normal case that the DAS
or whoever will recieve the DataGraph (or whatever) and the ChangeSummary will
be a property of some highlevel envelope type object... I certainly don't expect
to see CS properties on business objects. If that's the case, why not
retrieve the CS through get("changeSummary") instead of
getChangeSummary(). What is the use case for calling getChangeSummay() on
an ordinary DOs?
Ron
Von: Blaise Doughan
[mailto:blaise.doughan@oracle.com]
Hi Ron,Gesendet: Dienstag, 19. Mai 2009 16:59 An: Barack, Ron Cc: Radu Preotiuc; sdo@lists.oasis-open.org Betreff: Re: AW: [sdo] ChangeSummary and OrphanHolder properties I interpret your initial description as: "Containment still represents the ChangeSummary scope, and orphan properties can be thought of as containment properties in terms of this scoping". This is consistent with the role of orphan properties wrt the XML representation. An orphan DataObject could become part of a ChangeSummary as demonstrated in the following example: When an Address DataObject is set on a non-containment property on a Customer DataObject it looks at the Customer DataObject and recursively through its containers until it found a DataObject with a suitable orphan property, and once found it would take the ChangeSummary referenced by the DataObject with the requisite orphan property. Also, an orphan DataObject can and should return the ChangeSummary monitoring its changes from its getChangeSummary method. It seems unnecessary to call endLogging before the ChangeSummary can be interrogated. If calculations need to be done they could be triggered by the individual ChangeSummary calls. I agree that DataObjects belong in only one ChangeSummary, I believe this also means that DataObjects are also referenced by only one orphan property. -Blaise Barack, Ron wrote: 7C3EF93EEBC6EB4A8B4470853DE86566BEFC62@dewdfe18.wdf.sap.corp type="cite">Hi Radu, Thanks for the comments. Regarding endLogging(): my concern is that the consideration of orphanHolder properties would make the calculation of the changeSummary much more expensive. Certainly it is much harder to calculate ChangeSummary.isModified(orphanObject) than ChangeSummary.isModified(containedObject), since when I want to determine if the contained object is in scope, in the latter I can simply go up the containment treee, but in the former I have to search the entire tree, starting at the root. Rather than making each call to isModified expensive, it seems to me that the only way to get reasonable performance without increasing memory footprint is for the list of changed objects to be calculated once, so that isModified can simply check if the object is contained in the list. Obviously, when "beginLogging" is called, we need to traverse the containment tree, and if any orphanHolder properties are present, we need to include any matching orphans into the scope of the change summary. We a re going to have to do this again, when calculating the end-state of the change summary. But change summary is really poorly defined, it doesn't really define an "endState", only a "currentState". So the question is, when should the "end-state", and therefore, the list of changed objects be calculated? Since I'd anyway be happy to have some kind of reasonable meaning to associate with endLogging(), I thought, maybe, we could use endLogging for this purpose. Lately, however, I've been thinking about simply using getChangedObjects() for this purpose. After all, I don't think the normal use of isModified is that the client takes some arbitrary object and asks "is this object modified in reference to this change summary". I think the normal use is that the user (probably a DAS) first calls getChangedObjects, then iterates over the list, using isModified, isCreated and isDeleted to determine the nature of the changes. Since getChangedObjects has to traverse the tree anyway, it seems a natural place to consider orphanHolders in traversal algorithm. On the first question: no, I don't see the need to allow tracking of objects from multiple change summaries. This would add a lot of complexity, and I don't really see the use case. If we consider the standard SDO story, with a DAS providing disconnected data to a client who makes updates to the graphs, then I don't see orphan objects as floating between the graphs as some sort of shared objects. I don't think orphanHolders necessarilly bring us into a world where the association of an object with the DAS call that retrieved it is weaker than in 2.1. I guess the point of the proposal is that ophaned objects are as much "owned" by the graph as contained objects, it's just that they are owned via references and not via containment properties. All orphanHolder properties allow us to do is avoid imposing a containment structure just because we want to serialized to XML or have a change summary. Even in 2.1, obejcts could potentially be in scope of multiple change summarie s, but we say this is an error. I don't really see the motivation to relax this restriction. Best Regards, Ron -----Ursprüngliche Nachricht----- Von: Radu Preotiuc [mailto:radu.preotiuc-pietro@oracle.com] Gesendet: Dienstag, 19. Mai 2009 00:07 An: Barack, Ron Cc: sdo@lists.oasis-open.org Betreff: Re: [sdo] ChangeSummary and OrphanHolder properties Hi Ron, thanks for the write-up. A couple of initial observations: One thing about orphans is that because they are not (by definition) contained anywhere, they can be reference from two different containment trees, each with its own ChangeSummary. Is the intention of the proposal that the change be tracked in both places? That would seem necessary, in case the orphan is "removed" from one of the trees. The second observation is in regard to calling endLogging(). Do you propose that methods on the ChangeSummary interface like ChangeSummary.isModified/Created/Deleted and ChangeSummary.getChangedDataObjects can only be called after endLogging()? That seems rather limiting. Radu On Tue, 2009-05-12 at 14:12 +0200, Barack, Ron wrote:Hi Everyone, I think it is slowly time to move the discussion of the ChangeSummary from the DAS group to this TC. I see two major questions here: how do orphanHolder properties effect the change summary, and how does projection effect change summary. Here is one approach to the orphanHolder question. MOTIVATION AND BACKGROUND: Containmainment is a central concept in SDO, corresponding to UML aggregation. However, in practice, containment is often used not to define the characteristics of the business model, but rather to control SDO functionalities such as XML serialization and ChangeSummary. In order to use these capabilities, a defined containment structure be imposed on the data. Depending on the data source, this could be very unnatural and arbitrary. At least from our perspective, almost all data really comes from relational databases, and in this case, containment is very unnatural indeed. Even in cases where the data is structured, this is often only an (unwanted) by-product of the need to go over a WebService wire, and does not reflect the nature of the underlying data model. SDO 3 provides two methods of serializing non-closed data graphs to XML. First, the transitive closure may be included by packing the data graph in an envolope object that has "orphanHolder" properties. During XML serialization, orphanHolders collect any referenced objects that are not otherwise contained in the XML document. This results in the transitive closure being included in the XML document...which may be what the user wants, but is potentially very much a performance killer. The other approach is impose (or remove) a containment structure on-the-fly, using the "project" method. Providing ChangeSummary is one SDO's main talking points, but in SDO 2, ChangeSummaries are useful only if the data graph is hierarchically structured, because the set of data object's tracked by the change summary (it's "scope") is defined using containment. Since we've decided that SDO 3 will solve the serialization of non-closed data graphs and remove the restrictions that "closed" is the normative state of data graphs, it makes sense to similarly loosen the restrictions on containment wrt ChangeSummary, that is, to find a definition of ChangeSummary that is meaningful when the graph does not have a containment structure. REQUIREMENTS The solution SHOULD provide a meaning definition of the scope of the change summary, that can be applied to models where no containment structure has been defined. The solution MUST be backwards compatible, in regard to both functional and non-functional requirements. By non-functional requirements, I mean the performance hit implied by the solution should be minimal, at least in cases where only SDO 2 features are used. APPROACH The basic approach is to use the new SDO 3 structures, projection and orphanHolders, as a basis of a solution. This has the advantage of not unneccessarily further complicating our model, and also helps us achive backwards compatibility: since the new behavior is defined only in regard to these SDO 3 constructs, applications that use only SDO 2 constructs will continue to behave as before. The change summary is actually defined as a delta between the before-state and after-state of the data graph. As I formulate the approach, I will stick to this definition, and come back later to a discussion of more practical implementations. The spec talks about the "scope" of a change summary, I think this is confusing terminology, because it sounds as if "scope" is something that exists outside of operations (eg, "beginLogging") on change summary. We only have to calculate what is in-scope of a ChangeSummary in order to determine the "before" image and the "after" image. In SDO 2.1, the "before-image" of the change summary is the containment tree at the point of time that beginLogging() is called. Our proposal is that this be extended in SDO 3.0 to include the contents of orphanHolder properties in the containment tree. If any orphanHolder properties are found any DataObjects that are referenced but not contained by the containment tree are also part of the "before-image" of the change summary. If there are no orphanHolder properties, the behavior should be identical to SDO 2.1. In cases where orphanHolder properties are present, then it is clear that the beginLogging operation can be an expensive operation. However, I believe this functionally can be implemented such that, for the 2.1 cases, where no orphanHolder properties are present, and also in cases where the graph itself is closed, performance of SDO 3 should be comperable with that of SDO 2. This is of course something that I would need a prototype to verify. The expense involved in calculating the set of DataObjects to be included in the "after" image is a bit of a problem, because SDO 2.1 is pretty loose about ChangeSummary lifecycle. In particular, at least as I read the spec, the user is not really required to call "endLogging". Clearly, if the user does call "endLogging" we have a concrete point at which to calculate the scope, and, in particular to calculate the list to be returned by ChangeSummary.getChangedDataObjects(), and the set of objects to for which isModified, isCreated and isDeleted will return "true". If the user is not required to call endLogging(), then each of the ChangeSummary methods becomes potentially very expensive, which I think is bad design. I'm going to raise an issue in the SDO TC, to discuss how implementations interpret the endLogging() call. I think it's actually reasonable to required it, and define it as the time at which ChangeSummary is calculated. If this is considered a breaking change, then we can always say that the list of orphan nodes is only calculated when endLogging is called. GETTING THE CHANGE SUMMARY In SDO 2.1, the DataObject.getChangeSummary() method can simply walk up the containment tree looking for a DataObject with a ChangeSummary property. Under the approach I'm outlining here, this won't be possible for objects that are included via orphanHolder properties. There are two possible approaches here: First, when CS.beginLogging is called, an implementation could find all the orphans and call some (internal) "setChangeSummary()" method. This has a major drawback in that it will increase the memory footprint of the objects. I would actually prefer to say that getChangeSummary should be unchanged from 2.1, meaning that orphan objects may return "null". I think this is not a significant limitation, since DAS's will typically know where the ChangeSummary is (namely, on the DataGraph envelope), and use "getChangedObjects" to find the changes to process. In fact, I wonder if we should consider deprecating getChangeSummary, since the change summary should be found through calling a normal getter. XML As I described above, I think the approach requires a slightly better defined ChangeSummary lifecycle, namely, it requires something like "endLogging" that tells the implementation when to walk the tree and calculate the nodes that are in the "after" image, used to calculate the ChangeSummary. I've defined everything so far in terms of the API only, that is, in-memory use cases. I think that when XMLHelper.save is called, the after-image should be updated, and the created, modified and deleted lists updated. When an XML document that contains a CS is loaded, these lists are current, and all changeSummary methods should reflect the state of the change summary as read from the XML. It's as if the user has just called "endLogging". IMPLEMENTATION IDEAS Although the "snapshot" mode is useful for defining the behavior of ChangeSummary, I imagine that most implementations do not "make images" of the before state, but rather, when a setter is called, do some sort of calculation of whether the node is "in scope" of a change summary, and, if it is, somehow remember the old value. As with getChangeSummary(), we have a problem here when orphans are included in the scope. For such implementations, it will be necessary to traverse the scope of the CS, including orphans, and set a bit indicating that the object is "in scope" of a change summary. Even if this requires storing an additional boolean object, this would in all likelihood not increase the memory footprint of the data graph, at least not in Java, since objects are aligned on word boundries. And, of course, it is possible to do better, combinding several such flags into a single byte. So I think the costs here are very much acceptable. In fact, there's also an upside to the approach: going up the containment tree to find out if an object is in-scope will necessarily be slower than checking a bit. CONCLUSION AND FURTHER WORK Again, the ideas here are intended to represent only a potential approach, prototyping the solution will definitely be necessary. However, I think the ideas are appealing, because they address the issue without breaking backwards compatibility. If these ideas find acceptance, I would like next week to issue a similar approach that uses projection. Comments welcome! Best Regards, Ron--------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php |
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]