[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: binary formats
Ralf, For efficient representation of tabular data with CORBA GIOP, in the past we used column-major order rather than row-major order. Fast, simple, and efficient, and we’ve been using it
successfully for nearly 20 years. See in the attached ZIP, a Column includes an attribute of type Data, which uses a discriminated union selecting one of a set of possible sequence types. The basic idea is that column
metadata (like name, type) occurs once per column, rather than once per cell. And the actual column data is just a sequence of primitive values. Anyway, just the other day I was considering mapping this to OData CSDL, where rather than a union type, Data could be an abstract complex type, with various subtypes each of which
contains a field whose type is a collection of primitives. Anyway, that’s just some background, if we want something like Java ResultSet / .NET DataTable encoded in CSDL. Now rather than defining a new set of data types, and just want to make a more efficient (JSON) encoding for tabular data, we could just consider allowing column-major order as an alternative
encoding for a collection of entity values (or collection of complex values). So where now we may have (omitting metadata) a list of entities: { “value”: [{“name”:“A”, “id”:1}, {“name”:“B”, “id”:2}, {“name”:“C”, “id”:3}] } We could instead have a list of entities encoded like this: { “name”:[“A”,”B”,”C”], “id”:[1,2,3] } If we need column-specific meta-data, just add a an field in the object, with a collection of objects each of which contains column meta-data, e.g. I am sure this could be the starting point for a workable proposal. From: Handl, Ralf
Hi all, SAP’s most important consumers of OData services are browser-based _javascript_ apps, and the most critical performance indicator is the time it takes for the first results to show on the UI. Besides
server-side data acquisition this time includes sender-side packing + network transfer + receiver-side unpacking.
All relevant browsers have built-in GZIP (de)compression and JSON.parse(), so gzipped JSON is the baseline. A binary format that wants to compete has to be faster end-to-end, i.e. the time saved
by sending less over the wire has to overcompensate the time invested to pack and unpack the data. These considerations obviously went into the choice of GZIP as the predominant compression method, see
http://tukaani.org/lzma/benchmarks.html for a comparison of compression algorithms in terms of speed and memory footprint. We do have one pain point with the current JSON format: sending thousands of entities with hundreds of properties using DescriptivePropertyNamesExactlyDescribingTheSemanticsOfTheirPossibleValues.
Here the ratio between data and “package material” is getting rather low, especially if many of the properties have their default value. While GZIP does a decent job to deflate this kind payloads, the time spent by on server to zip plus the time spent on the
client to first unzip and then parse the large text stream into _javascript_ objects is quite noticeable.
A more compact JSON format for “tabular” data as initially proposed in
http://www.odata.org/blog/an-efficient-format-for-odata/ might help, but OData’s flexibility with $expand, dynamic properties, and inheritance don’t make that as straight-forward as it might
seem. We’ve experimented with “Recursive JSON”,
http://www.cliws.com/e/06pogA9VwXylo_GknPEeFA/, which addresses the problem of long property names and deals with the flexible structure. Combined with omitting properties with default value,
https://issues.oasis-open.org/browse/ODATA-818, this might take us far. Thoughts? Thanks in advance! --Ralf From:
odata@lists.oasis-open.org [mailto:odata@lists.oasis-open.org]
On Behalf Of Ireland, Evan Mark, I am not so sure of the shift to binary formats that you predict. I did some “real data” comparison of SAP enterprise data using JSON vs. BSON format, and found BSON to take up more space. Binary formats that encode lengths can chew up more space than text-based formats. (Similar
to CORBA GIOP, where you have a bunch of fixed-length (32-bt) length fields, as well as padding, ibn the binary encoding). Now for some ODaata services I am toying with, HTML is the default format. Not a standard format for OData, but very useful for system administration tasks like system monitoring (logs, metrics, etc). I think if the client doesn’t send an Accept header or specify $format in a URL, then all bets should be off. I suppose I am saying let the server decide a default, if even it has one rather than insisting on
finding format in URL or Accept header. From:
odata@lists.oasis-open.org [mailto:odata@lists.oasis-open.org]
On Behalf Of Mark Stafford Typically we say that the default is up to the server. The server only needs to support one of the standardized serialization formats – but since JSON is the only format that is fully standardized
at this point, we typically expect JSON to be the default response. I can also say that I *hope* it’s not put down somewhere in the standard as the default. I personally believe we’re at the height of the maturity curve for JSON, and I think as HTTP debugging
tools continue to rapidly improve, that we will see a shift in the default serialization format from JSON to a binary format, similar to what we’re seeing happen with HTTP2. So in my personal ideal future state, we would see something like Avro take over as
the default serialization format for OData payloads – but of course that depends upon the ability of the server to choose the right default for the API. Thoughts? Pushback? From:
odata@lists.oasis-open.org [mailto:odata@lists.oasis-open.org]
On Behalf Of Mark Biamonte I know that the default format for services in OData v4 changed to be JSON, but I am having difficulty in the spec finding where it explicitly states that. I would have expected to find something in the definition of
the accept header and the $format query parameter. Something along the lines of if the accept header and $format query parameter are not present then the JSON format is used. Mark |
Attachment:
TabularResults.zip
Description: TabularResults.zip
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]