PDA

View Full Version : Jabber


kmj
08-01-2002, 09:41 AM
www.jabber.org


anyone else here taken a look at jabber? any thoughts on the subject?

jemfinch
08-01-2002, 11:22 AM
XML is an amazingly bloated, wasteful, and overly redundant format. It's like the Perl of data formatting.

That's what Jabber convinced me of :)

Jeremy

darelf
08-01-2002, 03:06 PM
Originally posted by jemfinch
XML is an amazingly bloated, wasteful, and overly redundant format. It's like the Perl of data formatting.

That's what Jabber convinced me of :)

Jeremy

The point of XML is to work everywhere for every kind of document. So of course it will be generalized, and not necessarily a good fit for everything....

Strike
08-01-2002, 05:38 PM
Originally posted by jemfinch
XML is an amazingly bloated, wasteful, and overly redundant format. It's like the Perl of data formatting.

That's what Jabber convinced me of :)

Jeremy
Hooray for unqalified statements.

I'm with darelf. Like your IRC bot, jemfinch, it's not designed to be good at one specific thing, but rather generic enough that you can do just about anything with it. If programmers of the world agreed with you, then MS Office wouldn't use XML, nor would GNOME, nor would W3C, etc.

jemfinch
08-01-2002, 05:59 PM
If the programmers of the world agreed with me, Strike, we wouldn't have buffer overflows or type errors or anything like that.

XML is overly redundant. There's always a battle when designing a DTD between making some bit of data an attribute on an already existing element or a new subelement of an already existing element. This is the kind of redundancy that we all love that Python shuns -- it's a practically useless way to make things "different" -- it's The Perl Way, TMTOWTDI. Someone who knows XML better than I is welcome to explain to me why an XML programmer should use attributes instead of subelements, and that's fine, but it's still duplicated functionality.

XML is bloated. For the majority of uses of XML (including Jabber) the amount of text the XML metadata itself takes up is nearly as much or more than the amount of actual data tagged. It's wasteful for entirely this reason -- on the disk, it wastes space, on the network, it wastes bandwidth, and on the CPU, it wastes CPU time parsing it.

XML was designed to be easy to parse (they wanted it to be a two-week project for a CS grad student to write a non-validating XML parser). It was designed to be programming-language independent. But I don't think it's the best design for either of these (not that I'm offering any better designs).

Jeremy

Strike
08-01-2002, 06:22 PM
I'm no XML expert, but kmj is begging me to weigh in :)

First of all, the programmers of the world do agree with you on buffer overflows and type errors. "Correctness" isn't the only factor when you code, you know. I know what your priorities are, being a linguistics student, but that's not what the driving factor is in business decisions.

XML is overly redundant. There's always a battle when designing a DTD between making some bit of data an attribute on an already existing element or a new subelement of an already existing element. This is the kind of redundancy that we all love that Python shuns -- it's a practically useless way to make things "different" -- it's The Perl Way, TMTOWTDI. Someone who knows XML better than I is welcome to explain to me why an XML programmer should use attributes instead of subelements, and that's fine, but it's still duplicated functionality.

Sublements and attributes aren't very different, syntactically, you are right. They are different in what they represent. XML is supposed to represent a structure, and certain things in documents lend themselves to being attributes of elements (such as the id and name) whereas some things are meant to be contained by those elements, not explicitly attached to them. In some DTDs, it is indeed difficult to tell the difference. For example, in DocBook, you often use an <AuthorInfo> block (I think it is, don't want to look it up right now) from within a document (that is, the document contains the AuthorInfo) to delineate who wrote the document. I think a better, more appropriate way of doing so would be to do something like <chapter author="StrikeInfo">...</chapter> and have "StrikeInfo" be essentially an xref to an AuthorInfo block with my info in it. So, yes, there are many ways of doing things, but you know yourself that there is no way of preventing this in any language, expecially one designed to be as general purpose as XML. Even Python has bad ways of doing things. XML doesn't encourage poor design any more than Python does. Python just has a bit more freedom in that it can discourage BAD design more easily because it's more complex.


XML is bloated. For the majority of uses of XML (including Jabber) the amount of text the XML metadata itself takes up is nearly as much or more than the amount of actual data tagged. It's wasteful for entirely this reason -- on the disk, it wastes space, on the network, it wastes bandwidth, and on the CPU, it wastes CPU time parsing it.

Oh please, you of all people bitching about CPU time? You are the one who advocates Python for many things and pretty much saying "CPU cycles be damned" unless they really need to be fast. Granted, a lot of the uses for XML are often more metadata than data, but so what? People are still developing microkernels even though the "metaprocesses" (i.e., the message passing between kernel modules) consume the VAST majority of the kernel processing time. It's a tradeoff some are willing to make, it's not up to you to say what is right or wrong except for in cases where you may elect to use it or not.


XML was designed to be easy to parse (they wanted it to be a two-week project for a CS grad student to write a non-validating XML parser). It was designed to be programming-language independent. But I don't think it's the best design for either of these (not that I'm offering any better designs).

It was also designed to be easy to read to humans (which, considering the vast amount of uses it has, it does a pretty good job). And until you DO offer up an alternative, your argument of "I don't think it's the best design" is nothing but opinion.

Bradmont
08-01-2002, 10:46 PM
One of my instructors totally loved XML, for the reason that it's self-documenting. Lots of the time you end up with data files from some old program that doesn't exist any more, and the data are next to unreadable, since they are in some unknown, unreadable format. A major advantage of XML is that the data will still be usable a long time in the future, when the original program that handled those data is long gone/lost/unusable/whatnot.

kmj
08-05-2002, 10:37 AM
I'm no XML expert, but kmj is begging me to weigh in


I wouldn't quite say I was begging.. it was more along the lines of "I'd like to see your reply". :fu: :)

sicarius
08-06-2002, 10:42 AM
My favorite part about XML is the number of languages that have bindings for it. C, Java, Perl, etc. Not only is it platform independent, but by using a simple library I can access the same information in almost any language and still get the same info, even if I decide to treat that data differently.
I use XML quite a bit when I need a program that uses config files for the simple reason that it takes up a whole lot less time to do that then make my own specialized language and parser each time. I'm not insinuating that XML is the only solution in that case, but it is a very handy one.

XML like any other language is prone to bad programming and style. So by saying that XML is bloated and redundant you are only right on a document by document basis. Also, if you (jemfinch) are so worried about the cpu time that it takes to parse the file, why are you writting code in an interpreted language instead of C? Most libraries that bind language foo to XML are going to have the data in a balanced tree any how, so accessing the data after it has been parsed is pretty fast.