• « A fix to make IE CSS standards compliant?
    • |
    • Main
    • |
    • Print management with XSLT »
            • November 24, 2004

              Python and XML/XSL/XPath

            • XML.com: Location, Location, Location

              Article seems fairly informative. If you are considering Python as a possible base development language for your XML and XML tranaformation related projects this seems like a decent code-based sample overview.

              There are a few little annoyances I’ve noticed. I’ll document them in the extended portion of this entry.

              One of the more annoying things I come across in code samples is when authors use XML/XPath/XSLT that, while technically it will work, is simply just bad form and should not be propogated to people who may be new to the technology and end up picking bad habits up (that will eventually come back to annoy me again at some point… what an awful cycle.. bllaaahhh!)

              Heres a perfect example…

              <?xml version="1.0" encoding="iso-8859-1"?>
              <labels>
                <label added="2003-06-20">
                  <quote>
                    <emph>Midwinter Spring</emph> is its own season&#8230;
                  </quote>
                  <name>Thomas Eliot</name>
                  <address>
                    <street>3 Prufrock Lane</street>
                    <city>Stamford</city>
                    <state>CT</state>
                  </address>
                </label>
               <label added="2003-06-10">
                  <name>Ezra Pound</name>
                  <address>
                    <street>45 Usura Place</street>
                    <city>Hailey</city>
                    <state>ID</state>
                  </address>
               </label>
              </labels>
              

              /labels[1]/label[1]/quote[1]/emph[1]/text()[1]: text node with value “Midwinter Spring”

              Ok, can you see my point.

              First, whats the position selector doing at the root element “labels”. Theres only one root element (not to be confused with root node… search the archives or look at DaveP’s FAQ to understand the difference) of a well-formed XML file and given that “labels” follows on the next line after the “<?xml snip…” declaration the “its an XML fragment… there can be many labels within this document” isn’t going to fly.

              Second, there is no text node that is a child node of “emph”. The text contained within the “emph” element is the value of “emph”, not the value of a text node that is a child of “emph”. Theres a text node in this sample that is a sibling of “emph”. So you could use something like:

              /labels/label[1]/quote[1]/emph[1]/following-sibling::text()

              … this would be a nice way to showcase how you would access the value of the text that follows the “emph” element. Notice I took away the [1] that would specify the first text node sibling of “emph”. There can only ever be one text node following-sibling (or preceding-sibling for that matter) of any element as the parser sees everything after the “>” of the closing element and before the “<” of the next opening element, if anything but white space exists (well, that would depend on the processor to be completey accurate but no need to bring that into play as this point… DEFINITELY worth its own blog entry at some point though!), as one long string of text which it dubs a “text node”.

              Well, enough annoying code for one day… I’m headed off to bed… Got a Thanksgiving Day Turkey Bowl to attend in the morning. For those wondering what I mean by Turkey Bowl it refers to an annual football game played every year on Thanksgiving morning by your local crowd of weekend warriors which, over the years, I have sadly become. Chances are there is a Turkey Bowl planned for somewhere near your home… a park, a school ball field, wherever theres somewhere that can somehow be dubbed a make-shift football field chances are good there will be a really bad game of football being played on it;s surface tomorrow morning… I’m going to start taking Ibuprofen now so I can ease into the pain later… :)

              <M:D/>
              
            • Posted by m.david : November 24, 2004 09:01 PM GMT

            Trackback Pings

            TrackBack URL for this entry:
            http://www.xsltblog.com/xslt-blog-mt/mt-tb.cgi/26

            Listed below are links to weblogs that reference Python and XML/XSL/XPath:

            » phentermine from phentermine
            [Read More]

            Tracked on March 7, 2006 12:10 AM

            Comments

              • Your comments are extremely unfair for the fact that you pick quotes from the article entirely out of context in order to make an entirely bogus point. If you look at my published body of XSLT out there (I’ve forgotten more XSLT that I’ve published than most people have ever contemplated writing in the first place), you won’t find asingle place where I use a construction such as /labels[1] to access the document element of a well-formed source document in XSLT.

                In the article I was demonstrating techniques for automatically *generating* XPath. I point to the fact a couple of times in the article that the XPaths aren’t ideally constructed (after all, they’re examples of machine genration: how often do you see machine generated code that is very elegantly constructed; have you seen XSLT generated from schematron.xsl?).

                Of course I know that there is only one element node in a well-formed XML file, and that XSLT requires normalization of text nodes are automatically normalized. I expect that most people who can follow XPath know that as well. The article was not a tutorial of XPath, nor did it claim to be.

                And by the way, while you were climbing your high horse, you made a fundamental error of your own:

                ‘Second, there is no text node that is a child node of �emph�. The text contained within the �emph� element is the value of �emph�, not the value of a text node that is a child of �emph�.’

                Are you telling me that you are not aware that

                string(/labels[1]/label[1]/quote[1]/emph[1]/text()[1])

                results in “Midwinter Spring”

                or in terms you wouldn’t find “annoying” that string(/labels/label[1]/quote/emph/text()) is the same value as string(/labels/label[1]/quote/emph).

                Yes, indeed, I just re-read, and you are saying just that in an attempt to “correct” me. Sorry, but the spec contradicts you. I know, because I have implementd XPath and XSLT several times over and I know the spec very well.

                In conclusion you were not able to point out a single actual error in my article, so perhaps you should not try to pull down the work of others by cutting their context to shreds in future. This is generaly a nice site. I’d be as happy as any other reader if you didn’t ruin it with more of the same.

                Thanks.

                Uche Ogbuji

              • Posted by: uche at November 27, 2004 11:55 AM
              • Hello Uche,

                Thanks for taking the time to respond back. I believe we all have a right to defend ourselves when we feel an attack is being made against us. Before you burst any more blood vessells over this let me just point out a couple of things from my post.

                First, I posted the link because I thought the article, in general, was a good article and worthy of being posted.

                Second, I prefaced my comments with “..that, while technically it will work,”. I realize your code will work. I was’nt contending that fact. My contention was simply a personal one… I find it annoying when authors propogate bad form in their code. There is an inherent responsibility that comes as an author to be as accurate as possible in following good coding guidelines. So while I am aware that:

                “string(/labels[1]/label[1]/quote[1]/emph[1]/text()[1])”

                Will produce:

                “Midwinter Spring”

                I still hold tight to my claim that its simply bad use of XPath. When you implement style that does nothing except make your code more verbose and potentially confusing (to someone new to XPath they might have a hard time understanding the purpose of the position predicate for “labels” and therefore doubt that they understand things correctly) you risk the chance that your readers will walk away more confused than enlightened.

                The question is not “will it work” but “does it represent good form and style that will aid my readers in better understanding the technologies i am representing.” I’m sorry, but you won’t find an expert in the XSLT/XPath community who will not evangelize the propogation of good form and style and “proper” use of syntax. When acting in the role of an expert you must be that expert you are representing. By suggesting that “it works” and therefore “its good enough” tempts me to suggest that with such an attitude towards code development anything you write from this point forward is not worth reading. With as many “examples” I have gone through in books and online that simply just do not work I don’t have time anymore to pay attention to authors who don’t take care of the little things.

                With that said I will in no way suggest that every line of code that I have written and published is in perfect form or absolutely 100% technically correct. But I do know that when I realize for myself or have pointed out to me that something is incorrect (whether it will work or not) I have to fess up to my mistake and correct it. Having an “it works so who cares” attitude will lose you more respect than it will gain and pretty soon people will stop paying attention to anything you have to say.

                Whether its fair or not those are the simple facts. Take them as you will.

                Best regards,

                <M:D>

              • Posted by: M. David Peterson at November 27, 2004 01:28 PM
              • I read through your comments again and in all fairness I should re-qualify some of my statements. I appreciate the fact that you are representing generated code. And I recognize that generated code is not always perfect. I did’nt notice the comments you made in your article suggesting that the XPath format was not perfect. As soon as I saw the use of the XPath I pretty much turned things off and scanned the rest of the article just to see if the rest of it was worthy of posting, which it was. My point was’nt to post it so I could then turn around and slam you. No where does it suggest that the rest of the article was bad. Its a good article otherwise I wouldnt have posted it. But my earlier points are still valid. There needs to be a responsibility in writing articles to propogate good form. If you are going to showcase poorly written generated code follow it up with an example of what it really should look like so that your readers can understand your “syntax isn’t perfect” comment.

                Please accept my comments with the constructive criticism they were meant. I would expect anybody reading something I posted to be as critical as necessary to ensure that the proper information is passed along to future readers of an article. If we accept the critique of our colleagues in the spirit it was meant — helping us become better at what we do — then only good things will become of what is said.

                Hope this helps clarify my comments!

                Best regards,

                <M:D/>

              • Posted by: M. David Peterson at November 27, 2004 01:49 PM

            Post a comment




            Remember Me?

            (you may use HTML tags for style)

          • © 2005 :: <XSLT:Blog/> (xsltblog.com) is a product of M. David Peterson and FunctionalX Consulting. See Licensing Info Below.
          • Except where otherwise noted, this sites content and source code is licensed under the Attribution License from Creative Commons.