Pull Parsing in C# and Java
Grok Headline matches for Pull Parsing in C# and Java
Boost performance when parsing (Java
Pro)
Boost performance when parsing (Java
Pro)
09/12/2002 07:48 AMJava command line option parsing suite
Java command line option parsing suite
04/12/2005 11:54 AMJArgs 1.0 Released
Parsing XML with Perl
Parsing XML with Perl
07/21/2002 10:36 PMCNET Jul 21 2002 10:12PM ET
Article on Parsing RSS
Article on Parsing RSS
11/18/2002 12:58 PMI have put up an article on how I parse RSS files. Also, in the same
article I provide my RSS parser as a free download. I'd appreciate
any feedback on it.
The WAI compliancy will have to wait another day. One thing the Bobby
accessibility validator doesn't like about my site is the links below
every new post. The "permalink" and "comments" are specifically what
it doesn't like. This is because the same text is repeated for each
news post, although each one points to something (slightly) different.
I don't want to get rid of these links, so I'm looking for a suitable
(perhaps graphical) alternative.
Parsing OWL in RDF/XML Published
Parsing OWL in RDF/XML Published
01/22/2004 03:25 AM2004-01-21: The Web Ontology Working Group has released Parsing OWL in
RDF/XML as a Working Group Note. The OWL language is used to publish
and share sets of terms called ontologies, supporting advanced Web
search, software agents and knowledge management. This document
describes a strategy for OWL-RDF parsers. Read about the Semantic Web.
(News archive)
Parsing RSS At All Costs
Parsing RSS At All Costs
01/22/2003 07:41 PMIn his second Dive into XML column, Mark Pilgrim describes his
parse-at-all-costs parser of ill-formed RSS feeds, using Python's
sgmllib.
More XML: Parsing with Evolt.org
More XML: Parsing with Evolt.org
08/14/2002 08:16 AMIndependently Parsing Perl
Independently Parsing Perl
06/17/2005 04:30 PMStodgy, boring languages have great editors. What's keeping Perl from
refactoring support, perfect syntax highlighting, and other advanced
transformation techniques? It's really difficult to parse Perl.
Fortunately, Adam Kennedy's PPI project provides a standalone Perl
parser that operates correctly on all but 28 of the 38,000 CPAN
modules. Here's how it works and what you can do with it.
RSS native parsing in the next Firebird
RSS native parsing in the next Firebird
02/10/2004 02:42 AMThis is new to me. I was checking out the nightly builds of
Firebird 0.8 betas (windows and linux, mac<
/a>) and they'
ve got an rss button and panel that parses RSS, with titles
linking to the main window. Slick, but they need to let you track
which ones have new/old items.
update: It turns out I'm actually a dumbass. I
installed this RSS
extension so long ago I forgot about it, and because I never saw
it show up in any menu, I figured it never "took" on my Firebird
install. Then when I had the new nightly build the toolbars were out
of whack on first run so I went to customize them and saw the RSS
button for the first time, and assumed it came with Firebird 0.8. My
bad.
Parsing a Querystring With Perl
Parsing a Querystring With Perl
12/19/2002 07:40 PMStickysauce Dec 19 2002 6:46PM ET
Parsing the News.com RSS feed with PHP
Parsing the News.com RSS feed with PHP
12/11/2003 02:48 AMCNET Dec 11 2003 2:44AM ET
dtddoc step 1: Parsing a DTD
dtddoc step 1: Parsing a DTD
10/02/2002 09:35 AMOur quest to build a better automatic DTD documentation tool begins
with a quick look at some of the available DTD parsers for Java, Perl,
and PHP. By Michael Classen. 1002
Simple XML parsing with SAX and DOM
(OnJava.com)
Simple XML parsing with SAX and DOM
(OnJava.com)
07/01/2002 08:28 AMPython parsing module
Python parsing module
12/18/2003 01:00 PMpyParsing Python library - version 1.0.1 released
BitFlux Blog: Parsing Bad XML in PHP 5.1
BitFlux Blog: Parsing Bad XML in PHP 5.1
08/19/2004 10:10 AMIn a new note from the
BitFlux
blog, Christian Stocker has information about the latest patch
comitted to the PHP 5.1 branch that
allows you to parse not
well-formed XML documents and adds the missing elements, eg. missing
closing tags.
Functional XML Parsing Framework 5.1
Functional XML Parsing Framework 5.1
09/16/2004 09:22 PMSAX/DOM/SXML parsers with support for XML namespaces and validation.
Features: Non-Extractive Parsing for XML
Features: Non-Extractive Parsing for XML
05/19/2004 07:15 PMChanging the way XML parsers are written can make parsing more
efficient and more flexible.
Making the News: Parsing RSS Feeds With
PHP
Making the News: Parsing RSS Feeds With
PHP
11/13/2002 08:59 AMdtddoc step 1: Parsing a DTD
(WebReference.com)
dtddoc step 1: Parsing a DTD
(WebReference.com)
10/08/2002 09:14 AMLiberal XML parsing related to
personality?
Liberal XML parsing related to
personality?
02/12/2004 07:41 PMThe heat of the discussion on liberal XML parsing has subsided, so
this is actually a little late. That's because I wasn't sure if I
should post this. But a post by Dave Winer today convinced me to post
it anyway. Let me just say up front that I could be completely wrong.
?
Introduction to Event-Driven XML Parsing
Introduction to Event-Driven XML Parsing
02/10/2004 02:49 AMApple documents the new-in-Panther NSXMLParser class.
S-exp-based XML parsing/query/conversion
S-exp-based XML parsing/query/conversion
09/16/2004 07:33 PMSSAX-SXML Release 5.1
Re: Internet Explorer URL parsing
vulnerability
Re: Internet Explorer URL parsing
vulnerability
12/09/2003 03:45 PMsoulshok_at_hippie.dk (Dec 09 2003)
Flaw in Microsoft JPEG Parsing
Flaw in Microsoft JPEG Parsing
09/14/2004 06:12 PMWPkontakt message parsing error
WPkontakt message parsing error
12/24/2004 12:36 PMJaroslaw Sajko (Dec 23 2004)
Parsing XML documents with Perl's
XML::Simple
Parsing XML documents with Perl's
XML::Simple
09/20/2004 12:46 AMCNET Sep 20 2004 4:09AM GMT
BitFlux Blog: Parsing Bad XML - Part 2
BitFlux Blog: Parsing Bad XML - Part 2
08/20/2004 08:31 AMIn response to his
introduction of the
non-well-formed XML patch the other day, Christian Stocker has a
new posting with
a bit of
a rebuttal on the subject.
Warcraft III Replay Parsing Library
Warcraft III Replay Parsing Library
08/09/2004 11:30 AMW3RepLib 0.9 beta released!
The State of the Union Parsing Tool
The State of the Union Parsing Tool
02/05/2005 09:55 PMstyle.org/stateoftheunion/parse
track this
site | 3 links
Parsing XML documents with Perl
(Builder.com)
Parsing XML documents with Perl
(Builder.com)
07/18/2002 07:34 PMInternet Explorer URL parsing
vulnerability
Internet Explorer URL parsing
vulnerability
12/09/2003 01:22 PMbugtraq_at_zapthedingbat.com (Dec 09 2003)
URL Parsing Bug in IE Invites Phishing
Attacks
URL Parsing Bug in IE Invites Phishing
Attacks
06/11/2004 09:09 PMThe bug, which affects fully patched versions of IE, lets malicious
sites assume the privileges of more trusted zones.
RE: Internet Explorer URL parsing
vulnerability
RE: Internet Explorer URL parsing
vulnerability
12/10/2003 01:52 PMhttp-equiv_at_excite.com (Dec 09 2003)
High Speed XML Parsing is Not Intuitive
High Speed XML Parsing is Not Intuitive
02/11/2004 03:58 AMFor a PHP weblog, there haven't been many PHP articles or links
recently. This is because I feel most recent PHP articles I read have
nothing fresh to say, repeating material I linked to 2 or 3 years ago.
Perhaps I'm getting jaded. So to keep things fresh, here's a new
article, mostly original, and hopefully of some interest to everyone!
Last year, Tim Bray, one of the co-authors of the XML spec,
mentioned that he used Perl regular expressions to parse
XML.
Now here's the dirty secret; most of it is
machine-generated XML, and in most cases, I use the perl regexp engine
to read and process it.
I was struck by this because I would have thought XPath or SAX
would provide better performance
as they are APIs tuned specifically for XML.
I decided to do some benchmarks to determine which techniques were
better. I also wanted a realistic test, so I benchmarked parsing the
RSS feed of this
web-site, searching for the contents of all title tags, and returning
the contents as an array. The RSS file is from Nov 2003 (yes i did
this benchmark that long ago), and is about 20K and has 12 title tags,
so the returned array will have 12 title strings.
The techniques used were:
1. Regular expression:
preg_match_all('/<title>([^<]*)/',$rss,$titles_arr))
2. Explode('<title>', $rss) then strip the matching </title>
tag using strpos() and substr().
3. XPath, using $title_nodes = $ctx->xpath_eval("//title");
4. SAX, wrote an element handler function that matched and
processed the title tag.
5. DOM, using $titles = $dom->get_elements_by_tagname('title').
Intuitively, this should have been the slowest, as the whole tree is
generated.
Results
Here are the timings for processing the RSS file 1000 times. Faster
is better.
seconds Relative
to REGEX
REGEX 0.1080 1.00
EXPLODE 0.1696 1.57
DOM 6.3212 58.53
XPATH 8.3417 77.24
SAX 10.0851 93.38
Conclusion
Intutively, I would have thought that XPath would be the fastest
as XPath expressions can be compiled and tuned for XML. But the best
performance was achieved using regular expressions, which is
what Tim is using.
It appears that the DOM, SAX and XPath libraries remain immature
(compared to the Perl-compatible regex library) and are not highly
optimized. Strangely enough, DOM performance is better than XPath and
SAX! Perhaps someone else can explain why.
If anyone is interested, i can post the source.
Test platform: Windows 2000, PHP 4.3.3. I also tested on Linux, PHP
4.3.2, with similar results.

OpenSSL ASN.1 parsing bugs PoC / brute
forcer
OpenSSL ASN.1 parsing bugs PoC / brute
forcer
01/16/2004 10:59 AMBram Matthys (Syzop) (Jan 15 2004)
Codewalkers.com: Parsing INI Files Made
Easy
Codewalkers.com: Parsing INI Files Made
Easy
12/16/2003 08:58 AMWhen working with a PHP script, sometimes it's just easier to have
some of the configuration options outside of the souce. Not only does
this make things a bit more friendly for the user, but it makes less
debugging for you in the long run. But, to harness this feature, you
might need a shove in the right direction - and that's where
this new article
comes in.
[OpenSSL Advisory] Denial of Service in
ASN.1 parsing
[OpenSSL Advisory] Denial of Service in
ASN.1 parsing
11/04/2003 12:13 PMMark J Cox (Nov 04 2003)
[ESA-20031104-029] 'openssl' ASN.1
parsing denial of service
[ESA-20031104-029] 'openssl' ASN.1
parsing denial of service
11/04/2003 01:23 PMEnGarde Secure Linux (Nov 04 2003)
Incremental XML Parsing and Validation
in a Text Editor
Incremental XML Parsing and Validation
in a Text Editor
12/15/2003 02:29 AMOn 10 December 2003 at XML 2003 in Philadelphia, James Clark presented
the ideas and implementation behind his nXML XML editing mode for GNU
Emacs.
Grok Description matches for Pull Parsing in C# and Java
GrokA matches for Pull Parsing in C# and Java
Pull Parsing in C# and Java