stargeek
PHP news website logo.
home    PHP scripts    articles    seo tools    links    search    contact    shop    realtors


Python and XML: Unicode Secrets







Python and XML: Unicode Secrets

Python and XML: Unicode Secrets 06/05/2005 11:54 PM

In his latest Python-XML column, Uche Ogbuji delves broadly and deeply into the world of Unicode, especially with regard to processing XML in Python.




This is a GrokNews Entry: (what is grok?)





Similar Items

Python and XML: Unicode Secrets

Grok Headline matches for Python and XML: Unicode Secrets

Python and XML: More Unicode Secrets


Python and XML: More Unicode Secrets 06/17/2005 04:28 PM
In this month's Python and XML column, Uche Ogbuji continues his discussion of Unicode secrets with regard to XML processing in Python, especially BOMs and stream objects.

Python and XML: XML Namespaces Support
in Python Tools, Part Three


Python and XML: XML Namespaces Support
in Python Tools, Part Three
06/30/2004 07:31 PM
In this month's Python and XML column Uche Ogbuji examines the namespace support in ElementTree, PyRXPU, and libxml.

Python and XML: XML Namespaces Support
in Python Tools, Part Two


Python and XML: XML Namespaces Support
in Python Tools, Part Two
05/13/2004 07:55 PM
In his latest Python and XML column, Uche Ogbuji continues his tour of XML namespaces support in Python tools, focusing this time on 4Suite.

Backporting from Python 2.3 to Python
2.2


Backporting from Python 2.3 to Python
2.2
06/08/2004 11:18 PM

We have a home-grown templating system at work, which I intend to dedicate an entry to some time in the future. We originally wrote it in Python 2.2, but upgraded to Python 2.3 a while ago and have since been evolving our code in that environment. Today I found a need to load the most recent version of our templating system on to a small, long neglected application that had been running the original version ever since it had enough features to be usable.

Unfortunately, this application was running on a server that only had Python 2.2. Installing Python 2.3 would have been somewhat more painful here than on other servers we run for reasons I won't go in to, so I decided to have a go at getting our current code to run under the older Python version.

In the end, I only had to make three minor changes, all at the top of the file in question.

  1. I added from __future__ import generators as the very first line of the file. We use generators (with the yield statement) in a few places - this feature was only properly added in Python 2.3, but was made available in Python 2.2 as a "future enhancement" through the aforementioned obscure import.

  2. I added True, False = 1, 0 on the next line down. Surprisingly, Python 2.2 had no support for a boolean type and instead used a test for non-zero instead. The above line defines constants that behave enough like Python 2.3's True and False to avoid any problems.

  3. I defined an enumerate function, which was introduced for real in Python 2.3. Here's the code I used:

    
    def enumerate(obj):
        for i, item in zip(range(len(obj)), obj):
            yield i, item 
    

All in all it only took around ten minutes to put the above together, after which the script worked just fine. It was interesting to see how our code had grown to rely on Python 2.3 features without us realising it.


Big Unicode!


Big Unicode! 03/22/2005 03:19 PM
Via jwz, a monster Unicode chart about 6 by 12 feet. I want one!

Unicode-Japanese-0.19


Unicode-Japanese-0.19 01/19/2004 05:05 AM

Unicode-Normalize-0.30


Unicode-Normalize-0.30 05/02/2004 05:54 AM

Unicode-Collate-0.40


Unicode-Collate-0.40 04/24/2004 12:36 AM

Unicode-Transform-0.31


Unicode-Transform-0.31 11/15/2003 11:03 AM

Unicode-Normalize-0.27


Unicode-Normalize-0.27 11/16/2003 10:27 AM

Unicode-IMAPUtf7-1.99


Unicode-IMAPUtf7-1.99 11/16/2003 10:27 AM

rxvt-unicode 3.6


rxvt-unicode 3.6 08/14/2004 01:19 AM
An rxvt clone supporting mixed fonts, Xft fonts, and Unicode.

Unicode-Transform-0.30


Unicode-Transform-0.30 11/15/2003 05:27 AM

Joel on Unicode


Joel on Unicode 11/13/2003 01:57 AM
Joel of Joel on Software has put together a great overview of Unicode that all programmers should read.

10,000 ways to Ni Hao with Unicode and
PHP


10,000 ways to Ni Hao with Unicode and
PHP
10/29/2003 12:11 AM
Joel Spolsky has been cursing the lack of support for Unicode in PHP. So last week, he wrote this great article on The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Of course, that still doesn't answer his initial question of how to get Unicode working with PHP. Well Scott Reynen had the solution and wrote Ho w to develop multilingual, Unicode applications with PHP in response to Joel's frustration.

Scott's technique works on all versions of PHP. Or you can just use the UTF8 character set and mbstring functions, which should run faster as they are coded in C. To use mbstring, you need PHP 4.3 or later (it was buggy pre-4.3). On Unix, you will need to compile the extension in. On Windows, you just need to modify your php.ini.

Update: l0t3k is working on a Unicode I18N extension based on IBM's open source International Components for Unicode. The extension includes a UnicodeString class with the requisite searching, replacing, casing, trimming, and classification methods. Get the CVS version (16 Oct 2003).

PS: Ni Hao means hello in Chinese.


Unicode-Normalize-0.26


Unicode-Normalize-0.26 11/15/2003 09:58 AM

Unicode-Collate-0.32


Unicode-Collate-0.32 12/03/2003 06:07 PM

Unicode-Normalize-0.32


Unicode-Normalize-0.32 04/05/2005 11:43 AM

rxvt-unicode 4.7


rxvt-unicode 4.7 12/29/2004 06:01 PM
An rxvt clone supporting mixed fonts, Xft fonts, and Unicode.

"Unicode Chart"


"Unicode Chart" 03/25/2005 06:44 AM

Unicode Rewriter 0.1


Unicode Rewriter 0.1 08/12/2004 03:58 AM
A tool for converting ID3 tags to Unicode.

rxvt-unicode 4.0


rxvt-unicode 4.0 09/12/2004 11:26 PM
An rxvt clone supporting mixed fonts, Xft fonts, and Unicode.

Unicode and webl0gs


Unicode and webl0gs 04/19/2004 08:33 AM
Hossein Derakhshan: We should promote Unicode standard among English speaking programmers. Many tools do not work well with Unicode and this sucks. Spread the meme  Please test your clients, servers, comments, and feeds. ...

Unicode-UTF8simple-1.06


Unicode-UTF8simple-1.06 12/27/2004 09:06 AM

Unicode-Collate-0.33


Unicode-Collate-0.33 12/13/2003 04:49 AM

rxvt-unicode 3.4


rxvt-unicode 3.4 08/07/2004 03:39 AM
An rxvt clone supporting mixed fonts, Xft fonts, and Unicode.

Unicode-IMAPUtf7-2.01


Unicode-IMAPUtf7-2.01 08/29/2004 06:06 PM

Unicode-Collate-0.31


Unicode-Collate-0.31 11/16/2003 10:27 AM

Unicode-Japanese-0.22


Unicode-Japanese-0.22 05/31/2004 05:39 AM

rxvt-unicode 3.3


rxvt-unicode 3.3 07/31/2004 10:13 AM
An rxvt clone supporting mixed fonts, Xft fonts, and Unicode.

Unicode-Japanese-0.21


Unicode-Japanese-0.21 05/26/2004 05:48 AM

Unicode-Normalize-0.31


Unicode-Normalize-0.31 04/05/2005 11:43 AM

DSI Announces Unicode Functionality


DSI Announces Unicode Functionality 04/06/2005 04:38 AM
ZDNet India Apr 6 2005 8:56AM GMT

Unicode-IMAPUtf7-1.99_1


Unicode-IMAPUtf7-1.99_1 11/16/2003 06:17 PM

Universal Unicode Converter 0.2


Universal Unicode Converter 0.2 01/05/2005 01:39 PM
A converter for Unicode to and from several common 7-bit and 8-bit "plain text"

International Components for Unicode
(C/C++) 3.0


International Components for Unicode
(C/C++) 3.0
06/22/2004 10:12 PM
IBM Classes for Unicode (ICU) enable you to write fully cross-platform programs

Unicode for Syndication Consumers


Unicode for Syndication Consumers 04/22/2004 01:32 PM
Torsten Rendelmann: Hey, partitially good news: my local RSS Bandit beat build 109 does not fail anymore on Sam's test feed, if it is compiled with .NET 1.0 Whether that is good news or (or even news at all) is debatable, in any case, this should not be an accidental feature.  If this is to be pursued, here a few more things to think about.

Unicode Enabled Trackbacks


Unicode Enabled Trackbacks 06/30/2004 01:07 PM
I've changed my weblogging software to send trackbacks in utf-8, and to try to respect the charset, if specified, on trackbacks received. This involved four changes. ...

Gurmukhi Unicode Conversion Application


Gurmukhi Unicode Conversion Application 05/26/2004 07:47 PM
New releases!
Grok Description matches for Python and XML: Unicode Secrets
GrokA matches for Python and XML: Unicode Secrets

Python and XML: Unicode Secrets

The following phrases have been identified by the grok system as matching this entry:

















Also check out:


Grok

Ipod Porn on the
Rise

Brief Abstract of
Wikipedia's
Mesothelioma Cancer
page

Get first aid
instructions in your
cell phone

IE is crap
JSPWiki gains
podcasting support

Rich Salz: SOA Made
Real

XML Tourist:
Canadian
Broadcasting in XML

Features: Going
Native, Part 3

Features: Hacking
Election Maps with
XML and MapServer

XML-Deviant: XTech
2005

Features: TMQL: A
Brief Introduction

Delegates From More
Than 20 Nations and
Across the U.S. Will
Convene for Historic
Conference

Supertex to Present
at the 6th Annual
Smith Barney
Citigroup
Semiconductor
Conference

CeBit Australia 2005
Association of
Travel Marketing
Executives Focuses
on the Future at
25th Anniversary
Conference

Varsity Group
Announces Details
for Conference Call
Scheduled for 4:30
P.M. EDT on May 26,
2005

ZTE Makes Australian
Debut At CeBIT

NSW invests in
technology, CeBIT

LTX Announces
Availability of
Webcasts of Upcoming
Investor Conference
Presentations

Oakwood School
Students to Hold
Press Conference at
Noon Thursday
Outside Biltmore
Hotel, Will Announce
Move o

Easy Gardener
Products, Ltd. to
Host Conference
Call: May 31, 2005
at 3:00 P.M. Eastern
and Reports Third
Quar

Centennial
Communications to
Present at Lehman
Brothers 2005
Worldwide Wireless
and Wireline
Conference

Innovo Group to
Participate at the
Piper Jaffray
Consumer Conference

Cougar Mountain
Software Announces
Top Ten Business
Partners at 2005
Conference

LinuxWorld: Big
Changes Coming from
Open-Source
Licensing,
Developers

LinuxWorld: Vendor
Support Key To
Big-Business
Adoption Of Open
Source

LinuxWorld Summit:
Linux Replacing
Other Enterprise OS

True Circuits' John
Maneatis to Speak at
Semico's
Semiconductor
Intellectual
Property Conference
on the Topic

Fountain Powerboats
to Present at the
Wall Street Analyst
Forum's 16th Annual
Conference in New
York

Nokia 770 Tablet
Arrives At
LinuxWorld

Bengal Peerless ties
up with CeBIT for
Axis

Barrier Therapeutics
to Present at
Pacific Growth
Equities 2005 Life
Sciences Growth
Conference

LinuxWorld Summit:
Linux Lowers TCO

Successful
LinuxWorld Summit in
New York Completed

Finisar Corporation
Schedules Conference
Call and Webcast for
Fiscal Fourth
Quarter and Year-End
Financial Res

Sophos Anti-Virus
on-access scanner
for Linux
demonstrated at
LinuxWorld Expo

Cray Inc. to Present
at Bear Stearns 16th
Annual Technology
Conference

Cinapsys Microcap
Investor Conference
Brings Top Microcap
Executives to Las
Vegas June 29th and
30th

A Digital Experience
at CeBIT America

Is the new .xxx
domain a good idea?

$30 Billion Fluid
Power Industry
Validates LatchTool
Group's Contention

FlexLink AB Acquired
by ABN AMRO Capital

CoroWare Joins
Microsoft's Windows
Embedded Partner
Program

CoroWare and
RoboDynamics
Announce Robotics
Integrator
Partnership

International
Federation of
Robotics Hosts
Groundbreaking
Advanced Robotics
E-Symposium

New Catalog of
Precision Automation
Components

Visidot Installs
Unique Ford F150
Chassis
Identification AIDC
System

re2, Inc. Wins
Sub-Contract to
Investigate the
Feasibility of
Establishing an
Experimentation
Center for
Electro-Optic
Sensors in Western
Pennsylvania

SMC Expands Field
Bus Offering on
Series SY & VQC - 5
Port Solenoid Valves

SMC Introduces
Energy-Saving
Pneumatic Products

Compact Manifold
Regulators

Hi-Capacity, Compact
5 Port Solenoid
Valves

ASR/ASQ (Air Saving
Valve)

Digital Control
Systems Launches the
New Model 90 PID
Controller

VideoRay Underwater
Remotely Operated
Vehicle (ROV)
Purchased by
Monterey Bay
Aquarium

what is grok?