[JDEV] jabber.py problems
mallum
breakfast at 10.am
Wed Feb 6 19:15:45 CST 2002
Hmm, thanks for the work on the patch, I'll take a look at it later.
But Im a little dubious as apart from yuo saying it poor quality and
the fact that the dreaded "ordinal not in range" *may still* occur I
cant really see it as a proper fix.
Id really like to understand what the problem is with changing
site.encoding which seems to me the most logical and easiest way to
fix things 100% .
It seems however you fix things in jabberpy the python expat bindings
will still barf unless you change site.encoding .
Its probably best to discuss this further ( if you want ) on the
jabber.py list.
-- mallum
on Wed, Feb 06, 2002 at 08:35:05PM +0100, Jacek Konieczny wrote:
> On Wed, Feb 06, 2002 at 06:14:17PM +0100, Igor Stroh wrote:
> > > > it doesn't work this way, don't ask my why :) to switch to utf-8, edit
> > > > your site.py and change the line that says "encoding = 'ascii'" to
> > > > "encoding = 'UTF-8'"
> > > It is not a good thing (one Python hacker told me this, with some
> > > arguments, that convinced me).
> > > jabber.py should be fixed, so it uses proper encoding.
> >
> > there's no way to do it other than to follow the instructions at
> > http://www.python.org/cgi-bin/faqw.py?req=show&file=faq04.102.htp
> >
> ...
> >
> > again, this is a known issue, if you think there's a better way to handle
> > this problem, please send a patch
> Here is patch attached. It is not very good or pretty, but I wrote it
> just to show you how I think it should look like.
>
> IMHO jabber.py module should work on Unicode and it should not depend in
> any way on system or locale encoding. Applications are responsible for
> encoding conversion and if they don't do it well it is OK, that they
> crash. Sometimes it is the only way to convinve ascii-speaking developer
> to fix this :-)
>
> The problem is, that the expat python module doesn't support Unicode
> very well :-( Thats why the patch is so ugly (but I am sure there are
> better ways to do this anyway).
>
> This patch makes the sample jabber client work for me, with
> international characters.
>
> It could happen, that conversion error ("ordinal not in range") may
> occur. If it is raised in jabber.py, it means something more has to be
> fixed in the module. When in the application --- this means application
> is broken. Making the module silently convert international characters
> to "?" is bad. I left this behaviour for log and debug messages --- this
> are the only places where it seems OK for me.
>
> > or a solution proposal to jabber.py
> > mailing list
> > or just post in here, i'll forward the message to the list...
> Could you do this, please?
>
> Greets,
> Jacek
>
> The ugly patch follow...
>
> diff -durN jabber.py-0.3-1.orig/examples/test_client.py jabber.py-0.3-1/examples/test_client.py
> --- jabber.py-0.3-1.orig/examples/test_client.py Thu Jan 17 13:05:40 2002
> +++ jabber.py-0.3-1/examples/test_client.py Wed Feb 6 20:13:48 2002
> @@ -1,4 +1,4 @@
> -#!/usr/bin/env python2
> +#!/usr/bin/python
>
> # $Id: test_client.py,v 1.9 2002/01/17 12:05:40 mallum Exp $
>
> @@ -9,6 +9,7 @@
> from select import select
> from string import split,strip,join
> import sys,os
> +import locale
>
> sys.path.insert(1, os.path.join(sys.path[0], '..'))
>
> @@ -24,6 +25,12 @@
> MyStatus = ''
> MyShow = ''
>
> +loc = locale.getdefaultlocale()
> +if loc[1]:
> + LocalEncoding=loc[1]
> +else:
> + LocalEncoding=getdefaultencoding()
> +
> def usage():
> print "%s: a simple python jabber client " % sys.argv[0]
> print "usage:"
> @@ -107,7 +114,7 @@
> if Who != '':
> msg = jabber.Message(Who, strip(txt))
> msg.setType('chat')
> - print "<%s> %s" % (JID, msg.getBody())
> + print "<%s> %s" % (JID.encode(LocalEncoding,"replace"), msg.getBody().encode(LocalEncoding,"replace"))
> con.send(msg)
> else:
> print colorize('Nobody selected','red')
> @@ -117,8 +124,8 @@
> """Called when a message is recieved"""
> if msg.getBody(): ## Dont show blank messages ##
> print colorize(
> - '<' + str(msg.getFrom()) + '>', 'green'
> - ) + ' ' + msg.getBody()
> + '<' + str(msg.getFrom()).encode(LocalEncoding,"replace") + '>', 'green'
> + ) + ' ' + msg.getBody().encode(LocalEncoding,"replace")
>
> def presenceCB(con, prs):
> """Called when a presence is recieved"""
> @@ -149,11 +156,23 @@
> print colorize("we are now unsubscribed to %s" % (who), 'blue')
>
> elif type == 'available':
> + sh=prs.getShow()
> + if sh:
> + sh=sh.encode(LocalEncoding,"replace")
> + st=prs.getStatus()
> + if st:
> + st=st.encode(LocalEncoding,"replace")
> print colorize("%s is available (%s / %s)" % \
> - (who, prs.getShow(), prs.getStatus()),'blue')
> + (who, sh, st),'blue')
> elif type == 'unavailable':
> + sh=prs.getShow()
> + if sh:
> + sh=sh.encode(LocalEncoding,"replace")
> + st=prs.getStatus()
> + if st:
> + st=st.encode(LocalEncoding,"replace")
> print colorize("%s is unavailable (%s / %s)" % \
> - (who, prs.getShow(), prs.getStatus()),'blue')
> + (who, sh, st),'blue')
>
>
> def iqCB(con,iq):
> @@ -243,7 +262,7 @@
> inputs, outputs, errors = select([sys.stdin], [], [],1)
>
> if sys.stdin in inputs:
> - doCmd(con,sys.stdin.readline())
> + doCmd(con,unicode(sys.stdin.readline(),LocalEncoding))
> else:
> con.process(1)
>
> diff -durN jabber.py-0.3-1.orig/jabber.py jabber.py-0.3-1/jabber.py
> --- jabber.py-0.3-1.orig/jabber.py Thu Jan 17 13:05:40 2002
> +++ jabber.py-0.3-1/jabber.py Wed Feb 6 20:18:05 2002
> @@ -155,7 +155,7 @@
>
> def send(self, what):
> """Sends a jabber protocol element (Node) to the server"""
> - xmlstream.Client.write(self,str(what))
> + xmlstream.Client.write(self,what)
>
> def dispatch(self, root_node ):
> """Called internally when a 'protocol element' is recieved.
> @@ -364,7 +364,7 @@
>
> def send(self, what):
> """Sends a jabber protocol element (Node) to the server"""
> - xmlstream.Client.write(self,str(what))
> + xmlstream.Client.write(self,what.unicode())
>
> def sendInitPresence(self):
> """Sends an empty presence protocol element to the
> @@ -603,6 +603,9 @@
> """returns an xmlstreamnode representation of the protocol element"""
> return self._node
>
> + def unicode(self):
> + return self._node.unicode()
> +
> def __str__(self):
> return self._node.__str__()
>
> diff -durN jabber.py-0.3-1.orig/xmlstream.py jabber.py-0.3-1/xmlstream.py
> --- jabber.py-0.3-1.orig/xmlstream.py Thu Jan 17 13:05:40 2002
> +++ jabber.py-0.3-1/xmlstream.py Wed Feb 6 20:22:18 2002
> @@ -44,11 +44,6 @@
> STDIO = 0
> TCP_SSL = 2
>
> -ENCODING = site.encoding ## fallback encoding to avoid random
> - ## random UnicodeError: ASCII decoding error:
> - ## ordinal not in range(128)
> - ## type errors - being looked into.
> -
> BLOCK_SIZE = 1024 ## Number of bytes to get at at time via socket
> ## transactions
>
> @@ -159,7 +154,28 @@
> return newnode
>
> def __str__(self):
> - return self._xmlnode2str()
> + return self.unicode()
> +
> + def unicode(self, parent=None):
> + """Returns an xml ( Unicode ) representation of the node
> + and it children"""
> + s = u"<" + self.name
> + if self.namespace:
> + if parent and parent.namespace != self.namespace:
> + s = s + u" xmlns = '%s' " % self.namespace
> + for key in self.attrs.keys():
> + val = str(self.attrs[key])
> + s = s + u" %s='%s'" % ( key, XMLescape(val) )
> + s = s + u">"
> + cnt = 0
> + if self.kids != None:
> + for a in self.kids:
> + if (len(self.data)-1) >= cnt: s = s + XMLescape(self.data[cnt])
> + s = s + a._xmlnode2str(parent=self)
> + cnt=cnt+1
> + if (len(self.data)-1) >= cnt: s = s + XMLescape(self.data[cnt])
> + s = s + u"</" + self.name + u">"
> + return s
>
> def _xmlnode2str(self, parent=None):
> """Returns an xml ( string ) representation of the node
> @@ -208,6 +224,7 @@
> method of Node"""
> def __init__(self,data):
> self._parser = xml.parsers.expat.ParserCreate(namespace_separator=' ')
> + self._parser.returns_unicode = 1
> self._parser.StartElementHandler = self.unknown_starttag
> self._parser.EndElementHandler = self.unknown_endtag
> self._parser.CharacterDataHandler = self.handle_data
> @@ -298,8 +315,10 @@
> self._logFH = None
>
> def DEBUG(self,txt):
> + if type(txt) is type(u""):
> + txt=txt.encode(sys.getdefaultencoding(),"replace")
> if self._debug:
> - sys.stderr.write("DEBUG: %s\n" % txt)
> + sys.stderr.write("DEBUG: %s\n" % txt )
>
> def getSocket(self):
> return self._sock
> @@ -368,45 +387,42 @@
> data_in = u''
> if self._connection == TCP:
> data_in = data_in + \
> - unicode(self._sock.recv(BLOCK_SIZE),'utf-8').encode(ENCODING,
> - 'replace')
> + unicode(self._sock.recv(BLOCK_SIZE),'utf-8')
> while data_in:
> data = data + data_in
> if len(data_in) != BLOCK_SIZE:
> break
> - data_in = unicode(self._sock.recv(BLOCK_SIZE),'utf-8').encode(
> - ENCODING, 'replace')
> -
> + data_in = unicode(self._sock.recv(BLOCK_SIZE),'utf-8')
> if self._connection == TCP_SSL:
> data_in = data_in + \
> - unicode(self._sslObj.recv(BLOCK_SIZE),'utf-8').encode(ENCODING,'replace')
> + unicode(self._sslObj.recv(BLOCK_SIZE),'utf-8')
> while data_in:
> data = data + data_in
> if len(data_in) != BLOCK_SIZE:
> break
> - data_in = unicode(self._sslObj.recv(BLOCK_SIZE),'utf-8').encode(ENCODING, 'replace')
> + data_in = unicode(self._sslObj.recv(BLOCK_SIZE),'utf-8')
>
> elif self._connection == STDIO:
> ## Hope this dont buffer !
> - data_in = data_in + unicode(sys.stdin.read(1024),'utf-8').encode(
> - ENCODING, 'replace')
> - while data_in:
> + data_in = data_in + unicode(sys.stdin.read(1024),'utf-8')
> + while data_in:
> data = data + data_in
> if len(data_in) != 1024:
> break
> - data_in = unicode(sys.stdin.read(1024),'utf-8').encode(
> - ENCODING, 'replace')
> + data_in = unicode(sys.stdin.read(1024),'utf-8')
> else:
> pass # should never get here
>
> self.DEBUG("got data %s" % data )
> self.log(data, 'RECV:')
> - self._parser.Parse(data)
> + self._parser.Parse(data.encode("utf-8"))
> return data
>
> def write(self,data_out=u''):
> """Writes raw outgoing data. blocks"""
> try:
> + if type(data_out) is type(u''):
> + data_out=data_out.encode("utf-8")
> if self._connection == TCP:
> self._sock.send (data_out)
> elif self._connection == TCP_SSL:
> @@ -418,6 +434,7 @@
> self.log(data_out, 'SENT:')
> self.DEBUG("sent %s" % data_out)
> except:
> + raise
> self.DEBUG("xmlstream write threw error")
> self.disconnected()
>
> @@ -461,9 +478,13 @@
> def log(self, data, inout=''):
> """Logs data to the specified filehandle. Data is time stamped
> and prefixed with inout"""
> + if type(data) is type(u""):
> + data=data.encode(sys.getdefaultencoding(),"replace")
> + if type(inout) is type(u""):
> + inout=data.encode(sys.getdefaultencoding(),"replace")
> if self._logFH is not None:
> self._logFH.write("%s - %s - %s\n" %
> - (time.asctime(time.localtime(time.time())), inout, data ) )
> + (time.asctime(time.localtime(time.time())), inout, data))
>
> def getIncomingID(self):
> """Returns the streams ID"""
> _______________________________________________
> jdev mailing list
> jdev at jabber.org
> http://mailman.jabber.org/listinfo/jdev
More information about the JDev
mailing list