[JDEV] Introduction to the Worldwide Lexicon

Brian McConnell brianmsf at yahoo.com
Tue Jun 11 14:56:08 CDT 2002


Jeremie suggested I post an intro to the Jabber dev list about a new
project. The worldwide lexicon (www.worldwidelexicon.org) is an open source
initiative to create a standard procedure for locating and communicating
with language services throughout the web (e.g. dictionaries, machine
translation servers, and even pools of human translators). Think of this as
GNUtella for language services. WWL defines a fairly simple SOAP based
interface that allows apps to find and query these services via a small
library of SOAP-RPC methods.

What does this have to do with Jabber you might ask? Real-time communication
is one of the most interesting applications for this system. By creating a
simple client/server interface for talking to these services, WWL will make
it easier for developers to build chat clients that provide services such as
inline dictionaries, machine translation, and live human translation. The
chat client itself can be relatively dumb, and communicates with these
services via the SOAP interface.

A WWL enabled Jabber client would support three different modes of
operation, each of which is useful in different situations.

1) Machine translation mode: the chat client uses WWL to find MT servers
that support a specific language pair on the fly. It then uses WWL to send
translation requests to these servers, and inserts the translations into the
conversation. This is easy to implement (all the chat client does is send
queries to remote servers and then copy the results into the conversation).
The problem, of course, is that machine translation is often inaccurate.
However, it is better than no translation. Also, in a real-time
conversation, a user can always retransmit if a message is garbled in
transit.

2) Inline dictionary mode: in this scenario, the chat client watches the
user as he/she is talking with another user. Each time the user types a new
word or phrase, the client attempts to find translations into the other
user's language. If the WWL dictionary replies with multiple translations,
the chat client prompts the sender to pick the best meaning via a dialog box
or extra keystroke. This mode will be useful for people who speak another
language, but have a poor vocabulary. The chat client merely assists the
user in composing messages in another language. This will also be very
useful as a teaching tool/

3) Human translation mode: the WWL protocol does not care if queries are
processed by a machine, or by humans linked to a WWL server via IM. In this
mode, the chat client uses the protocol to submit queries to WWL servers
that are backed up by human translators. This will typically be run as a
commercial service where users pay a metered rate for human assisted
translation, and translators (who may be located on the other side of the
world) are paid on a per work-unit basis.

We are interested in recruiting Jabber developers to create WWL enabled
chat/IM clients. The project is at an early stage of development, so there
are many opportunities to get involved. There are also some great business
opportunities that will result from WWL. One I can envision is a human
assisted translation network that allows chat/IM users to communicate with
people worldwide. While casual users may not want to pay for translation,
the business and government uses for such a system are substantial.

To learn more about the Worldwide Lexicon project, visit our website at
www.worldwidelexicon.org.

Thanks for your time.

Brian McConnell, Project Leader




More information about the JDev mailing list