[JDEV] dropped messages?

Dustin Puryear dpuryear at usa.net
Fri Jul 13 13:29:52 CDT 2001


I am trying to determine why messages appear to be dropped when jabberd 
is under a heavy load. I'm not sure if it's my code or jabberd, so I'm 
looking for ideas.

I have multiple user pairs, where each pair is composed of a sending 
client A and receiving client B. Under a high load not all messages sent 
from A appear to arrive at B. I find this odd because I would think that 
the delivery time would be the only thing affected under high load, not 
actual message delivery. Thus, I suspect my code, but can't rule out 
jabberd.

Following are Jabber Test Suite results generating my concern, and then 
a review of what is going on:

UserPs Rate  Duration MinDelTime MaxDelTime AvgDelTime MsgCnt ExpMsg 
MsgLossRate
10     1     60       0.00310    0.10796    0.00471    600    600    0.00000
20     1     60       0.00484    0.15478    0.00930    1200   1200   0.00000
30     1     60       0.00668    0.16320    0.01948    1800   1800   0.00000
40     1     60       0.00844    0.17075    0.02729    2400   2400   0.00000
50     1     60       0.01026    0.21504    0.04267    3000   3000   0.00000
60     1     60       0.00608    0.21595    0.05980    3600   3600   0.00000
70     1     60       0.00532    0.36635    0.05474    4200   4200   0.00000
80     1     60       0.01197    0.19757    0.03924    4800   4800   0.00000
90     1     60       0.00984    0.23845    0.04508    5400   5400   0.00000
100    1     60       0.00233    0.33065    0.06156    6000   6000   0.00000
110    1     60       0.00326    0.48642    0.05514    6600   6600   0.00000
120    1     60       0.00518    6.02132    0.07296    6960   7200   0.03333
130    1     60       0.00900    6.10875    0.10111    7380   7800   0.05385
140    1     60       0.01031    5.98667    0.09968    7980   8400   0.05000
150    1     60       0.01358    6.16324    0.10493    7980   9000   0.11333
160    1     60       0.01539    6.23972    0.14795    8280   9600   0.13750

Notice that at >= 120 user pairs (240 connected users), which equates to 
120 msg/sec in this test, my message loss rate varies from 3% to 13%. 
The average delivery also climbs to .14 seconds, but I don't consider 
that a problem. (However, the worst case delivery times are bad: > 6 
seconds for 150 and 160 user pairs.)

There are only two places that I feel messages could be getting lost: in 
jabberd and in msgloadrec, the receiving client. If it's in jabberd then 
I have to wonder why this is happening. If in msgloadrec, I'm also a bit 
bewildered.

Perhaps I am not handling my XML parsing correctly with expat? My XML 
character data handler is:

void char_data_hdlr(void *userdata, const XML_Char *s, int len)
{
	user_data_t *ud = userdata;
	char buf[MAX_XML_BUFSZ+1];
	struct timeval tv;
	reply_data_t *reply;
	int id;

	memcpy(buf, s, len);
	buf[len] = '\0';
	DPRINT("found message: %s\n", buf);

	/* scan for our start times at the beginning of the message */
	if (sscanf(buf, " %d %ld %ld ",
		&id, &(tv.tv_sec), &(tv.tv_usec)) == 3)
	{
		reply = malloc(sizeof(reply_data_t));
		if (reply == NULL)
		{
			perror("malloc()");
			exit(EXIT_FAILURE);
		}

		DPRINT("char_data_hdlr(): adding buf = %s with sec = %ld and usec = %ld\n",
				buf, tv.tv_sec, tv.tv_usec);

		reply->begin.tv_sec = tv.tv_sec;
		reply->begin.tv_usec = tv.tv_usec;
		reply->id = id;
		list_add(&(ud->reply_list), (void *) reply);
	}
}

All character data has the form: "decimal float float". Hmm, should I 
not be assuming that I will get the entire character data at once? 
Perhaps it is being split across multiple invocations of 
char_data_hdlr() by expat? Any ideas?

Regards, Dustin

-- 
Dustin Puryear <dpuryear at usa.net>
http://members.telocity.com/~dpuryear
In the beginning the Universe was created.
This has been widely regarded as a bad move. - Douglas Adams




More information about the JDev mailing list