Page 1 of 1

Server downtime Sep. 26

Posted: Wed Sep 26, 2007 9:02 am
by drwr
Something about the server is suddenly crashing hard this morning, while people are going about their ordinary business. My apologies for the downtime, and I'll try to restart it as soon as I see that it's down.

I'm also trying to figure out why it's crashing. There's nothing unusual in the logs, other than the point at which it is suddenly not running any more. Usually a Python exception is caught and handled without shutting down the process, so it's probably not that--but I don't know why whatever it is has suddenly started happening.

It might be crashing as an indirect result of some user action. For instance, maybe when user X goes to download his latest turn in game Y, the server tries to read a bad byte in the database and spits up on the floor. Unfortunately, since I don't log every individual user action, it's hard for me to tell if this is the case. Does anyone have any insight, for instance, "it seems to stop working every time I try to send a message to OptimusPrime"?

David

Posted: Wed Sep 26, 2007 10:26 am
by drwr
I think I've got a handle on what's causing it. Looks like a memory leak in the server.

David