AnimeSuki Forums

Register Forum Rules FAQ Members List Social Groups Search Today's Posts Mark Forums Read

Go Back   AnimeSuki Forum > General > Forum & Site Feedback

Notices

Reply
 
Thread Tools
Old 2011-10-30, 09:08   Link #21
GHDpro
Administrator
*Administrator
 
 
Join Date: Jan 2001
Location: Netherlands
Age: 35
Update

Okay... seems like when I rebooted the server the BIOS thought it would be funny to swap the boot priority. And the SATA drive I mentioned above of course doesn't have a OS loaded on it.

When I fixed the boot priority the server booted without much of a problem.

Having said that I will definitely check the logs to see what happened and will order a spare drive (or 2 maybe) so that if this drive (or another) does fail, I have the spare parts to replace it.
GHDpro is offline   Reply With Quote
Old 2011-10-30, 14:11   Link #22
GHDpro
Administrator
*Administrator
 
 
Join Date: Jan 2001
Location: Netherlands
Age: 35
I've finished analyzing what went wrong and the cause appears to be quite simple. The main OS drive, which also holds the MySQL database and web server (Apache/nginx) logs is an SSD drive.

SSD are known to be really fast, which is why I chose it as the main drive. However they have one disadvantage: after a certain number of writes they will fail. Under normal circumstances (normal desktop use for example) you won't hit the "write limit" of an SSD within the life time of your PC (or server in this case).

But the various data gathering scripts are causing upwards of 50+ MB of written data per minute. If you do the math... that adds up to a lot of gigabytes per day. And as I mentioned before SSDs don't like that. So while fortunately the SSD drive isn't toast, it was "smoking" under the I/O write load.

The solution would be to simply not use SSD but a old fashioned SATA HDD for storing everything. This is also what I've done as immediate fix to prevent the problem from happening again: simply move the MySQL database and web server log files to the secondary SATA drive that is already in the server. In the long run I'll probably want to reinstall the OS on the SATA drive as well as I'm not sure I can fully trust the SSD that is in the server anymore.

I'll also look into why the scripts are causing that much I/O as quite frankly it seems a bit much.

BTW for those wondering why this problem manifests itself now: the torrent index was moved to a different (colocated) server about a week ago.
GHDpro is offline   Reply With Quote
Old 2012-01-29, 14:03   Link #23
GHDpro
Administrator
*Administrator
 
 
Join Date: Jan 2001
Location: Netherlands
Age: 35
The forum server was down for ~45 minutes just now because I had to run checks on the MySQL database and take a fresh backup for replication.

The reason for this is that sometime earlier this weekend the forum server rebooted itself uncleanly, which may potentially have caused database problems and certainly did break replication (the database is replicated in realtime to another server which is used mainly for backups, a process that would normally cause the database to freeze for several minutes every time a backup is made... which is something you'd want to do daily at least).

As far as I can see there do not appear to have been any major problems caused by the unclean reboot.
GHDpro is offline   Reply With Quote
Old 2012-05-26, 11:31   Link #24
GHDpro
Administrator
*Administrator
 
 
Join Date: Jan 2001
Location: Netherlands
Age: 35
Notice

I have been informed by the datacenter that one of the drives in the RAID array of our forum server has failed and needs to be replaced.

To replace the drive the server needs to be powered down, which obviously will cause some downtime. If the replacement goes well the downtime should not take all that long (about an hour max I figure) but after the server is back up the RAID array needs to be rebuild which may cause performance degradation for some time.

I've not received the precise time the drive will be replaced yet from the datacenter, but I'll update this thread when I know more.

(In the mean while I've run a backup so that if anything goes catastrophically wrong there should not be much data loss)
GHDpro is offline   Reply With Quote
Old 2012-05-26, 11:49   Link #25
GHDpro
Administrator
*Administrator
 
 
Join Date: Jan 2001
Location: Netherlands
Age: 35
Update

The drive replacement is now scheduled for 3 am PDT on May 27 (Sunday).

This is 12:00 (noon) CET (European time) and 11:00 am GMT (UK time).

As previously indicated if all goes well the downtime should be minimal, but any complications could extend the downtime.
GHDpro is offline   Reply With Quote
Old 2012-05-27, 05:43   Link #26
GHDpro
Administrator
*Administrator
 
 
Join Date: Jan 2001
Location: Netherlands
Age: 35
FYI

The drive replacement has been completed and (as you can see) the forum is back up.

Performance degradation due to RAID array rebuild appears to be minimal.
GHDpro is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 15:26.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
We use Silk.