warthog9: Warthog9 (Default)
[personal profile] warthog9
Computer Science has only 3 numbers that matter.
  • 0 - There is nothing to worry about, this is easy!
  • 1 - Nearly as easy as 0, things are simple, doing things once isn't so hard & honestly this is how most people think.
  • Many / Infinite - This is where things get a little tricky. The step from 0 to 1 isn't a huge jump, but the step from 1 to Many is a doozy.
I've got maybe 60-70% of the infrastructure in place now to deal with parallel mediawiki installs on korg now, and a lot of grunt work done with respect to php.
w00t progress!

However a quick backstory on what's going on, so I have an easy place to point people who are now grumbling about the speed of bugzilla, patchwork, the wikis, or basically anything that's a web app that's on kernel.org.

The box that has, and still is, running the dynamic content is an older ProLiant DL385 G1. It's an awesome box, and it's served us well, with it being the predecessor master box for kernel.org. However at some point the cache card stopped being able to charge the battery, which the cache card thinks is now failed and *TADA* because the battery is toast it won't enable the accelerator's cache. This doesn't mean a whole lot for reads, however writes are now painfully slow.

What does this mean? It means the box at times is crawling to a nigh halt as the loads spike well above several hundred at times. We've tried replacing the battery, it was fickle, didn't work, then did and is back to not. We've got a quote sitting in my inbox waiting for me to deal with to get a new cache card on order but that hasn't been dealt with yet as it's lumped in with a bunch of other orders that are kinda expensive I want to get that all dealt with and done *PROPERLY* this time (I.E. without the sales guys screwing up my order slightly again).

All is not lost however, we already have a box that is intended to run in parallel with the current box. However getting it setup and far enough along to start actually hosting content is painstaking, mainly because of the comment above on 0, 1, Many problem. It doesn't help that the only time this seems to get solved, it gets solved in an unportable way that no one really shares, so I feel like I'm constantly breaking new ground and fighting with everything to make this all work. So is the life of a sysadmin.

Where I'm at:
  1. Things that are currently working:
    • MySQL is Master-Master replicating across both boxes
    • I've now got what looks like a solid custom PHP session handler that will dump to the DB with a separate read and write path (read local, write remote).  Need to test "failover" with this eventually
    • Mediawiki seems to be setup far enough that it works on both machines. (this does not mean it's ready to actually run)
  2. Things that still need doing:
    • linux-ha needs to be setup and I need to define the network addresses that things like the DBs will be using
    • failover needs to be tested
    • Bugzilla needs to get setup with parallel instances setup
    • Redmine, Patchwork, Kerneloops are all going to need work before they will run in a parallel setup
I'm sure there's a pile of other stuff that I'm happily ignoring right now and will "discover" when I run across them.  I'm making forward progress but I'm dealing with the leap from 1 to Many and I'm not quite there yet.  I think I'll finish getting linux-ha setup and once I know I have that setup, the best solution is to just fail the current box out of rotation and let me plow forward with the rest of the setup and bring things online slowly and as they get done.


warthog9: Warthog9 (Default)

December 2014

141516 17181920

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Apr. 18th, 2019 05:29 pm
Powered by Dreamwidth Studios