Posts Tagged ‘system administration’

Administrating MongoDB

Thursday, March 17th, 2011

During the last two months I have spent a lot of time in close proximity to mongodb. Enough so that I feel like I’ve learned some things I should pass on. These are rooted in mistakes I have made and survived.

  • Think really hard about why you need mongodb. This is not because mongodb is bad, but it may be the wrong choice for what you’re trying to do.
    • Do you know what data will be the most valuable to you from your app? Then don’t use mongodb.
    • Can you afford to take downtime to modify the relationships between your data? Then don’t use mongodb.
    • Do you have existing trustable models which project the expected growth for your data? Then don’t use mongodb.
    • Are there one or more glaring unknowns about the data from your app? Then do use mongodb.
  • Get yourself in a support contract with 10gen. They are the best source of mongodb information, advice, and help.
  • Your  minimum production environment is nine servers.
    • Your servers should as congruent as you can get them for the same reason the drives in a RAID should be.
    • You’re going to want a dedicated RAID-10 device for each mongod process’s datafiles.
    • You want the drives to be maximized for speed. RPMs matter more than capacity when you are getting started.
    • RAM is the other thing which mongod is hungry for. The more the better.
    • The nine servers will be distributed like so
      shard 1 shard 2 shard 3
      primary replSet member primary replset member primary replSet member
      secondary replSet member secondary replSet member secondary replSet member
      delayed secondary replSet member + configdb delayed secondary replSet member + configdb delayed secondary replSet member + configdb
    • This is a reasonably robust setup to distribute data in a redundant manner. You can fail over to the non-delayed secondary in any given replSet, you can do stop+copy backups of your data at the delayed secondary member, you can put all three replSets into a shard and avoid some lopsided conditions from, for example, disabling the Balancer process. If you have to cut costs, you can skimp on the bottom row of servers. Try not to have to do that.
  • Never delete anything. Have enough disk that you can ‘remove’ files by moving them to another place on the same system. This applies to configuration files, data files, log files, anything mongodb related.
  • Find out about numactl. This is a good clearinghouse post about numactl as it applies to mysql. At large core + large memory sizes, it applies to mongodb, as well.
  • Graph everything you can. Iostat, memory usage, swap usage, mongodb operations, throughput at every layer in front of mongodb. You’ll need to know when a problem is with mongodb and when it’s higher or lower in the stack.
  • Run your mongod instances on alternate ports from the default.
  • Favor mv and scp over rsync when moving mongodb data files around.

Puppet for Absolute Beginners

Thursday, November 19th, 2009

This post is addressed to people who more or less are in the situation I was in two months ago.  Notably:

  1. you manage an environment which is mostly Ubuntu 8.04.mumble LTS
  2. you know enough Ruby to recognize it when you see it
  3. you want to start using Puppet to manage your environment

I wrote this post because you’ll run into problems trying to use the probably otherwise fine examples in the documentation, through a combination of puppet being a moving target and LTS being a static environment.

First, if you get inexplicable errors about certificates while configuring and tweaking and experimenting, shut down the puppetmasterd on the master server, kill any puppet process on the client machine, and remove the contents of /var/lib/puppet/ssl on both systems involved.  This seems especially common if you’re doing any hostname tomfoolery.

Second, here’s what a very basic puppet.conf file might look like:

# This file is managed in //depot/ops/puppet/
[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter
pluginsync=true
certname=mask.play

[puppetmasterd]
templatedir=/var/lib/puppet/templates
server=mask.play

The only real changes from the shipped default are an explicit certname and server value. It might not be obvious what I’ve set them to, here, but that’s a server name (on an internal only TLD). On the client machines you won’t need either of those set and you can squelch some noise by getting rid of the pluginsync= option.  In my environment, because of DNS tomfoolery, I had to explicitly make a record in /etc/hosts for the puppetmasterd’s server on the client systems, telling them to look for puppet at a specific address.

To get started setting up puppet, refer to the Simplest Puppet Recipe page specifically Step four: Run a client.  What you’re wanting to achieve here is an introduction between the puppet client node and the puppetmasterd server.  You’ll see your success or failure in /var/log/daemon.log on both systems.  Now about the rest of that Simplest Puppet Recipe page.

If you read it in conjunction with the Puppet Best Practices page, you may get confused like I got confused, especially if you’re a congruent kind of dumb.  After going round and round, here’s what my puppet tree looks like in SCM and implicitly on the puppetmasterd server:

puppet
puppet/services
puppet/manifests
puppet/plugins
puppet/notes
puppet/clients
puppet/modules
puppet/modules/appserver
puppet/modules/appserver/manifests
puppet/modules/curl
puppet/modules/curl/manifests
puppet/modules/snmpd
puppet/modules/snmpd/manifests
puppet/modules/snmpd/files
puppet/modules/user
puppet/modules/screen
puppet/modules/screen/manifests
puppet/modules/ntp
puppet/modules/ntp/manifests
puppet/modules/ntp/files
puppet/modules/sudo
puppet/modules/sudo/manifests
puppet/modules/sudo/files
puppet/modules/bacula
puppet/modules/bacula/templates
puppet/modules/bacula/manifests
puppet/tools

You can see that most of the action is in puppet/modules.  Specifically, I would have started (if I knew then what I know now) with puppet/modules/sudo.  Notice two subdirectories under that.  The puppet/modules/sudo/files directory has in it the sudoers file we want to deploy to client systems.  The puppet/modules/sudo/manifests directory has in it the init.pp file.  For reference, here’s what my version of their simplest recipe looks like:
# /etc/puppet/modules/sudo/manifests/init.pp

class sudo {
  package { sudo: ensure => latest }

  file {
    "/etc/sudoers":
      owner => "root",
      group => "root",
      mode  => 440,
      source => "puppet:///sudo/sudoers",
      require => Package["sudo"];

    "/usr/bin/sudo":
      owner => "root",
      group => "root",
      mode  => 4755;
    }
}

As I recall, the permissions on Ubuntu 8.04 are different from those in the example and my choice of layout and file placement dictated the source parameter.  Speaking of file placement, the fileserver.conf I use:
[files]
  path /etc/puppet/modules/*/files
  allow *.playsomething.com
  allow *.play

Upon shutdown, puppetmasterd complains bitterly about that asterisk but it seems to otherwise work. I surmise I could well enumerate all the paths explicitly and shut up that bit of noise but it hasn’t seemed worth it, to date. The default fileserver.conf that ships in Ubuntu 8.04 expects you to serve files from /etc/puppet/files but that seems to conflict with the Puppet Best Practices so I went with the fancier place.

So under puppet/manifests are where your rules for which modules get applied to which systems.  Mine has three files, modules.pp, nodes.pp and site.pp.  They look like this:

# /etc/puppet/manifests/modules.pp
# managed from //depot/ops/puppet/manifests/modules.pp

import "sudo"

and
# /etc/puppet/manifests/nodes.pp
# managed at //depot/ops/puppet/manifests/nodes.pp

node basenode {
  include curl
  include ntp
  include screen
  include snmpd
  include sudo
}

node 'exhaust', 'gasket' inherits basenode {
  include bacula
}

node 'manifold', 'header' inherits basenode {
  include appserver
}

and
# /etc/puppet/manifests/site.pp

import "modules"
import "nodes"

filebucket { main: server => 'mask.play' }

File { backup => main }
Exec { path => "/usr/bin:/usr/sbin/:/bin:/sbin" }

Package {
  provider => $operatingsystem ? {
    debian     => aptitude,
    openbsd    => freebsd,
  }
}

If you can read between the lines there, puppet reads the site.pp file. That tells it, among other things, to read the modules.pp and nodes.pp files. The openbsd operatingsystem part is untested so don’t count on it, but that Ubuntu systems identify as debian is true and somewhat important to know.

The nodes.pp file is where all the action happens.  I’ve got five modules there defined as necessary for all kinds of systems; your environment almost certainly varies.  Then there are two kinds of nodes which inherit from basenode and add another module; each of those child node types names two servers which are that kind of a node, and includes a different module.  In this case, exhaust, gasket, manifold and header are all names of servers.

You can probably generalize from this to get a boost on rolling out Puppet.  Each of the things you want a server or set of similar servers to do is a module in puppet/modules/NAME/ with manifests/init.pp telling puppet all the rules of that module, and files/WHATEVER containing any files which should be pushed as part of enforcing that module.  You may have noticed that my puppet/modules/bacula doesn’t have a files subdirectory, but does have a templates subdirectory.

Here’s what the init.pp for bacula looks like:

# /etc/puppet/modules/bacula/manifests/init.pp

class bacula {

  package { bacula-client: ensure => latest }

    file {
    "/etc/bacula/bacula-fd.conf":
      owner => "root",
      group => "root",
      mode  => 640,
      content => template("bacula/bacula-fd.conf.erb"),
      require => Package["bacula-client"];

    "/usr/sbin/bacula-fd":
      owner => "root",
      group => "root",
      mode  => 755
    }

  exec {
    "/etc/init.d/bacula-fd restart":
      subscribe   => File["/etc/bacula/bacula-fd.conf"],
      refreshonly  => true
  }
}

I’m providing it not because it’s the most awesome use of templates ever but because I found using templates very intimidating as a non-Ruby coder, so I had to do a couple shots of tequila before trying to write one. And here’s the puppet/modules/bacula/templates/bacula-fd.conf.erb file:
# /etc/bacula/bacula-fd.conf
# managed from //depot/ops/puppet/modules/bacula/templates/

Director {
  Name = akadi-dir
    Password = "yrmomsaidimthebestlickintown"
}

FileDaemon {
  Name = <%= hostname %>-fd
    FDport = 9102
  WorkingDirectory = /var/lib/bacula
  Pid Directory = /var/run/bacula
  Maximum Concurrent Jobs = 20
}

Messages {
  Name = Standard
    director = akadi-dir = all, !skipped, !restored
}

It’s a tiny substitution, just customizing the bacula-fd.conf file on the client systems to name themselves distinctly.  The point being I just took a bacula-fd.conf, figuring out the piece which needs to be different on different systems, and substituted a variable to differentiate them.  It’s easy.  There, I just saved you $7 on tequila.

Oh, so the last detail of absolute square one stuff.  Getting the important bits from your SCM workspace to the puppetmasterd workspace.  In our environment, we use a lot of Makefiles to deploy files from SCM so that’s what I did here.  It’s dead-simple but just in case you need a nudge, it looks like this:

# Makefile for puppetmasterd configuration.
# binder@manjusri.org takes the blame for this one

mask.push:
  rsync -av --delete --exclude '*/README' --exclude 'Makefile' .  root@mask.play:/etc/puppet/

It’s just an invocation of rsync which blows away any local crap, leaves the README and Makefile files in the SCM workspace, and copies everything else into the /etc/puppet directory on the puppetmasterd server. Reloading of the daemon is done manually at this point but you could always modify this Makefile to reload your configuration once you push.

OK, that’s it.  That’s everything I think I know about Puppet which I didn’t know two months ago.

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...