CouchDB from Src How-To

Well this was more effort than it should have been. I have been dabbling with Erlang for

Media_httpwwwstrangea_gvfsh
a while, and after struggling with MySQL, Tomcat and JDBC I was looking for an alternative web-app stack. CouchDB looks to be perfect, although sufficiently new that there is not a lot of documentation, both of the books available right now are OK, but not brilliant. In general I prefer the style of CouchDB: The Definitive Guide but prefer the examples from Beginning CouchDB. My personal preference with programming and software tool books is that they should provide detailed, hand-held, speak-to-me-like-I-am-an-imbecile walk-throughs of the most common basic use cases. Anyhow, CouchApp looks like a great way to develop apps for CouchDB, although documentation is fairly sparse. The main problem I found is that the versions of Erlang and CouchDB available from repositories for both Ubuntu and Debian are way behind the cutting edge to the point that the examples in the books won't run. I found that the best way to set up my CouchApp development environment is to completely avoid the repositories and build from source in a clean Ubuntu server VM using the following steps:

  • $ apt-get update
  • $ apt-get clean
  • $ apt-get upgrade

We are going to be building some software so the following tools are useful:

  • $ sudo apt-get install build-essential subversion git-core openssh-server

Install Erlang

Install CouchDB

  • $ sudo apt-get build-dep couchdb
  • $ sudo apt-get install xulrunner-dev libicu-dev libcurl4-gnutls-dev libtool
  • $ wget http://mirrors.ukfast.co.uk/sites/ftp.apache.org/couchdb/1.0.0/apache-couchdb...
  • $ tar zxvf apache-couchdb-1.0.0.tar.gz
  • $ cd apache-couchdb-1.0.0
  • $ ./configure
  • $ ./configure --with-js-lib=/usr/lib/xulrunner-devel-1.9.2.3/lib --with-js-include=/usr/lib/xulrunner-devel-1.9.2.3/include
  • $ make CouchDB
  • $make
  • $sudo make install

Final Setup & Running CouchDB

  • $ adduser --system --home /usr/local/var/lib/couchdb --no-create-home --shell /bin/bash --group --gecos "CouchDB Administrator" couchdb
  • $ sudo chown -R couchdb:couchdb /usr/local/etc/couchdb
  • $ sudo chown -R couchdb:couchdb /usr/local/var/lib/couchdb
  • $sudo chown -R couchdb:couchdb /usr/local/var/log/couchdb
  • $sudo chown -R couchdb:couchdb /usr/local/var/run/couchdb
  • $sudo chmod -R 0770 /usr/local/etc/couchdb
  • $sudo chmod -R 0770 /usr/local/var/lib/couchdb
  • $sudo chmod -R 0770 /usr/local/var/log/couchdb
  • $sudo chmod -R 0770 /usr/local/var/run/couchdb
  • $ sudo ln -s /usr/local/etc/init.d/couchdb /etc/init.d/couchdb

Put xulrunner-devel on your LD_LIBRARY_PATH

  • $ sudo touch /etc/ld.so.conf.d/couchdb.conf
  • $ sudo cat > /etc/ld.so.conf.d/couchdb.conf
  • /usr/lib/xulrunner-devel-1.9.2.6 (NB. Check the version that you have installed!)
  • <CTRL-D>
  • $ sudo ldconfig

Running CouchDB Manually

  • $ sudo -i -u couchdb couchdb

Running CouchDB As a Daemon

  • $ sudo /etc/init.d/couchdb start
Posted
 

Real World Shell Scripting

Media_httpac22001comp_zkkcf
Just to prove to you that shell scripts do come in handy as a way to save time and make a boring job into somethin more fun, here is a little script called wikify.sh that I used today to manipulate a file containing a list of papers into an unordered list for inserting into a wiki page. The papers were of the following form:
ito2008digital.youth.pdf
and I wanted them to be like this instead:
* [[ito2008digital.youth]]
Which involves cutting of the .pdf ending and putting two whitespaces a star and double square brackets around the filename. Because there were more than a thousand papers in the list I obviously didn't want to do this by hand, which would have been both time consuming and boring, instead I used this:
Posted
 

Wubi Ubuntu Installer

For those of you looking for an easier way to install Ubuntu you can try the Wubi Ubuntu Installer which promises to make dual booting an Ubuntu machine as non-invasive as using a virtualised machine yet as responsive as using a native install. Supposedly Wubi creates a file on the harddisk that to Linux appears to be a regular filesystem, and to windows appears to be a regular file located at c:\ubuntu\disks\root.disk so you can essentially run Ubuntu without having to create a partition for your Linux install to live in. Installing and uninstalling is therefore almost as simple as installing and deleting a regular windows program. Many of you will have realised that I am not the biggest fan of virtualising the desktop, running a Linux virtualised under Windows just feels abhorent to me, so anything that supports you in getting Ubuntu up and running with the least amount of blood, sweat and tears is a good thing. If anybody tries this out, please tell me how you get on.
Posted
 

Bash Quoting

Media_httpac22001comp_tkfqr
I found a couple of articles over at the Linux Journal that will give you a bit more understanding of the wonders of quoting in Bash. The first article will probably be of more use to you as it deals with the basic of quotes in scripts whereas the second article deals more with the intricacies of quoting within scripts, in their example, scripting interaction with a MySQL database.
Posted
 

The Editor of "Real Programmers"

The always excellent xkcd has this to say on the subject of editor choice and its relationship to real programmers:
Media_httpac22001comp_zfwyl
It seemed apropo given the last post.
Posted
 

Choosing an Editor

During this morning's lecture I mentioned options for editing on Linux. There are two basic paths that you can go, a GUI editor or a text-based editor. If you go the GUI route then there are many options for notepad type functionality such gEdit on Gnome or kEdit on KDE. However, to really take advantage of Linux you need to learn a proper editor by which I mean a command line editor such as VI (pronounced Vee-Eye), VIM (VI-Improved), or Emacs. Unfortunately the learning curve is very steep on these editors so I can do nothing more than suggest that you try them out and see. Alternatives to the big command-line editors are the lightweight editors like pico or nano that at least give you some on screen help to save you from immediately getting completely lost as is usually the case with VI and Emacs the first time you try it. For interest you should be aware that the VI versus Emacs debate has been going on for many years and is akin to a holy war. Ultimately, our choice of editor is merely a personal decision based upon the needed features and perseverance. That said we can learn a lot from other people's reasons for choosing one editor over another, for example, over at Charlie Stross' blog there was a post a couple of days back about his choice of writing tools (he is one of the best sci-fi writers around at the moment and is part of a very strong Scottish Science Fiction landscape). Charlie is a VI user because Emacs gives him repetitive stress, and given enough time, as one of the commenters pointed out:
Your choice of a text editor is kind of like a tattoo, isn't it? After a while you have to look back on that decision you made when you were fifteen and realize, "yup, that's just never going to go away."
So I am interested to hear your views, especially if you have tried out the editors available in the Ubuntu VM and have settled on one that you will use during the remainder of the module.
Posted
 

First Steps With Tor

Media_httpwwwstrangea_ebeqa
I finally got around to getting Tor working on one of my workstations. The process is quite simple to set up to enable anonymous browsing. My workstation is running Ubuntu so I firstly had to install the Debian/Ubuntu APT repository (NB. More detailed instructions for steps 1-3 are available from the Tor Project Debian/Ubuntu Installation Instructions and more details on steps 4 - 7 are available from the Tor Project general *nix installation instructions).
  1. To your /etc/apt/sources.list add (replacing <DISTRIBUTION> with the name of your distro which you can copy from your existing sources.list):
    deb http://deb.torproject.org/torproject.org <DISTRIBUTION> main
  2. Now you need to add the crypto keys
    gpg --keyserver keys.gnupg.net --recv 886DDD89 gpg --export A3C4F0F979CAA22CDBA8F512EE8CBC9E886DDD89 | sudo apt-key add -
  3. Now update your apt package list then install both Tor and the Tor Geographical IP database
    $ sudo apt-get update $ sudo apt-get install tor tor-geoipdb
  4. Install Polipo:
    $ sudo apt-get install polipo
  5. Configure Polipo by downloading the ready-made Polipo configuration file from the Tor Project, moving it to /etc/polipo and renaming it to config
  6. Configure Tor with your webbrowser. I am using Firefox and configuration was as easy as installing the TorButton plugin and restarting firefox.
  7. Verify that Tor is working correctly by visiting the Tor Detector.
Remember that this is not a panacea for privacy issues on the web and all of the technical protection offered by Tor can be squandered in a heartbeat through user action. Good personal online security and privacy practice is essential.
Posted
 

Synchronising the Clocks on Linux Boxen

The rsync program relies upon accurate clocks on the various machines that it is synchronising between because it uses the modification timestamps to determine which is newer and therefore which was more recently altered. So as the first stage in getting a distributed and bi-directional rsync system up and running so as to synchronise my files between several workstations and a fileserver, I have had to install and setup some tools that correct the clocks on each of these machines. The default is to use ntpdate which is installed on many *nixes by default but is only set to update the system clock at startup. Therefore a cron job is required to be run to cause ntpdate to be automatically run more often. Create an executable (chmod 755) /etc/cron.daily/ntpdate file containing:
ntpdate ntp.ubuntu.com pool.ntp.org
Because a machine's hardware clock can drift over time it helps also to install the ntp daemon which calculates the drift in your system clock and adjust is. To set it up, all that is required is:
sudo apt-get install ntp
followed by adding the ntp server pool to /etc/ntp.conf:
server ntp.ubuntu.com server pool.ntp.org
This process needs to be followed on all machines that are going to be part of the synchronisation pool because they all need to have accurate clocks.
Posted
 

Debian Install with RAID 1

Whilst I use LVM without RAID to manage the drives on my media server, because the media rarely changes so the offline backup is easy to keep in sync, I have started using RAID 1 to mirror the contents of my file server to guard against the failure of a single drive. This is because the data changes sufficiently often on the file server that my regular backup might get out of sync by up to a day which could lose me a days work.
  1. Using the Debian alternate install CD, at the partion disks screen, select manual
  2. Select the free space on your first drive and create 3 new partitions as follows: 2GB for /boot, 2GB for SWAP, Remainder for /
  3. Select "physical volume for RAID" at this point and ensure that the boot partition is marked as bootable.
  4. Repeat the above steps for the second drive
  5. Now select "Configure software RAID" from the main partition disks screen
  6. Create 3 Multidisk (MD) devices with each configured as RAID 1 with 2 active and 0 hotspace disks, one MD device for each of the boot swap and root partitions. Select the correct corresponding matched partitions to include in each MD device, e.g. sda1 and sdb1, sda2 and sdb2, sda3 and sdb3
  7. After returning once more to the main Partition Disks screen you should see the RAID devices that you just created listed alongside the partitions from before. Create file systems on each RAID device in the same fashion as for normal, non-RAID, partitions, e.g. RAID device #0: type ext3 mounted at /boot, RAID device #1: type SWAP, RAID device #2: type ext3 mounted at /
  8. Write the changes to disk and continue with a normal installation
After installation you must also install grub to the master boot record on the second drive in the RAID array so that the system can boot from the other drive in the case of one drive failing.
$ sudo grub-install /dev/sda $ sudo grub grub> device (hd0) /dev/sdb grub> root (hd0,0) grub> setup (hd0) grub> quit
Verify that you have the correct filesystems in the correct places, e.g.
$ grep /dev/md /etc/fstab $ df -h /
The status of RAID devices can be checked using the /proc/mdstat file. Each mdN device contains two sdXN disks and each mdN device should have "2/2" and "UU":
$ cat /proc/mdstat
The mdadm utility gives more details:
$ sudo mdadm --query --detail /dev/md0
Posted
 

The Watch Command or tail for directories

I had a long running job copying some huge media archives from external storage to a new server. I wanted to periodically keep tabs on the process and as I was using cp to copy the data I made sure to use the -v argument so that there was some feedback to the terminal to tell me how things were progressing:
cp -Rv usbhdd/ /home/media/
Whilst doing this I thought that it would be cool to have a version of tail that worked for directories rather than just for files. Well it turns out that you can, after a fashion. There is the watch command which gives similar functionality by causing a command to be executed periodically and the results displayed fullscreen:
watch -n 10 "ls -la"
This will cause the ls -la command to be executed every ten seconds and the result displayed. Whilst finding out about this I discovered that the watch command isn't on every system and that the following script can be used to provide the same functionality:
#!/bin/sh
while (true)
do
     ls -lrt | tail -5
     sleep 5
     clear
done
Which works but causes the screen to clear each time which is distracting, and doesn't have the niceties of the watch command like using the -d differences flag to show how the output changes between successive updates.
Posted