Computing that has Purpose & Meaning

Computing
I think that the "Land War in Asia (aka, CS undergraduate education)" post over at FCS' Difference Engine Blog skirts around some of the ideas I have been playing with recently about teaching Computer Science and Applied Computing. In particular is the following phrase:

    We need to teach Computing that has purpose and meaning


which I think is what Applied Computing tries, but often fails, to do. The idea that is sugested in the linked post is for CS teaching to tell a story (perhaps metaphorically) and to have a beginning, middle, and end. In the real world many people do not need skills from the more theoretical end of CS. I say this with my tongue ever so slightly in my cheek, having thought that I wouldn't need any of those geometry and trigonometry skills from Mathematics until I started working on visualisations of argument structure. Although I say that for effect as I actually needed those skills when I worked building sites doing joinery in the real world. In computing though, the skills that people do need are applied computing skills. These are good, practical, real-world computing skills that cover at least formal skills in:

  •     problem solving,
  •     ideation,
  •     coding,
  •     source control,
  •     development methodology,
  •     team-working,
  •     communication
  •     probably others but I can't think of any right now...


Additonally there needs to be sufficient background and experience of the academic side of CS to allow the student to have some idea of the breadth of tools and techniques that CS can offer as well as how to aquire and apply them to their own problems. However I don't think that Applied Computing should be taught in isolation as I do not want to turn Applied Computing into a vocational course. I think that what is required is the story, provided by an associated application domain that scaffolds the use of computational tools to solve problems. For example, Applied Computing with Life Sciences, Psychology, Law, Engineering, or any of a myriad of subject domains. At university level these should of course be selected from those areas that the University is already good at. For example, the University of Dundee has a large and successful College of Life Sciences and increasing numbers of vacancies for software developers which it struggles to fill. Unfortunately there is not sufficient joined-up-thinking to ensure that LifeSci students are getting a thorough grounding in computing so the graduates that are being produced are not suited to some of the jobs that are becoming available. With the increasing trend towards heavy computer use in the life sciences this is a bad situation. Additionally, the Applied Computing degree does not have close ties with many of the range of other subject domains that Dundee does well, so ends up being a kind of theory-light CS. This gives graduates lots of tools to take away with them but no subject matter of their own to which they can apply them.

Interestingly this approach might be useful in tackling that age-old question of how to attract and retain women undergraduates in CS. In "Is Teaching Computer Science Different from Teaching Other Sciences?" (1997), Bernstein suggests that women see computers as tools and men see computers as toys. Whilst I think that this is an overly simplistic view, gender bias and preferences are much more complicated than that, I think that teaching computers as tools is essentially what I have suggested above but with the caveat that an application domain gives you the structure and narrative into which to fit those computing tools.

Posted
 

Getting Started with Network Analysis & Twitter

Networkanalysis
If you are interested in the network analysis of social media data but don't know where to start then this description of a BetaLabs skill sharing workshop should be enough to get you bootstrapped. Assuming you know some Python, the remaining tools are as follows:

  • Tweepy - Twitter API Python Library
  • NetworkX - Network/Graph Python Library
  • Gephi - Network Vizualisation Application

Essentially the class did the following:

  1. Used Tweepy to access Twitter API & grab lists of users related to BetaWorks and for each user access their relationships
  2. Used NetworkX to build a graph of relationships using the data retrieved from Twitter
  3. Export the graph in GraphML format
  4. Import the graph into Gephi for manipulation, personalisation, investigation, &c.

Example code is available from GitHub

Posted
 

Scooby Doo as Critical Thinking Training for Children

Scoobydoo
Over at Comics Alliance there is a nice "Ask Chris" post that sets out to answer the Question, "On Scooby-Doo, do you prefer the monsters to be real or people in costumes?". What is nice about the answer "people in costumes" is that Chris goes further and claims that "there should never, ever be even a trace of the supernatural in the world of Scooby-Doo".

This is because Scooby-Doo is not about the supernatural, its not really a cartoon about kids fighting monsters but about kids looking for truth.

"the world is full of grown-ups who lie to kids, and that it's up to those kids to figure out what those lies are and call them on it, even if there are other adults who believe those lies with every fiber of their being. And the way that you win isn't through supernatural powers, or even through fighting. The way that you win is by doing the most dangerous thing that any person being lied to by someone in power can do: You think."


Seen through the lense of late sixties idealism, Scooby-Doo becomes a cartoon in which the viewer is trained to understand that whatever the mystery, you just have to observe, ask questions, and think to come up with rational explanations. The monsters don't really exist, just bad people who want to make you too scared to look, let alone ask questions or think, so that they can take advantage. Against this backdrop Scooby-Doo becomes a really nice way to educate children in the prerequisite skills of critical thinking: (1) observing, (2) questioning, and (3) thinking.

 

Addendum: There is also a nice discussion thread about this over at BoingBoing

Posted
 

Deploying & Packaging Javascript

Javascript_2378867408_4cc90791d6
I have been doing a fair bit of web oriented coding recently, finally getting to grips with Javascript and newer web technologies. The upshot is that I nearly have a new tool for visualising argumentation structures ready to push to my github account. In getting ready to do this I started finding out about things like minification and packaging for Javascript files to make the downloads as efficient as possible. The general consensus seems to be that you minify and package everything up into a single Javascript file which gives you the best trade-off of minimising the number of HTTP connections and minimising the amount actually transferred but without the computational overhead of unpacking.

One thing I was unsure about was how to handle versioning at deploy time, especially as I move towards continuous deployment of various projects. Turns out that one way of doing this is to name the Javascript file with the hash of the file instead of just calling it all.js or adding a query string. This gives a unique name for your src file without subjecting you to the vagaries of any cloud deployment infrastructure that may be operating on eventually consistent principles. This is nicely explained, with diagrams in this post over at Ben Kamen's blog. In the comment thread from that post, one commenter suggests also that using an Amazon S3 bucket to hold Javascript files and ensuring to deploy  the new Javascript file there before any code that accesses it is made available to users is a good approach also.

Posted
 

The Argument Against Jeremy Clarkson

So Jeremy Clarkson said something else that was stupid and designed to inflame public opinion. As Ben Goldacre points out "Complaining about Clarkson expressing an offensive view is like complaining that the wheels just fell off your clown taxi". The problem I have with what he said was that he sought to isolate an identifiable group of people and advocate bad things happening to them. Do I want that kind of speech restricted? No. Definitely not. I support freedom of speech especially for those people who say things with which I disagree because freedom of speech does not exist without the freedom to utter unpopular speech. That said, if you say something that causes offense then there may be repercussions, I just don't believe that these should be legal repercussions but that repercussions, if any, should come from public opinion, perception, and reaction.

But that is an aside to what this post is really about. The thought that was stirred up by this was more about a particular type of argument associated with this kind of outburst. Specifically the way that we react to this kind of statement. For example, Clarkson, referring to are public-sector workers who were on strike on November 30th, said the following:

 

   "I’d have them all shot. I would take them outside and execute them in front of their families."


One of the reasons that he should not have said this is that if you replace the group "public-sector workers" with Black people, women, children, homosexuals, transgender, Jewish, or any other minority group then you get a much more serious statement. A statement that in some cases could conceivably see you landed in jail. For example:

    "I’d have Homosexuals shot. I would take them outside and execute them in front of their families."


In these cases the defense of "only joking" rings hollow on most occasions. I suspect that, given that Clarkson is also a public-sector worker he doesn't literally advocate what he said but rather uttered something that stemmed partly from an "I'm alright Jack" perception of events and partly something that  was calculated to cause some amount of offense as his job in relation to anything that isn't related to cars seems to be taking potshots at easy targets like some kind of BBC sponsored bully. If Clarkson had used the exact same words in reference to black people then he could have been charged under one of those heinous little incitement laws that target the effects of discrimination rather than the underlying causes so that the government can be seen to be doing something without really doing anything of consequence.


For the moment I will refer to this stereotypical pattern of reasoning as the argument for the substitution of subjects which essentially has the following structure:

CONCLUSION: Alpha's argument about subject X should not be accepted .

SUBSTITUTION PREMISE: The argument would not hold if you substitute subject Y for subject X.

EQUIVALENCE PREMISE: What holds for subject Y also holds for subject X

MINOR PREMISE: Subjects X &/or Y are groups of individuals that have a special status.

What is interesting here is that an evaluation of the original argument or statement can be made based upon changing the variables. An argument with respect to group Y would not be acceptable so the argument with respect to group X is not acceptable either. This is interesting because it has a flavour of strawman about it. The defense against the position deliberately misrepresents the stated position of the speaker by ascribing a position to them that they haven't stated that they hold. Usually this is considered to be a rhetorical practice; misrepresent your opponents position to create something that is easier to attack than their actual stated position. In this case though it is a quite devastating rhetorical technique, particularly because when used in a public it creates an indefensible position. It is a difficult defense to maintain to say "of course I wouldn't support doing X to Y but Z are different" because it has the feeling of creating a minority that it is alright to do terrible things to because somehow they are different or 'other'.

We see this kind of argument quite often when dealing with tough real-world problems. This usually occurs in relation to problems that are affected by some form of prejudicial feeling. For example, some quite dispicable things are said in discussions on so-called 'gay marriage' or 'gay adoption' (NB. these terms are quoted because in my opinion there is JUST marriage and JUST adoption regardless of the sexuality of those concerned).

For example, I have heard people say that homosexual people should not be allowed to raise adopted children. If you substitute an alternative minority group for gay then the argument immediately becomes ludicrous. However the same people will rarely say that they do not believe that black or Jewish or disabled people should not be allowed to raise children.  Generally these utterances and the associated substitution expose the prejudices of the speaker more than they provide an edifying and well reasoned argument for or against a given stance. They are in fact one of the ways to distinguish whether the entire dialogue is worth having in the first place. Gilbert suggests in "How To Win An Argument" that you should never argue with a fanatic, perhaps this should be extended to bigots as well.

Posted
 

More on Early Coding & Education

Coding
Yesterday I suggested that we start teaching children to code as early as possible. I suggested a couple of points in support of this. My main point was to instill in the computer users of the future that computer use is about much more than just utilising the software installed on the machine but is about shaping the computer into an appropriate tool for your problems. Coding, of some sort, is a necessary element of this shaping process. This applies whether we are connecting tools from the standard toolset together using a shell to create new pipelines, writing new tools in Python or other similar languages, or crafting software from scratch in assembly. They all require similar basic problem solving abilities which are actually not really about computing.

My other point was to give children a head start in learning their tools and to ensure that they have the time necessary to learn them well. There is an article by Peter Norvig named "Teach Yourself Programming in Ten Years" [ http://norvig.com/21-days.html ] which takes issue with the teach yourself programming in x days/weeks/months series of books. Basically you can learn the syntax of a language in a few days but to really program well using that language requires experience and knowledge of the capabilities and limitations of that lanaguage, not to mention learning the, probably extensive, standard and third-party libraries that mean that you don't have to constantly reinvent the wheel. This obviously takes time, and I am not for a moment suggesting that our Children should be learning a vast array of languages and leaving junior school as ready-made software developers. What I do expect though is that, given that a lot of active learning is about assimilating and organising information, a child should be able to reach for the computer as the primary tool to work with information, in the same way that during the pre-computer age we would reach for a pen and paper when needing to make notes. I want to go further than this though, using the computer as an information management tool is quite straightforward with tools such as text editors and spreadsheets but this is to deny the child the real power of the machine, the power to bend the machine to their will, the power of coding.

How do we do this? Well the approach taken by the Raspberry Pi [ http://www.raspberrypi.org/ ] project might be a start. However I worry that this approach distances us from the computers that we meet in every day life. What might be better would be to make sure that the computers that our children encounter contain the basic tools that they need in order to use the machine in the ways that I outlined above. One approach might be to start installing a Linux as the default learning environment so that programming tools are there as standard but that is a whole 'nother can of worms that I don't fancy opening today. Another approach might be through play or writing games. Yet another might be through embedding computing tools at all levels within the curriculum. Of course any of these approaches will require effort but isn't that a part of the continual process of ensuring that our children are taught the things that they need to know and that teachers have the skills necessary to do so.

Why is it worth doing this? In the "Reading, Writing, Programming?" [ http://www.itworld.com/software/228381/new-school-curriculum-reading-writing-... ] article at ITWorld there is the following comment:

We're living in an exciting time when someone with a hot business idea, Javascript coding skills, and a free Amazon cloud account can get an internet application up and running in a weekend - and put out iOS and Android apps almost as quickly. That kind of freedom should be the kind of thing that appeals to kids looking for their 'ticket out of the ghetto'.

Perhaps this is the path we should be enabling. Inspiring the entrepreneurial spirit, as early as possible, and giving young people the tools for success.

Posted
 

When & why should we teach children to code

Toolbox
The BBC new article "Coding - the new Latin" talks about a campaign to inprove the teaching of computing skills including teaching kids in school how to code. Partly this is to make it more clear to kids what is involved in real computing as opposed to clerical computing, with the goal being to produce better graduates. The article suggested to me the following question:

When and why should we start teaching our children how to code?

My own position is that this should start as early as possible. Why? Partly my belief is based on fond memories of doing simple programming in Basic on my Commodore 64 at the age of seven, if I enjoyed it then why shouldn't anyone else? Quite apart from my personal belief is the knowledge that:

"The tool of the 21st century is the computer"

In other words, regardless of what area of work you are in, there is a good chance that you will need to use a computer. The sooner our kids are comfortable with computer use, up to and including coding, the better.

However, just using a computer is not always enough and reiterates a common misunderstanding of computing that can be delineated into clerical computing, using the standard tools that somebody else wrote and applying them to our problems without any deep understanding of how they work, computer science, which is not really about computers and their use and which I explored in a previous article and which can get rather theoretical, and applied computing, which is the use of computers to solve real problems by being a tool builder rather than merely a tool user. I think that teaching kids to code sooner will lead us to have less of the clerical computer users, and more of the applied computer users, which can only be a good thing.

Doing applied computing, which is what most coding really is, needs us to understand that the computer is a practically limitless tool that we can shape to our needs. We don't just use a computer, we shape it, we mold the computer so that it provides the kinds of tools we need to provide solutions in our own individual problem domains. If you sit at another person's machine you will intuit this when you see that they have things set up differently, or different software installed. But this is just skin-deep. When you really use a computer you start building new software that nobody else has installed, because you, and possibly ONLY you, need that particular piece of software. This is most apparent when you sit at the machine of a Unix or Linux user. You will find scripts and tools written by the user to perform their own tasks, to solve their own problems. Usually these tools are created in the spirit of the Unix philosophy, do one thing, do it well, communicate with other tools, facillitate reuse, &c. but most importantly they are symptomatic of a computer user taking control of their machine and fashioning new tools to supplement the standard ones.

I will reiterate, just using the computer is not enough, you need to own the computer. You need to ensure that the computer is your tool and that you know how to use it as such. You need to know the kinds of problems you can solve as standard with the computer and how to adapt that tool to solve new problems as they arise. It is the skill of tool adaptation that is what coding give you and the sooner you learn to code, the sooner it will become just another internalised skill that enables you to use your computer to solve problems quickly and accurately whenever they arise. A child that learns to code at high school age will have a lead over those who start coding at college or beyond. Similarly, a junior age child who can sit at a computer and produce code to satisfy their need will have a huge advantage over those that cannot. Note that I am not suggesting at this point that kids should be writing code that is ready for production use, although that would really change the nature of child labour and outsourcing in some areas of the world. What I am suggesting is that children are empowered to ask questions and seek answers for themselves, and that their primary tool for working with problems can be, and should be, the computer.

For me, a second, and more important question is not when should we start teaching children to code, but how shoud we teach children to code? We need to frame this in such a way that children of all genders are empowered to code. Finally, against this backdrop the percentage of male applicants to undergraduate computer science courses is increasing despite some efforts being made to make computing more attractive to all genders. So my final corollary today is, how should we present coding as an interesting and useful skill for Children of all genders to cultivate?

Posted
 

Software Development as Plumbing

Pipework
Came across two posts today that compare the development of software to plumbing, emphasizing the importance of constructing the joins between software components, as much as building the components themselves. In his post, John Cook coins the term 'plumber programmer' to describe the job of the programmer who connects things together. He illustrates the concept with two diagrams showing how we traditionally, and often innately, construct software architecture diagrams that concentrate on the components that make up the system, whereas there can be as many interconnections between the components as there are distinct components. These connections can include API design and usage, delegation, interfaces, and all of the associated testing and artifacts that follow from these things. These diagrams often illustrate the interconnections with simple arrows showing interconnection or communication, whereas a more realistic diagram might draw those arrows in proportion to the amount of effort required to do those connections properly.

One of the commenters in the thread on that post illustrates this idea in relation to other engineering disciplines such as automotive, electrical, and civil engineering where the maturity of these fields can be evaluated in terms of the management of complexity within them and the develoment of a hierarchy of reusable components and agreed abstractions which mean that, unless you have exceptional circumstances, it is not necessary to make those components yourself. To some degree, software engineering as a discipline is maturing in this regard; we have middleware tools and database tools that stop us from having to re-engineer those layers ourselves, choosing instead to use off-the-shelf solutions, and we also have software design patterns, languages that enables us to communicate solutions with each other.

The second post also explores the same space, but instead of from the interconnection perspective, focusses on the widget building nature of software engineering. The key take away from this is the idea that "(software) engineering would benefit from a more explicit focus on plumbing skills", a more explicit focus on building, testing, and maintaining the APIs that allow these complex modern software systems to be built from individual components. I know that the design, implementation and use of programming interfaces is, at best, poorly handled in most formal taught software development & user centered design courses at university level in the UK. Whilst the value of well design graphical interfaces has long been an important part of computing degree courses the same cannot be said of the design of textual & software interfaces. Given the increasing interconnectedness of modern software systems, this is something that has to change as we move towards software development as a real engineering discipline.

Posted
 

The Batteries Included CouchDB build system

Couchdb_0fcb5b25d65871882ff3993747dc3049
I have been working on a new project, or at least a new incarnation of an old project that is now in need of a NoSQL datastore. Having played with it in the past, CouchDB was an obvious choice. Because the repos generally seem to be behind the cutting edge in terms of libraries and actively-development tools I looked into building from source as I had already put some time into documenting the build process on Ubuntu in an earlier post some time ago. This time however I was happy to discover a new tool to ease the build process: build-couchdb

This tool does not just simplify the process of building CouchDB to the point of needing you only to kick off a rake process, it also manages all dependencies and allows you to specifiy particular builds of CouchDB from the git repo. The process is much simpler than my earlier attempt:

  • git clone git://github.com/iriscouch/build-couchdb
  • cd build-couchdb
  • git submodule init
  • git submodule update
  • rake

When the rake process has finished, which might take some time depending upon the speed of your build machine you should be able to start couchdb using:

  • build/bin/couchdb

then access CouchDB from the usual url: http://127.0.0.1:5984/

I have plans for an upcoming series of posts on getting started using CouchDB and Python to build webapps. If you can't wait for that then the "Getting Started with CouchDB" articles over at NetTutsplus is as good a practical intro as any. If like me you like to also have a printed book to support you then take a look at "CouchDB: The Definitive Guide " and "Beginning CouchDB ". Whilst they are never going to be considered the classic texts on this technology they do provide a decent enough overview if you work at it.

Posted
 

Musings on Research Impact

There have been a lot of changes in my working life over the last year or so, with more details in an upcoming post. One of the things that I have come to reconsider is the basis of my research and how it connects to the wider real world. This is in the context of the "academic research" sphere where the drive to publish is currently the prime mover, at least in most computer-science departments. This drive to publish means that, in my experience, only journal publications are taken into consideration in setting the research agenda and managing the activities of early career researchers. This can be to the exclusion of meaningful real-world impact that actually gets the outputs of research into the hands of those whose problems the research was designed to tackle.

Over the last ten years I have seen many great pieces of research, that could have real-world impact, languish in computer science departments because, the grant had run out, or the work was entirely theoretical so there was actually nothing to release, or the journal paper was published so there is nothing left to do, or the university could not see any way to profit from the research.

Two illustrative examples sping to mind here; in a recent interview Andrew Tanenbaum opines that:

"I published a paper in 1978 on something very close to the Java Virtual Machine, but we never got much credit for it although we were years ahead of Sun."

This illustrates my conjecture that just the paper is not enough when we are talking about the outputs of academic research. You need to release the code as well, ideally via a source control mechanism. This should be in a form that is easily reusable including build scripts. This should be done to matter how "hacky" the code. One of the worst things to hear, and I am guilty of this as well, is "we will release the software as soon as we have cleaned up the source code". In my defense, at the time I didn't know any better, and am now actively taking steps to repair the situation. I actually heard this again a fortnight ago, and I know it is always said with the best of intentions but it generally means that the code won't be released because there is never the time to do the cleanup, and if you don't have the mindset to work in the open, warts and all, then you are unlikely to ever feel that your code is ready for release. This can best be tackled by starting the project in the open and continuing that way. If you already have code then just bite the bullet and release it. I suspect that if Tanenbaum had done this with his JVM code, assuming that there was such a thing to underpin the paper, then it is likely that such a useful technology would have garnered significant uptake and hence Sun would never have had to re-invent the technology to scratch their itch.

My second example deals with the notion of uptake of an idea and is important. If the fruits of your research are released in a form that is useable by those other than the research niche in which you are working and who also happen to have read your journal paper, and if the idea solves some real-problem, then there is every chance that someone outside of your immediate circle will find a reason to use it. This is the beginnings of community. I would suggest that a new software tool, based directly on current research, and which garners even a couple of thousand users, has at least an equivalent impact factor to the average mediocre journal publication. Yet it is this area, software released by academic departments, around which a large community coalesces, which I have seen abandoned because "it doesn't buy us anything to continue with this". Even when other universities move into the same area several years later and begin to spin-out profitable, university-backed companies based on less-functional clones of the original software.

So, I suggest two lessons for computer science research:

  1. Where appropriate we should aim to release early and release often, whether that is code, data, discussion, or conjecture.
  2. If this leads to community-building and uptake, even if it is not a scientific community then you must recognise the value of that community and argue strongly that the community is an existance proof of the impact of your research agenda.

An alternative would be to do the research in the open as much as possible. If code is written as part of the research then that code should be immediately released. Minimally the core project repository will be hosted at least on a publically accessible server. Ideally the repo. will be located on GitHub (other distributed source control tools are available) where it is more easily discoverable than your departmental project server. Ultimately we need to affect a seachange in expectations with respect to academic research. Yes, we need to clearly and cogently write up our results in a manner that is concise and easily communicates our discoveries to others, something that papers are an OK mechanism for. This should not change, it is a necessary condition of being part of the existing research community. We also need to recognise that in the longer term, this is not enough, and that we will develop new and better methods of not just distributing results and findings but also of making those discoveries in the first place.

Posted