Usenix overview

 

 

Usenix held its annual conference in Boston the week of June 23-27.  The conference included about 20 tutorials, 3 days of technical sessions, guru-is-in discussions of a number of technical issues, and several interesting invited talks.  I’ve summarized some of the highlights below.

 

On Wednesday, June 25, Robert Lang gave an overview of origami and how it’s increasingly being used for practical problems, such as folding up bulky items so they can fit in confined spaces like the space shuttle’s cargo bay.  Drew Endy, who’s in the process of moving from MIT to Stanford, gave an overview of some of the challenges of synthetic biology and how they might be of interest to computer programmers.  On Thursday, David Patterson described the challenge of programming multicore architectures and how various groups are starting to address them.  Jim Waldo talked about Darkstar and the role of games in driving computer architectural developments.  On Friday, Ajay Anand described the Hadoop project and  Matthew Melis gave a summary of the investigations into the space shuttle Challenger disasters and how the lessons learned may apply to the problems of other complex systems.

 

Origami

 

Origami has come a long way from the designs you learned in grade school.  The basic rule is “one sheet, no cuts”, but mathematical techniques have taken this pretty far.  There are 36,000 origami designs listed in www.origamidatabase.com.    Two of the original Delian problems (that are not solvable using straight edge and compass) are solvable using origami.  Hishashi Abe used origami to trisect an angle, and Peter Messer used it to double a cube.  There are 7 axioms that define ways to construct a fold, the 6 Huzita axioms defined in 2002 and a seventh discovered independently by Kashiro Hatori and Jacques Justin and added to the set a little bit later.  You can construct solutions to all quadratic and cubic equations with rational coefficients using origami; this is not possible with only a straight edge and compass.  Applications include folding stents for insertion into a blood vessel, airbag simulation and packing, and design of space telescopes, solar arrays and deployable antennas for compact storage when being transported and smooth deployment to their final shape.

 

Multicore

 

David Paterson gave a talk on the challenges of parallelization and why it is important to solve them.

 

The computer architecture and chip industry is in a quandary. Intel cancelled high clock rate long pipeline designs to conserve energy.  100% of the companies founded to provide parallel computer hardware solutions to the marketplace have foundered.  These include Convex, Encore, Inmos (the Transputer), MasPar, NCUBE, Kendall Square Research, Sequent and Thinking Machines.  The physics of processor production dictates using computational engines working in parallel rather than more powerful processors or faster clock rates, but programmers have not yet figured out how to program them effectively for more than a few specialized problems.

 

Jim Gray listed developing software for parallel architectures as one of the 12 Grand Challenges for the computer industry in 1998.  (Others included interpretation and generation of natural speech, understanding images, and real artificial intelligence.)  Krste Asanovic and Bodik defined 7 computational dwarfs, patterns of communication and computation common across a set of applications, that need to be confronted if effective parallel programs are to become a reality.  David Patterson and the Berkeley View team expanded this into a list of 13 motifs: finite state machines, combinational, graph traversal, structured and unstructured grids, dense and sparse matrices, spectral (FFT), dynamic programming, n-body, MapReduce, Backtrack/B&B, and graphical models.  They produced a matrix showing in which of 11 main application areas (embedded systems, desktop systems, games, databases, machine learning, high performance computing, medicine, music, speech, CBIR and browser core code)  each motif is important.  <

 

 

 

Adam Cockcroft on Millicomputing

 

Millicomputing is to microcomputers what micros were to minis.  The milli stands for milliwatts.  One goal of millicomputers is to run cool so lots of them can be packed together for enterprise computing without air conditioning.  Another goal is long battery life for mobile computing.  Millicomputers run slower, but use much less power, so they are attractive for mobile devices that you may want to keep active for a long time.

 

The current state of the art is a 620 MHz machine with 128 MB RAM and 8 GB of long term (disk or flash) storage that runs cold and needs to be recharged daily.  The Freescale i.Mx31 SOC (with an ARM 1136 CPU) uses 250 mW of power, idles at 2 mWatt and costs about $100.  Check out the Homebrew Mobile Club to learn more.

 

Darkstar

 

Our attempts to escape into virtual reality has built online gaming into a multi-billion dollar per year industry, which Sun hopes to support better with Darkstar technology.  Webkinz has 5 million subscribers, mostly 5 – 10 year olds and their mothers, who log on regularly to take care of their online pets.  The world of Warcraft has 10 million subscribers who pay $15/month (a total of  $1.8 Billion/ year) to participate in their online community.  These services require 24/7 uptime world wide, with low latency, but modest throughput demands.  Game companies have effectively become service companies to support this.  The software is mostly event driven, uses client/server or publish/subscribe architectural models, and lots of graphics.  Much of the software is written using action script 3 for flash.  Sun supports this using Berkeley DB with lots of caching and distributed processing infrastructure  .

 

Hadoop

 

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing, including:

Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS). MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located. Hadoop has been demonstrated on clusters with 2000 nodes. The current design target is 10,000 node clusters.

 

Security Issues

 

Bruce Potter gave a tutorial on botnets and what you can do about them.  The largest is the Storm Worm cluster, which so far has taken over between one million and ten million CPUs at one time and may have affected up to 50 million hosts in total.  So far it’s been used mainly for Spam and seems to be under the control of a single person or entity.  The IT community is worried about what else it might be used for and is mobilizing against it.  Microsoft defends against it with its Malicious Software removal tool.  More bots are coming.

 

Digital Forensics is now more advanced than ever.  Naïve attempts to delete data can actually be a flag for a canny investigator looking for interesting artifacts on a hard disk. Under Windows, quick formats don’t actually delete much of anything (except the 1st character of each file name).  Long formats are mostly a read only operation that looks for bad disk sectors and leaves more than 90% of the original data intact and susceptible to recovery.  Even reformatting the disk in a different file format for a different operating system leaves significant amounts of the original data accessible. Even when it seems data is deleted, artifacts tend to persist in (sometimes) unexpected places.  System restore in XP contains a duplicate copy of the Windows registry. The print spool directories (perhaps somewhere out on the network) contain duplicate copies of some data and may be backed up in unexpected ways.  Thermite destroys disk drives, as does shredding; anything less is suspect. If you must do something that you want to keep absolutely private, try running a memory only OS such as Knoppix, but be wary of connecting to any network.