Modern illuminations

The illuminated manuscripts of the Middle Ages blended words and images, continuing traditions established by the Ancient Egyptians. Words and pictures go together: one without the other is a rather flat experience, like silent cinema, or eating fine food with a cold. This is why I like comic books so much. 

One of the opening sessions on Day 1 at the recent ScienceOnline conference was an hour with sketchnoter and überdoodler Perrin Ireland of Alphachimp Studio. She basically gave away all her secrets for purposeful scientific doodling. Tips like building a canon of fonts, practising icons and dividing lines, and honing an eye for the deft use of colour. 

The result... well, I had a lot of fun scribing talks. Two of them I managed to get to a point we might call al dente, or maybe half baked. The first from a session on open notebook science, something that interests me quite a bit: 

If it looks like you have to really listen and concentrate to produce one of these, that's because you do. I did miss bits, though, as I fretted over important things like what kind of robot to draw. And you might have noticed that I can't draw people. Yeah, I noticed that too. It didn't stop me adding them to the next one, from a session on the semantic web:

I'm not alone in my happiness at finding this sketchy new world. Perrin has given her perspective, and Michele Arduengo has written a lovely post about learning to draw science, and you can see many of the other efforts in this awesome Flickr gallery—the scratchings of amateurs like me sit half-convincingly alongside the professional pieces, and together I think they're rather wonderful.

Amenhotep image from Flickr user wallyg, licensed BY-NC-ND. All Flickr slideshow images are copyright of their respective creators, and may be subject to restrictions. All my work is licensed CC-BY.

Ten things I loved about ScienceOnline2012

ScienceOnline logoI spent Thursday and Friday at the annual Science Online unconference at North Carolina State University in Raleigh, NC. I had been looking forward to it since peeking in on—and even participating in—sessions last January at ScienceOnline2011. As soon as I had emerged from the swanky airport and navigated my way to the charmingly peculiar Velvet Cloak Inn I knew the first thing I loved was...

Raleigh, and NC State University. What a peaceful, unpretentious, human-scale place. And the university campus and facilities were beyond first class. I was born in Durham, England, and met my wife at university there, so I was irrationally prepared to have a soft spot for Durham, North Carolina, and by extension Raleigh too. And now I do. It's one of those rare places I've visited and known at once: I could live here. I was still basking in this glow of fondness when I opened my laptop at the hotel and found that the hard drive was doornail dead. So within 12 hours of arriving, I had...

Read More

The filtered earth

Ground-based image (top left) vs Hubble's image. Click for a larger view. One of the reasons for launching the Hubble Space Telescope in 1990 was to eliminate the filter of the atmosphere that affects earth-bound observations of the night sky. The results speak for themselves: more than 10 000 peer-reviewed papers using Hubble data, around 98% of which have citations (only 70% of all astronomy papers are cited). There are plenty of other filters at work on Hubble's data: the optical system, the electronics of image capture and communication, space weather, and even the experience and perceptive power of the human observer. But it's clear: eliminating one filter changed the way we see the cosmos.

What is a filter? Mathematically, it's a subset of a larger set. In optics, it's a wavelength-selection device. In general, it's a thing or process which removes part of the input, leaving some output which may or may not be useful. For example, in seismic processing we apply filters which we hope remove noise, leaving signal for the interpreter. But if the filters are not under our control, if we don't even know what they are, then the relationship between output and input is not clear.

Imagine you fit a green filter to your petrographic microscope. You can't tell the difference between the scene on the left and the one on the right—they have the same amount and distribution of green. Indeed, without the benefit of geological knowledge, the range of possible inputs is infinite. If you could only see a monochrome view, and you didn't know what the filter was, or even if there was one, it's easy to see that the situation would be even worse. 

Like astronomy, the goal of geoscience is to glimpse the objective reality via our subjective observations. All we can do is collect, analyse and interpret filtered data, the sifted ghost of the reality we tried to observe. This is the best we can do. 

What do our filters look like? In the case of seismic reflection data, the filters are mostly familiar: 

  • the design determines the spatial and temporal resolution you can achieve
  • the source system and near-surface conditions determine the wavelet
  • the boundaries and interval properties of the earth filter the wavelet
  • the recording system and conditions affect the image resolution and fidelity
  • the processing flow can destroy or enhance every aspect of the data
  • the data loading process can be a filter, though it should not be
  • the display and interpretation methods control what the interpreter sees
  • the experience and insight of the interpreter decides what comes out of the entire process

Every other piece of data you touch, from wireline logs to point-count analyses, and from pressure plots to production volumes, is a filtered expression of the earth. Do you know your filters? Try making a list—it might surprise you how long it is. Then ask yourself if you can do anything about any of them, and imagine what you might see if you could. 

Hubble image is public domain. Photomicrograph from Flickr user Nagem R., licensed CC-BY-NC-SA. 

News of the week

Some news from the last fortnight or so. Things seem to be getting going again after the winter break. If you see anything you think our readers would be interested in, please get in touch

Shale education

Penn State University have put together an interactive infographic on the Marcellus Shale development in Pennsylvania. My first impression was that it was pro-industry. On reflection, I think it's quite objective, if idealized. As an industry, we need to get away from claims like "fracking fluid is 99% water" and "shale gas developments cover only 0.05% of the state". They may be true, but they don't give the whole story. Attractive, solid websites like this can be part of fixing this.

New technology

This week all the technlogy news has come from the Consumer Electronics Show in Las Vegas. It's mostly about tablets this year, it seems. Seems reasonable—we have been seeing them everywhere recently, even in the workplace. Indeed, the rumour is that Schlumberger is buying lots of iPads for field staff.

So what's new in tech? Well, one company has conjured up a 10-finger multi-touch display, bringing the famous Minority Report dream a step closer. I want one of these augmented reality monocles. Maybe we will no longer have to choose between paper and digital!

Geophysical magic?

tiny press story piqued our interest. Who can resist the lure of Quantum Resonance Interferometry? Well, apparently some people can, because ViaLogy has yet to turn a profit, but we were intrigued. What is QRI? ViaLogy's website is not the most enlightening source of information—they really need some pictures!—but they seem to be inferring signal from subtle changes in noise. In our opinion, a little more openness might build trust and help their business. 

New things to read

Sometimes we check out the new and forthcoming books in Amazon. Notwithstanding their nonsensical prices, a few caught our eye this week:

Detect and Deter: Can Countries Verify the Nuclear Test Ban? Dahlman, et al, December 2011, Springer, 281 pages, $129. I've been interested in nuclear test monitoring since reading about the seismic insights of Tukey, Bogert, and others at Bell Labs in the 1960s. There's geophysics, nuclear physics and politics in here.

Deepwater Petroleum Exploration & Production: A Nontechnical Guide Leffler, et al, October 2011, Pennwell, 275 pages, $79. This is the second edition of this book by ex-Shell engineer Bill Leffler, aimed at a broad industry audience. There are new chapters on geoscience, according to the blurb.

Petrophysics: Theory and Practice of Measuring Reservoir Rock and Fluid Transport Properties Tiab and Donaldson, November 2011, Gulf Professional Publishing, 971 pages, $180. A five-star book at Amazon, this outrageously priced book is now in its third edition.

This regular news feature is for information only. We aren't connected with any of these organizations, and don't necessarily endorse their products or services. Low-res images of book and website considered fair use.

What do you mean by average?

I may need some help here. The truth is, while I can tell you what averages are, I can't rigorously explain when to use a particular one. I'll give it a shot, but if you disagree I am happy to be edificated. 

When we compute an average we are measuring the central tendency: a single quantity to represent the dataset. The trouble is, our data can have different distributions, different dimensionality, or different type (to use a computer science term): we may be dealing with lognormal distributions, or rates, or classes. To cope with this, we have different averages. 

Arithmetic mean

Everyone's friend, the plain old mean. The trouble is that it is, statistically speaking, not robust. This means that it's an estimator that is unduly affected by outliers, especially large ones. What are outliers? Data points that depart from some assumption of predictability in your data, from whatever model you have of what your data 'should' look like. Notwithstanding that your model might be wrong! Lots of distributions have important outliers. In exploration, the largest realizations in a gas prospect are critical to know about, even though they're unlikely.

Geometric mean

Like the arithmetic mean, this is one of the classical Pythagorean means. It is always equal to or smaller than the arithmetic mean. It has a simple geometric visualization: the geometric mean of a and b is the side of a square having the same area as the rectangle with sides a and b. Clearly, it is only meaningfully defined for positive numbers. When might you use it? For quantities with exponential distributions — permeability, say. And this is the only mean to use for data that have been normalized to some reference value. 

Harmonic mean

The third and final Pythagorean mean, always equal to or smaller than the geometric mean. It's sometimes (by 'sometimes' I mean 'never') called the subcontrary mean. It tends towards the smaller values in a dataset; if those small numbers are outliers, this is a bug not a feature. Use it for rates: if you drive 10 km at 60 km/hr (10 minutes), then 10 km at 120 km/hr (5 minutes), then your average speed over the 20 km is 80 km/hr, not the 90 km/hr the arithmetic mean might have led you to believe. 

Median average

The median is the central value in the sorted data. In some ways, it's the archetypal average: the middle, with 50% of values being greater and 50% being smaller. If there is an even number of data points, then its the arithmetic mean of the middle two. In a probability distribution, the median is often called the P50. In a positively skewed distribution (the most common one in petroleum geoscience), it is larger than the mode and smaller than the mean:

Mode average

The mode, or most likely, is the most frequent result in the data. We often use it for what are called nominal data: classes or names, rather than the cardinal numbers we've been discussing up to now. For example, the name Smith is not the 'average' name in the US, as such, since most people are called something else. But you might say it's the central tendency of names. One of the commonest applications of the mode is in a simple voting system: the person with the most votes wins. If you are averaging data like facies or waveform classes, say, then the mode is the only average that makes sense. 

Honourable mentions

Most geophysicists know about the root mean square, or quadratic mean, because it's a measure of magnitude independent of sign, so works on sinusoids varying around zero, for example. 

The root mean square equation

Finally, the weighted mean is worth a mention. Sometimes this one seems intuitive: if you want to average two datasets, but they have different populations, for example. If you have a mean porosity of 19% from a set of 90 samples, and another mean of 11% from a set of 10 similar samples, then it's clear you can't simply take their arithmetic average — you have to weight them first: (0.9 × 0.21) + (0.1 × 0.14) = 0.20. But other times, it's not so obvious you need the weighted sum, like when you care about the perception of the data points

Are there other averages you use? Do you see misuse and abuse of averages? Have you ever been caught out? I'm almost certain I have, but it's too late now...

There is an even longer version of this article in the wiki. I just couldn't bring myself to post it all here. 

How to keep up with Agile*

I mentioned the other day that there are a few ways to keep up with this blog. I thought I'd list some of them out, in case you have not yet found one you like. 

The easiest thing for many is probably to get the email updates. They go out early in the morning the day after we put up a new post. We do not use your email address for anything else and would certainly never share it. To get these, just enter your email address in the box to the right →

If you already get them, don't worry, nothing has changed.

For many diehard blog readers, the only way is the RSS feed. You can access this from the link in the box on the right too. Just copy the URL of the feed [http://feeds.feedburner.com/agilegeoscience] into an RSS reader, sometimes called an aggregator. There are dozens — here's a list. Lots of people like Google Reader. Some people don't.

Visit our Twitter account to see what it's all about — no account requiredEvery new post is tweeted by the Twitter account @agilegeo. This is more or less all this Twitter account does, at least for now, so it's high signal-to-noise (if you consider our posts and comments signal, that is). These tweets also post to our Facebook page, so you can Like us to see the new posts in your Facebook feed.

We've started playing with Google+, but it's quite different from Facebook and Twitter, so is taking some getting used to. If you use Google+, follow Agile, me or Evan to get a smattering there. And Evan and I usually post about new writing in our LinkedIn profiles too, if you know us personally.

Lastly, there's always the trusty bookmark. Just remember to hit it occasionally. 

Thank you for reading! Seriously. Thank you.