November 08, 2016

Welly to the wescue

November 08, 2016/ Matt Hall

I apologize for the widiculous title.

Last week I described some headaches I was having with well data, and I introduced welly, an open source Python tool that we've built to help cure the migraine. The first versions of welly were built — along with the first versions of striplog — for the Nova Scotia Department of Energy, to help with their various data wrangling efforts.

Aside — all software projects funded by government should in principle be open source.

Today we're using welly to get data out of LAS files and into so-called feature vectors for a machine learning project we're doing for Canstrat (kudos to Canstrat for their support for open source software!). In our case, the features are wireline log measurements. The workflow looks something like this:

Read LAS files into a welly 'project', which contains all the wells. This bit depends on lasio.
Check what curves we have with the project table I showed you on Thursday.
Check curve quality by passing a test suite to the project, and making a quality table (see below).
Fix problems with curves with whatever tricks you like. I'm not sure how to automate this.
Export as the X matrix, all ready for the machine learning task.

Let's look at these key steps as Python code.

1. Read LAS files

from welly import Project
p = Project.from_las('data/*.las')

2. Check what curves we have

Now we have a project full of wells and can easily make the table we saw last week. This time we'll use aliases to simplify things a bit — this trick allows us to refer to all GR curves as 'Gamma', so for a given well, welly will take the first curve it finds in the list of alternatives we give it. We'll also pass a list of the curves (called keys here) we are interested in:

The project table. The name of the curve selected for each alias is selected. The mean and units of each curve are shown as a quick QC. A couple of those RHOB curves definitely look dodgy, and they turned out to be DRHO correction curves.

3. Check curve quality

Now we have to define a suite of tests. Lists of test to run on each curve are held in a Python data structure called a dictionary. As well as tests for specific curves, there are two special test lists: Each and All, which are run on each curve encountered, and on all curves together, respectively. (The latter is required to, for example, compare the curves to each other to look for duplicates). The welly module quality contains some predefined tests, but you can also define your own test functions — these functions take a curve as input, and return either True (for a test pass) for False.

import welly.quality as qty
tests = {
    'All': [qty.no_similarities],
    'Each': [qty.no_monotonic],
    'Gamma': [
        qty.all_positive,
        qty.mean_between(10, 100),
    ],
    'Density': [qty.mean_between(1000,3000)],
    'Sonic': [qty.mean_between(180, 400)],
    }

html = p.curve_table_html(keys=keys, alias=alias, tests=tests)
HTML(html)

the green dot means that all tests passed for that curve. Orange means some tests failed. If all tests fail, the dot is red. The quality score shows a normalized score for all the tests on that well. In this case, RHOB and DT are failing the 'mean_b… — the green dot means that all tests passed for that curve. Orange means some tests failed. If all tests fail, the dot is red. The quality score shows a normalized score for all the tests on that well. In this case, RHOB and DT are failing the 'mean_between' test because they have Imperial units.

4. Fix problems

Now we can fix any problems. This part is not yet automated, so it's a fairly hands-on process. Here's a very high-level example of how I fix one issue, just as an example:

def fix_negs(c):
    c[c < 0] = np.nan
    return c

# Glossing over some details, we give a mnemonic, a test
# to apply, and the function to apply if the test fails.
fix_curve_if_bad('GAM', qty.all_positive, fix_negs)

What I like about this workflow is that the code itself is the documentation. Everything is fully reproducible: load the data, apply some tests, fix some problems, and export or process the data. There's no need for intermediate files called things like DT_MATT_EDIT or RHOB_DESPIKE_FINAL_DELETEME. The workflow is completely self-contained.

5. Export

The data can now be exported as a matrix, specifying a depth step that all data will be interpolated to:

X, _ = p.data_as_matrix(X_keys=keys, step=0.1, alias=alias)

That's it. We end up with a 2D array of log values that will go straight into, say, scikit-learn*. I've omitted here the process of loading the Canstrat data and exporting that, because it's a bit more involved. I will try to look at that part in a future post. For now, I hope this is useful to someone. If you'd like to collaborate on this project in the future — you know where to find us.

* For more on scikit-learn, don't miss Brendon Hall's tutorial in October's Leading Edge.

I'm happy to let you know that agilegeoscience.com and agilelibre.com are now served over HTTPS — so connections are private and secure by default. This is just a matter of principle for the Web, and we go to great pains to ensure our web apps modelr.io and pickthis.io are served over HTTPS. Find out more about SSL from DigiCert, the provider of Squarespace's (and Agile's) certs, which are implemented with the help of the non-profit Let's Encrypt, who we use and support with dollars.

November 03, 2016

Well data woes

November 03, 2016/ Matt Hall

I probably shouldn't be telling you this, but we've built a little tool for wrangling well data. I wanted to mention it, becase it's doing some really useful things for us — and maybe it can help you too. But I probably shouldn't because it's far from stable and we're messing with it every day.

But hey, what software doesn't have a few or several or loads of bugs?

Buggy data?

It's not just software that's buggy. Data is as buggy as heck, and subsurface data is, I assert, the buggiest data of all. Give units or datums or coordinate reference systems or filenames or standards or basically anything at all a chance to get corrupted in cryptic ways, and they take it. Twice if possible.

By way of example, we got a package of 10 wells recently. It came from a "data management" company. There are issues... Here are some of them:

All of the latitude and longitude data were in the wrong header fields. No coordinate reference system in sight anywhere. This is normal of course, and the only real side-effect is that YOU HAVE NO IDEA WHERE THE WELL IS.
Header chaos aside, the files were non-standard LAS sort-of-2.0 format, because tops had been added in their own little completely illegal section. But the LAS specification has a section for stuff like this (it's called OTHER in LAS 2.0).
Half the porosity curves had units of v/v, and half %. No big deal...
...but a different half of the porosity curves were actually v/v. Nice.
One of the porosity curves couldn't make its mind up and changed scale halfway down. I am not making this up.
Several of the curves were repeated with other names, e.g. GR and GAM, DT and AC. Always good to have a spare, if only you knew if or how they were different. Our tool curvenam.es tries to help with this, but it's far from perfect.
One well's RHOB curve was actually the PEF curve. I can't even...

The remarkable thing is not really that I have this headache. It's that I expected it. But this time, I was out of paracetamol.

Cards on the table

Our tool welly, which I stress is very much still in development, tries to simplify the process of wrangling data like this. It has a project object for collecting a lot of wells into a single data structure, so we can get a nice overview of everything:

Our goal is to include these curves in the training data for a machine learning task to predict lithology from well logs. The trained model can make really good lithology predictions... if we start with non-terrible data. Next time I'll tell you more about how welly has been helping us get from this chaos to non-terrible data.

September 21, 2016

x lines of Python: read and write SEG-Y

September 21, 2016/ Matt Hall

Reading SEG-Y files comes up a lot in the geophysicist's workflow. Writing, less often, but it does come up occasionally. As long as we're mostly concerned with trace data and not location, both of these tasks can be fairly easily accomplished with ObsPy.

Today we'll load some seismic, compute an attribute on it, and save a new SEG-Y, in 10 lines of Python.

ObsPy is a rare thing. It demonstrates what a research group can accomplish with a little planning and a lot of perseverance (cf my whinging earlier this year about certain consortiums in our field). It's an open source Python package from the geophysicists at the University of Munich — Karl Bernhard Zoeppritz studied there for a while, so you know it's legit. The tool serves their research in earthquake and global seismology needs, and also happens to handle SEG-Y files quite nicely.

Aside: I think SixtyNorth's segpy is actually the way to go for reading and writing SEG-Y; ObsPy is probably overkill for most applications — it's about 80 times the size for one thing. I just happen to be familiar with it and it's super easy to install: conda install obspy. So, since minimalism is kind of the point here, look out for a future x lines of Python using that library.

The sentences

As before, we'd like to express the process in just a few sentences of plain English. Assuming we just want to read the data into a NumPy array, look at it, do something to it, and write a new file, here's what we're doing:

Read (or really index) the file as an ObsPy Stream object.
Stack (in the NumPy sense) the Trace objects into a single NumPy array. We have data!
Get the 99th percentile of the amplitudes to make plotting easier.
Plot the data so we can see it.
Get the sample interval of the data from a trace header.
Compute the similarity attribute using our library bruges.
Make a new Stream object to hold the outbound data.
Add a Stats object, which holds the header, and recycle some header info.
Append info about our data to the header.
Write a new SEG-Y file with our computed data in it!

There's a bit more in the Jupyter Notebook (examining the file and trace headers, for example, and a few more plots) which, remember, you can run right in your browser! You don't need to install a thing. Please give it a look! Quick tip: Just keep hitting Shift+Enter to run the cells.

If you like this sort of thing, and are planning to be at the SEG Annual Meeting in Dallas next month, you might like to know that we'll be teaching our Creative Geocomputing class there. It's basically two days of this sort of thing, only with friends to learn with and us to help. Come and learn some new skills!

The seismic data used in this post is from the NPRA seismic repository of the USGS. The data is in the public domain.

September 15, 2016

x lines of Python: synthetic wedge model

September 15, 2016/ Matt Hall

Welcome to a new blog series! Like the A to Z and the Great Geophysicists, I expect it will be sporadic and unpredictable, but I know you enjoys life's little nonlinearities as much as I.

The idea with this one — x lines of Python — is to share small geoscience workflows in x lines or fewer. I'm not sure about the value of x, but I think 10 seems reasonable for most tasks. If x > 10 then the task may have been too big... If x < 5 then it was probably too small.

Python developer Raymond Hettinger says that each line of code should be equivalent to a sentence... so let's say that that's the measure of what's OK to put in a single line.

Synthetic wedge model

To kick things off, follow this link to a live Jupyter Notebook environment showing how you can make a simple synthetic three-rock wedge model in only 9 lines of code.

The sentences represented by the code that made the data in these images are:

Set up the size of the model.
Make the slanty bit, with 1's in the wedge and 2's in the base.
Add the top of the model as 0; these numbers will turn into rocks.
Define the velocity and density of rocks 0 to 2.
Distribute those properties through the model.
Calculate the acoustic impedance everywhere.
Calculate the reflection coefficients in the model.
Make a Ricker wavelet.
Convolve the wavelet with the reflection coefficients.

Your turn!

All of the notebooks we share in this series will be hosted on mybinder.org. I'm excited about this because it means you can run and edit them live, without installing anything at all. Give it a go right now.

You can see them on GitHub too, and fork or clone them from there. Note that if you look at the notebook for this post on GitHub, you'll be able to view it, but not change or run code unless you get everything running on your own machine. (To do that, you can more or less follow the instructions in my User Guide to the TLE tutorials).

Please do take this notion of x as 'par' as a challenge. If you'd like to try to shoot under par, please do — and share your efforts. Code golf is a fun way to learn better coding habits. (And maybe some bad ones.) There is a good chance I will shoot some bogies on this course.

We will certainly take requests too — what tasks would you like to see in x lines of Python?

March 30, 2016

Helpful horizons

March 30, 2016/ Matt Hall

Ah, the smell of a new seismic interpretation project. All those traces, all that geology — perhaps unseen by humans or indeed any multicellular organism at all since the Triassic. The temptation is to Just Start Interpreting, why, you could have a map by lunchtime tomorrow! But wait. There are some things to do first.

Once I've made sure all is present and correct (see How to QC a seismic volume), I spend a bit of time making some helpful horizons...

The surface. One of the fundamental horizons, the seafloor or ground surface is a must-have. You may have received it from the processor (did you ask for it?) or it may be hidden in the SEG-Y headers — ask whoever received or loaded the data. If not, ground elevation is usually easy enough to get from your friendly GIS guru. If you have to interpret the seafloor, at least it should autotrack quite well.
Seafloor multiple model. In marine data, I always make a seafloor multiple model — just multiply the seafloor pick by 2. This will help you make sense of any anomalous reflectors or amplitudes at that two-way time. Maybe make a 3× version too if things look really bad. Remember, the 2× multiple will be reverse polarity.
Other multiples. You can model the surface multiple of any strong reflectors with the same arithmetic — but the chances are that any residual multiple energy is quite subtle. You may want to seek help modeling them properly, once you have a 3D velocity model.

A 2D seismic dataset with some of the suggested helpful horizons. Please see the footnote about this dataset. Click the image to enlarge.

Water depth markers. I like to make flat horizons* at important water depths, eg shelf edge (usually about 100–200 m), plus 1000 m, 2000 m, etc. This mainly helps to keep track of where you are, and also to think about prospectivity, accessibility, well cost, etc. You only want these to exist in the water, so delete them anywhere they are deeper than the seafloor horizon. Your software should have an easy way to implement a simple model for time t in ms, given depth d in m and velocity** V in m/s, e.g.

$$ t = \frac{2000 d}{V} \approx \frac{2000 d}{1490} \qquad \qquad \mathrm{e.g.}\ \frac{2000 \times 1000}{1490} = 1342\ \mathrm{ms} $$

Hydrate stability zone. In marine data and in the Arctic you may want to model the bottom of the gas hydrate stability zone (GHSZ) to help interpret bottom-simulating reflectors, or BSRs. I usually do this by scanning the literature for reports of BSRs in the area, or data on hydrate encounters in wells. In the figure above, I just used the seafloor plus 400 ms. If you want to try for more precision, Bale et al. (2014) provided several models for computing the position of the GHSZ — thank you to Murray Hoggett at Birmingham for that tip.
Fold. It's very useful to be able to see seismic fold on a map along with your data, so I like to load fold maps at some strategic depths or, better yet, load the entire fold volume. That way you can check that anomalies (especially semblance) don't have a simple, non-geological explanation.
Gravity and magnetics. These datasets are often readily available. You will have to shift and scale them to some sensible numbers, either at the top or the bottom of your sections. Gravity can be especially useful for interpreting rifted margins.
Important boundaries. Your software may display these for you, but if not, you can fake it. Simply make a horizon that only exists within the polygon — a lease boundary perhaps — by interpolating within a polygon. Make this horizon flat and deep (deeper than the seismic), then merge it with a horizon that is flat and shallow (–1 ms, or anything shallower than the seismic). You should end up with almost-vertical lines at the edges of the feature.
Section headings. I like to organize horizons into groups — stratigraphy, attributes, models, markers, etc. I make empty horizons to act only as headings so I can see at a glance what's going on. Whether you need do this, and how you achieve it, depends on your software.

Most of these horizons don't take long to make, and I promise you'll find uses for them throughout the interpretation project.

If you have other helpful horizon hacks, I'd love to hear about them — put your favourites in the comments.

Footnotes

* It's not always obvious how to make a flat horizon. A quick way is to take some ubiquitous horizon — the seafloor maybe — and multiply it by zero.

** The velocity of sound in seawater is not a simple subject. If you want to be precise about it, you can try this online calculator, or implement the equations yourself.

The 2D seismic dataset shown is from the Laurentian Basin, offshore Newfoundland. The dataset is copyright of Natural Resources Canada, and subject to the Open Government License – Canada. You can download it from the OpendTect Open Seismic Repository. The cultural boundary and gravity data is fictitious — I made them up for the purposes of illustration.

References

Bale, Sean, Tiago M. Alves, Gregory F. Moore (2014). Distribution of gas hydrates on continental margins by means of a mathematical envelope: A method applied to the interpretation of 3D seismic data. Geochem. Geophys. Geosyst. 15, 52–68, doi:10.1002/2013GC004938. Note: the equations are in the Supporting Information.

January 25, 2016

White magic: calibrating seismic attributes

January 25, 2016/ Matt Hall

This post is part of a series on seismic attributes; the previous posts were...

Last time, I hinted that there might be a often-overlooked step in attribute analysis:

Calibration is a gaping void in many published workflows. How can we move past "that red blob looks like a point bar so I drew a line around it in PowerPoint" to "there's a 70% chance of finding reservoir quality sand at that location"?

Why is this step such a 'gaping void'? A few reasons:

It's fun playing with attributes, and you can make hundreds without a second thought. Some of them look pretty interesting, geological even. "That looks geological" is, however, not an attribute calibration technique. You have to prove it.
Nobody will be around when we find out the answer. There's a good chance that well will never be drilled, but when it is, you'll be on a different project, in a different company, or have left the industry altogether and be running a kayak rental business in Belize.
The bar is rather low. The fact that many published examples of attribute analysis include no proof at all, just a lot of maps with convincing-looking polygons on them, and claims of 'better reservoir quality over here'.

This is getting discouraging. Let's look at an example. Now, it's hard to present this without seeming over-critical, but I know these gentlemen can handle it, and this was only a magazine article, so we needn't make too much of it. But it illustrates the sort of thing I'm talking about, so here goes.

Quoting from Chopra & Marfurt (AAPG Explorer, April 2014), edited slightly for brevity:

While coherence shows the edges of the channel, it gives little indication of the heterogeneity or uniformity of the channel fill. Notice the clear definition of this channel on the [texture attribute — homogeneity].

We interpret [the] low homogeneity feature [...] to be a point bar in the middle of the incised valley (green arrow). This internal architecture was not delineated by coherence.

A nice story, making two claims:

The attribute incompletely represents the internal architecture of the channel.
The labeled feature on the texture attribute is a point bar.

I know explorers have to be optimists, and geoscience is all about interpretation, but as scientists we must be skeptical optimists. Claims like this are nice hypotheses, but you have to take the cue: go off and prove them. Remember confirmation bias, and Feynman's words:

The first principle is that you must not fool yourself — and you are the easiest person to fool.

The twin powers

Making geological predictions with seismic attribute analysis requires two related workflows:

Forward modeling — the best way to tune your intuition is to make a cartoonish model of the earth (2D, isotropic, homogeneous lithologies) and perform a simplified seismic experiment on it (convolutional, primaries only, noise-free). Then you can compare attribute behaviour to the known model.
Calibration — you are looking for an explicit, quantitative relationship between a physical property you care about (porosity, lithology, fluid type, or whatever) and a seismic attribute. A common way to show this is with a cross-plot of the seismic amplitude against the physical property.

When these foundations are not there, we can be sure that one or more bad things will happen:

The relationship produces a lot of type I errors (false positives).
It produces a lot of type II error (false negatives).
It works at some wells and not at others.
You can't reproduce it with a forward model.
You can't explain it with physics.

As the industry shrivels and questions — as usual — the need for science and scientists, we have to become more stringent, more skeptical, and more rigorous. Doing anything else feeds the confirmation bias of the non-scientific continent. Because it says, loud and clear: geoscience is black magic.

The image is part of the figure from Chopra, S and K Marfurt (2014). Extracting information from texture attributes. AAPG Explorer, April 2014. It is copyright of the Authors and AAPG.

September 21, 2015

What is AVO-friendly processing?

September 21, 2015/ Matt Hall

It's the Geophysics Hackathon next month! Come down to Propeller in New Orleans on 17 and 18 October, and we'll feed you and give you space to build something cool. You might even win a prize. Sign up — it's free!

Thank you to the sponsors, OpenGeoSolutions and Palladium Consulting — both fantastic outfits. Hire them.

AVO-friendly processing gets called various things: true amplitude, amplitude-friendly, and controlled amplitude, controlled phase (or just 'CACP'). And, if you've been involved in any processing jobs you'll notice these phrases get thrown around a lot. But seismic geophysics has a dirty little secret... we don't know exactly what it is. Or, at least, we can't agree on it.

A LinkedIn discussion in the Seismic Data Processing group earlier this month prompted this post:

I can't compile a list of exactly which processes will harm your AVO analysis (can anyone? Has anyone??), but I think I can start a list of things that you need to approach with caution and skepticism:

Anything that is not surface consistent. What does that mean? According to Oliver Kuhn (now at Quantec in Toronto):

Surface consistent: a shot-related [process] affects all traces within a shot gather in the same way, independent of their receiver positions, and, a receiver-related [process] affects all traces within a receiver gather in the same way, independent of their shot positions.

Anything with a window — spatial or temporal. If you must use windows, make them larger or longer than your areas and zones of interest. In this way, relative effects should be preserved.
Anything that puts the flattening of gathers before the accuracy of the data (<cough> trim statics). Some flat gathers don't look flat. (The thumbnail image for this post is from Duncan Emsley's essay in 52 Things.)
Anything that is a sort of last resort, post hoc attempt to improve the data — what we might call 'cosmetic' treatments. Things like wavelet stretch correction and spectral shaping are good for structural interpreters, but not for seismic analysts. At the very least, get volumes without them, and convince yourself they did no harm.
Anything of which people say, "This should be fine!" but offer no evidence.

Back to my fourth point there... spectral shaping and wavelet stretch correction (e.g. this patented technique I was introduced to at ConocoPhillips) have been the subject of quite a bit of discussion, in my experience. I don't know why; both are fairly easy to model, on the face of it. The problem is that we start to get into the sticky question of what wavelets 'see' and what's a wavelet anyway, and hang on a minute why does seismic reflection even work? Personally, I'm skeptical, especially as we get more used to, and better at, looking at spectral decompositions of stacked and pre-stack data.

Divergent paths

I have seen people use seismic data with very different processing paths for structural interpretation and for AVO analysis. This can happen on long-term projects, where the structural framework depends on an old post-stack migration that was later reprocessed for AVO friendliness. This is a bad idea — you won't be able to put the quantitative results into the structural framework without introducing substantial error.

What we need is a clinical trial of processing algorithms, in which they are tested against a known model like Marmousi, and their effect on attributes is documented. If such studies exist, I'd love to hear about them. Come to think of it, this would make a good topic for a hackathon some day... Maybe Dallas 2016?

August 21, 2015

How to QC a seismic volume

August 21, 2015/ Matt Hall

I've had two emails recently about quality checking seismic volumes. And last month, this question popped up on LinkedIn:

We have written before about making a data quality volume for your seismic — a handy way to incorporate uncertainty into risk maps — but these recent questions seem more concerned with checking a new volume for problems.

First things first

Ideally, you'd get to check the volume before delivery (at the processing shop, say), otherwise you might have to actually get it loaded before you can perform your QC. I am assuming you've already been through the processing, so you've seen shot gathers, common-offset gathers, etc. This is all about the stack. Nonetheless, the processor needs to prepare some things:

The stack volume, of course, with and without any 'cosmetic' filters (eg fxy, fk).
A semblance (coherency, similarity, whatever) volume.
A fold volume.
Make sure the processor has some software that can rapidly scan the data, plot amplitude histograms, compute a spectrum, pick a horizon, and compute phase. If not, install OpendTect (everyone should have it anyway), or you'll have to load the volume yourself.

There are also some things you can do ahead of time.

Be part of the processing from the start. You don't want big surprises at this stage. If a few lines got garbled during file creation, no problem. If there's a problem with ground-roll attentuation, you're not going to be very popular.
Make sure you know how the survey was designed — where the corners are, where you would expect live traces to be, and which way the shot and receiver lines went (if it was an orthogonal design). Get maps, take them with you.
Double-check the survey parameters. The initial design was probably changed. The PowerPoint presentation was never updated. The processor probably has the wrong information. General rule with subsurface data: all metadata is probably wrong. Ideally, talk to someone who was involved in the planning of the survey.
You didn't skip (2) did you? I'm serious, double check everything.

Crack open the data

OK, now you are ready for a visit with the processor. Don't fall into the trap of looking at the geology though — it will seduce you (it's always pretty, especially if it's the first time you've seen it). There is work to do first.

Check the cornerpoints of the survey. I like the (0, 0) trace at the SW corner. The inline and crossline numbering should be intuitive and simple. Make sure the survey is the correct way around with respect to north.
Scan through timeslices. All of them. Is the sample interval what you were expecting? Do you reach the maximum time you expected, based on the design? Make sure the traces you expect to be live are live, and the ones you expect to be dead are dead. Check for acquisition footprint. Start with greyscale, then try another colourmap.
Repeat (5) but in a similarity volume (or semblance, coherency, whatever). Look for edges, and geometric shapes. Check again for footprint.
Look through the inlines and crosslines. These usually look OK, because it's what processors tend to focus on.
Repeat (7) but in a similarity volume.

Dive into the details

Check some spectrums. Select some subsets of the data — at least 100 traces and 1000 ms from shallow, deep, north, south, east, west — and check the average spectrums. There should be no conspicuous notches or spikes, which could be signs of all sorts of things from poorly applied filters to reverberation.
Check the amplitude histograms from those same subsets. It should be 32-bit data — accept no less. Check the scaling — the numbers don't mean anything, so you can make them range over whatever you like. Something like ±100 or ±1000 tends to make for convenient scaling of amplitude maps and so on; ±1.0 or less can be fiddly in some software. Check for any departures from an approximately Laplacian (double exponential) distribution: clipping, regular or irregular spikes, or a skewed or off-centre distribution:

Interpret a horizon and check its phase. See Purves (Leading Edge, October 2014) or SubSurfWiki for some advice.
By this time, the fold volume should yield no surprises. If any of the rest of this checklist throws up problems, the fold volume might help troubleshoot.
Check any other products you asked for. If you asked for gathers or angle stacks (you should), check them too.

Last of all, before actual delivery, talk to whoever will be loading the data about what kind of media they prefer, and what kind of file organization. They may also have some preferences for the contents of the SEG-Y file and trace headers. Pass all of this on to the processor. And don't forget to ask for All The Seismic.

What about you?

Have I forgotten anything? Are there things you always do to check a new seismic volume? Or if you're really brave, maybe you have some pitfalls or even horror stories to share...

August 05, 2015

Introducing Bruges

August 05, 2015/ Evan Bianco

Welcome to Bruges, a Python library (previously known as agilegeo) that contains a variety of geophysical equations used in processing, modeling and analysing seismic reflection and well log data. Here's what's in the box so far, with new stuff being added every week:

Simple AVO example

  
    
    VP [m/s]
    VS [m/s]
    ρ [kg/m3]
  

    Rock 1
    3300
    1500
    2400
  

    Rock 2
    3050
    1400
    2075
  

Imagine we're studying the interface between the two layers whose rock properties are shown here...

To compute the zero-offset reflection coefficient at zero offset, we pass our rock properties into the Aki-Richards equation and set the incident angle to zero:

 >>> import bruges as b
 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=0)
 -0.111995777064

Similarly, compute the reflection coefficient at 30 degrees:

 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=30)
 -0.0965206980095

To calculate the reflection coefficients for a series of angles, we can pass in a list:

 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=[0,10,20,30])
 [-0.11199578 -0.10982911 -0.10398651 -0.0965207 ]

Similarly, we could compute all the reflection coefficients for all incidence angles from 0 to 70 degrees, in one degree increments, by passing in a range:

 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=range(70))
 [-0.11199578 -0.11197358 -0.11190703 ... -0.16646998 -0.17619878 -0.18696428]

A few more lines of code, shown in the Jupyter notebook, and we can make some plots:

Elastic moduli calculations

With the same set of rocks in the table above we could quickly calculate the Lamé parameters λ and µ, say for the first rock, like so (in SI units),

 >>> b.rockphysics.lam(vp1, vs1, rho1), b.rockphysics.mu(vp1, vs1, rho1)
 15336000000.0 5400000000.0

Sure, the equations for λ and µ in terms of P-wave velocity, S-wave velocity, and density are pretty straightforward:

but there are many other elastic moduli formulations that aren't. Bruges knows all of them, even the weird ones in terms of E and λ.

All of these examples, and lots of others — Backus averaging, examples are available in this Jupyter notebook, if you'd like to work through them on your own.

Bruges is a...

It is very much early days for Bruges, but the goal is to expose all the geophysical equations that geophysicists like us depend on in their daily work. If you can't find what you're looking for, tell us what's missing, and together, we'll make it grow.

What's a handy geophysical equation that you employ in your work? Let us know in the comments!

July 02, 2015

Attribute analysis and statistics

July 02, 2015/ Matt Hall

Last week I wrote a basic introduction to attribute analysis. The post focused on the different ways of thinking about sampling and intervals, and on how instantaneous attributes have to be interpolated from the discrete data. This week, I want to look more closely at those interval attributes. We'd often like to summarize the attributes of an interval info a single number, perhaps to make a map.

Before thinking about amplitudes and seismic traces, it's worth reminding ourselves about different kinds of average. This table from SubSurfWiki might help...

A peculiar feature of seismic data. from a statistical point of view, is the lack of the very low frequencies needed to give it a trend. Because of this, it oscillates around zero, so the average amplitude over a window tends to zero — seismic data has a mean value of zero. So not only do we have to think about interpolation issues when we extract attributes, we also have to think about statistics.

Fortunately, once we understand the issue it's easy to come up with ways around it. Look at the trace (black line) below:

The mean is, as expected, close to zero. So I've applied some other statistics to represent the amplitude values, shown as black dots, in the window (the length of the plot):

Average absolute amplitude (light green) — treat all values as positive and take the mean.
Root-mean-square amplitude (dark green) — tends to emphasize large values, so it's a bit higher.
Average energy (magenta) — the mean of the magnitude of the complex trace, or the envelope, shown in grey.
Maximum amplitude (blue) — the absolute maximum value encountered, which is higher than the actual sample values (which are all integers in this fake dataset) because of interpolation.
Maximum energy (purple) — the maximum value of the envelope, which is higher still because it is phase independent.

There are other statistics besides these, of course. We could compute the median average, or some other mean. We could take the strongest trough, or the maximum derivative (steepest slope). The options are really only limited by your imagination, and the physical relationship with geology that you expect.

We'll return to this series over the summer, asking questions like How do you know what to expect? and Does a physically realistic relationship even matter?

To view and run the code that I used in creating the figures for this post, grab the iPython/Jupyter Notebook.

Blog