More precise SEG-Y?

The impending SEG-Y Revision 2 release allows the use of double-precision floating point numbers. This news might leave some people thinking: "What?".

Integers and floats

In most computing environments, there are various kinds of number. The main two are integers and floating point numbers. Let's take a quick look at integers, or ints, first.

Integers can only represent round numbers: 0, 1, 2, 3, etc. They can have two main flavours: signed and unsigned, and various bit-depths, e.g. 8-bit, 16-bit, and so on. An 8-bit unsigned integer can have values between 0 and 255; signed ints go from -128 to +127 using a mathematical operation called two's complement.

As you might guess, floating point numbers, or floats, are used to represent all the other numbers — you know, numbers like 4.1 and –7.2346312 × 10¹³ — we need lots of those.  

Floats in binary

OK, so we need to know about floats. To understand what double-precision means, we need to know how floats are represented in computers. In other words, how on earth can a binary number like 01000010011011001010110100010101 represent a floating point number?

It's fairly easy to understand how integers are stored in binary: the 8-bit binary number 01001101 is the integer 77 in decimal, or 4D in hexadecimal; 11111111 is 255 (base 10) or FF (base 16) if we're dealing with unsigned ints, or -1 decimal if we're in the two's complement realm of signed ints.

Clearly we can only represent a certain number of values with, say, 16 bits. This would give us 65 536 integers... but that's not enough dynamic range represent tiny or gigantic floats, not if we want any precision at all. So we have a bit of a paradox: we'd like to represent a huge range of numbers (down around the mass of an electron, say, and up to Avogadro's number), but with reasonably high precision, at least a few significant figures. This is where floating point representations come in.

Scientific notation, sort of

If you're thinking about scientific notation, you're thinking on the right lines. We raise some base (say, 10) to some integer exponent, and multiply by another integer (called the mantissa, or significand). That way, we can write a huge range of numbers with plenty of precision, using only two integers. So:

$$ 3.14159 = 314159 \times 10^{-5} \ \ \mathrm{and} \ \ 6.02214 \times 10^{23} = 602214 \times 10^{18} $$

If I have two bytes at my disposal (so-called 'half precision'), I could have an 8-bit int for the integer part, called the significand, and another 8-bit int for the exponent. Then we could have floats from \(0\) up to \(255 \times 10^{255}\). The range is pretty good, but clearly I need a way to get negative significands — maybe I could use one bit for the sign, and leave 7 bits for the exponent. I also need a way to get negative exponents — I could assign a bias of –64 to the exponent, so that 127 becomes 63 and an exponent of 0 becomes –64. More bits would help, and there are other ways to apportion the bits, and we can use tricks like assuming that the significand starts with a 1, storing only the fractional part and thereby saving a bit. Every bit counts!

IBM vs IEEE floats

The IBM float and IEEE 754-2008 specifications are just different ways of splitting up the bits in a floating point representation. Single-precision (32-bit) IBM floats differ from single-precision IEEE floats in two ways: they use 7 bits and a base of 16 for the exponent. In contrast, IEEE floats — which are used by essentially all modern computers — use 8 bits and base 2 (usually) for the exponent. The IEEE standard also defines not-a-numbers (NaNs), and positive and negative infinities, among other conveniences for computing.

In double-precision, IBM floats get 56 bits for the fraction of the significand, allowing greater precision. There are still only 7 bits for the exponent, so there's no change in the dynamic range. 64-bit IEEE floats, however, use 11 bits for the exponent, leaving 52 bits for the fraction of the significand. This scheme results in 15–17 sigificant figures in decimal numbers.

The diagram below shows how four bytes (0x42, 0x6C, 0xAD, 0x15) are interpreted under the two schemes. The results are quite different. Notice the extra bit for the exponent in the IEEE representation, and the different bases and biases.

A four-byte word, 426CAD15 (in hexadecimal), interpreted as an IBM float (top) and an IEEE float (bottom). Most scientists would care about this difference!

A four-byte word, 426CAD15 (in hexadecimal), interpreted as an IBM float (top) and an IEEE float (bottom). Most scientists would care about this difference!

IBM, IEEE, and seismic

When SEG-Y was defined in 1975, there were only IBM floats — IEEE floats were not defined until 1985. The SEG allowed the use of IEEE floating-point numbers in Revision 1 (2002), and they are still allowed in the impending Revision 2 specification. This is important because most computers these days use IEEE float representations, so if you want to read or write IBM floats, you're going to need to do some work.

The floating-point format in a particular SEG-Y file should be indicated by a flag in bytes 3225–3226. A value of 0x01 indicates IBM floats, while 0x05 indicates IEEE floats. Then again, you can't believe everything you read in headers. And, unfortunately, you can't tell an IBM float just by looking at it. Meisinger (2004) wrote a nice article in CSEG Recorder about the perils of loading IBM as IEEE and vice versa — illustrated below. You should read it.

From Meisinger, D (2004). SEGY floating point confusion. CSEG Recorder 29(7). Available online.

From Meisinger, D (2004). SEGY floating point confusion. CSEG Recorder 29(7). Available online.

I wrote this post by accident while writing about endianness, the main big change in the new SEG-Y revision. Stay tuned for that post! [Update: here it is!]

Burning the surface onto the subsurface

Previously, I described a few of the reasons why we don't get a clean ground surface event on land seismic data like we do the water-bottom in marine seismic. In land data, the worst part of the image is right at the surface. But ground level is not just tricky to see, it's impossible to see. Since the vibe truck is on the ground, there's no reflection from that surface. Even if there was some kind of event there, processors apply a magic eraser to the top of the section — the mute — to erase the early arrivals. So it's not possible to see the ground in land data, and you can't pick what isn't there.  

But I still want to know where the ground is. Why can't we slap a ground-level seismic 'reflection' event on the section? 

What you need

We need the ground level, which is in depth of course, in the time domain of the seismic section. To compute this, let's call it \(t_\mathrm{G}\), we need three pieces of information at every trace location: the ground elevation \(G\), the seismic reference datum (SRD) which I'll call \(D\), and the replacement velocity \(V_\mathrm{r}\). 

$$ t_\mathrm{G} = \frac{2 (G - D)}{V_\mathrm{r}} $$

Ground elevation.  If you're lucky, you'll be able to find the ground elevation corresponding to each trace stored in the trace headers. Ground elevation might be located in bytes 41-44 or 45-48 of the trace header, which correspond to the receiver group elevation and the surface elevation of the source, respectively. These should be the same for a stacked trace, but as with any meta-data to do with SEGY, this info could be hiding somewhere else, or missing altogether. And if you're that unlucky, you might have to comb through processing reports for the missing information. If you are even more unlucky (as I was in this example), you won't have any kind of processing report to fall back on and you'll have to concoct something else. In the accompanying Jupyter notebook, I resorted to interpolating a digitized elevation profile from a JPEG plot of the seismic line. So if you're all out of options, you might find refuge in those legacy plots! 

This profile is particularly wonky, because the seismic reference datum (red) is not the same across the profile

This profile is particularly wonky, because the seismic reference datum (red) is not the same across the profile

Seismic reference datum. And to make life yet more complicated, the seismic reference datum is not flat across the profile. It goes downhill and then flattens out (red line below). Don't ask me what the advantages are of processing data to a variable datum, but whatever they are, I hope they offset the disadvantages of all-to-easily mistaking the datum to be flat.

The replacement velocity is given in the sidelabel of the raster image online (shown right). It's 10 000 ft/sec, or 3048 m/s. 

Byte locations 53-56 and 57-60 are the standard trace header placeholders reserved for holding the datum elevation at the receiver group and the datum elevation at source. Again, for a stacked trace, these should be the same value. If these fields are zeros, then check the fields of the Trace Header Extension. If they turn up empty, and if the datum is horizontal, it might be listed in the file's text header. 

Convert elevation to time

By definition, the seismic reference datum is horizontal in the time-domain (red line below). Notice how the ground elevation – in the time domain – plots mostly as negative values (before) time zero. In other words, most of the ground is being cut-off by the top of the section. So, if we want to see it, we need to shift everything down into the field of view. Conceptually, this means adjusting the seismic reference datum so it floats entirely above the ground-level. Computationally, we can achieve this easily enough by padding the top of the data with zeros.

A time-domain representation of the ground-level along the seismic profile. The surface of the earth extends above the start of the seismic data for most of the locations along the profile. 

A time-domain representation of the ground-level along the seismic profile. The surface of the earth extends above the start of the seismic data for most of the locations along the profile. 

Make the ground a pickable event

As a final bit of post-processing, we could actually burn the ground-level into the data as a sort of synthetic seismic event. The reason I like this concept is that it alleviates the need to dig up old-processing reports, puzzle over missing header data, or worse, maintain and munge external text files containing elevation information. I say, let's make it self-contained. Let's put it directly into the data so that it can be treated like any other seismic reflection. Why would I do this?

  • You can see where there might be fold, velocity or other issues related to topography.
  • You can immediately see the polarity of the data. 
  • You could use the bandwidth of the data to make the pseudo-reflector, giving a visual hint to the interpreter.
  • Keeping track of amplitude adjustments and phase rotations would be self-documenting and reversible.
  • you could autotrack it to get a topographic map (or just get this from the processor).
  • It looks cool!
Seismic profile with ground level SYNTHETICALLY SLAPPED ON TOP.  Bandlimited, of course, so you can Autotrack till your hearts content!

Seismic profile with ground level SYNTHETICALLY SLAPPED ON TOP.  Bandlimited, of course, so you can Autotrack till your hearts content!

I've deliberately constructed a band-limited reflection, opposed to placing a sharp spike at ground-level. The problem with a spike is that it has infinite bandwidth. It contains higher frequencies than the image, so as Carl Reine commented on that last post, that might not play nice with seismic attributes. Also, there's the problem of selecting an amplitude value to assign to the spike: we don't want to introduce amplitudes that are ridiculously out of range of the existing data.  

The whole image

I hereby propose that this synthetic ground level trick adopted as the new standard for any land seismic processing and interpretation. The great thing is, it can be done just as easily by interpreters and seismic data technologists, as by the processing companies that create the rest of the image. I realize we're adding stuff to the data that isn't actually signal. We do non-real things to signals all the time. The question is, do the benefits outweigh the artificiality?

Here's the view of the entire section:

The whole section, ground level included.

The whole section, ground level included.

The details of this exercise can be found in the this Jupyter Notebook.

References

The seismic is line 36_77_PR from the USGS data repository.

SEG Y rev 2 Data Exchange Format. SEG Technical Standards Committee. Draft 2.0, January, 2015. 

Where is the ground?

This is the upper portion of a land seismic profile in Alaska. Can you pick a horizon where the ground surface is? Have a go at pickthis.io.

Pick the Ground surface at the top of the seismic section at pickthis.io.

Pick the Ground surface at the top of the seismic section at pickthis.io.

Picking the ground surface on land-based seismic data is not straightforward. Picking the seafloor reflection on marine data, on the other hand, is usually a piece of cake, a warm-up pick. You can often auto-track the whole thing with a few seeds.

Seafloor reflection on Penobscot 3D survey, offshore Nova Scotia. from Matt's tutorial in the April 2016 The Leading Edge, The function of interpolation.

Seafloor reflection on Penobscot 3D survey, offshore Nova Scotia. from Matt's tutorial in the April 2016 The Leading Edge, The function of interpolation.

Why aren't interpreters more nervous that we don't know exactly where the surface of the earth is? I'm sure I'm not the only one that would like to have this information while interpreting. Wouldn't it be great if land seismic were more like marine?

Treacherously Jagged TopographY or Near-Surface processing ArtifactS?

Treacherously Jagged TopographY or Near-Surface processing ArtifactS?

If you're new to land-based seismic data, you might notice that there isn't a nice pickable event across the top of the section like we find in marine seismic data. Shot noise at the surface has been muted (deleted) in processing, and the low fold produces an unclean, jagged look at the top of the section. Additionally, the top of the section, time-zero — the seismic reference datum — usually floats somewhere above the land surface — and we can't know where that is unless it can be found in the file header, or looked up in the processing report.

The seismic reference datum, at a two-way time of zero seconds on seismic data, is typically set at mean sea level for offshore data. For land data, it is usually chosen to 'float' above the land surface.

The seismic reference datum, at a two-way time of zero seconds on seismic data, is typically set at mean sea level for offshore data. For land data, it is usually chosen to 'float' above the land surface.

Reframing the question

This challenge is a bit of a trick question. It begs the viewer to recognize that the seemingly simple task of mapping the ground level on a land seismic section is actually a rudimentary velocity modeling or depth conversion exercise in itself. Wouldn't it be nice to have the ground surface expressed as pickable seismic event? Shouldn't we have it always in our images? Baked into our data, so to speak, such that we've always got an unambiguous pick? In the next post, I'll illustrate what I mean and show what's involved in putting it in. 

In the meantime, I challenge you to pick where you think the (currently absent) ground surface is on this profile, so in the next post we can see how well you did.

x lines of Python: read and write SEG-Y

Reading SEG-Y files comes up a lot in the geophysicist's workflow. Writing, less often, but it does come up occasionally. As long as we're mostly concerned with trace data and not location, both of these tasks can be fairly easily accomplished with ObsPy. 

Today we'll load some seismic, compute an attribute on it, and save a new SEG-Y, in 10 lines of Python.


ObsPy is a rare thing. It demonstrates what a research group can accomplish with a little planning and a lot of perseverance (cf my whinging earlier this year about certain consortiums in our field). It's an open source Python package from the geophysicists at the University of Munich — Karl Bernhard Zoeppritz studied there for a while, so you know it's legit. The tool serves their research in earthquake and global seismology needs, and also happens to handle SEG-Y files quite nicely.

Aside: I think SixtyNorth's segpy is actually the way to go for reading and writing SEG-Y; ObsPy is probably overkill for most applications — it's about 80 times the size for one thing. I just happen to be familiar with it and it's super easy to install: conda install obspy. So, since minimalism is kind of the point here, look out for a future x lines of Python using that library.

The sentences

As before, we'd like to express the process in just a few sentences of plain English. Assuming we just want to read the data into a NumPy array, look at it, do something to it, and write a new file, here's what we're doing:

  1. Read (or really index) the file as an ObsPy Stream object.
  2. Stack (in the NumPy sense) the Trace objects into a single NumPy array. We have data!
  3. Get the 99th percentile of the amplitudes to make plotting easier.
  4. Plot the data so we can see it.
  5. Get the sample interval of the data from a trace header.
  6. Compute the similarity attribute using our library bruges.
  7. Make a new Stream object to hold the outbound data.
  8. Add a Stats object, which holds the header, and recycle some header info.
  9. Append info about our data to the header.
  10. Write a new SEG-Y file with our computed data in it!

There's a bit more in the Jupyter Notebook (examining the file and trace headers, for example, and a few more plots) which, remember, you can run right in your browser! You don't need to install a thing. Please give it a look! Quick tip: Just keep hitting Shift+Enter to run the cells.

If you like this sort of thing, and are planning to be at the SEG Annual Meeting in Dallas next month, you might like to know that we'll be teaching our Creative Geocomputing class there. It's basically two days of this sort of thing, only with friends to learn with and us to help. Come and learn some new skills!

The seismic data used in this post is from the NPRA seismic repository of the USGS. The data is in the public domain.

What's that funny noise?

Seismic reflections are strange noises. Around 50 Hz, narrow band, very quiet, and difficult to interpret. It is possible to convert seismic traces (active or passive) into audible sound with a shift in pitch and a time stretch.

Made by the legendary Emory Cook, who recorded everything from steel bands to racing cars to ionospheric noises to this treatment of Hugo Benioff's earthquake recordings. Epic.

Curiously the audification thing has never really caught on in exploration geophysics — a bit surprising, given the fascination with spectral decomposition over the last 15 years or so. And especially so when you consider that our hearing has a dynamic range of about 100 dB, which is comparable to, indeed slightly greater than, our vision (about 90 dB).

Paolo Dell'Aversana of ENI wants to change that. Rather than listening to 'raw' seismic, he's sending it to a MIDI interface and listening to it as a piano roll. Just try to imagine playing seismic on a piano for a second, then listen to his weird and wonderful results — at 9:45 in this EAGE video:

In this EAGE E-Lecture Paolo Dell'Aversana discusses how digital music technology can support geophysical data analysis and interpretation. If you've read any of Dell'Aversana's articles, you'll know he has one of the most creative minds in exploration geophysics. Skip to 9:45 for the crazy seismic piano roll.

On the subject of weird sounds, one of my favourite Wikipedia pages is List of unexplained sounds. I especially love the eerie recordings of mysterious underwater noises, like this one called Upsweep:

Upsweep
National Oceanic Atmospheric Administration

No-one knows what makes that noise! My money's on a volcanic vent, but that doesn't explain the seasonality. Maybe we should do a hackathon on these unexaplained sounds some time. If you know of any others — I'd love tohear about them.


If you enjoy strange infrasound as much as I do, I recommend following these two scientists on Twitter:


If you really like strange noises, don't forget to check out the Undersampled Radio podcast!

Q is for Q

Quality factor, or \(Q\), is one of the more mysterious quantities of seismology. It's right up there with Lamé's \(\lambda\) and Thomsen's \(\gamma\). For one thing, it's wrapped up with the idea of attenuation, and sometimes the terms \(Q\) and 'attenuation' are bandied about seemingly interchangeably. For another thing, people talk about it like it's really important, but it often seems to be completely ignored.

A quick aside. There's another quality factor: the rock quality factor, popular among geomechnicists (geomechanics?). That \(Q\) describes the degree and roughness of jointing in rocks, and is probably related — coincidentally if not theoretically — to seismic \(Q\) in various nonlinear and probably profound ways. I'm not going to say any more about it, but if this interests you, read Nick Barton's book, Rock Quality, Seismic Velocity, Attenuation and Anistropy (2006; CRC Press) if you can afford it. 

So what is Q exactly?

We know intuitively that seismic waves lose energy as they travel through the earth. There are three loss mechanisms: scattering (elastic losses resulting from reflections and diffractions), geometrical spreading, and intrinsic attenuation. This last one, anelastic energy loss due to absorption — essentially the deviation from perfect elasticity — is what I'm trying to describe here.

I'm not going to get very far, by the way. For the full story, start at the seminal review paper entitled \(Q\) by Leon Knopoff (1964), which surely has the shortest title of any paper in geophysics. (Knopoff also liked short abstracts, as you see here.)

The dimensionless seismic quality factor \(Q\) is defined in terms of the energy \(E\) stored in one cycle, and the change in energy — the energy dissipated in various ways, such as fluid movement (AKA 'sloshing', according to Carl Reine's essay in 52 Things... Geophysics) and intergranular frictional heat ('jostling') — over that cycle:

$$ Q \stackrel{\mathrm{def}}{=} 2 \pi \frac{E}{\Delta E} $$

Remarkably, this same definition holds for any resonator, including pendulums and electronics. Physics is awesome!

Because the right-hand side of that relationship is sort of upside down — the loss is in the denominator — it's often easier to talk about \(Q^{-1}\) which is, more or less, the percentage loss of energy in a single wavelength. This inverse of \(Q\) is proportional to the attenuation coefficient. For more details on that relationship, check out Carl Reine's essay.

This connection with wavelengths means that we have to think about frequency. Because high frequencies have shorter cycles (by definition), they attenuate faster than low frequencies. You know this intuitively from hearing the beat, but not the melody, of distant music for example. This effect does not imply that \(Q\) depends on frequency... that's a whole other can of worms. (Confused yet?)

The frequency dependence of \(Q\)

It's thought that \(Q\) is roughly constant with respect to frequency below about 1 Hz, then increases with \(f^\alpha\), where \(\alpha\) is about 0.7, up to at least 25 Hz (I'm reading this in Mirko van der Baan's 2002 paper), and probably beyond. Most people, however, seem to throw their hands up and assume a constant \(Q\) even in the seismic bandwidth... mainly to make life easier when it comes to seismic processing. Attempting to measure, let alone compensate for, \(Q\) in seismic data is, I think it's fair to say, an unsolved problem in exploration geophysics.

Why is it worth solving? I think the main point is that, if we could model and measure it better, it could be a semi-independent measure of some rock properties we care about, especially velocity. Actually, I think it's even a stretch to call velocity a rock property — most people know that velocity depends on frequency, at least across the gulf of frequencies between seismic and acoustic logging tools, but did you know that velocity also depends on amplitude? Paul Johnson tells about this effect in his essay in the forthcoming 52 Things... Rock Physics book — stay tuned for more on that.

For a really wacky story about negative values of \(Q\) — which imply transmission coefficients greater than 1 (think about that) — check out Chris Liner's essay in the same book (or his 2014 paper in The Leading Edge). It's not going to help \(Q\) get any less mysterious, but it's a good story. Here's the punchline from a Jupyter Notebook I made a while back; it follows along with Chris's lovely paper:

Top: Velocity and the Backus average velocity in the E-38 well offshore Nova Scotia. Bottom: Layering-induced attenuation, or 1/Q, in the same well. Note the negative numbers! Reproduction of Liner's 2014 results in a Jupyter Notebook.

Top: Velocity and the Backus average velocity in the E-38 well offshore Nova Scotia. Bottom: Layering-induced attenuation, or 1/Q, in the same well. Note the negative numbers! Reproduction of Liner's 2014 results in a Jupyter Notebook.

Hm, I had hoped to shed some light on \(Q\) in this post, but I seem to have come full circle. Maybe explaining \(Q\) is another unsolved problem.

References

Barton, N (2006). Rock Quality, Seismic Velocity, Attenuation and Anisotropy. Florida, USA: CRC Press. 756 pages. ISBN 9780415394413.

Johnson, P (in press). The astonishing case of non-linear elasticity.  In: Hall, M & E Bianco (eds), 52 Things You Should Know About Rock Physics. Nova Scotia: Agile Libre, 2016, 132 pp.

Knopoff, L (1964). Q. Reviews of Geophysics 2 (4), 625–660. DOI: 10.1029/RG002i004p00625.

Reine, C (2012). Don't ignore seismic attenuation. In: Hall, M & E Bianco (eds), 52 Things You Should Know About Geophysics. Nova Scotia: Agile Libre, 2012, 132 pp.

Liner, C (2014). Long-wave elastic attenuation produced by horizontal layering. The Leading Edge 33 (6), 634–638. DOI: 10.1190/tle33060634.1. Chris also blogged about this article.

Liner, C (in press). Negative Q. In: Hall, M & E Bianco (eds), 52 Things You Should Know About Rock Physics. Nova Scotia: Agile Libre, 2016, 132 pp.

van der Bann, M (2002). Constant Q and a fractal, stratified Earth. Pure and Applied Geophysics 159 (7–8), 1707–1718. DOI: 10.1007/s00024-002-8704-0.

Copyright and seismic data

Seismic company GSI has sued a lot of organizations recently for sharing its copyrighted seismic data, undermining its business. A recent court decision found that seismic data is indeed copyrightable, but Canadian petroleum regulations can override the copyright. This allows data to be disclosed by the regulator and copied by others — made public, effectively.


Seismic data is not like other data

Data is uncopyrightable. Like facts and ideas, data is considered objective, uncreative — too cold to copyright. But in an important ruling last month, the Honourable Madam Justice Eidsvik established at the Alberta Court of the Queen's Bench that seismic data is not like ordinary data. According to this ruling:

 

...the creation of field and processed [seismic] data requires the exercise of sufficient skill and judgment of the seismic crew and processors to satisfy the requirements of [copyrightability].

 

These requirements were established in the case of accounting firm CCH Canadian Limited vs The Law Society of Upper Canada (2004) in the Supreme Court of Canada. Quoting from that ruling:

 

What is required to attract copyright protection in the expression of an idea is an exercise of skill and judgment. By skill, I mean the use of one’s knowledge, developed aptitude or practised ability in producing the work. By judgment, I mean the use of one’s capacity for discernment or ability to form an opinion or evaluation by comparing different possible options in producing the work.

 

Interestingly:

 

There exist no cases expressly deciding whether Seismic Data is copyrightable under the American Copyright Act [in the US].

 

Fortunately, Justice Eidsvik added this remark to her ruling — just in case there was any doubt:

 

I agree that the rocks at the bottom of the sea are not copyrightable.

It's really worth reading through some of the ruling, especially sections 7 and 8, entitled Ideas and facts are not protected and Trivial and purely mechanical respectively. 

Why are we arguing about this?

This recent ruling about seismic data was the result of an action brought by Geophysical Service Incorporated against pretty much anyone they could accuse of infringing their rights in their offshore seismic data, by sharing it or copying it in some way. Specifically, the claim was that data they had been required to submit to regulators like the C-NLOPB and the C-NSOPB was improperly shared, undermining its business of shooting seismic data on spec.

You may not have heard of GSI, but the company has a rich history as a technical and business innovator. The company was the precursor to Texas Instruments, a huge player in the early development of computing hardware — seismic processing was the 'big data' of its time. GSI still owns the largest offshore seismic dataset in Canada. Recently, however, the company seems to have focused entirely on litigation.

The Calgary company brought more than 25 lawsuits in Alberta alone against corporations, petroleum boards, and others. There have been other actions in other jurisdictions. This ruling is just the latest one; here's the full list of defendants in this particular suit (there were only 25, but some were multiple entities):

  • Devon Canada Corporation
  • Statoil Canada Ltd.
  • Anadarko Petroleum Corporation
  • Anadarko US Offshore Corporation
  • NWest Energy Corp.
  • Shoal Point Energy Ltd.
  • Vulcan Minerals Inc.
  • Corridor Resources Inc.
  • CalWest Printing and Reproductions
  • Arcis Seismic Solutions Corp.
  • Exploration Geosciences (UK) Limited
  • Lynx Canada Information Systems Ltd.
  • Olympic Seismic Ltd.
  • Canadian Discovery Ltd.
  • Jebco Seismic UK Limited
  • Jebco Seismic (Canada) Company
  • Jebco Seismic, LP
  • Jebco/Sei Partnership LLC
  • Encana Corporation
  • ExxonMobil Canada Ltd.
  • Imperial Oil Limited
  • Plains Midstream Canada ULC
  • BP Canada Energy Group ULC
  • Total S.A.
  • Total E&P Canada Ltd.
  • Edison S.P.A.
  • Edison International S.P.A.
  • ConocoPhillips Canada Resources Corp.
  • Canadian Natural Resources Limited
  • MGM Energy Corp
  • Husky Oil Limited
  • Husky Oil Operations Limited
  • Nalcor Energy – Oil and Gas Inc.
  • Suncor Energy Inc.
  • Murphy Oil Company Ltd.
  • Devon ARL Corporation

Why did people share the data?

According to Section 101 Disclosure of Information of the Canada Petroleum Resources Act (1985) , geophysical data should be released to regulators — and thus, effectively, the public — five years after acquisition:

 

(2) Subject to this section, information or documentation is privileged if it is provided for the purposes of this Act [...]
(2.1) Subject to this section, information or documentation that is privileged under subsection 2 shall not knowingly be disclosed without the consent in writing of the person who provided it, except for the purposes of the administration or enforcement of this Act [...]

(7) Subsection 2 does not apply in respect of the following classes of information or documentation obtained as a result of carrying on a work or activity that is authorized under the Canada Oil and Gas Operations Act, namely, information or documentation in respect of

(d) geological work or geophysical work performed on or in relation to any frontier lands,
    (i) in the case of a well site seabed survey [...], or
    (ii) in any other case, after the expiration of five years following the date of completion of the work;

 

As far as I can tell, this does not necessarily happen, by the way. There seems to be a great deal of confusion in Canada about what 'seismic data' actually is — companies submit paper versions, sometimes with poor processing, or perhaps only every 10th line of a 3D. But the Canada Oil and Gas Geophysical Operations Regulations are quite clear. This is from the extensive and pretty explicit 'Final Report' requirements: 

 

(j) a fully processed, migrated seismic section for each seismic line recorded and, in the case of a 3-D survey, each line generated from the 3-D data set;

 

The intent is quite clear: the regulators are entitled to the stacked, migrated data. The full list is worth reading, it covers a large amount of data. If this is enforced, it is not very rigorous. If these datasets ever make it into the hands of the regulators, and I doubt it ever all does, then it's still subject to the haphazard data management practices that this industry has ubiquitously adopted.

GSI argued that 'disclosure', as set out in Section 101 of the Act, does not imply the right to copy, but the court was unmoved:

 

Nonetheless, I agree with the Defendants that [Section 101] read in its entirety does not make sense unless it is interpreted to mean that permission to disclose without consent after the expiry of the 5 year period [...] must include the ability to copy the information. In effect, permission to access and copy the information is part of the right to disclose.

 

So this is the heart of the matter: the seismic data was owned and copyrighted by GSI, but the regulations specify that seismic data must be submitted to regulators, and that they can disclose that data to others. There's obvious conflict between these ideas, so which one prevails? 

The decision

There is a principle in law called Generalia Specialibus Non Derogant. Quoting from another case involving GSI:

 

Where two provisions are in conflict and one of them deals specifically with the matter in question while the other is of more general application, the conflict may be avoided by applying the specific provision to the exclusion of the more general one. The specific prevails over the more general: it does not matter which was enacted first.

 

Quoting again from the recent ruling in GSI vs Encana et al.:

 

Parliament was aware of the commercial value of seismic data and attempted to take this into consideration in its legislative drafting. The considerations balanced in this regard are the same as those found in the Copyright Act, i.e. the rights of the creator versus the rights of the public to access data. To the extent that GSI feels that this policy is misplaced, its rights are political ones – it is not for this Court to change the intent of Parliament, unfair as it may be to GSI’s interests.

 

Finally:

 

[...the Regulatory Regime] is a complete answer to the suggestion that the Boards acted unlawfully in disclosing the information and documentation to the public. The Regulatory Regime is also a complete answer to whether the copying companies and organizations were entitled to receive and copy the information and documentation for customers. For the oil companies, it establishes that there is nothing unlawful about accessing or copying the information from the Boards [...]

 

So that's it: the data was copyright, but the regulations override the copyright, effectively. The regulations were legal, and — while GSI might find the result unfair — it must operate under them. 

The decision must be another step towards the end of this ugly matter. Maybe it's the end. I'm sure those (non-lawyers) involved can't wait for it to be over. I hope GSI finds a way back to its technical core and becomes a great company again. And I hope the regulators find ways to better live up to the fundamental idea behind releasing data in the first place: that the availability of the data to the public should promote better science and better decisions for Canada's offshore. As things stand today, the whole issue of 'public subsurface data' in Canada is, frankly, a mess.

Helpful horizons

Ah, the smell of a new seismic interpretation project. All those traces, all that geology — perhaps unseen by humans or indeed any multicellular organism at all since the Triassic. The temptation is to Just Start Interpreting, why, you could have a map by lunchtime tomorrow! But wait. There are some things to do first.

Once I've made sure all is present and correct (see How to QC a seismic volume), I spend a bit of time making some helpful horizons... 

  • The surface. One of the fundamental horizons, the seafloor or ground surface is a must-have. You may have received it from the processor (did you ask for it?) or it may be hidden in the SEG-Y headers — ask whoever received or loaded the data. If not, ground elevation is usually easy enough to get from your friendly GIS guru. If you have to interpret the seafloor, at least it should autotrack quite well.
  • Seafloor multiple model. In marine data, I always make a seafloor multiple model — just multiply the seafloor pick by 2. This will help you make sense of any anomalous reflectors or amplitudes at that two-way time. Maybe make a 3× version too if things look really bad. Remember, the 2× multiple will be reverse polarity.
  • Other multiples. You can model the surface multiple of any strong reflectors with the same arithmetic — but the chances are that any residual multiple energy is quite subtle. You may want to seek help modeling them properly, once you have a 3D velocity model.

A 2D seismic dataset with some of the suggested helpful horizons. Please see the footnote about this dataset. Click the image to enlarge.

  • Water depth markers. I like to make flat horizons* at important water depths, eg shelf edge (usually about 100–200 m), plus 1000 m, 2000 m, etc. This mainly helps to keep track of where you are, and also to think about prospectivity, accessibility, well cost, etc. You only want these to exist in the water, so delete them anywhere they are deeper than the seafloor horizon. Your software should have an easy way to implement a simple model for time t in ms, given depth d in m and velocity** V in m/s, e.g.

$$ t = \frac{2000 d}{V} \approx \frac{2000 d}{1490} \qquad \qquad \mathrm{e.g.}\ \frac{2000 \times 1000}{1490} = 1342\ \mathrm{ms} $$

  • Hydrate stability zone. In marine data and in the Arctic you may want to model the bottom of the gas hydrate stability zone (GHSZ) to help interpret bottom-simulating reflectors, or BSRs. I usually do this by scanning the literature for reports of BSRs in the area, or data on hydrate encounters in wells. In the figure above, I just used the seafloor plus 400 ms. If you want to try for more precision, Bale et al. (2014) provided several models for computing the position of the GHSZ — thank you to Murray Hoggett at Birmingham for that tip.
  • Fold. It's very useful to be able to see seismic fold on a map along with your data, so I like to load fold maps at some strategic depths or, better yet, load the entire fold volume. That way you can check that anomalies (especially semblance) don't have a simple, non-geological explanation. 
  • Gravity and magnetics. These datasets are often readily available. You will have to shift and scale them to some sensible numbers, either at the top or the bottom of your sections. Gravity can be especially useful for interpreting rifted margins. 
  • Important boundaries. Your software may display these for you, but if not, you can fake it. Simply make a horizon that only exists within the polygon — a lease boundary perhaps — by interpolating within a polygon. Make this horizon flat and deep (deeper than the seismic), then merge it with a horizon that is flat and shallow (–1 ms, or anything shallower than the seismic). You should end up with almost-vertical lines at the edges of the feature.
  • Section headings. I like to organize horizons into groups — stratigraphy, attributes, models, markers, etc. I make empty horizons to act only as headings so I can see at a glance what's going on. Whether you need do this, and how you achieve it, depends on your software.

Most of these horizons don't take long to make, and I promise you'll find uses for them throughout the interpretation project. 

If you have other helpful horizon hacks, I'd love to hear about them — put your favourites in the comments. 


Footnotes

* It's not always obvious how to make a flat horizon. A quick way is to take some ubiquitous horizon — the seafloor maybe — and multiply it by zero.

** The velocity of sound in seawater is not a simple subject. If you want to be precise about it, you can try this online calculator, or implement the equations yourself.

The 2D seismic dataset shown is from the Laurentian Basin, offshore Newfoundland. The dataset is copyright of Natural Resources Canada, and subject to the Open Government License – Canada. You can download it from the OpendTect Open Seismic Repository. The cultural boundary and gravity data is fictitious — I made them up for the purposes of illustration.

References

Bale, Sean, Tiago M. Alves, Gregory F. Moore (2014). Distribution of gas hydrates on continental margins by means of a mathematical envelope: A method applied to the interpretation of 3D seismic data. Geochem. Geophys. Geosyst. 15, 52–68, doi:10.1002/2013GC004938. Note: the equations are in the Supporting Information.

New open data and a competition

First, a quick announcement. EMC, the data storage and cloud computing company, has stepped up to sponsor the Subsurface Hackathon in Vienna in a few weeks. Their generous help will ensure a fun event with some awesome prizes — so get signed up and start planning your project!


New open data

A correspondent got in touch last week about an exciting new open seismic dataset. In the late summer and early autumn of 2015, the WesternGeco-acquired a large new 2D seismic survey in the Rockall Basin and the Mid North Sea High for the UK Government. The survey cost about £20 million and consists of 20,000 km of new broadband PreSTM data. At the end of March, the dataset will be released to the public for free download, along with about 20,000 km of legacy 2D data, 40,000 km of new gravity and magnetic data, and wells.

© Crown Copyright — Used under fair use provision.

If you are interested in downloading the data, the government is asking that you fill out this form — it will help them figure out what to make available, and how much infrastructure to provision. Excitingly, they are asking about angle stacks, PSTM gathers, not just the full stack. It sounds like being an important resource for our community.

They are even asking about interest in the field data — all 60TB of it. There will almost certainly be a fee associated with the larger datasets, by the way. I asked about this and it sounds like it will likely be on the order of several thousand pounds to handle the full SEGD data, because of course it will be on physical media. But the government is open to suggestions if the geophysical community would like to find another way to distribute the data — do let me know if you'd like to talk about this.

New seed funding

Along with the data package, the government has announced an exciting new competition for 'seed funding':

The £500,000 competition has been designed to encourage geoscientists and engineers to develop innovative interpretations and products potentially using [this new open data]...

The motivation for the competition is clear:

It is hoped the competition will not only significantly increase the understanding of these frontier areas in respect of the 29th Seaward Licensing Round later in the year, but also retain talent in the oil and gas community which has been affected by the oil and gas industry downturn.

The parameters of the competition are spelled out in the Word document on this tender notice. It sounds like almost anything goes: data analysis, product development, even exploration activity. So get creative — and pitch the coolest thing you can think of!

You'll have to get cracking though, because applications to take part must be in by 1 April. If selected, the project must be delivered on 11 November.

Old skool plot tool

It's not very glamorous, but sometimes you just want to plot a SEG-Y file. That's why we crafted seisplot. OK, that's why we cobbled seisplot together out of various scripts and functions we had lying around, after a couple of years of blog posts and Leading Edge tutorials and the like.

Pupils of the old skool — when everyone knew how to write a bash script, pencil crayons and lead-filled beanbags ruled the desktop, and Carpal Tunnel Syndrome was just the opening act to the Beastie Boys — will enjoy seisplot. For a start, it's command line only: 

    python seisplot.py -R -c config.py ~/segy_files -o ~/plots

Isn't that... reassuring? In this age of iOS and Android and Oculus Rift... there's still the command line interface.

Features galore

So what sort of features can you look forward to? Other than all the usual things you've come to expect of subsurface software, like a complete lack of support or documentation. (LOL, I'm kidding.) Only these awesome selling points:

  • Make wiggle traces or variable density plots... or don't choose — do both!
  • If you want, the script will descend into subdirectories and make plots for every SEG-Y file it finds.
  • There are plenty of colourmaps to choose from, or if you're insane you can make your own.
  • You can make PNGs, JPGs, SVGs or PDFs. But not CGM, sorry about that.

Well, I say 'selling points', but the tool is 100% free. We think this is a fair price. It's also open source of course, so please — seriously, please — improve the source code, then share it with the world! The code is in GitHub, natch.

Never go full throwback

There is one more feature: you can go full throwback and add scribbles and coffee stains. Here's one for your wall:


The 2D seismic line in this post is from the USGS NPRA Seismic Data Archive, and are in the public domain. This is line number 31-81-PR (links directly to SEG-Y file).