May 31, 2018

Weekend worship in Salt Lake City

May 31, 2018/ Evan Bianco

The Salt Lake City hackathon — only the second we've done with a strong geology theme — is a thing of history, but you can still access the event page to check out who showed up and who did what. (This events page is a new thing we launched in time for this hackathon; it will serve as a public document of what happens at our events, in addition to being a platform for people to register, sponsor, and connect around our events.)

In true seat-of-the-pants hackathon style we managed to set up an array of webcams and microphones to record the finale. The demos are the icing on the cake. Teams were selected at random and were given 4 minutes to wow the crowd. Here is the video, followed by a summary of what each team got up to...

Unconformist.ai

Didi Ooi (University of Bristol), Karin Maria Eres Guardia (Shell), Alana Finlayson (UK OGA), Zoe Zhang (Chevron). The team used machine learning the automate the mapping of unconformities in subsurface data. One of the trickiest parts is building up a catalog of data-model pairs for GANs to train on. Instead of relying on thousand or hundreds of thousands of human-made seismic interpretations, the team generated training images by programmatically labelling pixels on synthetic data as being either above (white) or below (black) the unconformity. Project page. Slides.

Outcrops Gee Whiz

Thomas Martin (soon Colorado School of Mines), Zane Jobe (Colorado School of Mines), Fabien Laugier (Chevron), and Ross Meyer (Colorado School of Mines). The team wrote some programs to evaluate facies variability along drone-derived digital outcrop models. They did this by processing UAV point cloud data in Python and classified different rock facies using using weather profiles, local cliff face morphology, and rock colour variations as attributes. This research will help in the development drone assisted 3D scanning to automate facies boundaries mapping and rock characterization. Repo, Slides.

Jet Loggers

Eirik Larsen and Dimitrios Oikonomou (Earth Science Analytics), and Steve Purves (Euclidity). This team of European geoscientists, with their circadian clocks all out of whack, investigated if a language of stratigraphy can be extracted from the rock record and, if so, if it can be used as another tool for classifying rocks. They applied natural language processing (NLP) to an alphabetic encoding of well logs as a means to assist or augment the labour-intensive tasks of classifying stratigraphic units and picking tops. Slides.

Book Cliffs Bandits

Tom Creech (ExxonMobil) and Jesse Pisel (Wyoming State Geological Survey). The team started munging datasets in the Book Cliffs. Unfortunately, they really did not have the perfect, ready to go data, and by the time they pivoted to some workable open data from Alaska, their team name had already became a thing. The goal was build a tool to assist with lithology and stratigraphic correlation. They settled on change-point detection using Bayesian statistics, which they were using to build richer feature sets to test if it could produce more robust automatic stratigraphic interpretation. Repo, and presentation.

A channel runs through it

Nam Pham (UT Austin), Graham Brew (Dynamic Graphics), Nathan Suurmeyer (Shell). Because morphologically realistic 3D synthetic seismic data is scarce, this team wrote a Python program that can take seismic horizon interpretations from real data, then construct richer training data sets for building an AI that can automatically delineate geological entities in the subsurface. The pixels enclosed by any two horizons are labelled with ones, pixels outside this region are labelled with zeros. This work was in support of Nam's thesis research which is using the SegNet architecture, and aims to extract not only major channel boundaries in seismic data, but also the internal channel structure and variability – details that many seismic interpreters, armed even with state-of-the art attribute toolboxes, would be unable to resolve. Project page, and code.

GeoHacker

Malcolm Gall (UK OGA), Brendon Hall and Ben Lasscock (Enthought). Innovation happens when hackers have the ability to try things... but they also need data to try things out on. There is a massive shortage of geoscience datasets that have been staged and curated for machine learning research. Team Geohacker's project wasn't a project per se, but a platform aimed at the sharing, distribution, and long-term stewardship of geoscience data benchmarks. The subsurface realm is swimming with disparate data types across a dizzying range of length scales, and indeed community efforts may be the only way to prove machine-learning's usefulness and keep the hype in check. A place where we can take geoscience data, and put it online in a ready-to-use for for machine learning. It's not only about being open, online and accessible. Good datasets, like good software, need to be hosted by individuals, properly documented, enriched with tutorials and getting-started guides, not to mention properly funded. Website.

Petrodict

Mark Mlella (Univ. Louisiana, Lafayette), Matthew Bauer (Anchutz Exploration), Charley Le (Shell), Thomas Nguyen (Devon). Petrodict is a machine-learning driven, cloud-based lithology prediction tool that takes petrophysics measurements (well logs) and gives back lithology. Users upload a triple combo log to the app, and the app returns that same log with with volumetric fractions for it's various lithologic or mineralogical constituents. For training, the team selected several dozen wells that had elemental capture spectroscopy (ECS) logs – a premium tool that is run only in a small fraction of wells – as well as triple combo measurements to build a model for predicting lithology. Repo.

Seismizor

George Hinkel, Vivek Patel, and Alex Waumann (all from University of Texas at Arlington). Earthquakes are hard. This team of computer science undergraduate students drove in from Texas to spend their weekend with all the other geo-enthusiasts. What problem in subsurface oil and gas did they identify as being important, interesting, and worthy of their relatively unvested attention? They took on the problem of induced seismicity. To test whether machine learning and analytics can be used to predict the likelihood that injected waste water from fracking will cause an earthquake like the ones that have been making news in Oklahoma. The majority of this team's time was spent doing what all good scientists do –understanding the physical system they were trying to investigate – unabashedly pulling a number of the more geomechanically inclined hackers from neighbouring teams and peppering them with questions. Induced seismicity is indeed a complex phenomenon, but George's realization that, "we massively overestimated the availability of data", struck a chord, I think, with the judges and the audience. Another systemic problem. The dynamic earth – incredible in its complexity and forces – coupled with the fascinating and politically charged technologies we use for drilling and fracking might be one of the hardest problems for machine learning to attack in the subsurface.

AAPG next year is in San Antonio. If it runs, the hackathon will be 18–19 of May. Mark your calendar and stay tuned!

May 25, 2018

Productive chaos

May 25, 2018/ Matt Hall

Wednesday was a good day.

Over 150 participants came to Room 251 for all or part of the first 'unsession' at the AAPG Annual Conference and Exhibition in Salt Lake City. I was one of the hosts of the event, and emceed the afternoon.

In a nutshell, it was awesome. I have facilitated unsessions before, but this event was on a new scale. Twelve tables of 8–10 seats — covered in sticky notes, stickers, coloured pens, and large sheets of paper — quickly filled up. Together, we burned about 10 person-weeks of human productivity, raising the temperature in the room by several degrees in the process.

Diversity means good conversation

On the way in, people self-identified as mostly software (blue name tags) or mostly soft rocks (red), as a non-serious way to get a handle on how many data scientists we had vs how many people are focused on the rocks themselves — without, I hope, any kind of value judgment. The ratio was about 1:2.

As people continued to drift in, we counted people identifying with various categories, to get a very rough idea of who was in the room. The results are shown here. In addition, I counted 24 women present at the start. Part of the point here is to introduce participants to each other, but there's another purpose too. AAPG, like many scientific organizations, is grappling with diversity today. Like others, it needs to do much better. A small part of the solution is, I think, to name it and measure how we're doing at every opportunity. It's one way to pay more attention.

Harder to capture is the profound level of job diversity. People responsible for billion-dollar budgets sat with graduate students, AAPG medal winners with SEC executives. We even had a venture capitalist and a physician.

Look at all these lovely people:

Tangible and intangible output

At the start of the session, I told the room I wanted to fill the walls with things we made — with data. We easily achieved this, producing a survey of the skills geoscientists will need in the future, hundreds of high-value machine learning tasks in geoscience, a ranked list of the most interesting of these, and even some problem analysis of some of them. None of this was definitive, but I hope it will provide grist for the mill of future conversations about machine learning in geoscience.

As well as these tangible products, each person in the room walked away with new connections and new ideas — about machine learning, about collaboration, and about what scientific meetings can be like.

Acknowledgments

A lot of people contributed to making this event happen.

My unsession co-chairs, Brendon Hall and Yan Zaretskiy of Enthought — spent several hours on the phone with me over the last few weeks, shaping the content and flow of an event that was a bit, er, fuzzy.

We seeded the tables with some of the Software Underground crowd who were in town for the hackathon and AAPG. This ensures that there's no failure case: twelve people are definitely coming. And in the unlikely event that 100 people come, there are twelve allies to manage some of the chaos. Heartfelt thanks to the table hosts:

Didi Ooi of the University of Bristol
Graham Ganssle of Expero
Lisa Stright of Colorado State University
Thomas Martin of Colorado School of Mines
Tom Creech of ExxonMobil
David Holmes of Dell EMC
Steve Purves of Euclidity
Diego Castaneda of Agile
Evan Bianco of Agile

Jenny Cole of SEG came along to observe the session and I appreciated her enthusiastic help as it became clear we were in for more than the usual amount of entropy in the room. Theresa Curry of AAPG did an amazing job getting the venue set up, providing refreshments, and ensuring the photographers were there to capture some of the action. The ACE 2018 organizing committee, especially Zane Jobe and Lauren Birgenheier, did their part by agreeing to supprt including such a weird-sounding thing in the program.

Finally, thank you to the 100+ scientists that came to the event, not knowing at all what to expect. It was a privilege to receive your enthusiastic participation and thoughtful contributions. Let's do it again some time!

We will digitize the ideas and products of the unsession over the coming weeks. They will be released under an open license. Watch this space for updates.

If you're interested in the methodology we use for these events, check out Proceedings of an unsession in CSEG Recorder, November 2013. If you'd like help running an event like this, get in touch.

May 21, 2018

Woo yeah perfect: hacking in Salt Lake City

May 21, 2018/ Evan Bianco

Thirty geoscientist-coders swarmed into Salt Lake City this past weekend to hack at Church & State, a co-working space in a converted church. There, we spent two days appealing to the almighty power of machine learning.

Nine teams worked on the usual rich variety of projects around the theme. Projects included AIs that pick unconformities, natural language processing to describe stratigraphy, and designing an open data platform in service of machine learning.

I'll do a run-down of the projects soon, but if you can't wait until then for my summary, you can watch the demos here; the first presentation starts at the 38 minute mark of the video. And you can check out some pictures from the event:

Pictures can say a lot but a few simple words, chosen at the right time, can speak volumes too. Shortly before we launched the demos, we asked the participants to choose words that best described how they were feeling. Here's what we got:

Each participant was able to submit three responses, and although we aren't able to tell who said what, we were able to scrape the data and look at each person's chosen triplet of words. A couple of noteworthy ones were: educated, naptime, inspired and the expressive woo, yeah, perfect. But my personal favorite, by far, has to be the combination of: dead, defeated, inspired.

The creative process can be a rollercoaster of emotions. It's not easy. It's not always comfortable. Things don't always work out. But that's entirely ok. Indeed, facing up to this discomfort, as individuals and as organizations, is a necesary step in the path to digital transformation.

Enough Zen! To all the participants who put in the hard work this weekend, and to our wonderful sponsors who brought all kinds of support, I thank you and I salute you.

May 10, 2018

An invitation to start something

May 10, 2018/ Matt Hall

Most sessions at your average conference are about results — the conclusions and insights from completed research projects. But at AAPG this year, there's another kind of session, about beginnings. The 'Unsession' is back!

Machine Learning Unsession
Room 251 B/C, 1:15 pm, Wednesday 23 May

The topic is machine learning in geoscience. My hope is that there's a lot of emphasis on geological problems, especially in stratigraphy. But we don't know exactly where it will go, because the participants themselves will determine the topic and direction of the session.

Importantly, most of the session will not involve technical discussion. It's not a computational geology session. It's a session for everyone — we absolutely need input from anyone who's interested in how computers can help us do geoscience.

What to expect

Echoing our previous unconference-style sessions, here's the vibe my co-hosts (Brendon Hall and Yan Zaretskiy of Enthought) and I are going for:

Conferences are too one-way, too passive. We want more action, more tangible outcomes.
We want open, honest, inclusive conversations about our science, and our technical challenges. Bring your most courageous, opinionated, candid self. The stuff you’re scared to mention, or you’d normally only talk about over a beer? Bring that.
Listen with an open mind. The minute you think you’re right, you’ve checked out of the conversation.
Whoever shows up — they are the right people. (This is a rule of Open Space Technology.)
What happens is the only thing that could have happened. (This is a rule of Open Space Technology.)
There is no finish line; when it's over, it's over.
What we are doing is not definitive. It's just a thing that we're doing.

The session is an experiment. Failure is most definitely an option, just the least desirable one. Conversely, perfection is the least likely outcome.

If you're going to AAPG this year, I hope you'll come along to this conversation. Bring a friend!

Here's a reminder of the very first Unsession that Evan and I facilitated, way back in 2013. Argh, that's 5 years ago...

May 03, 2018

The geospatial sport

May 03, 2018/ Matt Hall

If you love studying maps or solving puzzles, and you love being outside, then orienteering — the thinking runner's sport — might be the sport you've been looking for.

There are many, many flavours of orienteering (on foot, on skis, in kayaks, etc), but here's how it generally works:

Competitors make their way to an event, perhaps on a weekday evening, maybe a weekend morning.
Several courses are offered, varying in length (usually 2 to 12 km) and difficulty (from walk-in-the-park to he's-still-not-back-call-search-and-rescue).
A course consists of about 20 or so 'controls', which must be visited in order. Visits are recorded on an electronic 'dibber' carried by the orienteer, or by shapes punched on a card.
Each person chooses a course , and is allotted a start time.
You can't see your course — or the map — until you start. You have 0 seconds to prepare.
You walk or run or ski or bike around the controls, at various speeds and in various (occasionally incorrect) directions.
After making it to the finish, everyone engages in at least 30 minutes of analysis and dissection of route choices and split times, while eating everything in sight.

The catch is that your navigation system is entirely analog: you are only allowed a paper map and an analog compass, plus a whistle for safety. The only digital components are the timing system and the map-making process — which starts with LiDAR and ends in a software package like OCAD or OOM.

Orienteering maps are especially awesome. They are usually made especially for the sport, typically at 1:5000 or 1:7500, with a 2.5 m or 5 m contour interval. Many small features are mapped, for example walls and fences, small pits and mounds, and even individual trees and boulders.

The sample orienteering map from the Open Orienteering Mapper software, licensed GNU GPL. White areas correspond to open, runnable (high velocity) woodland, with darker shades of green indicating slower running. Yellow areas are open. Olive green ar… — The sample orienteering map from the Open Orienteering Mapper software, licensed GNU GPL. White areas correspond to open, runnable (high velocity) woodland, with darker shades of green indicating slower running. Yellow areas are open. Olive green areas are out of bounds.

Other than the contours and paths, the most salient feature is usually the vegetation, which is always carefully mapped. Geophysicists will like this: the colours correspond more to the speed with which you can run than to the type of vegetation. Orienteering maps are velocity maps!

Here's part of another map, this one from Debert, Nova Scotia:

So, sporty cartophile friends, I urge you to get out and give it a try. My family loves it because it's something we can do together — we all get to compete on our own terms, with our own peers, and there's a course for everyone. I'm coming up on 26 years in the sport, and every event is still a new adventure!

World Orienteering Day — really a whole week — is in the last week in May. It's a great time to give orienteering a try. There are events all over the world, but especially in Europe. If you can't find one near you, track down your national organization and check for events near you.

Blog