The Origin and Evolution of Old Weather

See what is happening on the project, our discoveries, and what others have discovered
Post Reply
User avatar
Michael
Posts: 4457
Joined: Sat Mar 14, 2020 7:09 pm
Location: Victoria, B.C. Canada

The Origin and Evolution of Old Weather

Post by Michael »

In early February, a couple of us were asked to sit in on a Zoom meeting about using Optical Character Recognition, OCR, to evaluate a commercial program being developed to read the weather data from a log book. Among others, the meeting included a couple of people from the company, Philip Brohan, Gil Compo, and myself. During the meeting, Philip told the people from the company that Old Weather was the only Zooniverse project that was was using a data and processing system designed, maintained and run by the volunteers! This is how that came about.

In the beginning
About 15 years ago, a group of astronomers had a problem. There were hundreds of thousands of digital images of the night sky and only a few grad students to search through them to find and categorize galaxies. The astronomers had an idea: put the images online and let thousands of volunteers, i.e. Citizen Scientists, look at the images and collect the data. This project was called The Galaxy Zoo, and it was a tremendous success.

Philip Brohan had a problem. He is a climate scientist in England, and he wanted historic weather data from the oceans to feed into the various climate models to improve their analysis of historic weather. He contacted the Galaxy Zoo people about having Citizen Scientists extracting weather from Royal Navy logbooks.

The two groups designed such a system, called Old Weather, and the two projects: Galaxy Zoo and Old Weather formed the Zooniverse.

Naval historian, Gordon Smith, joined Old Weather in April 2010. His job was to broaden the scope of the project – to demonstrate the value of the ships' logbooks as historical records, not just sources of pressure and temperature observations.

The success of Old Weather as a history project (naval-history.net) has also helped the work in climate science. Volunteers produce edited ship histories, based on Old Weather transcriptions, and the publication of these histories is announced on this forum and on social media. Expanding the project in this way has been vital in sustaining the public interest that has kept Old Weather going so long.

Old Weather a short history

Old weather has had several phases:
  • Phase I: Royal Navy logs ~1880 to ~1930;
  • Phase II: Royal Navy gunboats operating in China in the 1930s;
  • Phase III: U.S. Navy and Coast Guard 1844 to 1930s;
  • Phase IV: U.S. Whaling ships; and,
  • Phase V: It was so badly set up that we never did much with it, it doesn't matter.
  • OW Arctic, now called OW Federal Ships: U.S. Navy and Coast Guard 1867 - 1955.
The collection of weather data used the original design for Phases I, II and III, and we had a bulletin board, the Old Weather Forum. The volunteers loved the Old Weather project for three things:
  1. Collecting valuable data for climate research;
  2. Connecting with people around the world through the Forum; and,
  3. Having a sense of connection to our ships, the people on them and the history.
The volunteers did not like:
  1. The original data entry system which was slow, awkward and not flexible.
Problems with phase V brought about the transition from the Zooniverse to our own system.
  1. The new data entry system gave the transcriber random pages for a ship so the sense of continuity was lost.
  2. The old Forum was abandoned by the Zooniverse, and they moved to the dreaded Talk program.
  3. The last straw was the discovery that the Zooniverse's new program had a bug that prevented anyone from doing more than five pages for one ship-year and the news that the Zooniverse could not fix it.
The science team challenged us to come up with a solution. We did.

A new way of doing Old Weather

About two-thirds of the way through Phase III, Stuart thought it would be better to enter the data straight into a spreadsheet, with the logbook image as a background. It was an intriguing idea, and there were many advantages. I set up a test using Excel, and it was immediately obvious that such a system was faster, easier and more efficient than using the old system.

Bob, a very active transcriber until his job started taking all his time, took on the project. We had to use a free system, so he looked at Open Office and Libre Office, and he chose the latter. He designed and built the first spreadsheet and a couple of us tested it. We made suggestions, and Bob implemented them and worked out all the kinks.

Randi looked around the internet and found a very good bulletin board for our new forum. Bob set up the hosting for the forum and even paid for a five year subscription. Randi, Gordon and a couple of others set up the new forum, and we switched over.

From data collection to data processing

With the old system, we never saw the collected data. With Bob's new system, the data we collected were available on the internet in a tagged text format. For the first time, we could download these data and process them ourselves.

During the entire OW project, we collected place names and their locations, so we could see where the ships were. Matteo in Italy was, and still is, in charge of this project. He has been busy collating the data, and maintaining the lists in the forum. He also set up our online tools for finding places and maps, and for calculating positions and voyages.

In order to make the job of finding new places easier, I built a small program to calculate a ship's position given its starting position and a few hours of courses and bearings. Over the years, as the needs changed, it became more sophisticated to the point I could calculate the hourly positions for an entire year.

Now that we had our collected data online, I could decode the collected data, put them into a spreadsheet and run my voyage calculator. I then put the hourly positions back into the spreadsheet.

OW Arctic / OW Federal Ships

Kevin Wood, who was working on Arctic Ice projects for the National Oceanographic and Atmospheric Administration, NOAA, was very interested in the OW project and he proposed and became in charge of OW Whaling and OW Arctic. His interest was in climate change and Arctic sea ice, and he wanted to know how good the climate models were in predicting where sea ice was historically, and how the presence of sea ice had changed over the years.

As he was setting up OW Whaling and OW Arctic, and before Phase III was finished, he asked me to do two things:
  • Calculate a few voyages using Phase III data; and,
  • Collect some ice observations from a few U.S. Coast Guard ships and plot the position of the ship at those times.
Kevin sent me the Phase III data for a dozen ships, for which I calculated their hourly positions. The question Kevin and I had was this, "Do the hourly positions make the weather data more useful?" The original climate models had a very coarse grid, perhaps on the order of 200 km. As computers got more powerful, the grids became smaller, cutting in half about every seven years or so. Would the better positions be better with the newer models? The answer was, "Yes!" And that's why we now calculate the hourly positions for every voyage.

Kevin's other questions was: how useful is the ice data from the US Coast Guard ships? I collected ice data from six or eight ships and Kevin tested it against the models. He discovered that those data were extremely important. And, also, the models were very good at predicting the presence of ice, but having the actual data was helping improve the models even more.

So, with these results in hand, Kevin proposed that the logbooks from the Coast Guard and other ships be scanned for OW Arctic, and that logs from whaling ships be scanned and used for OW Whaling. The project was approved and funded.

By now, Phase III was ending and our spreadsheet method for data collection was operational.

New questions, new projects

Now that we had our new system set up, we were able to ask and answer new questions. I was curious to see how the data from the different ships compared with each other. It was easy to do: get the data for all the ships for a given year, and compare the data when any two ships were within five miles of each other. I was doing this just out of curiosity, but Kevin became interested. He has given the data to an expert in statistics, who is using these data to better correlate the readings between the ships, and so making the data even better.

In order to make the comparisons, I had to "clean" the data. Pressures like 3030 had to be converted to 30.30. Entries that were just ditto marks had to be converted to their actual values. Values that were out of range, like a temperature of 667 had to be fixed by checking with the value in the logs, etc. Wind speeds were converted from Beaufort Force to knots, Visibilities were converted from code values to nautical miles.

I went through all the data, cleaning and fixing what I could. I also added the Verification checks to the spreadsheet to catch any errors before they were saved. You would be interested to know that there are VERY FEW errors in our transcribed data. So few errors, in fact, that when we demonstrated the spreadsheet method to Philip, and when he sasw the first results, he was so impressed that he relaxed the three person rule to just one. For Phases I, II and III, each log book was transcribed by three separate people. This was done to weed out transcription errors. With our system, only one person is needed per log!

Yangtze River Floods. Gil Compo and other researchers were interested in weather data from the Yangtze River area in China for 1930-31. This was the year of the most severe floods in history, and was probably the most destructive weather event in the world. We transcribed the data from three U.S. gunboats for this project.

Converting wind directions from magnetic to true. Two days ago, Gil wondered if having magnetic wind directions would make a difference in the models. He asked me if I could do that conversion. I did the conversion for three ships, an "old" Bear, a "new" Northwind and a voyage from Ashuelot which sailed most of the way around the world. I chose this because it would have the greatest range of magnetic declinations. Gil did the test and asked me to convert all the wind directions, which I have just done.

Conclusion

The OW volunteers have come up with a system that is:
  1. Extremely efficient;
  2. Flexible; and,
  3. It evolves to meet the needs of the transcribers and the science team.
Bob's initial spreadsheet has stood the test of time, and it has had almost 400 modifications in order to meet the needs of the transcribers. In Phases I to V our requests for changes were met with, "Sorry, we can't do that. The original designer is no longer with us."

Because Bob no longer has the time and Craig has passed, Gordon, Chris (aka Hanibal) and I have all made any necessary or requested changes.

Randi, Joan, Gordon and Caro set up the new forum and keep it clean, organized and easy to use.

Matteo keeps the location data up to date, and he maintains his OW online tools.

When Kevin died, much, much too soon, Gil and Lawrence both reassured us that our work is very valuable and is being used in more ways that we imagined. I know that they are using each file as soon as it is uploaded, because I had a question about Kearsarge 1875 the day after I sent it up.

We should all be proud of what we have done. The science team certainly appreciates it.

PS
I wasn't too impressed with the OCR demonstration. I suggested that they wouldn't get too many long term volunteers to use it if it wasn't flexible, easy to use and able to be modified by the user. I.e. some people like to use a mouse, others prefer to use a keyboard. etc etc. I sent them a spreadsheet to give them an idea of all the options we give our users: Enhance images as required? Yes. Magnify the image? Yes. Change font colours? Yes. etc etc etc.
User avatar
AvastMH
Posts: 2675
Joined: Mon Mar 16, 2020 7:48 pm
Location: Oxford, England

Re: The Origin and Evolution of Old Weather

Post by AvastMH »

What a wonderful write up. Thank you Michael. I could spend a lot of words in reaction, but 'proud' and 'glad' are the only two required. Proud to be part of this immense project, and glad to have a share in it with all the dedicated, and talented, friends of the forum.
User avatar
Hanibal94
Posts: 1017
Joined: Thu Jun 11, 2020 6:05 pm
Location: Leipzig, Germany

Re: The Origin and Evolution of Old Weather

Post by Hanibal94 »

I'll second that, Joan - all of it. This is one of the best things I have ever read on the OW forum EVER!
User avatar
pommystuart
Posts: 1570
Joined: Mon May 18, 2020 12:48 am
Location: Cooranbong, NSW, Australia.

Re: The Origin and Evolution of Old Weather

Post by pommystuart »

Nice one Michael. And what do you do in your spare time?
User avatar
Michael
Posts: 4457
Joined: Sat Mar 14, 2020 7:09 pm
Location: Victoria, B.C. Canada

Re: The Origin and Evolution of Old Weather

Post by Michael »

Bought you beer in Sydney after the Bondi Beach Art Walk.

User avatar
pommystuart
Posts: 1570
Joined: Mon May 18, 2020 12:48 am
Location: Cooranbong, NSW, Australia.

Re: The Origin and Evolution of Old Weather

Post by pommystuart »

Nice find. (looks more like chips to me, but I do remember a beer somewhere along the line)
Must do that again.
User avatar
Michael
Posts: 4457
Joined: Sat Mar 14, 2020 7:09 pm
Location: Victoria, B.C. Canada

Re: The Origin and Evolution of Old Weather

Post by Michael »

Part of an email I got from the Science Team this morning...
We (the science team) have now entered into the second phase of the "20CRv3 Wind Data Assimilation Project", which will involve primarily utilizing and examining the wind direction observations and the wind speed observations from the OldWeather ships, as well as making comparisons of the wind direction observations and the wind speed observations from the OldWeather ships with the wind direction observations and the wind speed observations from the 20CRv3 (the 20th Century Reanalysis, Version 3) for different years. For the second phase of this particular research project, we also will be creating another (second) set of IMMA data files from the OldWeather "WX.tsv" files that contain both wind direction observations and wind speed observations, so we will be converting the meteorological data from TSV file format to IMMA (International Meteorological Maritime Archive) file format, and all of the created IMMA data files will have both the wind direction data and the wind speed data included in them.
Post Reply

Return to “Old Weather news and results and related science news”