GOES 17 Outage
As I’ve previously posted, I’ve some automated watchdog code which looks for when something isn’t quite right with the capture and processing of satellite images. This results in a few emails a month at the most, so whilst nothing unusual, the emails still get me investigating what is happening.
The first email showed that the last relayed images from GOES 16 was over three hours ago and there is normally one every hour. However I also noticed that the situation was similar for Himawari 8 relayed image and also for GOES 17 as they were just a few minutes off the three hour late threshold.
Sure enough, there was an email ten minutes later which confirmed that the threshold was passed for both of these satellites too.
Being at a convenient time investigate, i.e. not 3am, I kicked off my usual search for an issue. The first step was to check for any images from any of these satellites having been received as the watchdog only looks at a single image type which is relayed hourly. The last image from any of these satellites was received at 5:37pm, which aligns with the watchdog timing.
Having learnt from GK-2A issues, it is always worth having a quick look at the last image received, however this looked totally normal. Beautiful as ever, it was showing exactly what I’d expect with the darkness just starting to arrive over New Zealand and no obvious issues.
To see if the problem was that I’d lost connectivity with Zl1LAC’s (Leith) receiver, the simple was was to reboot the VM I use to fetch the feed for decoding. But this didn’t help, so I got in touch with Leith via Discord and luckily he was able to help investigate. His first check was to ensure that he was getting a good signal, however this rapidly highlighted the issue with the vit(avg) being over 2200 and it needs to be under 400 at worst to get data that could be decoded. This also showed that the Pi was working, plus the LNA and SDR, well enough.
Usually seeing a vit(avg) value so high means that the antenna is pointing well away from where the satellite is, although this could be just a few degrees off, so it isn’t too unusual if it gets really windy as has happened before. But it was a calm, dry evening. So this wasn’t the problem either.
Of course it could have been a problem the LNA or SDR working well enough to give the output seen but not well enough to capture a signal. Equally it could have been an issue with the connectors or cable to the antenna. However that wasn’t too easy to check in the dark.
My GK-2A experience has taught me that even satellites can have issues too, which has now happened a couple of times, with checking the GOES website which has live images however their last image was a similar time stamp too and they should be the experts at receiving their own satellite. This will be helped by them having a budget significantly higher than I’ve got. Of course they might have an issue with their antennae too but this was really pointing at it being a satellite problem.
Leith found a Twitter post which helped to confirm this.
However their status page wasn’t updating too quickly as it still showed a green status!
Clearly the watchdog and our investigation was working faster than the GOES team!
Working off a suggestion to look at the frequency (1.693GHz) used by the CDA, which is the command up link / down link which is used to control the satellite, Leith looked for a weak signal using SDR#.
By confirming that the CDA signal could be seen we knew that the satellite was almost certainly in one piece and not tumbling after being hit by debris or some explosion from a battery or rocket fuel. This is always a good sign!
It should be possible to decode the CDA, however it would need a much higher signal to noise ratio than what we had access to. However Aang23 on Discord found someone who had been looking at the CDA data with there being only some status signals and nothing else. This was suggesting a significant issue with the satellite almost certainly being in safe mode (as happened recently with the Hubble space telescope).
Later creinemann on Discord was able to get in touch with NOAA operations who operate GOES 17 by phone. He got the confirmation that the satellite was in safe mode with an unknown problem.
Later an outage notice was posted to the NOAA website.
GOES 15, which is the satellite that GOES 17 replaced, is still in orbit and is often returned to service during the northern hemisphere hurricane season and this was due to happen in a few months. So there was the option of bringing this back online earlier, but if GOES 17 was still not available it would mean that a lot of dish antennae would need to be re-pointed which isn’t always straightforward.
The next update was that recovery operations had started at 14:40 UTC (3:40pm NZ on the Friday), with the issue being described as – “there is no broadcast due to its current pointing/alignment position”. However this could be just a symptom and not the real issue.
I checked the GOES status page at 9pm and it was showing a green status, however yet again this was slightly optimistic. As is often the case, Twitter was useful for finding updates.
I’m not quite sure what is meant by “…aggressively troubleshooting..” but I suspect it probably has all the engineers working their socks off, fueled by coffee, energy drinks and pizza which is all too common worldwide!
The next update was that they had got the satellite out of safe mode with instruments coming online starting with the magnetometer which was functioning and data was being downloaded successfully. Another key step, although this data isn’t downloaded via the 1.6941GHz signal which can be received using a WiFi grid antenna.
At 1:32am Saturday morning Leigh posted an image showing he was getting signal lock.
Just after that creinemann posted:
GOES-17’s HRIT adn GRB broadcast has been activated today at 1326 UTC. All ABI data is coming through and all appears nominal at this point in time. There is still recovery work ongoing onboard the satellite, but NESDIS hopes to have all returned back to operations by 17 UTC today.– NOAA (as reported by creinemann)
Confirming there had been some positioning / orientation issue, vit(avg) values were up compared to normal for at least two people who could monitor the signal. But it was good enough.
Then then best update was seeing my first GOES 17 image just after 2am, with GOES 16 and Himawari 8 relayed images soon after. GOES 17 lives to fight another day.
The latest update from NOAA is:
GOES 17 NEWS RELEASE July 23, 2021
NOAA’s GOES-17 is out of safe-hold mode and engineers expect its six instruments to return to normal operations soon. The probable cause of yesterday’s anomaly appears to be a memory-bit error in the spacecraft computer. The engineering team says the computer has been responding correctly to commands. Earlier this morning, the Advanced Baseline Imager and Magnetometer were restored and data are flowing. The remaining four instruments are expected to come online later this morning. The team expects some minor, short-term data quality issues while the instruments are being recalibrated, but GOES-17 is on track for a full recovery with no lasting effects to the satellite. NOAA will provide an update as new information becomes available.
One Reply to “GOES 17 Outage”