" The Ugly:
We had an issue with our Lustre cluster. Ours is laid out with a metadata server, and two object storage targets/servers (OSTs, or basically, file servers). You can think of the metadata server (MDS) as like the MBR or index of a hard drive; the MDS lists where every file is on the OSTs and contains metadata and if it gets wiped out, recovery of data is nigh-impossible. As it turns out, when the new OST was brought online, it caused a race in the MDS, where it thought the new OST was the old, and that all data was missing... And started wiping out the data. By the time everything was over, the cluster was up and running normally, but all the metadata was gone. Because of that, all of the files were inaccessible."
Farther down:
" We apologize for the disruption this data loss has caused. We had picked lustre specifically because it was designed to be very reliable and provide high throughput to all of our distribution servers. Despite this setback, Goo.im will continue to operate, and we will work to restore what we can."
Sent from my VM670 with LS670ZVJ using Tapatalk 2.