Summary/Abstract:
The card failed due to write abort as the result of user host power supply unstable.
Failure Remarks:
Upon return, the unit was visually inspected with no obvious external abnormalities found.
At the application level, the unit was confirmed to be unrecognized by a standard host. In addition, the unit failed to produce valid Card Identification (CID) and Card Specific Data (CSD) register data. With this failure, it would also fail recognition on all host devices. The unit was then tested in a diagnostic host, where it also failed to initialize. As a result, further testing at the card level was not possible.
Observing the signal activity between the memory and controller, it was found that during the initialization process the firmware attempted to read a location with corrupted data in the memory. Since the data in this location was necessary to complete the loading of the firmware into the controller for normal operation, the initialization process failed. As a result, the unit became unrecognized at the system level.
Using a memory tester which directly accesses the memory bus, the failing pages identified above were examined. Here, the cell voltage distributions of the corrupted areas were confirmed to be under-programmed (many bits did not reach their expected programming state). This is the typical indication of power loss during an active write operation. As a result, when reading the data in this incompletely programmed area the operation would fail due to Uncorrectable ECC (UECC).
The failing blocks were then tested with program and erase operations with no issue, resulting in normal cell voltage distributions. To ensure memory integrity, random memory blocks were programmed and erased with no problem found.
To ensure the memory integrity, the unit was put through the production memory test which resulted in a passing status. Then the unit was put through the production system-test which also resulted in a passing status.
Following the production test, the unit was then subjected to a stress test which executes write/read/compare operations of numerous random files and file types. The unit completed 24 hours of testing, cycling the entire capacity of the card without reproducing the failure or encountering any other issues. After the test, the card was checked for entries in the error log and mapped out data blocks, however no error logging in and no data blocks being mapped out.
Root Cause Remarks:
Based on the analysis above, the unit had failed as a result of a phenomenon that is related to a write abort due to power loss or unstable host power supply issue that is caused by the external environment. Write abort occurs when the power supply to the card is interrupted during an active write process.