Disclaimer: This post may contain affiliate links, meaning we get a small commission if you make a purchase through our links, at no cost to you. For more information, please visit our Disclaimer Page.
As technology advances, the use of Solid State Drives (SSDs) has become more popular for computers and laptops. However, even with their advantages over traditional hard disk drives (HDD), SSDs can eventually suffer from degradation over time. If you have an SSD or are planning to get one, it’s important to understand why they are subject to performance drops as they age.
SSDs degrade over time as data can only be written on them a specific number of times. Once their program and erase cycles are exceeded, SSDs become unreliable. The error checking and correction (ECC) mechanisms can’t keep up with the number of errors, resulting in drive failure.
One of the reasons why HDDs fail is because of mechanical parts. As time goes on and more data gets written to these drives, they experience wear and tear, ultimately resulting in failure.
On the other hand, SSDs read and write data electronically – so how can they experience wear and tear? To understand fully understand wear and tear in SSDs, we need to look at SSD architecture and how it stores and deletes data.
Every SSD in the market has the following components:
- NAND flash memory: a storage medium that can store and retain data even when the SSD doesn’t have power. It’s made from several flash dies that consist of multiple blocks, which store large volumes of data.
- Pages: the smallest unit of data an SSD can store.
- SSD controllers: the “brains” of the SSD.
- Cells: the basic building blocks of SSDs. They can store bits of data, and the architecture determines how many bits a cell can hold at a time.
There are four types of NAND flash storage you should know:
- Single-Level Cell (SLC): One bit per cell
- Multi-Level Cell (MLC): Two bits per cell
- Triple-Level Cell (TLC): Three bits per cell
- Quad-Level Cell (QLC): Four bits per cell
Computers understand and communicate in binary language – 1s and 0s. They are represented by on and off switches. When the switch is on, it is represented by the digit 1 and 0 when it is off.
With this perspective, let’s look closely at SLC NAND flash. When the SSD wants to write data to one of the cells, it performs a voltage check to see whether the cell is full or not. As SLC can hold only one bit per cell, there are only two possible states – 1 and 0.
Now, if you look at MLC, which can have two bits in every cell, this increases the number of states from 2 to 4. Similarly, TLC allows three bits per cell, which means there are eight possible voltage states.
The best way to understand why SSDs degrade is to look at how it stores data in the first place. When you want to copy a file to the drive, the controller will break it down into multiple chunks. They are the size of a page, which can vary from 4 kilobytes (KB) to 16 kilobytes (KB).
HDDs have no problem storing data in the same place. When you need to store new information, it will enter this data into an unused portion of the spinning platter. SSDs function differently, as they need to erase the data before they can store new data.
An SSD can’t delete pages if it wants to write new data. It can only erase blocks, which is a combination of several pages. This happens even if you only want to delete data on a single page.
Every NAND flash storage has a limit to the number of times it can write and erase data, known as program and erase (P/E) cycles.
You can see the flaw with how SSDs handle data, especially when deleting information. Erasing entire blocks repeatedly will result in significant wear and tear over time. The worst part is that the deterioration won’t be even throughout the storage device, as only some cells will experience more wear and tear than others.
SSDs will fail even if a single block stops working. So how can these storage devices keep the wear and tear even in all the cells? It does so with the help of wear-leveling algorithms. Depending on how they’re designed, these algorithms move the pages around, even across different blocks. This ensures that wear and tear are spread throughout the cells.
You may have noticed a discrepancy between how much storage is advertised on SSDs, and how much you get post-installation. For example, you purchased a 500GB SSD, but it only shows 465 GB in your system. This discrepancy is known as overprovisioning. It plays a crucial role in helping keep the wear-and-tear uniform in the SSD.
If an SSD wants to delete bits from a single page, it must erase a block. This is because NAND flash memory only allows data to be written to the page when it is empty.
Fortunately, overprovisioning is beneficial for SSDs, providing additional storage for the drive. It can move data from several pages to these spare blocks. This allows the storage device to delete the block and restore the saved information to the previous block without any data loss. This process is known as garbage collection, which is essential to the proper functionality of SSDs.
The overprovisioned space allows the controller to perform various functions to even out wear and tear in the cells. It also comes to the drive’s rescue when the SSD can no longer store information in a block. The storage device will swap out the failed block with one that works correctly from the overprovisioned space.
If the SSD has to go through this process, it indicates that you need to replace the drive.
When you copy a file from your computer to the SSD, you’re not necessarily moving data equal to the size of the file.
In fact, the data written to the SSD can be greater than the file size in your system. This discrepancy is known as Write Amplification Factor (WAF). Ideally, having a ratio of 1:1 ensures wear and tear are uniform while copying files.
However, this varies depending on the SSD and the manufacturer. Higher WAF means that copying files to the SSD is quick. However, the storage unit’s cells will wear out faster and fail earlier than expected.
Program erase cycles refer to the number of times an SSD can write data in a certain block before it starts to degrade. As more data is written to an SSD, its blocks become increasingly worn out until one or more of them eventually fails, leading to data loss.
In order for SSD users to avoid this, it’s important that they understand the limitations of their particular drive and how often it should be replaced—or at least backed up regularly—in order to keep their data safe.
Here are the corresponding P/E cycles for different architectures:
- SLC – 100,000 P/E cycles
- MLC – 10,000 P/E cycles
- TLC – 3,000 P/E cycles
- QLC – 1,000 P/E cycles
Although SSDs are rated for these many P/E cycles, techniques like wear leveling, overprovisioning, and garbage collection can help increase the lifespan of SSDs.
MLC and TLC require significant electrical precision to store data. This is due to the number of possible voltage states in the cells. For example, as TLC has 3 bits per cell, it will take the controller thrice as long to check the voltage state compared to SLC.
How does this come into play when it comes to reliability?
Over time, the device’s ability to produce precise electrical charges diminishes. You can think of this as how humans become physically weaker later in life.
Due to the lack of precision, the SSD’s controller will find it hard to interpret the various voltage states. For example, it will start to mistake off state for on and vice versa.
Generally, SSDs have error checking and correction (ECC) to deal with this problem, like how a pair of glasses can help people with bad eyesight see properly. These mechanisms help identify and correct errors, ensuring the storage device reads the data accurately.
ECC mechanisms can’t handle an infinite number of errors, and they may become overwhelmed over time. As a result, the drive will fail and switch to a read-only mode. This means you can no longer write data to the storage unit and will have to back up everything on the drive immediately.
At this stage, the data will be available to you in a finite amount of time. Once this period is over, the SSD’s lifespan will end, meaning the data will be lost forever.
If you already have an SSD in your system, you might wonder how long you have until it fails. Fortunately, manufacturers use Self-Monitoring, Analysis, and Reporting Technology (SMART).
This provides information about the SSD, like times you turned on the storage device, its operating temperature, how long you used the device, and total storage. Depending on who manufactured your SSD, you can download their software.
If your SSD manufacturer doesn’t provide any software or you’re looking for software compatible with most storage devices, you can check out CrystalDiskInfo. This free software provides information like:
- Lifetime remaining
- Raw read error rate
- Erase fail count
- Reallocated NAND block count
With this information, you will have a clear idea of when the storage device is nearing the end of its lifespan.
If you go through any e-commerce website, you’ll find a vast difference in price among SSD manufacturers.
Keep in mind that expensive SSDs may not always be better than their cheaper counterparts. There are many factors to consider, so it is always better to check reviews from tech reviewers and users before making a decision.
As highlighted earlier, depending on the architecture of NAND flash, every SSD has a specific number of P/E cycles. However, it is more common for the controller on certain storage devices to face issues, which can result in problems.
Controllers are essential as they influence the storage device’s level of wear and tear. They need to determine how they’ll break down the data into batches, where to store them, their importance, and how long they’ll remain on the drive.
It also has to ensure the storage device doesn’t use the same cells repeatedly. In other words, it will be copying, deleting, and transferring data throughout the lifespan of the storage device.
Taking all these responsibilities into consideration, there is a need for robust controllers and algorithms.
Another factor influencing cost is dynamic random access memory (DRAM) on the SSD die. When the drive needs data location and storage space for various read and write functions, DRAM plays a crucial role in assisting these functions.
This improves performance when the SSD is under continuous use, ensuring you don’t face lags or stutters in the device. Without DRAM, SSD performance will take a hit and reach a point where it can be worse than HDDs’ read and write speeds.
High-quality controllers and algorithms can improve how long the SSDs last. This is why different manufacturers have various ratings regarding drive endurance. As a result, there will be an increase in cost.
You might wonder why you don’t see any SLC SSDs in the marketplace. They offer excellent reliability and are quicker than other architectures.
However, it can be expensive to create drives with large storage spaces. A better option, especially for consumers, is TLC SSDs. They provide greater storage space at a fraction of the cost.
When you use SSDs beyond their P/E cycle, they become unreliable in storing data. Other factors like cheap controllers and wear leveling algorithms influence its reliability during its lifespan.