Possibly faulty RAM
Introduction (for the curious; skip to next chapter if you like)
RAM is the hardware used in computers, also known as “memory”. Due to the technology used in the manufacturing, defects are quite common. Most of these are weeded out at the factory, but some will slip through to consumers.
There are two common reasons for faulty RAM:
- Manufacturing defects. In this case, usually just one of the sticks is faulty and others are good.
- Incompatible overclocking settings.
If you encounter any of these problems, you should run the test:
- Operating system crashes (such as Blue Screens on Windows).
- Random crashes in random programs.
- You submitted a SmartGit crash report and support replied that it happened in a reliable component. Around 50% of users who reported such crash find their hardware to be faulty.
It is a good idea to run the test for every computer you are using, whether or not it has visible problems. It is sufficient to run the test just once per hardware. Replacing the faulty RAM stick is usually better than tolerating random bugs and data corruption for years.
Running the test
It is recommended to run test test overnight, while you’re not using the computer.
If the test is not complete by the time you need the computer, please note if it found any problems and then cancel it. You could also run a quick test if full test takes too long, but note that quick test only finds the most obvious problems. It’s quite common that only the full test reveals the problems.
Select one of the following, depending on your OS:
Windows - Quick test
- If you have a few spare hours when you don’t need your computer, please run Extended test instead.
- The test will take around half an hour. You won’t be able to use your computer while the test runs. It could be convenient to run it during the lunch break.
- Run from Win+R or from command line:
MdSched.exe
- Choose whether to restart now or postpone it until next restart.
- Once your computer is restarted, the test will run automatically.
- If you need your computer before the test completes, you can interrupt it with Esc and run again later.
- Once the test is finished, the computer will automatically boot back into Windows.
- You can find test results in Windows Event Log :
- Go to
Control Panel > System and Security > Administrative Tools > Event Viewer > Windows Logs > System
- Find event with
Source
equal toMemoryDiagnostics-Results
- Go to
- It is quite common that a quick test overlooks existing problems. Please also run Extended test. It is convenient to run it overnight when you’re not using the computer.
- Continue to “What to do after the test”.
Windows - Extended test
- The test will take a few hours. You won’t be able to use your computer while the test runs. It could be convenient to run during the night.
- Run from Win+R or from command line:
MdSched.exe
- Choose whether to restart now or postpone it until next restart.
- Once your computer is restarted, the test will run automatically.
- Press F1 and select
Extended
test. Press F10 to apply. - If you need your computer before the test completes, you can interrupt it with Esc and run again later.
- Once the test is finished, the computer will automatically boot back into Windows.
- You can find test results in Windows Event Log :
- Go to
Control Panel > System and Security > Administrative Tools > Event Viewer > Windows Logs > System
- Find event with
Source
equal toMemoryDiagnostics-Results
- Go to
- Continue to “What to do after the test”.
Ubuntu
- Hold Shift when you boot your Ubuntu
- Select
memtest86
- Give it a few hours to run. It’s convenient to do that overnight.
- See if it found any errors (“Errors:” text in bottom right).
- Continue to “What to do after the test”.
Any OS
- Download memtest86+
- Create a bootable USB according to instructions
- Boot and test according to instructions
- Continue to “What to do after the test”.
What to do after the test
In any case, please reply to support’s mail! Getting feedback is really valuable to understanding which weird crashes are in fact due to hardware and which are not.
If test didn’t find any problems
Congratulations, your RAM is good. If you still encounter a lot of weird problems on your computer, this could be due to:
- CPU overheating.
Use monitor software of your choice to check. - Buggy drivers.
On windows, driver verifier could be used to find which driver causes trouble. Alternatively, update as many drivers as you can and see if it helps. - Buggy programs that inject their components in all other programs.
It is possible to investigate using crash dumps.
Please contact support if you’d like to get further hints.
If test detected lots of errors
It’s likely that your RAM is incorrectly overclocked in BIOS. Please check the settings there.
For example, it’s possible that RAM manufacturer has set overly optimistic timings in XMP settings.
If test detected just a few errors
How to fix
-
If you used Windows test, then Windows will remember the faulty regions and will avoid these regions. If the test detected all defects, then this should be enough to solve the problem. However, it still makes sense to replace the faulty sticks (continue reading for that).
At some later time Windows may forget the regions and you will need to repeat the test. You could check the current list with this commandline (run as admin):
bcdedit /enum {badmemory}
It will show this if the list is empty:
RAM Defects ----------- identifier {badmemory}
Otherwise, if there are avoided regions in the list, output will be like:
RAM Defects ----------- identifier {badmemory} badmemorylist 0x4dd360 0x4dd361 0x4dd363 0x4dd364 0x4dd366 0x4dd367
Note that in this example, faulty addresses make a small cluster. This is what is expected for localized manufacturing defects.
- It is likely that you have multiple RAM sticks in your computer. It is also likely that just one of them is faulty. Test each sticks individually to find which one is faulty and remove it.
- Replace the faulty stick. If the old one is still under warranty, return it to the seller.
What happens if you leave it as is
- Having a faulty region in RAM means that any data stored there will get corrupted randomly.
- On every computer boot, the region can get allocated to OS or to one of the programs. Also, the region could be reallocated to a different purpose while computer is working. So the problem will be moving around.
- Depending on which program gets this region, the consequences of data
corruption will be different, such as:
- OS crashes (Blue Screens on Windows)
- Program crashes
- Data corruptions in files
- Non-permanent artifacts when viewing images/videos
- Nothing, if some non-important data was corrupted