Wondering why your computer just crashed again? Its memory might be to blame, according to real-world Google research that finds error rates higher than what earlier work showed.
With hundreds of thousands of computers in its data centers, Google can collect an abundance of real-world data about how those machines actually work. That's exactly what the company did for a research paper that found error rates are surprisingly high.
"We found the incidence of memory errors and the range of error rates across different DIMMs (dual in-line memory modules) to be much higher than previously reported," according the paper jointly written by Bianca Schroeder, a professor at the University of Toronto, and Google's Eduardo Pinheiro and Wolf-Dietrich Weber. "Memory errors are not rare events."...
"In the olden days of personal computing and into the 1990s, memory was unreliable enough that people ran reliability tests."
When I did RAM tests back then (both during bootstrap & with diagnostic apps), I do not recall ever seeing an error detected. Maybe others have contrary experience?
BTW, the original 1981 IBM PC had parity-checked RAM. By no means the best way of detecting errors, it was better than nothing.
While this is speculation, I think latent software defects (the effect of the Fall upon human rationality) cause the majority of PC (if not server) crashes. It is *very* difficult (if not impossible) to make software reliable enough to handle all contingencies, especially anything as complex as modern operating systems & office apps with millions of lines of code & dozens of developers. Using software w/o incident for a long time does not mean it has no flaws (fallacy of induction).
Both programmers & users have forgotten the virtue of simplicity.