Skip to content

Alexey Guzey’s sleep deprivation self-experiment

Alexey “Matthew Walker’s ‘Why We Sleep’ Is Riddled with Scientific and Factual Errors” Guzey writes:

I [Guzey] recently finished my 14-day sleep deprivation self experiment and I ended up analyzing the data I have only in the standard p < 0.05 way and then interpreting it by writing explicitly about how much I believe I should update based on this data. I honestly have absolutely no knowledge of Bayesian data analysis, so I'd be curious if you think the data I have is worth analyzing in some more sophisticated manner or if you have general pointers to resources that would help me figure this out (unless the answer to this is that I should just google something like "bayesian data analysis"..) Here’s the experiment.

One concern that I have is that my Psychomotor Vigilance Task data (as an example) is just not very good (which I note explicitly in the post), and I would be worried that if I try doing any fancy analysis on it, people would be led to believe that the data is more trustworthy than it really is, based on the fancy methods (when in reality it’s garbage in garbage out type of a situation).

Here’s the background (from the linked post):

I [Guzey] slept 4 hours a night for 14 days and didn’t find any effects on cognition (assessed via Psychomotor Vigilance Task, a custom first-person shooter scenario, and SAT). I’m a 22-year-old male and normally I sleep 7-8 hours. . . .

I did not measure my sleepiness. However, for the entire duration of the experiment I had to resist regular urges to sleep . . . This sleep schedule was extremely difficult to maintain.

Lack of effect on cognitive ability is surprising and may reflect true lack of cognitive impairment, my desire to demonstrate lack of cognitive impairment due to chronic sleep deprivation and lack of blinding biasing the measurements, lack of statistical power, and/or other factors.

I believe that this experiment provides strong evidence that I experienced no major cognitive impairment as a result of sleeping 4 hours per day for 12-14 days and that it provides weak suggestive evidence that there was no cognitive impairment at all.

I [Guzey] plan to follow this experiment up with an acute sleep deprivation experiment (75 hours without sleep) and longer partial sleep deprivation experiments (4 hours of sleep per day for (potentially) 30 and more days). . . .

His main finding is a null effect, in comparison with Van Dongen et al., 2003, who reported large and consistent declines in performance after sleep deprivation.

My quick answer to Guzey’s question (“I’d be curious if you think the data I have is worth analyzing in some more sophisticated manner”) is, No, I don’t think any fancy statistical analysis is needed here. Not given the data we see here. An essentially null effect is an essentially null effect, no matter how you look at it. Looking forward, yes, I think a multilevel Bayesian approach as described here and here) would make sense. One reason I say this is because I noticed this bit of confusion from Guzey’s description:

The more hypotheses I have, the more samples I need to collect for each hypothesis, in order to maintain the same false positive probability ( This is a n=1 study and I’m barely collecting enough samples to measure medium-to-large effects and will spend 10 hours performing PVT. I’m not in a position to test many hypotheses at once.

This is misguided. The goal should be to learn, not to test hypotheses, and the false positive probability has nothing to do with anything relevant. It would arise if your plan were to perform a bunch of hypothesis tests and then record the minimum p-value, but it would make no sense to do this, as p-values are super-noisy.

Guzey has a whole bunch of this alpha-level test stuff, and I can see why he’d do this, because that’s what it says to do in some textbooks and online tutorials, and it seems like a rigorous thing to do, but this sort of hypothesis testing is not actually rigorous, it’s just a way to add noise to your data.

Anyway, none of this is really an issue here because he’s sharing his raw data. That’s really all the preregistration you need. For his next study, I recommend that Guzey just preregister exactly what measurements to take, then commit to posting the data and making some graphs.

There’s not much to say about the data analysis because Guzey’s data don’t show much. It could be, though, that as Guzey says he’s particularly motivated to perform well so he can find that sleep deprivation isn’t so bad.

Why do we go short on sleep and why do we care?

God is in every leaf of every tree.

As is so often the case, we can think better about this problem by thinking harder about the details and losing a layer or two of abstraction. In this case, the abstraction we can lose is the idea of “the effect of sleep deprivation on performance.”

To unpack “the effect of sleep deprivation on performance,” we have to ask: What sleep deprivation? What performance?

There are lots of reasons for sleep deprivation. For example, maybe you work 2 jobs, or maybe you’re up all night caring for a child or some other family member, or maybe you have some medical condition so you keep waking up in the middle of the night, or maybe you stay up all night sometimes to finish your homework.

Similarly, there are different performances you might care about. If you’re short on sleep because you’re working 2 jobs, maybe you don’t want to crash your car driving home one morning. Or maybe you’re operating heavy machinery and would like to avoid cutting your arm off. Or, if you’re staying up all night for work, maybe you want to do a good job on that assignment.

Given all this, it’s hard for me to make sense of general claims about the impact, or lack of impact, of lack of sleep on performance. I have the same concerns about measuring cognitive ability, as ability depends a lot on motivation.

These concerns are not unique to Guzey’s experiment; they also arise in other research, such as the cited paper by Van Dongen et al.


  1. Ben says:

    I think I get his question, but when you have a sample size of 1, there really isn’t much statistics you can do.

    • Andy says:

      It depends on that the goal of your experiment is, no? If you want to show the effect of sleep deprivation in the human population, of course his N = 1. If you want to show the effect in *yourself* (you *are* the population), then every day can be counted as an observation, I suppose (although not an independent one).

  2. jim says:

    Interesting experiment.

    I don’t think a video game is an adequate or appropriate or reasonable test of cognitive impairment. I’m sure a lot of people can relate to having played video games late into the night without too much difficulty or even finding it hard to fall asleep after playing. Screen time is widely reputed to have a negative effect on sleep, so how that plays into it isn’t clear.

    I don’t know about the SAT as a test of cognitive impairment, but if a person is taking it every day isn’t there some acquired benefit from repetition?

    A video game is also a poor proxy for real-life situations like operating heavy equipment or driving. Nothing is at stake in a video game. There’s no risk to toppling a warehouse full of 3-story-tall shelving and killing yourself or others. There’s no risk to swerving your Kenworth across the centerline and killing yourself or others. So there’s no stress associated with that risk.

    That brings up another issue. Alex’s tests are only a few minutes long. How do the effects of sleep deprivation play out when doing a repetitive activity like driving for an extended period – like 10-14 hours?

    Last but not least Alex is an experiment N=1 and at age 22 generally not representative of the working population that would be impacted by sleep deprivation.

    • jim says:

      At first I was surprised to learn Alexey is 22 years old, but suddenly I get it: he’s in the top demographic for risky accidents! Too funny.

    • Alexey Guzey says:

      >That brings up another issue. Alex’s tests are only a few minutes long.

      SAT is 3 hours long (with two breaks).

    • Joshua says:

      Seems to me that conducting a cognitive test on yourself is a non-starter for valid results.

      • Andrew says:


        “Non-starter” is a bit strong! One reason for formal self-experimentation and formal self-measurement is that people are doing informal self-experimentation and informal self-measurement all the time. I agree that there are concerns with bias (I wrote about this a few years ago regarding Seth Roberts, who I think became all too adept at fooling himself), but I don’t think this means that self-measurement is hopeless.

        • Joshua says:

          Andrew –

          > but I don’t think this means that self-measurement is hopeless

          I think it depends on what you’re measuring. For many things, (say diet, physical activity, height and weight etc.) self-measurement has proven to be very problematic, but I don’t think a non-starter. I think that self-assessment of cognitive performance should be considered a non-starter.

          The reason being that it isn’t only potential bias in the quantification that’s the problem, but in addition to that you have a potential bias imbedded into the cognitive tasks in themselves.

          It might be analogous to a longitudinal assessmwnt of the relationship of diet and weight – where bias might show up in the measurement of weight in itself but also an influence of bias could show up over time in the very food intake behaviors themselves. You are compounding the potential bias.

          • Joshua says:

            In other words, you could quantify the “error” in some forms of self-assessment. You couldn’t ever measure the one aspect of bias embedded in a self-assessment of cognitive performance (the performance aspect, not the outcomes measurement aspect).

            It would be invisible.

          • Phil says:

            Do you think if someone else administered the SAT, that would somehow be a more valid test than Guzey just taking it on his own? I don’t see how that would work.

            I could see various mechanisms for getting a biased result, such as making an extra effort to pay attention when doing it tired, but being more perfunctory when doing it other times, but I don’t see a fundamental problem with administering a test to yourself.

            • Joshua says:

              Phil –

              What I’m suggesting is this:

              “The first principle is that you must not fool yourself and you are the easiest person to fool.”

              Let’s say I had a theory that reading your comment would make me stupider. So I performed a cognitive test on myself before I read your comment. Then I read your comment and did the cognitive assessment again.

              And sure enough, I did worse the second time, which is evidence my theory was right.

              We know that in general self-reported data is quite unreliable. Social desirability is one of the problems with self-report. I’d suggest that an unconscious desire to prove yourself right is a kind of social desirability bias.

              > Do you think if someone else administered the SAT, that would somehow be a more valid test than Guzey just taking it on his own? I don’t see how that would work.

              I think the biggest problem is his investment in the outcome. It’s not only that he’s administering the test to himself. It’s also that his theory that he’s evaluating rides on the outcome of his performance.

              This is a problem in general, that the researcher has an investment in the outcome of the research. When you’re evaluating your own preformance, you add in another potential and ummeasurable biasing mechanism.

              • Phil says:

                Yes, he could deliberately or subconsciously perform worse than his true ability when taking the SAT when fully rested. What he cannot do, unless he doesn’t just fool himself but actively cheats, is do better than his true ability when tired.
                So the only thing he has to worry about is that he’ll perform badly when he’s rested. He says in the article that he’s aware of this. He also says in the article that he knows fairly accurately how he should be able to do on the test, from previous experience. Sure, he could be lying, how would I know, but if we accept that he is genuinely trying to learn something about himself then I think this testing is OK. If there were a large effect he would have seen it. He recognizes that there could be a small effect that is still big enough to be important to him.
                If you’re saying it would be better to somehow have a blind experiment in which the subject doesn’t know whether he’s tired, good luck with that. You’d have to instead have a study in which someone is sometimes sleep-deprived, sometimes not, and you measure their cognitive performance either without them knowing or without them caring how much sleep deprivation matters for them. That’s not easy. Try it if you want.
                But more to the point, it would not come even close to answering the question Guzey cares about, which is how much sleep deprivation matters -to him-.
                And finally, the idea that self-experimentation can’t teach someone something is simply not true. Yes absolutely it has its own pitfalls, but it also has many advantages. Sometimes the advantages outweigh the negatives.

              • Joshua says:

                Phil –

                I mostly agree. Except that…

                > What he cannot do, unless he doesn’t just fool himself but actively cheats, is do better than his true ability when tired.

                “true ability” is a pretty moving target in cognitive assessments.

                > So the only thing he has to worry about is that he’ll perform badly when he’s rested.

                I get your point, but I don’t quite agree with that, either. He could be motivated to do well when rested, and really motivated to do well when tired. Motivation affects how people perform, how well the concentrate, etc.

                Perhaps part of the problem might be that I have less faith in the validity (and reliability) of cognitive testing than you.

              • Phil says:

                But if what you mean is that ‘cognitive ability’ is not a thing, so even defining a ‘true cognitive ability’ at a single moment in time is not possible, I agree. We have a huge range of cognitive skills: language processing (which itself includes many sub-skills), mathematical ability (ditto), short- and long-term memory recall, and on and on and on. Being tired couldn’t decrease all of these in exactly the same way, and any test is going to test only some sort of weighted average of a subset of these.

                The concept with any test is that there is some ‘true’ number that you are trying to measure, although your measurement instrument may be poor at capturing it. Alfred Binet, co-inventor of a popular early IQ test, once famously answered the question “what is ‘IQ'” with the reply “IQ is what my test measures.” He was well aware of the problems with boiling all of cognitive ability into one number.

                Guzey tested his ability to play video games (emphasizes one weighting and one subset of cognitive skills) and to take the SAT (a very different weighting and subset) and didn’t see an effect on either. That doesn’t mean there’s no effect on anything anywhere. Maybe Guzey can do fine with stuff he knows already when he’s tired, but would be terrible learning new material, for example.

                There are caveats about what he’s learned about himself, but I certainly think he’s learned something.

              • Alexey Guzey says:



                >I get your point, but I don’t quite agree with that, either. He could be motivated to do well when rested, and really motivated to do well when tired. Motivation affects how people perform, how well the concentrate, etc.

                I do think that motivation is an important part of this and you’re correct to point out that being very motivated to perform in some conditions but not the other ones will affect the results. However, given that I did not consciously try dampen my performance in any condition and maybe the motivation was 80% vs 90%, we would still expect to see some effect of sleep deprivation, if it was seriously affecting my performance. But we did not see it, which is why I concluded that:

                >I believe that the SAT data strongly suggests that there was no major or moderate cognitive deterioration in many aspects of my cognition, given that test includes challenging (to me) reading comprehension questions and requires quick mathematical thinking. I believe that it provides barely any evidence regarding minor cognitive deterioration, given that I usually end up having free time at the end of every section, which I use to double check my responses, and minor cognitive deterioration could result in me getting roughly the same score but with less time left.

  3. Phil says:

    Self-experimentation is great for a lot of reasons. (1) Anyone can do it; (2) if you decide to stop or to change something, you can do so without messing things up for a bunch of other peope; (3) if you are interested in how something affects you specifically, it is much better to use yourself as a subject than to try to quantify the average effect on a lot of other people; (4) you don’t have to get clearance from an ethics board or anyone else.

    The late Seth Roberts was a big fan of self-experimentation and had written part of a book about it, but I think never found a publisher and loss interest. I think Seth was insufficiently critical of his own findings and tended to ‘chase noise’ — I like the fact that Guzey put some effort into thinking doubt ways his findings could mislead him, something I think Seth should have done more. I’ll stop bad-mouthing Seth because he’s not here to fight his corner. I will say that I admired the fact that (like Guzey) Seth would come up with an idea for something that might improve his life, figure out a way to quantify it, and have the discipline to carry out an intervention and quantify its effects over time. Ironically, or interestingly, or somethingly, one of Seth’s major issues was pretty much the opposite of Guzey’s: he was troubled by insomnia, wanted to sleep more, and felt that he did not perform well if he didn’t get enough sleep. Seth would try an intervention, such as not eating anything in the morning until 3 hours after his desired waking time, and would stick with it for several weeks while recording its effects, before concluding that it did or didn’t help him.

    Me, I have the interest but rarely the discipline to stick with a regular plan and do the necessary record-keeping. But all of this stuff is getting easier through technology. To give some examples, if you are interested in something related to fitness, your watch can record your workout duration and intensity (as measured through heart rate, anyway). And for sleep, depending on the watch, you may be able to automatically record your sleep duration and quality (as measured by movement during sleep). Your scale can keep track of your weight. For those of us too lazy to even stick to writing this sort of stuff down every time, these automated tools can make self-experimenting a lot easier.

    Finally, I’ll mention that I enjoyed the book “Smoking Ears and Screaming Teeth”, by Trevor Norton, which is about self-experimentation. It’s mostly just a collection of interesting stories, doesn’t really have much that would constitute advice for budding self-experimenters, but it’s a fun read.

    • jd says:

      I think athletes do this all the time and have done so for years.

      Every athlete is an experiment of one – I believe this is a quote from Peter Coe, coach and father of Seb Coe, from “Better Training for Distance Runners” by David Martin and Peter Coe, which was like part training book and part exercise physiology textbook. (although I don’t have the time to search the book to make sure).

      All good training involves self-experimentation.
      Also, I think trainers and athletes employ partial pooling of their self-experiments, naturally. As a simplified example, if I introduce 2x20min @ 300w into my cycling training program on a day during a week, then I might log my resting HR and subjective fatigue the next morning. But I wouldn’t go on this one experiment alone, because the work stress of the week might differ from next week or something. So I might try it a few more times (if the workload wasn’t horribly off the mark), and this would give me some idea of how difficult the workout was on average.

  4. Shravan says:

    An experimenter with skin the game shd not be the subject. Clear coi there.

    This is partly why linguists screwed up their entire theoretical space. They were the theory developers and they were their own subjects.

    With n=1 you can‘t say much about variability between individuals, which seems like the most interesting thing here (who cares what the average effect is?)

    PS in 2007 i would have been thrown into confusion by Andrew‘s comment that the point of data analysis is not hypothesis testing. WHAT?

    • John Richters says:

      n=1 self-experimentation is often, but not per, unreliable and untrustworthy. Case in point, discovery of “the bacterium Helicobacter pylori and its role in gastritis and peptic ulcer disease“

      • Phil says:

        The book I mentioned in an earlier comment is full of examples. For instance, there’s a guy who wanted to figure out how rapidly one can decompress after diving. He considered experimenting on other people to be unethical, so he would get in his compression chamber, up the pressure to the equivalent of 50 ft under water or something, wait a while, then decompress at some rate. If he didn’t get the bends, the next time he would try it with a faster decompression rate. Eventually he mapped out the space of (pressure, time at that pressure, decompression rate) that would not give him the bends. He was aware that he might be more or less sensitive than other people in some way, but you have to start somewhere and this is where he started. It’s a remarkable experimental measure because he suffered a great deal and had some episodes of extremely severe injury; indeed, the goal of each section of his experimental program was to reach the point where he had injured himself.

        The book has chapter after chapter of stuff like this.

    • morris39 says:

      As regards self experimentation results and not statistics, I strongly disagree based on experience. First thing needed is that the objective must be vitally important (this example is not) so that there is skin in the game. Secondly there must real results, in other words strong motivation to not cheat. Very difficult to do but very worthwhile if you can pull it off. In my case the experimentation is about personal well being.

    • Phil says:

      Everything you say is reasonable, but I disagree with everything you say!

      1. An experimenter with skin in the game can certainly be the subject. If I’m trying to find an intervention that works for me — such as finding something that helps me sleep better at night, to give one of Seth Roberts’ issues — why on earth would I not be allowed to try different things and see?

      2. I do not agree that the variability between individuals is necessarily “the most interesting thing.” Sometimes I just want to find something that helps me in some way. If I find that I concentrate better after I take a 20-minute nap at 2 in the afternoon, that’s great, I don’t have to be interested in whether that’s true of other people, and how true it is, etc. etc.

      3. Andrew has often made the point (with which I agree) that ‘hypothesis testing’, in the sense it is usually used in statistical analysis — to test whether there is a ‘statistically significant difference’ between treatment and control — should not be the point of data analysis because you already know the hypothesis is wrong. There’s pretty much no way your cognitive ability with 4 hours of sleep could be _exactly_ the same as with 8 hours of sleep. One should think about this sort of experiment as an attempt to quantify the difference, or maybe determine whether it is smaller than an effect of _practical_ importance. Yes, one could say one is “testing the hypothesis that decreasing sleep by several hours has a large enough effect to be of practical significance”, but when people talk about hypothesis testing in statistics that is not usually what they are talking about.

      • Shravan says:

        Sure, i agree with you if the goal is only learning something about oneself and not making a general claim about sleep or weight loss or whatever. Idon‘t think this self-experimenter was solely interested in what works only for him. If he was, then my comments are irrelevant.

  5. Mikhail Shubin says:

    Sleeping 4 hours per night and playing video games?
    This is exactly how I spend my 22

  6. Alexey Guzey says:

    Hi Andrew,

    Thanks for the discussion of my experiment. Regarding:

    >This is misguided. The goal should be to learn, not to test hypotheses, and the false positive probability has nothing to do with anything relevant. It would arise if your plan were to perform a bunch of hypothesis tests and then record the minimum p-value, but it would make no sense to do this, as p-values are super-noisy.

    I agree with you in the ideal world. However, in the real world, I do think that this is relevant. Here’s why: suppose I perform 10 different tests. One of them ends up showing large gains or large deficits on X due to sleep deprivation while the rest show ~no effect. Then, whenever people are going to discuss my experiment, they will not say “oh, guzey found no effects and the one that was big was probably just by chance”, instead, the most intuitive interpretation and the way people will see the experiment will be “sleep deprivation has a huuuuuge effect on X” and will just avoid discussing the no effects. I really would rather avoid this happening, which is why I’m trying to test as few things as possible but to be able to be confident about them.

    What do you think?

    • Andrew says:


      In the real world, just do all 10 tests and present all the data!

      • Alexey Guzey says:

        Of course I would present all the data in the example I gave. It would not prevent most people from interpreting the result as “guzey found a huge effect on X” rather than “given the number of tests guzey did, the most reasonable interpretation is 0s across the board”.

        • jim says:

          ‘“given the number of tests guzey did, the most reasonable interpretation is 0s across the board”.’

          If you believe that you’re introducing a very strong bias. Nine zeros and one ten could be the natural response pattern.

          • Phil says:

            Strongly disagree. You’d expect a range of effects, maybe many of them small but none exactly zero, unless your test is overly discretized.

            • jim says:

              99 sleepy drivers weave their way home safely. One weaves and smashes into a bridge abutment and dies. Harm per incident of sleepy driving: 99 zeros, one ten. Not all phenomena are incremental.

              You don’t know what the distribution is until you measure it. With ten measurements you know *nothing*, so to claim that the one measurement was an anomaly and the rest are the “true” measurements is just throwing away data.

              • Alexey Guzey says:

                I’m not saying anything about truth. Note that I say “reasonable” and “likely” in my comments. If I did in fact get 9 null effects and 1 huge effect, I would definitely try to replicate that 1 effect and see if it’s real or if it was just a fluke.

              • jim says:

                “I would definitely try to replicate that 1 effect and see if it’s real or if it was just a fluke”

                Perfect! That’s the right thing to do. Multiple times.

              • Phil says:

                jim and Alexey,
                I don’t think the null effect is plausible. The effect cannot really be 0.000000000000 in any reasonable units.

                jim’s ‘sleepy drivers’ example is misleading. Driving ability is continuous, not discrete, and looking at ‘fatal accident’ is an example of an overly discretized test. It’s as if Alexey did his test and instead of recording the SAT score he only recorded 0/1 based on under or over 1450 or something.

                Alexey, for sure going without sleep has some effect. I think you’ve done an excellent job at trying to estimate how big that effect is, with an experimental design that is necessarily somewhat burdensome but something you were, obviously, willing to do. But I don’t think you should think of this as a binary choice, no effect vs effect.

  7. Kevin says:

    What appears to be missing is a discussion motivating Guzey’s hypothesis. As Andrew discusses in the last section, that specific hypothesis may be uninformative towards a specific goal or objective. Guzey’s only written objective appears to be replicating an existing study. In that study, the stated objective is: “To inform the debate over whether human sleep can be chronically reduced without consequences,”. I think Guzey should address if he shares that objective and then motivate his specific hypothesis and his methods for testing that hypothesis, especially because there are major differences between his experiment and the study. The original study actually tests multiple doses of 0, 4, 6, or 8 hours of sleep whereas Guzey only tests 4 vs his existing baseline of 8. Furthermore, Guzey motivates using PVT to test his specific hypothesis (also motivated by appearing in the original study), but does not motivate SAT testing (it is not used in the original study). The unmotivated SAT results appear alongside the PVT results and so could be misleading. In an academic paper, I could envision the SAT results as “unpublished data” motivating the main study, but I find them problematic appearing alongside the main results.

    Furthermore, I think there are differences in the design of experiments for N>1 vs N=1 that are not addressed by Guzey. N>1 has implicit considerations of time/money/feasibility. There may be random errors that can not be addressed at N>1 but should be addressed at N=1 because their effect will not average out and they may be more easily controlled for. For instance, Guzey states that he is near the threshold for SAT and seems to conclude that that makes it a justifiable method. I disagree. At a threshold, you are expecting non-linear outcomes. When designing an experiment, you should try to find a method that has the most sensitive region covering your expected outcomes and fits within your resources. For N>1, the SAT may be the best option, however, the SAT may be a poor choice for measuring changes in any given individual’s cognitive abilities (i dont know if this is true or relevant because Guzey does not discuss it).

  8. Benjamin Anndersson says:

    “Given all this, it’s hard for me to make sense of general claims about the impact, or lack of impact, of lack of sleep on performance. I have the same concerns about measuring cognitive ability, as ability depends a lot on motivation.”

    Do you mean concerns about measuring cognitive ability related to sleep deprivation, or do you mean cognitive ability in general (i.e. IQ-tests)?

  9. Anonymous says:

    So wait:

    I never took the SATs so I had to look up what the scores mean.

    Alexey scored 1470 on his first (pre-experiment) SAT. He scored an average of 1495 on all tests and his lowest score was 1440. Do I have this right? Taking his lowest score, he’s still in the 97th percentile??!!

    Holy crap no experiment he could run on himself regarding cognition could be remotely representative of the general population or a “normal” or “typical” response to anything! Hilarious!

    • David J. Littleboy says:

      No, no. It’s important that he can actually do the task at hand. If he were in some sort of more “normal” percentile, he’d be testing himself on a task he couldn’t do anyway, so the test would be meaningless.

      If he were, say, a professional juggler and juggles 5 clubs regularly, then sleep deprivation testing on his juggling ability would be meaningful. If you watched me juggle, you wouldn’t know if I were awake or asleep.

      Look, the SAT/GRE tests are really good tests. I re-applied to grad schools (in a different field) after messing up my first try. Since it had been 9 or so years since I had taken the SATs, so I bought a book on the GREs and took a sample test. And got 50% or so. Oops, maybe I’m in trouble. But I carefully re-read the questions. They were bad. They’d have multiple possible answers, no good answer. All sorts of dizziness. I acquired a real sample test, and it was fine. All the questions were sensible good writing with a unique correct answer. I did so well, I even checked the “I won’t attend the program if you don’t provide financial support” box, and they coughed up tuition and about half my living expenses. And that was in the humanities…

      The language part of the SAT/GRE tests are about reading comprehension, information retention, logical thought (and groking the cultural milieau of the blokes who wrote the questions, but that’s another rant). The sorts of things you’d think would be trashed by sleep deprivation. But, as we now know, you thought wrong.

      Alexey has pulled off a brilliant hack on some bad work, convincingly demonstrating that it’s ridiculous _as stated_.

      (OK, I’m overstating the case. That 99 sleep-deprived drivers make it home but one dies is a good point. But people here seem not to be getting the joyous brilliance of this experiment.)

      • Anonymous says:

        It’s plausible that highly intelligent and capable people are just less affected cognitively by sleep deprivation. Also it seems likely that taking the test several days in a row would confer an advantage – just as coaching and practice does.

  10. Phil says:

    Back In The Day, I was the subject of mild pity among my friends for only scoring 740 (out of 800) on the math portion. Most of my friends, including Andrew, scored 800, and the ones that didn’t were 780 or so. It worked out ok for me, I was still able to get a PhD in physics and have a successful career using my math skills, but…well, 97% of Americans are terrible at math, even compared to me. And I probably got better at it with all those graduate courses and stuff.

    I guess what I’m saying is: yeah, Guzey is great at math compared to most people, but I’m guessing a lot of this blog’s readers find his score unremarkable. If you hadn’t raised the issue I wouldn’t have thought about it at all. .

    • Phil says:

      This comment was meant as a reply to Anonymous, above.

    • Anonymous says:

      No doubt most of the people who frequent a stats/modelling blog have some math skills! :) Just the same his total score puts him **WAY** out of the ordinary, so whatever his cognition self-experimentation means for him personally there’s good reason to suspect it doesn’t apply to the larger population.

Leave a Reply