Skip to content

Graphs of school shootings in the U.S.

Bert Gunter writes:

This link is to an online CNN “analysis” of school shootings in the U.S. I think it is a complete mess (you may disagree, of course).

The report in question is by Christina Walker and Sam Petulla.

Gunter lists two problems:

1. Graph labeled “Race Plays A Factor in When School Shootings Occur”:
AFAICT, they are graphing number of casualties vs. time of shooting. But they should be graphing the number of shootings vs time; in fact, as they should be comparing incident *rates* vs time by race, they should be graphing the proportion of each category of schools that have shooting incidents vs time (I of course ignore more formal statistical modeling, which would not be meaningful for a mass market without a good deal of explanatory work).

2. Graph of “Shootings at White Schools Have More Casualties”:
The area of the rectangles in the graph appears to be proportional to the casualties per incident but with both different lengths and widths, it is not possible to glean clear information by eye (for me anyway). And aside from the obvious huge 3 or 4 largest incidents in the White Majority schools, I do not see any notable differences by category. Paraphrasing Bill Cleveland, the graph is a puzzle to be decipered: it appears to violate most of the principles of good graphics.

Moreover, it is not clear that casualties per incident is all that meaningful anyway. Maybe White schools involved in shootings just have more students so that it’s easier for a shooter to amass more casualties.

The “appropriate” analysis is: “Most school shootings everywhere involve 1 or 2 people, except for a handful of mass shootings at White schools. The graph is a deliberate attempt to mislead, not just merely bad.”

Unfortunately, as you are well aware, due to intense competition for viewer eyeballs, both formerly only print (NYT, WSJ, etc.) and purely online news media are now full of such colorful, sometimes interactive, and increasingly animated data analyses whose quality is, ummm… rather uneven. So impossible to discuss statistical deficiences and the possible political/sociological consequences of such mass media data analytical malfeasance in it all.

My reply:

I think the report is pretty good. Sure, some of the graphs don’t present data patterns so clearly, but as Antony Unwin and I wrote a few years ago, infovis and statistical graphics have different goals and different looks. In this case, I think these are the main messages being conveyed by these plots:
– There have been a lot of school shootings in the past decade.
– They’ve been happening all over the place, at all different times and to all different sorts of students.
– This report is based on real data that the researchers collected.
Indeed, at the bottom of the report they provide a link to the data on Github.

Regarding Gunter’s points 1 and 2 above, sure, there are other ways of analyzing and graphing the data. But (a) I don’t see why he says the graph is a deliberate attempt to mislead, and (b) I think the graphs are admirably transparent.

Consider for example the first two graphs in the report, here:

and here:

Both these graphs have issues, and there are places where I would’ve made different design choices. For example, I think the color scheme is confusing in that the same palette is used in two different ways, also I think it’s just wack to make three different graphs for early morning, daytime, and late afternoon and evening (and to compress the time scales for some of these). Also a mistake to compress Sat/Sun into one date: distorting the scale obscures the data. Instead, they could simply have rotated that second graph 90 degrees, running day of week down from Monday to Sunday on the vertical axis and time of day from 00:00 to 24:00 on the horizontal axis. One clean graph would then display all the shootings and their times.

The above graph has a problem that I see a lot in data graphics, and in statistical analysis more generally, which is that it is overdesigned. The breaking up into three graphs, the distortion of the hour and day scales, the extraneous colors (which convey no information, as time is already indicated by position on the plot) all just add confusion and make a simple story look more complicated.

So, sure, the graphs are not perfect. Which is no surprise. We all have deadlines. My own published graphs could be improved too.

The thing I really like about the graphs in Walker and Petulla’s report is that they are so clearly tied to the data. That’s important.

If someone were to do more about this, I think the next step would be to graph shootings and other violent crimes that occur outside of schools.


  1. Antony Unwin says:

    Thanks, Bert, for drawing attention to this. I agree with most of Andrew’s comments, but the headline about more shootings happening on Fridays and during the afternoon does not look right. Checking the data shows that there were more killings on Fridays because two of the worst events happened on Fridays, but the number of shootings was actually highest (just) on Wednesdays. I see no evidence of the number of shootings (or killings) being higher in the afternoon.

    Graphics are made up of the graphic and various bits of text (legend, caption, title, annotations, accompanying text). It is always disturbing when these different components are inconsistent. How many people only read a headline and assume the accompanying graphic is evidence for it?

  2. Kevin says:

    I agree with Andrew that this isn’t the most egregious set of plots. I do think it’s a serious mistake to lump the 9 mass shootings in with the other 171 shootings though, especially given that much of their analysis is on casualties rather than just frequency. Skews everything.

  3. Kevin Dick says:

    It also seems like they should provided additional graphics adjusting for student population and showing school shooting deaths relative to all shooting deaths, all violent deaths, and all accidental/violent deaths in the relevant population.

  4. jim says:

    I think the graphics are fine.

    I didn’t notice that they had used the same color scheme for different meanings and wasn’t the least bit confused by it. I like the scheme, the colors are distinctive and pleasant. Repeating the same scheme isn’t a problem because all the graphics have totally different meanings and the scheme is simple and clear.

    My only concern is the confounding of “shooting at a school” with “mass school shooting”. I think people see these as completely different types of events and the report makes no distinction at all except in the one graphic. All shootings at schools are bad, but there’s a difference between street violence and disputes spilling into schools (crime) and a massive shooting rampage (terrorism). People use to think of the latter as a special kind of event, but after shootings like Las Vegas and others, it’s not really clear that the fact that some of these terrorist shootings happen at schools even matters.

    So that makes me wonder if any “school shooting” is special or significant in any way. Both types – street crime and terrorism – both seem to reflect broader trends in society.

    • Ben says:

      > Both types – street crime and terrorism – both seem to reflect broader trends in society.

      > If someone were to do more about this, I think the next step would be to graph shootings and other violent crimes that occur outside of schools.

      Initially I was thinking that I would want more specific information about these shootings. Things that might effect policy stuff for example — like were the perpetrators from within the school or outside the school and where did the guns come from.

      But I guess what you’re both suggesting is that we figure out where shootings at schools fits in the bigger picture of shootings everywhere. I guess the advantage to that would be that it’d be easier to figure out what is unusual or not in the details after you’ve done the high level stuff.

  5. Random says:

    For once, something positive! I like it; you could try it more often too.

  6. Phillip Middleton says:

    Agreed with the general thoughts here. What I would add is that the crux of the report was simplified for the general public, we have to take *baby steps* to get mortal humans to think in more than one dimension. The plots make their simplified points, but, along with Antony’s comment, I too do not see from the data a clear distinction in morning/afternoon shooting frequencies. I do agree that there is more transparency in the data comprising the plots w/ CNN versus, well, other publications. It’s the beginning of a brave new world :)

    I get Walker/Petulla’s point however which, rightly or wrongly, could be an implicit acknowledgement of “If one needed to direct limited resources at the most likely periods of shooting (assuming every school cannot have a police officer present the entire day), given some level of evidence, where would be “better” and when? (note I didn’t say ‘best’). IF anything, maybe its a conversation starter for the combined efforts of the law enforcement and primary education communities. But this is also where the backend statistical analysis would be needed.

  7. Terry says:

    The CNN analysis is an interesting little study in how to shape a narrative to a desired end, and, perhaps more importantly, how to shape it to avoid undesired ends.

    So you can treat it as a little exercise in how savvy a consumer of manufactured analyses you are by looking for the ways the narrative was shaped. What forks were chosen, and what forks were avoided?

Leave a Reply