Automating Police Report Text: A Product Manager's View

Recently on Twitter, Peter Moskos asked why paperwork could not be made easier for cops.

One Way We Automated Report Content at StreetCred

My former company, StreetCred - which as CEO I crashed into the side of a mountain - had some pretty fantastic law enforcement technology - not “for its time” but just, full stop great tech. Sure, it sucked, too. It is tech. But it was great because it removed toil, provided all the information needed to do the job without preaching about how the job should be done; it anticipated beautifully what the users would need to do next, and provided it - plus a way to easily break out of our expected workflow when the user wanted to make it different.

My failure was funding; the nearly 20 agencies running the tool all loved it, and reported low- to mid-double-digit percentage improvements in the closure rates of cases they worked with it.

So I think I am well-suited to talk about tech. No one should listen to me about money. If you’re thinking about making law enforcement technology, you should read this and learn from our toil, mistakes, and false-starts; by which I mean, our wisdom.


As experts in the processes of serving arrest warrants, we noticed that writing the narrative for police reports was a real slog. Cops would need to copy stuff from the warrant to the report - stuff that wasn’t easy for humans to copy, like driver license numbers, penal code entries, street addresses, descriptions… All the basic identifiers and references that computers can do faster than we can think of how to do it.

This is stupid, we thought, all that should just be automated.

Using the data we had and some very simple (tickboxes and dropdowns) original data entry in the field, StreetCred would generate arrest report narratives. This article describes what that was and how it worked.

A Sample Automated Report Narrative

On its face, it would seem really easy to generate the following sample text:

Here’s another version, with a female prisoner:

The rest of this article is how we did it, and why it is not easy at all.

What StreetCred Did

It aggregated records from multiple local, county, state, and federal systems, and then algorithmically analyzed the data to help, for example, determine the current location of wanted fugitives and, more important, analyze police interaction with civilians over time to determine which officers were behaving poorly. By “algorithmically analyzed” I mean that this was not Machine Learning, but it was orderly: a series of nested if/then/else statements.

This aggregation of data from heterogeneous systems required many things that many LE tech vendors are unwilling or unable to do:

Finding The Data

I said our police expertise gave us insight into police data sprawl. As an example, consider the police record of a vehicle stop that led to a probable-cause search of the vehicle, that led to an arrest, and after the person was released, an arrest warrant was issued on more complex charges. An every day thing.

But, as examples:

Knowing where to find these systems - often (in smaller agencies) stored in different non-integrated systems made by different manufacturers is just half the battle. Then one must access the data within - legally, politically, organizationally, technically. We chose inter-operability over the much more complicated, expensive, and time consuming “integration”; interoperable meant we ask these systems regularly for updates, and update our records separately, continuing to treat the original system as the One Source of Truth. That meant we could not write our findings back. Their loss.

Opening Our Data

So as not to be part of the larger problem ourselves, we made a decision at the very beginning to open our data:

Report Writing Automation: Use The Data We Have

With respect to the fugitive, we had the following information:

With respect to the officer, we knew:

With respect to the agency, we knew:

Putting This Together

Data we knew we needed based on agency policy and procedure - this is an example, it changed per agency - included:

With the information I have just given you, our system could then produce a narrative provided that the officer enters the vehicle number, and ticks the following boxes:

The system knew the sex of the officer and the fugitive, so if the former was male and the latter female:

How This Worked

The officer would use a page in the StreetCred system to get all the information available on a particular fugitive. When they made an arrest, they would click the Generate Arrest Narrative button, and the dialogue would guide them through the necessary tickbox questions, situational to the sex of the officer and the fugitive. The officer would enter the requested data and click, “Generate”, and the system would then present the text of the report. Because StreetCred was required to run over a persistent VPN tunnel, it would also simultaneously send an internal agency email to the officer (to avoid sending unencrypted information over the public Internet). The officer would then copy the narrative text and paste it into their arrest report.

What Was The Pushback?

Initially there were questions: is the machine writing this, or is the cop? Ultimately in conversations with city attorneys, it became clear that the officer was in fact writing this narrative because the only verbiage that was being created by StreetCred was created as a direct result of the officer’s actions: the officer selects the fugitive of interest, the officer confirms the information within StreetCred and validates the warrant with Dispatch, and the officer initiates the arrest narrative process and enters in the field the information that is unique to the arrest. There is, they and we argued, no difference between the officer holding the arrest warrant and copying from it the fugitive’s height, weight, and driver license information, and StreetCred copying that data for the officer.

What Was The Reaction?

Hey, I’d like to remove totally unnecessary toil from your work with no downside, is that OK with you?

How Did You Do It?

The attentive reader will note that the data simply must be aggregated from a range of sources. For example, we got the warrant data from the court record management system, the vehicle information and fugitive driver license photograph from the state DMV, lots of the situational awareness information (like gang members and sex offenders near the fugitive’s house) from the agency record management systems, current felony information from the National Crime Information Center and regional (misdemeeanor) wants and warrants from a regional warrant databases, death information from the Social Security Death Index, jail information from an internal proprietary system that queried the jails of all the surrounding counties, etc.

There were some records we needed to get that helped us locate people that were non-law enforcement. For example, if the driver license said Mr. Jones lives at 123 Any Lane, are there any municipal construction permits with that name at that address? Is he the owner of the building? Are any pets registered to him at that address? Any hunting licenses issued to him at that address? Etc. These came from municipal, county, and state records, and other agencies.

In each case, we would find the easiest way to export raw data from the systems, normalize and ingest that. This often required agreements (or, sometimes, non-agreements in which the vendor would not approve but told us they wouldn’t object). Often this required writing scripts in COBOL or other languages to interact with the system’s internal workings.

Why Don’t More People Do This?

Think of the level of intricate subject matter expertise described here. First, we needed to understand that the data existed. Second, we needed to figure out where it was. Third, we needed to get the permission of that agency overseeing the desired data-set to allow it to be used for the purpose. Then we needed to get the permission of the vendor making the system that stored the data. Then we needed to reverse engineer the process of getting the data out of whatever weird, proprietary, non-relational flat-file or relational database it’s in, normalize it, then ingest it and de-moronize it to place it in an open standard format. Finally, we got to process the data.

That’s a lot of expertise, a lot of horse-trading, and a lot of hard work, and most vendors won’t do it and most chiefs, judges, and city administrators and attorneys won’t allow it.

When you ask yourself why we can’t “just” automate report writing, this should give you some insights into some of the answers.

Get Notified Of New Posts
(We won't ever sell, rent, or lend your email; nor will we ever spam you)

Automating Police Report Text A Product Managers View - September 25, 2022 -