Adding missing end-of-file line endings

I recently started work in a code base where many files didn’t end with a newline character. The way my text editor is configured, it adds a trailing newline to the last line if there isn’t already one, which meant that any time I edited a file, the git diff would show the last line as changed along with a message saying “No newline at end of file”. This change in the file’s last line had nothing to do with the rest of the changes I made in the file, so ideally I wouldn’t want to put the end-of-file line ending change in the same commit as whatever feature I was working on.

Solution 1: Reconfigure my editor to not insist upon adding a newline at the end of files. This would certainly be possible, but undesirable. According to POSIX, a text file “contains characters organized into zero or more lines”^[1], and a line is “a sequence of zero or more non-newline characters plus a terminating newline character”^[2]. By this definition, POSIX expects the last line of the file to end with a newline. Since some tools assume that text files conform to this definition and don’t work quite as expected if they don’t, we might as well follow the POSIX definition.

Solution 2: Fix them all! After some experimentation, I came up with:

1	git grep -I --name-only -e '' \| xargs -d '\n' sed -i -e '$a\'

Quick note if you’re on macOS: this makes uses of some arguments for xargs and sed found in the GNU variants of those programs, but not in the BSD variants that come on your computer. You should install the GNU versions, which if you’re using Homebrew you can do using brew install coreutils findutils. By default they’ll be installed as gxargs and gsed, so you’ll need to substitute those in the above command accordingly.

How it works:

git grep -I --name-only -e '' gives a list of all text files that git knows about. (We chose git grep instead of git ls-files because we don’t want to touch binary files such as image assets.) It will print out each filename on its own line.

This list of filenames is piped into xargs. -d '\n' tells xargs to treat newlines as what separates the input. By default it would also consider spaces and tabs as separators, which would cause it to break if any of your filenames included spaces. For each filename, it runs sed -i -e '$a\'.

sed’s -i argument causes it to edit the file in place instead of printing to stdout. -e means it will run the command that follows on the file. In the command, $ is an address that tells sed to only look at the last line of the file. a\ is the append command, and will append the text that follows the backslash to the line. In our case, there’s nothing after the backslash, and so nothing gets appended. It would seem that this ought to be a no-op, but sed operates under the POSIX definition of a line, meaning that all of the lines it outputs will end with newlines.

Take a look at your git diff to make sure the changes look reasonable, and if everything looks as you expected, go ahead and commit these end-of-file changes in one big commit.