xargs and the unruly tags

A tale of two commands

I thought I was really clever when I configured my CI/CD pipeline to tag commits that got deployed and push the tags back into the repo, but I’m rarely as clever as I like to think: I had forgotten to put the proper checks in place to avoid these tag pushes triggering subsequent runs of the pipeline, and things got a little … out of hand.

I’d gone to bed just after pushing an update, and when I arose to check on it, I found that the deploy tagging stage had been running over and over and over and over and … you get the point. Thankfully, it had failed after about 130 rounds, so it could have been a lot worse, but I was left with a large amount of useless and unwanted tags in the remote repo.

So how do you fix something like this? Yup, xargs to the rescue!

Where there’s a will …

At first, I didn’t really know how I’d go about it. I was hoping git would have some nice, built-in functionality for mass-deleting remote tags, but while I have found in retrospect that it does (see the postmortem), I couldn’t find it at the time.

However, because all the tags were for a specific commit, I did know that I could list all the relevant tags separated by newlines, using git tag --contains <SHA>.

So, with some helpful advice from Stack Overflow and this guy, I constructed this little command which sorted me out just fine:

Now, I’d come across xargs before, even done the ol’ copying and pasting from Stack Overflow trick, but it had always looked really complicated and no-one had ever told me why I’d need it or what it does; so I just carried on in blissful ignorance. Not this time, though. It was time to figure out what was going on.

Groking xargs

The way xargs was sold to me was: “execute a command for each item in a list”. It’s actually more powerful than that, but that’s a great place to start.

Let’s use the man page to find out what that -I % bit means :

-I

replace-str: “Replace occurrences in the initial-arguments with names read from standard input”

The string to use to indicate where to place arguments in the command to run. In the command above, we chose to use %, but you’re not limited to this.

Similar to printf and format strings in general, this places your arguments at your desired place in the command. In our case it both limits us to using one argument (tag) at a time, and it lets us append it to :refs/tags/ without being separated by a space.

That means that in the above snippet, xargs would, for each tag listed, run the command git push origin :refs/tags/<tag_name>, which pushes that tag with an empty reference, thereby deleting it.

If all you want is to put the argument at the end of the command, you can even do without the -I. Say you want to recursively delete all the .swp files in a directory:

Be aware, though, that without either using a -I or -n (to limit the number of arguments to use for each command), xargs will split the list you give it into sizeable chunks and apply as many arguments to the command as it can each time. That means that in this case, it’d likely end up looking something like this:

which is usually fine and what you want, but keep this in mind for when it isn’t.

This is only scratching the surface of what xargs can do, but it’s enough to make it do some pretty heavy lifting. It might not be something to reach for very often, but for when you do need it, it’s a great tool to have in your belt.

Postmortem ⚰️

Now, you might have noticed that I did a git push for each tag that I was deleting, and you might be thinking that for over a hundred tags, it must have taken quite some time. You would be right. Luckily, I was working on something else, so I could happily let it run in the background. But we can do better!

xargs has an option -P or --max-procs, which you can use to decide how many processes to run in parallel. The default is 1, but if you set it to 0, it will run as many as it can. This could have saved us quite some time, assuming git would let us run multiple push operations from the same repo at the same time. But there is an even better way:

As outlined in this Stack Overflow response, you can use a whitespace-separated list of tag names (<tags>) with git push; so we could have run git push --delete origin <tags> to achieve the same outcome as deleting them one by one.

If we rewrite the command from earlier, we can both simplify it and do it all in a single push:

… yeah, that would have been a lot more efficient 😅