Pair Programming with ChatGPT: Bash Retry with STDIN Replay
===========================================================
ChatGPT can be a very useful programming assistant. I’ve used it for debugging, documentation reference, and more recently trying out random ideas I think I would find useful. Its conversational nature and correctibility make it conducive to problem solving.
Unreliable services outside of your control are disruptive to automation. Sometimes the best you can do is retry while praying that the problem goes away after a few attempts. It would be useful to be able to wrap routines that depend on such services in a retry function.
A basic formulation of the problem is relatively straightforward to implement, but a trickier aspect to handle is retrying commands that accept input on STDIN. Properly retrying such a command requires capturing and replaying what is received on STDIN. This added complication made it feel like a fun challenge to try using ChatGPT as a pair-programming assistant.
Before highlighting the progression of the effort, here is a direct link to the final implementation: retry.sh
Here’s the original problem statement I gave it:
Hello! I need a bash script that provides a function called
retry
. Its first argument is the number of times to retry a command, with 0 meaning to retry forever. The remaining arguments are a command and its arguments to retry. The tricky part about this retry function is that I need it to be able to have buffered what it receives on STDIN, such that it can replay it to the command on each retry. For example, if I writeretry 4 bash -e < <(echo "date && false")
, the inner command (bash -e
) will receive the textdate && false
every time it’s retried, even though theretry
function only received that text on its STDIN once.
It cheerfully produced the following output:
#!/bin/bash
retry() {
local retries="$1"
shift
local cmd=("$@")
local tmp_input_file
tmp_input_file="$(mktemp)"
cat >"${tmp_input_file}"
local count=0
local exit_code
while [[ "${retries}" -eq 0 || "${count}" -lt "${retries}" ]]; do
"${cmd[@]}" <"${tmp_input_file}"
exit_code="$?"
if [[ "${exit_code}" -eq 0 ]]; then
break
else
count=$((count + 1))
fi
done
rm -f "${tmp_input_file}"
return "${exit_code}"
}
I was pleasantly surprised to find that it behaved correctly for the example I provided!
➜ ~ retry 4 bash -e < <(echo "date && false")
Wed Apr 5 07:20:24 PM EDT 2023
Wed Apr 5 07:20:24 PM EDT 2023
Wed Apr 5 07:20:24 PM EDT 2023
Wed Apr 5 07:20:24 PM EDT 2023
➜ ~ echo $?
1
At this point I started wondering about a handful of corner cases, and how many of them it could help me address. The first one I pursued was not needing to buffer all of STDIN before starting the first attempt.
Is it possible to modify this to not need to wait for all of stdin to be received before running the command?
Its idea was to replace the temporary file with a named pipe fed by a background job, which implemented the behavior I specified. However, in doing so, it broke the ability for reattempts to receive the same STDIN data as the first attempt because data in named pipes is consumed when read.
@@ -5,25 +5,28 @@ retry() {
shift
local cmd=("$@")
- local tmp_input_file
- tmp_input_file="$(mktemp)"
- cat >"${tmp_input_file}"
+ local fifo
+ fifo="$(mktemp -u)"
+ mkfifo "${fifo}"
+
+ cat >"${fifo}" &
local count=0
local exit_code
while [[ "${retries}" -eq 0 || "${count}" -lt "${retries}" ]]; do
- "${cmd[@]}" <"${tmp_input_file}"
+ "${cmd[@]}" <"${fifo}"
exit_code="$?"
if [[ "${exit_code}" -eq 0 ]]; then
break
else
count=$((count + 1))
+ echo >&2 "Retrying... (${count}/${retries})"
fi
done
- rm -f "${tmp_input_file}"
+ rm -f "${fifo}"
return "${exit_code}"
}
When I challenged it on this, it acknowledged the error and switched back from a named pipe to a temporary file (retaining the background job to feed it), and used tail
to feed the contents of the file into the command.
Will this be able to replay everything that was received on stdin? I forgot if a named pipe can be read from multiple times without the data that has been read becoming consumed.
@@ -5,28 +5,30 @@ retry() {
shift
local cmd=("$@")
- local fifo
- fifo="$(mktemp -u)"
- mkfifo "${fifo}"
-
- cat >"${fifo}" &
+ local tmp_input_file
+ tmp_input_file="$(mktemp)"
+
+ # Background process to write stdin to temporary file
+ cat > "${tmp_input_file}" &
local count=0
local exit_code
while [[ "${retries}" -eq 0 || "${count}" -lt "${retries}" ]]; do
- "${cmd[@]}" <"${fifo}"
+ # Read from the temporary file and feed it to the command
+ "${cmd[@]}" < <(tail -n +1 -f "${tmp_input_file}")
exit_code="$?"
if [[ "${exit_code}" -eq 0 ]]; then
break
else
count=$((count + 1))
- echo >&2 "Retrying... (${count}/${retries})"
fi
done
- rm -f "${fifo}"
+ # Clean up background process and temporary file
+ wait
+ rm -f "${tmp_input_file}"
return "${exit_code}"
}
I also asked it about correctly handling interrupts such as Ctrl-C. It reasonably added a cleanup
function to a trap
.
Will this clean up the background
cat
process if interrupted with ctrl-c? Will the retry function itself exit on ctrl-c or a termination signal? It should.
@@ -10,6 +10,15 @@ retry() {
# Background process to write stdin to temporary file
cat > "${tmp_input_file}" &
+ bg_pid=$!
+
+ cleanup() {
+ kill "${bg_pid}" 2>/dev/null
+ rm -f "${tmp_input_file}"
+ }
+
+ # Trap to handle signals and clean up
+ trap cleanup EXIT INT TERM
local count=0
local exit_code
@@ -27,8 +36,7 @@ retry() {
done
# Clean up background process and temporary file
- wait
- rm -f "${tmp_input_file}"
+ cleanup
return "${exit_code}"
}
When I tested this version, I found that the script was not exiting when the command succeeds. I suspected the use of tail
was responsible for this.
This works when the command to retry fails. When the command succeeds, the script hangs and never exits, which is undesirable. I think this is because the tail command is not terminating. I’m not sure why it terminates when the command fails, but not when the command succeeds.
Its next idea was to switch to using cat
instead of tail
, and to nest it in a loop to continuously read data from the temporary file.
@@ -25,14 +25,18 @@ retry() {
while [[ "${retries}" -eq 0 || "${count}" -lt "${retries}" ]]; do
# Read from the temporary file and feed it to the command
- "${cmd[@]}" < <(tail -n +1 -f "${tmp_input_file}")
- exit_code="$?"
-
- if [[ "${exit_code}" -eq 0 ]]; then
- break
- else
- count=$((count + 1))
- fi
+ while ! "${cmd[@]}" < <(cat "${tmp_input_file}"); do
+ exit_code="$?"
+
+ if [[ "${exit_code}" -eq 0 ]]; then
+ break 2
+ else
+ count=$((count + 1))
+ if [[ "${retries}" -ne 0 && "${count}" -ge "${retries}" ]]; then
+ break 2
+ fi
+ fi
+ done
done
# Clean up background process and temporary file
This is a confusing change, as the value of having two nested loops is unclear. I observed that it never exits when the command succeeds, and reports success when a command fails.
This updated version results in strange behavior.
echo "date && false" | retry 4 bash -e
produces one line of output before exiting with status 0, when I expected four lines of output (for four attempts) and an exit status of 1. When I runecho "date && true" | retry 4 bash -e
, it retries indefinitely and never returns.
The next attempt was to revert back to using tail
, and depends on the cleanup
function killing the background cat
process, which is supposed to cause tail
to exit.
@@ -25,18 +25,14 @@ retry() {
while [[ "${retries}" -eq 0 || "${count}" -lt "${retries}" ]]; do
# Read from the temporary file and feed it to the command
- while ! "${cmd[@]}" < <(cat "${tmp_input_file}"); do
- exit_code="$?"
-
- if [[ "${exit_code}" -eq 0 ]]; then
- break 2
- else
- count=$((count + 1))
- if [[ "${retries}" -ne 0 && "${count}" -ge "${retries}" ]]; then
- break 2
- fi
- fi
- done
+ "${cmd[@]}" < <(tail -n +1 -f "${tmp_input_file}")
+ exit_code="$?"
+
+ if [[ "${exit_code}" -eq 0 ]]; then
+ break
+ else
+ count=$((count + 1))
+ fi
done
# Clean up background process and temporary file
This change also makes no sense, because the command blocking ensures that cleanup
never runs, thus never killing the background cat
process. I decided that I needed to help it out a bit, and proposed that the background process stream STDIN to the command line-by-line through a named pipe, exiting if it detects that the command running in the background has exited.
The
false
test works correctly, but thetrue
test hangs after printing the expected one line of output. It seems like thetail
command is still not terminating, preventing progression of the script from reaching thecleanup
function that would kill the backgroundcat
command. Instead of usingtail
, another function may need to be introduced that reads fromtmp_input_file
, checking if the command receiving input has terminated or not in between each line read and sent. Maybe a named pipe could be used by the function that reads fromtmp_input_file
to stream stdin to the command? I’m not sure if this would work though, or if it’s the best approach.
The implementation it generated served as the starting point for my own manual iterations on the program.
@@ -1,5 +1,20 @@
#!/bin/bash
+stream_input() {
+ local tmp_input_file="$1"
+ local cmd_pid="$2"
+ local fifo="$3"
+
+ while kill -0 "${cmd_pid}" 2>/dev/null; do
+ while read -r line; do
+ echo "${line}" > "${fifo}"
+ if ! kill -0 "${cmd_pid}" 2>/dev/null; then
+ break
+ fi
+ done < "${tmp_input_file}"
+ done
+}
+
retry() {
local retries="$1"
shift
@@ -12,9 +27,14 @@ retry() {
cat > "${tmp_input_file}" &
bg_pid=$!
+ local fifo
+ fifo="$(mktemp -u)"
+ mkfifo "${fifo}"
+
cleanup() {
kill "${bg_pid}" 2>/dev/null
rm -f "${tmp_input_file}"
+ rm -f "${fifo}"
}
# Trap to handle signals and clean up
@@ -24,8 +44,15 @@ retry() {
local exit_code
while [[ "${retries}" -eq 0 || "${count}" -lt "${retries}" ]]; do
- # Read from the temporary file and feed it to the command
- "${cmd[@]}" < <(tail -n +1 -f "${tmp_input_file}")
+ # Start the command
+ "${cmd[@]}" < "${fifo}" &
+ cmd_pid=$!
+
+ # Stream input to the command
+ stream_input "${tmp_input_file}" "${cmd_pid}" "${fifo}" &
+
+ # Wait for the command to finish
+ wait "${cmd_pid}"
exit_code="$?"
if [[ "${exit_code}" -eq 0 ]]; then
It was at this point that I decided I had reached the point of diminishing returns, wherein the program was becoming more complex than it was capable of usefully understanding. The fact that even after pointing it in a solid direction it generated a plainly broken implementation was good evidence of this (stream_input
will repeatedly dump the contents of tmp_input_file
into the named pipe).
The final implementation ended up being significantly more complicated than I was originally expecting this problem to require. It is included below with explanations.
retry.sh
#! /usr/bin/env bash
# usage: retry ATTEMPTS COMMAND [...]
#
# This script provides a retry function capable of capturing and replaying data
# from stdin to the command being retried, so that each attempt receives the
# input. The data from stdin is streamed to the command immediately, it is not
# pre-buffered, and thus will not lock up if the input from stdin never ends.
#
# If ATTEMPTS is 0, COMMAND is retried forever until it succeeds.
# Suppress some of the job control messages. Unfortunately it is impossible to
# suppress all of them.
set +m
retry() {
local retries="$1"
shift
local cmd=("$@")
# Background process which reads stdin data from stdin_reader through
# in_fifo, buffers it in replay_file, and writes it to the stdin of each
# command attempt through out_fifo.
stream_input() {
local cmd_pid="$1"
local stdin_reader_pid="$2"
local in_fifo="$3"
local out_fifo="$4"
local replay_file="$5"
# Hold out_fifo open for writing. The command might otherwise exit
# prematurely if the pipe is closed by a bash I/O redirect operator in
# between individual line writes.
exec 6>"${out_fifo}"
while read -r line; do
echo "${line}" >&6
if ! kill -0 "${cmd_pid}" 2>/dev/null; then
# Stop replaying from replay_file to out_fifo if the command
# has exited.
break
fi
done < "${replay_file}"
# This is used to signal if data is done being sent over in_fifo by
# stdin_reader.
local final_read=false
exec 5<"${in_fifo}"
while true; do
if read -t 0 -r -u 5; then
read -r -u 5 line
echo "${line}" >> "${replay_file}"
echo "${line}" >&6
elif ${final_read}; then
break
fi
if ! kill -0 "${cmd_pid}" 2>/dev/null; then
# Stop passing lines from in_fifo to out_fifo if the command
# has exited.
break
elif ! kill -0 "${stdin_reader_pid}" 2>/dev/null; then
# Since stdin_reader has exited, nothing new will come over
# in_fifo. Signal the next iteration of the loop to finish
# reading whatever remains in the pipe and exit.
final_read=true
fi
done
# Close out_fifo and in_fifo.
exec 5<&-
exec 6>&-
}
# Create a named pipe to receive data from stdin asynchronously. Bind the
# pipe to &3 to keep it open on both ends. It needs to be held open for
# writing because subsequent instances of stream_input will block on
# opening it for reading if stdin_reader has exited, which would otherwise
# have been the only thing holding it open. It needs to be held open for
# reading to prevent holding it open it for writing from blocking.
local in_fifo
in_fifo="$(mktemp -u)"
mkfifo "${in_fifo}"
exec 3<>"${in_fifo}"
cat > "${in_fifo}" &
stdin_reader_pid=$!
# The replay file buffers data received on stdin so that it can be passed
# to reattempts of the command.
local replay_file
replay_file="$(mktemp)"
cleanup() {
# Kill all background jobs
# https://unix.stackexchange.com/a/609300
for j in $(jobs | awk '{gsub("[^0-9]","",$1);printf "%%%s\n", $1}'); do
kill $j
done
[[ -e "${in_fifo}" ]] && rm -f "${in_fifo}"
[[ -e "${replay_file}" ]] && rm -f "${replay_file}"
[[ -e "${out_fifo}" ]] && rm -f "${out_fifo}"
wait
}
trap cleanup EXIT INT TERM
local count=0
local exit_code
while ((retries == 0 || count < retries)); do
# Create a named pipe to receive buffered and replayed data from stdin
# from stream_input.
local out_fifo
out_fifo="$(mktemp -u)"
mkfifo "${out_fifo}"
"${cmd[@]}" < "${out_fifo}" &
cmd_pid=$!
stream_input \
"${cmd_pid}" \
"${stdin_reader_pid}" \
"${in_fifo}" \
"${out_fifo}" \
"${replay_file}" &
stream_input_pid=$!
wait "${cmd_pid}"
exit_code="$?"
rm -f "${out_fifo}"
if ((exit_code == 0)); then
break
else
count=$((count + 1))
fi
done
cleanup
return "${exit_code}"
}
Now that I have a functional implementation, it is easy to see in hindsight that bash was likely not the best choice for this. It’s if nothing else an interesting example of the types of limits you can run into when using ChatGPT for programming problem-solving.