commit: e8e526bdb58ea53fafea5fd82e3535b7fe218b23
parent a146d09884ac03cd824fbb4b32f35d2f4d17a683
Author: Haelwenn (lanodan) Monnier <contact@hacktivis.me>
Date: Fri, 31 Oct 2025 15:03:27 +0100
cmd/shuf.1: document shuffling method
Diffstat:
2 files changed, 28 insertions(+), 7 deletions(-)
diff --git a/cmd/shuf.1 b/cmd/shuf.1
@@ -1,7 +1,7 @@
.\" utils-std: Collection of commonly available Unix tools
.\" Copyright 2017 Haelwenn (lanodan) Monnier <contact+utils@hacktivis.me>
.\" SPDX-License-Identifier: MPL-2.0
-.Dd January 17, 2025
+.Dd October 31, 2025
.Dt SHUF 1
.Os
.Sh NAME
@@ -18,17 +18,23 @@
.Op Fl n Ar num
.Op Ar string...
.Sh DESCRIPTION
+In it's first form,
.Nm
-reads each
+reads lines from each
.Ar file
-in sequence and writes it on the standard output with some shuffling applied to each line.
-If no
+or if unspecified or when
.Ar file
-is given or if
-.Ar file is
+is
.Qq - ,
+lines are read from standard input.
+And are then shuffled and printed using a reservoir shuffle, see
+.Sx SHUFFLING
+for details.
+.Pp
+In it's second form,
.Nm
-reads from the standard input.
+uses a Fisher-Yates shuffle to swap-shuffle all the strings,
+and then prints them out as lines.
.Sh OPTIONS
.Bl -tag -width _n_num
.It Fl e
@@ -42,8 +48,22 @@ lines.
.It Fl z
Use NULL as line delimiter, not newline.
.El
+.Sh SHUFFLING
+In it's first form,
+.Nm
+.\" LINES_LEN
+uses a reservoir of 512 lines.
+It picks a random location,
+prints a line if present,
+then inserts a newly read line.
+Once all lines are read it prints the lines still present in the reservoir.
+.br
+While this isn't truly a random sort as lines beyond 512 won't be printed first,
+it allows to use a bounded amount of memory.
.Sh EXIT STATUS
.Ex -std
+.Sh SEE ALSO
+.Xr sort 1
.Sh HISTORY
An
.Nm
diff --git a/cmd/shuf.c b/cmd/shuf.c
@@ -19,6 +19,7 @@
// Not a full shuffle, if there is more than 512 lines then last lines are never going to be printed first.
// But this allows bounded memory usage.
+// /!\ Make sure to modify the manpage as well if this gets changed /!\
// FIXME: handle newline-less lines