A coworker of mine is fond of this as an interview question.
I liked the question so much I decided to do a complete analysis of it.
Here's the question:



You have an array of n + m bytes. You also have a single byte of temporary storage space.
Using that single byte of storage, rearrange the bytes in the array such that the first
n bytes are now at the end of the array (and the last m bytes are now at the beginning of
the array).

Let's take an example. It's easiest to think of the array of bytes as a character string.
Thus, suppose n = 5 and m = 8, and the array contains: "BlockChopping". Then the goal is to reorder
the bytes to form "ChoppingBlock".
The questions to ask are:
 What are some algorithms to solve this problem?
 How many operations do those algorithms take?
 What are the best and worst cases (values of n and m) for this problem?



Some definitions:
 (n, m) is shorthand for the problem with n and m bytes to swap.
 r = n + m, the total number of bytes in the array
 s(n, m) = number of swap operations used by an algorithm to solve the problem for the
given values of n and m. A swap
is defined as taking byte A, copying it into the storage byte, copying byte B into byte A's
original position, and then copying the storage byte into byte B's original position.
 c(n, m) = number of copy operations used by an algorithm to solve the problem for the
given values of n and m.
c(n, m) = 3*s(n, m) if only swap operations are used.
 s_max(r) = maximum number of swap operations used by an algorithm to solve the problem
for the given value of r.
 s_min(r) = minimum number of swap operations used by an algorithm to solve the problem
for the given value of r.
 c_max(r) = maximum number of copy operations used by an algorithm to solve the problem
for the given value of r.
 c_min(r) = minimum number of copy operations used by an algorithm to solve the problem
for the given value of r.

Before continuing, you may want to solve this problem on your own.
Algorithm 1
One way to solve this is to use a recursive approach:
 Let p = minimum(n, m)
 Move those p blocks to their final destination, swapping them with
whatever bytes were already there.
 You now have p bytes in the correct place. What remains is the same
problem, only smaller. If n < m then what remains are (m  n) and n
bytes to rearrange. If n > m, then what remains are (n  m) and m
bytes to rearrange. If n == m, then you're already done, of course!
In our example, here are the steps using algorithm 1. The blue line block is
just for clarity  it doesn't actually represent a space, but rather where the
first block ends and the second begins. In the example below, the darker squares
are ones which have reached their correct positions.
B 
l 
o 
c 
k 

C 
h 
o 
p 
p 
i 
n 
g 
To start with, n = 5 and m = 8. We now move "Block" into the correct position.
p 
p 
i 
n 
g 

C 
h 
o 
B 
l 
o 
c 
k 
Now we've reduced this to a smaller problem, where n = 5 and m = 3. We now move "Cho" into
the correct position.
C 
h 
o 
n 
g 

p 
p 
i 
B 
l 
o 
c 
k 
Now we've reduced this to a smaller problem, where n = 2 and m = 3. We now move "ng" into
the correct position.
C 
h 
o 
p 
i 

p 
n 
g 
B 
l 
o 
c 
k 
Now we've reduced this to a smaller problem, where n = 2 and m = 1. We now move "p" into
the correct position. Note that we are swapping p's here; if you were doing this by hand
of course you wouldn't do that, but that only works by chance; in the general case you have
to actually do the swap.
C 
h 
o 
p 
i 

p 
n 
g 
B 
l 
o 
c 
k 
Now we've reduced this to a smaller problem, where n = 1 and m = 1. We simply swap the
two characters and we're done.
C 
h 
o 
p 
p 

i 
n 
g 
B 
l 
o 
c 
k 
Clearly, the best case scenario for this algorithm is for cases where n == m. In that
case, we simply swap one half with the other and we're done. Thus:



For algorithm 1:
 s(n, n) = n
 s_min(r) = ceiling(r / 2)
 c_min(r) = 3 * ceiling(r / 2)

What's the worst case scenario? The worst case is when the problem never reduces to the
n == m case until n == 1. How many swaps are required in this case? Each swap except for
the last one puts
exactly one character into the right position (the last swap puts two characters in the
right position). Thus, (n + m  1) swaps are required. So:



For algorithm 1:
 s_max(r) = r  1
 c_max(r) = 3 * (r  1)

This doesn't exactly answer the question, though. For what values of n, m does the worst case
occur?



THEOREM:
For algorithm 1, c(n, m) = 3 * (r  1) if and only if gcd(n, m) = 1.

Proof:
To prove this statement, it is sufficient to prove that (n, m) with gcd(n, m) = 1 always
reduces to a problem where the gcd remains 1.
 The (n, m) problem reduces to (n  m, m) or (m  n, n).
 In the first case, suppose gcd(n  m, m) = g. Since g divides m, there exists a such that
a * g = m. Since g divides n  m, there exists b such that b * g = n  m. Adding
those equations, we get (a + b) * g = n. Thus, g divides n and m. But we know
gcd(n, m) = 1, so gcd(n  m, m) = g = 1.
 In the second case, suppose gcd(m  n, n) = g. Since g divides n, there exists a such that
a * g = n. Since g divides m  n, there exists b such that b * g = m  n. Adding
those equations, we get (a + b) * g = m. Thus, g divides n and m. But we know
gcd(n, m) = 1, so gcd(m  n, n) = g = 1.
 Since the problem always reduces to a case where the gcd is 1, it will eventually
reduce to (1, 1) and thus is the worst case.
Similarly, if gcd(n, m) = g (not equal to 1), then the problem always reduces to a case
where the gcd is still g. So it will eventually reduce to the problem (g, g). c(g, g) = 3 * g,
which is a savings of 3 * (g  1) copy operations. Thus, c(n, m) = 3 * (r  1)  3 * (g  1)
= 3 * (r  g).



For algorithm 1, c(n, m) = 3 * (r  g)
where g = gcd(n, m).

Algorithm 2
A tricky way to solve this is to reverse each block, then reverse the entire array. Reversing
a block is easy  simply swap the two bytes on either end, then move inward until you reach
the middle.
B 
l 
o 
c 
k 

C 
h 
o 
p 
p 
i 
n 
g 
Reverse each side:
k 
c 
o 
l 
B 

n 
g 
i 
p 
p 
o 
h 
C 
Then reverse the whole thing:
C 
h 
o 
p 
p 
i 
n 
g 

B 
l 
o 
c 
k 
How many swaps does this take? floor(n / 2) + floor(m / 2) swaps to reverse the two
blocks, then floor((n + m) / 2) swaps to reverse the whole array. So:



For algorithm 2:
 s(n, m) = floor(n / 2) + floor(m / 2) + floor((n + m) / 2)
 s_max(r) = r
 c_max(r) = 3 * r
 s_min(r) = r  1
 c_min(r) = 3 * (r  1)

Thus, algorithm 2 is never better than algorithm 1, and usually worse. But it's kinda neat.
Algorithm 3
There's a fairly straightforward method which is the most efficient (but hardest to
program and somewhat difficult to analyze). The idea is to not swap but to instead always
copy a byte into its correct position, only using the storage byte to hold what used to
be in that position. To be precise:
 Start with the leftmost byte that is not in the correct position, call this byte A.
 Copy byte A into the storage byte. There is now an empty slot where byte A was; call
it the "empty slot".
 Copy the correct byte into the empty slot. After this step, the empty slot is now
where the last moved byte was.
 Repeat the above step until byte A is placed into an empty slot.
 If not done, then repeat starting with the first step.
To illustrate, with the storage byte displayed on the far right:
B 
l 
o 
c 
k 
C 
h 
o 
p 
p 
i 
n 
g 


Put "B" into the storage slot.

l 
o 
c 
k 
C 
h 
o 
p 
p 
i 
n 
g 

B 
"C" should go into the empty slot.
C 
l 
o 
c 
k 

h 
o 
p 
p 
i 
n 
g 

B 
"i" should go into the empty slot.
C 
l 
o 
c 
k 
i 
h 
o 
p 
p 

n 
g 

B 
"o" should go into the empty slot.
C 
l 

c 
k 
i 
h 
o 
p 
p 
o 
n 
g 

B 
There follows o, g, k, p, l, h, n, c, p, and finally B (from the storage byte).
C 
h 
o 
p 
p 
i 
n 
g 
B 
l 
o 
c 
k 


We got lucky here in that we didn't need to repeat the steps. This is the best case, requiring 1 copy
for the initial copy into the storage byte, then r copies to put everything into their correct
positions.
The worst case is when n == m, in which case we would have to put the storage byte into the empty
slot right away, and this algorithm would be equivalent to algorithm 1.



For algorithm 3:
 c(n, n) = 3 * n
 c_max(r) = 3 * floor(r / 2)

An interesting question to ask is: how do we tell if (n, m) is the best case and, more
generally, is there an easy way to compute c(n, m) for algorithm 3? What is c(4, 6),
c(5, 7), etc.?



THEOREM:
For algorithm 3, c(n, m) = g * (1 + (r / g)) where g = gcd(n, m).

Proof:
We will show that g repetitions of the algorithm are necessary, each repetition placing
r / g bytes into their correct slots.
 Let e = the position of the empty slot (numbered 0 through r  1). e = 0 at the
start of the algorithm.
 After 1 step, e = n. After another step, e = 2n (mod r). After t steps,
e = tn (mod r).
 We stop once e = m. So we want to find the smallest t such that tn == m (mod (n + m)).
 I leave it as an exercise to the reader to show that
t = r / g is the solution to that congruence.
 After these t steps, we move the byte from the storage byte into its empty slot,
so we need t + 1 copy operations, or 1 + (r / g) copies. We repeat this g times,
thus, g * (1 + (r / g)) copies are required.
Summary
Let's summarize the performance of the three algorithms:
Algorithm 
c_min(r) 
c_max(r) 
c(n, m) 
1 
3 * ceiling(r / 2) 
3 * (r  1) 
3 * (r  g) 
2 
3 * (r  1) 
3 * r 
3 * r if n and m are both even
3 * (r  1) otherwise 
3 
r + 1 
3 * floor(r / 2) 
g * (1 + (r/g)) 
Note, g = gcd(n, m).
Algorithm 3 is the best  it is always at least as good as algorithm 1, which in turn is
always at least as good as algorithm 2. Algorithm 3 is always better than algorithm 2.
