finding minimum number of rectangular pieces in a rectangular chocolate bar, with a rule
I'm having trouble with my school homework. I have a chocolate bar that consists of either black, white or black & white (mixed) squares. I'm supposed to divide it in two groups, one that has only white or black&white pieces and the other that has only black or black&white pieces. Dividing the chocolate bar means cracking it either horizontally or vertically along the line that separates individual squares.
Given a layout of a chocolate bar, I am to find an optimal division which separates dark and white cubes and results in the smallest possible number of pieces, the chocolate bar being not bigger than 50x50 squares.
The chocolate bar is defined on the standard input like this: first line consists of two integers M (number of rows in chocolate bar) and N (no. of columns), then there M columns each consisting of N characters symbolizing individual squares (0black, 1white, 2mixed)
Some examples of an optimal division, their inputs respectively (correct outputs are 3 and 7):
3 3
1 1 2
1 2 0
2 0 0
4 4
0 1 1 1
1 0 1 0
1 0 1 0
2 0 0 0
My problem is that I managed to work out a solution, but the algorithm I'm using isn't fast enough, if the chocolate bar is big like this for example:
40 40
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 1 2 1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 0 2 1 2 1 2 0 0 1 2 2 0 0 0 0 0 0 0 0 1 1 2 1 2 0 0 0 0 0 0 0 0 0 0
0 0 0 1 2 2 0 1 1 1 1 1 0 0 1 2 2 0 0 0 0 0 1 0 0 2 2 1 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 2 2 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 1 1 2 2 0 0 0 1 2 2 1 2 1 0 0 0 0 0 1 2 1 2 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 2 2 1 2 0 0 0 0 0 2 1 2 2 0 0 0 0 0 2 1 2 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 2 2 2 1 1 0 0 0 0 0 2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0
0 2 1 2 1 0 2 2 2 2 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 1 2 0 2 2 1 0 0 0 0 0 0
0 2 2 1 2 0 1 2 2 1 1 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0
0 2 2 1 2 0 0 0 0 2 1 2 1 2 1 1 2 0 2 0 0 0 0 0 0 0 1 2 2 2 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 2 2 2 2 1 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 1 2 1 1 2 2 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 2 0 0 0 0
0 0 0 0 0 0 0 2 1 2 0 0 2 2 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 2 0 0 0 0
0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 2 2 1 0 0 0 0 2 0 1 1 1 2 1 2 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 2 0 0 0 0 0 0 2 1 2 2 2 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 2 0 0 0 0 0 0 1 2 1 1 2 2 0 0 0 0 0
0 0 0 0 0 0 1 2 1 2 2 1 0 0 0 0 0 0 0 1 2 1 2 0 0 0 0 0 0 0 0 0 2 1 2 0 0 0 0 0
0 0 0 0 0 0 1 2 2 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 2 0 0 0 0 0 0 0 0 0 0 2 1 1 0 0
0 0 0 0 0 0 1 1 1 1 1 2 2 0 0 0 0 0 0 0 0 1 1 1 2 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0
0 0 0 0 0 0 1 2 2 2 1 1 1 0 0 0 0 0 0 0 0 1 2 1 2 0 0 0 0 0 0 0 0 0 0 2 2 2 1 0
0 0 0 0 0 0 0 0 0 1 2 1 2 0 0 0 0 0 0 0 0 1 1 1 2 2 0 0 0 0 0 0 0 0 0 1 2 1 1 0
0 0 0 2 1 1 2 2 0 1 2 1 1 0 0 0 0 0 2 2 1 2 2 1 2 2 0 0 0 0 0 0 0 0 0 1 2 2 2 0
0 0 0 2 2 2 1 1 0 0 1 2 2 2 0 0 0 0 2 2 2 1 1 2 1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 2 1 2 2 1 1 0 2 1 2 1 2 1 2 1 1 2 1 1 1 1 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 2 2 2 2 1 0 1 1 1 1 1 1 2 1 1 2 2 1 0 1 2 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 0 0 2 1 1 1 2 1 2 0 0 1 2 1 2 1 2 2 0 0 0 0 0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 2 2 1 1 2 2 1 1 1 1 1 1 1 2 1 0 0 0 0 0 0 0 2 2 2 0 0 0
0 0 0 0 0 0 0 1 1 1 2 0 0 1 1 1 2 2 1 2 2 2 1 0 0 0 1 1 1 0 0 0 0 0 1 2 1 0 0 0
0 0 0 0 0 0 0 2 1 1 2 0 0 0 0 0 0 2 2 2 1 1 1 0 0 0 1 2 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 2 2 2 2 0 0 0 0 0 0 2 1 1 1 2 0 0 0 0 1 2 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 2 2 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 1 1 2 0 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 0 0 0 0 0 0 0 1 2 1 0 0
0 0 0 0 0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 1 2 1 0 0
0 0 0 0 0 0 0 0 0 2 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
then it takes 10 seconds for my program to solve it (correct solution for that one is 126 and I should be able to solve it in under 2 seconds!)
My algorithm works roughly with some minor optimization like this: iterate through all possible lines where it's possible to cut and then recursively do the same for the 2 newly emerged rectangles, if they cannot be divided anymore, then return 1.
The function after it iterates trough all the possible cuts always returns the minimum, once the minimum is found then store it and if I'd happen to need to solve this rectangle again then just return the value.
I thought that maybe If I happen to have already solved a particular rectangle and now I need to solve one that is one row or column bigger or smaller, then I could somehow use the solution I already have for that one and use it for the new one. But I really don't know how would i implement such a feature. Right now my algorithm treats it like a completely new unsolved rectangle.
My code so far:
#include <stdio.h>
#include <stdlib.h>
unsigned int M, N;
unsigned int ****pieces; ////already solved rectangles, the value of pieces[y0][x0][y1][x1] is the optimal number of pieces in which the particular rectangle(that has upperleft corner in [x0,y0] and bottomright corner in[x1,y1]) can be divided
int ****checked;
unsigned int inf;
unsigned int minbreaks(int mat[M][N], unsigned int starti, unsigned int startj, unsigned int maxi, unsigned int maxj) {
if (pieces[starti][startj][maxi][maxj] != 0) {
return pieces[starti][startj][maxi][maxj];
} else {
unsigned int vbreaks[maxj  1];
unsigned int hbreaks[maxi  1];
for (unsigned int i = 0; i < maxj  1; i++) {
vbreaks[i] = inf;
}
for (unsigned int i = 0; i < maxi  1; i++) {
hbreaks[i] = inf;
}
unsigned int currentmin = inf;
for (unsigned int i = starti; i < maxi; i++) {
for (unsigned int j = startj; j < maxj  1; j++) {
if (mat[i][j] != 2) {
for (unsigned int k = startj + 1; k < maxj; k++) {
if (vbreaks[k  1] == inf) {
for (unsigned int z = starti; z < maxi; z++) {
if (!checked[i][j][z][k]) {
if (mat[z][k] != 2 && mat[i][j] != mat[z][k]) {
vbreaks[k  1] = minbreaks(mat, starti, startj, maxi, k) + minbreaks(mat, starti, k, maxi, maxj);
if (vbreaks[k  1] < currentmin) {
currentmin = vbreaks[k  1];
}
break;
}
checked[i][j][z][k] = 1;
}
}
}
}
}
}
}
for (unsigned int i = starti; i < maxi  1; i++) {
for (unsigned int j = startj; j < maxj; j++) {
if (mat[i][j] != 2) {
for (unsigned int k = starti + 1; k < maxi; k++) {
if (hbreaks[k  1] == inf) {
for (unsigned int z = startj; z < maxj; z++) {
if (!checked[i][j][k][z]) {
if (mat[k][z] != 2 && mat[i][j] != mat[k][z]) {
hbreaks[k  1] = minbreaks(mat, starti, startj, k, maxj) + minbreaks(mat, k, startj, maxi, maxj);
if (hbreaks[k  1] < currentmin) {
currentmin = hbreaks[k  1];
}
break;
}
checked[i][j][k][z] = 1;
}
}
}
}
}
}
}
if (currentmin == inf) {
currentmin = 1;
}
pieces[starti][startj][maxi][maxj] = currentmin;
return currentmin;
}
}
int main(void) {
FILE *file = stdin;
fscanf(file, "%u %u", &M, &N);
int mat[M][N];
pieces = malloc(sizeof (unsigned int***)*M);
checked = malloc(sizeof (int***)*M);
for (unsigned int i = 0; i < M; i++) {//initialize the pieces,checked and mat arrays.
pieces[i] = malloc(sizeof (unsigned int**)*N);
checked[i] = malloc(sizeof (int**)*N);
for (unsigned int j = 0; j < N; j++) {
int x;
fscanf(file, "%d", &x);
mat[i][j] = x;
pieces[i][j] = malloc(sizeof (unsigned int*)*(M + 1));
checked[i][j] = malloc(sizeof (int*)*M);
for (unsigned int y = i; y < M + 1; y++) {
pieces[i][j][y] = malloc(sizeof (unsigned int)*(N + 1));
for (unsigned int x = j; x < N + 1; x++) {
pieces[i][j][y][x] = 0;
}
}
for (unsigned int y = 0; y < M; y++) {
checked[i][j][y] = malloc(sizeof (int)*N);
for (unsigned int x = 0; x < N; x++) {
checked[i][j][y][x] = 0;
}
}
}
}
inf = M * N + 1; //number one bigger than maximal theoretically possible number of divisions
unsigned int result = minbreaks(mat, 0, 0, M, N);
printf("%u\n", result);
return (EXIT_SUCCESS);
}
So anybody has any idea for improvements?
3 answers

There is a dynamic programming approach to this, but it won't be cheap either. You need to fill in a load of tables giving, for each size and position of rectangle within the main square, the minimum number of divisions necessary to divide up that smaller rectangle fully.
For a rectangle of size 1x1 then answer is 0.
For a rectangle of size AxB look and see if all of its cells are uniform enough that the answer is 0 for that rectangle. If so, fine. If not try all possible horizontal and vertical divisions. Each of these divisions gives you two smaller rectangles. If you work out the answers for all rectangles of size A1xB and smaller and size AxB1 and smaller before you try and work out the answers for rectangles of size AxB you all ready know the answers for the two smaller rectangles. So for each possible division, add up the answers for the two smaller rectangles and add one to get the cost for that division. Chose the division that gives you the smallest cost and that gives you the answer for your current AxB rectangle.
Working out the answers for all smaller rectangles before larger rectangles, the very last answer you work out gives you the optimum number of divisions for the full square. The easiest way to work out what the best division is is to keep a little extra information for each rectangle, recording what the best division found was.
For an NxN square there are O(N^4) rectangles  any two points in the square define a rectangle as opposite corners. A rectangle of size O(N)xO(N) has O(N) possible divisions so you have something like an O(N^5) algorithm, or O(N^2.5) if N is the input size since an NxN square has input data of size O(N^2).
(You could also do something very like this by taking your original code and storing the results from calls to minBreaks() so that if minBreaks() is called more than once with the same arguments it simply returns the stored answer instead of recalculating it with yet more recursive calls to minBreaks()).

For any arbitrary rectangle, we can know if it contains either no white or no black pieces in
O(1)
time, withO(M * N)
preprocessing of matrix prefixsums for white and black separately (count 1 for each piece).We can store potential horizontal and vertical split points separately in two kd trees for
O(log(splitPoints) + k)
retrieval for an arbitrary rectangle, again preprocessing the entire input.After that, a general recursive algorithm could look like:
f(tl, br): if storedSolution(tl, br): return storedSolution(tl, br) else if isValid(tl, br): return setStoredSolution(tl, br, 0) best = Infinity for p in vSplitPoints(tl, br): best = min( best, 1 + f(tl, (p.x1, br.y)) + f((p.x, tl.y), br) ) for p in hSplitPoints(tl, br): best = min( best, 1 + f(tl, (br.x, p.y1)) + f((tl.x, p.y), br) ) return setStoredSolution(tl, br, best)

Thanks to everybody who helped me, my mistake was that in those nested loops I tried to avoid some unnecessary breaks, like this for example
1 1 > 1  1 1 1 1  1 1 1 1  1
thinking it would speed up the runtime but the correct approach was just simply breaking the chocolate bar always everywhere possible. Anyway for anyone interested here is my working code:
#include <stdio.h> #include <stdlib.h> unsigned int M, N; unsigned int ****pieces; ////already solved rectangles, the value of pieces[y0][x0][y1][x1] is the optimal number of pieces in which the particular rectangle(that has upperleft corner in [x0,y0] and bottomright corner in[x1,y1]) can be divided unsigned int inf; int isOneColor(int mat[M][N], unsigned int starti, unsigned int startj, unsigned int maxi, unsigned int maxj) { int c = 2; for (unsigned int i = starti; i < maxi; i++) { for (unsigned int j = startj; j < maxj; j++) { if (c == 2) { if (mat[i][j] != 2) { c = mat[i][j]; } } else if (c != mat[i][j] && mat[i][j] != 2) { return 0; } } } return 1; } unsigned int minbreaks(int mat[M][N], unsigned int starti, unsigned int startj, unsigned int maxi, unsigned int maxj) { if (pieces[starti][startj][maxi][maxj] != 0) { return pieces[starti][startj][maxi][maxj]; } else if (isOneColor(mat, starti, startj, maxi, maxj)) { return pieces[starti][startj][maxi][maxj] = 1; } else { unsigned int currentmin = inf; for (unsigned int j = startj; j < maxj  1; j++) { unsigned int c = minbreaks(mat, starti, startj, maxi, j + 1) + minbreaks(mat, starti, j + 1, maxi, maxj); if (c < currentmin) { currentmin = c; } } for (unsigned int i = starti; i < maxi  1; i++) { unsigned int c = minbreaks(mat, starti, startj, i + 1, maxj) + minbreaks(mat, i + 1, startj, maxi, maxj); if (c < currentmin) { currentmin = c; } } pieces[starti][startj][maxi][maxj] = currentmin; return currentmin; } } int main(void) { FILE *file = stdin; //FILE *file = fopen("inputfile", "r"); fscanf(file, "%u %u", &M, &N); int mat[M][N]; pieces = malloc(sizeof (unsigned int***)*M); for (unsigned int i = 0; i < M; i++) { pieces[i] = malloc(sizeof (unsigned int**)*N); for (unsigned int j = 0; j < N; j++) { int x; fscanf(file, "%d", &x); mat[i][j] = x; pieces[i][j] = malloc(sizeof (unsigned int*)*(M + 1)); for (unsigned int y = i; y < M + 1; y++) { pieces[i][j][y] = malloc(sizeof (unsigned int)*(N + 1)); for (unsigned int x = j; x < N + 1; x++) { pieces[i][j][y][x] = 0; } } } } inf = M * N + 1; //number that is bigger by one than maximal theoretically possible number of divisions unsigned int result = minbreaks(mat, 0, 0, M, N); printf("%u\n", result); return (EXIT_SUCCESS); }