MLP performing worse than the SGD in a supervised classification

The learning dataset I'm using is a grayscale image that was flatten to have each pixel representing an individual sample. The second image will be classified pixel by pixel after training the Stochastic gradient descent (SGD) and Multilayer perceptron (MLP) classifiers on the former one.

The problem I have is that the SGD is performing way better than the MLP, even if I keep the default parameters provided by Scikit-learn in both cases. Below is the code for both (please notice that because the training dataset is in the order of millions of samples, I had to employ partial_fit() to train the MLP by chunks, while this was not necessary for the SGD):

def batcherator(data, target, chunksize):
    for i in range(0, len(data), chunksize):
        yield data[i:i+chunksize], target[i:i+chunksize]

def classify():
    if algorithm == 'sgd':
        classifier = SGDClassifier(verbose=True),
    elif algorithm == 'mlp':
        classifier = MLPClassifier(verbose=True)

        gen = batcherator(,, 1000)
        for chunk_data, chunk_target in gen:
            classifier.partial_fit(chunk_data, chunk_target,
                                   classes=np.array([0, 1]))

My question is which parameters should I adjust in the MLP classifier to make its results similar to those obtained with the SGD?

I've tried to increase the number of neurons in the hidden layer using hidden_layer_sizes but I didn't see any improvement. No improvement either if I change the activation function of the hidden layer from the default relu to logistic using the activation parameter.