keras predict_generator is shuffling its output when using a keras.utils.Sequence


keras predict_generator is shuffling its output when using a keras.utils.Sequence



I am using keras to build a model that inputs 720x1280 images and outputs a value.



I am having a problem with keras.models.Sequential.predict_generator when using the keras.utils.Sequence class to obtain the values corresponding to images on the validation/training sets. The values returned are shuffled, so I don't know which output corresponds to which image.


keras.models.Sequential.predict_generator


keras.utils.Sequence



This is how my generators are defined


from skimage.io import ImageCollection, imread
from keras.utils import Sequence

def load_images(f):
return imread(f).astype(np.float64)

class DataSetImageKeras(Sequence):
def __init__(self, image_collection, values, batch_size):
self.images = image_collection
self.hf = values
self.batch_size = batch_size
self.n = len(self.images)
self.x_scale = 250
self.y_scale = 1e4

def __len__(self):
return int(np.ceil(len(self.images) / float(self.batch_size)))

def __getitem__(self, idx):
# batch_x is a numpy.ndarray
batch_x = (
self.images[idx:min(idx + self.batch_size, self.n)]
.concatenate()
.reshape(self.batch_size, 720, 1280, 1)
)
batch_y = self.hf[idx:min(idx + self.batch_size, self.n)]


return batch_x/self.x_scale, batch_y/self.y_scale

images_train = ImageCollection(images_paths_train, load_func=load_images)
images_val = ImageCollection(images_paths_test, load_func=load_images)

data_train = DataSetImageKeras(images_train, values_train, n_batch)
data_val = DataSetImageKeras(images_val, values_val, n_batch)


from keras.models import load_model
model = load_model('model001') #this model is already trained



If I use the following code:


val_result =
val_hf =
for (batch_x, batch_y) in data_val:
val_result.append(model.predict_on_batch(batch_x))
val_hf.append(batch_y)

val_result = np.concatenate(val_result)
val_hf = np.concatenate(val_hf)

plt.plot(val_hf,
val_result,
marker='.',
linestyle='')



The correct result is obtained (as seen on this image where x is the desired value and y is the predicted value)



However if I use the predict_generator function, as below:


val_result = model.predict_generator(data_val, verbose=1,
workers=1,
max_queue_size=50,
use_multiprocessing=False)



The output is shuffled as can be seen here.



My problem is similar to
#5048 and
#6745,
which should be solved by
#6891 API, but I am using keras version 2.1.6 and it is still shuffling my predictions, even when using workers=1.


workers=1



It is also similar to this, but I didn't find anything that could reset the generators and this problem is still present if I define a new generator and try to run the predict_generator.


predict_generator



I also found something stating that it could have something to do with the number of batches not dividing exactly the number of samples, but this problem is still present if I use n_batch=1


n_batch=1



As a side note, it might be that predict_generator is not shuffling data, but only returning it with an index offset, since the input data on values and images_paths are already shuffled.


values


images_paths




1 Answer
1



predict_generator was not shuffling my predictions, after all. The problem was with the __getitem__ method. For instance, usingn_batch=32, the method would yield values from 1 to 32, then from 2 to 33 and so forth, instead of from 1 to 32, 33 to 64, etc.


predict_generator


__getitem__


n_batch=32



Changing the method as follows solves the problem


def __getitem__(self, idx):
# batch_x is a numpy.ndarray
idx_min = idx*self.batch_size
idx_max = min(idx_min + self.batch_size, self.n)
batch_x = (
self.images[idx_min:idx_max]
.concatenate()
.reshape(self.batch_size, 720, 1280, 1)
)
batch_y = self.hf[idx_min:idx_max]






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV