🤔 What is CIFAR-10?
CIFAR-10 is a famous dataset with 60,000 tiny images (32x32 pixels) across 10 different classes. It's like the "Hello World" of computer vision!
✈️
Airplane
🚗
Automobile
🐦
Bird
🐱
Cat
🦌
Deer
🐕
Dog
🐸
Frog
🐎
Horse
🚢
Ship
🚛
Truck
Why CIFAR-10 is perfect for learning:
- Small images: 32x32 pixels = fast training
- Diverse classes: Animals, vehicles, objects
- Challenging: Low resolution makes it tricky
- Balanced: 6,000 images per class
# Load CIFAR-10 dataset
import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
print(f"Training images: {x_train.shape}") # (50000, 32, 32, 3)
print(f"Test images: {x_test.shape}") # (10000, 32, 32, 3)
print(f"Classes: {len(set(y_train.flatten()))}") # 10 classes
🏗️ Building a CNN for CIFAR-10
The CNN Architecture Strategy:
Think of your CNN like a detective that gets better at seeing details as it goes deeper. Early layers spot edges and colors, deeper layers recognize shapes and objects!
🎯 Our CNN Blueprint:
- Layer 1: 32 filters (3x3) → Find basic edges
- Layer 2: 64 filters (3x3) → Combine edges into shapes
- Layer 3: 128 filters (3x3) → Recognize complex patterns
- Dense Layer: 512 neurons → Make final decisions
- Output: 10 neurons → One for each class
# Complete CIFAR-10 CNN Model
import tensorflow as tf
from tensorflow.keras import layers, models
model = models.Sequential([
# First Conv Block
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
layers.MaxPooling2D((2, 2)),
# Second Conv Block
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
# Third Conv Block
layers.Conv2D(128, (3, 3), activation='relu'),
# Classifier
layers.Flatten(),
layers.Dense(512, activation='relu'),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax')
])
🚀 Training & Getting High Accuracy
Data Preprocessing - The Secret Sauce:
Raw pixel values (0-255) are too big for neural networks. We normalize them to 0-1 range, like adjusting the volume on your stereo!
# Preprocess the data
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# Convert labels to categorical
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)
# Compile model
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)
# Train the model
history = model.fit(
x_train, y_train,
batch_size=32,
epochs=20,
validation_data=(x_test, y_test)
)
🎯 Pro Tips for Better Accuracy:
- Data Augmentation: Rotate, flip, zoom images for more training data
- Dropout: Prevents overfitting (like studying different topics, not just one)
- Learning Rate Scheduling: Start fast, slow down as you get closer
- Batch Normalization: Keeps training stable
💡 Making Predictions & Understanding Results
How to interpret your model's predictions:
Your model outputs 10 probabilities (one for each class). The highest probability wins! It's like a voting system where each neuron votes for what it thinks the image is.
# Make predictions
predictions = model.predict(x_test[:5])
# Get class names
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck']
# Show predictions
for i in range(5):
predicted_class = np.argmax(predictions[i])
confidence = np.max(predictions[i]) * 100
print(f"Image {i}: {class_names[predicted_class]} ({confidence:.1f}% confident)")
🎯 Expected Performance:
- Basic CNN: ~70-75% accuracy
- With Data Augmentation: ~80-85% accuracy
- Advanced Techniques: ~90%+ accuracy
- State-of-the-art: ~99% accuracy (ResNet, EfficientNet)
Remember: CIFAR-10 is challenging because images are only 32x32 pixels!
# Evaluate model
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0)
print(f"Test Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")
# Save your trained model
model.save('cifar10_cnn_model.h5')
print("Model saved successfully!")