Zayed Ansari - Portfolio

Projects

Multiclass Classification

Developed a deep learning model capable of classifying animals across 153 different classes.

Deep Learning HuggingFace Streamlit

View Details →

CNN Image Classifier: Dog vs. Cat

Built a CNN from scratch to classify dog and cat images with ~85% accuracy.

CNN Transfer Learning Computer Vision

View Details →

Heart Disease Predictor

Developed a Random Forest model with 92% accuracy using the UCI Heart Disease dataset.

Random Forest Logistic Regression KNeighborsClassifier Healthcare Classification

View Details →

Posting more on my GitHub

Check out :)

GitHub →

Multiclass Classification

What I Did:

Built a model to classify 153 animal species using ensemble learning.

Integrated data preprocessing with augmentation (random flips, crops).

I used two Transfer Learning models InceptionV3 and Xception then I concatenate them.

And then deployed the model as an interactive web app using Hugging Face Spaces and Streamlit.

Code Snippet

Showing some of the code:

 # This is data augmentation 
train_datagen = ImageDataGenerator(
rescale = 1/255.0,
rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True,
shear_range=0.1,
zoom_range=0.2, 
fill_mode = "nearest",
validation_split = 0.1
)
val_datagen = ImageDataGenerator(rescale=1/255.0,validation_split = 0.1)
img_size = (224, 224)

train_data = train_datagen.flow_from_directory(data_dir, batch_size=32, class_mode="categorical",target_size=img_size, subset = "training")
val_data = val_datagen.flow_from_directory(data_dir, batch_size=32, class_mode="categorical",target_size=img_size, subset = "validation")

# Model architecture
base_model_1 = tf.keras.applications.InceptionV3(weights="imagenet", input_shape=(224, 224,3), include_top=False)
base_model_1.trainable = False
base_model_2 = tf.keras.applications.Xception(weights="imagenet", include_top=False, input_shape=(224,224, 3))
base_model_2.trainable = False

input_layer = Input(shape=(224, 224, 3))
x1 = base_model_1(input_layer, training = False)
x1 = GlobalAveragePooling2D()(x1)

x2 = base_model_2(input_layer, training= False)
x2 = GlobalAveragePooling2D()(x2)

merged = concatenate([x1, x2])
x = Dense(512, activation = "relu")(merged)
x = Dropout(0.4)(x)
outputs = Dense(153, activation = "softmax")(x)

ensemble_model = tf.keras.Model(inputs = input_layer, outputs = outputs) 
ensemble_model.compile(loss = 'categorical_crossentropy', optimizer=Adam(), metrics= ['accuracy'])

base_model_1.trainable = True
for layer in base_model_1.layers[:100]:
    layer.trainable = False

base_model_2.trainable = True
for layer in base_model_2.layers[:100]:
    layer.trainable = False


ensemble_model.compile(loss = 'categorical_crossentropy', optimizer=Adam(1e-5), metrics= ['accuracy'])
history_model_1_tuned = ensemble_model.fit(train_data, epochs=20, validation_data=val_data, callbacks=[lr])

Challenges Faced

Handling class imbalance was tough—some species had few samples, leading to overfitting. Used weighted loss functions and oversampling to mitigate. And simple models and even transfer learning models weren't performing upto the mark so had to use ensemble learning to train on the data. Deployment on Hugging Face also required optimizing model size for faster inference.

Live Demo

CNN Image Classifier: Dog vs. Cat

What I Did:

Designed and trained a CNN from scratch,

Then improved it with transfer learning using Xception.

Preprocessed images with resizing and normalization, achieving ~96% accuracy on a Kaggle dataset.

Code Snippet


# Using data augmentation
train_data_generator = ImageDataGenerator(
    rescale=1/255., # Normalize the pixel values
    rotation_range=30, # Randomly rotate the images by up to 30 degrees
    width_shift_range=0.2, # Randomly shift images horizontally by up to 20%
    height_shift_range=0.2, # Randomly shift images vertically by up to 20%
    shear_range=0.2, # Apply shear transformations
    zoom_range=0.2, # Randomly zoom in/out by up to 20%
    horizontal_flip=True # Randomly flip images horizontally 
)

# Load augmented training data
train_data_augmented = train_data_generator.flow_from_directory(train_dir, batch_size=32, class_mode='binary', target_size=Image_size)

# Model 1: A CNN model training on augmented data
model_1 = Sequential([
        Conv2D(32, (3, 3), input_shape=(224, 224, 3), activation='relu'),
        MaxPooling2D(pool_size=(2, 2)),

        Conv2D(64, (3, 3), activation='relu'),
        MaxPooling2D(pool_size=(2, 2)),

        Conv2D(128, (3, 3), activation='relu'),
        MaxPooling2D(pool_size=(2, 2)),
        Flatten(),

        Dense(128, activation='relu'),
        Dense(1, activation='sigmoid')
])
# Compile the model
model_1.compile(loss = 'binary_crossentropy', optimizer = Adam(), metrics = ['accuracy'])

# Fit the model with augmented data
history_1 = model_1.fit(train_data_augmented, epochs = 10,validation_data = val_data, callbacks = [checkpoint_callback])

### Trying transfer learning
xception_model = Xception(weights = 'imagenet', include_top = False, input_shape = (224, 224, 3)) # Here the Xception model is trained on Imagenet, which is used as feature extractor 
# Include top is False as I am going to another classification head
xception_model.trainable = False # Freezing the base mode to retain pre-trained weights

# Add new classification head
model_2 = Sequential([
    xception_model, # The pre-trained xception model
    layers.GlobalAveragePooling2D(), # Global average pooling to reduce spatical dimensions

    Dense(64, activation = 'relu'), # Dense layer for feature extraction
    Dropout(0.5), # Dropping out 50% of neurons
    Dense(1, activation = 'sigmoid')
])

# Compile the Model 1
model_2.compile(loss = 'binary_crossentropy', optimizer=Adam(), metrics = ['accuracy'])

# Fit the Model 1 with augmented data
history_2 = model_2.fit(train_data_augmented, epochs=10, validation_data=val_data, callbacks=[checkpoint_callback])

Challenges Faced

Initial model was getting confused and wasn't able to predict dog class. Even though I fixed it but the result wasn't up to the mark so, I had to use Transfer learning. Also faced memory issues with large batch sizes on my GPU.

GitHub Repo

Heart Disease Predictor

What I Did:

Created models using Logistic Regression, KNeighborsClassifier, Random Forest classifier on the UCI Heart Disease dataset.

Performed feature engineering (e.g., normalizing cholesterol levels) and analyzed feature importance to identify key predictors like age and chest pain type.

Code Snippet

                    # Importing the libraries 
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report, ConfusionMatrixDisplay
from sklearn.ensemble import RandomForestClassifier 

X = heart_df.drop('target', axis = 1)
y = heart_df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=101)

# Scaling the data
scaler = StandardScaler()
scaled_X_train = scaler.fit_transform(X_train)
scaled_X_test= scaler.fit_transform(X_test)

# Logistic Regression

logistic_model = LogisticRegression()
logistic_model.fit(scaled_X_train, y_train)

# KNN Model

knn =  KNeighborsClassifier()
operations = [('scaler',scaler),('knn',knn)]
pipe = Pipeline(operations)
k_values = list(range(1,15))


parameter_grid = {
    'knn__n_neighbors': k_values
}

cv_classifier = GridSearchCV(pipe, param_grid=parameter_grid, cv = 5, scoring = 'accuracy')
cv_classifier.fit(X_train, y_train)

knn_final_model = KNeighborsClassifier(n_neighbors=1)
scaler = StandardScaler()
operations = [('scaler',scaler),('knn',knn_final_model)]
pipe = Pipeline(operations)
pipe.fit(X_train, y_train)

# Random Forest

rf_model = RandomForestClassifier()
rf_param = {
    'n_estimators': [10,50,1100,200],
    'max_features':[2,3,4,5],
    'bootstrap':[True, False],
    'oob_score':[True, False]
}
grid = GridSearchCV(rf_model, rf_param)
grid.fit(X_train, y_train)

grid.best_params_
{'bootstrap': True, 'max_features': 2, 'n_estimators': 50, 'oob_score': True}
rf_prediction = grid.predict(X_test)

# Comparing the scores
comp_score ={
    'Model': ['Logistic Regression', 'KNN', 'Random Forest'],
    'Accuracy':[0.86, 1.0, 1.0 ],
    'Precision': [ 0.92, 1.00, 1.00],
    'Recall':[0.81, 1.00, 1.00],
    'F1 Score':[0.86, 1.00, 1.00]
}
pd.DataFrame(comp_score)

Challenges Faced

Missing data in the dataset required imputation strategies—tried mean and median, settled on KNN imputation. Tuning hyperparameters took time; grid search was slow on my machine as it was using CPU.

GitHub Repo

Hi I'm Z A Y E D

Building Neural Networks That Matter

Projects

Multiclass Classification

CNN Image Classifier: Dog vs. Cat

Heart Disease Predictor

Posting more on my GitHub

Skills

Technical Skills

Techniques

Tools & Frameworks

Deployment