0
// AŞAMA 0 — TEMEL ALTYAPI (OLMAZSA OLMAZ)
Matematik + Python + ML Temelleri
Lineer Cebir · Calculus · Olasılık · NumPy · Pandas · Scikit-learn
▶
⚠️ DL'ye direkt girme: Temel olmadan ilerlersen duvar çarparsın. Backpropagation'ı anlayabilmek için zincir kuralı, gradient descent için türev, PCA/Attention için lineer cebir şart.
Lineer Cebir
Vektör & matris operasyonları
Matris çarpımı (dot product)
Transpoz, determinant
Eigenvalue / eigenvector
SVD (tekil değer ayrışımı)
Attention · PCA · Weights
Calculus
Türev — f'(x) = limit
Zincir kuralı (🔥 KRİTİK)
Kısmi türev
Gradient vektörü
Jacobian & Hessian (ileri)
Backprop · Optimizer
Olasılık & İstatistik
Koşullu olasılık, Bayes
Normal, Bernoulli, Categorical
Beklenen değer, varyans
MLE (Max Likelihood)
KL Divergence (ileri)
Loss · VAE · GAN
# NumPy — DL'nin altında yatan matris mantığı
import numpy as np
# Forward pass — manuel implementasyon
W1 = np.random.randn(784, 128) * 0.01 # He init için √(2/n)
b1 = np.zeros(128)
z1 = np.dot(X, W1) + b1 # z = Wx + b
a1 = np.maximum(0, z1) # ReLU aktivasyonu
# Zincir kuralı — gradient hesabı
# dL/dW = dL/da × da/dz × dz/dW
dz1 = da1 * (z1 > 0) # ReLU türevi
dW1 = np.dot(X.T, dz1) / m
db1 = np.mean(dz1, axis=0)
# Gradient descent güncelleme
W1 -= lr * dW1
b1 -= lr * db1
Hedef: NumPy ile sıfırdan 2-katmanlı sinir ağı yaz. Framework olmadan. Bunu yapanlar framework'ü gerçekten anlar.
1
// AŞAMA 1 — DERİN ÖĞRENMEYE GİRİŞ
Perceptron → Backprop → PyTorch
Aktivasyon · Loss Functions · Optimizer · Backpropagation · Training Loop
▶
INPUT
x₁ x₂ x₃
x₁ x₂ x₃
HIDDEN 1
z=Wx+b → ReLU
z=Wx+b → ReLU
HIDDEN 2
BatchNorm + Dropout
BatchNorm + Dropout
OUTPUT
Softmax → ŷ
Softmax → ŷ
ReLU
max(0,x)
Default. Hızlı, seyrek. Dying ReLU riski var.
GELU
x·Φ(x)
BERT/GPT tercih eder. ReLU'dan yumuşak.
Sigmoid
1/(1+e⁻ˣ)
Output layer (binary). Vanishing gradient!
Softmax
eˣᵢ/Σeˣʲ
Multi-class output. Olasılık dağılımı.
# PyTorch — tam training loop
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset
class DeepNet(nn.Module):
def __init__(self, in_dim: int, hidden: int, n_classes: int):
super().__init__()
self.net = nn.Sequential(
nn.Linear(in_dim, hidden), nn.BatchNorm1d(hidden), nn.GELU(), nn.Dropout(0.3),
nn.Linear(hidden, hidden//2), nn.BatchNorm1d(hidden//2), nn.GELU(), nn.Dropout(0.2),
nn.Linear(hidden//2, n_classes)
)
self._init_weights()
def _init_weights(self):
for m in self.modules():
if isinstance(m, nn.Linear):
nn.init.kaiming_normal_(m.weight, nonlinearity="relu")
def forward(self, x): return self.net(x)
# Training loop — production kalitesi
model = DeepNet(784, 512, 10).to(device)
criterion = nn.CrossEntropyLoss(label_smoothing=0.1)
optimizer = torch.optim.AdamW(model.parameters(), lr=3e-4, weight_decay=1e-4)
scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=1e-3, steps_per_epoch=len(loader), epochs=50)
for epoch in range(50):
model.train()
for X_batch, y_batch in loader:
X_batch, y_batch = X_batch.to(device), y_batch.to(device)
optimizer.zero_grad()
loss = criterion(model(X_batch), y_batch)
loss.backward()
nn.utils.clip_grad_norm_(model.parameters(), 1.0) # gradient clipping
optimizer.step(); scheduler.step()
✏️ MNIST Rakam Tanıma
📧 Spam Sınıflandırma
🖼️ Basit Image Classifier
2
// AŞAMA 2 — ORTA SEVİYE
CNN (Computer Vision) + RNN/LSTM (Sequence)
Conv2d · Pooling · Feature Maps · Transfer Learning · LSTM · GRU · Sentiment
▶
# CNN — görüntü için (ResNet benzeri blok)
class ResidualBlock(nn.Module):
def __init__(self, channels: int):
super().__init__()
self.block = nn.Sequential(
nn.Conv2d(channels, channels, 3, padding=1, bias=False),
nn.BatchNorm2d(channels), nn.ReLU(inplace=True),
nn.Conv2d(channels, channels, 3, padding=1, bias=False),
nn.BatchNorm2d(channels)
)
def forward(self, x):
return nn.functional.relu(self.block(x) + x) # skip connection!
# Transfer Learning — en verimli yol
import torchvision.models as models
from torchvision import transforms
backbone = models.efficientnet_b0(weights="DEFAULT")
for param in backbone.features.parameters():
param.requires_grad = False # freeze — sadece classifier train
backbone.classifier = nn.Sequential(
nn.Dropout(0.3),
nn.Linear(backbone.classifier[1].in_features, 2)
)
# LSTM — zaman serisi / NLP
class SentimentLSTM(nn.Module):
def __init__(self, vocab_size: int, embed_dim: int, hidden: int):
super().__init__()
self.embed = nn.Embedding(vocab_size, embed_dim, padding_idx=0)
self.lstm = nn.LSTM(embed_dim, hidden, num_layers=2,
batch_first=True, dropout=0.3, bidirectional=True)
self.head = nn.Linear(hidden * 2, 1) # bidirectional → ×2
def forward(self, x):
e = self.embed(x)
out, (h, _) = self.lstm(e)
h_cat = torch.cat([h[-2], h[-1]], dim=1) # her iki yön
return self.head(h_cat).squeeze()
Data Augmentation
RandomHorizontalFlip, RandomCrop, ColorJitter, MixUp, CutMix. Albumentations kütüphanesi ile hızlı pipeline.
Skip Connection (ResNet)
Gradient vanishing problemi çözümü. Derin ağlarda eğitimi mümkün kılar. Modern mimarilerin temeli.
Vanishing Gradient
LSTM/GRU ile RNN sorununu çöz. Gradient clipping ekle. Residual connection en etkili çözüm.
🐱 Kedi-Köpek Sınıflandırma
😊 Sentiment Analysis
📝 Metin Üretme
🎬 Film Yorumu Analizi
3
// AŞAMA 3 — İLERİ SEVİYE (OYUN BURADA BAŞLIYOR)
Transformer + Self-Attention + NLP + Adv. CV
Attention · BERT · GPT · HuggingFace · YOLO · Segmentation · ViT
▶
🔥 "Attention Is All You Need" (2017): Bu paper modern yapay zekanın dönüm noktasıdır. BERT, GPT, T5, LLaMA, Stable Diffusion — hepsi bu mimaride. Okumayı kendinize borçlusunuz.
# Multi-Head Self-Attention — Transformer'ın kalbi
class MultiHeadAttention(nn.Module):
def __init__(self, d_model: int, n_heads: int):
super().__init__()
assert d_model % n_heads == 0
self.d_head = d_model // n_heads
self.n_heads = n_heads
self.Wq = nn.Linear(d_model, d_model)
self.Wk = nn.Linear(d_model, d_model)
self.Wv = nn.Linear(d_model, d_model)
self.Wo = nn.Linear(d_model, d_model)
def forward(self, q, k, v, mask=None):
B, T, D = q.shape
# Q, K, V'yi head boyutuna böl
Q = self.Wq(q).view(B, T, self.n_heads, self.d_head).transpose(1,2)
K = self.Wk(k).view(B, T, self.n_heads, self.d_head).transpose(1,2)
V = self.Wv(v).view(B, T, self.n_heads, self.d_head).transpose(1,2)
# Scaled dot-product attention
scores = torch.matmul(Q, K.transpose(-2,-1)) / (self.d_head ** 0.5)
if mask is not None: scores = scores.masked_fill(mask==0, float("-inf"))
attn = torch.nn.functional.softmax(scores, dim=-1)
return self.Wo(torch.matmul(attn, V).transpose(1,2).contiguous().view(B,T,D))
# HuggingFace — modern NLP pipeline
from transformers import (AutoTokenizer, AutoModelForSequenceClassification,
TrainingArguments, Trainer)
from datasets import load_dataset
import evaluate
# BERT fine-tuning — sadece birkaç satır
model_name = "dbmdz/bert-base-turkish-cased" # Türkçe BERT!
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
training_args = TrainingArguments(
output_dir="./results",
num_train_epochs=3, learning_rate=2e-5,
per_device_train_batch_size=16, weight_decay=0.01,
evaluation_strategy="epoch", save_strategy="best",
load_best_model_at_end=True, fp16=True # mixed precision
)
trainer = Trainer(model=model, args=training_args,
train_dataset=train_ds, eval_dataset=val_ds)
trainer.train()
Vision Transformer (ViT)
Görüntüyü patch'lere böl → sequence gibi işle. CNN'den daha iyi büyük veride. DeiT, Swin Transformer.
YOLO v8/v9
Real-time object detection. Ultralytics API ile 5 satır kod. Custom dataset fine-tuning kolay.
Segment Anything (SAM)
Meta'nın universal segmentation modeli. Zero-shot segmentation. Medical imaging'de devrim.
🇹🇷 Türkçe Chatbot
📰 Metin Özetleme
❓ Q&A Sistemi
🚦 Trafik Levhası Tanıma
📷 Güvenlik Kamerası
4
// AŞAMA 4 — UZMANLIK ALANLARI
NLP · Computer Vision · Speech · RL · Generative AI
LLM Fine-tuning · Diffusion · RL · RLHF · Voice · Multimodal
▶
Karar ver: Hepsine birden girme. Bir alanda 1 yıl geçirmek, 5 alanda 2 ay geçirmekten kat kat değerlidir. Sektörde en yüksek talep şu anda LLM + CV alanında.
🔵 NLP / LLM
• LLM fine-tuning (LoRA, QLoRA)
• RAG (Retrieval Augmented Generation)
• Prompt engineering
• LangChain / LlamaIndex
• RLHF (ChatGPT nasıl eğitildi?)
👁️ Computer Vision
• Object detection (YOLO, DETR)
• Semantic segmentation (Mask R-CNN)
• Image generation (Stable Diffusion)
• 3D vision (NeRF)
• Video understanding (VideoMAE)
🎮 Reinforcement Learning
• Markov Decision Process (MDP)
• Q-learning → DQN
• Policy gradient (PPO, A3C)
• AlphaGo / MuZero mimarisi
• Gymnasium (eski OpenAI Gym)
🎙️ Speech Processing
• Speech-to-text (Whisper)
• Text-to-speech (Tortoise, XTTS)
• Speaker recognition
• Audio classification
• Türkçe ASR modelleri
# LoRA — büyük modeli az parametreyle fine-tune
from peft import LoraConfig, get_peft_model, TaskType
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b", load_in_4bit=True # QLoRA
)
lora_config = LoraConfig(
task_type=TaskType.CAUSAL_LM,
r=16, lora_alpha=32, lora_dropout=0.1,
target_modules=["q_proj", "v_proj"] # attention weight'lere LoRA
)
model = get_peft_model(base_model, lora_config)
model.print_trainable_parameters() # ~0.1% parametre!
5
// AŞAMA 5 — PRODUCTION + MLOPS
Deployment + Docker + Cloud + Model Monitoring
FastAPI · TorchServe · Triton · ONNX · Docker · AWS SageMaker · Wandb
▶
# Model → ONNX → Triton Inference Server
import torch.onnx
# 1. ONNX'e dönüştür
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "model.onnx",
input_names=["input"], output_names=["output"],
opset_version=17, dynamic_axes={"input": {0: "batch"}})
# 2. FastAPI servisi
from fastapi import FastAPI, UploadFile
import onnxruntime as ort
import numpy as np
app = FastAPI(title="DL Inference API")
sess = ort.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
@app.post("/predict")
async def predict(file: UploadFile):
img = preprocess_image(await file.read())
out = sess.run(None, {"input": img})[0]
prob = np.softmax(out[0])
return {"class": int(np.argmax(prob)), "confidence": float(prob.max())}
Model Quantization
INT8/FP16 ile model boyutu ÷4. Inference hızı ×2-4. torch.quantization veya bitsandbytes ile.
Weights & Biases
Deney takibi, hyperparameter sweep, model registry. MLflow alternatifi. Kaggle'da standart.
Gradient Checkpointing
Büyük model eğitiminde bellek tasarrufu. Aktivasyonları kaydedip yeniden hesapla.
6
// AŞAMA 6 — PROJELERle PORTFÖy
GitHub + Kaggle + Paper Reading + Open Source
End-to-end projects · Reproductions · Blog yazma · Conference
▶
🔥 Yapman Gereken Projeler
📹 Gerçek zamanlı kamera sistemi (CNN+YOLO)
🤖 Türkçe destekli NLP chatbot (BERT+RAG)
🎬 Recommendation system (collaborative filter)
🏗️ End-to-end ML pipeline (Docker+API+Monitor)
🎨 Image generation fine-tuning (Stable Diffusion)
📄 Okuman Gereken Papers
• Attention Is All You Need (2017)
• BERT (2018) + GPT-2 (2019)
• ResNet: Deep Residual Learning (2015)
• Denoising Diffusion (DDPM 2020)
• LoRA: Low-Rank Adaptation (2021)
💡 Blog Yaz: Öğrendiklerini Medium veya kişisel blog'da paylaş. "Transformer'ı sıfırdan implement ettim" yazısı CV'nin en güçlü satırı olabilir. Yazmak öğrenmeyi 2x pekiştirir.