filtering_analysis

Phase 3: First-Pass Filtering Analysis

Novelty Quick-Check Results

⚠️ PARTIAL OVERLAP - Need Differentiation

Idea 1: Modality-Aware Adaptive LoRA (MA-LoRA)

Closest work:
- Multimodal Low-Rank Adaptation (MokA) - Already does modality-aware parameter allocation
- Hierarchical and Dynamic Rank Adaptation for Mobile VLM - Dynamic rank for multimodal
- MARS: Multimodal Adaptive Rank Search - Adaptive rank search for multimodal
Differentiation needed: These papers already combine adaptive rank + multimodal. Need stronger angle.
Status: ⚠️ NEEDS REFINEMENT

Idea 2: Cross-Modal Budget Allocation (CMBA)

Closest work:
- Towards Efficient Visual-Language Alignment of Q-Former - Uses AdaLoRA on Q-Former
- Cross-Modal Low-rank Adaptation - Cross-modal LoRA
Gap: No systematic study of budget ratios across modalities. Existing work applies uniform methods.
Status: ✅ NOVEL - diagnostic angle is unique

Idea 3: Zero-Shot Rank Predictor (ZRP)

Closest work:
- Model Prior-Guided Rank Allocation (SR-LoRA) - Uses stable rank (intrinsic dimensionality)
- Geometric Adaptive Ranks - Geometry-based rank selection
Gap: SR-LoRA uses stable rank but still requires training. True zero-shot prediction from pre-trained stats is unexplored.
Status: ⚠️ PARTIAL OVERLAP - SR-LoRA is very close

✅ NOVEL - Clear Differentiation

Idea 4: Hierarchical Rank Allocation (HRA)

No direct match found. Existing work does layer-wise or modality-wise, but not hierarchical (coarse→fine).
Status: ✅ NOVEL

Idea 5: Modality-Specific Learning Rate Scaling (MSLR)

No work combines adaptive rank + modality-specific LR.
Status: ✅ NOVEL

Idea 10: Dynamic Rank Adjustment During Training

Existing work uses fixed rank or post-hoc pruning, not gradual decay during training.
Status: ✅ NOVEL

❌ HIGH RISK / HIGH COMPUTE

Idea 6: Task-Conditioned Rank Allocation (TCRA)

Interesting but requires 72 GPU-hours for 3 tasks. Too expensive for pilot.
Status: ❌ ELIMINATE (compute budget)

Idea 7: Gradient-Free Rank Search via Evolutionary Algorithm

80 GPU-hours for search. Exceeds MAX_TOTAL_GPU_HOURS=8.
Status: ❌ ELIMINATE (compute budget)

Idea 8: Cross-Architecture Rank Transfer

96 GPU-hours across 3 architectures. Too expensive.
Status: ❌ ELIMINATE (compute budget)

Idea 9: Information Bottleneck-Guided Rank Allocation

High risk (theory may not hold) + 48 GPU-hours.
Status: ❌ ELIMINATE (risk + compute)

Feasibility Check

Idea	Compute	Data	Implementation	Verdict
1. MA-LoRA	40h	VQAv2 ✅	Medium	⚠️ Needs differentiation
2. CMBA	24h	VQAv2 ✅	Easy	✅ PASS
3. ZRP	60h	VQAv2 ✅	Hard	⚠️ Overlaps SR-LoRA
4. HRA	32h	VQAv2 ✅	Medium	✅ PASS
5. MSLR	24h	VQAv2 ✅	Easy	✅ PASS
6. TCRA	72h	3 datasets	Hard	❌ Too expensive
7. Evolutionary	80h	2 datasets	Medium	❌ Too expensive
8. Transfer	96h	3 models	Hard	❌ Too expensive
9. IB-Guided	48h	VQAv2 ✅	Very Hard	❌ High risk + cost
10. Dynamic	24h	VQAv2 ✅	Easy	✅ PASS

Impact Estimation

High Impact (clear "so what"):

Idea 2 (CMBA): Reveals where parameter budget should go in multimodal models. Actionable for practitioners.
Idea 4 (HRA): 2x faster than AdaLoRA while maintaining performance. Clear efficiency win.
Idea 10 (Dynamic): 40-50% parameter savings with same accuracy. Strong practical value.

Medium Impact:

Idea 5 (MSLR): Faster convergence is useful but not groundbreaking.

Unclear Impact:

Idea 1 (MA-LoRA): Too similar to existing work (MokA, MARS). Needs stronger differentiation.
Idea 3 (ZRP): SR-LoRA already does this partially. Incremental improvement at best.

Surviving Ideas (6 → 4)

✅ Top Tier (pilot these)

Idea 2: Cross-Modal Budget Allocation (CMBA) - LOW risk, HIGH impact, NOVEL
Idea 4: Hierarchical Rank Allocation (HRA) - MEDIUM risk, HIGH impact, NOVEL
Idea 10: Dynamic Rank Adjustment (Dynamic) - LOW risk, HIGH impact, NOVEL

⚠️ Second Tier (validate on paper, pilot if budget allows)

Idea 5: Modality-Specific Learning Rate Scaling (MSLR) - LOW risk, MEDIUM impact, NOVEL

❌ Eliminated (6 ideas)

Idea 1: Too similar to MokA/MARS - needs stronger angle
Idea 3: SR-LoRA already covers this - incremental at best
Idea 6, 7, 8: Exceed compute budget (72-96 GPU-hours)
Idea 9: High risk + high compute (48h)

Recommendation for Phase 4

Pilot these 3 ideas in parallel (total: 24+32+24 = 80h, but pilots are scaled down):

CMBA - 6 ratio ablations × 2h = 12h pilot
HRA - 3 allocation strategies × 3h = 9h pilot
Dynamic - 3 rank schedules × 2h = 6h pilot

Total pilot budget: ~27 GPU-hours (well under MAX_TOTAL_GPU_HOURS=8 per idea)

If pilots show positive signal, proceed to deep validation (Phase 4).

Key Findings from Literature

Recent relevant work (2024-2025):

MokA: Modality-aware LoRA with shared/specific parameters
MARS: Adaptive rank search for multimodal
SR-LoRA: Stable rank-guided allocation
Hierarchical Dynamic Rank: Dynamic rank for mobile VLMs
Q-Former PEFT: AdaLoRA on cross-modal projector

Structural gap confirmed: No work systematically studies budget allocation ratios across modalities (vision:projector:language). This is Idea 2's unique angle.