VJ
VJAI Paper Hub
HomeArchive
Knowledge Hub

Paper Archive

Papers the group has read, presented, and reproduced — independent of the nomination cycle. Each entry includes session notes, reproduction code, and a vibe score from the group.

9
Papers
4
Reproduced
6
Venues
91%
Avg Vibe
9 of 9
TransformersNLPFoundations
Reproduced

NeurIPS 2017 · Duc Vo

Attention Is All You Need

  • Introduces multi-head self-attention to replace recurrence for sequence-to-sequence tasks.
  • Positional encodings allow the model to reason about sequence order without RNN loops.
  • Trains 3x faster than RNN seq2seq and sets new BLEU SOTA on WMT English-German.
99% hackable
DiffusionImage GenerationGenerative Models
In Review
  • Encodes images into 4x compressed latent space with a VQ-regularized autoencoder.
  • Diffusion happens in latent space, reducing compute vs pixel-space diffusion by 4-8x.
  • Cross-attention conditioning enables flexible text, image, and semantic map control.
97% hackable
LLMFine-tuningEfficiency
Reproduced
  • Freezes pre-trained weights and injects trainable rank decomposition matrices into each layer.
  • Reduces GPU memory requirements by >3x, enabling fine-tuning on consumer hardware.
  • Achieves comparable or better performance vs full fine-tuning on NLP benchmarks.
95% hackable
MultimodalVisionContrastive Learning
Reproduced
  • Contrastive loss pulls matching image-text pairs together and pushes non-matching pairs apart.
  • Zero-shot transfer: describe a class in text and classify images without any fine-tuning.
  • Matches ResNet-50 supervised accuracy on ImageNet with no labeled training data.
94% hackable
LLMQuantizationEfficiency
Reproduced
  • Introduces 4-bit NormalFloat (NF4), a new data type optimal for normally-distributed weights.
  • Double quantization further reduces memory by ~0.5 bits per parameter.
  • Guanaco models fine-tuned with QLoRA match ChatGPT on human benchmarks.
92% hackable
RAGLLMRetrieval
Archived
  • Fine-tunes both the retriever and generator end-to-end for knowledge-intensive NLP.
  • Outperforms pure parametric models on open-domain QA with far less compute.
  • Enables easy knowledge updates without retraining the full model.
90% hackable
ArchitectureSSMEfficiency
In Review
  • Selective State Space Model (SSM) that can selectively remember or forget context.
  • 5x faster inference than Transformers of equivalent size at long sequence lengths.
  • Outperforms Transformers on multiple language, audio, and genomics benchmarks.
88% hackable
RoboticsDiffusionRL
Archived
  • Formulates robot policy as a conditional denoising diffusion process over actions.
  • Handles multimodal action distributions better than behavior cloning or IBC.
  • Achieves 46% improvement over best prior method on 12 robot manipulation tasks.
82% hackable
BiologyProteinDiffusion
Archived
  • Uses a diffusion module instead of structure modules to generate 3D atom coordinates.
  • Covers proteins, DNA, RNA, and small molecules in a single unified model.
  • Sets SOTA on diverse benchmarks for protein-ligand and antibody-antigen docking.
78% hackable