
Esta semana compramos muita pipoca na MultiCortex, pois foi um absoluto cinema de papers no setor de IA. Acreditamos que foi a semana mais insana de 2024 marcada por avanços significativos no campo da inteligência artificial, com a publicação de diversos artigos que abordam desde melhorias em arquiteturas de modelos até questões de segurança e eficiência. A seguir, a lista desses trabalhos:
- Byte Latent Transformer
- Training Large Language Models to Reason in a Continuous Latent Space
- Language Modeling in a Sentence Representation Space
- Phi-4 Technical Report – Best-of-N Jailbreaking
- Forking Paths in Neural Text Generation
- Refusal Tokens – [MASK] is All You Need
- Explore Theory-of-Mind
- Obfuscated Activations Bypass LLM Latent-Space Defenses
- The Pitfalls of Memorization
- How to Merge Your Multimodal Models Over Time?
- Machine Unlearning Doesn’t Do What You Think
- Understanding Gradient Descent through the Training Jacobian
- An Evolved Universal Transformer Memory
- Transformers Struggle to Learn to Search
- Transformers Can Navigate Mazes With Multi-Step Prediction
- Frontier Models are Capable of In-context Scheming
- Mixture of Monosemantic Experts for Transformers
- Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation
- Scalable Text and Image Conditioned Video Generation
- Hidden in the Noise: Two-Stage Robust Watermarking for Images
- Learned Compression for Compressed Learning
- Learning Flow Fields in Attention for Controllable Person Image Generation
- ProcessBench: Identifying Process Errors in Mathematical Reasoning
- Unraveling the Complexity of Memory in RL Agents
- Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
- APOLLO: SGD-like Memory, AdamW-level Performance
- Neural LightRig