Attention Ceci est une très mauvaise bibliographie (ça fait partie de l’exercice):
Ne faites pas comme ça.
Chen, Meng, Jiawei Tu, Chao Qi, et al. 2025. “Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation.” arXiv:2409.10071. Version 5. Preprint, arXiv, August 15. https://doi.org/10.48550/arXiv.2409.10071.
Cools, Kasper, Clara Maathuis, Alexander M. van Oers, et al. 2025. “Vision Transformers: The Threat of Realistic Adversarial Patches.” arXiv:2509.21084. Preprint, arXiv, September 25. https://doi.org/10.48550/arXiv.2509.21084.
Goldblum, Micah, Dimitris Tsipras, Chulin Xie, et al. 2021. “Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses.” arXiv:2012.10544. Preprint, arXiv, March 31. https://doi.org/10.48550/arXiv.2012.10544.
Gu, Jindong, Xiaojun Jia, Pau de Jorge, et al. 2024. “A Survey on Transferability of Adversarial Examples across Deep Neural Networks.” arXiv:2310.17626. Preprint, arXiv, May 2. https://doi.org/10.48550/arXiv.2310.17626.
Laugros, Alfred, Alice Caplier, and Matthieu Ospici. 2021. “Using Synthetic Corruptions to Measure Robustness to Natural Distribution Shifts.” arXiv:2107.12052. Preprint, arXiv, November 18. https://doi.org/10.48550/arXiv.2107.12052.
Li, Yiquan, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Bo Li, and Chaowei Xiao. 2024. “Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness.” arXiv:2407.00623. Version 1. Preprint, arXiv, June 30. https://doi.org/10.48550/arXiv.2407.00623.
Lu, Liming, Shuchao Pang, Siyuan Liang, et al. 2025. “Adversarial Training for Multimodal Large Language Models against Jailbreak Attacks.” arXiv:2503.04833. Preprint, arXiv, March 18. https://doi.org/10.48550/arXiv.2503.04833.
Lyu, Saiyue, Shadab Shaikh, Frederick Shpilevskiy, Evan Shelhamer, and Mathias Lécuyer. 2025. “Adaptive Randomized Smoothing: Certified Adversarial Robustness for Multi-Step Defences.” arXiv:2406.10427. Version 3. Preprint, arXiv, July 10. https://doi.org/10.48550/arXiv.2406.10427.
Mahmood, Kaleel, Rigel Mahmood, and Marten van Dijk. 2021. “On the Robustness of Vision Transformers to Adversarial Examples.” arXiv:2104.02610. Preprint, arXiv, June 5. https://doi.org/10.48550/arXiv.2104.02610.
Wang, Jiakai, Xianglong Liu, Jin Hu, et al. 2024. “Adversarial Examples in the Physical World: A Survey.” arXiv:2311.01473. Version 2. Preprint, arXiv, July 19. https://doi.org/10.48550/arXiv.2311.01473.
Akki, Shivayogi, and Tan Chen. 2025. “Benchmarking Model Predictive Control and Reinforcement Learning Based Control for Legged Robot Locomotion in MuJoCo Simulation.” arXiv:2501.16590. Preprint, arXiv, January 28. https://doi.org/10.48550/arXiv.2501.16590.
Brohan, Anthony, Noah Brown, Justice Carbajal, et al. 2023a. “RT-1: Robotics Transformer for Real-World Control at Scale.” arXiv:2212.06817. Preprint, arXiv, August 11. https://doi.org/10.48550/arXiv.2212.06817.
Brohan, Anthony, Noah Brown, Justice Carbajal, et al. 2023b. “RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control.” arXiv:2307.15818. Preprint, arXiv, July 28. https://doi.org/10.48550/arXiv.2307.15818.
Burchi, Maxime, and Radu Timofte. 2024. “MuDreamer: Learning Predictive World Models without Reconstruction.” arXiv:2405.15083. Preprint, arXiv, May 23. https://doi.org/10.48550/arXiv.2405.15083.
Chittepu, Yaswanth, Blossom Metevier, Will Schwarzer, Austin Hoag, Scott Niekum, and Philip S. Thomas. 2025. “Reinforcement Learning from Human Feedback with High-Confidence Safety Constraints.” arXiv:2506.08266. Version 1. Preprint, arXiv, June 9. https://doi.org/10.48550/arXiv.2506.08266.
Gajewski, Paul, Dominik Żurek, Marcin Pietroń, and Kamil Faber. 2024. “Solving Multi-Goal Robotic Tasks with Decision Transformer.” arXiv:2410.06347. Preprint, arXiv, October 8. https://doi.org/10.48550/arXiv.2410.06347.
Li, Zezeng, Alexandre Chapin, Enda Xiang, et al. 2025. “Robotic Manipulation via Imitation Learning: Taxonomy, Evolution, Benchmark, and Challenges.” arXiv:2508.17449. Version 1. Preprint, arXiv, August 24. https://doi.org/10.48550/arXiv.2508.17449.
Morad, Steven, Ajay Shankar, Jan Blumenkamp, and Amanda Prorok. 2024. “Language-Conditioned Offline RL for Multi-Robot Navigation.” arXiv:2407.20164. Preprint, arXiv, July 29. https://doi.org/10.48550/arXiv.2407.20164.
Nair, Suraj, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, and Abhinav Gupta. 2022. “R3M: A Universal Visual Representation for Robot Manipulation.” arXiv:2203.12601. Preprint, arXiv, November 18. https://doi.org/10.48550/arXiv.2203.12601.
Zakka, Kevin, Baruch Tabanpour, Qiayuan Liao, et al. 2025. “MuJoCo Playground.” arXiv:2502.08844. Version 1. Preprint, arXiv, February 12. https://doi.org/10.48550/arXiv.2502.08844.
Chen, Lei, Yuan Meng, Chen Tang, et al. 2024. “Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers.” arXiv:2406.17343. Version 1. Preprint, arXiv, June 25. https://doi.org/10.48550/arXiv.2406.17343.
Fladmark, Eirik, Muhammad Hamza Sajjad, and Laura Brinkholm Justesen. 2023. “Exploring the Performance of Pruning Methods in Neural Networks: An Empirical Study of the Lottery Ticket Hypothesis.” arXiv:2303.15479. Preprint, arXiv, March 26. https://doi.org/10.48550/arXiv.2303.15479.
Gu, Yuxian, Li Dong, Furu Wei, and Minlie Huang. 2025. “MiniLLM: Knowledge Distillation of Large Language Models.” arXiv:2306.08543. Preprint, arXiv, November 21. https://doi.org/10.48550/arXiv.2306.08543.
Huang, Xijie, Zhiqiang Shen, Pingcheng Dong, and Kwang-Ting Cheng. 2024. “Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision.” arXiv:2307.00331. Preprint, arXiv, October 12. https://doi.org/10.48550/arXiv.2307.00331.
Jayanth, Rakshith, Neelesh Gupta, and Viktor Prasanna. 2024. “Benchmarking Edge AI Platforms for High-Performance ML Inference.” arXiv:2409.14803. Version 1. Preprint, arXiv, September 23. https://doi.org/10.48550/arXiv.2409.14803.
Liang, Jessica, and Anirudh Bharadwaj. 2025. “QR-LoRA: QR-Based Low-Rank Adaptation for Efficient Fine-Tuning of Large Language Models.” arXiv:2508.21810. Preprint, arXiv, August 29. https://doi.org/10.48550/arXiv.2508.21810.
Mansourian, Amir M., Rozhan Ahmadi, Masoud Ghafouri, et al. 2025. “A Comprehensive Survey on Knowledge Distillation.” arXiv:2503.12067. Preprint, arXiv, October 11. https://doi.org/10.48550/arXiv.2503.12067.
Rakka, Mariam, Marios Fournarakis, Olga Krestinskaya, et al. 2025. “Mixed-Precision Quantization for Language Models: Techniques and Prospects.” arXiv:2510.16805. Preprint, arXiv, October 19. https://doi.org/10.48550/arXiv.2510.16805.
Tian, Chunlin, Xuyang Wei, Huanrong Liu, Zhijiang Guo, and Li Li. 2025. “Less Is More: Resource-Efficient Low-Rank Adaptation.” arXiv:2512.00878. Preprint, arXiv, November 30. https://doi.org/10.48550/arXiv.2512.00878.
Wu, Junyi, Haoxuan Wang, Yuzhang Shang, Mubarak Shah, and Yan Yan. 2024. “PTQ4DiT: Post-Training Quantization for Diffusion Transformers.” arXiv:2405.16005. Version 3. Preprint, arXiv, October 17. https://doi.org/10.48550/arXiv.2405.16005.
Chen, Xiaoyang, Ben He, Hongyu Lin, et al. 2024. “Spiral of Silence: How Is Large Language Model Killing Information Retrieval? -- A Case Study on Open Domain Question Answering.” arXiv:2404.10496. Preprint, arXiv, June 23. https://doi.org/10.48550/arXiv.2404.10496.
Csizmadia, Daniel, Andrei Codreanu, Victor Sim, et al. 2025. “Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation.” arXiv:2505.21549. Version 2. Preprint, arXiv, May 29. https://doi.org/10.48550/arXiv.2505.21549.
Guu, Kelvin, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. “REALM: Retrieval-Augmented Language Model Pre-Training.” arXiv:2002.08909. Preprint, arXiv, February 10. https://doi.org/10.48550/arXiv.2002.08909.
Izacard, Gautier, Mathilde Caron, Lucas Hosseini, et al. 2022. “Unsupervised Dense Information Retrieval with Contrastive Learning.” arXiv:2112.09118. Preprint, arXiv, August 29. https://doi.org/10.48550/arXiv.2112.09118.
Karpukhin, Vladimir, Barlas Oğuz, Sewon Min, et al. 2020. “Dense Passage Retrieval for Open-Domain Question Answering.” arXiv:2004.04906. Preprint, arXiv, September 30. https://doi.org/10.48550/arXiv.2004.04906.
Khattab, Omar, and Matei Zaharia. 2020. “ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT.” arXiv:2004.12832. Preprint, arXiv, June 4. https://doi.org/10.48550/arXiv.2004.12832.
Lewis, Patrick, Ethan Perez, Aleksandra Piktus, et al. 2021. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” arXiv:2005.11401. Preprint, arXiv, April 12. https://doi.org/10.48550/arXiv.2005.11401.
Pan, Zhenyu, Haozheng Luo, Manling Li, and Han Liu. 2024. “Conv-CoA: Improving Open-Domain Question Answering in Large Language Models via Conversational Chain-of-Action.” arXiv:2405.17822. Preprint, arXiv, May 28. https://doi.org/10.48550/arXiv.2405.17822.
Thakur, Nandan, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, and Iryna Gurevych. 2021. “BEIR: A Heterogenous Benchmark for Zero-Shot Evaluation of Information Retrieval Models.” arXiv:2104.08663. Preprint, arXiv, October 21. https://doi.org/10.48550/arXiv.2104.08663.
Zhong, Ming, Zhizhi Wu, and Nanako Honda. 2024. “Deep Learning Based Dense Retrieval: A Comparative Study.” arXiv:2410.20315. Version 1. Preprint, arXiv, October 27. https://doi.org/10.48550/arXiv.2410.20315.
Bao, Guangsheng, Zhiyang Teng, and Yue Zhang. 2023. “Target-Side Augmentation for Document-Level Machine Translation.” arXiv:2305.04505. Preprint, arXiv, June 4. https://doi.org/10.48550/arXiv.2305.04505.
Bogoychev, Nikolay, and Pinzhen Chen. 2023. “Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting.” arXiv:2310.05824. Preprint, arXiv, October 9. https://doi.org/10.48550/arXiv.2310.05824.
Dale, David, Elena Voita, Janice Lam, et al. 2023. “HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation.” arXiv:2305.11746. Preprint, arXiv, December 6. https://doi.org/10.48550/arXiv.2305.11746.
Guerreiro, Nuno M., Duarte Alves, Jonas Waldendorf, et al. 2023. “Hallucinations in Large Multilingual Translation Models.” arXiv:2303.16104. Preprint, arXiv, March 28. https://doi.org/10.48550/arXiv.2303.16104.
He, Zhiwei, Tian Liang, Wenxiang Jiao, et al. 2023. “Exploring Human-Like Translation Strategy with Large Language Models.” arXiv:2305.04118. Preprint, arXiv, November 29. https://doi.org/10.48550/arXiv.2305.04118.
Herold, Christian, and Hermann Ney. 2023. “Improving Long Context Document-Level Machine Translation.” arXiv:2306.05183. Preprint, arXiv, June 8. https://doi.org/10.48550/arXiv.2306.05183.
Lu, Hongyuan, Haoran Yang, Haoyang Huang, Dongdong Zhang, Wai Lam, and Furu Wei. 2024. “Chain-of-Dictionary Prompting Elicits Translation in Large Language Models.” arXiv:2305.06575. Preprint, arXiv, August 17. https://doi.org/10.48550/arXiv.2305.06575.
Sennrich, Rico, Jannis Vamvas, and Alireza Mohammadshahi. 2024. “Mitigating Hallucinations and Off-Target Machine Translation with Source-Contrastive and Language-Contrastive Decoding.” arXiv:2309.07098. Preprint, arXiv, January 29. https://doi.org/10.48550/arXiv.2309.07098.
Wang, Longyue, Chenyang Lyu, Tianbo Ji, et al. 2023. “Document-Level Machine Translation with Large Language Models.” arXiv:2304.02210. Preprint, arXiv, October 24. https://doi.org/10.48550/arXiv.2304.02210.
Zhu, Wenhao, Hongyi Liu, Qingxiu Dong, et al. 2024. “Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis.” arXiv:2304.04675. Preprint, arXiv, June 14. https://doi.org/10.48550/arXiv.2304.04675.
Berner, Julius, Miguel Liu-Schiaffini, Jean Kossaifi, et al. 2025. “Principled Approaches for Extending Neural Architectures to Function Spaces for Operator Learning.” arXiv:2506.10973. Preprint, arXiv, June 12. https://doi.org/10.48550/arXiv.2506.10973.
Ganga, Sai, and Ziya Uddin. 2024. “Exploring Physics-Informed Neural Networks: From Fundamentals to Applications in Complex Systems.” arXiv:2410.00422. Preprint, arXiv, October 1. https://doi.org/10.48550/arXiv.2410.00422.
Karumuri, Sharmila, Lori Graham-Brady, and Somdatta Goswami. 2025. “Physics-Informed Latent Neural Operator for Real-Time Predictions of Time-Dependent Parametric PDEs.” arXiv:2501.08428. Version 3. Preprint, arXiv, October 28. https://doi.org/10.48550/arXiv.2501.08428.
Lassen, Oskar Bohn, Serio Angelo Maria Agriesti, Filipe Rodrigues, and Francisco Camara Pereira. 2025. “Climate Surrogates for Scalable Multi-Agent Reinforcement Learning: A Case Study with CICERO-SCM.” arXiv:2510.07971. Version 1. Preprint, arXiv, October 9. https://doi.org/10.48550/arXiv.2510.07971.
Lejarza, Fernando, and Michael Baldea. 2022. “DySMHO: Data-Driven Discovery of Governing Equations for Dynamical Systems via Moving Horizon Optimization.” Scientific Reports 12 (1): 11836. https://doi.org/10.1038/s41598-022-13644-w.
Oommen, Vivek, Siavash Khodakarami, Aniruddha Bora, Zhicheng Wang, and George Em Karniadakis. 2025. “Learning Turbulent Flows with Generative Models: Super-Resolution, Forecasting, and Sparse Flow Reconstruction.” arXiv:2509.08752. Preprint, arXiv, September 10. https://doi.org/10.48550/arXiv.2509.08752.
Owens, Katherine, and J. Nathan Kutz. 2022. “Data-Driven Discovery of Governing Equations for Coarse-Grained Heterogeneous Network Dynamics.” arXiv:2205.10965. Preprint, arXiv, May 23. https://doi.org/10.48550/arXiv.2205.10965.
Tauberschmidt, Jan, Sophie Fellenz, Sebastian J. Vollmer, and Andrew B. Duncan. 2025. “Physics-Constrained Fine-Tuning of Flow-Matching Models for Generation and Inverse Problems.” arXiv:2508.09156. Preprint, arXiv, August 5. https://doi.org/10.48550/arXiv.2508.09156.
Toscano, Juan Diego, Vivek Oommen, Alan John Varghese, et al. 2024. “From PINNs to PIKANs: Recent Advances in Physics-Informed Machine Learning.” arXiv:2410.13228. Preprint, arXiv, October 22. https://doi.org/10.48550/arXiv.2410.13228.
You, Wen, Shaoqian Zhou, and Xuhui Meng. 2025. “Self-Supervised Neural Operator for Solving Partial Differential Equations.” arXiv:2509.00867. Version 1. Preprint, arXiv, August 31. https://doi.org/10.48550/arXiv.2509.00867.
Huang, Rongjie, Mingze Li, Dongchao Yang, et al. 2023. “AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head.” arXiv:2304.12995. Preprint, arXiv, April 25. https://doi.org/10.48550/arXiv.2304.12995.
Luo, Zhengxiong, Dayou Chen, Yingya Zhang, et al. 2023. “VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation.” arXiv:2303.08320. Preprint, arXiv, October 13. https://doi.org/10.48550/arXiv.2303.08320.
Moser, Brian B., Arundhati S. Shanbhag, Federico Raue, Stanislav Frolov, Sebastian Palacio, and Andreas Dengel. 2025. “Diffusion Models, Image Super-Resolution And Everything: A Survey.” IEEE Transactions on Neural Networks and Learning Systems 36 (7): 11793–813. https://doi.org/10.1109/TNNLS.2024.3476671.
Nichol, Alex, Prafulla Dhariwal, Aditya Ramesh, et al. 2022. “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.” arXiv:2112.10741. Preprint, arXiv, March 8. https://doi.org/10.48550/arXiv.2112.10741.
Saharia, Chitwan, William Chan, Saurabh Saxena, et al. 2022. “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding.” arXiv:2205.11487. Preprint, arXiv, May 23. https://doi.org/10.48550/arXiv.2205.11487.
Sordo, Zineb, Eric Chagnon, and Daniela Ushizima. 2025. “A Review on Generative AI For Text-To-Image and Image-To-Image Generation and Implications To Scientific Images.” arXiv:2502.21151. Version 2. Preprint, arXiv, March 10. https://doi.org/10.48550/arXiv.2502.21151.
Sun, Quan, Qiying Yu, Yufeng Cui, et al. 2023. “Generative Pretraining in Multimodality.” arXiv:2307.05222. Version 1. Preprint, arXiv, July 11. https://doi.org/10.48550/arXiv.2307.05222.
Xu, Katherine, Lingzhi Zhang, and Jianbo Shi. 2025. “Detecting Origin Attribution for Text-to-Image Diffusion Models.” arXiv:2403.19653. Preprint, arXiv, April 16. https://doi.org/10.48550/arXiv.2403.19653.
Zhang, Jinjin, Qiuyu Huang, Junjie Liu, Xiefan Guo, and Di Huang. 2025. “Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models.” arXiv:2503.18352. Version 1. Preprint, arXiv, March 24. https://doi.org/10.48550/arXiv.2503.18352.
Zhang, Lvmin, Anyi Rao, and Maneesh Agrawala. 2023. “Adding Conditional Control to Text-to-Image Diffusion Models.” arXiv:2302.05543. Preprint, arXiv, November 26. https://doi.org/10.48550/arXiv.2302.05543.
Frantar, Elias, Carlos Riquelme, Neil Houlsby, Dan Alistarh, and Utku Evci. 2023. “Scaling Laws for Sparsely-Connected Foundation Models.” arXiv:2309.08520. Preprint, arXiv, September 15. https://doi.org/10.48550/arXiv.2309.08520.
Liu, Fan, Tianshu Zhang, Wenwen Dai, Wenwen Cai, Xiaocong Zhou, and Delong Chen. 2024. “Few-Shot Adaptation of Multi-Modal Foundation Models: A Survey.” arXiv:2401.01736. Preprint, arXiv, January 4. https://doi.org/10.48550/arXiv.2401.01736.
Liu, Xu, Tong Zhou, Yuanxin Wang, et al. 2023. “Towards the Unification of Generative and Discriminative Visual Foundation Model: A Survey.” arXiv:2312.10163. Preprint, arXiv, December 15. https://doi.org/10.48550/arXiv.2312.10163.
Lu, Jianglin, Hailing Wang, Yi Xu, Yizhou Wang, Kuo Yang, and Yun Fu. 2025. “Representation Potentials of Foundation Models for Multimodal Alignment: A Survey.” arXiv:2510.05184. Preprint, arXiv, October 5. https://doi.org/10.48550/arXiv.2510.05184.
Schneider, Johannes, Christian Meske, and Pauline Kuss. 2024. “Foundation Models.” Business & Information Systems Engineering 66 (2): 221–31. https://doi.org/10.1007/s12599-024-00851-0.
Subramanian, Shashank, Peter Harrington, Kurt Keutzer, et al. 2023. “Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior.” arXiv:2306.00258. Preprint, arXiv, June 1. https://doi.org/10.48550/arXiv.2306.00258.
Sun, Weigao, Jiaxi Hu, Yucheng Zhou, et al. 2025. “Speed Always Wins: A Survey on Efficient Architectures for Large Language Models.” arXiv:2508.09834. Preprint, arXiv, August 13. https://doi.org/10.48550/arXiv.2508.09834.
Xu, Mengwei, Wangsong Yin, Dongqi Cai, et al. 2024a. “A Survey of Resource-Efficient LLM and Multimodal Foundation Models.” arXiv:2401.08092. Preprint, arXiv, September 23. https://doi.org/10.48550/arXiv.2401.08092.
Xu, Mengwei, Wangsong Yin, Dongqi Cai, et al. 2024b. “A Survey of Resource-Efficient LLM and Multimodal Foundation Models.” arXiv:2401.08092. Preprint, arXiv, September 23. https://doi.org/10.48550/arXiv.2401.08092.
Yuan, Yang. 2024. “On the Power of Foundation Models.” arXiv:2211.16327. Preprint, arXiv, October 22. https://doi.org/10.48550/arXiv.2211.16327.
Afchar, Darius, Gabriel Meseguer-Brocal, and Romain Hennequin. 2025. “AI-Generated Music Detection and Its Challenges.” arXiv:2501.10111. Version 1. Preprint, arXiv, January 17. https://doi.org/10.48550/arXiv.2501.10111.
Agostinelli, Andrea, Timo I. Denk, Zalán Borsos, et al. 2023. “MusicLM: Generating Music From Text.” arXiv:2301.11325. Preprint, arXiv, January 26. https://doi.org/10.48550/arXiv.2301.11325.
Chen, Yanxu, Linshu Huang, and Tian Gou. 2024. “Applications and Advances of Artificial Intelligence in Music Generation:A Review.” arXiv:2409.03715. Preprint, arXiv, September 3. https://doi.org/10.48550/arXiv.2409.03715.
Copet, Jade, Felix Kreuk, Itai Gat, et al. 2024. “Simple and Controllable Music Generation.” arXiv:2306.05284. Preprint, arXiv, January 30. https://doi.org/10.48550/arXiv.2306.05284.
Evans, Zach, Julian D. Parker, C. J. Carr, Zack Zukowski, Josiah Taylor, and Jordi Pons. 2024. “Long-Form Music Generation with Latent Diffusion.” arXiv:2404.10301. Version 2. Preprint, arXiv, July 29. https://doi.org/10.48550/arXiv.2404.10301.
Huang, Qingqing, Daniel S. Park, Tao Wang, et al. 2023. “Noise2Music: Text-Conditioned Music Generation with Diffusion Models.” arXiv:2302.03917. Preprint, arXiv, March 6. https://doi.org/10.48550/arXiv.2302.03917.
Lam, Max W. Y., Qiao Tian, Tang Li, et al. 2023. “Efficient Neural Music Generation.” arXiv:2305.15719. Preprint, arXiv, May 25. https://doi.org/10.48550/arXiv.2305.15719.
Lehmkuhl, Jonathan, Ábel Ilyés-Kun, Nico Bremes, Cemhan Kaan Özaltan, Frederik Muthers, and Jiayi Yuan. 2025. “Generating Piano Music with Transformers: A Comparative Study of Scale, Data, and Metrics.” arXiv:2511.07268. Preprint, arXiv, November 10. https://doi.org/10.48550/arXiv.2511.07268.
Wu, Shih-Lun, and Yi-Hsuan Yang. 2022. “MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE.” arXiv:2105.04090. Preprint, arXiv, December 19. https://doi.org/10.48550/arXiv.2105.04090.
Yuan, Ruibin, Hanfeng Lin, Yi Wang, et al. 2024. “ChatMusician: Understanding and Generating Music Intrinsically with LLM.” arXiv:2402.16153. Preprint, arXiv, February 25. https://doi.org/10.48550/arXiv.2402.16153.
Chang, Amy, Nicholas Conley, Harish Santhanalakshmi Ganesan, and Adam Swanda. 2025. “Death by a Thousand Prompts: Open Model Vulnerability Analysis.” arXiv:2511.03247. Version 1. Preprint, arXiv, November 5. https://doi.org/10.48550/arXiv.2511.03247.
Chen, Sizhe, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, and Chuan Guo. 2024. “Aligning LLMs to Be Robust Against Prompt Injection.” arXiv:2410.05451. Version 1. Preprint, arXiv, October 7. https://doi.org/10.48550/arXiv.2410.05451.
Du, Chenghao, Quanfeng Huang, Tingxuan Tang, Zihao Wang, Adwait Nadkarni, and Yue Xiao. 2025. “Measuring the Security of Mobile LLM Agents under Adversarial Prompts from Untrusted Third-Party Channels.” arXiv:2510.27140. Preprint, arXiv, November 6. https://doi.org/10.48550/arXiv.2510.27140.
Jia, Feiran, Tong Wu, Xin Qin, and Anna Squicciarini. 2024. “The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents.” arXiv:2412.16682. Preprint, arXiv, December 21. https://doi.org/10.48550/arXiv.2412.16682.
Kumar, Aounon, Chirag Agarwal, Suraj Srinivas, Aaron Jiaxun Li, Soheil Feizi, and Himabindu Lakkaraju. 2025. “Certifying LLM Safety against Adversarial Prompting.” arXiv:2309.02705. Preprint, arXiv, February 4. https://doi.org/10.48550/arXiv.2309.02705.
Peng, Benji, Ziqian Bi, Qian Niu, et al. 2024. “Jailbreaking and Mitigation of Vulnerabilities in Large Language Models.” arXiv:2410.15236. Version 1. Preprint, arXiv, October 20. https://doi.org/10.48550/arXiv.2410.15236.
Shang, Zhengchun, and Wenlan Wei. 2025. “Evolving Security in LLMs: A Study of Jailbreak Attacks and Defenses.” arXiv:2504.02080. Version 1. Preprint, arXiv, April 2. https://doi.org/10.48550/arXiv.2504.02080.
Shi, Jiawen, Zenghui Yuan, Yinuo Liu, et al. 2025. “Optimization-Based Prompt Injection Attack to LLM-as-a-Judge.” arXiv:2403.17710. Preprint, arXiv, August 24. https://doi.org/10.48550/arXiv.2403.17710.
Yi, Sibo, Yule Liu, Zhen Sun, et al. 2024. “Jailbreak Attacks and Defenses Against Large Language Models: A Survey.” arXiv:2407.04295. Preprint, arXiv, August 30. https://doi.org/10.48550/arXiv.2407.04295.
Zhao, Andrew, Reshmi Ghosh, Vitor Carvalho, et al. 2025. “Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-Based Optimizers.” arXiv:2510.14381. Preprint, arXiv, October 16. https://doi.org/10.48550/arXiv.2510.14381.
Catalano, Nico, and Matteo Matteucci. 2024. “Few Shot Semantic Segmentation: A Review of Methodologies, Benchmarks, and Open Challenges.” arXiv:2304.05832. Preprint, arXiv, May 20. https://doi.org/10.48550/arXiv.2304.05832.
Ke, Lei, Mingqiao Ye, Martin Danelljan, et al. 2023. “Segment Anything in High Quality.” arXiv:2306.01567. Preprint, arXiv, October 23. https://doi.org/10.48550/arXiv.2306.01567.
Kirillov, Alexander, Eric Mintun, Nikhila Ravi, et al. 2023. “Segment Anything.” arXiv:2304.02643. Preprint, arXiv, April 5. https://doi.org/10.48550/arXiv.2304.02643.
Li, Feng, Hao Zhang, Peize Sun, et al. 2023. “Semantic-SAM: Segment and Recognize Anything at Any Granularity.” arXiv:2307.04767. Preprint, arXiv, July 10. https://doi.org/10.48550/arXiv.2307.04767.
Li, Feng, Hao Zhang, Huaizhe xu, et al. 2022. “Mask DINO: Towards A Unified Transformer-Based Framework for Object Detection and Segmentation.” arXiv:2206.02777. Preprint, arXiv, December 12. https://doi.org/10.48550/arXiv.2206.02777.
Liu, Xinyu, Beiwen Tian, Zhen Wang, et al. 2023. “Delving Into Shape-Aware Zero-Shot Semantic Segmentation.” 2999–3009. https://openaccess.thecvf.com/content/CVPR2023/html/Liu_Delving_Into_Shape-Aware_Zero-Shot_Semantic_Segmentation_CVPR_2023_paper.html.
Rajič, Frano, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, and Fisher Yu. 2023. “Segment Anything Meets Point Tracking.” arXiv:2307.01197. Preprint, arXiv, December 3. https://doi.org/10.48550/arXiv.2307.01197.
Wang, Xinlong, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, and Tiejun Huang. 2023. “SegGPT: Towards Segmenting Everything in Context.” 1130–40. https://openaccess.thecvf.com/content/ICCV2023/html/Wang_SegGPT_Towards_Segmenting_Everything_in_Context_ICCV_2023_paper.html.
Xu, Jiarui, Shalini De Mello, Sifei Liu, et al. 2022. “GroupViT: Semantic Segmentation Emerges from Text Supervision.” arXiv:2202.11094. Preprint, arXiv, July 18. https://doi.org/10.48550/arXiv.2202.11094.
Xu, Jilan, Junlin Hou, Yuejie Zhang, et al. 2023. “Learning Open-Vocabulary Semantic Segmentation Models From Natural Language Supervision.” 2935–44. https://openaccess.thecvf.com/content/CVPR2023/html/Xu_Learning_Open-Vocabulary_Semantic_Segmentation_Models_From_Natural_Language_Supervision_CVPR_2023_paper.html.
Chen, Ting, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. “A Simple Framework for Contrastive Learning of Visual Representations.” arXiv:2002.05709. Preprint, arXiv, July 1. https://doi.org/10.48550/arXiv.2002.05709.
Chen, Wenxi, Yuzhe Liang, Ziyang Ma, Zhisheng Zheng, and Xie Chen. 2024. “EAT: Self-Supervised Pre-Training with Efficient Audio Transformer.” arXiv:2401.03497. Preprint, arXiv, January 7. https://doi.org/10.48550/arXiv.2401.03497.
Guo, Huijie, Jingyao Wang, Peizheng Guo, Xingchen Shen, Changwen Zheng, and Wenwen Qiang. 2025. “Exploring Transferability of Self-Supervised Learning by Task Conflict Calibration.” arXiv:2511.13787. Preprint, arXiv, November 16. https://doi.org/10.48550/arXiv.2511.13787.
Hondru, Vlad, Florinel Alin Croitoru, Shervin Minaee, Radu Tudor Ionescu, and Nicu Sebe. 2024. “Masked Image Modeling: A Survey.” arXiv:2408.06687. Version 1. Preprint, arXiv, August 13. https://doi.org/10.48550/arXiv.2408.06687.
Liu, Ziyu, Azadeh Alavi, Minyi Li, and Xiang Zhang. 2024. “Self-Supervised Learning for Time Series: Contrastive or Generative?” arXiv:2403.09809. Version 1. Preprint, arXiv, March 14. https://doi.org/10.48550/arXiv.2403.09809.
Ma, Duo, Xianghu Yue, Junyi Ao, Xiaoxue Gao, and Haizhou Li. 2024. “Text-Guided HuBERT: Self-Supervised Speech Pre-Training via Generative Adversarial Networks.” arXiv:2402.15725. Version 3. Preprint, arXiv, July 22. https://doi.org/10.48550/arXiv.2402.15725.
Naiman, Ilan, Emanuel Ben-Baruch, Oron Anschel, et al. 2025. “LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders.” arXiv:2504.03501. Preprint, arXiv, October 7. https://doi.org/10.48550/arXiv.2504.03501.
Shi, Yuge, Imant Daunhawer, Julia E. Vogt, Philip H. S. Torr, and Amartya Sanyal. 2022. “How Robust Is Unsupervised Representation Learning to Distribution Shift?” arXiv:2206.08871. Preprint, arXiv, December 16. https://doi.org/10.48550/arXiv.2206.08871.
Tan, Fuwen, Fatemeh Saleh, and Brais Martinez. 2023. “Effective Self-Supervised Pre-Training on Low-Compute Networks without Distillation.” arXiv:2210.02808. Preprint, arXiv, October 2. https://doi.org/10.48550/arXiv.2210.02808.
Zong, Yongshuo, Oisin Mac Aodha, and Timothy Hospedales. 2024. “Self-Supervised Multimodal Learning: A Survey.” arXiv:2304.01008. Preprint, arXiv, August 16. https://doi.org/10.48550/arXiv.2304.01008.
Borsos, Zalán, Raphaël Marinier, Damien Vincent, et al. 2023. “AudioLM: A Language Modeling Approach to Audio Generation.” arXiv:2209.03143. Preprint, arXiv, July 26. https://doi.org/10.48550/arXiv.2209.03143.
Chan, William, Daniel Park, Chris Lee, Yu Zhang, Quoc Le, and Mohammad Norouzi. 2021. “SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network.” arXiv:2104.02133. Preprint, arXiv, April 27. https://doi.org/10.48550/arXiv.2104.02133.
Chen, Sanyuan, Chengyi Wang, Zhengyang Chen, et al. 2022. “WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing.” IEEE Journal of Selected Topics in Signal Processing 16 (6): 1505–18. https://doi.org/10.1109/JSTSP.2022.3188113.
Cui, Wenqian, Dianzhi Yu, Xiaoqi Jiao, et al. 2025. “Recent Advances in Speech Language Models: A Survey.” arXiv:2410.03751. Preprint, arXiv, August 7. https://doi.org/10.48550/arXiv.2410.03751.
Gulati, Anmol, James Qin, Chung-Cheng Chiu, et al. 2020. “Conformer: Convolution-Augmented Transformer for Speech Recognition.” arXiv:2005.08100. Preprint, arXiv, May 16. https://doi.org/10.48550/arXiv.2005.08100.
Huang, Rongjie, Mingze Li, Dongchao Yang, et al. 2023. “AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head.” arXiv:2304.12995. Preprint, arXiv, April 25. https://doi.org/10.48550/arXiv.2304.12995.
Ju, Zeqian, Yuancheng Wang, Kai Shen, et al. 2024. “NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models.” arXiv:2403.03100. Preprint, arXiv, April 23. https://doi.org/10.48550/arXiv.2403.03100.
Lu, Yizhou, Mingkun Huang, Xinghua Qu, Pengfei Wei, and Zejun Ma. 2022. “Language Adaptive Cross-Lingual Speech Representation Learning with Sparse Sharing Sub-Networks.” arXiv:2203.04583. Preprint, arXiv, March 9. https://doi.org/10.48550/arXiv.2203.04583.
Mohamed, Abdelrahman, Hung-yi Lee, Lasse Borgholt, et al. 2022. “Self-Supervised Speech Representation Learning: A Review.” IEEE Journal of Selected Topics in Signal Processing 16 (6): 1179–210. https://doi.org/10.1109/JSTSP.2022.3207050.
Radford, Alec, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2022. “Robust Speech Recognition via Large-Scale Weak Supervision.” arXiv:2212.04356. Preprint, arXiv, December 6. https://doi.org/10.48550/arXiv.2212.04356.