Skip to main content

Research Lab

Research experiments, UDOM analysis, and lab projects.

📄️ Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

Tommi Vatanen, Tapani Raiko, Harri Valpola, Department of Information and Computer Science, Aalto University School of Science, P.O.Box 15400, FI-00076, Aalto, Espoo, Finland, % Tapani Raiko, % Department of Information and Computer Science, % Aalto University School of Science, % P.O.Box 15400, FI-00076, Aalto, Espoo, Finland, Yann LeCun, New York University, 715 Broadway, New York, NY 10003, USA

📄️ Differentially- and non-differentially-private random decision trees

% You can go ahead and credit any number of authors here, % e.g. one 'row of three' or two rows (consisting of one row of three % and a second row of one, two or three). % % The command \alignauthor (no curly braces needed) should % precede each author name, affiliation/snail-mail address and % e-mail address. Additionally, tag each line of % affiliation/address with \affaddr, and tag the % e-mail address with \email. % % 1st. author \alignauthor Mariusz Bojarski, NYU Polytechnic School of Engineering, Brooklyn, NY, Courant Institute of Mathematical Sciences, New York, NY, Google Research, New York, NY, % 4th. author \alignauthor Yann LeCun, Courant Institute of Mathematical Sciences and Facebook, New York, NY

📄️ Entropy-SGD: Biasing Gradient Descent Into Wide Valleys Code:~ href{https://github.com/ucla-vision/entropy-sgd{https://github.com/ucla-vision/entropy-sgd}}

Pratik Chaudhari$^$, Anna Choromanska$^{2}$, Stefano Soatto$^{1}$, Yann LeCun$^{3,4}$, Carlo Baldassi$^{5}$,, [0.03in] Christian Borgs$^{6$, Jennifer Chayes$^{6}$, Levent Sagun$^{3}$, Riccardo Zecchina$^{5}$}, [0.05in] $^{1}$ Computer Science Department, University of California, Los Angeles, $^{2}$ Department of Electrical and Computer Engineering, New York University, $^{3}$ Courant Institute of Mathematical Sciences, New York University, $^{4}$ Facebook AI Research, New York, $^{5}$ Dipartimento di Scienza Applicata e Tecnologia, Politecnico di Torino, $^{6}$ Microsoft Research New England, Cambridge

📄️ GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations

Zhilin Yang$^$, Jake (Junbo) Zhao$^{23}$, Bhuwan Dhingra$^{1}$, Kaiming He$^{3$, William W. Cohen$^{4}$~, Ruslan Salakhutdinov$^{1}$, Yann LeCun$^{23}$}, $^*$Equal contribution, $^1$Carnegie Mellon University, $^2$New York University, $^3$Facebook AI Research, $^4$Google, Inc., % $^{1}$Carnegie Mellon University, $$ % David S.~Hippocampus, % Department of Computer Science, % Cranberry-Lemon University, % Pittsburgh, PA, %% examples of more authors %%, %% Coauthor, %% Affiliation, %% Address, %% email, %%, %% Coauthor, %% Affiliation, %% Address, %% email, %%, %% Coauthor, %% Affiliation, %% Address, %% email, %%, %% Coauthor, %% Affiliation, %% Address, %% email

📄️ Unsupervised Learning of Structured Representations via Closed-Loop Transcription

Shengbang Tong\textsuperscript{\rm 1 \quad Xili Dai\rm 1,2 * \quad Yubei Chen\rm 3 \quad Mingyang Li\rm 5 \quad Zengyi Li\rm 1 \quad Brent Yi\rm 1 \quad }, Yann LeCun\textsuperscript{\rm 3,4\quad Yi Ma\rm 1,5 }, \textsuperscript{\rm 1University of California, Berkeley \quad \rm 2Hong Kong University of Science and Technology (Guangzhou)}, \textsuperscript{\rm 3Center for Data Science, New York University\quad \rm 4Courant Inst., New York University }, % \textsuperscript{\rm 4Courant Inst., New York University\quad \rm 5Tsinghua-Berkeley Shenzhen Institute (TBSI) \quad }

📄️ Graph MLP-Mixer

% Ziyue Qi, % School of Coumputing and Information, % University of Pittsburgh, % Pittsburgh, PA, % %% examples of more authors %, % Zixuan Lu, % School of Coumputing and Information, % University of Pittsburgh, % Pittsburgh, PA, % Yuchen Lu, % School of Coumputing and Information, % University of Pittsburgh, % Pittsburgh, PA, %%, %% Coauthor, %% Affiliation, %% Address, %% email, %%, %% Coauthor, %% Affiliation, %% Address, %% email, %%, %% Coauthor, %% Affiliation, %% Address, %% email

📄️ Latent Variable Energy Based Models Lecun

Current automated systems have crucial limitations that need to be addressed before artificial intelligence can reach human-like levels and bring new technological revolutions. Among others, our societies still lack Level 5 self-driving cars, domestic robots, and virtual assistants that learn reliable world models, reason, and plan complex action sequences. In these notes, we summarize the main ideas behind the architecture of autonomous intelligence of the future proposed by Yann LeCun. In particular, we introduce energy-based and latent variable models and combine their advantages in the building block of LeCun's proposal, that is, in the hierarchical joint embedding predictive architecture (H-JEPA).

📄️ Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation © 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. To appear in the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

Tal Zeevi, Ravid Shwartz-Ziv, Yann LeCun, Lawrence H. Staib, John A. Onofrey

📄️ V-JEPA~2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Mido Assran, Adrien Bardes, David Fan, Quentin Garrido, Russell Howes, Mojtaba, Komeili, Matthew Muckley, Ammar Rizvi, Claire Roberts, Koustuv Sinha, Artem Zholus, Sergio Arnaud, Abha Gejji, Ada Martin, Francois Robert Hogan, Daniel Dugas, Piotr Bojanowski, Vasil Khalidov, Patrick Labatut, Francisco Massa, Marc Szafraniec, Kapil Krishnakumar, Yong Li, Xiaodong Ma, Sarath Chandar, Franziska Meier, Yann LeCun, Michael Rabbat, Nicolas Ballas

📄️ Memory in the Age of AI Agents: A Survey Large Forms, Functions and Dynamics

Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, Zhenrong Cheng, Xuanbo Fan, Jiaxin Guo, Xinlei Yu, Zhenhong Zhou, Zewen Hu, Jiahao Huo, Junhao Wang, Yuwei Niu, Yu Wang, Zhenfei Yin, Xiaobin Hu, Yue Liao, Qiankun Li, Kun Wang, Wangchunshu Zhou, Yixin Liu, Dawei Cheng, Qi Zhang, Tao Gui, Shirui Pan, Yan Zhang, Philip Torr, Zhicheng Dou, Ji-Rong Wen, Xuanjing Huang, Yu-Gang Jiang, Shuicheng Yan

📄️ research-lab-research-perplexity-google-cloud-workstati

You want a per‑tenant “dev pod” on Google Cloud Workstations that Coditect provisions at signup, with Coditect‑core licensed and enforced inside each environment. A clean way to do this is: Coditect owns the control plane (tenants, users, licensing, projects/IAM), and Cloud Workstations is treated as an internal runtime that your control plane drives via API, with one or more workstation configs per Coditect plan and IAM‑based single vs multi‑tenant sharing.