Rafael Mitkov Rafailov

Viittaukset

	Kaikki	2019 lähtien
Sitaatit	1391	1378
h-indeksi	12	12
i10-indeksi	13	13

640

320

160

480

2019202020212022202320244 7 63 187 479 637

Yleisessä käytössä

Näytä kaikki

4 artikkelia

0 artikkelia

käytettävissä

ei käytettävissä

Perustuu rahoitusehtoihin

Seuraa

Rafael Mitkov Rafailov

Graduate Student, Stanford University

Vahvistettu sähköpostiosoite verkkotunnuksessa stanford.edu - Kotisivu

reinforcement learning statistical machine learning


Nimike Lajittele sitaattien mukaan Lajittele vuoden mukaan Lajittele otsikon mukaan	Viittaukset Viittaukset	Vuosi
Direct preference optimization: Your language model is secretly a reward model R Rafailov, A Sharma, E Mitchell, CD Manning, S Ermon, C Finn Advances in Neural Information Processing Systems 36, 2024	530	2024
Combo: Conservative offline model-based policy optimization T Yu, A Kumar, R Rafailov, A Rajeswaran, S Levine, C Finn Advances in neural information processing systems 34, 28954-28967, 2021	324	2021
Offline reinforcement learning from images with latent space models R Rafailov, T Yu, A Rajeswaran, C Finn Learning for dynamics and control, 1154-1168, 2021	112	2021
Offline meta-reinforcement learning with advantage weighting E Mitchell, R Rafailov, XB Peng, S Levine, C Finn International Conference on Machine Learning, 7780-7791, 2021	96	2021
Open x-embodiment: Robotic learning datasets and rt-x models A Padalkar, A Pooley, A Jain, A Bewley, A Herzog, A Irpan, A Khazatsky, ... arXiv preprint arXiv:2310.08864, 2023	82	2023
Just ask for calibration: Strategies for eliciting calibrated confidence scores from language models fine-tuned with human feedback K Tian, E Mitchell, A Zhou, A Sharma, R Rafailov, H Yao, C Finn, ... arXiv preprint arXiv:2305.14975, 2023	76	2023
Visual adversarial imitation learning using variational models R Rafailov, T Yu, A Rajeswaran, C Finn Advances in Neural Information Processing Systems 34, 3016-3028, 2021	36	2021
Vision-based manipulators need to also see from their hands K Hsu, MJ Kim, R Rafailov, J Wu, C Finn arXiv preprint arXiv:2203.12677, 2022	27	2022
On the sum of powered distances to certain sets of points on the circle N Nikolov, R Rafailov Pacific journal of mathematics 253 (1), 157-168, 2011	23	2011
On extremums of sums of powered distances to a finite set of points N Nikolov, R Rafailov Geometriae Dedicata 167 (1), 69-89, 2013	18	2013
Diffusion model alignment using direct preference optimization B Wallace, M Dang, R Rafailov, L Zhou, A Lou, S Purushwalkam, S Ermon, ... arXiv preprint arXiv:2311.12908, 2023	16	2023
Contrastive prefence learning: Learning from human feedback without rl J Hejna, R Rafailov, H Sikchi, C Finn, S Niekum, WB Knox, D Sadigh arXiv preprint arXiv:2310.13639, 2023	15	2023
Open x-embodiment: Robotic learning datasets and RT-x models Q Vuong, S Levine, HR Walke, K Pertsch, A Singh, R Doshi, C Xu, J Luo, ... Towards Generalist Robots: Learning Paradigms for Scalable Skill Acquisition …, 2023	11	2023
An emulator for fine-tuning large language models using small language models E Mitchell, R Rafailov, A Sharma, C Finn, CD Manning arXiv preprint arXiv:2310.12962, 2023	9	2023
Disentangling length from quality in direct preference optimization R Park, R Rafailov, S Ermon, C Finn arXiv preprint arXiv:2403.19159, 2024	4	2024
MOTO: Offline pre-training to online fine-tuning for model-based robot learning R Rafailov, KB Hatch, V Kolev, JD Martin, M Phielipp, C Finn Conference on Robot Learning, 3654-3671, 2023	3	2023
Aligning Modalities in Vision Large Language Models via Preference Fine-tuning Y Zhou, C Cui, R Rafailov, C Finn, H Yao arXiv preprint arXiv:2402.11411, 2024	2	2024
Offline retraining for online rl: Decoupled policy learning to mitigate exploration bias MS Mark, A Sharma, F Tajwar, R Rafailov, S Levine, C Finn arXiv preprint arXiv:2310.08558, 2023	2	2023
Example-based offline reinforcement learning without rewards K Hatch, T Yu, R Rafailov, C Finn Proceedings of Machine Learning Research vol 144, 1-17, 2022	2	2022
From to : Your Language Model is Secretly a Q-Function R Rafailov, J Hejna, R Park, C Finn arXiv preprint arXiv:2404.12358, 2024	1	2024

Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.

Artikkelit 1–20

Sitaatteja vuodessa

Päällekkäiset lähteet

Yhdistetyt sitaatit

Lisää muut kirjoittajatMuut kirjoittajat

Seuraa

Viittaukset