Woof to Husky’Log!

This is Husky’Log, a starting explorer in AI world. I write about my projects, thoughts, and more. Wish I can develop the habit of writing high-quality blogs.

How to Write a Good Letter of Recommendation?

When requesting recommendation letter from Professors, it鈥檚 quite common nowadays you will be asked to provide a timeline, a work summary, or even a draft of your letter. Professors are all busy (you may fail to imagine how many emails they need to reply in a day), so it鈥檚 a reasonable request. Besides, this is actually a good news, since you could handle the content of recommendation letter by yourself. Therefore, you need to learn how to write a good recommendation letter, even as a student. Here I collected several sources and notes from them, hope they could be helpful. ...

October 22, 2024 路 10 min 路 1985 words 路 Benhao Huang

Paper Reading: Cheating Popular LLM Benchmarks

Anti-cheating has long been a critical consideration when designing the rules for leaderboards, but this remains unexplored in the context of LLM benchmarks Citation: [1]Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates X. Zheng, T. Pang, C. Du, Q. Liu, J. Jiang, M. Lin, (2024) Link . Introduction There are many well-known LLM benchmarks, such as AlpacaEval 2.0 Citation: [2]Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators Y. Dubois, B. Galambosi, P. Liang, T. Hashimoto, (2024) DOI , Arena-Hard-Auto Citation: [3]From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline T. Li, W. Chiang, E. Frick, L. Dunlap, T. Wu, B. Zhu, J. Gonzalez, I. Stoica, (2024) DOI ; Citation: [4]Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena L. Zheng, W. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, Z. Lin, Z. Li, D. Li, E. Xing, H. Zhang, J. Gonzalez, I. Stoica, (2023) DOI , and MTBench Citation: [4]Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena L. Zheng, W. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, Z. Lin, Z. Li, D. Li, E. Xing, H. Zhang, J. Gonzalez, I. Stoica, (2023) DOI . They are widely used in the research community to evaluate the performance of LLMs. ...

October 11, 2024 路 8 min 路 1630 words 路 Benhao Huang