LlamaCast

By: Shahriar Shariati
  • Summary

  • Daily podcast about the published articles in the LLM field.
    Shahriar Shariati
    Show more Show less
activate_Holiday_promo_in_buybox_DT_T2
Episodes
  • Inference Scaling for Long-Context RAG
    Oct 20 2024
    🗓 Inference Scaling for Long-Context Retrieval Augmented Generation

    This research paper explores the effectiveness of inference scaling for retrieval augmented generation (RAG), a technique that enhances large language models (LLMs) by incorporating external knowledge. The authors introduce two strategies, demonstration-based RAG (DRAG) and iterative demonstration-based RAG (IterDRAG), for effectively scaling inference computation. They demonstrate that increasing inference computation, when optimally allocated, leads to nearly linear gains in RAG performance. Furthermore, they develop a computation allocation model to predict the optimal test-time compute allocation for various tasks and scenarios, showcasing its effectiveness in achieving performance gains and aligning with experimental results.

    📎 Link to paper
    Show more Show less
    12 mins
  • Model Swarms
    Oct 19 2024
    🤝 Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

    This paper presents a new method called MODEL SWARMS, a collaborative search algorithm for adapting large language models (LLMs) using swarm intelligence. The researchers propose viewing each LLM expert as a "particle" in a swarm and use particle swarm optimization (PSO) to collaboratively search the weight space for optimized models. This approach allows LLMs to adapt to a variety of objectives, including single tasks, multi-task domains, reward models, and human interests, without requiring large amounts of training data. Extensive experiments demonstrate that MODEL SWARMS outperforms existing model composition baselines and enables the discovery of previously unseen capabilities in LLMs.

    📎 Link to paper
    Show more Show less
    12 mins
  • Agent-as-a-Judge
    Oct 18 2024
    🤖 Agent-as-a-Judge: Evaluate Agents with Agents

    The paper detail a new framework for evaluating agentic systems called Agent-as-a-Judge, which uses other agentic systems to assess their performance. To test this framework, the authors created DevAI, a benchmark dataset consisting of 55 realistic automated AI development tasks. They compared Agent-as-a-Judge to LLM-as-a-Judge and Human-as-a-Judge on DevAI, finding that Agent-as-a-Judge outperforms both, aligning closely with human evaluations. The authors also discuss the benefits of Agent-as-a-Judge for providing intermediate feedback and creating a flywheel effect, where both the judge and evaluated agents improve through an iterative process.

    📎 Link to paper
    🤗 See their HuggingFace

    Show more Show less
    9 mins

What listeners say about LlamaCast

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.